From WikiPatents
Jump to navigation Jump to search


Organization Name

Google LLC


Honglak Lee of Mountain View CA (US)

Shixiang Gu of Mountain View CA (US)

Sergey Levine of Berkeley CA (US)


This abstract first appeared for US patent application 18673510 titled 'DATA-EFFICIENT HIERARCHICAL REINFORCEMENT LEARNING

Abstract: Training and/or utilizing a hierarchical reinforcement learning (HRL) model for robotic control. The HRL model can include at least a higher-level policy model and a lower-level policy model. Some implementations relate to techniques that enable more efficient off-policy training to be utilized in training of the higher-level policy model and/or the lower-level policy model. Some of those implementations utilize off-policy correction, which re-labels higher-level actions of experience data, generated in the past utilizing a previously trained version of the HRL model, with modified higher-level actions. The modified higher-level actions are then utilized to off-policy train the higher-level policy model. This can enable effective off-policy training despite the lower-level policy model being a different version at training time (relative to the version when the experience data was collected).

Key Features and Innovation:

  • Hierarchical reinforcement learning (HRL) model for robotic control
  • Higher-level policy model and lower-level policy model
  • Off-policy training techniques for more efficiency
  • Off-policy correction for re-labeling higher-level actions
  • Effective off-policy training despite version differences

Potential Applications: - Robotics - Autonomous systems - Industrial automation - Process control

Problems Solved: - Efficient training of hierarchical reinforcement learning models - Overcoming version differences in training data and policy models

Benefits: - Improved robotic control performance - Enhanced efficiency in training - Adaptability to different versions of policy models

Commercial Applications: Title: Advanced Robotic Control Systems Utilizing Hierarchical Reinforcement Learning Potential commercial uses include: - Manufacturing automation - Warehouse logistics - Autonomous vehicles - Healthcare robotics

Prior Art: Research in the field of reinforcement learning and robotic control systems can provide valuable insights into prior art related to this technology.

Frequently Updated Research: Stay updated on advancements in reinforcement learning algorithms, robotic control systems, and hierarchical reinforcement learning models for the latest developments in the field.

Questions about Hierarchical Reinforcement Learning: 1. How does hierarchical reinforcement learning differ from traditional reinforcement learning methods? Hierarchical reinforcement learning involves learning multiple levels of policies, allowing for more complex decision-making compared to traditional reinforcement learning.

2. What are the advantages of utilizing off-policy training in hierarchical reinforcement learning models? Off-policy training enables more efficient learning by utilizing past experience data, even when the policy models have been updated.

Original Abstract Submitted

Training and/or utilizing a hierarchical reinforcement learning (HRL) model for robotic control. The HRL model can include at least a higher-level policy model and a lower-level policy model. Some implementations relate to technique(s) that enable more efficient off-policy training to be utilized in training of the higher-level policy model and/or the lower-level policy model. Some of those implementations utilize off-policy correction, which re-labels higher-level actions of experience data, generated in the past utilizing a previously trained version of the HRL model, with modified higher-level actions. The modified higher-level actions are then utilized to off-policy train the higher-level policy model. This can enable effective off-policy training despite the lower-level policy model being a different version at training time (relative to the version when the experience data was collected).