DEMONSTRATION-DRIVEN REINFORCEMENT LEARNING

Organization Name

deepmind technologies limited

Inventor(s)

Oleg O. Sushkov of London (GB)

Todor Bozhinov Davchev of Edinburgh (GB)

Jonathan Karl Scholz of London (GB)

DEMONSTRATION-DRIVEN REINFORCEMENT LEARNING

This abstract first appeared for US patent application 20240412063 titled 'DEMONSTRATION-DRIVEN REINFORCEMENT LEARNING

Original Abstract Submitted

methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a reinforcement learning system to select actions to be performed by an agent interacting with an environment to perform a particular task. in one aspect, one of the methods includes obtaining a training sequence comprising a respective training observations at each of a plurality of time steps; obtaining demonstration data comprising one or more demonstration sequences; generating a new training sequence from the training sequence and the demonstration data; and training the goal-conditioned policy neural network on the new training sequence through reinforcement learning.