20240042600. DATA-DRIVEN ROBOT CONTROL simplified abstract (DeepMind Technologies Limited)

From WikiPatents
Jump to navigation Jump to search

DATA-DRIVEN ROBOT CONTROL

Organization Name

DeepMind Technologies Limited

Inventor(s)

Serkan Cabi of London (GB)

Ziyu Wang of Markham (CA)

Alexander Novikov of London (GB)

Ksenia Konyushkova of London (GB)

Sergio Gomez Colmenarejo of London (GB)

Scott Ellison Reed of Atlanta GA (US)

Misha Man Ray Denil of London (GB)

Jonathan Karl Scholz of London (GB)

Oleg O. Sushkov of London (GB)

Rae Chan Jeong of North York (CA)

David Barker of Reading (GB)

David Budden of London (GB)

Mel Vecerik of London (GB)

Yusuf Aytar of London (GB)

Joao Ferdinando Gomes De Freitas of London (GB)

DATA-DRIVEN ROBOT CONTROL - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240042600 titled 'DATA-DRIVEN ROBOT CONTROL

Simplified Explanation

Methods, systems, and apparatus are disclosed for data-driven robotic control. The patent application describes a method that involves maintaining robot experience data, obtaining annotation data, training a reward model on the annotation data, and generating task-specific training data for a particular task. The task-specific training data is generated by processing observations from a subset of the robot experience data using the trained reward model to generate reward predictions and associating the reward predictions with the corresponding experiences. A policy neural network is then trained on the task-specific training data to generate a control policy for a robot performing the particular task.

  • The patent application describes a method for data-driven robotic control.
  • The method involves maintaining robot experience data and obtaining annotation data.
  • A reward model is trained on the annotation data.
  • Task-specific training data is generated by processing observations from a subset of the robot experience data using the trained reward model.
  • The reward predictions generated from the trained reward model are associated with the corresponding experiences.
  • A policy neural network is trained on the task-specific training data to generate a control policy for a robot performing a particular task.

Potential applications of this technology:

  • Robotic control systems that can learn from experience and adapt to different tasks.
  • Automation of complex tasks in various industries such as manufacturing, logistics, and healthcare.
  • Autonomous robots capable of performing a wide range of tasks without explicit programming.

Problems solved by this technology:

  • Traditional robotic control methods often require manual programming for each specific task, which can be time-consuming and costly.
  • Adapting robots to new tasks or environments can be challenging without a data-driven approach.
  • This technology enables robots to learn from experience and improve their performance over time.

Benefits of this technology:

  • Increased efficiency and productivity in industries that rely on robotic automation.
  • Flexibility and adaptability of robotic systems to perform different tasks.
  • Reduced need for manual programming and maintenance of robotic control systems.
  • Improved performance and accuracy of robots through continuous learning and optimization.


Original Abstract Submitted

methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. one of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.