US Patent Application 18171845. INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING COMPUTER PROGRAM PRODUCT simplified abstract

From WikiPatents
Jump to navigation Jump to search

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING COMPUTER PROGRAM PRODUCT

Organization Name

KABUSHIKI KAISHA TOSHIBA

Inventor(s)

Gaku Minamoto of Kawasaki (JP)

Toshimitsu Kaneko of Kawasaki (JP)

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING COMPUTER PROGRAM PRODUCT - A simplified explanation of the abstract

This abstract first appeared for US patent application 18171845 titled 'INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING COMPUTER PROGRAM PRODUCT

Simplified Explanation

- The patent application describes an information processing device (Device A) that is used in a mobile robot. - Device A includes an acquisition unit (Unit A) that acquires the current state of the mobile robot. - Unit B is responsible for learning a first inference model using reinforcement learning and specifies a first action-value-function for the mobile robot based on the current state and the first inference model. - Unit C specifies a second action-value-function for the mobile robot based on the current state and a second inference model that is not a parameter update target. - Unit D determines the first action of the mobile robot based on the first action-value-function and the second action-value-function.

  • The patent application describes an information processing device used in a mobile robot.
  • The device acquires the current state of the mobile robot.
  • The device learns a first inference model using reinforcement learning.
  • The device specifies a first action-value-function for the mobile robot based on the current state and the first inference model.
  • The device also specifies a second action-value-function for the mobile robot based on the current state and a second inference model.
  • The device determines the first action of the mobile robot based on the first and second action-value-functions.


Original Abstract Submitted

An information processing device A includes an acquisition unit A, a first action-value-function specifying unit B, a second action-value-function specifying unit C, and an action determination unit D. The acquisition unit A acquires a current state of a mobile robot as an exemplary device. The first action-value-function specifying unit B has functioning of learning a first inference model by reinforcement learning, and specifies a first action-value-function of the mobile robot based on the current state and the first inference model. The second action-value-function specifying unit C specifies a second action-value-function of the mobile robot based on the current state and a second inference model that is not a parameter update target. The action determination unit D determines a first action of the mobile robot based on the first action-value-function and the second action-value-function.