20230177348. REINFORCEMENT LEARNING METHOD AND APPARATUS USING TASK DECOMPOSITION simplified abstract (RESEARCH & BUSINESS FOUNDATION SUNGKYUNKWAN UNIVERSITY)

From WikiPatents
Jump to navigation Jump to search

REINFORCEMENT LEARNING METHOD AND APPARATUS USING TASK DECOMPOSITION

Organization Name

RESEARCH & BUSINESS FOUNDATION SUNGKYUNKWAN UNIVERSITY

Inventor(s)

Min Jong Yoo of Suwon-si (KR)

Gwang Pyo Yoo of Suwon-si (KR)

Hong Uk Woo of Suwon-si (KR)

REINFORCEMENT LEARNING METHOD AND APPARATUS USING TASK DECOMPOSITION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20230177348 titled 'REINFORCEMENT LEARNING METHOD AND APPARATUS USING TASK DECOMPOSITION

Simplified Explanation

The patent application describes a reinforcement learning method that uses a task decomposition inference model in a time-variant environment. Here are the key points:

  • The method involves selecting paired transitions from a dataset, where each pair has a common characteristic that does not change over time and a different environmental characteristic that does change over time.
  • A cycle generative adversarial network (GAN) is used to identify these paired transitions.
  • An auto encoder is trained to embed the time-variant and time-invariant parts of each paired transition into a latent space.
  • Reinforcement learning is then performed on transitions collected in the time-variant environment using the trained auto encoder.

Potential applications of this technology:

  • This method can be applied to various reinforcement learning tasks in dynamic environments, such as robotics, autonomous vehicles, and game playing.
  • It can be used to improve the efficiency and effectiveness of learning in time-variant environments.

Problems solved by this technology:

  • Traditional reinforcement learning methods struggle to adapt to time-variant environments, as they do not consider the changing characteristics of the environment.
  • This method addresses this problem by decomposing the task into time-invariant and time-variant parts, allowing for more effective learning in dynamic environments.

Benefits of this technology:

  • By considering both time-invariant and time-variant characteristics, this method enables more accurate and efficient learning in dynamic environments.
  • It allows for better adaptation to changing environmental conditions, leading to improved performance in tasks that require real-time decision making.


Original Abstract Submitted

according to an exemplary embodiment of the present invention, a reinforcement learning method using a task decomposition inference model in a time-variant environment includes selecting a plurality of paired transitions having a time-invariant common characteristic and a time-variant different environmental characteristic from a dataset including a plurality of transition data, based on a cycle generative adversarial network (gan), training an auto encoder to embed each of the time-variant part and the time-invariant part with respect to the plurality of paired transitions into a latent space, and performing reinforcement learning on a transition corresponding to data collected in the time-variant environment, using the trained auto encoder.