REINFORCEMENT LEARNING METHOD AND APPARATUS USING TASK DECOMPOSITION

Organization Name

RESEARCH & BUSINESS FOUNDATION SUNGKYUNKWAN UNIVERSITY

Inventor(s)

REINFORCEMENT LEARNING METHOD AND APPARATUS USING TASK DECOMPOSITION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20230177348 titled 'REINFORCEMENT LEARNING METHOD AND APPARATUS USING TASK DECOMPOSITION

Simplified Explanation

The patent application describes a reinforcement learning method that uses a task decomposition inference model in a time-variant environment. Here are the key points:

The method involves selecting paired transitions from a dataset, where each pair has a common characteristic that does not change over time and a different environmental characteristic that does change over time.
A cycle generative adversarial network (GAN) is used to identify these paired transitions.
An auto encoder is trained to embed the time-variant and time-invariant parts of each paired transition into a latent space.
Reinforcement learning is then performed on transitions collected in the time-variant environment using the trained auto encoder.

Potential applications of this technology:

This method can be applied to various reinforcement learning tasks in dynamic environments, such as robotics, autonomous vehicles, and game playing.
It can be used to improve the efficiency and effectiveness of learning in time-variant environments.

Problems solved by this technology:

Traditional reinforcement learning methods struggle to adapt to time-variant environments, as they do not consider the changing characteristics of the environment.
This method addresses this problem by decomposing the task into time-invariant and time-variant parts, allowing for more effective learning in dynamic environments.

Benefits of this technology:

By considering both time-invariant and time-variant characteristics, this method enables more accurate and efficient learning in dynamic environments.
It allows for better adaptation to changing environmental conditions, leading to improved performance in tasks that require real-time decision making.

Original Abstract Submitted

according to an exemplary embodiment of the present invention, a reinforcement learning method using a task decomposition inference model in a time-variant environment includes selecting a plurality of paired transitions having a time-invariant common characteristic and a time-variant different environmental characteristic from a dataset including a plurality of transition data, based on a cycle generative adversarial network (gan), training an auto encoder to embed each of the time-variant part and the time-invariant part with respect to the plurality of paired transitions into a latent space, and performing reinforcement learning on a transition corresponding to data collected in the time-variant environment, using the trained auto encoder.

20230177348. REINFORCEMENT LEARNING METHOD AND APPARATUS USING TASK DECOMPOSITION simplified abstract (RESEARCH & BUSINESS FOUNDATION SUNGKYUNKWAN UNIVERSITY)

Contents

REINFORCEMENT LEARNING METHOD AND APPARATUS USING TASK DECOMPOSITION

Organization Name

Inventor(s)