REINFORCEMENT LEARNING (RL) POLICY WITH GUIDED META RL

Organization Name

HONDA MOTOR CO., LTD.

Inventor(s)

Kanghoon Lee of Daejeon (KR)

Jiachen Li of Mountain View CA (US)

David F. Isele of Sunnyvale CA (US)

Jinkyoo Park of Palo Alto CA (US)

REINFORCEMENT LEARNING (RL) POLICY WITH GUIDED META RL

This abstract first appeared for US patent application 18455056 titled 'REINFORCEMENT LEARNING (RL) POLICY WITH GUIDED META RL

Original Abstract Submitted

According to one aspect, a system for generating a reinforcement learning (RL) policy with guided meta RL is provided. The system may include a processor and a memory. The memory may store one or more instructions. The processor may execute one or more of the instructions stored on the memory to perform one or more acts, actions, and/or steps, such as generating an initial RL policy for an ego-vehicle based on an intelligent driver model (IDM), generating a set of RL guiding policies for a set of social agents based on the initial RL policy and a set of preferences, generating a meta-RL guided policy based on the set of RL guiding policies, and generating a RL policy with guided meta RL for the ego-vehicle based on the meta-RL guided policy and the IDM.