17936459. METHOD AND SYSTEM FOR MODELING PERSONALIZED CAR-FOLLOWING DRIVING STYLES WITH MODEL-FREE INVERSE REINFORCEMENT LEARNING simplified abstract (TOYOTA JIDOSHA KABUSHIKI KAISHA)

From WikiPatents
Jump to navigation Jump to search

METHOD AND SYSTEM FOR MODELING PERSONALIZED CAR-FOLLOWING DRIVING STYLES WITH MODEL-FREE INVERSE REINFORCEMENT LEARNING

Organization Name

TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventor(s)

Ziran Wang of San Jose CA (US)

Kyungtae Han of Palo Alto CA (US)

Rohit Gupta of Santa Clara CA (US)

METHOD AND SYSTEM FOR MODELING PERSONALIZED CAR-FOLLOWING DRIVING STYLES WITH MODEL-FREE INVERSE REINFORCEMENT LEARNING - A simplified explanation of the abstract

This abstract first appeared for US patent application 17936459 titled 'METHOD AND SYSTEM FOR MODELING PERSONALIZED CAR-FOLLOWING DRIVING STYLES WITH MODEL-FREE INVERSE REINFORCEMENT LEARNING

Simplified Explanation

The method described in the abstract involves using inverse reinforcement learning to learn reward functions for a plurality of vehicles, associating each vehicle with a cluster based on the reward functions, determining centroid reward functions for each cluster, comparing data of a second vehicle with the first vehicle data to find the most similar vehicle, associating the second vehicle with the cluster of the determined vehicle, and controlling the second vehicle based on the centroid reward function of the cluster.

  • Learning reward functions for vehicles using inverse reinforcement learning
  • Associating vehicles with clusters based on reward functions
  • Determining centroid reward functions for clusters
  • Comparing data of a second vehicle with first vehicle data
  • Associating the second vehicle with a cluster based on similarity
  • Controlling the second vehicle based on the centroid reward function of the cluster

Potential Applications

This technology could be applied in autonomous vehicle systems, robotics, and machine learning algorithms for decision-making processes.

Problems Solved

This technology helps in improving the efficiency and performance of autonomous vehicles by learning from the behavior of other vehicles and adapting their decision-making processes accordingly.

Benefits

The benefits of this technology include enhanced safety, optimized traffic flow, reduced energy consumption, and improved overall transportation systems.

Potential Commercial Applications

One potential commercial application of this technology could be in the development of advanced autonomous driving systems for cars, trucks, and drones.

Possible Prior Art

One possible prior art could be research on clustering algorithms for data analysis in machine learning and robotics.

Unanswered Questions

How does this technology ensure data privacy and security in the learning process?

The abstract does not mention how the privacy and security of the vehicle data are maintained during the learning process.

What are the computational requirements for implementing this method on a large scale?

The abstract does not provide information on the computational resources needed to scale this method for a large number of vehicles in real-world applications.


Original Abstract Submitted

A method may include learning reward functions for a plurality of first vehicles based on first vehicle data associated with the plurality of first vehicles using inverse reinforcement learning, associating each vehicle of the plurality of first vehicles with one cluster among a plurality of clusters based on the reward functions, determining a centroid reward function for each of the clusters based on the reward functions associated with each cluster, performing a comparison between second vehicle data associated with a second vehicle and the first vehicle data, determining a vehicle among the plurality of first vehicles having associated first vehicle data that is most similar to the second vehicle data based on the comparison, associating the second vehicle with the cluster associated with the determined vehicle, and controlling operation of the second vehicle based on the centroid reward function of the cluster associated with the second vehicle.