EXPERIENCE SELECTION IN REINFORCEMENT LEARNING

Organization Name

Inventor(s)

EXPERIENCE SELECTION IN REINFORCEMENT LEARNING

This abstract first appeared for US patent application 18895583 titled 'EXPERIENCE SELECTION IN REINFORCEMENT LEARNING

Original Abstract Submitted

Techniques described herein include selecting experience data for use when training or retraining a model. In one example, this disclosure describes a method that includes generating a plurality of trajectories, each comprising a contiguous sequence of instances of experience data, where each instance of experience data in the contiguous sequence has an error value associated that instance of experience data; determining, for each of the trajectories, a sorted order of the instances of experience data, wherein the sorted order is based on the error value associated with each of the instances of experience data; selecting, based on a distribution function applied to the sorted order of the instances of experience data in at least one of the trajectories, a subset of instances of the experience data; and retraining a reinforcement learning model, using the subset of instances of experience data, to predict an optimal action to take in a state.