US Patent Application 18348587. ROBUST TRAINING IN THE PRESENCE OF LABEL NOISE simplified abstract

From WikiPatents
Jump to navigation Jump to search

ROBUST TRAINING IN THE PRESENCE OF LABEL NOISE

Organization Name

Google LLC


Inventor(s)

Zizhao Zhang of San Jose CA (US)

Sercan Omer Arik of San Francisco CA (US)

Tomas Jon Pfister of Foster City CA (US)

Han Zhang of Sunnyvale CA (US)

ROBUST TRAINING IN THE PRESENCE OF LABEL NOISE - A simplified explanation of the abstract

This abstract first appeared for US patent application 18348587 titled 'ROBUST TRAINING IN THE PRESENCE OF LABEL NOISE

Simplified Explanation

The patent application describes a method for training a model using labeled training samples.

  • Obtaining a set of labeled training samples with given labels
  • Generating pseudo labels and estimating the accuracy of the given labels
  • Determining if the weight of a labeled training sample meets a threshold
  • Adding the sample to a set of cleanly labeled samples if the weight threshold is met
  • Adding the sample to a set of mislabeled samples if the weight threshold is not met
  • Training the model using the cleanly labeled samples with given labels
  • Training the model using the mislabeled samples with corresponding pseudo labels


Original Abstract Submitted

A method for training a model comprises obtaining a set of labeled training samples each associated with a given label. For each labeled training sample, the method includes generating a pseudo label and estimating a weight of the labeled training sample indicative of an accuracy of the given label. The method also includes determining whether the weight of the labeled training sample satisfies a weight threshold. When the weight of the labeled training sample satisfies the weight threshold, the method includes adding the labeled training sample to a set of cleanly labeled training samples. Otherwise, the method includes adding the labeled training sample to a set of mislabeled training samples. The method includes training the model with the set of cleanly labeled training samples using corresponding given labels and the set of mislabeled training samples using corresponding pseudo labels.