18422856. Contrastive Pre-Training for Language Tasks simplified abstract (Google LLC)

From WikiPatents
Jump to navigation Jump to search

Contrastive Pre-Training for Language Tasks

Organization Name

Google LLC

Inventor(s)

Thang Minh Luong of Santa Clara CA (US)

Quoc V. Le of Sunnyvale CA (US)

Kevin Stefan Clark of San Francisco CA (US)

Contrastive Pre-Training for Language Tasks - A simplified explanation of the abstract

This abstract first appeared for US patent application 18422856 titled 'Contrastive Pre-Training for Language Tasks

Simplified Explanation

The patent application describes a method for training a machine-learned language encoding model using a contrastive learning task. This task involves the encoder learning to distinguish input tokens from plausible alternatives by masking out some tokens, replacing them with samples from a generator, and training the encoder to predict whether each token is original or a replacement.

  • The method trains a machine-learned language encoding model through a contrastive learning task.
  • The encoder learns to distinguish input tokens from plausible alternatives.
  • Some tokens are masked out and replaced with samples from a generator during training.
  • The encoder is trained to predict whether each token is original or a replacement.

Potential Applications

This technology could be applied in natural language processing tasks, such as text generation, machine translation, and sentiment analysis.

Problems Solved

This technology helps improve the performance of language encoding models by training them to distinguish between original tokens and replacements, leading to better language understanding and generation.

Benefits

The method enhances the capabilities of machine-learned language encoding models, making them more accurate and effective in various natural language processing tasks.

Potential Commercial Applications

This technology could be valuable for companies working on language processing applications, such as chatbots, virtual assistants, and automated translation services.

Possible Prior Art

One potential prior art for this technology could be the use of contrastive learning tasks in training machine learning models for various tasks, such as image recognition and speech processing.

=== What are the specific techniques used in the masking and replacement process during training? During training, some subset of the original input tokens is masked out, typically around 15%. These masked tokens are then replaced with samples from a generator, which could be a small masked language model.

=== How does the encoder differentiate between original tokens and replacements during training? The encoder is trained to predict whether each token comes from the original data or is a replacement produced by the generator. This is achieved through the contrastive learning task where the encoder learns to distinguish between input tokens and plausible alternatives.


Original Abstract Submitted

Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.