Contrastive Pre-Training for Language Tasks

Organization Name

Google LLC

Inventor(s)

Thang Minh Luong of Santa Clara CA US

Quoc V. Le of Sunnyvale CA US

Kevin Stefan Clark of San Francisco CA US

Contrastive Pre-Training for Language Tasks

This abstract first appeared for US patent application 18990884 titled 'Contrastive Pre-Training for Language Tasks

Original Abstract Submitted

Systems and methods are provided that train a machine-learned language encoding model through the use of a contrastive learning task. In particular, the present disclosure describes a contrastive learning task where the encoder learns to distinguish input tokens from plausible alternatives. In some implementations, on each training example the proposed method masks out some subset (e.g., 15%) of the original input tokens, replaces the masked tokens with samples from a “generator” (e.g., which may be a small masked language model), and then trains the encoder to predict whether each token comes from the original data or is a replacement produced by the generator.

18990884. Contrastive Pre-Training for Language Tasks (Google LLC)

Contrastive Pre-Training for Language Tasks

Organization Name

Inventor(s)

Contrastive Pre-Training for Language Tasks

Original Abstract Submitted

(Ad) Transform your business with AI in minutes, not months

Transform your business with AI in minutes, not months