US Patent Application 18089684. Modeling Ambiguity in Neural Machine Translation simplified abstract

From WikiPatents
Jump to navigation Jump to search

Modeling Ambiguity in Neural Machine Translation

Organization Name

Google LLC


Inventor(s)

Felix Stahlberg of Berlin (DE)

Shankar Kumar of New York NY (US)

Modeling Ambiguity in Neural Machine Translation - A simplified explanation of the abstract

This abstract first appeared for US patent application 18089684 titled 'Modeling Ambiguity in Neural Machine Translation

Simplified Explanation

The technology aims to improve neural machine translation by addressing ambiguity.

  • An encoder module generates an encoded representation of a given text exemplar.
  • A decoder module receives the encoded representation and a set of translation prefixes.
  • The decoder module outputs a set of tokens associated with each pair of the exemplar and translation prefix.
  • Each token is assigned a probability in the exemplar's vocabulary at each time step.
  • A logits module generates bounded conditional probabilities for each token based on the unbounded function.
  • The probabilities are not normalized over the vocabulary at each time step.
  • A loss function module identifies whether each target text is a valid translation of the exemplar.
  • The loss function module has a positive loss component and a scaled negative loss component.


Original Abstract Submitted

The technology addresses ambiguity in neural machine translation. An encoder module receives a given text exemplar and generates an encoded representation of it. A decoder module receives the encoded representation and a set of translation prefixes. The decoder module outputs an unbounded function corresponding to a set of tokens associated with each pair of the given text exemplar and translation prefix from the set of translation prefixes. Each token is assigned a probability between 0 and 1 in a vocabulary of the exemplar at each time step. A logits module generates, based on the unbounded function, a corresponding bounded conditional probability for each token, wherein the probabilities are not normalized over the vocabulary at each time step. A loss function module having a positive loss component and a scaled negative loss component identifies whether each target text of a set of target texts is a valid translation of the exemplar.