18403939. ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS simplified abstract (Google LLC)

From WikiPatents
Jump to navigation Jump to search

ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS

Organization Name

Google LLC

Inventor(s)

Noam M. Shazeer of Palo Alto CA (US)

Lukasz Mieczyslaw Kaiser of San Francisco CA (US)

Etienne Pot of Palo Alto CA (US)

Mohammad Saleh of Santa Clara CA (US)

Ben David Goodrich of San Francisco CA (US)

Peter J. Liu of Santa Clara CA (US)

Ryan Sepassi of Beverly Hills CA (US)

ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18403939 titled 'ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS

The patent application describes methods, systems, and apparatus for generating an output sequence from an input sequence using a self-attention decoder neural network.

  • The method involves generating a combined sequence at each generation time step, processing it with the neural network to generate a score distribution over possible output tokens, and selecting the next output token based on the distribution.
  • This approach allows for the generation of complex output sequences based on input sequences, improving the accuracy and efficiency of sequence generation tasks.
  • The use of a self-attention decoder neural network enables the model to focus on different parts of the input sequence at each generation time step, enhancing the quality of the generated output.
  • By selecting output tokens based on the score distribution, the system can adaptively generate sequences that are contextually relevant and coherent.
  • The innovation described in the patent application can be applied to various natural language processing tasks, such as machine translation, text summarization, and dialogue generation.

Potential Applications:

  • Machine translation
  • Text summarization
  • Dialogue generation
  • Sentiment analysis

Problems Solved:

  • Improving the accuracy and efficiency of sequence generation tasks
  • Enhancing the quality of generated output sequences
  • Adapting output sequences based on context

Benefits:

  • Increased accuracy in sequence generation
  • Enhanced coherence and relevance of output sequences
  • Improved efficiency in natural language processing tasks

Commercial Applications:

  • Natural language processing software development
  • AI-powered chatbots
  • Automated content generation tools

Questions about the technology: 1. How does the self-attention decoder neural network improve the generation of output sequences? 2. What are the potential limitations of this approach in real-world applications?


Original Abstract Submitted

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.