ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS

Organization Name

Google LLC

Inventor(s)

Noam M. Shazeer of Palo Alto CA (US)

Lukasz Mieczyslaw Kaiser of San Francisco CA (US)

Etienne Pot of Palo Alto CA (US)

Mohammad Saleh of Santa Clara CA (US)

Ben David Goodrich of San Francisco CA (US)

Peter J. Liu of Santa Clara CA (US)

Ryan Sepassi of Beverly Hills CA (US)

ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18403939 titled 'ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS

The patent application describes methods, systems, and apparatus for generating an output sequence from an input sequence using a self-attention decoder neural network.

The method involves generating a combined sequence at each generation time step, processing it with the neural network to generate a score distribution over possible output tokens, and selecting the next output token based on the distribution.
This approach allows for the generation of complex output sequences based on input sequences, improving the accuracy and efficiency of sequence generation tasks.
The use of a self-attention decoder neural network enables the model to focus on different parts of the input sequence at each generation time step, enhancing the quality of the generated output.
By selecting output tokens based on the score distribution, the system can adaptively generate sequences that are contextually relevant and coherent.
The innovation described in the patent application can be applied to various natural language processing tasks, such as machine translation, text summarization, and dialogue generation.

Potential Applications:

Machine translation
Text summarization
Dialogue generation
Sentiment analysis

Problems Solved:

Improving the accuracy and efficiency of sequence generation tasks
Enhancing the quality of generated output sequences
Adapting output sequences based on context

Benefits:

Increased accuracy in sequence generation
Enhanced coherence and relevance of output sequences
Improved efficiency in natural language processing tasks

Commercial Applications:

Natural language processing software development
AI-powered chatbots
Automated content generation tools

Questions about the technology: 1. How does the self-attention decoder neural network improve the generation of output sequences? 2. What are the potential limitations of this approach in real-world applications?

Original Abstract Submitted

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.

18403939. ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS simplified abstract (Google LLC)

Contents

ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS

Organization Name

Inventor(s)

ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS - A simplified explanation of the abstract

Original Abstract Submitted

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools