18403939. ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS simplified abstract (Google LLC)
Contents
ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS
Organization Name
Inventor(s)
Noam M. Shazeer of Palo Alto CA (US)
Lukasz Mieczyslaw Kaiser of San Francisco CA (US)
Etienne Pot of Palo Alto CA (US)
Mohammad Saleh of Santa Clara CA (US)
Ben David Goodrich of San Francisco CA (US)
Peter J. Liu of Santa Clara CA (US)
Ryan Sepassi of Beverly Hills CA (US)
ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS - A simplified explanation of the abstract
This abstract first appeared for US patent application 18403939 titled 'ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS
The patent application describes methods, systems, and apparatus for generating an output sequence from an input sequence using a self-attention decoder neural network.
- The method involves generating a combined sequence at each generation time step, processing it with the neural network to generate a score distribution over possible output tokens, and selecting the next output token based on the distribution.
- This approach allows for the generation of complex output sequences based on input sequences, improving the accuracy and efficiency of sequence generation tasks.
- The use of a self-attention decoder neural network enables the model to focus on different parts of the input sequence at each generation time step, enhancing the quality of the generated output.
- By selecting output tokens based on the score distribution, the system can adaptively generate sequences that are contextually relevant and coherent.
- The innovation described in the patent application can be applied to various natural language processing tasks, such as machine translation, text summarization, and dialogue generation.
Potential Applications:
- Machine translation
- Text summarization
- Dialogue generation
- Sentiment analysis
Problems Solved:
- Improving the accuracy and efficiency of sequence generation tasks
- Enhancing the quality of generated output sequences
- Adapting output sequences based on context
Benefits:
- Increased accuracy in sequence generation
- Enhanced coherence and relevance of output sequences
- Improved efficiency in natural language processing tasks
Commercial Applications:
- Natural language processing software development
- AI-powered chatbots
- Automated content generation tools
Questions about the technology: 1. How does the self-attention decoder neural network improve the generation of output sequences? 2. What are the potential limitations of this approach in real-world applications?
Original Abstract Submitted
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. One of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
- Google LLC
- Noam M. Shazeer of Palo Alto CA (US)
- Lukasz Mieczyslaw Kaiser of San Francisco CA (US)
- Etienne Pot of Palo Alto CA (US)
- Mohammad Saleh of Santa Clara CA (US)
- Ben David Goodrich of San Francisco CA (US)
- Peter J. Liu of Santa Clara CA (US)
- Ryan Sepassi of Beverly Hills CA (US)
- G06N3/08
- G06N3/045
- CPC G06N3/08