Google llc (20240211751). ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS simplified abstract
Contents
ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS
Organization Name
Inventor(s)
Noam M. Shazeer of Palo Alto CA (US)
Lukasz Mieczyslaw Kaiser of San Francisco CA (US)
Etienne Pot of Palo Alto CA (US)
Mohammad Saleh of Santa Clara CA (US)
Ben David Goodrich of San Francisco CA (US)
Peter J. Liu of Santa Clara CA (US)
Ryan Sepassi of Beverly Hills CA (US)
ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240211751 titled 'ATTENTION-BASED DECODER-ONLY SEQUENCE TRANSDUCTION NEURAL NETWORKS
The abstract of the patent application describes methods, systems, and apparatus for generating an output sequence from an input sequence using a self-attention decoder neural network.
- At each generation time step, a combined sequence is generated that includes the input sequence followed by the output tokens already generated.
- The combined sequence is processed using a self-attention decoder neural network to generate a score distribution over possible output tokens.
- An output token is selected from the set of possible output tokens based on the score distribution to be the next output token in the sequence.
Potential Applications: - Natural language processing - Machine translation - Speech recognition
Problems Solved: - Efficient generation of output sequences from input sequences - Improved accuracy in predicting next output tokens
Benefits: - Enhanced performance in sequence generation tasks - Increased efficiency in neural network processing
Commercial Applications: - Language translation software - Chatbots and virtual assistants - Automated content generation tools
Questions about the technology: 1. How does the self-attention decoder neural network improve sequence generation compared to other methods? 2. What are the potential limitations of using this technology in real-world applications?
Frequently Updated Research: - Stay updated on advancements in neural network architectures for sequence generation tasks.
Original Abstract Submitted
methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating an output sequence from an input sequence. one of the methods includes, at each of a plurality of generation time steps: generating a combined sequence for the generation time step that includes the input sequence followed by the output tokens that have already been generated as of the generation time step; processing the combined sequence using a self-attention decoder neural network to generate a time step output that defines a score distribution over a set of possible output tokens; and selecting, using the time step output, an output token from the set of possible output tokens as the next output token in the output sequence.
- Google llc
- Noam M. Shazeer of Palo Alto CA (US)
- Lukasz Mieczyslaw Kaiser of San Francisco CA (US)
- Etienne Pot of Palo Alto CA (US)
- Mohammad Saleh of Santa Clara CA (US)
- Ben David Goodrich of San Francisco CA (US)
- Peter J. Liu of Santa Clara CA (US)
- Ryan Sepassi of Beverly Hills CA (US)
- G06N3/08
- G06N3/045
- CPC G06N3/08