18336687. DIMENSIONAL ATTENTION FOR WEIGHT ALLOCATION IN LANGUAGE MODELS OR OTHER MACHINE LEARNING MODELS simplified abstract (SAMSUNG ELECTRONICS CO., LTD.)

From WikiPatents
Jump to navigation Jump to search

DIMENSIONAL ATTENTION FOR WEIGHT ALLOCATION IN LANGUAGE MODELS OR OTHER MACHINE LEARNING MODELS

Organization Name

SAMSUNG ELECTRONICS CO., LTD.

Inventor(s)

Suhel Jaber of San Jose CA (US)

Brendon Christopher Beachy Eby of Chicago IL (US)

Sai Ajay Modukuri of San Francisco CA (US)

DIMENSIONAL ATTENTION FOR WEIGHT ALLOCATION IN LANGUAGE MODELS OR OTHER MACHINE LEARNING MODELS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18336687 titled 'DIMENSIONAL ATTENTION FOR WEIGHT ALLOCATION IN LANGUAGE MODELS OR OTHER MACHINE LEARNING MODELS

Simplified Explanation

The method involves processing an input with multiple tokens using a machine learning model that performs attention over both the dimensions of the tokens and the dimensions of embedding vectors representing the tokens, weighting different dimensions of each token differently to generate an output embedding vector for a query token.

  • Obtaining an input containing multiple tokens
  • Processing the input using a machine learning model
  • Performing attention over multiple dimensions of the tokens
  • Performing attention over multiple dimensions of embedding vectors
  • Weighting different dimensions of each token differently
  • Generating an output embedding vector for a query token based on the attention

Potential Applications

  • Natural language processing
  • Sentiment analysis
  • Information retrieval systems

Problems Solved

  • Improving accuracy of machine learning models
  • Enhancing understanding of relationships between tokens
  • Handling complex inputs with multiple dimensions

Benefits

  • Increased efficiency in processing inputs
  • Improved performance in generating output embeddings
  • Enhanced capabilities in analyzing and interpreting data


Original Abstract Submitted

A method includes obtaining an input containing multiple tokens. The method also includes processing the input using a machine learning model. Processing the input includes performing attention over both (i) multiple dimensions of the tokens contained in the input and (ii) multiple dimensions of embedding vectors used to represent the tokens contained in the input so that different dimensions of each of at least some of the tokens are weighted differently. In addition, the method includes generating an output embedding vector for a query token of the multiple tokens based on the attention.