US Patent Application 18044842. Modeling Dependencies with Global Self-Attention Neural Networks simplified abstract

From WikiPatents
Jump to navigation Jump to search

Modeling Dependencies with Global Self-Attention Neural Networks

Organization Name

Google LLC


Inventor(s)

Zhuoran Shen of Newcastle WA (US)

Raviteja Vemulapalli of Seattle WA (US)

Irwan Bello of San Francisco CA (US)

Xuhui Jia of Seattle WA (US)

Ching-Hui Chen of Shoreline WA (US)

Modeling Dependencies with Global Self-Attention Neural Networks - A simplified explanation of the abstract

This abstract first appeared for US patent application 18044842 titled 'Modeling Dependencies with Global Self-Attention Neural Networks

Simplified Explanation

The patent application describes a system for modeling dependencies in a network using a global-self attention model with a content attention layer and a positional attention layer.

  • The model takes input data with content values and context positions.
  • The content attention layer generates output features for each context position based on a global attention operation applied to the content values.
  • The positional attention layer generates an attention map for each context position based on the content values of the context position and its neighboring positions.
  • The output is determined using the output features from the content attention layer and the attention map from the positional attention layer.
  • This model improves efficiency and can be used in deep networks.


Original Abstract Submitted

The present disclosure provides systems, methods, and computer program products for modeling dependencies throughout a network using a global-self attention model with a content attention layer and a positional attention layer that operate in parallel. The model receives input data comprising content values and context positions. The content attention layer generates one or more output features for each context position based on a global attention operation applied to the content values independent of the context positions. The positional attention layer generates an attention map for each of the context positions based on one or more content values of the respective context position and associated neighboring positions. Output is determined based on the output features generated by the content attention layer and the attention map generated for each context position by the positional attention layer. The model improves efficiency and can be used throughout a deep network.