17720681. METHOD AND APPARATUS WITH SELF-ATTENTION-BASED IMAGE RECOGNITION simplified abstract (SAMSUNG ELECTRONICS CO., LTD.)
Contents
METHOD AND APPARATUS WITH SELF-ATTENTION-BASED IMAGE RECOGNITION
Organization Name
Inventor(s)
Seohyung Lee of Yongin-si (KR)
METHOD AND APPARATUS WITH SELF-ATTENTION-BASED IMAGE RECOGNITION - A simplified explanation of the abstract
This abstract first appeared for US patent application 17720681 titled 'METHOD AND APPARATUS WITH SELF-ATTENTION-BASED IMAGE RECOGNITION
Simplified Explanation
The patent application describes a method using self-attention in a three-dimensional (3D) feature map. Here are the key points:
- The method starts by obtaining a 3D feature map.
- It then generates 3D query data and 3D key data by performing a convolution operation on the 3D feature map.
- Next, it generates two-dimensional (2D) vertical data by vertically projecting the 3D query data and the 3D key data.
- It also generates 2D horizontal data by horizontally projecting the 3D query data and the 3D key data.
- An intermediate attention result is determined by multiplying the 2D vertical data and the 2D horizontal data.
- Finally, a final attention result is determined by multiplying the intermediate attention result and the 3D feature map.
Potential applications of this technology:
- Image recognition and classification: The self-attention method can be used to enhance the performance of image recognition and classification algorithms by improving the focus on relevant features within the image.
- Natural language processing: This method can also be applied to analyze and process textual data, improving the understanding and interpretation of natural language.
- Video analysis: The self-attention technique can be utilized to analyze and extract meaningful information from video data, enabling applications such as video summarization and object tracking.
Problems solved by this technology:
- Enhanced feature extraction: The self-attention method helps to identify and focus on important features within the input data, improving the accuracy and efficiency of various tasks.
- Handling complex and high-dimensional data: By utilizing 3D feature maps and projections, this method can effectively process and analyze complex and high-dimensional data, such as images and videos.
Benefits of this technology:
- Improved accuracy: The self-attention method allows for better identification and understanding of important features, leading to improved accuracy in various applications.
- Efficient processing: By focusing on relevant features, this method reduces the computational burden and improves the efficiency of data processing tasks.
- Versatility: The self-attention technique can be applied to various domains and types of data, making it a versatile tool for different applications.
Original Abstract Submitted
A method with self-attention includes: obtaining a three-dimensional (3D) feature map; generating 3D query data and 3D key data by performing a convolution operation based on the 3D feature map; generating two-dimensional (2D) vertical data based on a vertical projection of the 3D query data and the 3D key data; generating 2D horizontal data based on a horizontal projection of the 3D query data and the 3D key data; determining an intermediate attention result through a multiplication based on the 2D vertical data and the 2D horizontal data; and determining a final attention result through a multiplication based on the intermediate attention result and the 3D feature map.