17987060. METHOD AND APPARATUS WITH MULTI-MODAL FEATURE FUSION simplified abstract (SAMSUNG ELECTRONICS CO., LTD.)

From WikiPatents
Jump to navigation Jump to search

METHOD AND APPARATUS WITH MULTI-MODAL FEATURE FUSION

Organization Name

SAMSUNG ELECTRONICS CO., LTD.

Inventor(s)

Hao Wang of Beijing (CN)

Weiming Li of Beijing (CN)

Qiang Wang of Beijing (CN)

Jiyeon Kim of Suwon-si (KR)

Hyun Sung Chang of Suwon-si (KR)

Sunghoon Hong of Suwon-si (KR)

METHOD AND APPARATUS WITH MULTI-MODAL FEATURE FUSION - A simplified explanation of the abstract

This abstract first appeared for US patent application 17987060 titled 'METHOD AND APPARATUS WITH MULTI-MODAL FEATURE FUSION

Simplified Explanation

The patent application describes a method, apparatus, electronic device, and computer-readable storage medium for multi-modal feature fusion.

  • The method involves generating 3D feature information and 2D feature information from a color image and a depth image.
  • The generated 3D and 2D feature information is fused together using an attention mechanism to create fused feature information.
  • Image processing is then performed on the fused feature information to generate predicted image information.

Potential Applications

This technology has potential applications in various fields, including:

  • Computer vision and image processing
  • Augmented reality and virtual reality systems
  • Robotics and autonomous systems
  • Medical imaging and diagnostics
  • Surveillance and security systems

Problems Solved

The technology addresses the following problems:

  • Integrating and utilizing both 3D and 2D feature information for improved image processing and analysis.
  • Efficiently fusing multi-modal features to enhance the accuracy and reliability of predictions.
  • Overcoming limitations of traditional methods that rely solely on either 3D or 2D information.

Benefits

The technology offers several benefits:

  • Improved image processing and analysis by leveraging both 3D and 2D feature information.
  • Enhanced accuracy and reliability of predictions through the fusion of multi-modal features.
  • Increased efficiency in generating predicted image information.
  • Potential for applications in various fields, including computer vision, augmented reality, robotics, and medical imaging.


Original Abstract Submitted

A method, apparatus, electronic device, and non-transitory computer-readable storage medium with multi-modal feature fusion are provided. The method includes generating three-dimensional (3D) feature information and two-dimensional (2D) feature information based on a color image and a depth image, generating fused feature information by fusing the 3D feature information and the 2D feature information based on an attention mechanism, and generating predicted image information by performing image processing based on the fused feature information.