Samsung electronics co., ltd. (20240346673). VIDEO DEPTH ESTIMATION BASED ON TEMPORAL ATTENTION simplified abstract

From WikiPatents
Revision as of 00:40, 18 October 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

VIDEO DEPTH ESTIMATION BASED ON TEMPORAL ATTENTION

Organization Name

samsung electronics co., ltd.

Inventor(s)

Haoyu Ren of San Diego CA (US)

Mostafa El-khamy of San Diego CA (US)

Jungwon Lee of San Diego CA (US)

VIDEO DEPTH ESTIMATION BASED ON TEMPORAL ATTENTION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240346673 titled 'VIDEO DEPTH ESTIMATION BASED ON TEMPORAL ATTENTION

    • Simplified Explanation:**

This patent application describes a method for detecting depth based on multiple video frames captured at different times. The method involves convolving input frames to generate feature maps, calculating a temporal attention map based on these feature maps, and applying the attention map to generate a final feature map with temporal attention.

    • Key Features and Innovation:**

- Utilizes multiple video frames captured at different times for depth detection - Generates feature maps through convolution of input frames - Calculates a temporal attention map to highlight important features across frames - Applies the attention map to enhance the final feature map with temporal information

    • Potential Applications:**

- Depth sensing in augmented reality applications - Object detection and tracking in surveillance systems - 3D reconstruction in computer vision tasks

    • Problems Solved:**

- Enhances depth detection accuracy by considering temporal information - Improves object recognition in dynamic scenes - Enables more robust depth estimation in varying lighting conditions

    • Benefits:**

- Enhanced depth perception in real-time applications - Improved object tracking and recognition accuracy - Increased reliability of depth sensing systems

    • Commercial Applications:**

Title: Advanced Depth Detection Technology for Augmented Reality and Surveillance Systems This technology can be applied in industries such as augmented reality, surveillance, and computer vision for enhanced depth sensing capabilities. It can improve the accuracy and reliability of depth detection systems, leading to better performance in various applications.

    • Prior Art:**

Prior research in depth detection methods using convolutional neural networks and temporal information can be relevant to this technology. Researchers have explored similar approaches to improve depth estimation accuracy in dynamic scenes.

    • Frequently Updated Research:**

Ongoing research in depth detection algorithms, temporal attention mechanisms, and convolutional neural networks can provide valuable insights into the advancement of this technology. Stay updated on the latest developments in these areas to enhance the performance of depth sensing systems.

    • Questions about Depth Detection Technology:**

1. How does this method compare to traditional depth detection techniques? This method improves depth detection accuracy by considering temporal information from multiple frames, leading to more robust results compared to traditional techniques.

2. What are the potential challenges in implementing this technology in real-time applications? Implementing this technology in real-time applications may require efficient computational resources and optimized algorithms to process multiple frames quickly and accurately.


Original Abstract Submitted

a method of depth detection based on a plurality of video frames includes receiving a plurality of input frames including a first input frame, a second input frame, and a third input frame respectively corresponding to different capture times, convolving the first to third input frames to generate a first feature map, a second feature map, and a third feature map corresponding to the different capture times, calculating a temporal attention map based on the first to third feature maps, the temporal attention map including a plurality of weights corresponding to different pairs of feature maps from among the first to third feature maps, each weight of the plurality of weights indicating a similarity level of a corresponding pair of feature maps, and applying the temporal attention map to the first to third feature maps to generate a feature map with temporal attention.