18603946. MEMORY-GUIDED VIDEO OBJECT DETECTION simplified abstract (Google LLC)
Contents
MEMORY-GUIDED VIDEO OBJECT DETECTION
Organization Name
Inventor(s)
Dmitry Kalenichenko of Los Angeles CA (US)
Menglong Zhu of Playa Vista CA (US)
Marie Charisse White of Mountain View CA (US)
Yinxiao Li of Sunnyvale CA (US)
MEMORY-GUIDED VIDEO OBJECT DETECTION - A simplified explanation of the abstract
This abstract first appeared for US patent application 18603946 titled 'MEMORY-GUIDED VIDEO OBJECT DETECTION
The abstract describes systems and methods for detecting objects in a video using an interleaved object detection model with feature extractor networks and a shared memory layer.
- Input a video with multiple frames into the object detection model.
- Select a feature extractor network to analyze each frame.
- Analyze frames to determine features.
- Update features based on previous frame's features stored in shared memory.
- Detect objects in frames based on updated features.
- Key Features and Innovation:**
- Utilizes multiple feature extractor networks for object detection.
- Incorporates shared memory layer for storing and updating features.
- Enables efficient and accurate object detection in videos.
- Potential Applications:**
- Video surveillance systems.
- Autonomous vehicles.
- Augmented reality applications.
- Problems Solved:**
- Enhances object detection accuracy in videos.
- Improves efficiency of analyzing video frames.
- Enables real-time object detection in videos.
- Benefits:**
- Increased accuracy in detecting objects.
- Faster processing of video frames.
- Enhanced performance in various applications.
- Commercial Applications:**
Object detection technology can be utilized in various industries such as security, transportation, and entertainment for improved object recognition and tracking capabilities.
- Questions about Object Detection Technology:**
1. How does the shared memory layer improve the efficiency of object detection in videos? 2. What are the potential limitations of using multiple feature extractor networks for analyzing video frames?
Original Abstract Submitted
Systems and methods for detecting objects in a video are provided. A method can include inputting a video comprising a plurality of frames into an interleaved object detection model comprising a plurality of feature extractor networks and a shared memory layer. For each of one or more frames, the operations can include selecting one of the plurality of feature extractor networks to analyze the one or more frames, analyzing the one or more frames by the selected feature extractor network to determine one or more features of the one or more frames, determining an updated set of features based at least in part on the one or more features and one or more previously extracted features extracted from a previous frame stored in the shared memory layer, and detecting an object in the one or more frames based at least in part on the updated set of features.