18243555. LANDMARK DETECTION WITH AN ITERATIVE NEURAL NETWORK simplified abstract (NVIDIA Corporation)

From WikiPatents
Jump to navigation Jump to search

LANDMARK DETECTION WITH AN ITERATIVE NEURAL NETWORK

Organization Name

NVIDIA Corporation

Inventor(s)

Pavlo Molchanov of Mountain View CA (US)

Jan Kautz of Lexington MA (US)

Arash Vahdat of San Mateo CA (US)

Hongxu Yin of San Jose CA (US)

Paul Micaelli of Edinburgh (GB)

LANDMARK DETECTION WITH AN ITERATIVE NEURAL NETWORK - A simplified explanation of the abstract

This abstract first appeared for US patent application 18243555 titled 'LANDMARK DETECTION WITH AN ITERATIVE NEURAL NETWORK

Simplified Explanation

Landmark detection refers to the detection of landmarks within an image or a video, and is used in many computer vision tasks such emotion recognition, face identity verification, hand tracking, gesture recognition, and eye gaze tracking. Current landmark detection methods rely on a cascaded computation through cascaded networks or an ensemble of multiple models, which starts with an initial guess of the landmarks and iteratively produces corrected landmarks which match the input more finely. However, the iterations required by current methods typically increase the training memory cost linearly, and do not have an obvious stopping criteria. Moreover, these methods tend to exhibit jitter in landmark detection results for video. The present disclosure improves current landmark detection methods by providing landmark detection using an iterative neural network. Furthermore, when detecting landmarks in video, the present disclosure provides for a reduction in jitter due to reuse of previous hidden states from previous frames.

  • The innovation in this patent application is the use of an iterative neural network for landmark detection, which improves upon current methods that rely on cascaded networks or ensemble models.
  • The technology reduces jitter in landmark detection results for video by reusing previous hidden states from previous frames, leading to more stable and accurate landmark tracking.

Potential Applications

The technology can be applied in various fields such as:

  • Emotion recognition
  • Face identity verification
  • Hand tracking
  • Gesture recognition
  • Eye gaze tracking

Problems Solved

The technology addresses the following issues:

  • Linear increase in training memory cost with current landmark detection methods
  • Lack of an obvious stopping criteria in current methods
  • Jitter in landmark detection results for video

Benefits

The benefits of this technology include:

  • Improved accuracy in landmark detection
  • Reduction in jitter for video applications
  • More efficient memory usage during training

Potential Commercial Applications

The technology can be commercially applied in industries such as:

  • Healthcare for patient monitoring
  • Security for facial recognition systems
  • Entertainment for gesture-based interfaces

Possible Prior Art

One possible prior art for landmark detection in computer vision is the use of deep learning models for facial landmark detection, which have been widely studied in recent years.

Unanswered Questions

How does the iterative neural network compare to traditional methods in terms of computational efficiency?

The article does not provide a direct comparison between the iterative neural network and traditional methods in terms of computational efficiency. It would be interesting to know if the new approach is more computationally efficient or if there are trade-offs in terms of speed or resource usage.

What are the potential limitations or challenges of implementing this technology in real-world applications?

The article does not discuss any potential limitations or challenges of implementing this technology in real-world applications. It would be important to understand any constraints or obstacles that may arise when deploying this innovation in practical settings.


Original Abstract Submitted

Landmark detection refers to the detection of landmarks within an image or a video, and is used in many computer vision tasks such emotion recognition, face identity verification, hand tracking, gesture recognition, and eye gaze tracking. Current landmark detection methods rely on a cascaded computation through cascaded networks or an ensemble of multiple models, which starts with an initial guess of the landmarks and iteratively produces corrected landmarks which match the input more finely. However, the iterations required by current methods typically increase the training memory cost linearly, and do not have an obvious stopping criteria. Moreover, these methods tend to exhibit jitter in landmark detection results for video. The present disclosure improves current landmark detection methods by providing landmark detection using an iterative neural network. Furthermore, when detecting landmarks in video, the present disclosure provides for a reduction in jitter due to reuse of previous hidden states from previous frames.