18464912. Pose Empowered RGB-Flow Net simplified abstract (GOOGLE LLC)

From WikiPatents
Jump to navigation Jump to search

Pose Empowered RGB-Flow Net

Organization Name

GOOGLE LLC

Inventor(s)

Yinxiao Li of Mountain View CA (US)

Zhichao Lu of Mountain View CA (US)

Xuehan Xiong of Mountain View CA (US)

Jonathan Huang of San Carlos CA (US)

Pose Empowered RGB-Flow Net - A simplified explanation of the abstract

This abstract first appeared for US patent application 18464912 titled 'Pose Empowered RGB-Flow Net

Simplified Explanation

The patent application describes a method for analyzing video data of an actor performing an activity using neural networks.

  • The video data is processed to generate three input streams: spatial images representing spatial features of the actor, temporal images representing motion, and pose images representing the actor's pose.
  • These input streams are then processed by at least one neural network.
  • The neural network classifies the activity based on the information from the spatial, temporal, and pose input streams.

Potential applications of this technology:

  • Video surveillance systems that can automatically detect and classify activities performed by individuals.
  • Sports analysis tools that can analyze the movements and poses of athletes to provide insights and feedback.
  • Virtual reality and augmented reality applications that can track and analyze the movements of users for interactive experiences.

Problems solved by this technology:

  • Automating the analysis of video data to classify activities, reducing the need for manual review and analysis.
  • Providing a more comprehensive understanding of an actor's activity by considering spatial, temporal, and pose information together.
  • Enabling real-time analysis and classification of activities, allowing for immediate response or feedback.

Benefits of this technology:

  • Improved accuracy and efficiency in classifying activities compared to traditional methods.
  • Ability to analyze and understand complex activities that involve both spatial and temporal aspects.
  • Potential for a wide range of applications in various industries, including security, sports, and entertainment.


Original Abstract Submitted

A method includes receiving video data that includes a series of frames of image data. Here, the video data is representative of an actor performing an activity. The method also includes processing the video data to generate a spatial input stream including a series of spatial images representative of spatial features of the actor performing the activity, a temporal input stream representative of motion of the actor performing the activity, and a pose input stream including a series of images representative of a pose of the actor performing the activity. Using at least one neural network, the method also includes processing the temporal input stream, the spatial input stream, and the pose input stream. The method also includes classifying, by the at least one neural network, the activity based on the temporal input stream, the spatial input stream, and the pose input stream.