18634794. ACTION LOCALIZATION IN VIDEOS USING LEARNED QUERIES simplified abstract (GOOGLE LLC)

From WikiPatents
Jump to navigation Jump to search

ACTION LOCALIZATION IN VIDEOS USING LEARNED QUERIES

Organization Name

GOOGLE LLC

Inventor(s)

Alexey Alexeevich Gritsenko of Amsterdam (NL)

Xuehan Xiong of Mountain View CA (US)

Josip Djolonga of Zurich (CH)

Mostafa Dehghani of Amsterdam (NL)

Chen Sun of San Francisco CA (US)

Mario Lucic of Adliswil (CH)

Cordelia Luise Schmid of Saint Ismier (FR)

Anurag Arnab of Grenoble (FR)

ACTION LOCALIZATION IN VIDEOS USING LEARNED QUERIES - A simplified explanation of the abstract

This abstract first appeared for US patent application 18634794 titled 'ACTION LOCALIZATION IN VIDEOS USING LEARNED QUERIES

The patent application describes methods, systems, and apparatus for action localization on an input video, where data specifying bounding boxes and actions for agents in the video frames are generated.

  • The system maintains a set of query vectors and uses them with the input video to generate an action localization output.
  • The action localization output includes bounding boxes and actions for agents in the video frames.
  • The actions are selected from a set of predefined actions that the agents are performing in the video.

Potential Applications:

  • Video surveillance systems
  • Sports analysis and coaching
  • Video editing and post-production
  • Virtual reality and augmented reality applications

Problems Solved:

  • Efficiently identifying and tracking actions in videos
  • Automating the process of action localization
  • Enhancing the analysis of video content

Benefits:

  • Improved accuracy in identifying actions in videos
  • Time-saving in manual video analysis tasks
  • Enhanced user experience in interactive video applications

Commercial Applications:

  • Security and surveillance industry
  • Entertainment and media production companies
  • Sports analytics and broadcasting
  • Virtual reality gaming development

Questions about Action Localization: 1. How does action localization in videos benefit security and surveillance systems?

  - Action localization helps security systems in identifying and tracking specific actions of interest in surveillance footage, enhancing security monitoring capabilities.

2. What are the potential challenges in implementing action localization technology in virtual reality applications?

  - Challenges may include real-time processing requirements, accuracy in action recognition, and integration with existing VR platforms.


Original Abstract Submitted

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing action localization on an input video. In particular, a system maintains a set of query vectors and uses the input video and the set of query vectors to generate an action localization output for the input video. The action localization output includes, for each of one or more agents depicted in the video, data specifying, for each of one or more video frames in the video, a respective bounding box in the video frame that depicts the agent and a respective action from a set of actions that is being performed by the agent in the video frame.