Google llc (20240428586). Systems and Methods for Improved Video Understanding

From WikiPatents
Jump to navigation Jump to search

Systems and Methods for Improved Video Understanding

Organization Name

google llc

Inventor(s)

Anurag Arnab of Grenoble (FR)

Mostafa Dehghani of Amsterdam (NL)

Georg Heigold of Aachen (DE)

Chen Sun of San Francisco CA (US)

Mario Lucic of Adliswil (CH)

Cordelia Luise Schmid of Saint-Ismier (FR)

Systems and Methods for Improved Video Understanding

This abstract first appeared for US patent application 20240428586 titled 'Systems and Methods for Improved Video Understanding



Original Abstract Submitted

a computer-implemented method for classifying video data with improved accuracy includes obtaining, by a computing system comprising one or more computing devices, video data comprising a plurality of video frames; extracting, by the computing system, a plurality of spatiotemporal representations from the video data, the plurality of spatiotemporal representations comprising a representation of spatiotemporal information in the video data; providing, by the computing system, the plurality of spatiotemporal representations as input to a video understanding model, the video understanding model comprising a video transformer encoder model; and receiving, by the computing system, a classification output from the video understanding model.