Google llc (20240428586). Systems and Methods for Improved Video Understanding
Contents
Systems and Methods for Improved Video Understanding
Organization Name
Inventor(s)
Mostafa Dehghani of Amsterdam (NL)
Chen Sun of San Francisco CA (US)
Cordelia Luise Schmid of Saint-Ismier (FR)
Systems and Methods for Improved Video Understanding
This abstract first appeared for US patent application 20240428586 titled 'Systems and Methods for Improved Video Understanding
Original Abstract Submitted
a computer-implemented method for classifying video data with improved accuracy includes obtaining, by a computing system comprising one or more computing devices, video data comprising a plurality of video frames; extracting, by the computing system, a plurality of spatiotemporal representations from the video data, the plurality of spatiotemporal representations comprising a representation of spatiotemporal information in the video data; providing, by the computing system, the plurality of spatiotemporal representations as input to a video understanding model, the video understanding model comprising a video transformer encoder model; and receiving, by the computing system, a classification output from the video understanding model.