Jump to content

19014029. OPEN-VOCABULARY OBJECT DETECTION IN IMAGES (GOOGLE LLC)

From WikiPatents


OPEN-VOCABULARY OBJECT DETECTION IN IMAGES

Organization Name

GOOGLE LLC

Inventor(s)

Matthias Johannes Lorenz Minderer of Zürich CH

Alexey Alexeevich Gritsenko of Amsterdam NL

Austin Charles Stone of San Francisco CA US

Dirk Weissenborn of Berlin DE

Alexey Dosovitskiy of Berlin DE

Neil Matthew Tinmouth Houlsby of Zürich CH

OPEN-VOCABULARY OBJECT DETECTION IN IMAGES

This abstract first appeared for US patent application 19014029 titled 'OPEN-VOCABULARY OBJECT DETECTION IN IMAGES

Original Abstract Submitted

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for object detection. In one aspect, a method comprises: obtaining: (i) an image, and (ii) a set of one or more query embeddings, wherein each query embedding represents a respective category of object; processing the image and the set of query embeddings using an object detection neural network to generate object detection data for the image, comprising: processing the image using an image encoding subnetwork of the object detection neural network to generate a set of object embeddings; processing each object embedding using a localization subnetwork to generate localization data defining a corresponding region of the image; and processing: (i) the set of object embeddings, and (ii) the set of query embeddings, using a classification subnetwork to generate, for each object embedding, a respective classification score distribution over the set of query embeddings.

Cookies help us deliver our services. By using our services, you agree to our use of cookies.