19014029. OPEN-VOCABULARY OBJECT DETECTION IN IMAGES (GOOGLE LLC)
OPEN-VOCABULARY OBJECT DETECTION IN IMAGES
Organization Name
Inventor(s)
Matthias Johannes Lorenz Minderer of Zürich CH
Alexey Alexeevich Gritsenko of Amsterdam NL
Austin Charles Stone of San Francisco CA US
Alexey Dosovitskiy of Berlin DE
Neil Matthew Tinmouth Houlsby of Zürich CH
OPEN-VOCABULARY OBJECT DETECTION IN IMAGES
This abstract first appeared for US patent application 19014029 titled 'OPEN-VOCABULARY OBJECT DETECTION IN IMAGES
Original Abstract Submitted
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for object detection. In one aspect, a method comprises: obtaining: (i) an image, and (ii) a set of one or more query embeddings, wherein each query embedding represents a respective category of object; processing the image and the set of query embeddings using an object detection neural network to generate object detection data for the image, comprising: processing the image using an image encoding subnetwork of the object detection neural network to generate a set of object embeddings; processing each object embedding using a localization subnetwork to generate localization data defining a corresponding region of the image; and processing: (i) the set of object embeddings, and (ii) the set of query embeddings, using a classification subnetwork to generate, for each object embedding, a respective classification score distribution over the set of query embeddings.
- GOOGLE LLC
- Matthias Johannes Lorenz Minderer of Zürich CH
- Alexey Alexeevich Gritsenko of Amsterdam NL
- Austin Charles Stone of San Francisco CA US
- Dirk Weissenborn of Berlin DE
- Alexey Dosovitskiy of Berlin DE
- Neil Matthew Tinmouth Houlsby of Zürich CH
- G06V10/764
- G06F40/40
- G06V10/22
- G06V10/74
- G06V10/774
- G06V10/776
- G06V10/82
- CPC G06V10/764