20250181631. Using Larg (Microsoft Technology Licensing, LLC)
USING LARGE GENERATIVE MODELS WITH IMPROVED GROUNDING TO IMPROVE IMAGE CONTEXT QUERIES
Abstract: the disclosure describes utilizing an image query system to improve response accuracy and reduce computational steps resources in responding to natural language queries of input images. in various implementations, the image query system utilizes grounding information from one or more sources to determine accurate information for an input image. for example, the image query system uses a single comprehensive image prompt to obtain extensive visual image grounding information for the input image from a visual-based large generative model. additionally, or in alternative implementations, the image query system obtains reverse image search grounding information for the input image. the image query system then cleverly utilizes the grounding information with a large generative language model to generate text query responses to image-based queries of the input image more accurately and efficiently.
Inventor(s): Vishrav CHAUDHARY, Bradley Moore ABRAMS, Kamal GINOTRA, Owais Khan MOHAMMED, Barun PATRA, Michael Lawrence Valenzuela
CPC Classification: G06F16/532 (Query formulation, e.g. graphical querying)
Search for rejections for patent application number 20250181631