18502747. USING LARGE LANGUAGE MODELS FOR SIMILARITY DETERMINATIONS IN CONTENT GENERATION SYSTEMS AND APPLICATIONS (NVIDIA Corporation)

From WikiPatents
Revision as of 07:27, 19 December 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

USING LARGE LANGUAGE MODELS FOR SIMILARITY DETERMINATIONS IN CONTENT GENERATION SYSTEMS AND APPLICATIONS

Organization Name

NVIDIA Corporation

Inventor(s)

Denis Laprise of Palo Alto CA (US)

Shuang Wu of Fremont CA (US)

Ge Cong of Pleasanton CA (US)

Mark Wheeler of Saratoga CA (US)

USING LARGE LANGUAGE MODELS FOR SIMILARITY DETERMINATIONS IN CONTENT GENERATION SYSTEMS AND APPLICATIONS

This abstract first appeared for US patent application 18502747 titled 'USING LARGE LANGUAGE MODELS FOR SIMILARITY DETERMINATIONS IN CONTENT GENERATION SYSTEMS AND APPLICATIONS



Original Abstract Submitted

Approaches presented herein provide for the ability to process, store, index, and search geospatial information such as maps with flexible granularity. A set of observations, such as may include sensor data captured for a region of an environment, can be fed as input to a language model. The language model can generate a tokenized description of the region, as may include a text string of tokens encapsulating semantics, topology, geometry, and/or other aspects of the region. A feature vector or embeddings for the region can be generated based on the tokenized description, and a similarity search performed against a vector database, for example, to identify similar feature vectors corresponding to similar regions or domains. Labels or other information associated with these similar feature vectors can be automatically applied to the example region. Clustering of feature vectors or other embeddings can also be performed based in part on the similarity.