17934590. SEARCHING A DATA SOURCE USING EMBEDDINGS OF A VECTOR SPACE simplified abstract (International Business Machines Corporation)

From WikiPatents
Jump to navigation Jump to search

SEARCHING A DATA SOURCE USING EMBEDDINGS OF A VECTOR SPACE

Organization Name

International Business Machines Corporation

Inventor(s)

Richard Obinna Osuala of Munich (DE)

Dominik Moritz Stein of Oberding (DE)

Andrea Giovannini of Zurich (CH)

SEARCHING A DATA SOURCE USING EMBEDDINGS OF A VECTOR SPACE - A simplified explanation of the abstract

This abstract first appeared for US patent application 17934590 titled 'SEARCHING A DATA SOURCE USING EMBEDDINGS OF A VECTOR SPACE

Simplified Explanation

The abstract describes a method for querying a data source represented by data object embeddings in a vector space. A processor inputs a query and at least one token to a trained embedding generation model, which then generates a set of embeddings of the vector space. The set includes an embedding of the query and at least one token, where each token's embedding is a prediction of a supplement of the query. The data object embeddings are searched for matches with the set of embeddings, resulting in search result embeddings. Data objects represented by the search result embeddings are determined and may be provided.

  • Trained embedding generation model used to generate embeddings of data objects in a vector space
  • Query and tokens input to the model to generate embeddings for searching data object embeddings
  • Matching process to find data objects represented by the search result embeddings

Potential Applications

The technology can be applied in:

  • Information retrieval systems
  • Recommendation systems
  • Natural language processing tasks

Problems Solved

This technology addresses:

  • Efficient searching and matching of data object embeddings
  • Enhancing query processing and data retrieval accuracy

Benefits

The benefits of this technology include:

  • Improved search results
  • Enhanced data object matching
  • Increased efficiency in data retrieval processes

Potential Commercial Applications

The technology can be utilized in:

  • E-commerce platforms for product recommendations
  • Search engines for improved query results
  • Data analytics tools for enhanced data processing

Possible Prior Art

One possible prior art for this technology could be:

  • Similar methods used in information retrieval systems
  • Existing techniques in natural language processing for embedding generation

Unanswered Questions

How does the trained embedding generation model handle different types of data objects?

The abstract does not specify how the model adapts to various data object types and structures.

What is the computational complexity of the searching and matching process for large datasets?

The abstract does not provide information on the scalability of the technology for handling extensive data sources.


Original Abstract Submitted

In several aspects for querying a data source represented by data object embeddings in a vector space, a processor inputs, to a trained embedding generation model, a received query and at least one token for receiving from the trained embedding generation model a set of embeddings of the vector space. The set of embeddings comprises an embedding of the received query and at least one embedding of the at least one token respectively, wherein the embedding of each token is a prediction of an embedding of a supplement of the query. The data object embeddings may be searched for data object embeddings that match the set of embeddings. This may result in search result embeddings of the set of embeddings. Data objects that are represented by the search result embeddings may be determined. At least part of the determined data objects may be provided.