Qualcomm incorporated (20240134908). SOUND SEARCH simplified abstract

From WikiPatents
Jump to navigation Jump to search

SOUND SEARCH

Organization Name

qualcomm incorporated

Inventor(s)

Rehana Mahfuz of San Diego CA (US)

Yinyi Guo of San Diego CA (US)

Erik Visser of San Diego CA (US)

SOUND SEARCH - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240134908 titled 'SOUND SEARCH

Simplified Explanation

The device described in the abstract is a system that uses processors to generate query caption embeddings and select caption embeddings from a set of media files based on similarity metrics. The system then generates search results identifying media files associated with the selected caption embeddings.

  • The device uses processors to generate query caption embeddings based on a query.
  • The system selects caption embeddings from a set of media files based on similarity metrics.
  • Search results are generated to identify media files associated with the selected caption embeddings.

Potential Applications

The technology described in this patent application could be applied in the following areas:

  • Content recommendation systems
  • Audio search engines
  • Multimedia content retrieval systems

Problems Solved

The technology addresses the following issues:

  • Improving search accuracy for sound-based queries
  • Enhancing user experience in searching for specific audio content

Benefits

The technology offers the following benefits:

  • Efficient retrieval of relevant media files based on sound descriptions
  • Enhanced search capabilities for audio content
  • Improved user satisfaction in finding specific sound-based information

Potential Commercial Applications

The technology could be commercially applied in:

  • Music streaming services
  • Podcast platforms
  • Audio editing software

Possible Prior Art

One possible prior art for this technology could be the use of text-based search algorithms in multimedia content retrieval systems.

Unanswered Questions

How does the system handle variations in sound descriptions?

The abstract does not specify how the system accounts for different ways of describing sounds in the query and caption embeddings.

What is the computational complexity of the similarity metric used in selecting caption embeddings?

The abstract does not provide information on the computational resources required for calculating the similarity metric between query and caption embeddings.


Original Abstract Submitted

a device includes one or more processors configured to generate one or more query caption embeddings based on a query. the processor(s) are further configured to select one or more caption embeddings from among a set of embeddings associated with a set of media files of a file repository. each caption embedding represents a corresponding sound caption, and each sound caption includes a natural-language text description of a sound. the caption embedding(s) are selected based on a similarity metric indicative of similarity between the caption embedding(s) and the query caption embedding(s). the processor(s) are further configured to generate search results identifying one or more first media files of the set of media files. each of the first media file(s) is associated with at least one of the caption embedding(s).