RECOGNIZING SPEECH IN THE PRESENCE OF ADDITIONAL AUDIO

Organization Name

Inventor(s)

Diego Melendo Casado of Mountain View CA (US)

Ignacio Lopez Moreno of Brooklyn NY (US)

Javier Gonzalez-dominguez of Madrid (ES)

RECOGNIZING SPEECH IN THE PRESENCE OF ADDITIONAL AUDIO - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240221737 titled 'RECOGNIZING SPEECH IN THE PRESENCE OF ADDITIONAL AUDIO

The technology described in this document involves a computer-implemented method that can identify and reduce audio output levels of a speaker device based on detecting a user's utterance.

The method includes receiving a signal from a speaker device and an additional audio signal.
It involves determining that the additional audio signal corresponds to a user's utterance using a trained model.
The method initiates a reduction in the speaker device's audio output level upon detecting the user's utterance.

Potential Applications: - Voice-controlled devices - Noise-canceling systems - Speech recognition technology

Problems Solved: - Improving user experience by reducing background noise during user interactions - Enhancing the accuracy of voice recognition systems

Benefits: - Clearer communication between users and devices - Enhanced privacy by reducing unintended audio output

Commercial Applications: - Smart home devices - Virtual assistants - Communication systems

Questions about the technology: 1. How does the technology differentiate between background noise and user utterances? 2. What are the potential limitations of this method in real-world environments?

Frequently Updated Research: - Stay updated on advancements in speech recognition technology and noise reduction algorithms to enhance the performance of the system.

Original Abstract Submitted

the technology described in this document can be embodied in a computer-implemented method that includes receiving, at a processing system, a first signal including an output of a speaker device and an additional audio signal. the method also includes determining, by the processing system, based at least in part on a model trained to identify the output of the speaker device, that the additional audio signal corresponds to an utterance of a user. the method further includes initiating a reduction in an audio output level of the speaker device based on determining that the additional audio signal corresponds to the utterance of the user.