Google LLC (20240221737). RECOGNIZING SPEECH IN THE PRESENCE OF ADDITIONAL AUDIO simplified abstract
RECOGNIZING SPEECH IN THE PRESENCE OF ADDITIONAL AUDIO
Organization Name
Inventor(s)
Diego Melendo Casado of Mountain View CA (US)
Ignacio Lopez Moreno of Brooklyn NY (US)
Javier Gonzalez-dominguez of Madrid (ES)
RECOGNIZING SPEECH IN THE PRESENCE OF ADDITIONAL AUDIO - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240221737 titled 'RECOGNIZING SPEECH IN THE PRESENCE OF ADDITIONAL AUDIO
The technology described in this document involves a computer-implemented method that can identify and reduce audio output levels of a speaker device based on detecting a user's utterance.
- The method includes receiving a signal from a speaker device and an additional audio signal.
- It involves determining that the additional audio signal corresponds to a user's utterance using a trained model.
- The method initiates a reduction in the speaker device's audio output level upon detecting the user's utterance.
Potential Applications: - Voice-controlled devices - Noise-canceling systems - Speech recognition technology
Problems Solved: - Improving user experience by reducing background noise during user interactions - Enhancing the accuracy of voice recognition systems
Benefits: - Clearer communication between users and devices - Enhanced privacy by reducing unintended audio output
Commercial Applications: - Smart home devices - Virtual assistants - Communication systems
Questions about the technology: 1. How does the technology differentiate between background noise and user utterances? 2. What are the potential limitations of this method in real-world environments?
Frequently Updated Research: - Stay updated on advancements in speech recognition technology and noise reduction algorithms to enhance the performance of the system.
Original Abstract Submitted
the technology described in this document can be embodied in a computer-implemented method that includes receiving, at a processing system, a first signal including an output of a speaker device and an additional audio signal. the method also includes determining, by the processing system, based at least in part on a model trained to identify the output of the speaker device, that the additional audio signal corresponds to an utterance of a user. the method further includes initiating a reduction in an audio output level of the speaker device based on determining that the additional audio signal corresponds to the utterance of the user.