20240029731. Voice Detection By Multiple Devices simplified abstract (Sonos, Inc.)

From WikiPatents
Jump to navigation Jump to search

Voice Detection By Multiple Devices

Organization Name

Sonos, Inc.

Inventor(s)

Jonathon Reilly of Cambridge MA (US)

Gregory Burlingame of Woburn MA (US)

Christopher Butts of Evanston IL (US)

Romi Kadri of Cambridge MA (US)

Jonathan P. Lang of Santa Barbara CA (US)

Voice Detection By Multiple Devices - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240029731 titled 'Voice Detection By Multiple Devices

Simplified Explanation

The disclosed patent application describes techniques for voice detection using multiple NMDs (Neural Microphone Arrays). The system involves one or more servers that receive data representing multiple audio recordings of a voice input spoken by a user. Each audio recording is captured by a respective NMD from the multiple NMDs. The voice input includes a wake-word that is detected.

Based on the sound pressure levels of the multiple audio recordings, the servers select a particular NMD and exclude the selection of other NMDs. The servers then send a playback command to the selected NMD via a network interface. This playback command corresponds to a voice command in the voice input represented by the multiple audio recordings. The selected NMD plays back audio content according to the playback command.

  • The patent application describes a system for voice detection using multiple NMDs.
  • The system receives audio recordings of a voice input spoken by a user.
  • Each audio recording is captured by a respective NMD from the multiple NMDs.
  • The system selects a particular NMD based on the sound pressure levels of the audio recordings.
  • The selected NMD receives a playback command corresponding to a voice command in the voice input.
  • The selected NMD plays back audio content according to the playback command.

Potential applications of this technology:

  • Voice assistants: The system can be used in voice assistant devices to accurately detect and process voice commands.
  • Smart speakers: This technology can enhance the performance of smart speakers by improving voice detection and command execution.
  • Voice-controlled devices: It can be applied in various voice-controlled devices such as smartphones, home automation systems, and automotive systems.

Problems solved by this technology:

  • Improved voice detection: By using multiple NMDs and selecting the most suitable one based on sound pressure levels, accurate voice detection can be achieved even in noisy environments.
  • Efficient command execution: The system ensures that the selected NMD receives the appropriate playback command, leading to efficient execution of voice commands.

Benefits of this technology:

  • Enhanced user experience: Accurate voice detection and efficient command execution result in a smoother and more reliable user experience.
  • Improved performance in noisy environments: The system's ability to select the most suitable NMD based on sound pressure levels allows for better performance in noisy environments.
  • Increased reliability: By using multiple NMDs, the system provides redundancy and improves the reliability of voice detection and command execution.


Original Abstract Submitted

disclosed herein are example techniques for voice detection by multiple nmds. an example implementation may involve one or more servers receiving, via a network interface, data representing multiple audio recordings of a voice input spoken by a given user, each audio recording recorded by a respective nmd of the multiple nmds, wherein the voice input comprises a detected wake-word. based on respective sound pressure levels of the multiple audio recordings of the voice input, the servers (i) select a particular nmd of the multiple nmds and (ii) forego selection of other nmds of the multiple nmds. the servers send, via the network interface to the particular nmd, data representing a playback command that corresponds to a voice command in the voice input represented in the multiple audio recordings, wherein the data representing the playback command causes the particular nmd to play back audio content according to the playback command.