18432282. END-TO-END STREAMING KEYWORD SPOTTING simplified abstract (GOOGLE LLC)
Contents
- 1 END-TO-END STREAMING KEYWORD SPOTTING
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 END-TO-END STREAMING KEYWORD SPOTTING - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Potential Commercial Applications
- 1.9 Possible Prior Art
- 1.10 Original Abstract Submitted
END-TO-END STREAMING KEYWORD SPOTTING
Organization Name
Inventor(s)
Raziel Alvarez Guevara of Menlo Park CA (US)
Hyun Jin Park of Palo Alto CA (US)
END-TO-END STREAMING KEYWORD SPOTTING - A simplified explanation of the abstract
This abstract first appeared for US patent application 18432282 titled 'END-TO-END STREAMING KEYWORD SPOTTING
Simplified Explanation
The method described in the abstract is for detecting a hotword in streaming audio using a neural network with SVDF layers. The network filters audio features in two stages and generates a probability score for the presence of the hotword.
- The method uses a neural network with SVDF layers to detect a hotword in streaming audio.
- Each neuron in the network includes a memory component and two filtering stages for audio features.
- The probability score generated by the network is used to determine if a hotword is present in the audio stream.
Potential Applications
This technology can be applied in:
- Voice-activated devices
- Speech recognition systems
- Virtual assistants
Problems Solved
This technology helps in:
- Improving accuracy in detecting specific keywords in audio streams
- Enhancing user experience with voice-controlled devices
- Enabling hands-free operation of devices
Benefits
The benefits of this technology include:
- Efficient detection of hotwords in streaming audio
- Quick response time in initiating actions based on detected keywords
- Enhanced user interaction with devices through voice commands
Potential Commercial Applications
This technology can be commercially benefit:
- Smart home devices
- Automotive voice control systems
- Customer service chatbots
Possible Prior Art
One possible prior art for this technology could be the use of deep learning models for speech recognition and keyword detection in audio streams.
Unanswered Questions
How does this technology compare to existing methods for hotword detection in streaming audio?
This technology uses a neural network with SVDF layers for hotword detection, which may offer improved accuracy and efficiency compared to traditional methods.
What are the limitations of this technology in real-world applications?
The limitations of this technology may include the need for significant computational resources for real-time processing and potential challenges in adapting to different accents or languages.
Original Abstract Submitted
A method for detecting a hotword includes receiving a sequence of input frames that characterize streaming audio captured by a user device and generating a probability score indicating a presence of a hotword in the streaming audio using a memorized neural network. The network includes sequentially-stacked single value decomposition filter (SVDF) layers and each SVDF layer includes at least one neuron. Each neuron includes a respective memory component, a first stage configured to perform filtering on audio features of each input frame individually and output to the memory component, and a second stage configured to perform filtering on all the filtered audio features residing in the respective memory component. The method also includes determining whether the probability score satisfies a hotword detection threshold and initiating a wake-up process on the user device for processing additional terms.