LATENCY REDUCTION FOR MULTI-STAGE SPEECH RECOGNITION

Organization Name

Inventor(s)

Sachin Raghunath Abdagire of San Diego CA (US)

LATENCY REDUCTION FOR MULTI-STAGE SPEECH RECOGNITION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240274127 titled 'LATENCY REDUCTION FOR MULTI-STAGE SPEECH RECOGNITION

Simplified Explanation: The patent application describes a system for processing audio samples using keyword detection models to skip processing unnecessary frames.

The system receives audio samples and determines keyword detection scores using models.
If the score exceeds a threshold, the model processes subsequent frames.
Frames with high scores are compared to a second threshold for further processing.
A second model is used to skip processing frames based on the second threshold.

Key Features and Innovation:

Utilizes keyword detection models to process audio samples efficiently.
Skips processing frames with low keyword detection scores.
Improves processing speed and accuracy of audio sample analysis.

Potential Applications:

Speech recognition systems
Voice-controlled devices
Audio transcription software

Problems Solved:

Reducing processing time for analyzing audio samples
Enhancing the accuracy of keyword detection in audio data

Benefits:

Faster and more efficient audio sample processing
Improved performance of keyword detection models
Enhanced user experience in speech recognition applications

Commercial Applications: The technology can be applied in various industries such as telecommunications, smart home devices, and transcription services to improve the efficiency and accuracy of audio processing systems.

Questions about Audio Sample Processing: 1. How does the system determine which frames to skip processing? 2. What are the potential limitations of using keyword detection models in audio sample analysis?

Original Abstract Submitted

systems and techniques are provided for processing one or more audio samples. for example, a process can include receiving one or more audio samples in a first audio frame and determining, using a first keyword detection model, a first keyword detection score for the first audio frame. one or more audio samples can be received in additional audio frames. based on the first keyword detection score exceeding a first threshold, the first keyword detection model can be used to determine a keyword detection score for each audio frame of the additional audio frames. the respective keyword detection score for each audio frame of the additional audio frames can be compared to a second threshold that is greater than the first threshold. based on the respective keyword detection score exceeding the second threshold, using a second keyword detection model to process the first audio frame and the additional audio frames can be skipped.

QUALCOMM Incorporated (20240274127). LATENCY REDUCTION FOR MULTI-STAGE SPEECH RECOGNITION simplified abstract

Contents

LATENCY REDUCTION FOR MULTI-STAGE SPEECH RECOGNITION

Organization Name

Inventor(s)

LATENCY REDUCTION FOR MULTI-STAGE SPEECH RECOGNITION - A simplified explanation of the abstract

Original Abstract Submitted

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools