Google llc (20240161769). Method for Detecting and Classifying Coughs or Other Non-Semantic Sounds Using Audio Feature Set Learned from Speech simplified abstract

From WikiPatents
Jump to navigation Jump to search

Method for Detecting and Classifying Coughs or Other Non-Semantic Sounds Using Audio Feature Set Learned from Speech

Organization Name

google llc

Inventor(s)

Jacob Garrison of Seattle WA (US)

Jacob Scott Peplinski of Chandler AZ (US)

Joel Shor of Tokyo (JP)

Method for Detecting and Classifying Coughs or Other Non-Semantic Sounds Using Audio Feature Set Learned from Speech - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240161769 titled 'Method for Detecting and Classifying Coughs or Other Non-Semantic Sounds Using Audio Feature Set Learned from Speech

Simplified Explanation

The patent application describes a method for detecting coughs in an audio stream using pre-processing steps and a self-supervised triplet loss embedding model.

  • Pre-processing steps are performed on the audio stream to create an input audio sequence with time-separated audio segments.
  • An embedding is generated for each segment using an audio feature set and a self-supervised triplet loss embedding model.
  • The embedding for each segment is inputted into a model for cough detection, which generates probabilities for cough episodes in the audio sequence.
  • Cough metrics are generated for each detected cough episode in the input audio sequence.

Potential Applications

This technology could be applied in healthcare settings for monitoring patients with respiratory conditions, in smart home devices for detecting coughs for safety or health purposes, and in call centers for quality assurance purposes.

Problems Solved

This technology solves the problem of accurately detecting cough episodes in audio streams, which can be challenging due to background noise and variations in cough sounds.

Benefits

The benefits of this technology include early detection of cough episodes, improved monitoring of respiratory conditions, and enhanced quality assurance in various industries.

Potential Commercial Applications

The technology could be used in healthcare monitoring devices, smart home systems, call center software, and audio analysis tools for various industries.

Possible Prior Art

One possible prior art could be existing cough detection algorithms used in healthcare settings or audio processing software.

Unanswered Questions

How does the self-supervised triplet loss embedding model improve cough detection accuracy?

The self-supervised triplet loss embedding model is trained to learn audio features in a specific manner that enhances its ability to differentiate cough sounds from other audio segments.

What are the limitations of this method in detecting coughs in real-world environments?

The method may face challenges in noisy environments or with variations in cough sounds that are not well-represented in the training data.


Original Abstract Submitted

a method of detecting a cough in an audio stream includes a step of performing one or more pre-processing steps on the audio stream to generate an input audio sequence comprising a plurality of time-separated audio segments. an embedding is generated by a self-supervised triplet loss embedding model for each of the segments of the input audio sequence using an audio feature set, the embedding model having been trained to learn the audio feature set in a self-supervised triplet loss manner from a plurality of speech audio clips from a speech dataset. the embedding for each of the segments is provided to a model performing cough detection inference. this model generates a probability that each of the segments of the input audio sequence includes a cough episode. the method includes generating cough metrics for each of the cough episodes detected in the input audio sequence.