18055739. DETECTING AND CLASSIFYING FILLER WORDS IN AUDIO USING NEURAL NETWORKS simplified abstract (ADOBE INC.)

From WikiPatents
Jump to navigation Jump to search

DETECTING AND CLASSIFYING FILLER WORDS IN AUDIO USING NEURAL NETWORKS

Organization Name

ADOBE INC.

Inventor(s)

Justin Salamon of San Francisco CA (US)

Juan-Pablo Caceres Chomali of San Francisco CA (US)

Ge Zhu of Rochester NY (US)

Nicholas J. Bryan of Belmont CA (US)

DETECTING AND CLASSIFYING FILLER WORDS IN AUDIO USING NEURAL NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18055739 titled 'DETECTING AND CLASSIFYING FILLER WORDS IN AUDIO USING NEURAL NETWORKS

Simplified Explanation

The patent application describes a process for detecting filler words in audio using trained neural networks.

  • The system receives an audio input and analyzes it to identify filler word candidates.
  • Each filler word candidate is classified into categories by a filler word classification model.
  • The output audio sequence includes identified filler words in a specific category.

Potential Applications

This technology could be applied in:

  • Media editing software to automatically detect and remove filler words in audio recordings.
  • Language learning tools to help users identify and improve their speech patterns.

Problems Solved

This technology addresses the following issues:

  • Time-consuming manual detection of filler words in audio recordings.
  • Inaccurate identification of filler words leading to inefficient editing processes.

Benefits

The benefits of this technology include:

  • Increased efficiency in editing audio recordings.
  • Improved accuracy in identifying filler words.
  • Enhanced user experience in language learning applications.

Potential Commercial Applications

This technology could be commercially benefit:

  • Media production companies looking to streamline their editing processes.
  • Language learning platforms seeking to enhance their speech analysis tools.

Possible Prior Art

One possible prior art for this technology could be speech recognition software that identifies and transcribes spoken words, but not specifically focusing on filler word detection.

Unanswered Questions

How does the system handle different accents and speech patterns in the audio input?

The patent application does not provide details on how the system adapts to variations in speech patterns and accents.

What is the accuracy rate of the filler word detection process compared to manual detection methods?

The patent application does not mention the accuracy rate of the system in detecting filler words.


Original Abstract Submitted

Embodiments are disclosed for performing a filler word detection process on input audio by a media editing system using trained neural networks. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving an input including an audio sequence, analyzing the audio sequence to determine filler word candidates, classifying, by a filler word classification model, each filler word candidate of the filler word candidates into one of a set of categories, and generating an output audio sequence, the output audio sequence including an identification of a subset of the filler word candidates in a filler words category of the set of categories as identified filler words.