18371233. PATCHED MULTI-CONDITION TRAINING FOR ROBUST SPEECH RECOGNITION simplified abstract (Samsung Electronics Co., Ltd.)

From WikiPatents
Jump to navigation Jump to search

PATCHED MULTI-CONDITION TRAINING FOR ROBUST SPEECH RECOGNITION

Organization Name

Samsung Electronics Co., Ltd.

Inventor(s)

Pablo Peso Parada of Staines (GB)

Agnieszka Dobrowolska of Staines (GB)

Karthikeyan Saravanan of Staines (GB)

Mete Ozay of Staines (GB)

PATCHED MULTI-CONDITION TRAINING FOR ROBUST SPEECH RECOGNITION - A simplified explanation of the abstract

This abstract first appeared for US patent application 18371233 titled 'PATCHED MULTI-CONDITION TRAINING FOR ROBUST SPEECH RECOGNITION

Simplified Explanation

The abstract describes a method for obtaining a patched signal to train a model for speech and audio recognition. The method involves modifying a first signal (speech or audio signal) to obtain at least one second signal. The first signal and the second signal are then divided into multiple patches. Selected patches from both signals are mixed to obtain a patched signal.

  • The method involves obtaining and modifying signals to create patches for training a model.
  • The first signal can be either speech or audio signal.
  • The second signal is obtained by modifying the first signal.
  • Both the first and second signals are divided into multiple patches.
  • Each patch contains a respective part of the original signal it belongs to.
  • Selected patches from both signals are mixed to create a patched signal.

Potential applications of this technology:

  • Speech recognition systems: The method can be used to train models for speech recognition, improving the accuracy and performance of speech recognition systems.
  • Audio recognition systems: The method can also be applied to train models for audio recognition, enabling better identification and classification of audio signals.

Problems solved by this technology:

  • Training data preparation: The method simplifies the process of preparing training data by automatically dividing signals into patches and mixing them to create a patched signal.
  • Model performance improvement: By using the patched signal for training, the model can learn from a more diverse and representative set of data, potentially improving its performance in recognizing speech and audio.

Benefits of this technology:

  • Enhanced accuracy: The use of patched signals for training can lead to improved accuracy in speech and audio recognition systems.
  • Efficient training: The method streamlines the process of preparing training data, making it more efficient and less time-consuming.
  • Increased model robustness: Training the model with a diverse set of patches can enhance its ability to handle different variations and conditions in speech and audio signals.


Original Abstract Submitted

A method of obtaining a patched signal for training a model for use in at least one of a speech and an audio recognition is disclosed. The method comprises obtaining a first signal, wherein the first signal is at least one of a speech and an audio signal, modifying the first signal to obtain at least one second signal, dividing the first signal and the at least one second signal respectively into a plurality of first patches and a plurality of second patches, wherein each one of the plurality of first patches comprises a respective part of the first signal and each one of the plurality of second patches comprises a respective part of the at least one second signal and mixing selected ones of the plurality of first patches and the plurality of second patches to obtain a patched signal.