20240038216. LANGUAGE IDENTIFICATION CLASSIFIER TRAINED USING ENCODED AUDIO FROM ENCODER OF PRE-TRAINED SPEECH-TO-TEXT SYSTEM simplified abstract (INTERNATIONAL BUSINESS MACHINES CORPORATION)
LANGUAGE IDENTIFICATION CLASSIFIER TRAINED USING ENCODED AUDIO FROM ENCODER OF PRE-TRAINED SPEECH-TO-TEXT SYSTEM
Organization Name
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor(s)
LANGUAGE IDENTIFICATION CLASSIFIER TRAINED USING ENCODED AUDIO FROM ENCODER OF PRE-TRAINED SPEECH-TO-TEXT SYSTEM - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240038216 titled 'LANGUAGE IDENTIFICATION CLASSIFIER TRAINED USING ENCODED AUDIO FROM ENCODER OF PRE-TRAINED SPEECH-TO-TEXT SYSTEM
Simplified Explanation
The abstract of the patent application describes a system that includes a processor capable of receiving encoded audio from a pre-trained speech-to-text (STT) model encoder. The processor is designed to further train a language identification (LID) classifier using labeled training samples to detect the language of the encoded audio.
- The system includes a processor that receives encoded audio from an STT model encoder.
- The processor is responsible for training a language identification (LID) classifier.
- The LID classifier is trained using labeled training samples.
- The purpose of the LID classifier is to detect the language of the encoded audio.
Potential Applications:
- Speech recognition systems: The technology can be applied in speech recognition systems to accurately identify the language being spoken.
- Multilingual transcription services: The system can be used in transcription services to automatically determine the language of the audio being transcribed.
- Language-specific content filtering: It can be utilized in content filtering systems to identify and filter content based on the language it is in.
Problems Solved by this Technology:
- Language detection accuracy: The system improves the accuracy of language detection in encoded audio, which can be challenging due to variations in accents, dialects, and speech patterns.
- Efficient language identification: The technology enables efficient language identification without relying on external language models or resources.
- Adaptability to new languages: The system can be trained with labeled samples to detect new languages, allowing it to adapt to a wide range of languages.
Benefits of this Technology:
- Enhanced speech-to-text accuracy: By accurately identifying the language, the system can optimize the speech-to-text conversion process, resulting in improved accuracy.
- Automation of language identification: The technology automates the language identification process, reducing the need for manual intervention.
- Scalability and versatility: The system can be trained to detect multiple languages, making it scalable and versatile for various applications.
Original Abstract Submitted
an example system includes a processor to receive encoded audio from an encoder of a pre-trained speech-to-text (stt) model. the processor is to further train a language identification (lid) classifier to detect a language of the encoded audio using training samples labeled by language.