LANGUAGE IDENTIFICATION CLASSIFIER TRAINED USING ENCODED AUDIO FROM ENCODER OF PRE-TRAINED SPEECH-TO-TEXT SYSTEM

Organization Name

INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor(s)

LANGUAGE IDENTIFICATION CLASSIFIER TRAINED USING ENCODED AUDIO FROM ENCODER OF PRE-TRAINED SPEECH-TO-TEXT SYSTEM - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240038216 titled 'LANGUAGE IDENTIFICATION CLASSIFIER TRAINED USING ENCODED AUDIO FROM ENCODER OF PRE-TRAINED SPEECH-TO-TEXT SYSTEM

Simplified Explanation

The abstract of the patent application describes a system that includes a processor capable of receiving encoded audio from a pre-trained speech-to-text (STT) model encoder. The processor is designed to further train a language identification (LID) classifier using labeled training samples to detect the language of the encoded audio.

The system includes a processor that receives encoded audio from an STT model encoder.
The processor is responsible for training a language identification (LID) classifier.
The LID classifier is trained using labeled training samples.
The purpose of the LID classifier is to detect the language of the encoded audio.

Potential Applications:

Speech recognition systems: The technology can be applied in speech recognition systems to accurately identify the language being spoken.
Multilingual transcription services: The system can be used in transcription services to automatically determine the language of the audio being transcribed.
Language-specific content filtering: It can be utilized in content filtering systems to identify and filter content based on the language it is in.

Problems Solved by this Technology:

Language detection accuracy: The system improves the accuracy of language detection in encoded audio, which can be challenging due to variations in accents, dialects, and speech patterns.
Efficient language identification: The technology enables efficient language identification without relying on external language models or resources.
Adaptability to new languages: The system can be trained with labeled samples to detect new languages, allowing it to adapt to a wide range of languages.

Benefits of this Technology:

Enhanced speech-to-text accuracy: By accurately identifying the language, the system can optimize the speech-to-text conversion process, resulting in improved accuracy.
Automation of language identification: The technology automates the language identification process, reducing the need for manual intervention.
Scalability and versatility: The system can be trained to detect multiple languages, making it scalable and versatile for various applications.

Original Abstract Submitted

an example system includes a processor to receive encoded audio from an encoder of a pre-trained speech-to-text (stt) model. the processor is to further train a language identification (lid) classifier to detect a language of the encoded audio using training samples labeled by language.

20240038216. LANGUAGE IDENTIFICATION CLASSIFIER TRAINED USING ENCODED AUDIO FROM ENCODER OF PRE-TRAINED SPEECH-TO-TEXT SYSTEM simplified abstract (INTERNATIONAL BUSINESS MACHINES CORPORATION)

Contents

LANGUAGE IDENTIFICATION CLASSIFIER TRAINED USING ENCODED AUDIO FROM ENCODER OF PRE-TRAINED SPEECH-TO-TEXT SYSTEM

Organization Name

Inventor(s)