International business machines corporation (20240127801). DOMAIN ADAPTIVE SPEECH RECOGNITION USING ARTIFICIAL INTELLIGENCE simplified abstract
Contents
- 1 DOMAIN ADAPTIVE SPEECH RECOGNITION USING ARTIFICIAL INTELLIGENCE
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 DOMAIN ADAPTIVE SPEECH RECOGNITION USING ARTIFICIAL INTELLIGENCE - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Potential Commercial Applications
- 1.9 Possible Prior Art
- 1.10 Original Abstract Submitted
DOMAIN ADAPTIVE SPEECH RECOGNITION USING ARTIFICIAL INTELLIGENCE
Organization Name
international business machines corporation
Inventor(s)
DOMAIN ADAPTIVE SPEECH RECOGNITION USING ARTIFICIAL INTELLIGENCE - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240127801 titled 'DOMAIN ADAPTIVE SPEECH RECOGNITION USING ARTIFICIAL INTELLIGENCE
Simplified Explanation
The patent application describes methods, systems, and computer program products for domain adaptive speech recognition using artificial intelligence. The process involves generating language data candidates from phonemes, determining subsets of graphemes for target phonemes, generating speech recognition outputs using biasing language models and AI-based speech recognition models, and performing automated actions based on the final speech recognition output.
- Generating language data candidates from phonemes using an AI-based data conversion model
- Determining subsets of graphemes for target phonemes
- Generating speech recognition outputs using biasing language models and AI-based speech recognition models
- Performing automated actions based on the final speech recognition output
Potential Applications
This technology can be applied in various fields such as virtual assistants, customer service chatbots, transcription services, and language translation tools.
Problems Solved
This technology helps improve the accuracy and efficiency of speech recognition systems, especially in domain-specific contexts where traditional models may struggle to accurately transcribe speech.
Benefits
The benefits of this technology include enhanced speech recognition performance, increased adaptability to different domains, and improved user experience in speech-to-text applications.
Potential Commercial Applications
Potential commercial applications of this technology include speech-to-text transcription services, virtual assistant platforms, customer service automation tools, and language translation services.
Possible Prior Art
One possible prior art in this field is the use of neural network models for speech recognition, which have been widely studied and implemented in various applications.
Unanswered Questions
How does this technology compare to existing speech recognition systems in terms of accuracy and adaptability?
This article does not provide a direct comparison between this technology and existing speech recognition systems.
What are the potential limitations or challenges of implementing this technology in real-world applications?
The article does not address the potential limitations or challenges of implementing this technology in real-world applications.
Original Abstract Submitted
methods, systems, and computer program products for domain adaptive speech recognition using artificial intelligence are provided herein. a computer-implemented method includes generating a set of language data candidates, each language data candidate comprising one or more graphemes, by processing a sequence of phonemes related to input speech data using an artificial intelligence-based data conversion model; determining, for a target pair of phonemes and graphemes, a subset of graphemes from the set of language data candidates; generating a first speech recognition output by processing the subset of graphemes using at least one biasing language model and an artificial intelligence-based speech recognition model; generating a second speech recognition output by replacing at least a portion of the subset of graphemes in the first speech recognition output with at least one of the graphemes from the target pair; and performing automated actions based on the second speech recognition output.