17526350. GLOBAL NEURAL TRANSDUCER MODELS LEVERAGING SUB-TASK NETWORKS simplified abstract (International Business Machines Corporation)
GLOBAL NEURAL TRANSDUCER MODELS LEVERAGING SUB-TASK NETWORKS
Organization Name
International Business Machines Corporation
Inventor(s)
Samuel Thomas of White Plains NY (US)
GLOBAL NEURAL TRANSDUCER MODELS LEVERAGING SUB-TASK NETWORKS - A simplified explanation of the abstract
This abstract first appeared for US patent application 17526350 titled 'GLOBAL NEURAL TRANSDUCER MODELS LEVERAGING SUB-TASK NETWORKS
Simplified Explanation
The patent application describes a computer-based method for training a neural transducer for speech recognition. The method involves initializing the neural transducer with a prediction network, an encoder network, and a joint network.
- The prediction network is expanded into multiple branches, each dedicated to a specific sub-task within speech recognition.
- The neural transducer is trained using data sets for all the specific sub-tasks, allowing it to learn and improve its performance.
- The trained neural transducer is obtained by combining the outputs of the multiple prediction-net branches.
Potential Applications
- Speech recognition systems for various applications such as virtual assistants, transcription services, and voice-controlled devices.
- Language translation services that can convert spoken words into written text in different languages.
- Accessibility tools for individuals with speech impairments, enabling them to communicate more effectively.
Problems Solved
- Enhances the accuracy and performance of speech recognition systems by training the neural transducer on multiple specific sub-tasks.
- Addresses the challenge of handling different aspects of speech recognition, such as phoneme recognition, language modeling, and acoustic modeling, within a single system.
- Provides a more efficient and effective method for training neural transducers for speech recognition.
Benefits
- Improved accuracy and reliability of speech recognition systems, leading to better user experiences.
- Increased flexibility and adaptability of the neural transducer by training it on various specific sub-tasks.
- Simplified training process by fusing the outputs of multiple prediction-net branches, reducing the need for separate training for each sub-task.
Original Abstract Submitted
A computer-implemented method for training a neural transducer for speech recognition is provided. The method includes initializing the neural transducer having a prediction network and an encoder network and a joint network. The method further includes expanding the prediction network by changing the prediction network to a plurality of prediction-net branches. Each of the prediction-net branches is a prediction network for a respective specific sub-task from among a plurality of specific sub-tasks. The method also includes training, by a hardware processor, an entirety of the neural transducer by using training data sets for all of the plurality of specific sub-tasks. The method additionally includes obtaining a trained neural transducer by fusing the plurality of prediction-net branches.