GLOBAL NEURAL TRANSDUCER MODELS LEVERAGING SUB-TASK NETWORKS

Organization Name

International Business Machines Corporation

Inventor(s)

GLOBAL NEURAL TRANSDUCER MODELS LEVERAGING SUB-TASK NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 17526350 titled 'GLOBAL NEURAL TRANSDUCER MODELS LEVERAGING SUB-TASK NETWORKS

Simplified Explanation

The patent application describes a computer-based method for training a neural transducer for speech recognition. The method involves initializing the neural transducer with a prediction network, an encoder network, and a joint network.

The prediction network is expanded into multiple branches, each dedicated to a specific sub-task within speech recognition.
The neural transducer is trained using data sets for all the specific sub-tasks, allowing it to learn and improve its performance.
The trained neural transducer is obtained by combining the outputs of the multiple prediction-net branches.

Potential Applications

Speech recognition systems for various applications such as virtual assistants, transcription services, and voice-controlled devices.
Language translation services that can convert spoken words into written text in different languages.
Accessibility tools for individuals with speech impairments, enabling them to communicate more effectively.

Problems Solved

Enhances the accuracy and performance of speech recognition systems by training the neural transducer on multiple specific sub-tasks.
Addresses the challenge of handling different aspects of speech recognition, such as phoneme recognition, language modeling, and acoustic modeling, within a single system.
Provides a more efficient and effective method for training neural transducers for speech recognition.

Benefits

Improved accuracy and reliability of speech recognition systems, leading to better user experiences.
Increased flexibility and adaptability of the neural transducer by training it on various specific sub-tasks.
Simplified training process by fusing the outputs of multiple prediction-net branches, reducing the need for separate training for each sub-task.

Original Abstract Submitted

A computer-implemented method for training a neural transducer for speech recognition is provided. The method includes initializing the neural transducer having a prediction network and an encoder network and a joint network. The method further includes expanding the prediction network by changing the prediction network to a plurality of prediction-net branches. Each of the prediction-net branches is a prediction network for a respective specific sub-task from among a plurality of specific sub-tasks. The method also includes training, by a hardware processor, an entirety of the neural transducer by using training data sets for all of the plurality of specific sub-tasks. The method additionally includes obtaining a trained neural transducer by fusing the plurality of prediction-net branches.

17526350. GLOBAL NEURAL TRANSDUCER MODELS LEVERAGING SUB-TASK NETWORKS simplified abstract (International Business Machines Corporation)

Contents

GLOBAL NEURAL TRANSDUCER MODELS LEVERAGING SUB-TASK NETWORKS

Organization Name

Inventor(s)