SELF-SUPERVISED SPEECH RECOGNITION

Organization Name

INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor(s)

Cheng-I Lai of Cambridge MA (US)

Yang Zhang of Cambridge MA (US)

Kaizhi Qian of Champaign IL (US)

Chuang Gan of Cambridge MA (US)

James R. Glass of Winchester MA (US)

Alexander Haojan Liu of Malden MA (US)

SELF-SUPERVISED SPEECH RECOGNITION - A simplified explanation of the abstract

This abstract first appeared for US patent application 17662435 titled 'SELF-SUPERVISED SPEECH RECOGNITION

Simplified Explanation

The patent application describes a method for improving the performance of a self-supervised learning (SSL) speech model through a process called finetuning.

The method involves using one or more computer processors to obtain an initial subnetwork and pruning mask from a pre-trained SSL speech model.
The initial subnetwork is adjusted by zeroing out certain weights specified by the pruning mask.
A new subnetwork is then trained from the adjusted subnetwork.
To achieve a desired level of sparsity, the method further prunes weights with the lowest magnitude in the new subnetwork, regardless of the network structure.
Finally, the finetuned subnetwork is used to classify audio segments.

Overall, this patent application presents a technique for refining a self-supervised learning speech model by iteratively adjusting and pruning the network weights to improve its performance in audio classification tasks.

Original Abstract Submitted

One or more computer processors obtain an initial subnetwork at a target sparsity and an initial pruning mask from a pre-trained self-supervised learning (SSL) speech model. The one or more computer processors finetune the initial subnetwork, comprising: the one or more computer processors zero out one or more masked weights in the initial subnetwork specified by the initial pruning mask; the one or more computer processors train a new subnetwork from the zeroed out subnetwork; the one or more computer processors prune one or more weights of lowest magnitude in the new subnetwork regardless of network structure to satisfy the target sparsity. The one or more computer processors classify an audio segment with the finetuned subnetwork.

US Patent Application 17662435. SELF-SUPERVISED SPEECH RECOGNITION simplified abstract

Contents

SELF-SUPERVISED SPEECH RECOGNITION

Organization Name

Inventor(s)

SELF-SUPERVISED SPEECH RECOGNITION - A simplified explanation of the abstract

Simplified Explanation

Original Abstract Submitted

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools