Unified Cascaded Encoder ASR model for Dynamic Model Sizes: abstract simplified (18182925)

From WikiPatents
Jump to navigation Jump to search
  • This abstract for appeared for patent application number 18182925 Titled 'Unified Cascaded Encoder ASR model for Dynamic Model Sizes'

Simplified Explanation

The abstract describes a model for automated speech recognition (ASR) that consists of several components. The first encoder takes a sequence of acoustic frames as input and generates a higher order feature representation for each frame. The first decoder uses this representation to generate a probability distribution of possible speech recognition hypotheses. The second encoder takes the higher order feature representation from the first encoder and generates a different higher order feature representation for each frame. The second decoder then uses this representation to generate another probability distribution of possible speech recognition hypotheses.


Original Abstract Submitted

An automated speech recognition (ASR) model includes a first encoder, a first encoder, a second encoder, and a second decoder. The first encoder receives, as input, a sequence of acoustic frames, and generates, at each of a plurality of output steps, a first higher order feature representation for a corresponding acoustic frame in the sequence of acoustic frames. The first decoder receives, as input, the first higher order feature representation generated by the first encoder, and generates a first probability distribution over possible speech recognition hypotheses. The second encoder receives, as input, the first higher order feature representation generated by the first encoder, and generates a second higher order feature representation for a corresponding first higher order feature frame. The second decoder receives, as input, the second higher order feature representation generated by the second encoder, and generates a second probability distribution over possible speech recognition hypotheses.