SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM

Organization Name

Inventor(s)

SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM - A simplified explanation of the abstract

This abstract first appeared for US patent application 18453338 titled 'SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM

Simplified Explanation

The patent application describes a speech recognition device that can accurately recognize speech and convert it into text by using two different models and tagging specific classes in the recognition results.

Acquisition part: Acquires a speech signal.
Speech feature amount calculation part: Calculates a speech feature amount.
First speech recognition part: Performs speech recognition using a learned first E2E model and attaches a first tag to a vocabulary portion of a specific class in the recognition result.
Second speech recognition part: Performs speech recognition using a learned second E2E model and attaches a second tag to a vocabulary portion of a specific class in a phoneme that is a recognition result.
Phoneme replacement part: Replaces a vocabulary with the first tag with a phoneme with the second tag.
Output part: Converts the phoneme with the second tag into text and outputs the same.

Potential Applications

This technology can be used in various applications such as:

Voice-controlled devices
Speech-to-text transcription software
Language translation tools

Problems Solved

The technology solves the following problems:

Improving speech recognition accuracy
Enhancing text output quality
Streamlining the speech-to-text conversion process

Benefits

The benefits of this technology include:

Increased efficiency in converting speech to text
Enhanced user experience in voice-activated systems
Improved accessibility for individuals with speech impairments

Potential Commercial Applications

The technology can be commercially applied in:

Virtual assistants
Call center automation systems
Dictation software

Possible Prior Art

One possible prior art for this technology could be the use of deep learning models in speech recognition systems.

Unanswered Questions

How does the device handle background noise during speech recognition?

The patent application does not provide details on how the device deals with background noise interference.

What languages are supported by the speech recognition device?

The patent application does not specify the languages that the device can recognize and convert into text.

Original Abstract Submitted

A speech recognition device includes: an acquisition part, acquiring a speech signal; a speech feature amount calculation part, calculating a speech feature amount; a first speech recognition part, based on the speech feature amount, performing speech recognition using a learned first E2E model, attaching a first tag to a vocabulary portion of a specific class in text that is a recognition result, and outputting the same; a second speech recognition part, based on the speech feature amount, performing speech recognition using a learned second E2E model, attaching a second tag to a vocabulary portion of a specific class in a phoneme that is a recognition result, and outputting the same; a phoneme replacement part, replacing a vocabulary with the first tag with a phoneme with the second tag; and an output part, converting the phoneme with the second tag into text and outputting the same.

18453338. SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM simplified abstract (Honda Motor Co., Ltd.)

Contents

SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM

Organization Name

Inventor(s)