18453338. SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM simplified abstract (Honda Motor Co., Ltd.)

From WikiPatents
Jump to navigation Jump to search

SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM

Organization Name

Honda Motor Co., Ltd.

Inventor(s)

Yui Sudo of Saitama (JP)

Kazuhiro Nakadai of Saitama (JP)

Kazuya Hata of Saitama (JP)

SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM - A simplified explanation of the abstract

This abstract first appeared for US patent application 18453338 titled 'SPEECH RECOGNITION DEVICE, SPEECH RECOGNITION METHOD, AND NON-TRANSITORY COMPUTER-READABLE MEDIUM

Simplified Explanation

The patent application describes a speech recognition device that can accurately recognize speech and convert it into text by using two different models and tagging specific classes in the recognition results.

  • Acquisition part: Acquires a speech signal.
  • Speech feature amount calculation part: Calculates a speech feature amount.
  • First speech recognition part: Performs speech recognition using a learned first E2E model and attaches a first tag to a vocabulary portion of a specific class in the recognition result.
  • Second speech recognition part: Performs speech recognition using a learned second E2E model and attaches a second tag to a vocabulary portion of a specific class in a phoneme that is a recognition result.
  • Phoneme replacement part: Replaces a vocabulary with the first tag with a phoneme with the second tag.
  • Output part: Converts the phoneme with the second tag into text and outputs the same.

Potential Applications

This technology can be used in various applications such as:

  • Voice-controlled devices
  • Speech-to-text transcription software
  • Language translation tools

Problems Solved

The technology solves the following problems:

  • Improving speech recognition accuracy
  • Enhancing text output quality
  • Streamlining the speech-to-text conversion process

Benefits

The benefits of this technology include:

  • Increased efficiency in converting speech to text
  • Enhanced user experience in voice-activated systems
  • Improved accessibility for individuals with speech impairments

Potential Commercial Applications

The technology can be commercially applied in:

  • Virtual assistants
  • Call center automation systems
  • Dictation software

Possible Prior Art

One possible prior art for this technology could be the use of deep learning models in speech recognition systems.

Unanswered Questions

How does the device handle background noise during speech recognition?

The patent application does not provide details on how the device deals with background noise interference.

What languages are supported by the speech recognition device?

The patent application does not specify the languages that the device can recognize and convert into text.


Original Abstract Submitted

A speech recognition device includes: an acquisition part, acquiring a speech signal; a speech feature amount calculation part, calculating a speech feature amount; a first speech recognition part, based on the speech feature amount, performing speech recognition using a learned first E2E model, attaching a first tag to a vocabulary portion of a specific class in text that is a recognition result, and outputting the same; a second speech recognition part, based on the speech feature amount, performing speech recognition using a learned second E2E model, attaching a second tag to a vocabulary portion of a specific class in a phoneme that is a recognition result, and outputting the same; a phoneme replacement part, replacing a vocabulary with the first tag with a phoneme with the second tag; and an output part, converting the phoneme with the second tag into text and outputting the same.