Samsung electronics co., ltd. (20240127812). METHOD AND SYSTEM FOR AUTO-CORRECTION OF AN ONGOING SPEECH COMMAND simplified abstract

From WikiPatents
Jump to navigation Jump to search

METHOD AND SYSTEM FOR AUTO-CORRECTION OF AN ONGOING SPEECH COMMAND

Organization Name

samsung electronics co., ltd.

Inventor(s)

Prashant Inbavaluthi of Noida (IN)

Vikas Kapur of Noida (IN)

Ramakant Singh of Noida (IN)

METHOD AND SYSTEM FOR AUTO-CORRECTION OF AN ONGOING SPEECH COMMAND - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240127812 titled 'METHOD AND SYSTEM FOR AUTO-CORRECTION OF AN ONGOING SPEECH COMMAND

Simplified Explanation

The patent application describes a system that uses a voice assistant to receive voice commands from a user, converts the commands into text, extracts features from the audio and text, determines connections between the audio and text, tags replacement, cue, and correction words, and decodes revised text on-the-fly.

  • Voice assistant receives voice command input from user
  • Speech to text convertor converts voice command into text
  • Feature extractor extracts acoustic and textual features for context
  • Multi-modal unified attention sequence tagger determines connection between audio and text
  • Tags replacement, cue, and correction words based on connection
  • On-the-fly decoder decodes revised text and displays it on user interface
  • Decoded text sent to NLP for response generation

Potential Applications

This technology can be applied in:

  • Voice-controlled devices
  • Speech recognition systems
  • Language translation tools

Problems Solved

  • Improving accuracy of voice command interpretation
  • Enhancing user experience with voice assistants
  • Streamlining communication between users and devices

Benefits

  • Faster and more accurate voice command processing
  • Real-time feedback and correction for users
  • Seamless integration of audio and text data

Potential Commercial Applications

  • Smart home devices
  • Virtual assistants in cars
  • Language learning applications

Possible Prior Art

One possible prior art for this technology could be existing speech recognition systems that use similar techniques for processing voice commands and generating responses.

Unanswered Questions

How does the system handle different accents and speech patterns from users?

The system's ability to adapt to various accents and speech patterns could impact its overall accuracy and user experience.

What measures are in place to ensure user privacy and data security?

Given that the system processes sensitive voice data, it is essential to address concerns regarding privacy and data security to gain user trust and compliance with regulations.


Original Abstract Submitted

the system includes a voice assistant receiving a voice command as input from user. the speech to text convertor converts the voice command into a text. a feature extractor extracts acoustic features from raw waveform of voice command and textual features from converted text for determining nearby context tokens. a multi modal unified attention sequence tagger determines a connection between the audio and the text based on an individual contextual embedding and a fused contextual embedding at context tokens level. it further tags replacement, cue and correction words sequentially based on determined connection between the audio and the text. an on-the-fly decoder decodes revised text on-the-fly based on tagged replacement, cue and correction words, to display the decoded revised text on user interface and sends the decoded revised text to nlp to generate a response corresponding to the input speech.