18491266. SYSTEMS AND METHODS FOR PROVIDING NON-LEXICAL CUES IN SYNTHESIZED SPEECH simplified abstract (Intel Corporation)

From WikiPatents
Revision as of 06:02, 26 April 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

SYSTEMS AND METHODS FOR PROVIDING NON-LEXICAL CUES IN SYNTHESIZED SPEECH

Organization Name

Intel Corporation

Inventor(s)

Jessica M. Christian of Redwood City CA (US)

Peter Graff of San Jose CA (US)

Crystal A. Nakatsu of San Jose CA (US)

Beth Ann Hockey of Sunnyvale CA (US)

SYSTEMS AND METHODS FOR PROVIDING NON-LEXICAL CUES IN SYNTHESIZED SPEECH - A simplified explanation of the abstract

This abstract first appeared for US patent application 18491266 titled 'SYSTEMS AND METHODS FOR PROVIDING NON-LEXICAL CUES IN SYNTHESIZED SPEECH

Simplified Explanation

The abstract describes a system and method for providing non-lexical cues in synthesized speech, including generating breathing cues and prosody cues to enhance the speech synthesis process. These cues are inserted into the text at specific points based on markup language tags, and then the speech synthesis is triggered.

  • Processor circuitry generates breathing cues to enhance synthesized speech.
  • Breathing cues are inserted into the text at specific points identified by markup language tags.
  • Prosody cues are also generated to enhance speech synthesis.
  • Prosody cues are inserted into the text at specific points identified by markup language tags.
  • The cues are then used to trigger the synthesis of speech from the text.

Potential Applications

This technology could be applied in various fields such as:

  • Assistive technology for individuals with visual impairments.
  • Language learning applications.
  • Virtual assistants and chatbots.

Problems Solved

This technology helps improve the naturalness and expressiveness of synthesized speech by incorporating non-lexical cues like breathing and prosody.

Benefits

  • Enhanced user experience in speech synthesis applications.
  • Improved communication for individuals with disabilities.
  • More engaging and natural interactions with AI-powered systems.

Potential Commercial Applications

  • Speech synthesis software for various industries.
  • Accessibility tools for individuals with disabilities.
  • Language learning platforms.

Possible Prior Art

One potential prior art in this field is the use of markup language tags to insert cues in text for speech synthesis, but the specific implementation of breathing and prosody cues may be a novel aspect of this technology.

Unanswered Questions

How does this technology compare to existing methods of enhancing synthesized speech?

This article does not provide a direct comparison to existing methods or technologies in the field of speech synthesis enhancement.

What are the limitations or challenges of implementing non-lexical cues in synthesized speech?

The article does not address any potential limitations or challenges that may arise in the implementation of non-lexical cues in synthesized speech.


Original Abstract Submitted

Systems and methods are disclosed for providing non-lexical cues in synthesized speech. An example system includes processor circuitry to generate a breathing cue to enhance speech to be synthesized from text; determine a first insertion point of the breathing cue in the text, wherein the breathing cue is identified by a first tag of a markup language; generate a prosody cue to enhance speech to be synthesized from the text; determine a second insertion point of the prosody cue in the text, wherein the prosody cue is identified by a second tag of the markup language; insert the breathing cue at the first insertion point based on the first tag and the prosody cue at the second insertion point based on the second tag; and trigger a synthesis of the speech from the text, the breathing cue, and the prosody cue.