Intel corporation (20240127789). SYSTEMS AND METHODS FOR PROVIDING NON-LEXICAL CUES IN SYNTHESIZED SPEECH simplified abstract
Contents
- 1 SYSTEMS AND METHODS FOR PROVIDING NON-LEXICAL CUES IN SYNTHESIZED SPEECH
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 SYSTEMS AND METHODS FOR PROVIDING NON-LEXICAL CUES IN SYNTHESIZED SPEECH - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Potential Commercial Applications
- 1.9 Possible Prior Art
- 1.10 Unanswered Questions
- 1.11 Original Abstract Submitted
SYSTEMS AND METHODS FOR PROVIDING NON-LEXICAL CUES IN SYNTHESIZED SPEECH
Organization Name
Inventor(s)
Jessica M. Christian of Redwood City CA (US)
Peter Graff of San Jose CA (US)
Crystal A. Nakatsu of San Jose CA (US)
Beth Ann Hockey of Sunnyvale CA (US)
SYSTEMS AND METHODS FOR PROVIDING NON-LEXICAL CUES IN SYNTHESIZED SPEECH - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240127789 titled 'SYSTEMS AND METHODS FOR PROVIDING NON-LEXICAL CUES IN SYNTHESIZED SPEECH
Simplified Explanation
The abstract describes a system and method for providing non-lexical cues in synthesized speech, such as breathing and prosody cues, to enhance the naturalness of the speech. The cues are inserted into the text based on markup language tags and then used during the synthesis of the speech.
- Processor circuitry generates breathing and prosody cues to enhance synthesized speech.
- Cues are inserted into the text at specific points based on markup language tags.
- The cues are then used during the synthesis of the speech to improve its naturalness.
Potential Applications
This technology could be applied in various fields such as:
- Assistive technology for individuals with speech impairments.
- Virtual assistants and chatbots to improve the naturalness of their responses.
Problems Solved
This technology addresses the following issues:
- Lack of naturalness in synthesized speech.
- Difficulty in conveying emotions and intentions through synthesized speech.
Benefits
The benefits of this technology include:
- Enhanced naturalness and expressiveness in synthesized speech.
- Improved user experience in applications utilizing synthesized speech.
Potential Commercial Applications
Potential commercial applications of this technology include:
- Speech synthesis software for various industries.
- Voice-enabled devices and applications.
Possible Prior Art
One possible prior art in this field is the use of markup language tags to enhance text-to-speech synthesis by inserting cues for pauses, emphasis, and intonation.
Unanswered Questions
How does this technology handle different languages and accents in speech synthesis?
The article does not provide information on how the system adapts to different languages and accents during speech synthesis.
What is the impact of these non-lexical cues on the overall performance of the synthesized speech?
The article does not discuss the potential impact of breathing and prosody cues on the overall quality and intelligibility of the synthesized speech.
Original Abstract Submitted
systems and methods are disclosed for providing non-lexical cues in synthesized speech. an example system includes processor circuitry to generate a breathing cue to enhance speech to be synthesized from text; determine a first insertion point of the breathing cue in the text, wherein the breathing cue is identified by a first tag of a markup language; generate a prosody cue to enhance speech to be synthesized from the text; determine a second insertion point of the prosody cue in the text, wherein the prosody cue is identified by a second tag of the markup language; insert the breathing cue at the first insertion point based on the first tag and the prosody cue at the second insertion point based on the second tag; and trigger a synthesis of the speech from the text, the breathing cue, and the prosody cue.