Recorded Books, Inc. (20240265910). METHOD AND APPARATUS FOR AUDIO CONTENT CREATION VIA A COMBINATION OF A TEXT-TO-SPEECH MODEL AND HUMAN NARRATION simplified abstract
Contents
- 1 METHOD AND APPARATUS FOR AUDIO CONTENT CREATION VIA A COMBINATION OF A TEXT-TO-SPEECH MODEL AND HUMAN NARRATION
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 METHOD AND APPARATUS FOR AUDIO CONTENT CREATION VIA A COMBINATION OF A TEXT-TO-SPEECH MODEL AND HUMAN NARRATION - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Key Features and Innovation
- 1.6 Potential Applications
- 1.7 Problems Solved
- 1.8 Benefits
- 1.9 Commercial Applications
- 1.10 Prior Art
- 1.11 Frequently Updated Research
- 1.12 Questions about the Technology
- 1.13 Original Abstract Submitted
METHOD AND APPARATUS FOR AUDIO CONTENT CREATION VIA A COMBINATION OF A TEXT-TO-SPEECH MODEL AND HUMAN NARRATION
Organization Name
Inventor(s)
Dion Michael Pyland of Elberta AL (US)
John Michael Shea of Vienna VA (US)
METHOD AND APPARATUS FOR AUDIO CONTENT CREATION VIA A COMBINATION OF A TEXT-TO-SPEECH MODEL AND HUMAN NARRATION - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240265910 titled 'METHOD AND APPARATUS FOR AUDIO CONTENT CREATION VIA A COMBINATION OF A TEXT-TO-SPEECH MODEL AND HUMAN NARRATION
Simplified Explanation
The patent application describes a system where a user's voice is used to update text-to-speech generated audio content.
- Text-to-speech model trained using user's audio data
- User records a subset of text
- User recording used to update portions of initial audio content
Key Features and Innovation
- Utilizes user's voice to enhance text-to-speech generated audio
- Allows for personalized audio content based on user recordings
- Improves the quality and accuracy of audio content by incorporating user's voice
Potential Applications
- Personalized voice assistants
- Interactive audio content creation
- Language learning tools
Problems Solved
- Enhances user experience with more natural-sounding audio content
- Improves the accuracy of text-to-speech technology
- Allows for customization of audio content based on user preferences
Benefits
- Enhanced user engagement
- Improved audio content quality
- Personalized user experience
Commercial Applications
- Customized voice assistant services for businesses
- Educational platforms with personalized audio content
- Audio book services with user-recorded updates
Prior Art
There may be prior art related to incorporating user recordings into text-to-speech generated audio content. Researchers and patent databases can be consulted to explore existing technologies in this field.
Frequently Updated Research
Research on improving text-to-speech technology and user interaction with audio content may provide insights into the ongoing development of similar systems.
Questions about the Technology
How does this technology improve user experience with audio content?
This technology enhances user experience by allowing for personalized and natural-sounding audio content based on user recordings.
What are the potential commercial applications of this technology?
The technology can be applied to customized voice assistant services, educational platforms, and audio book services to provide personalized audio content for users.
Original Abstract Submitted
in an embodiment, a set of text is received. initial audio content substantially corresponding to a voice of a user, associated with the set of text and generated by a text-to-speech (tts) model that was trained using training data that includes audio of the user, is received. a subset of text from the set of text that is to be recorded by the user is identified, based on analysis of the initial audio content. a signal indicating that the subset of text are to be recorded by the user is sent to cause the second compute device to generate a user recording. a representation of the user recording is received. portions of the initial audio content associated with the subset of text are caused to be updated using the user recording to generate updated audio content.