Recorded Books, Inc. (20240265910). METHOD AND APPARATUS FOR AUDIO CONTENT CREATION VIA A COMBINATION OF A TEXT-TO-SPEECH MODEL AND HUMAN NARRATION simplified abstract

From WikiPatents
Jump to navigation Jump to search

METHOD AND APPARATUS FOR AUDIO CONTENT CREATION VIA A COMBINATION OF A TEXT-TO-SPEECH MODEL AND HUMAN NARRATION

Organization Name

Recorded Books, Inc.

Inventor(s)

Dion Michael Pyland of Elberta AL (US)

John Michael Shea of Vienna VA (US)

METHOD AND APPARATUS FOR AUDIO CONTENT CREATION VIA A COMBINATION OF A TEXT-TO-SPEECH MODEL AND HUMAN NARRATION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240265910 titled 'METHOD AND APPARATUS FOR AUDIO CONTENT CREATION VIA A COMBINATION OF A TEXT-TO-SPEECH MODEL AND HUMAN NARRATION

Simplified Explanation

The patent application describes a system where a user's voice is used to update text-to-speech generated audio content.

  • Text-to-speech model trained using user's audio data
  • User records a subset of text
  • User recording used to update portions of initial audio content

Key Features and Innovation

  • Utilizes user's voice to enhance text-to-speech generated audio
  • Allows for personalized audio content based on user recordings
  • Improves the quality and accuracy of audio content by incorporating user's voice

Potential Applications

  • Personalized voice assistants
  • Interactive audio content creation
  • Language learning tools

Problems Solved

  • Enhances user experience with more natural-sounding audio content
  • Improves the accuracy of text-to-speech technology
  • Allows for customization of audio content based on user preferences

Benefits

  • Enhanced user engagement
  • Improved audio content quality
  • Personalized user experience

Commercial Applications

  • Customized voice assistant services for businesses
  • Educational platforms with personalized audio content
  • Audio book services with user-recorded updates

Prior Art

There may be prior art related to incorporating user recordings into text-to-speech generated audio content. Researchers and patent databases can be consulted to explore existing technologies in this field.

Frequently Updated Research

Research on improving text-to-speech technology and user interaction with audio content may provide insights into the ongoing development of similar systems.

Questions about the Technology

How does this technology improve user experience with audio content?

This technology enhances user experience by allowing for personalized and natural-sounding audio content based on user recordings.

What are the potential commercial applications of this technology?

The technology can be applied to customized voice assistant services, educational platforms, and audio book services to provide personalized audio content for users.


Original Abstract Submitted

in an embodiment, a set of text is received. initial audio content substantially corresponding to a voice of a user, associated with the set of text and generated by a text-to-speech (tts) model that was trained using training data that includes audio of the user, is received. a subset of text from the set of text that is to be recorded by the user is identified, based on analysis of the initial audio content. a signal indicating that the subset of text are to be recorded by the user is sent to cause the second compute device to generate a user recording. a representation of the user recording is received. portions of the initial audio content associated with the subset of text are caused to be updated using the user recording to generate updated audio content.