17991443. METHOD, DEVICE, AND COMPUTER PROGRAM PRODUCT FOR TEXT TO SPEECH simplified abstract (Dell Products L.P.)
Contents
METHOD, DEVICE, AND COMPUTER PROGRAM PRODUCT FOR TEXT TO SPEECH
Organization Name
Inventor(s)
METHOD, DEVICE, AND COMPUTER PROGRAM PRODUCT FOR TEXT TO SPEECH - A simplified explanation of the abstract
This abstract first appeared for US patent application 17991443 titled 'METHOD, DEVICE, AND COMPUTER PROGRAM PRODUCT FOR TEXT TO SPEECH
The present disclosure pertains to a method, device, and computer program product for text-to-speech synthesis. The method involves encoding the style of a reference waveform from one speaker and transferring this style to a spectrogram generated from input text, resulting in a style-transferred spectrogram that is then converted into a speech waveform.
- Comparative learning framework for flexible and effective speech synthesis with a target speaker's style
- Lightweight speech style transfer capability
- High-quality and recognizable speech synthesis features
- Effective speaker feature learning
Potential Applications: - Speech synthesis for various applications such as virtual assistants, audiobooks, and voice assistants - Personalized speech synthesis for individuals with unique speech styles
Problems Solved: - Efficiently synthesizing speech with different styles - Enabling personalized speech synthesis for various applications
Benefits: - Improved speech synthesis quality - Enhanced speaker feature learning - Personalized and customizable speech synthesis capabilities
Commercial Applications: Title: Advanced Text-to-Speech Synthesis Technology for Personalized Applications This technology can be utilized in industries such as entertainment, education, customer service, and accessibility services to provide personalized and high-quality speech synthesis solutions.
Prior Art: There have been advancements in speech synthesis technology, but the ability to transfer speech styles efficiently and effectively remains a challenge.
Frequently Updated Research: Researchers are continuously exploring new methods and techniques to enhance speech synthesis technology, including style transfer capabilities and speaker feature learning.
Questions about Text-to-Speech Synthesis Technology: 1. How does this technology improve the efficiency of speech synthesis with different styles? 2. What are the potential applications of personalized speech synthesis in various industries?
Original Abstract Submitted
Embodiments of the present disclosure relate to a method, a device, and a computer program product for text to speech. The method includes encoding a reference waveform of a first speaker to obtain an encoded style feature separated from a second speaker. The method further includes transferring the encoded style feature to a spectrogram obtained by encoding an input text, to obtain a style transferred spectrogram. The method further includes converting the style transferred spectrogram into a time-domain speech waveform. According to the method for text to speech in the present disclosure, a comparative learning framework can also flexibly and effectively synthesize speech with a style of a target speaker, thus realizing lightweight speech style transfer, making it possible to learn high-quality and recognizable features of speech synthesis, and realizing effective speaker feature learning. In addition, the model will be beneficial to other downstream tasks.