17987034. METHOD, ELECTRONIC DEVICE, AND COMPUTER PROGRAM PRODUCT FOR SPEECH SYNTHESIS simplified abstract (Dell Products L.P.)

From WikiPatents
Jump to navigation Jump to search

METHOD, ELECTRONIC DEVICE, AND COMPUTER PROGRAM PRODUCT FOR SPEECH SYNTHESIS

Organization Name

Dell Products L.P.

Inventor(s)

Zijia Wang of WeiFang (CN)

Zhisong Liu of Shenzhen (CN)

Zhen Jia of Shanghai (CN)

METHOD, ELECTRONIC DEVICE, AND COMPUTER PROGRAM PRODUCT FOR SPEECH SYNTHESIS - A simplified explanation of the abstract

This abstract first appeared for US patent application 17987034 titled 'METHOD, ELECTRONIC DEVICE, AND COMPUTER PROGRAM PRODUCT FOR SPEECH SYNTHESIS

The abstract describes a method, electronic device, and computer program product for speech synthesis. The method involves extracting voice feature vectors of speakers from audio, calculating loss functions, and generating a speech synthesis model.

  • Utilizes voice feature vectors of speakers to optimize and train a speech synthesis model.
  • Calculates loss functions based on distances between voice feature vectors and texts with corresponding real audios.
  • Generates a speech synthesis model to output high-quality audio with target voice features based on texts.

Potential Applications: - Speech synthesis for virtual assistants - Voice cloning for personalized audio messages - Language translation with natural-sounding voices

Problems Solved: - Improving the quality and accuracy of speech synthesis - Enhancing the naturalness and expressiveness of synthesized speech

Benefits: - Customizable voice features for specific applications - Enhanced user experience with more natural-sounding speech - Efficient training and optimization of speech synthesis models

Commercial Applications: - Voice-enabled devices and applications - Language learning platforms - Customer service automation systems

Prior Art: There are existing methods for speech synthesis using voice feature vectors, but this approach combines multiple loss functions for improved model training and optimization.

Frequently Updated Research: Ongoing research focuses on refining the calculation of loss functions and further enhancing the naturalness of synthesized speech.

Questions about Speech Synthesis: 1. How does this method differ from traditional speech synthesis techniques? This method incorporates multiple loss functions for improved model training and optimization, resulting in higher-quality synthesized speech.

2. What are the potential limitations of using voice feature vectors for speech synthesis? Voice feature vectors may not capture all nuances of a speaker's voice, leading to potential limitations in accurately replicating natural speech patterns.


Original Abstract Submitted

Embodiments of the present disclosure provide a method, an electronic device, and a computer program product for speech synthesis. The method for speech synthesis includes: extracting a plurality of voice feature vectors of a plurality of speakers from a plurality of audios corresponding to the plurality of speakers; calculating a first loss function based on distances between the plurality of voice feature vectors of the plurality of speakers; calculating a second loss function according to a plurality of texts and a plurality of corresponding real audios; and generating a speech synthesis model based on the first loss function and the second loss function. By implementing the method, the speech synthesis model can be optimized and trained, so that a high-quality audio with target voice features can be outputted based on the texts.