Sanas.ai Inc. (20240347070). SYSTEM AND METHOD FOR AUTOMATIC ALIGNMENT OF PHONETIC CONTENT FOR REAL-TIME ACCENT CONVERSION simplified abstract

From WikiPatents
Jump to navigation Jump to search

SYSTEM AND METHOD FOR AUTOMATIC ALIGNMENT OF PHONETIC CONTENT FOR REAL-TIME ACCENT CONVERSION

Organization Name

Sanas.ai Inc.

Inventor(s)

Lukas Pfeifenberger of Salzburg (AT)

Shawn Zhang of Palo Alto CA (US)

SYSTEM AND METHOD FOR AUTOMATIC ALIGNMENT OF PHONETIC CONTENT FOR REAL-TIME ACCENT CONVERSION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240347070 titled 'SYSTEM AND METHOD FOR AUTOMATIC ALIGNMENT OF PHONETIC CONTENT FOR REAL-TIME ACCENT CONVERSION

The disclosed technology involves methods, systems, and media for real-time accent conversion. Phonetic embedding vectors are obtained for source accent phonetic content from input audio data. A machine learning model transforms these vectors to match the target accent, aligning speech data based on the transformation to generate output audio data representing the target accent.

  • Obtaining phonetic embedding vectors for source accent phonetic content from input audio data
  • Applying a machine learning model to transform these vectors to match the target accent
  • Determining alignment by maximizing cosine distance between the source and transformed vectors
  • Aligning speech data based on the determined alignment to generate output audio data representing the target accent
  • Efficient and seamless accent conversion in real-time applications
      1. Potential Applications:

- Language learning platforms - Voice assistant technology - Dubbing and voice-over services - Speech therapy and accent modification programs

      1. Problems Solved:

- Efficient real-time accent conversion - Seamless integration of different accents in audio data - Improved user experience in language-related applications

      1. Benefits:

- Enhanced communication across different accents - Personalized user experience in language learning - Increased accessibility for individuals with speech differences

      1. Commercial Applications:
        1. Title: Real-Time Accent Conversion Technology for Language Learning Platforms

This technology can be utilized in language learning platforms to provide personalized accent conversion services for learners, enhancing their language acquisition experience. Additionally, it can be integrated into voice assistant technology to improve communication between users with different accents.

      1. Questions about Real-Time Accent Conversion Technology:
        1. 1. How does this technology impact language learning platforms?

This technology enhances language learning platforms by providing personalized accent conversion services for learners, improving their language acquisition experience.

        1. 2. What are the potential commercial applications of real-time accent conversion technology?

The commercial applications of this technology include language learning platforms, voice assistant technology, dubbing and voice-over services, and speech therapy programs.


Original Abstract Submitted

the disclosed technology relates to methods, accent conversion systems, and non-transitory computer readable media for real-time accent conversion. in some examples, a set of phonetic embedding vectors is obtained for phonetic content representing a source accent and obtained from input audio data. a trained machine learning model is applied to the set of phonetic embedding vectors to generate a set of transformed phonetic embedding vectors corresponding to phonetic characteristics of speech data in a target accent. an alignment is determined by maximizing a cosine distance between the set of phonetic embedding vectors and the set of transformed phonetic embedding vectors. the speech data is then aligned to the phonetic content based on the determined alignment to generate output audio data representing the target accent. the disclosed technology transforms phonetic characteristics of a source accent to match the target accent more closely for efficient and seamless accent conversion in real-time applications.