Meta Platforms, Inc. (20240211704). GLOBALIZATION OF VIDEOS USING AUTOMATED VOICE DUBBING simplified abstract

From WikiPatents
Jump to navigation Jump to search

GLOBALIZATION OF VIDEOS USING AUTOMATED VOICE DUBBING

Organization Name

Meta Platforms, Inc.

Inventor(s)

Charles Patrick Mason Griffin of Menlo Park CA (US)

Prakash Chandra of Fremont CA (US)

Carlos Lourenco of Dublin CA (US)

Amit Agarwal of Newark CA (US)

GLOBALIZATION OF VIDEOS USING AUTOMATED VOICE DUBBING - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240211704 titled 'GLOBALIZATION OF VIDEOS USING AUTOMATED VOICE DUBBING

Simplified Explanation: The audio processing system described in the patent application separates background noise, speaker audio data, and translates speech from one language to another for each speaker before transmitting the encoded audio data to a user device.

  • The system separates background noise, first speaker audio data, and second speaker audio data.
  • It recognizes and converts speech from the first speaker to text, translates it to a second language, and converts it back to speech.
  • It does the same for the second speaker, translating their speech to a second language and converting it back to speech.
  • The system then generates encoded audio data for transmission to a user device.

Potential Applications: This technology could be used in conference calls, meetings, and other situations where multiple speakers speaking different languages need to communicate effectively.

Problems Solved: This technology addresses the challenge of real-time translation and communication between speakers of different languages in audio settings.

Benefits: The system allows for seamless communication between speakers of different languages, improving understanding and collaboration in multilingual environments.

Commercial Applications: The technology could be valuable for multinational companies, language interpretation services, and any organization that deals with multilingual communication on a regular basis.

Prior Art: Prior art related to this technology may include research on speech recognition, translation, and audio processing systems in the field of artificial intelligence and machine learning.

Frequently Updated Research: Researchers may be exploring improvements in speech recognition accuracy, translation algorithms, and real-time audio processing capabilities in similar systems.

Questions about Audio Processing System: 1. How does the system handle background noise in the audio data? 2. What are the potential limitations of real-time translation in audio processing systems?


Original Abstract Submitted

an audio processing system includes: a receiver configured to receive the original audio data; a processor configured to execute the instructions stored in the memory to cause the audio processing system to: separate a background noise audio data, a first speaker audio data, and a second speaker audio data; recognize first speaker speech, convert the first speaker speech to first speaker text, translate the first speaker text to a second language text, and convert the second language text to a second speech; recognize second speaker speech, convert the second speaker speech to second speaker text, translate the second speaker text to the second language text, and convert the second language text of the second speaker to a second speech for the second speaker; and generate encoded audio data; and a transmitter configured to transmit the encoded audio data to a content user device.