17937692. PERSONALIZED MULTI-MODAL SPOKEN LANGUAGE IDENTIFICATION simplified abstract (SAMSUNG ELECTRONICS CO., LTD.)

From WikiPatents
Jump to navigation Jump to search

PERSONALIZED MULTI-MODAL SPOKEN LANGUAGE IDENTIFICATION

Organization Name

SAMSUNG ELECTRONICS CO., LTD.

Inventor(s)

Divya Neelagiri of Dublin CA (US)

Cindy Sushen Tseng of Santa Clara CA (US)

Vijendra Raj Apsingekar of San Jose CA (US)

PERSONALIZED MULTI-MODAL SPOKEN LANGUAGE IDENTIFICATION - A simplified explanation of the abstract

This abstract first appeared for US patent application 17937692 titled 'PERSONALIZED MULTI-MODAL SPOKEN LANGUAGE IDENTIFICATION

Simplified Explanation

The patent application describes a method for identifying the spoken language of a person based on an audio input captured by an electronic device. The method involves the following steps:

  • Obtaining an audio input of a person speaking, captured by an electronic device.
  • Determining the probability that the person is speaking in each of multiple language types by applying a trained spoken language identification model to the audio input.
  • Determining additional probabilities based on characteristics of the person or the electronic device.
  • Calculating a score for each language type by combining the first and second probabilities using a weighted sum.
  • Identifying the language type with the highest score as the spoken language of the person in the audio input.

Potential applications of this technology:

  • Language recognition in voice assistants: This method can be used to accurately identify the language spoken by a user, allowing voice assistants to respond in the appropriate language.
  • Multilingual call centers: Call centers that handle calls in multiple languages can use this method to automatically route calls to agents who are fluent in the language spoken by the caller.
  • Language learning applications: Language learning platforms can utilize this method to assess the proficiency of learners in different languages and provide personalized feedback.

Problems solved by this technology:

  • Accurate language identification: The method improves the accuracy of identifying the spoken language by considering both the audio input and additional characteristics of the person or the electronic device.
  • Efficient language recognition: By using a weighted sum of probabilities, the method provides a more reliable and efficient way to determine the spoken language compared to traditional methods.
  • Adaptability to different languages: The method can be trained to recognize multiple language types, making it adaptable to various language scenarios.

Benefits of this technology:

  • Enhanced user experience: Voice-controlled devices and applications can better understand and respond to users in their preferred language, improving the overall user experience.
  • Streamlined communication: In multilingual environments, this method can help streamline communication by automatically identifying the language spoken and routing it to the appropriate recipient.
  • Time and cost savings: By automating language recognition, businesses can save time and resources that would otherwise be spent manually identifying languages in various contexts.


Original Abstract Submitted

A method includes obtaining an audio input of a person speaking, where the audio input is captured by an electronic device. The method also includes, for each of multiple language types, (i) determining a first probability that the person is speaking in the language type by applying a trained spoken language identification model to the audio input, (ii) determining at least one second probability that the person is speaking in the language type based on at least one characteristic of the person or the electronic device, and (iii) determining a score for the language type based on a weighted sum of the first and second probabilities. The method further includes identifying the language type associated with a highest score as a spoken language of the person in the audio input.