Adobe Inc. (20240257798). SPOKEN LANGUAGE RECOGNITION simplified abstract

From WikiPatents
Jump to navigation Jump to search

SPOKEN LANGUAGE RECOGNITION

Organization Name

Adobe Inc.

Inventor(s)

Oriol Nieto-caballero of Oakland CA (US)

Zeyu Jin of San Francisco CA (US)

Justin Jonathan Salamon of San Francisco CA (US)

Franck Dernoncourt of Spokane WA (US)

SPOKEN LANGUAGE RECOGNITION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240257798 titled 'SPOKEN LANGUAGE RECOGNITION

The technology described in this patent application utilizes a neural network with an efficient and lightweight architecture to recognize spoken language from audio signals.

  • Features are generated from the audio signal, such as by converting it to a normalized spectrogram.
  • These features are input to the neural network, which includes convolutional layers and an output activation layer.
  • Each neuron in the output activation layer corresponds to a language and generates an activation value.
  • Based on the activation values, the system provides an indication of one or more languages present in the audio signal.

Potential Applications: - Speech recognition systems - Language translation tools - Voice-controlled devices

Problems Solved: - Improving accuracy and efficiency of spoken language recognition - Enhancing multilingual capabilities in technology

Benefits: - Faster and more accurate language recognition - Enhanced user experience in voice-controlled applications - Improved accessibility for multilingual users

Commercial Applications: - Integration into smart speakers and virtual assistants - Language learning applications - Customer service chatbots

Questions about the Technology: 1. How does the neural network differentiate between different languages in the audio signal? 2. What are the potential limitations of this technology in real-world applications?

Frequently Updated Research: - Stay updated on advancements in neural network architectures for speech recognition - Monitor developments in multilingual language processing algorithms.


Original Abstract Submitted

some aspects of the technology described herein employ a neural network with an efficient and lightweight architecture to perform spoken language recognition. given an audio signal comprising speech, features are generated from the audio signal, for instance, by converting the audio signal to a normalized spectrogram. the features are input to the neural network, which has one or more convolutional layers and an output activation layer. each neuron of the output activation layer corresponds to a language from a set of language and generates an activation value. based on the activations values, an indication of zero or more languages from the set of languages is provided for the audio signal.