18609362. Phrase Extraction for ASR Models simplified abstract (Google LLC)

From WikiPatents
Jump to navigation Jump to search

Phrase Extraction for ASR Models

Organization Name

Google LLC

Inventor(s)

Ehsan Amid of Mountain View CA (US)

Om Dipakbhai Thakkar of Sunnyvale CA (US)

Rajiv Mathews of Sunnyvale CA (US)

Francoise Beaufays of Mountain View CA (US)

Phrase Extraction for ASR Models - A simplified explanation of the abstract

This abstract first appeared for US patent application 18609362 titled 'Phrase Extraction for ASR Models

Simplified Explanation:

The method described in the abstract involves modifying audio data to hide a specific phrase, then using an ASR model to see if the phrase can still be detected.

Key Features and Innovation:

  • Audio data is altered to obscure a particular phrase.
  • ASR model is used to predict the transcription of the altered audio.
  • Detection of leaked phrases from the training data set.

Potential Applications: This technology could be used in privacy-sensitive applications where certain phrases need to be protected from being leaked by ASR models.

Problems Solved: This technology addresses the issue of sensitive information being unintentionally leaked by ASR models during transcription.

Benefits:

  • Enhanced privacy protection for sensitive information.
  • Improved control over the information revealed by ASR models.

Commercial Applications: Potential commercial applications include secure transcription services for industries dealing with confidential information, such as healthcare or legal sectors.

Prior Art: Researchers interested in this technology may want to explore prior work in the fields of audio data obfuscation and ASR model training techniques.

Frequently Updated Research: Stay updated on advancements in audio data privacy protection and ASR model security measures to ensure the most effective implementation of this technology.

Questions about Audio Data Obfuscation: 1. How does audio data obfuscation differ from encryption in terms of protecting sensitive information? 2. What are the potential limitations of using audio data obfuscation techniques in real-world applications?


Original Abstract Submitted

A method of phrase extraction for ASR models includes obtaining audio data characterizing an utterance and a corresponding ground-truth transcription of the utterance and modifying the audio data to obfuscate a particular phrase recited in the utterance. The method also includes processing, using a trained ASR model, the modified audio data to generate a predicted transcription of the utterance, and determining whether the predicted transcription includes the particular phrase by comparing the predicted transcription of the utterance to the ground-truth transcription of the utterance. When the predicted transcription includes the particular phrase, the method includes generating an output indicating that the trained ASR model leaked the particular phrase from a training data set used to train the ASR model.