17809202. ARTIFICIAL INTELLIGENCE FACTSHEET GENERATION FOR SPEECH RECOGNITION simplified abstract (INTERNATIONAL BUSINESS MACHINES CORPORATION)

From WikiPatents
Jump to navigation Jump to search

ARTIFICIAL INTELLIGENCE FACTSHEET GENERATION FOR SPEECH RECOGNITION

Organization Name

INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor(s)

Shreya Khare of Bangalore (IN)

Ashish R. Mittal of Bangaluru (IN)

Saneem Ahmed Chemmengath of Bangalore (IN)

Samarth Bharadwaj of Bangalore (IN)

Karthik Sankaranarayanan of Bangalore (IN)

ARTIFICIAL INTELLIGENCE FACTSHEET GENERATION FOR SPEECH RECOGNITION - A simplified explanation of the abstract

This abstract first appeared for US patent application 17809202 titled 'ARTIFICIAL INTELLIGENCE FACTSHEET GENERATION FOR SPEECH RECOGNITION

Simplified Explanation

The patent application describes a method, system, and computer program for automatically generating AI factsheets to customize speech to text (STT) models. Here are the key points:

  • The method takes audio data containing human speech as input.
  • It uses a first speech to text model to convert the audio data into text.
  • The text data generated may contain errors, which are identified.
  • AI factsheets are then generated to describe the model metadata of the first speech to text model.
  • Based on the identified errors and the AI factsheets, a second speech to text model is generated specifically customized to the user.

Potential applications of this technology:

  • Improving accuracy and customization of speech to text services.
  • Enhancing transcription services for various industries, such as legal, medical, and media.
  • Enabling better voice assistants and voice-controlled devices.
  • Assisting in language learning and pronunciation improvement.

Problems solved by this technology:

  • Reducing transcription errors in speech to text conversion.
  • Providing customized speech to text models tailored to individual users.
  • Streamlining the process of generating AI factsheets for model metadata.

Benefits of this technology:

  • Improved accuracy and efficiency in converting speech to text.
  • Customized models that better understand and transcribe individual users' speech patterns.
  • Enhanced user experience with voice-controlled devices and applications.
  • Time and cost savings in transcription services.


Original Abstract Submitted

A method, system, and computer program product for automated artificial intelligence (AI) factsheet generation for modeling and model customization in speech to text (STT) services. The method receives audio data for a user. The audio data contains human speech. Text data is generated, using a first speech to text model, to represent the human speech of the audio data. A set of transcription errors of the first speech to text model are identified. A set of AI factsheets are generated to describe model metadata for the first speech to text model. Based on the set of transcription errors and the set of AI factsheets, the method generates a second speech to text model customized to the user.