INTERNATIONAL BUSINESS MACHINES CORPORATION (20240233703). PROVIDING A REPOSITORY OF AUDIO FILES HAVING PRONUNCIATIONS FOR TEXT STRINGS TO PROVIDE TO A SPEECH SYNTHESIZER simplified abstract

From WikiPatents
Jump to navigation Jump to search

PROVIDING A REPOSITORY OF AUDIO FILES HAVING PRONUNCIATIONS FOR TEXT STRINGS TO PROVIDE TO A SPEECH SYNTHESIZER

Organization Name

INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor(s)

Jun Su of Beijing (CN)

Yang Liang of Beijing (CN)

Terry James Hoffman of Austin TX (US)

Su Liu of Austin TX (US)

PROVIDING A REPOSITORY OF AUDIO FILES HAVING PRONUNCIATIONS FOR TEXT STRINGS TO PROVIDE TO A SPEECH SYNTHESIZER - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240233703 titled 'PROVIDING A REPOSITORY OF AUDIO FILES HAVING PRONUNCIATIONS FOR TEXT STRINGS TO PROVIDE TO A SPEECH SYNTHESIZER

The patent application describes a computer program product, system, and method for managing a repository of audio files containing pronunciations for text strings to be used by a speech synthesizer.

  • The repository includes data structures for text strings found in documents.
  • Each data structure for a text string includes attributes for how the text is presented in the document and audio files with pronunciations for the text.
  • When a search text string and attribute are received from the speech synthesizer, the system identifies a matching data structure in the repository.
  • The system then retrieves the corresponding audio file for the search text string and attribute to be output by the speech synthesizer.

Potential Applications: - Language learning applications - Accessibility tools for individuals with speech impairments - Automated transcription services

Problems Solved: - Providing accurate pronunciations for text strings in documents - Enhancing the capabilities of speech synthesizers

Benefits: - Improved accessibility for users with speech impairments - Enhanced language learning experiences - Efficient transcription and speech synthesis processes

Commercial Applications: Title: "Enhanced Speech Synthesis Repository System" This technology could be utilized in educational software, transcription services, and communication devices for individuals with disabilities. The market implications include increased efficiency in speech synthesis processes and improved accessibility for users.

Questions about the technology: 1. How does this technology improve the accuracy of pronunciations for text strings? 2. What are the potential limitations of using this system in real-time speech synthesis applications?


Original Abstract Submitted

provided are a computer program product, system, and method for providing a repository of audio files having pronunciations for text strings to provide to a speech synthesizer. the repository has data structures for text strings in documents. a data structure for a text string indicates at least one attribute of a presentation of the text string in the document and at least one audio file providing at least one audio pronunciation of the text string. a search text string and a search attribute are received from the speech synthesizer. a determination is made of a data structure in the repository including a text string and an attribute matching the search text string and the search attribute, respectively. an audio file, indicated in the determined data structure, is returned to the speech synthesizer to output for the search text string in a document being processed by the speech synthesizer.