18683786. SPEECH SYNTHESIS APPARATUS, SPEECH SYNTHESIS METHOD, AND SPEECH SYNTHESIS PROGRAM simplified abstract (NIPPON TELEGRAPH AND TELEPHONE CORPORATION)

From WikiPatents
Revision as of 06:09, 18 October 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

SPEECH SYNTHESIS APPARATUS, SPEECH SYNTHESIS METHOD, AND SPEECH SYNTHESIS PROGRAM

Organization Name

NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventor(s)

Yusuke Ijima of Musashino-shi, Tokyo (JP)

Tomoki Koriyama of Tokyo (JP)

Shinnosuke Takamichi of Tokyo (JP)

SPEECH SYNTHESIS APPARATUS, SPEECH SYNTHESIS METHOD, AND SPEECH SYNTHESIS PROGRAM - A simplified explanation of the abstract

This abstract first appeared for US patent application 18683786 titled 'SPEECH SYNTHESIS APPARATUS, SPEECH SYNTHESIS METHOD, AND SPEECH SYNTHESIS PROGRAM

The speech synthesis apparatus described in the abstract includes a memory and a processor that work together to generate a speech synthesis model for reading out a text associated with an image.

  • The processor obtains utterance information on subjects to be uttered, which are texts contained in data on a book.
  • It also acquires image information on images contained in the data on the book.
  • Additionally, the processor obtains speech data corresponding to the subjects to be uttered.
  • Based on the obtained utterance information, image information, and speech data, the processor generates a speech synthesis model for reading out a text associated with an image.

Potential Applications: - This technology can be used in e-book readers to provide a more interactive and engaging reading experience. - It can also be utilized in educational settings to assist students in learning and comprehension. - The speech synthesis model can be integrated into virtual assistants to enhance their capabilities in reading out information.

Problems Solved: - Enhances accessibility for visually impaired individuals by providing audio descriptions of images in books. - Improves the overall user experience by combining text, images, and speech synthesis seamlessly.

Benefits: - Increases the inclusivity of digital content by providing audio descriptions of images. - Enhances the learning experience by offering a multi-modal approach to reading. - Improves the efficiency of virtual assistants by enabling them to read out text associated with images.

Commercial Applications: Title: Innovative Speech Synthesis Technology for Enhanced Reading Experiences This technology can be commercialized in e-book readers, educational software, and virtual assistant devices to provide users with a more immersive and interactive experience while consuming digital content.

Questions about Speech Synthesis Technology: 1. How does this technology improve accessibility for visually impaired individuals? - This technology enhances accessibility by providing audio descriptions of images in books, making the content more inclusive and engaging for visually impaired users.

2. What are the potential educational applications of this speech synthesis technology? - This technology can be used in educational settings to assist students in learning and comprehension by offering a multi-modal approach to reading.


Original Abstract Submitted

A speech synthesis apparatus according to the present disclosure includes a memory and a processor coupled to the memory. The processor is configured to: obtain utterance information on subjects to be uttered, wherein the subjects to be uttered are texts contained in data on a book, obtain image information on images that are contained in the data on the book, obtain speech data corresponding to the subjects to be uttered; and generate, based on the obtained utterance information, the obtained image information, and the obtained speech data, a speech synthesis model for reading out a text associated with an image.