18392369. ELECTRONIC APPARATUS AND CONTROL METHOD THEREOF simplified abstract (Samsung Electronics Co., Ltd.)

From WikiPatents
Revision as of 04:26, 26 April 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

ELECTRONIC APPARATUS AND CONTROL METHOD THEREOF

Organization Name

Samsung Electronics Co., Ltd.

Inventor(s)

Sichen Jin of Suwon-si (KR)

Kwangyoun Kim of Seoul (KR)

Sungsoo Kim of Suwon-si (KR)

Junmo Park of Suwon-si (KR)

Dhairya Sandhyana of Suwon-si (KR)

Changwoo Han of Suwon-si (KR)

ELECTRONIC APPARATUS AND CONTROL METHOD THEREOF - A simplified explanation of the abstract

This abstract first appeared for US patent application 18392369 titled 'ELECTRONIC APPARATUS AND CONTROL METHOD THEREOF

Simplified Explanation

The electronic apparatus described in the patent application is designed to extract objects and characters from image data, identify them, generate a bias keyword list based on the identified objects and characters, convert speech data to text using the bias keyword list and a language contextual model, and display the converted text as captions.

  • Communication interface to receive content comprising image data and speech data
  • Memory to store a language contextual model trained with relevance between words
  • Display for showing the converted text
  • Processor to extract objects and characters, identify object names, generate bias keyword list, convert speech data to text, and control display

Potential Applications

This technology could be applied in:

  • Automatic captioning for images and videos
  • Language translation services
  • Assistive technologies for individuals with hearing impairments

Problems Solved

This technology addresses the following issues:

  • Improving accessibility to visual content for individuals with hearing impairments
  • Enhancing user experience by providing accurate and relevant captions for images and videos

Benefits

The benefits of this technology include:

  • Increased accessibility to multimedia content
  • Improved user engagement with visual media
  • Enhanced communication for individuals with hearing impairments

Potential Commercial Applications

The potential commercial applications of this technology could be:

  • Integration into social media platforms for automatic captioning of user-generated content
  • Inclusion in smart devices for real-time translation and captioning services

Possible Prior Art

One possible prior art for this technology could be existing speech-to-text and image recognition systems that are used for captioning and translation services.

Unanswered Questions

How does the accuracy of the object and character identification impact the overall performance of the system?

The accuracy of object and character identification is crucial for generating relevant bias keyword lists and converting speech data accurately. Higher accuracy can lead to more precise captions and better user experience.

What are the limitations of the language contextual model in handling different languages and dialects?

The language contextual model may have limitations in handling languages and dialects that are not well-represented in the training data. This could affect the accuracy of speech-to-text conversion and the relevance of the generated captions.


Original Abstract Submitted

An electronic apparatus and a control method thereof are provided. The electronic apparatus includes a communication interface configured to receive content comprising image data and speech data; a memory configured to store a language contextual model trained with relevance between words; a display; and a processor configured to: extract an object and a character included in the image data, identify an object name of the object and the character, generate a bias keyword list comprising an image-related word that is associated with the image data, based on the identified object name and the identified character, convert the speech data to a text based on the bias keyword list and the language contextual model, and control the display to display the text that is converted from the speech data, as a caption.