18383765. DISPLAY APPARATUS THAT PROVIDES ANSWER TO QUESTION BASED ON IMAGE AND CONTROLLING METHOD THEREOF simplified abstract (SAMSUNG ELECTRONICS CO., LTD.)
Contents
DISPLAY APPARATUS THAT PROVIDES ANSWER TO QUESTION BASED ON IMAGE AND CONTROLLING METHOD THEREOF
Organization Name
Inventor(s)
Jakub Hoschiowics of Warszawa (PL)
Adrian Wisniewski of Warszawa (PL)
DISPLAY APPARATUS THAT PROVIDES ANSWER TO QUESTION BASED ON IMAGE AND CONTROLLING METHOD THEREOF - A simplified explanation of the abstract
This abstract first appeared for US patent application 18383765 titled 'DISPLAY APPARATUS THAT PROVIDES ANSWER TO QUESTION BASED ON IMAGE AND CONTROLLING METHOD THEREOF
Simplified Explanation
The display apparatus described in the patent application allows users to input information while watching a video, which is then used to generate text based on the video content.
- The display apparatus includes a display and one or more processors.
- Users can input information while watching a video on the display.
- The apparatus obtains at least one passage by inputting the user input and the video to a first encoder.
- A text is generated by using a neural network model and a part of a plurality of tokens included in the obtained passage.
- The generated text is outputted for the user.
---
- Potential Applications
- Educational tools for generating text summaries of video content.
- Assistive technology for individuals with hearing impairments to understand video content.
- Problems Solved
- Difficulty in summarizing video content quickly and accurately.
- Lack of tools for generating text from video content in real-time.
- Benefits
- Improved accessibility to video content for a wider range of users.
- Enhanced learning experience through text summaries of video content.
Original Abstract Submitted
Disclosed is a display apparatus. The display apparatus includes a display; and one or more processors configured to, based on a user input being received while a video is provided through the display, obtain at least one passage by inputting information on the user input and the video to a first encoder, obtain a text by using a part of a plurality of tokens included in the at least one passage by inputting the video, the information on the user input, and the at least one passage into a neural network model, and output the obtained text.