17973314. SYSTEM AND METHOD FOR MULTI MODAL INPUT AND EDITING ON A HUMAN MACHINE INTERFACE simplified abstract (Robert Bosch GmbH)

From WikiPatents
Revision as of 06:39, 8 May 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

SYSTEM AND METHOD FOR MULTI MODAL INPUT AND EDITING ON A HUMAN MACHINE INTERFACE

Organization Name

Robert Bosch GmbH

Inventor(s)

Zhengyu Zhou of Fremont CA (US)

Jiajing Guo of Mountain View CA (US)

Nan Tian of Foster City CA (US)

Nicholas Feffer of Stanford CA (US)

William Ma of Lagrangeville NY (US)

SYSTEM AND METHOD FOR MULTI MODAL INPUT AND EDITING ON A HUMAN MACHINE INTERFACE - A simplified explanation of the abstract

This abstract first appeared for US patent application 17973314 titled 'SYSTEM AND METHOD FOR MULTI MODAL INPUT AND EDITING ON A HUMAN MACHINE INTERFACE

Simplified Explanation

The patent application describes a virtual reality apparatus with features such as a display, microphone for voice commands, eye gaze sensor, and a processor for various functions related to text input and editing.

  • Display outputs information related to the user interface.
  • Microphone receives spoken word commands from the user.
  • Eye gaze sensor tracks the user's eye movement.
  • Processor responds to inputs by outputting text, emphasizing words, toggling through words, highlighting and editing, and suggesting words based on contextual information.

Potential Applications

This technology could be applied in virtual reality gaming, virtual meetings and collaboration, language learning applications, and accessibility tools for individuals with disabilities.

Problems Solved

This technology solves the problem of efficient text input and editing in virtual reality environments, enhances user experience by incorporating voice commands and eye tracking, and provides suggestions for text input based on context.

Benefits

The benefits of this technology include improved user interaction in virtual reality, increased productivity in text input tasks, enhanced accessibility for users with disabilities, and a more immersive and intuitive virtual reality experience.

Potential Commercial Applications

Potential commercial applications of this technology include virtual reality software development, virtual reality hardware manufacturing, language learning platforms, accessibility software for individuals with disabilities, and virtual reality training simulations.

Possible Prior Art

One possible prior art could be existing virtual reality text input systems that utilize voice commands and eye tracking, but may not have the same level of functionality and contextual word suggestions as described in this patent application.

Unanswered Questions

How does the processor determine the threshold time for eye gaze exceeding to emphasize words?

The patent application does not provide specific details on how the threshold time for eye gaze exceeding is determined by the processor.

What types of contextual information does the language model utilize to suggest words?

The patent application does not elaborate on the specific types of contextual information used by the language model to suggest words for text input.


Original Abstract Submitted

A virtual reality apparatus that includes a display configured to output information related to a user interface of the virtual reality device, a microphone configured to receive one or more spoken word commands from a user upon activation of a voice recognition session, an eye gaze sensor configured to track eye movement of the user, and a processor programmed to, in response to a first input, output one or more words of a text field, in response to an eye gaze of the user exceeding a threshold time, emphasize a group of one or more words of the text field, toggle through a plurality of words of only the group utilizing the input interface, in response to a second input, highlight and edit an edited word from the group, and in response to utilizing contextual information associated with the group a language model, outputting one or more suggested words.