US Patent Application 17737535. Compositional Action Machine Learning Mechanisms simplified abstract

From WikiPatents
Jump to navigation Jump to search

Compositional Action Machine Learning Mechanisms

Organization Name

INTERNATIONAL BUSINESS MACHINES CORPORATION


Inventor(s)

Bo Wu of Cambridge MA (US)

Chuang Gan of Cambridge MA (US)

Pin-Yu Chen of White Plains NY (US)

Xin Zhang of Chappaqua NY (US)

Compositional Action Machine Learning Mechanisms - A simplified explanation of the abstract

This abstract first appeared for US patent application 17737535 titled 'Compositional Action Machine Learning Mechanisms

Simplified Explanation

The patent application describes mechanisms for training a machine learning action recognition model. Here are the key points:

  • The invention focuses on training a machine learning model for action recognition using an original input dataset.
  • The original input dataset is processed to generate a bank of object features for different objects.
  • When given an input video, the system generates a verb data structure and an original object data structure.
  • From the object feature bank, a candidate object feature data structure is selected for generating pseudo composition (PC) training data.
  • The PC training data combines the verb data structure with the selected candidate object feature data structure.
  • This PC training data represents a combination of an action and an object that were not present in the original input dataset.
  • The machine learning model is trained using this unseen combination of PC training data.
  • The innovation allows the model to learn and recognize actions with objects that were not part of the original dataset.


Original Abstract Submitted

Mechanisms are provided for performing machine learning (ML) training of a ML action recognition computer model which involves processing an original input dataset to generate an object feature bank comprising object feature data structures for a plurality of different objects. For an input video, a verb data structure and an original object data structure are generated and a candidate object feature data structure is selected from the object feature bank for generation of pseudo composition (PC) training data. The PC training data is generated based on the selected candidate object feature data structure and comprises a combination of the verb data structure and the candidate object feature data structure. The PC training data represents a combination of an action and an object not represented in the original input dataset. ML training of the ML action recognition computer model is performed based on an unseen combination comprising the PC training data.