18479108. METHOD OF GENERATING LANGUAGE FEATURE EXTRACTION MODEL, INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM simplified abstract (FUJIFILM CORPORATION)

From WikiPatents
Jump to navigation Jump to search

METHOD OF GENERATING LANGUAGE FEATURE EXTRACTION MODEL, INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM

Organization Name

FUJIFILM CORPORATION

Inventor(s)

Akimichi Ichinose of Tokyo (JP)

METHOD OF GENERATING LANGUAGE FEATURE EXTRACTION MODEL, INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM - A simplified explanation of the abstract

This abstract first appeared for US patent application 18479108 titled 'METHOD OF GENERATING LANGUAGE FEATURE EXTRACTION MODEL, INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM

Simplified Explanation

The method described in the abstract involves generating a language feature extraction model that allows a computer to extract a feature from a text related to an image. Here is a simplified explanation of the abstract:

  • A system uses machine learning with training data containing an image, position information related to a region of interest in the image, and a text describing the region of interest.
  • The text is input into a language feature extraction model to output a feature amount.
  • The image and feature amount are then input into a second model to estimate the region of interest.
  • The first and second models are trained to ensure the estimated region of interest matches the correct answer indicated by the position information.
      1. Potential Applications

This technology could be applied in image recognition systems, natural language processing, and content-based image retrieval.

      1. Problems Solved

This technology helps in extracting relevant features from text related to images, improving the accuracy of image analysis and understanding.

      1. Benefits

The method enhances the efficiency of extracting features from text and improves the overall performance of image processing systems.

      1. Potential Commercial Applications

This technology could be utilized in industries such as e-commerce, healthcare, security, and social media for image analysis and content recommendation.

      1. Possible Prior Art

One possible prior art could be the use of machine learning models for image recognition and text analysis separately, without the integration described in this method.

        1. Unanswered Questions
        1. How does this method handle complex images with multiple regions of interest?

The abstract does not specify how the system deals with images containing multiple regions of interest. Further details on how the models are trained to handle such scenarios would be beneficial.

        1. What is the computational cost of training and using the language feature extraction model?

The abstract does not mention the computational resources required for training and utilizing the language feature extraction model. Understanding the computational cost would be essential for practical implementation and scalability.


Original Abstract Submitted

A method of generating a language feature extraction model that causes a computer to extract a feature from a text related to an image, includes that a system performs machine learning using training data including a first image, first position information related to a region of interest in the first image, and a first text that describes the region of interest to input the first text into a first model, which is the language feature extraction model, to cause the first model to output a first feature amount, input the first image and the first feature amount into a second model to cause the second model to estimate the region of interest, and train the first model and the second model such that an estimated region of interest output from the second model matches the region of interest of a correct answer indicated by the first position information.