18179177. PERTURBATION ROBUST METRIC FOR EVALUATING IMAGE CAPTIONS simplified abstract (Adobe Inc.)

From WikiPatents
Jump to navigation Jump to search

PERTURBATION ROBUST METRIC FOR EVALUATING IMAGE CAPTIONS

Organization Name

Adobe Inc.

Inventor(s)

Seunghyun Yoon of San Jose CA (US)

Trung Bui of San Jose CA (US)

PERTURBATION ROBUST METRIC FOR EVALUATING IMAGE CAPTIONS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18179177 titled 'PERTURBATION ROBUST METRIC FOR EVALUATING IMAGE CAPTIONS

Simplified Explanation: The patent application describes a system for training an image caption evaluation system by receiving a training image, a ground truth image caption, and a perturbed image caption. The system generates visual and text embeddings for the images and captions, computes losses between them, and trains a perturbation-aware text encoder based on the losses.

  • **Key Features and Innovation:**
   - Training system for image caption evaluation
   - Utilizes visual and text embeddings
   - Perturbation-aware text encoder
   - Loss computation for training
  • **Potential Applications:**
   - Image caption evaluation systems
   - Natural language processing
   - Computer vision applications
  • **Problems Solved:**
   - Improving image caption evaluation accuracy
   - Handling perturbed image captions
   - Enhancing training of text encoders
  • **Benefits:**
   - More accurate image caption evaluations
   - Better understanding of perturbed captions
   - Enhanced training methods for text encoders
  • **Commercial Applications:**
   - Enhanced image captioning systems for businesses
   - Improved natural language processing tools for companies
   - Potential applications in social media platforms for image analysis
  • **Prior Art:**
   Prior art related to this technology may include research on image caption evaluation systems, text encoders, and perturbation-aware models in natural language processing.
  • **Frequently Updated Research:**
   Ongoing research in the field of image caption evaluation, natural language processing, and computer vision may provide further insights into the development of similar systems.

Questions about Image Caption Evaluation System:

1. *How does the system handle perturbed image captions in training?*

  - The system uses a perturbation-aware text encoder to generate embeddings for both the ground truth and perturbed image captions, allowing it to compute losses and train effectively.

2. *What are the potential real-world applications of this technology?*

  - The technology can be applied in image caption evaluation systems, natural language processing tasks, and various computer vision applications.


Original Abstract Submitted

Embodiments are disclosed for training an image caption evaluation system to perform evaluations of image captions. In particular, in one or more embodiments, the disclosed systems and methods comprise receiving a training image, a ground truth image caption for the training image, and a perturbed image caption for the training image, where the perturbed image caption includes modifications to the ground truth image caption. The disclosed systems and methods further comprise generating, by a visual encoder, a visual embedding representation of the training image and generating, by a perturbation-aware text encoder, a first text embedding for the ground truth image caption and a second text embedding for the perturbed image caption. The disclosed systems and methods further comprise computing losses between the visual embedding, the first text embedding, and the second text embedding and training the perturbation-aware text encoder based on the computed losses.