International business machines corporation (20240135227). GENERATING IN-DISTRIBUTION SAMPLES OF TIME-SERIES OR IMAGE DATA FOR THE NEIGHBORHOOD DISTRIBUTION simplified abstract

From WikiPatents
Revision as of 04:03, 26 April 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

GENERATING IN-DISTRIBUTION SAMPLES OF TIME-SERIES OR IMAGE DATA FOR THE NEIGHBORHOOD DISTRIBUTION

Organization Name

international business machines corporation

Inventor(s)

Natalia Martinez Gil of Durham NC (US)

Kanthi Sarpatwar of Elmsford NY (US)

Roman Vaculin of Larchmont NY (US)

GENERATING IN-DISTRIBUTION SAMPLES OF TIME-SERIES OR IMAGE DATA FOR THE NEIGHBORHOOD DISTRIBUTION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240135227 titled 'GENERATING IN-DISTRIBUTION SAMPLES OF TIME-SERIES OR IMAGE DATA FOR THE NEIGHBORHOOD DISTRIBUTION

Simplified Explanation

The patent application describes a method, system, and computer program product for generating in-distribution samples of data for a neighborhood distribution to be used by post-hoc local explanation methods. An autoencoder is trained to generate in-distribution samples of input data for the neighborhood distribution to be used by a post-hoc local explanation method. The training process involves mapping the input data into a latent dimension, forming a first and a second latent code. A mixed code is then obtained by convexly combining the first and second latent codes with a random coefficient. The mixed code is decoded with the input data masked with interpretable features to obtain conditional mixed reconstructions. Adversarial training is then performed against a discriminator to promote in-distribution samples by computing the reconstruction losses of the conditional mixed reconstructions as well as the discriminator losses and then minimizing such losses.

  • Autoencoder trained to generate in-distribution samples of input data
  • Mapping input data into a latent dimension to form latent codes
  • Obtaining a mixed code by combining latent codes with a random coefficient
  • Decoding mixed code with input data masked with interpretable features
  • Adversarial training against a discriminator to promote in-distribution samples

Potential Applications

This technology could be applied in various fields such as anomaly detection, fraud detection, and predictive maintenance.

Problems Solved

This technology helps in generating in-distribution samples of data for neighborhood distributions, which can improve the performance of post-hoc local explanation methods.

Benefits

The benefits of this technology include improved interpretability of machine learning models, enhanced accuracy in local explanations, and better understanding of model behavior.

Potential Commercial Applications

One potential commercial application of this technology could be in the financial sector for fraud detection systems.

Possible Prior Art

Prior art in the field of generative models and adversarial training techniques may exist, but specific examples are not provided in the patent application.

Unanswered Questions

How does this technology compare to existing methods for generating in-distribution samples of data?

This article does not provide a direct comparison with existing methods, leaving the reader to wonder about the specific advantages of this approach over others.

What are the limitations or potential drawbacks of using this technology in practice?

The article does not address any potential limitations or drawbacks of implementing this technology, leaving room for uncertainty about its practical implications.


Original Abstract Submitted

a computer-implemented method, system and computer program product for generating in-distribution samples of data for a neighborhood distribution to be used by post-hoc local explanation methods. an autoencoder is trained to generate in-distribution samples of input data for the neighborhood distribution to be used by a post-hoc local explanation method. such training includes mapping the input data (e.g., time series data) into a latent dimension (or latent space) forming a first and a second latent code. a mixed code is then obtained by convexly combining the first and second latent codes with a random coefficient. the mixed code is then decoded with the input data masked with interpretable features to obtain conditional mixed reconstructions. adversarial training is then performed against a discriminator in order to promote in-distribution samples by computing the reconstruction losses of the conditional mixed reconstructions as well as the discriminator losses and then minimizing such losses.