17951889. Generative Models for Discrete Datasets Constrained by a Marginal Distribution Specification simplified abstract (GOOGLE LLC)

From WikiPatents
Jump to navigation Jump to search

Generative Models for Discrete Datasets Constrained by a Marginal Distribution Specification

Organization Name

GOOGLE LLC

Inventor(s)

Hanjun Dai of San Jose CA (US)

Bo Dai of San Jose CA (US)

Mengjiao Yang of Berkeley CA (US)

Yuan Xue of Palo Alto CA (US)

Dale Eric Schuurmans of Edmonton (CA)

Generative Models for Discrete Datasets Constrained by a Marginal Distribution Specification - A simplified explanation of the abstract

This abstract first appeared for US patent application 17951889 titled 'Generative Models for Discrete Datasets Constrained by a Marginal Distribution Specification

Simplified Explanation

The present disclosure describes generative models for datasets constrained by marginal constraints. One method involves generating a target dataset based on a marginal constraint for a source dataset. The source dataset encodes co-occurrence frequencies for object pairs, and a source generative model is accessed and adapted to generate the target dataset.

  • Receiving a request to generate a target dataset based on a marginal constraint for a source dataset
  • Updating a second module of the source generative model based on the marginal constraint
  • Generating an adapted generative model that includes the updated second module
  • Generating a target dataset based on the adapted generative model

Potential Applications

This technology could be applied in various fields such as natural language processing, image generation, and data synthesis for machine learning models.

Problems Solved

This technology solves the problem of generating target datasets that adhere to specific marginal constraints, allowing for more precise data generation in various applications.

Benefits

The benefits of this technology include improved data generation accuracy, enhanced model training, and the ability to generate datasets with specific constraints for research and development purposes.

Potential Commercial Applications

Potential commercial applications of this technology include data augmentation services, synthetic data generation tools for machine learning companies, and research tools for academics and scientists.

Possible Prior Art

One possible prior art for this technology could be the use of generative adversarial networks (GANs) for data generation with constraints. However, the specific method described in this disclosure may offer unique advantages and improvements over existing techniques.

Unanswered Questions

How does this technology compare to other data generation methods in terms of efficiency and accuracy?

This article does not provide a direct comparison with other data generation methods, leaving the reader to wonder about the relative performance of this technology.

What are the potential limitations or challenges of implementing this technology in real-world applications?

The article does not address potential limitations or challenges that may arise when implementing this technology in practical settings, leaving room for further exploration and discussion on this topic.


Original Abstract Submitted

The present disclosure is directed to generative models for datasets constrained by marginal constraints. One method includes receiving a request to generate a target dataset based on a marginal constraint for a source dataset. A first object occurs at a source frequency in the source dataset. The marginal constraint indicates a target frequency for the first object. The source dataset encodes a set of co-occurrence frequencies for a plurality of object pairs. A source generative model is accessed. The source generative model includes a first module and a second module that are trained on the source dataset. The second module is updated based on the marginal constraint. An adapted generative model is generated that includes the first module and the updated second module. The target dataset is generated based on the adapted generative model. The first object occurs at the target frequency in the target dataset. The target dataset encodes the set of co-occurrence frequencies for the plurality of object pairs.