17955055. CONTRASTIVE LEARNING BY DYNAMICALLY SELECTING DROPOUT RATIOS AND LOCATIONS BASED ON REINFORCEMENT LEARNING simplified abstract (International Business Machines Corporation)

From WikiPatents
Jump to navigation Jump to search

CONTRASTIVE LEARNING BY DYNAMICALLY SELECTING DROPOUT RATIOS AND LOCATIONS BASED ON REINFORCEMENT LEARNING

Organization Name

International Business Machines Corporation

Inventor(s)

Zhong Fang Yuan of Xi'an (CN)

Si Tong Zhao of Beijing (CN)

Tong Liu of XI'AN (CN)

Yi Chen Zhong of Shanghai (CN)

Yuan Yuan Ding of Shanghai (CN)

Hai Bo Zou of Beijing (CN)

CONTRASTIVE LEARNING BY DYNAMICALLY SELECTING DROPOUT RATIOS AND LOCATIONS BASED ON REINFORCEMENT LEARNING - A simplified explanation of the abstract

This abstract first appeared for US patent application 17955055 titled 'CONTRASTIVE LEARNING BY DYNAMICALLY SELECTING DROPOUT RATIOS AND LOCATIONS BASED ON REINFORCEMENT LEARNING

Simplified Explanation

The method described in the patent application involves using reinforcement learning to select dropout ratios and locations for contrastive learning in a neural network. This process helps improve the network's ability to distinguish between positive and negative samples by dropping out connections between neurons based on a policy derived from the training data.

  • The method involves receiving training data with positive and negative samples.
  • A dropout policy is created based on the training data to identify connections to dropout in the neural network.
  • The training data is encoded to form embeddings using the dropout policy.
  • The embeddings include positive sample embeddings for the target and negative sample embeddings for non-target samples.

Potential Applications

The technology could be applied in various fields such as image recognition, natural language processing, and recommendation systems to enhance the performance of neural networks in distinguishing between different classes of data.

Problems Solved

This technology addresses the challenge of improving the contrastive learning process in neural networks by dynamically selecting dropout ratios and locations based on reinforcement learning, leading to better feature representation and classification accuracy.

Benefits

The benefits of this technology include enhanced model performance, improved generalization ability, and increased efficiency in training neural networks by optimizing dropout strategies based on the specific data distribution.

Potential Commercial Applications

Potential commercial applications of this technology include developing advanced AI systems for image and speech recognition, personalized recommendation engines, and fraud detection systems that require accurate classification of data.

Possible Prior Art

One possible prior art in this field could be the use of static dropout strategies in neural networks, where connections between neurons are randomly dropped out during training without considering the specific characteristics of the data.

What is the specific reinforcement learning algorithm used to select dropout ratios and locations in this method?

The specific reinforcement learning algorithm used in this method is not explicitly mentioned in the abstract. Further details on the algorithm and its implementation would be necessary to understand the exact approach taken in selecting dropout ratios and locations.

How does the encoding of training data into embeddings improve the performance of the neural network in contrastive learning?

The abstract mentions that the training data is encoded based on the dropout policy to form embeddings. However, it does not elaborate on how this encoding process specifically enhances the network's ability to distinguish between positive and negative samples. Further information on the encoding technique and its impact on the network's performance would be needed to fully understand this aspect of the technology.


Original Abstract Submitted

A method for contrastive learning by selecting dropout ratios and locations based on reinforcement learning includes receiving training data having a positive sample corresponding to a target and negative samples not corresponding to the target. A dropout policy for a neural network is produced based on the training data, where the dropout policy identifies at least one connection between neurons in the neural network to dropout. The training data is encoded, based on the dropout policy, to form embeddings, where the embeddings include multiple positive sample embeddings corresponding to the positive sample and multiple negative sample embedding corresponding to the negative samples.