International business machines corporation (20240135166). DNN TRAINING ALGORITHM WITH DYNAMICALLY COMPUTED ZERO-REFERENCE simplified abstract

From WikiPatents
Revision as of 04:02, 26 April 2024 by Wikipatents (talk | contribs) (Creating a new page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

DNN TRAINING ALGORITHM WITH DYNAMICALLY COMPUTED ZERO-REFERENCE

Organization Name

international business machines corporation

Inventor(s)

Malte Johannes Rasch of Chappaqua NY (US)

DNN TRAINING ALGORITHM WITH DYNAMICALLY COMPUTED ZERO-REFERENCE - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240135166 titled 'DNN TRAINING ALGORITHM WITH DYNAMICALLY COMPUTED ZERO-REFERENCE

Simplified Explanation

The abstract describes a computer implemented method for updating weights in a deep neural network using a resistive processing unit (RPU) crossbar array and a digital medium.

  • The method involves performing a gradient update for stochastic gradient descent using hidden weights stored in an RPU crossbar array and a digital medium.
  • A set of reference values is computed during a transfer cycle of weights between the RPU crossbar array and the digital medium.
  • The weights are updated in the digital medium when a threshold is reached in the RPU crossbar array.

Potential Applications

This technology could be applied in various fields such as machine learning, artificial intelligence, and data analysis.

Problems Solved

This technology helps in efficiently updating weights in deep neural networks, improving the performance and accuracy of the models.

Benefits

The benefits of this technology include faster training of deep neural networks, reduced energy consumption, and improved accuracy in model predictions.

Potential Commercial Applications

Potential commercial applications of this technology include in industries such as healthcare, finance, autonomous vehicles, and robotics.

Possible Prior Art

One possible prior art could be the use of resistive processing units in neural network training, but the specific method described in the abstract may be novel.

Unanswered Questions

How does this method compare to traditional weight updating techniques in deep neural networks?

This article does not provide a direct comparison between this method and traditional weight updating techniques. It would be interesting to see a performance comparison in terms of training speed, accuracy, and energy efficiency.

What are the limitations of using a resistive processing unit crossbar array for weight storage and updates in deep neural networks?

The article does not discuss any limitations of using an RPU crossbar array. It would be valuable to understand any potential drawbacks or challenges associated with this technology.


Original Abstract Submitted

a computer implemented method includes performing a gradient update for a stochastic gradient descent (sgd) of a deep neural network (dnn) using a first set of hidden weights stored in a first matrix comprising a resistive processing unit (rpu) crossbar array. a second matrix comprising a second set of hidden weights is stored in a digital medium. a third matrix comprising a set of reference values is computed upon a transfer cycle of the first set of weights from the first matrix to the second matrix, accounting for a sign-change (chopper). the third matrix is stored in the digital medium. a third set of weights is updated for the dnn from the second matrix when a threshold is reached for the second set of weights, in a fourth matrix comprising a rpu crossbar array.