17862779. APPARATUS AND METHOD WITH NEURAL NETWORK TRAINING BASED ON KNOWLEDGE DISTILLATION simplified abstract (SAMSUNG ELECTRONICS CO., LTD.)

From WikiPatents
Jump to navigation Jump to search

APPARATUS AND METHOD WITH NEURAL NETWORK TRAINING BASED ON KNOWLEDGE DISTILLATION

Organization Name

SAMSUNG ELECTRONICS CO., LTD.

Inventor(s)

Eunhee Kang of Yongin-si (KR)

Minsoo Kang of Seoul (KR)

Bohyung Han of Seoul (KR)

Sehwan Ki of Hwaseong-si (KR)

HYONG EUK Lee of Suwon-si (KR)

APPARATUS AND METHOD WITH NEURAL NETWORK TRAINING BASED ON KNOWLEDGE DISTILLATION - A simplified explanation of the abstract

This abstract first appeared for US patent application 17862779 titled 'APPARATUS AND METHOD WITH NEURAL NETWORK TRAINING BASED ON KNOWLEDGE DISTILLATION

Simplified Explanation

The abstract describes a method for training a student network using an energy-based model and a teacher network. The method involves generating a sample based on the distribution of the energy-based model using the results from both the student and teacher networks. The model parameters of the energy-based model are then trained to decrease its value based on the results from the teacher and student networks. Finally, the student network is trained to increase the value of the energy-based model using the generated sample and its own results.

  • The method involves training a student network using an energy-based model and a teacher network.
  • A sample is generated based on the distribution of the energy-based model using the results from both the student and teacher networks.
  • The model parameters of the energy-based model are trained to decrease its value based on the results from the teacher and student networks.
  • The student network is then trained to increase the value of the energy-based model using the generated sample and its own results.

Potential Applications

  • This method can be applied in various fields where training neural networks is required, such as computer vision, natural language processing, and speech recognition.
  • It can be used to improve the performance and accuracy of student networks by leveraging the knowledge of a teacher network.

Problems Solved

  • This method addresses the problem of training student networks effectively by utilizing the information from a teacher network.
  • It helps in overcoming the limitations of traditional training methods by incorporating an energy-based model and generating samples based on its distribution.

Benefits

  • The method allows for more efficient training of student networks by leveraging the knowledge of a teacher network.
  • It improves the performance and accuracy of student networks by incorporating an energy-based model and generating samples based on its distribution.
  • The method provides a novel approach to training neural networks, potentially leading to advancements in various fields of artificial intelligence.


Original Abstract Submitted

A method includes: generating, based on a student network result of an implemented student network provided with an input, a sample corresponding to a distribution of an energy-based model based on the student network result and a teacher network result of an implemented teacher network provided with the input; training model parameters of the energy-based model to decrease a value of the energy-based model, based on the teacher network result and the student network result; and training the implemented student network to increase the value of the energy-based model, based on the sample and the student network result.