Jump to content

20250190800. Method Apparatus Training Deep Learning Model Search Optimal Architecture Neural Network Knowledge Distill (Korea Advanced Institute of Science and Technology)

From WikiPatents
Revision as of 06:01, 12 June 2025 by Wikipatents (talk | contribs) (Automated patent report)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

METHOD AND APPARATUS FOR TRAINING DEEP LEARNING MODEL TO SEARCH OPTIMAL ARCHITECTURE OF NEURAL NETWORK FOR KNOWLEDGE DISTILLATION

Abstract: the present disclosure relates to a method for training a deep learning model to search for an optimal architecture of a neural network for knowledge distillation. an information about a metadata set related to a first domain is obtained. an information about at least one candidate architecture of a neural network included in a search space using the information about the metadata set is obtained. a performance evaluation information about the at least one candidate architecture is output by inputting the information about the metadata set and the information about the at least one candidate architecture into the deep learning model. the deep learning model is trained through backpropagation to minimize a loss function that is determined on the basis of the performance evaluation information and label information included in the information about the metadata set.

Inventor(s): Sung Ju HWANG, Ha Yeon LEE, So Hyun AN, Min Seon KIM

CPC Classification: G06N3/084 (Backpropagation, e.g. using gradient descent)

Search for rejections for patent application number 20250190800


Cookies help us deliver our services. By using our services, you agree to our use of cookies.