20250190800. Method Apparatus Training Deep Learning Model Search Optimal Architecture Neural Network Knowledge Distill (Korea Advanced Institute of Science and Technology)
METHOD AND APPARATUS FOR TRAINING DEEP LEARNING MODEL TO SEARCH OPTIMAL ARCHITECTURE OF NEURAL NETWORK FOR KNOWLEDGE DISTILLATION
Abstract: the present disclosure relates to a method for training a deep learning model to search for an optimal architecture of a neural network for knowledge distillation. an information about a metadata set related to a first domain is obtained. an information about at least one candidate architecture of a neural network included in a search space using the information about the metadata set is obtained. a performance evaluation information about the at least one candidate architecture is output by inputting the information about the metadata set and the information about the at least one candidate architecture into the deep learning model. the deep learning model is trained through backpropagation to minimize a loss function that is determined on the basis of the performance evaluation information and label information included in the information about the metadata set.
Inventor(s): Sung Ju HWANG, Ha Yeon LEE, So Hyun AN, Min Seon KIM
CPC Classification: G06N3/084 (Backpropagation, e.g. using gradient descent)
Search for rejections for patent application number 20250190800