Intel corporation (20240289612). METHOD AND APPARATUS FOR OPTIMIZING INFERENCE OF DEEP NEURAL NETWORKS simplified abstract

From WikiPatents
Jump to navigation Jump to search

METHOD AND APPARATUS FOR OPTIMIZING INFERENCE OF DEEP NEURAL NETWORKS

Organization Name

intel corporation

Inventor(s)

Haihao Shen of Shanghai (CN)

Hengyu Meng of Shanghai (CN)

Feng Tian of Shanghai (CN)

METHOD AND APPARATUS FOR OPTIMIZING INFERENCE OF DEEP NEURAL NETWORKS - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240289612 titled 'METHOD AND APPARATUS FOR OPTIMIZING INFERENCE OF DEEP NEURAL NETWORKS

Simplified Explanation: The patent application describes an application that provides a hardware-aware cost model for optimizing the inference of a deep neural network (DNN) by estimating computation and memory/cache costs based on hardware specifications.

Key Features and Innovation:

  • Computation cost estimator computes estimated computation cost based on input, weight, and output tensors from the DNN.
  • Memory/cache cost estimator performs cost estimation strategy based on hardware specifications.
  • Performance simulation on target hardware provides dynamic quantization knobs for converting a conventional precision inference model to an optimized one.

Potential Applications: The technology can be used in optimizing deep neural network inference for various applications such as image recognition, natural language processing, and autonomous driving systems.

Problems Solved: The technology addresses the need for efficient optimization of deep neural network inference models to improve performance on specific hardware configurations.

Benefits:

  • Improved performance and efficiency of deep neural network inference models.
  • Dynamic quantization knobs for optimizing inference models based on hardware specifications.
  • Enhanced accuracy and speed of computations in DNN applications.

Commercial Applications: The technology can be applied in industries such as healthcare (medical image analysis), finance (fraud detection), and retail (customer behavior analysis) to enhance the efficiency and accuracy of deep learning models.

Prior Art: Readers can explore prior research on hardware-aware optimization techniques for deep neural networks to understand the evolution of this technology.

Frequently Updated Research: Stay updated on the latest advancements in hardware-aware optimization strategies for deep neural networks to leverage cutting-edge techniques for inference model optimization.

Questions about Hardware-Aware Cost Model for Optimizing Inference: 1. How does the hardware-aware cost model improve the efficiency of deep neural network inference? 2. What are the key components of the performance simulation process in optimizing inference models?


Original Abstract Submitted

the application provides a hardware-aware cost model for optimizing inference of a deep neural network (dnn) comprising: a computation cost estimator configured to compute estimated computation cost based on input tensor, weight tensor and output tensor from the dnn; and a memory/cache cost estimator configured to perform memory/cache cost estimation strategy based on hardware specifications, wherein the hardware-aware cost model is used to perform performance simulation on target hardware to provide dynamic quantization knobs to quantization as required for converting a conventional precision inference model to an optimized inference model based on the result of the performance simulation.