18323908. STATIC SCHEDULING AND DYNAMIC SCHEDULING FOR COMPILER-HINTED AND SELF-SCHEDULING MULTI-ENGINE ARTIFICIAL INTELLIGENCE (AI) PROCESSING UNIT SYSTEM simplified abstract (MEDIATEK Inc.)
Contents
- 1 STATIC SCHEDULING AND DYNAMIC SCHEDULING FOR COMPILER-HINTED AND SELF-SCHEDULING MULTI-ENGINE ARTIFICIAL INTELLIGENCE (AI) PROCESSING UNIT SYSTEM
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 STATIC SCHEDULING AND DYNAMIC SCHEDULING FOR COMPILER-HINTED AND SELF-SCHEDULING MULTI-ENGINE ARTIFICIAL INTELLIGENCE (AI) PROCESSING UNIT SYSTEM - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Potential Commercial Applications
- 1.9 Possible Prior Art
- 1.10 Original Abstract Submitted
STATIC SCHEDULING AND DYNAMIC SCHEDULING FOR COMPILER-HINTED AND SELF-SCHEDULING MULTI-ENGINE ARTIFICIAL INTELLIGENCE (AI) PROCESSING UNIT SYSTEM
Organization Name
Inventor(s)
Chieh-Fang Teng of Hsinchu (TW)
Chih Chung Cheng of Hsinchu (TW)
STATIC SCHEDULING AND DYNAMIC SCHEDULING FOR COMPILER-HINTED AND SELF-SCHEDULING MULTI-ENGINE ARTIFICIAL INTELLIGENCE (AI) PROCESSING UNIT SYSTEM - A simplified explanation of the abstract
This abstract first appeared for US patent application 18323908 titled 'STATIC SCHEDULING AND DYNAMIC SCHEDULING FOR COMPILER-HINTED AND SELF-SCHEDULING MULTI-ENGINE ARTIFICIAL INTELLIGENCE (AI) PROCESSING UNIT SYSTEM
Simplified Explanation
The apparatus described in the abstract is a system designed to optimize the execution of neural network models by determining whether operations/threads are compute bound or memory bound and allocating resources accordingly.
- The apparatus includes a compiler that compiles a neural network model to generate operations/threads and analyzes whether each operation/thread is compute bound or memory bound.
- A memory component stores the operations/threads of the neural network model.
- A thread scheduler schedules the operations/threads of the neural network model.
- A multi-engine processing unit with multiple compute units (CUs) is used to execute the operations/threads.
- An executor allocates the operations/threads and activates a number of CUs based on whether the operation/thread is compute bound or memory bound.
Potential Applications
This technology can be applied in various fields such as artificial intelligence, machine learning, computer vision, and natural language processing.
Problems Solved
This technology addresses the challenge of efficiently executing neural network models by dynamically allocating resources based on whether operations/threads are compute bound or memory bound.
Benefits
The benefits of this technology include improved performance, optimized resource utilization, and faster execution of neural network models.
Potential Commercial Applications
One potential commercial application of this technology is in cloud computing services for accelerating deep learning tasks.
Possible Prior Art
Prior art in this field may include research papers or patents related to optimizing neural network execution through resource allocation based on workload characteristics.
=== What is the impact of this technology on energy efficiency in neural network processing? This technology can potentially improve energy efficiency in neural network processing by allocating resources more intelligently based on whether operations/threads are compute bound or memory bound. By optimizing resource usage, unnecessary energy consumption can be reduced.
=== How does this technology compare to existing methods of optimizing neural network execution? This technology stands out from existing methods by dynamically determining whether operations/threads are compute bound or memory bound and allocating resources accordingly. This adaptive approach can lead to more efficient execution of neural network models compared to static resource allocation methods.
Original Abstract Submitted
Aspects of the present disclosure provide an apparatus. For example, the apparatus can include a compiler configured to compile a neural network (NN) model to generate a plurality of operations/threads and determine whether each of the operations/threads is compute bound or memory bound, and a memory coupled to the compiler and configured to store the operations/threads. The apparatus can also include a thread scheduler coupled to the memory and configured to schedule the operations/threads of the NN model. The apparatus can also include a multi-engine processing unit that includes a plurality of compute units (CUs), and an executor coupled between the thread scheduler and the multi-engine processing unit. The executor can be configured to allocate the operations/threads of the NN model and activate a number of the CUs of the multi-engine processing unit for each of the operations/threads based on whether the operation/thread is compute bound or memory bound.