18323908. STATIC SCHEDULING AND DYNAMIC SCHEDULING FOR COMPILER-HINTED AND SELF-SCHEDULING MULTI-ENGINE ARTIFICIAL INTELLIGENCE (AI) PROCESSING UNIT SYSTEM simplified abstract (MEDIATEK Inc.)

From WikiPatents
Jump to navigation Jump to search

STATIC SCHEDULING AND DYNAMIC SCHEDULING FOR COMPILER-HINTED AND SELF-SCHEDULING MULTI-ENGINE ARTIFICIAL INTELLIGENCE (AI) PROCESSING UNIT SYSTEM

Organization Name

MEDIATEK Inc.

Inventor(s)

Chieh-Fang Teng of Hsinchu (TW)

En-Jui Chang of Hsinchu (TW)

Chih Chung Cheng of Hsinchu (TW)

STATIC SCHEDULING AND DYNAMIC SCHEDULING FOR COMPILER-HINTED AND SELF-SCHEDULING MULTI-ENGINE ARTIFICIAL INTELLIGENCE (AI) PROCESSING UNIT SYSTEM - A simplified explanation of the abstract

This abstract first appeared for US patent application 18323908 titled 'STATIC SCHEDULING AND DYNAMIC SCHEDULING FOR COMPILER-HINTED AND SELF-SCHEDULING MULTI-ENGINE ARTIFICIAL INTELLIGENCE (AI) PROCESSING UNIT SYSTEM

Simplified Explanation

The apparatus described in the abstract is a system designed to optimize the execution of neural network models by determining whether operations/threads are compute bound or memory bound and allocating resources accordingly.

  • The apparatus includes a compiler that compiles a neural network model to generate operations/threads and analyzes whether each operation/thread is compute bound or memory bound.
  • A memory component stores the operations/threads of the neural network model.
  • A thread scheduler schedules the operations/threads of the neural network model.
  • A multi-engine processing unit with multiple compute units (CUs) is used to execute the operations/threads.
  • An executor allocates the operations/threads and activates a number of CUs based on whether the operation/thread is compute bound or memory bound.

Potential Applications

This technology can be applied in various fields such as artificial intelligence, machine learning, computer vision, and natural language processing.

Problems Solved

This technology addresses the challenge of efficiently executing neural network models by dynamically allocating resources based on whether operations/threads are compute bound or memory bound.

Benefits

The benefits of this technology include improved performance, optimized resource utilization, and faster execution of neural network models.

Potential Commercial Applications

One potential commercial application of this technology is in cloud computing services for accelerating deep learning tasks.

Possible Prior Art

Prior art in this field may include research papers or patents related to optimizing neural network execution through resource allocation based on workload characteristics.

=== What is the impact of this technology on energy efficiency in neural network processing? This technology can potentially improve energy efficiency in neural network processing by allocating resources more intelligently based on whether operations/threads are compute bound or memory bound. By optimizing resource usage, unnecessary energy consumption can be reduced.

=== How does this technology compare to existing methods of optimizing neural network execution? This technology stands out from existing methods by dynamically determining whether operations/threads are compute bound or memory bound and allocating resources accordingly. This adaptive approach can lead to more efficient execution of neural network models compared to static resource allocation methods.


Original Abstract Submitted

Aspects of the present disclosure provide an apparatus. For example, the apparatus can include a compiler configured to compile a neural network (NN) model to generate a plurality of operations/threads and determine whether each of the operations/threads is compute bound or memory bound, and a memory coupled to the compiler and configured to store the operations/threads. The apparatus can also include a thread scheduler coupled to the memory and configured to schedule the operations/threads of the NN model. The apparatus can also include a multi-engine processing unit that includes a plurality of compute units (CUs), and an executor coupled between the thread scheduler and the multi-engine processing unit. The executor can be configured to allocate the operations/threads of the NN model and activate a number of the CUs of the multi-engine processing unit for each of the operations/threads based on whether the operation/thread is compute bound or memory bound.