DISTRIBUTED MACHINE LEARNING COMPILER OPTIMIZATION

Organization Name

qualcomm incorporated

Inventor(s)

Weiliang Zeng of San Diego CA (US)

Christopher G. Lott of San Diego CA (US)

Edward H. Teague of San Diego CA (US)

Yang Yang of San Diego CA (US)

Joseph Binamira Soriaga of San Diego CA (US)

DISTRIBUTED MACHINE LEARNING COMPILER OPTIMIZATION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240177048 titled 'DISTRIBUTED MACHINE LEARNING COMPILER OPTIMIZATION

Simplified Explanation

The abstract describes a method for optimizing the compilation of a machine learning model to be executed on target edge devices. The method involves allocating compute nodes to a compiler optimization process, scheduling rounds of optimization, applying sequencing and scheduling solutions at each node, and implementing the solution with the best performance metric for execution on the target edge devices.

The method involves allocating compute nodes to a compiler optimization process for a machine learning model.
The machine learning model has a compute graph representation with nodes as kernel operators and edges defining precedence constraints.
Rounds of optimization are scheduled among the allocated compute nodes.
A sequencing and scheduling solution is applied at each node to obtain a performance metric for the machine learning model.
The solution with the best performance metric is identified and implemented for execution on the target edge devices.

Potential Applications

This technology can be applied in various fields such as:

Edge computing
Internet of Things (IoT) devices
Real-time machine learning applications

Problems Solved

This technology helps in:

Optimizing the compilation of machine learning models for edge devices
Improving the performance of machine learning models on target edge devices

Benefits

The benefits of this technology include:

Enhanced efficiency in executing machine learning models on edge devices
Improved performance metrics for machine learning applications
Cost-effective optimization process for edge computing environments

Potential Commercial Applications

This technology can be utilized in:

Smart home devices
Industrial automation systems
Autonomous vehicles

Possible Prior Art

One possible prior art in this field is the use of distributed computing techniques for optimizing machine learning models for edge devices.

What are the key components of the optimization process described in the abstract?

The key components of the optimization process described in the abstract include:

Allocation of compute nodes to a compiler optimization process
Scheduling rounds of optimization among the allocated compute nodes
Applying sequencing and scheduling solutions at each node
Identifying and implementing the solution with the best performance metric for execution on the target edge devices

How does this technology contribute to the advancement of edge computing applications?

This technology contributes to the advancement of edge computing applications by:

Enhancing the efficiency and performance of machine learning models on target edge devices
Providing a cost-effective optimization process for edge computing environments
Enabling real-time execution of machine learning applications on edge devices

Original Abstract Submitted

a method for optimizing the compilation of a machine learning model to be executed on target edge devices is provided. compute nodes of a plurality of compute nodes are allocated to a compiler optimization process for a compiler of said machine learning model. the machine learning model has a compute graph representation having nodes that are kernel operators necessary to execute the machine learning model and edges that connect said kernel operators to define precedence constraints. a round of optimization is scheduled for the process amongst the allocated compute nodes. at each allocated compute node a sequencing and scheduling solution is applied per round to obtain a performance metric for the machine learning model. from each compute node the performance metric is received and a solution that has the best performance metric is identified and implemented for execution of the machine learning model on the target edge devices.

Qualcomm incorporated (20240177048). DISTRIBUTED MACHINE LEARNING COMPILER OPTIMIZATION simplified abstract