Qualcomm incorporated (20240177048). DISTRIBUTED MACHINE LEARNING COMPILER OPTIMIZATION simplified abstract
DISTRIBUTED MACHINE LEARNING COMPILER OPTIMIZATION
Organization Name
Inventor(s)
Weiliang Zeng of San Diego CA (US)
Christopher G. Lott of San Diego CA (US)
Edward H. Teague of San Diego CA (US)
Yang Yang of San Diego CA (US)
Joseph Binamira Soriaga of San Diego CA (US)
DISTRIBUTED MACHINE LEARNING COMPILER OPTIMIZATION - A simplified explanation of the abstract
This abstract first appeared for US patent application 20240177048 titled 'DISTRIBUTED MACHINE LEARNING COMPILER OPTIMIZATION
Simplified Explanation
The abstract describes a method for optimizing the compilation of a machine learning model to be executed on target edge devices. The method involves allocating compute nodes to a compiler optimization process, scheduling rounds of optimization, applying sequencing and scheduling solutions at each node, and implementing the solution with the best performance metric for execution on the target edge devices.
- The method involves allocating compute nodes to a compiler optimization process for a machine learning model.
- The machine learning model has a compute graph representation with nodes as kernel operators and edges defining precedence constraints.
- Rounds of optimization are scheduled among the allocated compute nodes.
- A sequencing and scheduling solution is applied at each node to obtain a performance metric for the machine learning model.
- The solution with the best performance metric is identified and implemented for execution on the target edge devices.
Potential Applications
This technology can be applied in various fields such as:
- Edge computing
- Internet of Things (IoT) devices
- Real-time machine learning applications
Problems Solved
This technology helps in:
- Optimizing the compilation of machine learning models for edge devices
- Improving the performance of machine learning models on target edge devices
Benefits
The benefits of this technology include:
- Enhanced efficiency in executing machine learning models on edge devices
- Improved performance metrics for machine learning applications
- Cost-effective optimization process for edge computing environments
Potential Commercial Applications
This technology can be utilized in:
- Smart home devices
- Industrial automation systems
- Autonomous vehicles
Possible Prior Art
One possible prior art in this field is the use of distributed computing techniques for optimizing machine learning models for edge devices.
What are the key components of the optimization process described in the abstract?
The key components of the optimization process described in the abstract include:
- Allocation of compute nodes to a compiler optimization process
- Scheduling rounds of optimization among the allocated compute nodes
- Applying sequencing and scheduling solutions at each node
- Identifying and implementing the solution with the best performance metric for execution on the target edge devices
How does this technology contribute to the advancement of edge computing applications?
This technology contributes to the advancement of edge computing applications by:
- Enhancing the efficiency and performance of machine learning models on target edge devices
- Providing a cost-effective optimization process for edge computing environments
- Enabling real-time execution of machine learning applications on edge devices
Original Abstract Submitted
a method for optimizing the compilation of a machine learning model to be executed on target edge devices is provided. compute nodes of a plurality of compute nodes are allocated to a compiler optimization process for a compiler of said machine learning model. the machine learning model has a compute graph representation having nodes that are kernel operators necessary to execute the machine learning model and edges that connect said kernel operators to define precedence constraints. a round of optimization is scheduled for the process amongst the allocated compute nodes. at each allocated compute node a sequencing and scheduling solution is applied per round to obtain a performance metric for the machine learning model. from each compute node the performance metric is received and a solution that has the best performance metric is identified and implemented for execution of the machine learning model on the target edge devices.