17848679. SYSTEMS AND METHODS FOR DISTRIBUTING LAYERS OF SPECIAL MIXTURE-OF-EXPERTS MACHINE LEARNING MODELS simplified abstract (Microsoft Technology Licensing, LLC)
Contents
- 1 SYSTEMS AND METHODS FOR DISTRIBUTING LAYERS OF SPECIAL MIXTURE-OF-EXPERTS MACHINE LEARNING MODELS
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 SYSTEMS AND METHODS FOR DISTRIBUTING LAYERS OF SPECIAL MIXTURE-OF-EXPERTS MACHINE LEARNING MODELS - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Original Abstract Submitted
SYSTEMS AND METHODS FOR DISTRIBUTING LAYERS OF SPECIAL MIXTURE-OF-EXPERTS MACHINE LEARNING MODELS
Organization Name
Microsoft Technology Licensing, LLC
Inventor(s)
Devangkumar Rameshbhai Patel of Fremont CA (US)
SYSTEMS AND METHODS FOR DISTRIBUTING LAYERS OF SPECIAL MIXTURE-OF-EXPERTS MACHINE LEARNING MODELS - A simplified explanation of the abstract
This abstract first appeared for US patent application 17848679 titled 'SYSTEMS AND METHODS FOR DISTRIBUTING LAYERS OF SPECIAL MIXTURE-OF-EXPERTS MACHINE LEARNING MODELS
Simplified Explanation
The patent application describes a computing system with different types of accelerators, each with different memory and processing capabilities. The system uses these accelerators to distribute a machine learning model that consists of dense and sparse layers. The dense layers are distributed on accelerators with greater memory capability, while the sparse layers are distributed on accelerators with greater processing capability.
- The computing system has different accelerators with varying memory and processing capabilities.
- The machine learning model is divided into dense and sparse layers.
- Dense layers are distributed on accelerators with greater memory capability.
- Sparse layers are distributed on accelerators with greater processing capability.
Potential Applications
This technology can be applied in various fields where machine learning models are used, such as:
- Natural language processing
- Computer vision
- Speech recognition
- Recommendation systems
Problems Solved
This technology addresses the following problems:
- Imbalance between memory and processing capabilities in accelerators.
- Efficient distribution of machine learning models across different accelerators.
- Optimizing performance and resource utilization in computing systems.
Benefits
The benefits of this technology include:
- Improved performance by utilizing accelerators with different capabilities effectively.
- Enhanced memory capacity for processing dense layers.
- Increased processing power for handling sparse layers.
- Efficient utilization of computing resources.
Original Abstract Submitted
Some disclosed embodiments are directed to computing systems having different accelerators such that a first set of accelerators has a greater memory capability than a second set accelerators, while the second set of accelerators has a greater processing capability than the first set of accelerators. A machine learning model having different dense layers and sparse layers is distributed on the different accelerators such that the dense layers are distributed on one or more accelerators selected from the first set of accelerators and the sparse layers are distributed on one or more accelerators in the second set of accelerators.