18521396. MACHINE LEARNING PARALLELIZATION METHOD USING HOST CPU WITH MULTI-SOCKET STRUCTURE AND APPARATUS THEREFOR simplified abstract (ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE)
Contents
- 1 MACHINE LEARNING PARALLELIZATION METHOD USING HOST CPU WITH MULTI-SOCKET STRUCTURE AND APPARATUS THEREFOR
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 MACHINE LEARNING PARALLELIZATION METHOD USING HOST CPU WITH MULTI-SOCKET STRUCTURE AND APPARATUS THEREFOR - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Potential Commercial Applications
- 1.9 Possible Prior Art
- 1.10 Original Abstract Submitted
MACHINE LEARNING PARALLELIZATION METHOD USING HOST CPU WITH MULTI-SOCKET STRUCTURE AND APPARATUS THEREFOR
Organization Name
ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
Inventor(s)
Myung-Hoon Cha of Daejeon (KR)
MACHINE LEARNING PARALLELIZATION METHOD USING HOST CPU WITH MULTI-SOCKET STRUCTURE AND APPARATUS THEREFOR - A simplified explanation of the abstract
This abstract first appeared for US patent application 18521396 titled 'MACHINE LEARNING PARALLELIZATION METHOD USING HOST CPU WITH MULTI-SOCKET STRUCTURE AND APPARATUS THEREFOR
Simplified Explanation
The abstract describes a method for machine-learning parallelization using host CPUs of a multi-socket structure. The method involves splitting a learning model at a layer level for pipeline stages, allocating them to Non-Uniform Memory Access (NUMA) nodes for respective CPU sockets, initializing parameters required for learning, generating multiple threads based on a policy of each parallelism algorithm, and executing them by allocating them to respective cores included in the NUMA node.
- Machine-learning parallelization method using host CPUs of a multi-socket structure:
- Splitting learning model at a layer level for pipeline stages - Allocating to NUMA nodes for respective CPU sockets - Initializing parameters for learning - Generating multiple threads based on parallelism algorithm policy - Executing threads by allocating them to respective cores in the NUMA node
Potential Applications
This technology can be applied in various fields such as: - Data analytics - Image recognition - Natural language processing
Problems Solved
- Efficient parallelization of machine learning tasks - Utilization of host CPUs in a multi-socket structure - Optimization of learning model allocation
Benefits
- Improved performance in machine learning tasks - Enhanced scalability - Reduced computational time and resources
Potential Commercial Applications
Optimized machine learning parallelization for: - Cloud computing services - Big data analytics platforms - AI-driven applications
Possible Prior Art
Prior art may include: - Parallel processing techniques in distributed systems - Multi-threading algorithms for computational tasks
Unanswered Questions
How does this method compare to existing parallelization techniques in terms of performance and scalability?
This article does not provide a direct comparison with existing parallelization techniques. Further research or experimentation may be needed to evaluate the performance and scalability of this method in comparison to others.
What are the specific policies for each parallelism algorithm mentioned in the abstract, and how do they impact the execution of multiple threads?
The abstract mentions considering a policy for each parallelism algorithm when generating multiple threads. However, it does not elaborate on the specific policies or how they affect the execution of threads. Additional information or details would be required to understand the impact of these policies on thread execution.
Original Abstract Submitted
Disclosed herein are a method for machine-learning parallelization using host CPUs of a multi-socket structure and an apparatus therefor. The method, performed by the apparatus for machine-learning parallelization using host CPUs of a multi-socket structure, includes a compile phase in which a learning model is split at a layer level for respective pipeline stages and allocated to Non-Uniform Memory Access (NUMA) nodes for respective CPU sockets and a runtime phase in which parameters required for learning are initialized and multiple threads generated in consideration of a policy of each parallelism algorithm are executed by being allocated to respective cores included in the NUMA node.