17943256. FUNCTION-BASED ACTIVATION OF MEMORY TIERS simplified abstract (INTERNATIONAL BUSINESS MACHINES CORPORATION)
FUNCTION-BASED ACTIVATION OF MEMORY TIERS
Organization Name
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventor(s)
Julian Roettger Buechel of Zurich (CH)
Manuel Le Gallo-bourdeau of Horgen (CH)
Irem Boybat Kara of Adliswil (CH)
Abbas Rahimi of Rueschlikon (CH)
Abu Sebastian of Adliswil (CH)
FUNCTION-BASED ACTIVATION OF MEMORY TIERS - A simplified explanation of the abstract
This abstract first appeared for US patent application 17943256 titled 'FUNCTION-BASED ACTIVATION OF MEMORY TIERS
Simplified Explanation
The patent application describes a 3D compute-in-memory accelerator system for efficient inference of Mixture of Expert (MoE) neural network models.
- The system includes multiple tiers of in-memory compute cells in each compute-in-memory core.
- Expert sub-models of the MoE model are represented by one or more tiers of in-memory compute cells.
- Expert sub-models are selected for activation propagation using a function-based routing method.
- A hash-based tier selection function is used for dynamic routing of inputs and output activations.
- The system can select a single expert or multiple experts based on input data or layer-activation-based MoEs for single tier activation.
- The system can be configured as a multi-model system with single expert model selection or with multi-expert selection.
Potential Applications
This technology can be applied in various fields such as artificial intelligence, machine learning, data analytics, and pattern recognition.
Problems Solved
This technology addresses the challenges of efficient inference in complex neural network models like Mixture of Expert (MoE) models.
Benefits
The system offers improved performance, energy efficiency, and scalability for inference tasks in neural network models.
Potential Commercial Applications
Optimizing Inference in Mixture of Expert (MoE) Neural Network Models: A Game-Changer in AI Acceleration.
Original Abstract Submitted
A 3D compute-in-memory accelerator system and method for efficient inference of Mixture of Expert (MoE) neural network models. The system includes a plurality of compute-in-memory cores, each in-memory core including multiple tiers of in-memory compute cells. One or more tiers of in-memory compute cells correspond to an expert sub-model of the MoE model. One or more expert sub-models are selected for activation propagation based on a function-based routing, the tiers of the corresponding experts being activated based on this function. In one embodiment, this function is a hash-based tier selection function used for dynamic routing of inputs and output activations. In embodiments, the function is applied to select a single expert or multiple experts with input data-based or with layer-activation-based MoEs for single tier activation. Further, the system is configured as a multi-model system with single expert model selection or with a multi-model system with multi-expert selection.