17943256. FUNCTION-BASED ACTIVATION OF MEMORY TIERS simplified abstract (INTERNATIONAL BUSINESS MACHINES CORPORATION)

From WikiPatents
Jump to navigation Jump to search

FUNCTION-BASED ACTIVATION OF MEMORY TIERS

Organization Name

INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor(s)

Julian Roettger Buechel of Zurich (CH)

Manuel Le Gallo-bourdeau of Horgen (CH)

Irem Boybat Kara of Adliswil (CH)

Abbas Rahimi of Rueschlikon (CH)

Abu Sebastian of Adliswil (CH)

FUNCTION-BASED ACTIVATION OF MEMORY TIERS - A simplified explanation of the abstract

This abstract first appeared for US patent application 17943256 titled 'FUNCTION-BASED ACTIVATION OF MEMORY TIERS

Simplified Explanation

The patent application describes a 3D compute-in-memory accelerator system for efficient inference of Mixture of Expert (MoE) neural network models.

  • The system includes multiple tiers of in-memory compute cells in each compute-in-memory core.
  • Expert sub-models of the MoE model are represented by one or more tiers of in-memory compute cells.
  • Expert sub-models are selected for activation propagation using a function-based routing method.
  • A hash-based tier selection function is used for dynamic routing of inputs and output activations.
  • The system can select a single expert or multiple experts based on input data or layer-activation-based MoEs for single tier activation.
  • The system can be configured as a multi-model system with single expert model selection or with multi-expert selection.

Potential Applications

This technology can be applied in various fields such as artificial intelligence, machine learning, data analytics, and pattern recognition.

Problems Solved

This technology addresses the challenges of efficient inference in complex neural network models like Mixture of Expert (MoE) models.

Benefits

The system offers improved performance, energy efficiency, and scalability for inference tasks in neural network models.

Potential Commercial Applications

Optimizing Inference in Mixture of Expert (MoE) Neural Network Models: A Game-Changer in AI Acceleration.


Original Abstract Submitted

A 3D compute-in-memory accelerator system and method for efficient inference of Mixture of Expert (MoE) neural network models. The system includes a plurality of compute-in-memory cores, each in-memory core including multiple tiers of in-memory compute cells. One or more tiers of in-memory compute cells correspond to an expert sub-model of the MoE model. One or more expert sub-models are selected for activation propagation based on a function-based routing, the tiers of the corresponding experts being activated based on this function. In one embodiment, this function is a hash-based tier selection function used for dynamic routing of inputs and output activations. In embodiments, the function is applied to select a single expert or multiple experts with input data-based or with layer-activation-based MoEs for single tier activation. Further, the system is configured as a multi-model system with single expert model selection or with a multi-model system with multi-expert selection.