GENERATIVE AI ACCELERATOR APPARATUS USING IN-MEMORY COMPUTE CHIPLET DEVICES FOR TRANSFORMER WORKLOADS

Organization Name

Inventor(s)

GENERATIVE AI ACCELERATOR APPARATUS USING IN-MEMORY COMPUTE CHIPLET DEVICES FOR TRANSFORMER WORKLOADS - A simplified explanation of the abstract

This abstract first appeared for US patent application 20230168899 titled 'GENERATIVE AI ACCELERATOR APPARATUS USING IN-MEMORY COMPUTE CHIPLET DEVICES FOR TRANSFORMER WORKLOADS

Simplified Explanation

The patent application describes an AI accelerator apparatus that uses in-memory compute chiplet devices. The apparatus consists of chiplets, each containing multiple tiles, which in turn contain slices, a CPU, and a hardware dispatch device. The slices include a digital in-memory compute (DIMC) device for high throughput computations, specifically for accelerating attention functions in transformer-based models used in machine learning applications.

The chiplet devices have multiple tiles, each containing slices, a CPU, and a hardware dispatch device.
The slices include a digital in-memory compute (DIMC) device for high throughput computations.
The DIMC device is specifically designed to accelerate attention functions in transformer-based models used in machine learning applications.
The chiplet devices also have a single input multiple data (SIMD) device for further processing DIMC output and computing softmax functions for attention functions.
The chiplet devices include die-to-die (D2D) interconnects, a PCIe bus, a DRAM interface, and a global CPU interface for communication between chiplets, memory, and a server or host system.

Potential applications of this technology:

Accelerating attention functions in transformer-based models used in machine learning applications.
Enhancing the performance of generative AI models.
Improving the efficiency of AI accelerators in server or host systems.

Problems solved by this technology:

Addressing the computational demands of attention functions in transformer-based models.
Increasing the throughput and efficiency of AI accelerators.
Facilitating communication between chiplets, memory, and server or host systems.

Benefits of this technology:

Faster and more efficient computation of attention functions.
Improved performance and throughput of AI accelerators.
Enhanced capabilities for generative AI models.
Better communication and integration between chiplets, memory, and server or host systems.

Original Abstract Submitted

an ai accelerator apparatus using in-memory compute chiplet devices. the apparatus includes one or more chiplets, each of which includes a plurality of tiles. each tile includes a plurality of slices, a central processing unit (cpu), and a hardware dispatch device. each slice can include a digital in-memory compute (dimc) device configured to perform high throughput computations. in particular, the dimc device can be configured to accelerate the computations of attention functions for transformer-based models (a.k.a. transformers) applied to machine learning applications, including generative ai. a single input multiple data (simd) device configured to further process the dimc output and compute softmax functions for the attention functions. the chiplet can also include die-to-die (d2d) interconnects, a peripheral component interconnect express (pcie) bus, a dynamic random access memory (dram) interface, and a global cpu interface to facilitate communication between the chiplets, memory and a server or host system.

20230168899. GENERATIVE AI ACCELERATOR APPARATUS USING IN-MEMORY COMPUTE CHIPLET DEVICES FOR TRANSFORMER WORKLOADS simplified abstract (d-MATRIX CORPORATION)

Contents

GENERATIVE AI ACCELERATOR APPARATUS USING IN-MEMORY COMPUTE CHIPLET DEVICES FOR TRANSFORMER WORKLOADS

Organization Name

Inventor(s)