18626775. Multi-tile Memory Management for Detecting Cross Tile Access Providing Multi-Tile Inference Scaling and Providing Page Migration simplified abstract (Intel Corporation)

From WikiPatents
Jump to navigation Jump to search

Multi-tile Memory Management for Detecting Cross Tile Access Providing Multi-Tile Inference Scaling and Providing Page Migration

Organization Name

Intel Corporation

Inventor(s)

Lakshminarayanan Striramassarma of Folsom CA (US)

Prasoonkumar Surti of Folsom CA (US)

Varghese George of Folsom CA (US)

Ben Ashbaugh of Folsom CA (US)

Aravindh Anantaraman of Folsom CA (US)

Valentin Andrei of San Jose CA (US)

Abhishek Appu of El Dorado Hills CA (US)

Nicolas Galoppo Von Borries of Portland OR (US)

Altug Koker of El Dorado Hills CA (US)

Mike Macpherson of Portland OR (US)

Subramaniam Maiyuran of Gold River CA (US)

Nilay Mistry of Bangalore (IN)

Elmoustapha Ould-ahmed-vall of Chandler AZ (US)

Selvakumar Panneer of Portland OR (US)

Vasanth Ranganathan of El Dorado Hills CA (US)

Joydeep Ray of Folsom CA (US)

Ankur Shah of Folsom CA (US)

Saurabh Tangri of Folsom CA (US)

Multi-tile Memory Management for Detecting Cross Tile Access Providing Multi-Tile Inference Scaling and Providing Page Migration - A simplified explanation of the abstract

This abstract first appeared for US patent application 18626775 titled 'Multi-tile Memory Management for Detecting Cross Tile Access Providing Multi-Tile Inference Scaling and Providing Page Migration

The abstract describes a patent application for multi-tile memory management in a graphics processor for a multi-tile architecture.

  • The innovation involves detecting cross-tile memory accesses, providing multi-tile inference scaling through data multicasting, and enabling page migration.
  • The graphics processor includes multiple GPUs with their own memories and a cross-GPU fabric for communication.
  • The memory controller determines frequent cross-tile memory accesses and initiates data transfer mechanisms when needed.

Potential Applications:

  • This technology can be used in high-performance computing systems, data centers, and AI applications that require efficient memory management across multiple tiles.

Problems Solved:

  • Efficiently managing memory access across multiple tiles in a graphics processor.
  • Improving performance and scalability of multi-tile architectures.

Benefits:

  • Enhanced performance and scalability in multi-tile architectures.
  • Reduced latency in cross-tile memory accesses.
  • Improved overall efficiency in memory management.

Commercial Applications:

  • This technology can be applied in graphics processing units, data centers, AI accelerators, and other high-performance computing systems to optimize memory access and improve overall performance.

Questions about Multi-tile Memory Management: 1. How does multi-tile memory management improve the efficiency of memory access in graphics processors?

  - Multi-tile memory management optimizes memory access by detecting and handling cross-tile accesses efficiently.

2. What are the potential implications of using multi-tile memory management in AI applications?

  - Implementing multi-tile memory management can significantly enhance the performance and scalability of AI applications.


Original Abstract Submitted

Multi-tile Memory Management for Detecting Cross Tile Access, Providing Multi-Tile Inference Scaling with multicasting of data via copy operation, and Providing Page Migration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a memory and a memory controller, a second graphics processing unit (GPU) having a memory and a cross-GPU fabric to communicatively couple the first and second GPUs. The memory controller is configured to determine whether frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU in the multi-GPU configuration and to send a message to initiate a data transfer mechanism when frequent cross tile memory accesses occur from the first GPU to the memory of the second GPU.