17946409. COMPILING OF TASKS FOR STREAMING OPERATIONS AT NEURAL PROCESSOR simplified abstract (Apple Inc.)

From WikiPatents
Jump to navigation Jump to search

COMPILING OF TASKS FOR STREAMING OPERATIONS AT NEURAL PROCESSOR

Organization Name

Apple Inc.

Inventor(s)

Sayyed Karen Khatamifard of Bellevue WA (US)

Thomas G. Anderl of Seattle WA (US)

Alexander J. Kirchhoff of Seattle CA (US)

Keith Wyss of Seattle WA (US)

Dylan H. Rush of Mountlake Terrace WA (US)

Chenfan Sun of SHORELINE WA (US)

Jeffrey D Marker of Pleasant View UT (US)

COMPILING OF TASKS FOR STREAMING OPERATIONS AT NEURAL PROCESSOR - A simplified explanation of the abstract

This abstract first appeared for US patent application 17946409 titled 'COMPILING OF TASKS FOR STREAMING OPERATIONS AT NEURAL PROCESSOR

Simplified Explanation

The patent application relates to compiling neural network operations into tasks that can be performed in a streaming manner by a neural processor. Tasks associated with multiple layers of the neural network are performed simultaneously in an overlapping manner to improve efficiency. Memory usage during streaming operation is optimized by assigning tasks with similar completion times to the same portion of memory, which can be flushed after completion to make space for new tasks. Additionally, multiple tasks can be combined into a single task to further enhance efficiency.

  • Spatially partitioned tensor for streaming operation
  • Simultaneous execution of tasks from multiple layers
  • Efficient memory usage through task assignment
  • Coalescing multiple tasks into a single task

Potential Applications

This technology can be applied in various fields such as:

  • Real-time image and video processing
  • Natural language processing
  • Autonomous vehicles
  • Robotics

Problems Solved

The technology addresses the following issues:

  • Memory inefficiency during neural network operations
  • Slow processing speeds in traditional neural processors
  • Inefficient task execution in neural networks

Benefits

The benefits of this technology include:

  • Improved efficiency in neural network operations
  • Faster processing speeds
  • Optimal memory usage
  • Enhanced performance in real-time applications

Potential Commercial Applications

The technology can be commercially applied in:

  • Edge computing devices
  • Cloud computing servers
  • AI accelerators
  • Smart cameras

Possible Prior Art

One possible prior art could be the use of parallel processing techniques in neural networks to improve performance and efficiency. Another could be the optimization of memory usage in neural processors for faster task execution.

Unanswered Questions

How does this technology compare to existing parallel processing techniques in neural networks?

The article does not provide a direct comparison to existing parallel processing techniques in neural networks. It would be interesting to see a performance comparison in terms of speed and efficiency.

What impact does coalescing tasks have on the overall performance of the neural processor?

The article mentions the coalescing of tasks to reduce the number of tasks, but it does not delve into the specific impact this has on the performance of the neural processor. It would be beneficial to understand how this optimization affects processing speeds and efficiency.


Original Abstract Submitted

Embodiments relate to compiling neural network operations into tasks that may be performed in a streaming manner by a neural processor. In a streaming operation, a tensor is spatially partitioned, and tasks associated two or more layers of the neural network are performed simultaneously in an overlapping manner. To enable efficient memory usage during streaming operation, a subset of the tasks having completion times close in time are assigned to a same portion of memory in the neural processor during a compilation process. After the tasks assigned to the same portion of the memory is finished, the portion of the memory may be flushed to make space for subsequent tasks. Multiple tasks may also be coalesced into a single task to reduce the number of tasks and more efficiently perform the operations at the neural processor.