17946409. COMPILING OF TASKS FOR STREAMING OPERATIONS AT NEURAL PROCESSOR simplified abstract (Apple Inc.)
Contents
- 1 COMPILING OF TASKS FOR STREAMING OPERATIONS AT NEURAL PROCESSOR
- 1.1 Organization Name
- 1.2 Inventor(s)
- 1.3 COMPILING OF TASKS FOR STREAMING OPERATIONS AT NEURAL PROCESSOR - A simplified explanation of the abstract
- 1.4 Simplified Explanation
- 1.5 Potential Applications
- 1.6 Problems Solved
- 1.7 Benefits
- 1.8 Potential Commercial Applications
- 1.9 Possible Prior Art
- 1.10 Unanswered Questions
- 1.11 Original Abstract Submitted
COMPILING OF TASKS FOR STREAMING OPERATIONS AT NEURAL PROCESSOR
Organization Name
Inventor(s)
Sayyed Karen Khatamifard of Bellevue WA (US)
Thomas G. Anderl of Seattle WA (US)
Alexander J. Kirchhoff of Seattle CA (US)
Dylan H. Rush of Mountlake Terrace WA (US)
Chenfan Sun of SHORELINE WA (US)
Jeffrey D Marker of Pleasant View UT (US)
COMPILING OF TASKS FOR STREAMING OPERATIONS AT NEURAL PROCESSOR - A simplified explanation of the abstract
This abstract first appeared for US patent application 17946409 titled 'COMPILING OF TASKS FOR STREAMING OPERATIONS AT NEURAL PROCESSOR
Simplified Explanation
The patent application relates to compiling neural network operations into tasks that can be performed in a streaming manner by a neural processor. Tasks associated with multiple layers of the neural network are performed simultaneously in an overlapping manner to improve efficiency. Memory usage during streaming operation is optimized by assigning tasks with similar completion times to the same portion of memory, which can be flushed after completion to make space for new tasks. Additionally, multiple tasks can be combined into a single task to further enhance efficiency.
- Spatially partitioned tensor for streaming operation
- Simultaneous execution of tasks from multiple layers
- Efficient memory usage through task assignment
- Coalescing multiple tasks into a single task
Potential Applications
This technology can be applied in various fields such as:
- Real-time image and video processing
- Natural language processing
- Autonomous vehicles
- Robotics
Problems Solved
The technology addresses the following issues:
- Memory inefficiency during neural network operations
- Slow processing speeds in traditional neural processors
- Inefficient task execution in neural networks
Benefits
The benefits of this technology include:
- Improved efficiency in neural network operations
- Faster processing speeds
- Optimal memory usage
- Enhanced performance in real-time applications
Potential Commercial Applications
The technology can be commercially applied in:
- Edge computing devices
- Cloud computing servers
- AI accelerators
- Smart cameras
Possible Prior Art
One possible prior art could be the use of parallel processing techniques in neural networks to improve performance and efficiency. Another could be the optimization of memory usage in neural processors for faster task execution.
Unanswered Questions
How does this technology compare to existing parallel processing techniques in neural networks?
The article does not provide a direct comparison to existing parallel processing techniques in neural networks. It would be interesting to see a performance comparison in terms of speed and efficiency.
What impact does coalescing tasks have on the overall performance of the neural processor?
The article mentions the coalescing of tasks to reduce the number of tasks, but it does not delve into the specific impact this has on the performance of the neural processor. It would be beneficial to understand how this optimization affects processing speeds and efficiency.
Original Abstract Submitted
Embodiments relate to compiling neural network operations into tasks that may be performed in a streaming manner by a neural processor. In a streaming operation, a tensor is spatially partitioned, and tasks associated two or more layers of the neural network are performed simultaneously in an overlapping manner. To enable efficient memory usage during streaming operation, a subset of the tasks having completion times close in time are assigned to a same portion of memory in the neural processor during a compilation process. After the tasks assigned to the same portion of the memory is finished, the portion of the memory may be flushed to make space for subsequent tasks. Multiple tasks may also be coalesced into a single task to reduce the number of tasks and more efficiently perform the operations at the neural processor.