20240037404. Tensor Decomposition Rank Exploration for Neural Network Compression simplified abstract (Deeplite Inc.)

From WikiPatents
Jump to navigation Jump to search

Tensor Decomposition Rank Exploration for Neural Network Compression

Organization Name

Deeplite Inc.

Inventor(s)

Olivier Mastropietro of Montreal (CA)

Ehsan Saboori of Richmond Hill (CA)

Tensor Decomposition Rank Exploration for Neural Network Compression - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240037404 titled 'Tensor Decomposition Rank Exploration for Neural Network Compression

Simplified Explanation

The patent application describes a system, device, and method for reducing machine learning models for target hardware. The method involves providing a model, training data, and a training threshold. A search space for reducing the model is determined using a pruning function and a pruning factor, with the pruning function being bounded by constraints. Boundaries for the pruning factor are determined based on the constraints, defining the search space. The pruning function increases compression along the depth of the model, with the compression increases being based on the pruning factor. The model is trained into a reduced model by iteratively updating model parameters based on the pruning function and pruning factor within the search space, and evaluating the updated model with the training parameters. The reduced model is then provided to target hardware.

  • The patent application proposes a method for reducing machine learning models for target hardware.
  • The method involves determining a search space for reducing the model using a pruning function and a pruning factor.
  • The pruning function is bounded by constraints, and boundaries for the pruning factor are determined based on these constraints.
  • The pruning function increases compression along the depth of the model, with the compression increases based on the pruning factor.
  • The model is trained into a reduced model by iteratively updating model parameters based on the pruning function and pruning factor within the search space.
  • The updated model is evaluated with the training parameters.
  • The reduced model is then provided to target hardware.

Potential Applications

  • This technology can be applied in various fields where machine learning models need to be optimized for specific hardware platforms.
  • It can be used in edge computing devices, IoT devices, and embedded systems to reduce the size and computational requirements of machine learning models.
  • It can be beneficial in applications such as image and speech recognition, natural language processing, and autonomous systems.

Problems Solved

  • Reducing machine learning models for target hardware addresses the issue of limited computational resources in devices with constrained hardware capabilities.
  • It solves the problem of optimizing machine learning models to run efficiently on specific hardware platforms.
  • It helps overcome the challenges of deploying complex machine learning models on resource-constrained devices.

Benefits

  • The technology enables the deployment of machine learning models on resource-constrained hardware platforms.
  • It reduces the size of machine learning models, making them more suitable for edge computing and IoT devices.
  • It improves the efficiency and performance of machine learning models by optimizing them for specific hardware architectures.
  • It allows for faster inference and reduced power consumption in devices running the reduced models.


Original Abstract Submitted

a system, device and method are provided for reducing machine learning models for target hardware. illustratively, the method includes providing a model, a set of training data, and a training threshold. a search space for reducing the model is determined with a pruning function and a pruning factor. the pruning function is bounded with constraints. based on the constraints, boundaries for the pruning factor are determined, which boundaries define at least in part the search space. the pruning function increases compression along a depth of the model, and the compression increases are based on the pruning factor. a model is trained into a reduced model by iteratively updating model parameters based on the pruning function and the pruning factor and within the search space, and evaluating the updated model with the training parameters. the method includes providing the reduced model to target hardware.