Huawei technologies co., ltd. (20240104346). METHOD AND DEVICE FOR COMPRESSING GENERATIVE PRE-TRAINED LANGUAGE MODELS VIA QUANTIZATION simplified abstract

From WikiPatents
Jump to navigation Jump to search

METHOD AND DEVICE FOR COMPRESSING GENERATIVE PRE-TRAINED LANGUAGE MODELS VIA QUANTIZATION

Organization Name

huawei technologies co., ltd.

Inventor(s)

Lu Hou of Shenzhen (CN)

Chaofan Tao of Shenzhen (CN)

Wei Zhang of Shenzhen (CN)

Lifeng Shang of Hong Kong (CN)

Xin Jiang of Hong Kong (CN)

Qun Liu of Hong Kong (CN)

Li Qian of Shenzhen (CN)

METHOD AND DEVICE FOR COMPRESSING GENERATIVE PRE-TRAINED LANGUAGE MODELS VIA QUANTIZATION - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240104346 titled 'METHOD AND DEVICE FOR COMPRESSING GENERATIVE PRE-TRAINED LANGUAGE MODELS VIA QUANTIZATION

Simplified Explanation

The method described in the abstract is for quantizing a neural network model by determining a scaling factor based on the distribution of weights, quantizing weights based on the scaling factor, determining training loss based on the quantized weights, and updating the scaling factor based on the gradient of the training loss.

  • Determining a scaling factor based on weight distribution
  • Quantizing weights based on the scaling factor
  • Calculating training loss using quantized weights
  • Updating the scaling factor based on the gradient of the training loss

Potential Applications

The technology could be applied in various fields such as:

  • Machine learning
  • Artificial intelligence
  • Data analysis

Problems Solved

This technology helps in:

  • Reducing memory usage
  • Improving computational efficiency
  • Enhancing model performance

Benefits

The benefits of this technology include:

  • Faster processing speeds
  • Lower resource requirements
  • Improved accuracy of neural network models

Potential Commercial Applications

This technology could be used in:

  • Image recognition systems
  • Natural language processing applications
  • Autonomous vehicles

Possible Prior Art

One possible prior art could be:

  • Research papers on weight quantization in neural networks

Unanswered Questions

How does this method compare to existing quantization techniques in terms of accuracy and efficiency?

This article does not provide a direct comparison with existing quantization techniques. Further research or experimentation may be needed to determine the performance differences.

What impact could this method have on the deployment of neural network models in resource-constrained environments?

The article does not discuss the specific implications for resource-constrained environments. Future studies could explore the potential benefits of this method in such settings.


Original Abstract Submitted

a method is provided for quantizing a neural network model performed by a processing system. the method comprises determining a scaling factor based on a distribution of weights associated with the neural network model, determining quantized weights based on the scaling factor and the weights associated with the distribution, determining a training loss of the neural network model based on the quantized weights during training of the neural network model, and determining an updated scaling factor for the neural network model based on a gradient of the training loss.