Google llc (20240220863). Compression of Machine-Learned Models via Entropy Penalized Weight Reparameterization simplified abstract

From WikiPatents
Jump to navigation Jump to search

Compression of Machine-Learned Models via Entropy Penalized Weight Reparameterization

Organization Name

google llc

Inventor(s)

Deniz Oktay of Mountain View CA (US)

Saurabh Singh of Mountain View CA (US)

Johannes Balle of San Francisco CA (US)

Abhinav Shrivastava of Silver Springs MD (US)

Compression of Machine-Learned Models via Entropy Penalized Weight Reparameterization - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240220863 titled 'Compression of Machine-Learned Models via Entropy Penalized Weight Reparameterization

The present disclosure pertains to systems and methods that learn a compressed representation of a machine-learned model, such as a neural network, by reparameterizing the model parameters within a reparameterization space during training.

  • The innovation involves an end-to-end model weight compression approach utilizing a latent-variable data compression method.
  • Model parameters, like weights and biases, are represented in a "latent" or "reparameterization" space, leading to a reparameterization.
  • The reparameterization space can incorporate a learned probability model to impose an entropy penalty on the parameter representation during training and compress the representation using arithmetic coding after training.
  • This approach maximizes accuracy and model compressibility simultaneously, with the rate-error trade-off controlled by a hyperparameter.

Potential Applications: - Efficient model compression in machine learning applications. - Improved performance and reduced memory requirements in neural networks. - Enhanced scalability and deployment of complex models in resource-constrained environments.

Problems Solved: - Addressing the challenge of balancing model accuracy and compressibility in machine learning. - Providing a systematic approach to optimize model performance and memory usage.

Benefits: - Enhanced model efficiency and speed. - Reduced memory footprint and storage requirements. - Improved scalability and deployment flexibility.

Commercial Applications: Title: "Efficient Model Compression for Neural Networks" This technology can be applied in various industries such as: - Artificial intelligence and machine learning software development. - IoT devices and edge computing applications. - Cloud computing services for optimized resource utilization.

Questions about Efficient Model Compression for Neural Networks:

1. How does the reparameterization space impact model training and compression?

  The reparameterization space allows for a more efficient representation of model parameters, balancing accuracy and compressibility during training and compression.

2. What are the potential implications of this innovation on the future of machine learning?

  This innovation could lead to more efficient and scalable machine learning models, enabling broader deployment in various applications.


Original Abstract Submitted

example aspects of the present disclosure are directed to systems and methods that learn a compressed representation of a machine-learned model (e.g., neural network) via representation of the model parameters within a reparameterization space during training of the model. in particular, the present disclosure describes an end-to-end model weight compression approach that employs a latent-variable data compression method. the model parameters (e.g., weights and biases) are represented in a “latent” or “reparameterization” space, amounting to a reparameterization. in some implementations, this space can be equipped with a learned probability model, which is used first to impose an entropy penalty on the parameter representation during training, and second to compress the representation using arithmetic coding after training. the proposed approach can thus maximize accuracy and model compressibility jointly, in an end-to-end fashion, with the rate-error trade-off specified by a hyperparameter.