US Patent Application 17848947. SYSTEM(S) AND METHOD(S) FOR JOINTLY LEARNING MACHINE LEARNING MODEL(S) BASED ON SERVER DATA AND CLIENT DATA simplified abstract

From WikiPatents
Jump to navigation Jump to search

SYSTEM(S) AND METHOD(S) FOR JOINTLY LEARNING MACHINE LEARNING MODEL(S) BASED ON SERVER DATA AND CLIENT DATA

Organization Name

Google LLC


Inventor(s)

Sean Augenstein of San Mateo CA (US)

Andrew Hard of Menlo Park CA (US)

Kurt Partridge of San Francisco CA (US)

Rajiv Mathews of Sunnyvale CA (US)

Lin Ning of San Jose CA (US)

Karan Singhal of Roslyn NY (US)

SYSTEM(S) AND METHOD(S) FOR JOINTLY LEARNING MACHINE LEARNING MODEL(S) BASED ON SERVER DATA AND CLIENT DATA - A simplified explanation of the abstract

This abstract first appeared for US patent application 17848947 titled 'SYSTEM(S) AND METHOD(S) FOR JOINTLY LEARNING MACHINE LEARNING MODEL(S) BASED ON SERVER DATA AND CLIENT DATA

Simplified Explanation

The patent application is about techniques to prevent catastrophic forgetting in federated learning of global machine learning models.

  • The implementation identifies a global machine learning model that is initially trained on a remote server using server data.
  • The server-based data includes EWC loss terms, client augmenting gradients, server augmenting gradients, and server-based data.
  • The global ML model and server-based data are transmitted to multiple client devices.
  • The client devices generate client gradients based on processing predicted output using the global ML model and the server-based data.
  • The client gradients are then transmitted back to the remote server.
  • An updated global ML model is generated based on the client gradients.


Original Abstract Submitted

Implementations disclosed herein are directed to various techniques for mitigating and/or preventing catastrophic forgetting in federated learning of global machine learning (ML) models. Implementations may identify a global ML model that is initially trained at a remote server based on a server data set, determine server-based data for global weight(s) of the global ML model, and transmit the global ML model and the server-based data to a plurality of client devices. The server-based data may include, for example, EWC loss term(s), client augmenting gradients, server augmenting gradients, and/or server-based data. Further, the plurality client devices may generate, based on processing corresponding predicted output and using the global ML model, and based on the server-based data, a corresponding client gradient, and transmit the corresponding client gradient to the remote server. Implementations may further generate an updated global ML model based on at least the corresponding client gradients.