US Patent Application 17734766. UTILIZING ELASTIC WEIGHT CONSOLIDATION (EWC) LOSS TERM(S) TO MITIGATE CATASTROPHIC FORGETTING IN FEDERATED LEARNING OF MACHINE LEARNING MODEL(S) simplified abstract

From WikiPatents
Jump to navigation Jump to search

UTILIZING ELASTIC WEIGHT CONSOLIDATION (EWC) LOSS TERM(S) TO MITIGATE CATASTROPHIC FORGETTING IN FEDERATED LEARNING OF MACHINE LEARNING MODEL(S)

Organization Name

Google LLC


Inventor(s)

Andrew Hard of Menlo Park CA (US)

Kurt Partridge of San Francisco CA (US)

Rajiv Mathews of Sunnyvale CA (US)

Sean Augenstein of San Mateo CA (US)

UTILIZING ELASTIC WEIGHT CONSOLIDATION (EWC) LOSS TERM(S) TO MITIGATE CATASTROPHIC FORGETTING IN FEDERATED LEARNING OF MACHINE LEARNING MODEL(S) - A simplified explanation of the abstract

This abstract first appeared for US patent application 17734766 titled 'UTILIZING ELASTIC WEIGHT CONSOLIDATION (EWC) LOSS TERM(S) TO MITIGATE CATASTROPHIC FORGETTING IN FEDERATED LEARNING OF MACHINE LEARNING MODEL(S)

Simplified Explanation

This patent application is about using elastic weight consolidation (EWC) loss term(s) in federated learning of global machine learning (ML) models. Here are the key points:

  • The technology focuses on utilizing EWC loss term(s) in federated learning.
  • A global ML model is initially trained on a remote server using a server data set.
  • EWC loss term(s) for the global weights of the ML model are determined based on a Fisher information matrix for the server data set.
  • The global ML model and the EWC loss term(s) are transmitted to multiple client devices.
  • The client devices generate client gradients based on processing predicted output using the global ML model and the EWC loss term(s).
  • The client gradients are then transmitted back to the remote server.
  • An updated global ML model is generated based on the client gradients received from the client devices.


Original Abstract Submitted

Implementations disclosed herein are directed to utilizing elastic weight consolidation (EWC) loss term(s) in federated learning of global machine learning (ML) models. Implementations may identify a global ML model that initially trained at a remote server based on a server data set, determine the EWC loss term(s) for global weight(s) of the global ML model, and transmit the global ML model and the EWC loss term(s) to a plurality of client devices. The EWC loss term(s) may be determined based on a Fisher information matrix for the server data set. Further, the plurality client devices may generate, based on processing corresponding predicted output and using the global ML model, and based on the EWC loss term(s), a corresponding client gradient, and transmit the corresponding client gradient to the remote server. Implementations may further generate an updated global ML model based on at least the corresponding client gradients.