18276972. ENERGY-EFFICIENT DEEP NEURAL NETWORK TRAINING ON DISTRIBUTED SPLIT ATTRIBUTES simplified abstract (Telefonaktiebolaget LM Ericsson (publ))

From WikiPatents
Jump to navigation Jump to search

ENERGY-EFFICIENT DEEP NEURAL NETWORK TRAINING ON DISTRIBUTED SPLIT ATTRIBUTES

Organization Name

Telefonaktiebolaget LM Ericsson (publ)

Inventor(s)

Selim Ickin of Stocksund (SE)

Konstantinos Vandikas of Solna (SE)

ENERGY-EFFICIENT DEEP NEURAL NETWORK TRAINING ON DISTRIBUTED SPLIT ATTRIBUTES - A simplified explanation of the abstract

This abstract first appeared for US patent application 18276972 titled 'ENERGY-EFFICIENT DEEP NEURAL NETWORK TRAINING ON DISTRIBUTED SPLIT ATTRIBUTES

Simplified Explanation

The abstract describes a method for operating a master node in a vertical federated learning system, where multiple workers train a split neural network. The method involves receiving layer outputs from workers, determining if outputs are missing, generating imputed values for missing outputs, calculating gradients based on received and imputed outputs, splitting gradients among workers, and transmitting them back to the workers.

  • Simplified Explanation:
   - Method for managing a master node in a vertical federated learning system with multiple workers training a split neural network.
   - Involves receiving and processing layer outputs from workers, handling missing data, and distributing gradients for training.
      1. Potential Applications:

- This technology can be applied in industries where data privacy is crucial, such as healthcare, finance, and telecommunications. - It can also be used in scenarios where data is distributed across multiple locations or devices, such as IoT networks.

      1. Problems Solved:

- Addresses the challenge of training neural networks on decentralized data without compromising data privacy. - Solves the issue of missing data from workers in a federated learning system, ensuring efficient training of split neural networks.

      1. Benefits:

- Enhances data privacy by allowing training on decentralized data without sharing raw information. - Improves training efficiency by handling missing data and optimizing gradient calculations in a federated learning setup.

      1. Potential Commercial Applications:
        1. Optimizing Federated Learning in Decentralized Systems

- This technology can be utilized by companies offering federated learning solutions to improve model training on distributed data. - It can be integrated into platforms for secure and efficient collaborative machine learning across multiple entities.

      1. Possible Prior Art:

- One potential prior art in this field is the concept of federated learning, where models are trained across decentralized devices without centralizing data. Google's Federated Learning of Cohorts (FLoC) is an example of a federated learning approach for targeted advertising.

        1. Unanswered Questions:
        2. How does this method handle communication latency between the master node and workers in a federated learning system?

- The abstract does not provide details on how communication delays are managed, which could impact the training efficiency and synchronization of the neural network.

        1. What mechanisms are in place to ensure the security and integrity of imputed values generated for missing layer outputs?

- The abstract mentions generating imputed values for missing data, but it does not elaborate on the security measures or validation processes for these imputed values, raising concerns about potential biases or inaccuracies in the training process.


Original Abstract Submitted

A method of operating a master node in a vertical federated learning, vFL, system including a plurality of workers for training a split neural network includes receiving layer outputs for a sample period from one or more of the workers for a cut-layer at which the neural network is split between the workers and the master node, and determining whether layer outputs for the cut-layer were not received from one of the workers. In response to determining that layer outputs for the cut-layer were not received from one of the workers, the method includes generating imputed values of the layer outputs that were not received, calculating gradients for neurons in the cut-layer based on the received layer outputs and the imputed layer outputs, splitting the gradients into groups associated with respective ones of the workers, and transmitting the groups of gradients to respective ones of the workers.