18457002. TRAINING NEURAL NETWORK WITH BUDDING ENSEMBLE ARCHITECTURE BASED ON DIVERSITY LOSS simplified abstract (Intel Corporation)

From WikiPatents
Jump to navigation Jump to search

TRAINING NEURAL NETWORK WITH BUDDING ENSEMBLE ARCHITECTURE BASED ON DIVERSITY LOSS

Organization Name

Intel Corporation

Inventor(s)

Qutub Syed Sha of Munich (DE)

Neslihan Kose Cihangir of Munich (DE)

Rafael Rosales of Unterhaching (DE)

TRAINING NEURAL NETWORK WITH BUDDING ENSEMBLE ARCHITECTURE BASED ON DIVERSITY LOSS - A simplified explanation of the abstract

This abstract first appeared for US patent application 18457002 titled 'TRAINING NEURAL NETWORK WITH BUDDING ENSEMBLE ARCHITECTURE BASED ON DIVERSITY LOSS

Simplified Explanation

The patent application describes a method for training deep neural networks (DNNs) with budding ensemble architectures using diversity loss. Here are the key points:

  • The DNN consists of a backbone and multiple heads.
  • The backbone has one or more layers that generate intermediate tensors.
  • The heads are organized in pairs, with each pair consisting of a first head and a second head duplicated from the first head.
  • The second head has the same tensor operations as the first head but different internal parameters.
  • The intermediate tensor from a backbone layer is input into both the first and second heads.
  • The first head computes a first detection tensor, and the second head computes a second detection tensor.
  • The similarity between the first and second detection tensors is used as a diversity loss for training the DNN.

Potential applications of this technology:

  • Image recognition: The method can be used to train DNNs for tasks like object detection and classification in images.
  • Natural language processing: DNNs trained using this method can be applied to tasks like sentiment analysis, language translation, and text generation.
  • Speech recognition: The method can be used to train DNNs for speech recognition and voice command systems.

Problems solved by this technology:

  • Overfitting: The use of diversity loss helps prevent overfitting by encouraging the heads to learn different representations of the data.
  • Lack of diversity in ensembles: By duplicating and modifying heads, the method increases the diversity within the ensemble, leading to improved performance.

Benefits of this technology:

  • Improved accuracy: The use of diversity loss and budding ensemble architectures can lead to higher accuracy in DNN models.
  • Robustness: The diversity within the ensemble makes the DNN more robust to variations in the input data.
  • Efficient training: The method allows for efficient training of DNNs with budding ensemble architectures, reducing the computational resources required.


Original Abstract Submitted

Deep neural networks (DNNs) with budding ensemble architectures may be trained using diversity loss. A DNN may include a backbone and a plurality of heads. The backbone includes one or more layers. A layer in the backbone may generate an intermediate tensor. The plurality of heads may include one or more pairs of heads. A pair of heads includes a first head and a second head duplicated from the first head. The second head may include the same tensor operations as the first head but different internal parameters. The intermediate tensor generated by a backbone layer may be input into both the first head and the second head. The first head may compute a first detection tensor, and the second head may compute a second detection tensor. A similarity between the first detection tensor and the second detection tensor may be used as a diversity loss for training the DNN.