17952848. DATA PROCESSING METHOD AND DEVICE simplified abstract (Huawei Technologies Co., Ltd.)

From WikiPatents
Jump to navigation Jump to search

DATA PROCESSING METHOD AND DEVICE

Organization Name

Huawei Technologies Co., Ltd.

Inventor(s)

Teng Su of Hangzhou (CN)

Tingting Chen of Shenzhen (CN)

Zhenzhang Yang of Hangzhou (CN)

Xiaoda Zhang of Hangzhou (CN)

DATA PROCESSING METHOD AND DEVICE - A simplified explanation of the abstract

This abstract first appeared for US patent application 17952848 titled 'DATA PROCESSING METHOD AND DEVICE

Simplified Explanation

The patent application describes a data processing method for distributed parallel model training in the field of artificial intelligence. The method enables hybrid parallelism in a distributed cluster and can be applied to various models such as text translation, speech recognition, facial recognition, 3D reconstruction, and virtual reality.

  • The method involves inserting a redistribution operator between operators in a deep neural network model that have an input-output dependency relationship. This allows for conversion between different tensor layouts, which are the arrangements of data in tensors.
  • The redistribution operator is inserted into a sliced computational graph, which is a representation of the neural network model divided into smaller parts for parallel processing.
  • An updated sliced computational graph is determined, enabling parallel model training of the deep neural network.

Potential applications of this technology:

  • Distributed training of text translation models
  • Distributed training of speech recognition models
  • Distributed training of facial recognition models
  • Distributed training of 3D reconstruction models
  • Distributed training of virtual reality models

Problems solved by this technology:

  • Enables hybrid parallelism in distributed clusters, improving the efficiency of model training.
  • Facilitates conversion between different tensor layouts, allowing for seamless integration of different parts of the neural network model.
  • Enables parallel model training of deep neural networks, reducing the time required for training large models.

Benefits of this technology:

  • Faster and more efficient training of deep neural network models.
  • Improved scalability and utilization of distributed clusters.
  • Enhanced flexibility in integrating different parts of the neural network model.
  • Enables training of large-scale models for various AI applications.


Original Abstract Submitted

Embodiments of this application disclose a data processing method, and relate to the field of artificial intelligence. The method is applied to distributed parallel model training, for example, distributed training of a text translation model, a speech recognition model, a facial recognition model, a three-dimensional reconstruction model, and a virtual reality model. The method can implement hybrid parallelism in a distributed cluster. The method includes: inserting, based on tensor layouts of tensors of at least one operator in a deep neural network model, a redistribution operator between operators that have an input-output dependency relationship, to implement conversion between different tensor layouts; inserting the redistribution operator into a sliced computational graph; and determining an updated sliced computational graph to implement parallel model training of the deep neural network.