Huawei technologies co., ltd. (20240249133). SYSTEMS, APPARATUSES, METHODS, AND NON-TRANSITORY COMPUTER-READABLE STORAGE DEVICES FOR TRAINING ARTIFICIAL-INTELLIGENCE MODELS USING ADAPTIVE DATA-SAMPLING simplified abstract

From WikiPatents
Jump to navigation Jump to search

SYSTEMS, APPARATUSES, METHODS, AND NON-TRANSITORY COMPUTER-READABLE STORAGE DEVICES FOR TRAINING ARTIFICIAL-INTELLIGENCE MODELS USING ADAPTIVE DATA-SAMPLING

Organization Name

huawei technologies co., ltd.

Inventor(s)

Habib Hajimolahoseini of Toronto (CA)

Ali Saheb Pasand of Waterloo (CA)

Ehsan Kamalloo of Waterloo (CA)

Mehdi Rezagholi Zadeh of Vaughan (CA)

Yang Liu of Toronto (CA)

SYSTEMS, APPARATUSES, METHODS, AND NON-TRANSITORY COMPUTER-READABLE STORAGE DEVICES FOR TRAINING ARTIFICIAL-INTELLIGENCE MODELS USING ADAPTIVE DATA-SAMPLING - A simplified explanation of the abstract

This abstract first appeared for US patent application 20240249133 titled 'SYSTEMS, APPARATUSES, METHODS, AND NON-TRANSITORY COMPUTER-READABLE STORAGE DEVICES FOR TRAINING ARTIFICIAL-INTELLIGENCE MODELS USING ADAPTIVE DATA-SAMPLING

    • Simplified Explanation:**

The method described in the patent application involves using an artificial intelligence model to calculate importance metrics of data samples, then selecting a subset of samples based on these metrics to train the AI model.

    • Key Features and Innovation:**
  • Importance metrics of data samples are calculated without using labels or learning rates.
  • Sampling probabilities are determined based on these importance metrics.
  • The AI model is trained using the selected subset of data samples.
    • Potential Applications:**

This technology could be used in various fields where training AI models with limited labeled data is a challenge, such as healthcare, finance, and manufacturing.

    • Problems Solved:**

This technology addresses the issue of training AI models with limited labeled data by efficiently selecting samples for training based on their importance metrics.

    • Benefits:**
  • Improved efficiency in training AI models with limited labeled data.
  • Enhanced performance of AI models by focusing on important data samples.
  • Potential cost savings by reducing the need for extensive labeling of data.
    • Commercial Applications:**

Potential commercial applications include AI-driven decision-making systems, predictive analytics tools, and automated data processing solutions for industries with limited labeled data resources.

    • Questions about the Technology:**

1. How does this method improve the training process of AI models compared to traditional approaches? 2. What are the potential limitations or challenges of implementing this technology in real-world applications?


Original Abstract Submitted

a method has the steps of: calculating importance metrics of a plurality of data samples based on predictions of an artificial-intelligence (ai) model obtained from the plurality of data samples in a plurality of previous training epochs without using labels of the plurality of data samples and without using a learning rate of the ai model; calculating sampling probabilities of the plurality of data samples based on the importance metrics thereof; selecting a subset of the plurality of data samples based on the sampling probabilities of the of plurality of data samples; and training the ai model using the selected subset of the plurality of data samples for one or more epochs.