17943101. SYSTEM FOR LABELING A DATA SET simplified abstract (TOYOTA JIDOSHA KABUSHIKI KAISHA)

From WikiPatents
Jump to navigation Jump to search

SYSTEM FOR LABELING A DATA SET

Organization Name

TOYOTA JIDOSHA KABUSHIKI KAISHA

Inventor(s)

Monica PhuongThao Van of Palo Alto CA (US)

Yin-Ying Chen of San Jose CA (US)

Kenton Michael Lyons of Los Altos CA (US)

Francine Chen of Los Altos CA (US)

SYSTEM FOR LABELING A DATA SET - A simplified explanation of the abstract

This abstract first appeared for US patent application 17943101 titled 'SYSTEM FOR LABELING A DATA SET

Simplified Explanation

The abstract describes a method for labeling a data set using a coding model, which involves generating multiple sets of related initial labels, determining the quantity of occurrences of each label within the data set, calculating breadth scores for each label based on related labels, updating labels based on the scores, and finally labeling the data set.

  • Generating multiple sets of related initial labels
  • Determining quantity of occurrences of each label in the data set
  • Calculating breadth scores for each label based on related labels
  • Updating labels based on breadth scores
  • Labeling the data set based on initial and related labels

Potential Applications

This technology could be applied in various fields such as data analysis, machine learning, natural language processing, and information retrieval.

Problems Solved

This method helps in efficiently labeling large data sets, improving accuracy in data classification, and enhancing the performance of machine learning algorithms.

Benefits

- Streamlined data labeling process - Increased accuracy in data classification - Enhanced performance of machine learning models

Potential Commercial Applications

Optimizing data labeling processes for businesses Improving the efficiency of machine learning algorithms in various industries

Unanswered Questions

How does this method compare to existing data labeling techniques?

This method provides a systematic approach to labeling data sets based on related initial labels and breadth scores, potentially improving accuracy and efficiency compared to traditional methods.

What are the computational requirements for implementing this labeling method?

The abstract does not specify the computational resources or infrastructure needed to implement this labeling method, which could be crucial for practical applications.


Original Abstract Submitted

A method for labeling a data set by a coding model includes generating multiple sets of related initial labels based on processing a data set with a group of initial labels. The method also includes determining a quantity of occurrences, within the data set, of each one of the group of initial labels and each related initial label of the multiple sets of related initial labels. The method further includes determining, for each initial label of the group of initial labels, a breadth score based on the number of occurrences of each related initial label. The method still further includes updating one or more of the group of initial labels based on respective breadth scores satisfying a label updating condition. The method also includes labeling the data set based on the group of initial labels and the multiple sets of related initial labels.