17457922. UNLABELED LOG ANOMALY CONTINUOUS LEARNING simplified abstract (INTERNATIONAL BUSINESS MACHINES CORPORATION)

From WikiPatents
Jump to navigation Jump to search

UNLABELED LOG ANOMALY CONTINUOUS LEARNING

Organization Name

INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor(s)

Sahil Bansal of Kurukshetra (IN)

Harshit Kumar of Delhi (IN)

Lu An of Raleigh NC (US)

Xiaotong Liu of San Jose CA (US)

Anbang Xu of San Jose CA (US)

UNLABELED LOG ANOMALY CONTINUOUS LEARNING - A simplified explanation of the abstract

This abstract first appeared for US patent application 17457922 titled 'UNLABELED LOG ANOMALY CONTINUOUS LEARNING

Simplified Explanation

The patent application describes a method for automatically classifying log lines as erroneous or non-erroneous and using these classifications to train a log anomaly model. Here are the key points:

  • The method uses computer processors to classify each log line as either erroneous or non-erroneous.
  • The classified log lines are then templatized, meaning they are converted into standardized templates.
  • The erroneous log templates and non-erroneous log templates are clustered separately.
  • Clusters that exceed a certain frequency threshold are eliminated.
  • The remaining clusters are used to train a log anomaly model.
  • The trained model can then be used to identify subsequent log lines as anomalous or non-anomalous.

Potential applications of this technology:

  • Log analysis: This method can be used to automatically analyze large volumes of log data and identify anomalous log lines, which can be helpful for troubleshooting and detecting system issues.
  • Cybersecurity: By identifying anomalous log lines, this method can assist in detecting potential security breaches or suspicious activities in computer systems.

Problems solved by this technology:

  • Manual log analysis: This method automates the process of log analysis, saving time and effort compared to manual inspection of log lines.
  • Error detection: By classifying log lines as erroneous or non-erroneous, this method helps in identifying errors or abnormalities in system logs.

Benefits of this technology:

  • Efficiency: The automated classification and templatization process speeds up log analysis and reduces the need for manual inspection.
  • Accuracy: By training a log anomaly model using classified log templates, this method can improve the accuracy of identifying anomalous log lines.
  • Scalability: This method can handle large volumes of log data, making it suitable for analyzing logs from complex systems.


Original Abstract Submitted

One or more computer processors classify each log line in a plurality of unlabeled log lines as an erroneous log line or a non-erroneous log line. The one or more computer processors templatize each classified erroneous log line and non-erroneous log line in the plurality of unlabeled log lines. The one or more computer processors cluster erroneous log templates into erroneous log template clusters and the non-erroneous log templates into non-erroneous log template clusters. The one or more computer processors eliminate the erroneous log template clusters and the non-erroneous log template clusters that exceed a frequency threshold. The one or more computer processors train a log anomaly model utilizing=remaining erroneous log template clusters and remaining non-erroneous log template clusters. The one or more computer processors identify a subsequent log line as anomalous or non-anomalous utilizing the trained log anomaly model.