17531158. TRAINING AND USING A MEMORY FAILURE PREDICTION MODEL simplified abstract (Microsoft Technology Licensing, LLC)

From WikiPatents
Jump to navigation Jump to search

TRAINING AND USING A MEMORY FAILURE PREDICTION MODEL

Organization Name

Microsoft Technology Licensing, LLC

Inventor(s)

Jasmine Grace Schlichting of Seattle WA (US)

Bhuvan Malladihalli Shashidhara of Bellevue WA (US)

Ramakoti R. Bhimanadhuni of Bothell WA (US)

Emily Nicole Wilson of Seattle WA (US)

Farah Farzana of Redmond WA (US)

Michael Wayne Stephenson of Woodinville WA (US)

Pallavi Baral of Redmond WA (US)

Josh Charles Moore of Lynnwood WA (US)

Christina Margaret Tobias of Seattle WA (US)

John A. Strange of Everett WA (US)

Peter Hanpeng Jiang of Kirkland WA (US)

Sebastien Nathan R Levy of Seattle WA (US)

Brett Kenneth Dodds of Boise ID (US)

Arhatha Bramhanand of Redmond WA (US)

Juan Arturo Herrera Ortiz of Seattle WA (US)

Ahu Oral of Seattle WA (US)

Charlotte Gauchet of Redmond WA (US)

Daniel Sebastian Berger of Seattle WA (US)

TRAINING AND USING A MEMORY FAILURE PREDICTION MODEL - A simplified explanation of the abstract

This abstract first appeared for US patent application 17531158 titled 'TRAINING AND USING A MEMORY FAILURE PREDICTION MODEL

Simplified Explanation

The patent application describes a method for training and using a prediction model to identify uncorrectable errors (UE) in telemetry data.

  • Sets of UE state labels and non-UE state labels are generated from collected telemetry data.
  • Statistical features are extracted from the telemetry data of the sets of UE and non-UE state labels.
  • The extracted statistical features are used to train a UE state prediction model.
  • A second set of telemetry data is obtained and used to predict a UE event using the trained prediction model.
  • A preventative operation is performed on a memory page based on the predicted UE event to prevent it from occurring.

Potential Applications

This technology can be applied in various industries and systems where telemetry data is collected and analyzed. Some potential applications include:

  • Computer systems: Predicting and preventing uncorrectable errors in computer memory to improve system reliability.
  • Manufacturing: Identifying potential faults in production equipment based on telemetry data to prevent breakdowns and optimize maintenance schedules.
  • Energy systems: Predicting and preventing failures in power generation equipment based on telemetry data to minimize downtime and improve efficiency.

Problems Solved

This technology addresses the following problems:

  • Uncorrectable errors: Identifying and predicting uncorrectable errors in telemetry data helps prevent system failures and improve overall reliability.
  • Reactive maintenance: By predicting potential errors, preventative operations can be performed to avoid costly breakdowns and reduce the need for reactive maintenance.
  • Data analysis efficiency: The use of statistical features and prediction models streamlines the analysis process, allowing for quicker identification and prevention of errors.

Benefits

The use of this technology offers several benefits:

  • Improved system reliability: By predicting and preventing uncorrectable errors, system downtime and failures can be minimized, leading to increased reliability.
  • Cost savings: Preventative operations based on predicted errors help avoid costly breakdowns and reduce maintenance expenses.
  • Efficient data analysis: The use of statistical features and prediction models allows for faster and more accurate analysis of telemetry data, saving time and resources.


Original Abstract Submitted

The disclosure herein describes training and using an uncorrectable error (UE) state prediction model based on telemetry error data. Sets of UE state labels and non-UE state labels are generated from a first set of collected telemetry data, wherein the UE state labels each reference a UE and telemetry data of an interval prior to the referenced UE. Statistical features are extracted from telemetry data of the sets of UE state labels and non-UE state labels, and the extracted statistical features are used to train a UE state prediction model. A second set of collected telemetry data is obtained, and a UE event is predicted based on the second set of collected telemetry data using the trained UE state prediction model. A preventative operation is performed on a memory page of the system based on the predicted UE event, whereby the predicted UE event is prevented from occurring.