18539996. Fault Isolation and Recovery of CPU Cores for Failed Secondary Asymmetric Multiprocessing Instance simplified abstract (Cisco Technology, Inc.)

From WikiPatents
Jump to navigation Jump to search

Fault Isolation and Recovery of CPU Cores for Failed Secondary Asymmetric Multiprocessing Instance

Organization Name

Cisco Technology, Inc.

Inventor(s)

Amit Chandra of San Jose CA (US)

Nivin Lawrence of Fremont CA (US)

Etienne Martineau of Gatineau (CA)

Fault Isolation and Recovery of CPU Cores for Failed Secondary Asymmetric Multiprocessing Instance - A simplified explanation of the abstract

This abstract first appeared for US patent application 18539996 titled 'Fault Isolation and Recovery of CPU Cores for Failed Secondary Asymmetric Multiprocessing Instance

Simplified Explanation

The patent application describes a system that can run a secondary software process in parallel with a primary instance, with the ability to recover the secondary instance in case of a fault without impacting the primary instance.

  • The system includes one or more processors and storage media with instructions for executing a secondary software process in parallel with a primary instance.
  • The secondary instance is associated with multiple cores, including a bootstrap core, and can register a non-maskable interrupt for the bootstrap core.
  • If the secondary instance is in a fault state, the system can halt the cores associated with the secondary instance without affecting the primary instance, and recover the bootstrap core by switching its context back to the primary instance via the non-maskable interrupt.

Potential Applications

This technology could be applied in high availability systems where continuous operation is critical, such as servers, data centers, or industrial control systems.

Problems Solved

This technology solves the problem of fault tolerance in parallel processing systems, ensuring that a fault in one instance does not impact the overall operation of the system.

Benefits

The ability to run a secondary instance in parallel with a primary instance provides redundancy and fault tolerance, increasing the reliability of the system.

Potential Commercial Applications

Commercial applications for this technology could include server systems, cloud computing platforms, and other high-performance computing environments where fault tolerance is essential.

Possible Prior Art

Prior art in the field of fault tolerance and parallel processing systems may include techniques for error detection and recovery, as well as methods for running multiple instances of software in parallel.

Unanswered Questions

How does this technology compare to traditional fault tolerance methods in parallel processing systems?

This article does not provide a direct comparison to traditional fault tolerance methods in parallel processing systems.

What impact could this technology have on overall system performance and efficiency?

The article does not address the potential impact of this technology on overall system performance and efficiency.


Original Abstract Submitted

According to certain embodiments, a system includes one or more processors and one or more computer-readable non-transitory storage media comprising instructions that, when executed by the one or more processors, cause one or more components to perform operations including executing a software process of a secondary instance, the secondary instance running in parallel with a primary instance and associated with a plurality of cores including a bootstrap core, registering a non-maskable interrupt for the bootstrap core in the secondary instance, determining whether the secondary instance is in a fault state, wherein, if the secondary instance is in the fault state, halting the plurality of cores associated with the secondary instance, without impact to the primary instance, and recovering the bootstrap core by switching a context of the bootstrap core from the secondary instance to the primary instance via the non-maskable interrupt.