This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
|
Checkpointing is a technique that provides fault tolerance for computing systems. It basically consists of saving a snapshot of the application's state, so that applications can restart from that point in case of failure. This is particularly important for long running applications that are executed in failure-prone computing systems.