Self-stabilization

Self-stabilization is a concept of fault-tolerance in distributed systems. Given any initial state, a self-stabilizing distributed system will end up in a correct state in a finite number of execution steps.

At first glance, the guarantee of self stabilization may seem less promising than that of the more traditional fault-tolerance of algorithms, that aim to guarantee that the system always remains in a correct state under certain kinds of state transitions. However, that traditional fault tolerance cannot always be achieved. For example, it cannot be achieved when the system is started in an incorrect state or is corrupted by an intruder. Moreover, because of their complexity, it is very hard to debug and to analyze distributed systems. Hence, it is very hard to prevent a distributed system from reaching an incorrect state. Indeed, some forms of self-stabilization are incorporated into many modern computer and telecommunications networks, since it gives them the ability to cope with faults that were not foreseen in the design of the algorithm.

Many years after the seminal paper of Edsger Dijkstra in 1974, this concept remains important as it presents an important foundation for self-managing computer systems and fault-tolerant systems. As a result, Dijkstra's paper received the 2002 ACM PODC Influential-Paper Award, one of the highest recognitions in the distributed computing community.[1] Moreover, after Dijkstra's death, the award was renamed and is now called the Dijkstra Award.

  1. ^ "PODC Influential Paper Award: 2002", ACM Symposium on Principles of Distributed Computing, retrieved 2009-09-01