Heartbeat (computing)

In computer science, a heartbeat is a periodic signal generated by hardware or software to indicate normal operation or to synchronize other parts of a computer system.[1][2] Heartbeat mechanism is one of the common techniques in mission critical systems for providing high availability and fault tolerance of network services by detecting the network or systems failures of nodes or daemons which belongs to a network cluster—administered by a master server—for the purpose of automatic adaptation and rebalancing of the system by using the remaining redundant nodes on the cluster to take over the load of failed nodes for providing constant services.[3][1] Usually a heartbeat is sent between machines at a regular interval in the order of seconds; a heartbeat message.[4] If the endpoint does not receive a heartbeat for a time—usually a few heartbeat intervals—the machine that should have sent the heartbeat is assumed to have failed.[5] Heartbeat messages are typically sent non-stop on a periodic or recurring basis from the originator's start-up until the originator's shutdown. When the destination identifies a lack of heartbeat messages during an anticipated arrival period, the destination may determine that the originator has failed, shutdown, or is generally no longer available.

  1. ^ a b Hou & Huang 2003, p. 1.
  2. ^ "Definition of Heartbeat". pcmag.com Encyclopedia. Retrieved 7 October 2020.
  3. ^ Robertson 2000, p. 1.
  4. ^ US 4710926, Donald W. Brown, James W. Leth, James E. Vandendorpe, "Fault recovery in a distributed processing system", published 1987-12-01 
  5. ^ Kawazoe Aguilera, Marcos; Chen, Wei; Toueg, Sam (1997). "Heartbeat: A timeout-free failure detector for quiescent reliable communication" (PDF). Distributed Algorithms. Berlin, Heidelberg: Springer Berlin Heidelberg. pp. 126–140. doi:10.1007/bfb0030680. hdl:1813/7286. ISBN 978-3-540-63575-8. ISSN 0302-9743.