User:Dingo1729/History of commitment ordering

Commitment ordering (or Commit ordering, CO; Raz 1990a, 1990b), publicly introduced in May 1991, has solved the years old open fundamental Global serializability problem (e.g., Silberschatz et al. 1991, page 120; see second quotation) through a generalization of the de-facto standard for Global serializability, SS2PL, and provided insight into it. Though published in prestigious academic journal (Raz 1994 - IPL) and refereed conferences ((Raz 1992 - VLDB), (Raz 1993a - ACM PODS), (Raz 1993b - IEEE RIDE-IMS)), as well as in three patents (Raz 1991a), the CO solution has been largely misunderstood and misrepresented (e.g., Weikum and Vossen 2001, pages 102, 700, and Breitbart and Silberschatz 1992), and to a great extent ignored in relevant database research texts (e.g., the text-books Liu and Özsu 2009, and Silberschatz et al. 2010), until recently: In Bernstein and Newcomer 2009 it is related to as the fourth major concurrency control method in addition to the earlier known three major methods (see last quotation). On the other hand, in some research articles the CO solution has been utilized without using the name CO, without referencing the CO work, and without explaining properly how and why CO achieves Global serializability (e.g., Haller et al. 2005, Voicu and Schuldt 2009b).

In concurrency control of databases and transaction processing (transaction management), CO is a class of interoperable Serializability techniques, both centralized and distributed. It is also the name of the resulting transaction schedule property. In a CO compliant schedule the chronological order of commitment events of transactions is compatible with the precedence order of the respective transactions. CO provides an effective, general solution to the Global serializability problem, i.e., achieving serializability in a distributed environment of multiple autonomous database systems and other transactional objects, that possibly utilize a different concurrency control mechanism each (e.g., as in Grid computing and Cloud computing). CO also provides an effective solution for Distributed serializability in general. The concept of CO has evolved in three threads of development, seemingly initially independent:

  1. Dynamic atomicity (DA) at the MIT (Weihl 1984); its definition was changed to be equivalent to CO in (Fekete et al. 1988; prior to CO).
  2. Analogous execution and serialization order (AESO) at the University of Houston (Georgakopoulus and Rusinkiewics 1989).
  3. Commitment ordering (CO) at Digital Equipment Corporation (Raz 1990a, 1990b).

Similarity between the initial concepts above and their final merging in equivalent or identical definitions caused researchers in the past to refer to them as "the same" (e.g., Weikum and Vossen 2001, page 720). However essential differences exist between their respective final research results and time-lines:

  1. The originally defined DA is close to CO, but is strictly contained in CO (which makes an essential difference in implementability; unlike CO, no general algorithm for DA is known). Only in (Fekete et al. 1988; precedes CO) DA is defined to be equivalent to CO (in a model that supports sub-transactions), but without a full-fledged generic local and distributed algorithms.
  2. The definition of the AESO schedule property evolved to the definition of Strong recoverability (STRC) in (Breitbart et al. 1991). The definition of STRC is identical to the CO definition. However, this should not be confused with the CO algorithmic techniques and theoretical results: No such techniques and results have been proposed for STRC, in particular the key result that local CO implies Global serializability with no concurrency control information distribution (preserving local autonomy: no need in external information locally). (Breitbart et al. 1991) appeared in September 1991. An already accepted for publication version of the article, circulated before its publication, does not have a word about Strong recoverability, and does not include the Abstract part which describes it and uses the term "commitment order".
  3. Atomic commitment protocol (ACP) and voting deadlocks (or an equivalent mechanism), which are crucial for the CO solution and CO's efficient distribution, are utilized neither for Dynamic atomicity (DA, which has a partial replacement though) nor for Strong recoverability (STRC).

Three patents filed for the CO techniques in 1991-3 were granted in 1996-7 (Raz 1991a). They have been extensively referenced in other patents since.

All three development threads have converged at definitions of schedule properties identical or equivalent to CO, and noticed that Strong strict two-phase locking (SS2PL or Rigorousness) possesses their respective properties. The DA work has provided additional examples of algorithms that generate DA compliant schedules, as well as implying that local DA (the original DA) implies global serializability while using only local concurrency control information. STRC is shown to imply Serializability but no proof is given that local STRC implies Global serializability with only local concurrency control information. General algorithms are given neither for DA nor for STRC. Only the CO articles have provided (patented) general algorithms for CO and methods to turn any concurrency control mechanism into a CO compliant one, for achieving global serializability across autonomous transactional objects (i.e., using only local concurrency control information; e.g., autonomous databases) with possibly different concurrency controls. The CO articles have also provided generalizations of CO that guarantee Global serializability with more concurrency and better performance by using additional local information (ECO in Raz 1993a, and MVCO in Raz 1993b).

A unique and novel element in the CO techniques and patents, besides ordering commit events, is the utilization of the atomic commitment protocol voting mechanism to break global cycles (also referred to as distributed cycles; span two or more transactional objects) in the conflict graph, for guaranteeing global serializability. It is achieved by applying a specific vote ordering strategy: voting in local precedence order. Also locking based global deadlocks are resolved automatically by the same mechanism as a side benefit. This allows effective implementations of distributed CO (and resulting distributed serializability), while allowing any, uninterrupted transaction operations scheduling without any conflict information distribution (e.g., local precedence relations, locks, timestamps, tickets). Furthermore, CO does not use any additional, artificial transaction access operations (e.g., "take timestamp" or "take ticket"), which typically result in additional, artificial conflicts that reduce concurrency.

The CO solution has been utilized extensively since 1997 in numerous works within the subject of Transactional processes (e.g., Schuldt et al. 2002, where CO is referenced). Some of them include descriptions of CO utilization in commercial distributed software products. Recently CO has been utilized as the basis for the concurrency control of Re:GRIDiT (e.g., Voicu et al. 2009a, Voicu and Schuldt 2009b), a proposed approach to transaction management in Grid computing and Cloud computing. The latter two articles and all other related articles by the same authors neither mention the name CO nor reference CO's articles.

CO has been utilized in a work on fault tolerance of transactional systems with replication (Vandiver et al. 2007).

With the proliferation of Multi-core processors CO also has been increasingly utilized in Concurrent programming (Parallel programming) and Transactional memory (and especially in Software transactional memory, STM) for achieving serializability optimistically by "commit order" (e.g., Ramadan et al. 2009, Zhang et al. 2006, von Parun et al. 2007).