Test and test-and-set

In computer architecture, the test-and-set CPU instruction (or instruction sequence) is designed to implement mutual exclusion in multiprocessor environments. Although a correct lock can be implemented with test-and-set, the test and test-and-set optimization lowers resource contention caused by bus locking, especially cache coherency protocol overhead on contended locks.

Given a lock:

boolean locked := false // shared lock variable

the entry protocol is:

procedure EnterCritical() {
  do {
    while ( locked == true )
         skip // spin using normal instructions until the lock is free
  } while ( TestAndSet(locked) == true ) // attempt actual atomic locking using the test-and-set instruction
}

and the exit protocol is:

procedure ExitCritical() {
  locked := false
}

The difference to the simple test-and-set protocol is the additional spin-loop (the test in test and test-and-set) at the start of the entry protocol, which utilizes ordinary load instructions. The load in this loop executes with less overhead compared to an atomic operation (resp. a load-exclusive instruction). E.g., on a system utilizing the MESI cache coherency protocol, the cache line being loaded is moved to the Shared state, whereas a test-and-set instruction or a load-exclusive instruction moves it into the Exclusive state.

This is particularly advantageous if multiple processors are contending for the same lock: whereas an atomic instruction or load-exclusive instruction requires a coherency-protocol transaction to give that processor exclusive access to the cache line (causing that line to ping-pong between the involved processors), ordinary loads on a line in Shared state require no protocol transactions at all: processors spinning in the inner loop operate purely locally.

Cache-coherency protocol transactions are used only in the outer loop, after the initial check has ascertained that they have a reasonable likelihood of success.

If the programming language used supports short-circuit evaluation, the entry protocol could be implemented as:

 procedure EnterCritical() {
   while ( locked == true or TestAndSet(locked) == true )
     skip // spin until locked
 }