Thread coordination using Boost.Atomic

Enforcing happens-before through mutual exclusion
happens-before through release and acquire
Fences
happens-before through release and consume
Sequential consistency

The most common use of Boost.Atomic is to realize custom thread synchronization protocols: The goal is to coordinate accesses of threads to shared variables in order to avoid "conflicts". The programmer must be aware of the fact that compilers, CPUs and the cache hierarchies may generally reorder memory references at will. As a consequence a program such as:

int x = 0, int y = 0;

thread1:
  x = 1;
  y = 1;

thread2
  if (y == 1) {
    assert(x == 1);
  }

might indeed fail as there is no guarantee that the read of x by thread2 "sees" the write by thread1.

Boost.Atomic uses a synchronisation concept based on the happens-before relation to describe the guarantees under which situations such as the above one cannot occur.

The remainder of this section will discuss happens-before in a "hands-on" way instead of giving a fully formalized definition. The reader is encouraged to additionally have a look at the discussion of the correctness of a few of the examples afterwards.

Enforcing happens-before through mutual exclusion

As an introductory example to understand how arguing using happens-before works, consider two threads synchronizing using a common mutex:

mutex m;

thread1:
  m.lock();
  ... /* A */
  m.unlock();

thread2:
  m.lock();
  ... /* B */
  m.unlock();

The "lockset-based intuition" would be to argue that A and B cannot be executed concurrently as the code paths require a common lock to be held.

One can however also arrive at the same conclusion using happens-before: Either thread1 or thread2 will succeed first at m.lock(). If this is be thread1, then as a consequence, thread2 cannot succeed at m.lock() before thread1 has executed m.unlock(), consequently A happens-before B in this case. By symmetry, if thread2 succeeds at m.lock() first, we can conclude B happens-before A.

Since this already exhausts all options, we can conclude that either A happens-before B or B happens-before A must always hold. Obviously cannot state which of the two relationships holds, but either one is sufficient to conclude that A and B cannot conflict.

Compare the spinlock implementation to see how the mutual exclusion concept can be mapped to Boost.Atomic.

happens-before through `release` and `acquire`

The most basic pattern for coordinating threads via Boost.Atomic uses release and acquire on an atomic variable for coordination: If ...

... thread1 performs an operation A,
... thread1 subsequently writes (or atomically modifies) an atomic variable with release semantic,
... thread2 reads (or atomically reads-and-modifies) the value this value from the same atomic variable with acquire semantic and
... thread2 subsequently performs an operation B,

... then A happens-before B.

Consider the following example

atomic<int> a(0);

thread1:
  ... /* A */
  a.fetch_add(1, memory_order_release);

thread2:
  int tmp = a.load(memory_order_acquire);
  if (tmp == 1) {
    ... /* B */
  } else {
    ... /* C */
  }

In this example, two avenues for execution are possible:

The store operation by thread1 precedes the load by thread2: In this case thread2 will execute B and "A happens-before B" holds as all of the criteria above are satisfied.
The load operation by thread2 precedes the store by thread1: In this case, thread2 will execute C, but "A happens-before C" does not hold: thread2 does not read the value written by thread1 through a.

Therefore, A and B cannot conflict, but A and C can conflict.

Fences

Ordering constraints are generally specified together with an access to an atomic variable. It is however also possible to issue "fence" operations in isolation, in this case the fence operates in conjunction with preceding (for acquire, consume or seq_cst operations) or succeeding (for release or seq_cst) atomic operations.

The example from the previous section could also be written in the following way:

atomic<int> a(0);

thread1:
  ... /* A */
  atomic_thread_fence(memory_order_release);
  a.fetch_add(1, memory_order_relaxed);

thread2:
  int tmp = a.load(memory_order_relaxed);
  if (tmp == 1) {
    atomic_thread_fence(memory_order_acquire);
    ... /* B */
  } else {
    ... /* C */
  }

This provides the same ordering guarantees as previously, but elides a (possibly expensive) memory ordering operation in the case C is executed.

happens-before through `release` and `consume`

The second pattern for coordinating threads via Boost.Atomic uses release and consume on an atomic variable for coordination: If ...

... thread1 performs an operation A,
... thread1 subsequently writes (or atomically modifies) an atomic variable with release semantic,
... thread2 reads (or atomically reads-and-modifies) the value this value from the same atomic variable with consume semantic and
... thread2 subsequently performs an operation B that is computationally dependent on the value of the atomic variable,

... then A happens-before B.

Consider the following example

atomic<int> a(0);
complex_data_structure data[2];

thread1:
  data[1] = ...; /* A */
  a.store(1, memory_order_release);

thread2:
  int index = a.load(memory_order_consume);
  complex_data_structure tmp = data[index]; /* B */

In this example, two avenues for execution are possible:

The store operation by thread1 precedes the load by thread2: In this case thread2 will read data[1] and "A happens-before B" holds as all of the criteria above are satisfied.
The load operation by thread2 precedes the store by thread1: In this case thread2 will read data[0] and "A happens-before B" does not hold: thread2 does not read the value written by thread1 through a.

Here, the happens-before relationship helps ensure that any accesses (presumable writes) to data[1] by thread1 happen before before the accesses (presumably reads) to data[1] by thread2: Lacking this relationship, thread2 might see stale/inconsistent data.

Note that in this example, the fact that operation B is computationally dependent on the atomic variable, therefore the following program would be erroneous:

atomic<int> a(0);
complex_data_structure data[2];

thread1:
  data[1] = ...; /* A */
  a.store(1, memory_order_release);

thread2:
  int index = a.load(memory_order_consume);
  complex_data_structure tmp;
  if (index == 0)
    tmp = data[0];
  else
    tmp = data[1];

consume is most commonly (and most safely! see limitations) used with pointers, compare for example the singleton with double-checked locking.

Sequential consistency

The third pattern for coordinating threads via Boost.Atomic uses seq_cst for coordination: If ...

... thread1 performs an operation A,
... thread1 subsequently performs any operation with seq_cst,
... thread1 subsequently performs an operation B,
... thread2 performs an operation C,
... thread2 subsequently performs any operation with seq_cst,
... thread2 subsequently performs an operation D,

then either "A happens-before D" or "C happens-before B" holds.

In this case it does not matter whether thread1 and thread2 operate on the same or different atomic variables, or use a "stand-alone" atomic_thread_fence operation.