Is A Concurent Write And Read To A Non-atomic Variable Of Fundamental Type Without Using It Undefined Behavior?

in a lock-free queue.pop(), I read a trivialy_copyable variable (of integral type) after synchronization with an atomic aquire inside a loop. Minimized pseudo code:

//somewhere else,release)

bool pop(size_t & returnValue){
writePosition = writePosition.load(aquire)
oldReadPosition = readPosition.load(relaxed)
size_t value{};
  value = data[oldReadPosition]
  newReadPosition = oldReadPosition+1
}while(readPosition.compare_exchange(oldReadPosition, newReadPosition, relaxed)
// here we are owner of the value
returnValue = value;
return true;

the memory of data[oldReadPosition] can only be changed iff this value was read from another thread bevor.

read and write Positions are ABA safe. with a simple copy, value = data[oldReadPosition] the memory of data[oldReadPosition] will not be changed.

but a write thread queue.push(...) can change data[oldReadPosition] while reading, iff another thread has already read oldPosition and changed the readPosition.

it would be a race condition, if you use the value, but is it also a race condition, and thus undefined behavior, when we leave value untouched? the standard is not spezific enough or I don´t understand it. imo, this should be possible, because it has no effect. I would be very happy to get an qualified answer to get deeper insights

thanks a lot



Yes, it's UB in ISO C++; value = data[oldReadPosition] in the C++ abstract machine involves reading the value of that object. (Usually that means lvalue to rvalue conversion, IIRC.)

But it's mostly harmless, probably only going to be a problem on machines with hardware race detection (not normal mainstream CPUs, but possibly on C implementations like clang with threadsanitizer).

Another use-case for non-atomic read and then checking for possible tearing is the SeqLock, where readers can prove no tearing by reading the same value from an atomic counter before and after the non-atomic read. It's UB in C++, even with volatile for the non-atomic data, although that may be helpful in making sure the compiler-generated asm is safe. (With memory barriers and current handling of atomics by existing compilers, even non-volatile makes working asm). See Optimal way to pass a few variables between 2 threads pinning different CPUs

atomic_thread_fence is still necessary for a SeqLock to be safe, and some of the necessary ordering of atomic loads wrt. non-atomic may be an implementation detail if it can't sync with something and create a happens-before.

People do use Seq Locks in real life, depending on the fact that real-life compilers de-facto define a bit more behaviour than ISO C++. Or another way to put it is that happen to work for now; if you're careful about what code you put around the non-atomic read it's unlikely for a compiler to be able to do anything problematic.

But you're definitely venturing out past the safe area of guaranteed behaviour, and probably need to understand how C++ compiles to asm, and how asm works on the target platforms you care about; see also Who's afraid of a big bad optimizing compiler? on LWN; it's aimed at Linux kernel code, which is the main user of hand-rolled atomics and stuff like that.