On Mon, Aug 15, 2022 at 10:36 PM Hector Martin marcan@marcan.st wrote:
These ops are documented in Documentation/atomic_bitops.txt as being unordered in the failure ("bit was already set" case), and that matches the generic implementation (which arm64 uses).
Yeah, that documentation is pure garbage. We need to fix it.
I think that "unordered on failure" was added at the same time that the generic implementation was rewritten.
IOW, the documentation was changed to match that broken implementation, but it's clearly completely broken.
I think I understand *why* it's broken - it looks like a "harmless" optimization. After all, if the bitop doesn't do anything, there's nothing to order it with.
It makes a certain amount of sense - as long as you don't think about it too hard.
The reason it is completely and utterly broken is that it's not actually just "the bitop doesn't do anything". Even when it doesn't change the bit value, just the ordering of the read of the old bit value can be meaningful, exactly for that case of "I added more work to the queue, I need to set the bit to tell the consumers, and if I'm the first person to set the bit I may need to wake the consumer up".
On the other hand, Twitter just pointed out that contradicting documentation exists (I believe this was the source of the third party doc I found that claimed it's always a barrier):
It's not just that other documentation exists - it's literally that the unordered semantics don't even make sense, and don't match reality and history.
And nobody thought about it or caught it at the time.
The Xen people seem to have noticed at some point, and tried to introduce a "sync_set_set()"
So either one doc and the implementation are wrong, or the other doc is wrong.
That doc and the generic implementation is clearly wrong.
Linus