On Tue, Apr 7, 2020 at 11:04 AM Chris Wilson chris@chris-wilson.co.uk wrote:
That submission can run concurrently to the list iteration, but only _after_ the final list_del.
There is no such thing, Chris.
Not without locks and memory ordering.
The "list_del()" is already ordered on the CPU it happens.
And _there_ it's already ordered with the list_for_each_entry_safe() by the compiler.
There may be something really subtle going on, but it really smells like "two threads are modifying the same list at the same time".
In strict succession.
See above.
There is no such thing as strict succession across two CPU's unless you have memory barriers, locks, or things like release/acquire operations.
So a "strict succession from list_del()" only makes sense on the _local_ CPU. Not across threads.
You may be relying on some very subtle consistency guarantee that is true on x86. For example, x86 does guarantee "causality".
Not everybody else does that.
There's some more shutting up required for KCSAN to bring the noise down to usable levels which I hope has been done so I don't have to argue for it, such as
Stop using KCSAN if you can't deal with the problems it introduces.
I do NOT want to see bogus patches in the kernel that are introduced by bad infrastructure.
That READ_ONCE() is _wrong_.
File a bug with the KCSAN people, don't send garbage upstream.
Linus