From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 4c4e3731564c8945ac5ac90fc2a1e1f21cb79c92 ]
Notable cmpxchg() does not provide ordering when it fails, however wake_q_add() requires ordering in this specific case too. Without this it would be possible for the concurrent wakeup to not observe our prior state.
Andrea Parri provided:
C wake_up_q-wake_q_add
{ int next = 0; int y = 0; }
P0(int *next, int *y) { int r0;
/* in wake_up_q() */
WRITE_ONCE(*next, 1); /* node->next = NULL */ smp_mb(); /* implied by wake_up_process() */ r0 = READ_ONCE(*y); }
P1(int *next, int *y) { int r1;
/* in wake_q_add() */
WRITE_ONCE(*y, 1); /* wake_cond = true */ smp_mb__before_atomic(); r1 = cmpxchg_relaxed(next, 1, 2); }
exists (0:r0=0 /\ 1:r1=0)
This "exists" clause cannot be satisfied according to the LKMM:
Test wake_up_q-wake_q_add Allowed States 3 0:r0=0; 1:r1=1; 0:r0=1; 1:r1=0; 0:r0=1; 1:r1=1; No Witnesses Positive: 0 Negative: 3 Condition exists (0:r0=0 /\ 1:r1=0) Observation wake_up_q-wake_q_add Never 0 3
Reported-by: Yongji Xie elohimes@gmail.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Davidlohr Bueso dave@stgolabs.net Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: Waiman Long longman@redhat.com Cc: Will Deacon will.deacon@arm.com Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/core.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 13ddfa46d741..152a0b0c91bb 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -405,10 +405,11 @@ void wake_q_add(struct wake_q_head *head, struct task_struct *task) * its already queued (either by us or someone else) and will get the * wakeup due to that. * - * This cmpxchg() executes a full barrier, which pairs with the full - * barrier executed by the wakeup in wake_up_q(). + * In order to ensure that a pending wakeup will observe our pending + * state, even in the failed case, an explicit smp_mb() must be used. */ - if (cmpxchg(&node->next, NULL, WAKE_Q_TAIL)) + smp_mb__before_atomic(); + if (cmpxchg_relaxed(&node->next, NULL, WAKE_Q_TAIL)) return;
get_task_struct(task);