this is a backport of commit 7aa54be297655 ("locking/qspinlock, x86: Provide liveness guarantee") for the v4.14 stable tree. For the v4.4 tree the ARCH_USE_QUEUED_SPINLOCKS option got disabled on x86. For v4.14 it has been decided to do a minimal backport of the final fix (including all its dependencies). It was not as small and simple as assumed. It is mostly what I had(have) for v4.9 except that I kept cmpxchg_acquire() since it is already part for v4.14.
|$ git range-diff v4.9.144..v4.9-stable-qspinlock v4.14.89..v4.14-stable-qspinlock | | 1: b14324c48428c = 1: f1689a618de73 locking: Remove smp_read_barrier_depends() from queued_spin_lock_slowpath() | 2: b1caa34ac4a5d = 2: ffb9ca819eb61 locking/qspinlock: Ensure node is initialised before updating prev->next | 3: 2c35bd12f90f3 = 3: fb592cc9d2562 locking/qspinlock: Bound spinning on pending->locked transition in slowpath | 4: 06efd7410eb22 ! 4: a965972feec95 locking/qspinlock: Merge 'struct __qspinlock' into 'struct qspinlock' | @@ -191,8 +191,8 @@ | - struct __qspinlock *l = (void *)lock; | - | if (!(atomic_read(&lock->val) & _Q_LOCKED_PENDING_MASK) && | -- (cmpxchg(&l->locked, 0, _Q_LOCKED_VAL) == 0)) { | -+ (cmpxchg(&lock->locked, 0, _Q_LOCKED_VAL) == 0)) { | +- (cmpxchg_acquire(&l->locked, 0, _Q_LOCKED_VAL) == 0)) { | ++ (cmpxchg_acquire(&lock->locked, 0, _Q_LOCKED_VAL) == 0)) { | qstat_inc(qstat_pv_lock_stealing, true); | return true; | } | @@ -222,10 +222,10 @@ | - struct __qspinlock *l = (void *)lock; | - | - return !READ_ONCE(l->locked) && | -- (cmpxchg(&l->locked_pending, _Q_PENDING_VAL, _Q_LOCKED_VAL) | +- (cmpxchg_acquire(&l->locked_pending, _Q_PENDING_VAL, | + return !READ_ONCE(lock->locked) && | -+ (cmpxchg(&lock->locked_pending, _Q_PENDING_VAL, _Q_LOCKED_VAL) | - == _Q_PENDING_VAL); | ++ (cmpxchg_acquire(&lock->locked_pending, _Q_PENDING_VAL, | + _Q_LOCKED_VAL) == _Q_PENDING_VAL); | } | #else /* _Q_PENDING_BITS == 8 */ | @@ | 5: 1c1971db6f166 ! 5: f418494654adb locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath | @@ -219,4 +219,4 @@ | - | /* | * The pending bit check in pv_queued_spin_steal_lock() isn't a memory | - * barrier. Therefore, an atomic cmpxchg() is used to acquire the lock | + * barrier. Therefore, an atomic cmpxchg_acquire() is used to acquire the | 6: a7b330da1b7e1 = 6: b4ea20c1230ef locking/qspinlock: Remove duplicate clear_pending() function from PV code | 7: be0280dd572b3 = 7: 5314d2c19c23b locking/qspinlock: Kill cmpxchg() loop when claiming lock from head of queue | 8: 8cfd5c4bd4919 = 8: 326422ea5c879 locking/qspinlock: Re-order code | 9: b6a0b3ebcec0c = 9: 5fc0f95adbc41 locking/qspinlock/x86: Increase _Q_PENDING_LOOPS upper bound |10: 4fc477c008fc4 = 10: b961404d8eee6 locking/qspinlock, x86: Provide liveness guarantee
Sebastian