On 4/9/18 8:58 AM, Bart Van Assche wrote:
On Mon, 2018-04-09 at 11:37 +0200, Christoph Hellwig wrote:
This looks sensible, but I'm worried about taking a whole spinlock for every request completion, including irq disabling. However it seems like your new updated pattern would fit use of cmpxchg() very nicely.
Hello Christoph,
Thanks for the review. I had a look at the spin lock implementation on x86 and apparently on x86 spin locks are implemented as qspinlocks (include/asm-generic/qspinlock.h). queued_spin_lock() already uses atomic_cmpxchg_acquire(). Are you sure that replacing the spin lock by cmpxchg() will yield a performance improvement?
It's definitely worth a shot, especially since this looks like a clear cut case for cmpxchg(). Additionally, it also kills the local irq disable.
Conversion should be trivial and we can do some perf testing around that.
Neither solution really makes me happy though, the prospect of either kind of synchronization cost isn't appealing at all. I'll have to ponder this a lot more, but it would be ideal if we could shift that cost to the extremely unlikely case of a timeout triggering rather than add the cost upfront to EVERY request.