On Fri, Sep 6, 2019 at 6:43 AM Eric Biggers ebiggers@kernel.org wrote:
On Wed, Sep 04, 2019 at 04:29:03PM +0200, Miklos Szeredi wrote:
TBH, I find the fix disgusting. It's confusing to sprinke code that has absolutely nothing to do with interrupts with spin_lock_irq() calls.
I think the lock/unlock calls should at least be done with a helper with a comment explaining why disabling interrupts is needed (though I have not managed to understand why aio needs to actually mess with the waitq lock...)
The aio code is doing a poll(), so it needs to use the wait queue.
Doesn't explain why the irq disabled nested locking is needed in aio_poll(). poll/select manage to do that without messing with waitq internals. How is aio poll different?
Probably a better fix would be to just use a separate spinlock to avoid the need to disable interrupts in cases where it's not necessary.
Well, the below is what a separate lock would look like. Note that it actually still disables IRQs in some places; it's just hidden away in the nested spin_lock_irqsave() in wake_up(). Likewise, adding something to the fuse_iqueue then requires taking 2 spin locks -- one explicit, and one hidden in wake_up().
Right, that's exactly why the waitq lock was used.
Is this the solution you'd prefer?
I'd actually prefer if aio was fixed. But I guess that's not realistic, so yes, the below patch looks okay. If fiq->lock is in the same cacheline as fiq->waitq then it shouldn't make a difference.
Thanks, Miklos