On Thu, Mar 02, 2023 at 02:28:00PM -0800, Nathan Huckleberry wrote:
Hey Eric,
On Sun, Feb 26, 2023 at 12:43 PM Eric Biggers ebiggers@kernel.org wrote:
From: Eric Biggers ebiggers@google.com
Once all I/O using a blk_crypto_key has completed, filesystems can call blk_crypto_evict_key(). However, the block layer doesn't call blk_crypto_put_keyslot() until the request is being cleaned up, which happens after upper layers have been told (via bio_endio()) the I/O has completed. This causes a race condition where blk_crypto_evict_key() can see 'slot_refs > 0' without there being an actual bug.
This makes __blk_crypto_evict_key() hit the 'WARN_ON_ONCE(atomic_read(&slot->slot_refs) != 0)' and return without doing anything, eventually causing a use-after-free in blk_crypto_reprogram_all_keys(). (This is a very rare bug and has only been seen when per-file keys are being used with fscrypt.)
There are two options to fix this: either release the keyslot in blk_update_request() just before bio_endio() is called on the request's last bio, or just make __blk_crypto_evict_key() ignore slot_refs. Let's go with the latter solution for now, since it avoids adding overhead to the loop in blk_update_request(). (It does have the disadvantage that hypothetical bugs where a key is evicted while still in-use become harder to detect. But so far there haven't been any such bugs anyway.)
I disagree with the proposal to ignore the race condition in blk_crypto_evict_key(). As you said, ignoring the error could lead to undetected bugs in the future. Instead, I think we should focus on fixing the function ordering so that blk_crypto_put_keyslot() is called before blk_crypto_evict_key().
I think the overhead is a necessary trade-off to ensure correctness.
Thanks,
Sure, I'm concerned about pushback on adding something to blk_update_request() that's not critical, but I'll send out that patch for consideration too.
- Eric