On Fri, 02 Feb 2024, Chuck Lever wrote:
Passes pynfs, fstests, and the git regression suite. Please apply these to origin/linux-5.4.y.
I should have mentioned this a day or two ago but I hadn't quite made all the connection yet...
The RELEASE_LOCKOWNER bug was masking a double-free bug that was fixed by Commit 47446d74f170 ("nfsd4: add refcount for nfsd4_blocked_lock") which landed in v5.17 and wasn't marked as a bugfix, and so has not gone to stable kernels.
Any kernel earlier than v5.17 that receives the RELEASE_LOCKOWNER fix also needs the nfsd4_blocked_lock fix. There is a minor follow-up fix for that nfsd4_blocked_lock fix which Chuck queued yesterday.
The problem scenario is that an nfsd4_lock() call finds a conflicting lock and so has a reference to a particular nfsd4_blocked_lock. A concurrent nfsd4_read_lockowner call frees all the nfsd4_blocked_locks including the one held in nfsd4_lock(). nfsd4_lock then tries to free the blocked_lock it has, and results in a double-free or a use-after-free.
Before either patch is applied, the extra reference on the lock-owner than nfsd4_lock holds causes nfsd4_realease_lockowner() to incorrectly return an error and NOT free the blocks_lock. With only the RELEASE_LOCKOWNER fix applied, the double-free happens. With both patches applied the refcount on the nfsd4_blocked_lock prevents the double-free.
Kernels before 4.9 are (probably) not affected as they didn't have find_or_allocate_block() which takes the second reference to a shared object. But that is ancient history - those kernels are well past EOL.
Thanks, NeilBrown
Chuck Lever (2): NFSD: Modernize nfsd4_release_lockowner() NFSD: Add documenting comment for nfsd4_release_lockowner()
NeilBrown (1): nfsd: fix RELEASE_LOCKOWNER
fs/nfsd/nfs4state.c | 65 +++++++++++++++++++++++++-------------------- 1 file changed, 36 insertions(+), 29 deletions(-)
-- Chuck Lever