The recent patch to fix the afs_server struct leak didn't actually fix the bug, but rather fixed some of the symptoms. The problem is that an asynchronous call that holds a resource pointed to by call->reply[0] will find the pointer cleared in the call destructor, thereby preventing the resource from being cleaned up.
In the case of the server record leak, the afs_fs_get_capabilities() function in devel code sets up a call with reply[0] pointing at the server record that should be altered when the result is obtained, but this was being cleared before the destructor was called, so the put in the destructor does nothing and the record is leaked.
Commit f014ffb025c1 removed the additional ref obtained by afs_install_server(), but the removal of this ref is actually used by the garbage collector to mark a server record as being defunct after the record has expired through lack of use.
The offending clearance of call->reply[0] upon completion in afs_process_async_call() has been there from the origin of the code, but none of the asynchronous calls actually use that pointer currently, so it should be safe to remove (note that synchronous calls don't involve this function).
Fix this by the following means:
(1) Revert commit f014ffb025c1.
(2) Remove the clearance of reply[0] from afs_process_async_call().
Without this, afs_manage_servers() will suffer an assertion failure if it sees a server record that didn't get used because the usage count is not 1.
Fixes: f014ffb025c1 ("afs: Fix afs_server struct leak") Fixes: 08e0e7c82eea ("[AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC.") Signed-off-by: David Howells dhowells@redhat.com ---
fs/afs/rxrpc.c | 2 -- fs/afs/server.c | 2 -- 2 files changed, 4 deletions(-)
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c index 748b37b130a2..9bbb8af000b4 100644 --- a/fs/afs/rxrpc.c +++ b/fs/afs/rxrpc.c @@ -620,8 +620,6 @@ static void afs_process_async_call(struct work_struct *work) }
if (call->state == AFS_CALL_COMPLETE) { - call->reply[0] = NULL; - /* We have two refs to release - one from the alloc and one * queued with the work item - and we can't just deallocate the * call because the work item may be queued again. diff --git a/fs/afs/server.c b/fs/afs/server.c index d5ef05e24e18..1a087eb8f2d7 100644 --- a/fs/afs/server.c +++ b/fs/afs/server.c @@ -199,11 +199,9 @@ static struct afs_server *afs_install_server(struct afs_net *net,
write_sequnlock(&net->fs_addr_lock); ret = 0; - goto out;
exists: afs_get_server(server); -out: write_sequnlock(&net->fs_lock); return server; }
On Mon, Oct 15, 2018 at 12:43:02PM +0100, David Howells wrote:
The recent patch to fix the afs_server struct leak didn't actually fix the bug, but rather fixed some of the symptoms. The problem is that an asynchronous call that holds a resource pointed to by call->reply[0] will find the pointer cleared in the call destructor, thereby preventing the resource from being cleaned up.
In the case of the server record leak, the afs_fs_get_capabilities() function in devel code sets up a call with reply[0] pointing at the server record that should be altered when the result is obtained, but this was being cleared before the destructor was called, so the put in the destructor does nothing and the record is leaked.
Commit f014ffb025c1 removed the additional ref obtained by afs_install_server(), but the removal of this ref is actually used by the garbage collector to mark a server record as being defunct after the record has expired through lack of use.
The offending clearance of call->reply[0] upon completion in afs_process_async_call() has been there from the origin of the code, but none of the asynchronous calls actually use that pointer currently, so it should be safe to remove (note that synchronous calls don't involve this function).
Fix this by the following means:
(1) Revert commit f014ffb025c1.
(2) Remove the clearance of reply[0] from afs_process_async_call().
Without this, afs_manage_servers() will suffer an assertion failure if it sees a server record that didn't get used because the usage count is not 1.
Fixes: f014ffb025c1 ("afs: Fix afs_server struct leak") Fixes: 08e0e7c82eea ("[AF_RXRPC]: Make the in-kernel AFS filesystem use AF_RXRPC.") Signed-off-by: David Howells dhowells@redhat.com
Now applied, thanks.
greg k-h
linux-stable-mirror@lists.linaro.org