3.16.61-rc1 review patch. If anyone has any objections, please let me know.
------------------
From: Kiran Kumar Modukuri kiran.modukuri@gmail.com
commit 5ce83d4bb7d8e11e8c1c687d09f4b5ae67ef3ce3 upstream.
In cachefiles_mark_object_active(), the new object is marked active and then we try to add it to the active object tree. If a conflicting object is already present, we want to wait for that to go away. After the wait, we go round again and try to re-mark the object as being active - but it's already marked active from the first time we went through and a BUG is issued.
Fix this by clearing the CACHEFILES_OBJECT_ACTIVE flag before we try again.
Analysis from Kiran Kumar Modukuri:
[Impact] Oops during heavy NFS + FSCache + Cachefiles
CacheFiles: Error: Overlong wait for old active object to go away.
BUG: unable to handle kernel NULL pointer dereference at 0000000000000002
CacheFiles: Error: Object already active kernel BUG at fs/cachefiles/namei.c:163!
[Cause] In a heavily loaded system with big files being read and truncated, an fscache object for a cookie is being dropped and a new object being looked. The new object being looked for has to wait for the old object to go away before the new object is moved to active state.
[Fix] Clear the flag 'CACHEFILES_OBJECT_ACTIVE' for the new object when retrying the object lookup.
[Testcase] Have run ~100 hours of NFS stress tests and have not seen this bug recur.
[Regression Potential] - Limited to fscache/cachefiles.
Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem") Signed-off-by: Kiran Kumar Modukuri kiran.modukuri@gmail.com Signed-off-by: David Howells dhowells@redhat.com [bwh: Backported to 3.16: adjust context] Signed-off-by: Ben Hutchings ben@decadent.org.uk --- fs/cachefiles/namei.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -189,6 +189,7 @@ try_again: /* an old object from a previous incarnation is hogging the slot - we * need to wait for it to be destroyed */ wait_for_old_object: + clear_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags); if (fscache_object_is_live(&object->fscache)) { pr_err("\n"); pr_err("Error: Unexpected object collision\n"); @@ -250,7 +251,6 @@ wait_for_old_object: goto try_again;
requeue: - clear_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags); cache->cache.ops->put_object(&xobject->fscache); _leave(" = -ETIMEDOUT"); return -ETIMEDOUT;