(reposting in plain text, sorry for the previous HTML email, I should have not posted from the Phone)
On Thu, Feb 22, 2018 at 5:48 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Fri, Feb 16, 2018 at 11:02:01AM -0800, Joel Fernandes wrote:
ashmem_mutex create a chain of dependencies like so:
(1) mmap syscall -> mmap_sem -> (acquired) ashmem_mmap ashmem_mutex (try to acquire) (block)
(2) llseek syscall -> ashmem_llseek -> ashmem_mutex -> (acquired) inode_lock -> inode->i_rwsem (try to acquire) (block)
(3) getdents -> iterate_dir -> inode_lock -> inode->i_rwsem (acquired) copy_to_user -> mmap_sem (try to acquire)
There is a lock ordering created between mmap_sem and inode->i_rwsem causing a lockdep splat [2] during a syzcaller test, this patch fixes the issue by unlocking the mutex earlier. Functionally that's Ok since we don't need to protect vfs_llseek.
[1] https://patchwork.kernel.org/patch/10185031/ [2] https://lkml.org/lkml/2018/1/10/48
Cc: Todd Kjos tkjos@google.com Cc: Arve Hjonnevag arve@android.com Cc: Greg Hackmann ghackmann@google.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: stable@vger.kernel.org Reported-by: syzbot+8ec30bb7bf1a981a2012@syzkaller.appspotmail.com Signed-off-by: Joel Fernandes joelaf@google.com
Changes since first version: Don't relock after vfs call since its not needed. Only reason we lock is to protect races with asma->file. https://patchwork.kernel.org/patch/10185031/
I'd like some acks from others before I take this patch.
GregH, Todd, could you provide Acks?
Joel, did the original reporter say this patch solved their issue or not? For some reason I didn't think it worked properly...
That's a different but similar issue: https://patchwork.kernel.org/patch/10202127/. That was related to RECLAIM_FS lockdep recursion. That was posted as an RFC unlike this one, and its still being investigated since Miles reported that the lockdep inode annotation doesn't fix the issue.
This one on the other hand has a straightforward fix, and was verified by the syzbot.
I can see why its easy to confuse the two issues.
Thanks,
- Joel