The patch titled
Subject: mm/gup.c: fix invalid page pointer returned with FOLL_PIN gups
has been removed from the -mm tree. Its filename was
mm-fix-invalid-page-pointer-returned-with-foll_pin-gups.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Peter Xu <peterx(a)redhat.com>
Subject: mm/gup.c: fix invalid page pointer returned with FOLL_PIN gups
Alex reported invalid page pointer returned with pin_user_pages_remote()
from vfio after upstream commit 4b6c33b32296 ("vfio/type1: Prepare for
batched pinning with struct vfio_batch"). This problem breaks NVIDIA vfio
mdev.
It turns out that it's not the fault of the vfio commit; however after
vfio switches to a full page buffer to store the page pointers it starts
to expose the problem easier.
The problem is for VM_PFNMAP vmas we should normally fail with an -EFAULT
then vfio will carry on to handle the MMIO regions. However when the bug
triggered, follow_page_mask() returned -EEXIST for such a page, which will
jump over the current page, leaving that entry in **pages untouched.
However the caller is not aware of it, hence the caller will reference the
page as usual even if the pointer data can be anything.
We had that -EEXIST logic since commit 1027e4436b6a ("mm: make GUP handle
pfn mapping unless FOLL_GET is requested") which seems very reasonable.
It could be that when we reworked GUP with FOLL_PIN we could have
overlooked that special path in commit 3faa52c03f44 ("mm/gup: track
FOLL_PIN pages"), even if that commit rightfully touched up
follow_devmap_pud() on checking FOLL_PIN when it needs to return an
-EEXIST.
Since at it, add another WARN_ON_ONCE() at the -EEXIST handling to make
sure we mustn't have **pages set when reaching there, because otherwise it
means the caller will try to read a garbage right after __get_user_pages()
returns.
Attaching the Fixes to the FOLL_PIN rework commit, as it happened later
than 1027e4436b6a.
Link: https://lkml.kernel.org/r/20220125033700.69705-1-peterx@redhat.com
Fixes: 3faa52c03f44 ("mm/gup: track FOLL_PIN pages")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
Reported-by: Alex Williamson <alex.williamson(a)redhat.com>
Debugged-by: Alex Williamson <alex.williamson(a)redhat.com>
Tested-by: Alex Williamson <alex.williamson(a)redhat.com>
Reviewed-by: John Hubbard <jhubbard(a)nvidia.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: J��r��me Glisse <jglisse(a)redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/mm/gup.c~mm-fix-invalid-page-pointer-returned-with-foll_pin-gups
+++ a/mm/gup.c
@@ -440,7 +440,7 @@ static int follow_pfn_pte(struct vm_area
pte_t *pte, unsigned int flags)
{
/* No page to get reference */
- if (flags & FOLL_GET)
+ if (flags & (FOLL_GET | FOLL_PIN))
return -EFAULT;
if (flags & FOLL_TOUCH) {
@@ -1181,7 +1181,13 @@ retry:
/*
* Proper page table entry exists, but no corresponding
* struct page.
+ *
+ * Warn if we jumped over even with a valid **pages.
+ * It shouldn't trigger in practise, but when there's
+ * buggy returns on -EEXIST we'll warn before returning
+ * an invalid page pointer in the array.
*/
+ WARN_ON_ONCE(pages);
goto next_page;
} else if (IS_ERR(page)) {
ret = PTR_ERR(page);
_
Patches currently in -mm which might be from peterx(a)redhat.com are
On Tue, Feb 1, 2022 at 10:00 AM Will McVicker <willmcvicker(a)google.com> wrote:
>
> On Tue, Feb 1, 2022 at 1:29 AM John Hubbard <jhubbard(a)nvidia.com> wrote:
> >
> > This reverts commit 54d516b1d62ff8f17cee2da06e5e4706a0d00b8a
> >
> > That commit did a refactoring that effectively combined fast and slow
> > gup paths (again). And that was again incorrect, for two reasons:
> >
> > a) Fast gup and slow gup get reference counts on pages in different ways
> > and with different goals: see Linus' writeup in commit cd1adf1b63a1
> > ("Revert "mm/gup: remove try_get_page(), call try_get_compound_head()
> > directly""), and
> >
> > b) try_grab_compound_head() also has a specific check for "FOLL_LONGTERM
> > && !is_pinned(page)", that assumes that the caller can fall back to slow
> > gup. This resulted in new failures, as recently report by Will McVicker
> > [1].
> >
> > But (a) has problems too, even though they may not have been reported
> > yet. So just revert this.
> >
> > [1] https://lore.kernel.org/r/20220131203504.3458775-1-willmcvicker@google.com
> >
> > Fixes: 54d516b1d62f ("mm/gup: small refactoring: simplify try_grab_page()")
> > Cc: Christoph Hellwig <hch(a)lst.de>
> > Cc: Will McVicker <willmcvicker(a)google.com>
> > Cc: Minchan Kim <minchan(a)google.com>
> > Cc: Matthew Wilcox <willy(a)infradead.org>
> > Cc: Christian Borntraeger <borntraeger(a)de.ibm.com>
> > Cc: Heiko Carstens <hca(a)linux.ibm.com>
> > Cc: Vasily Gorbik <gor(a)linux.ibm.com>
> > Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
> > Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
> > ---
> > mm/gup.c | 35 ++++++++++++++++++++++++++++++-----
> > 1 file changed, 30 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/gup.c b/mm/gup.c
> > index f0af462ac1e2..a9d4d724aef7 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -124,8 +124,8 @@ static inline struct page *try_get_compound_head(struct page *page, int refs)
> > * considered failure, and furthermore, a likely bug in the caller, so a warning
> > * is also emitted.
> > */
> > -struct page *try_grab_compound_head(struct page *page,
> > - int refs, unsigned int flags)
> > +__maybe_unused struct page *try_grab_compound_head(struct page *page,
> > + int refs, unsigned int flags)
> > {
> > if (flags & FOLL_GET)
> > return try_get_compound_head(page, refs);
> > @@ -208,10 +208,35 @@ static void put_compound_head(struct page *page, int refs, unsigned int flags)
> > */
> > bool __must_check try_grab_page(struct page *page, unsigned int flags)
> > {
> > - if (!(flags & (FOLL_GET | FOLL_PIN)))
> > - return true;
> > + WARN_ON_ONCE((flags & (FOLL_GET | FOLL_PIN)) == (FOLL_GET | FOLL_PIN));
> >
> > - return try_grab_compound_head(page, 1, flags);
> > + if (flags & FOLL_GET)
> > + return try_get_page(page);
> > + else if (flags & FOLL_PIN) {
> > + int refs = 1;
> > +
> > + page = compound_head(page);
> > +
> > + if (WARN_ON_ONCE(page_ref_count(page) <= 0))
> > + return false;
> > +
> > + if (hpage_pincount_available(page))
> > + hpage_pincount_add(page, 1);
> > + else
> > + refs = GUP_PIN_COUNTING_BIAS;
> > +
> > + /*
> > + * Similar to try_grab_compound_head(): even if using the
> > + * hpage_pincount_add/_sub() routines, be sure to
> > + * *also* increment the normal page refcount field at least
> > + * once, so that the page really is pinned.
> > + */
> > + page_ref_add(page, refs);
> > +
> > + mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_ACQUIRED, 1);
> > + }
> > +
> > + return true;
> > }
> >
> > /**
> >
> > base-commit: 26291c54e111ff6ba87a164d85d4a4e134b7315c
> > --
> > 2.35.1
> >
>
> Thanks John! I verified this works on the Pixel 6 with the 5.15 kernel
> for my camera use-case. Free free to include:
>
> Tested-by: Will McVicker <willmcvicker(a)google.com>
>
> Thanks,
> Will
And just so we don't miss this, I'd also like to request this be
pulled into the 5.15 stable branch please.
Cc: stable(a)vger.kernel.org # 5.15
Thanks,
Will