Currently in the is_continue case in hugetlb_mcopy_atomic_pte(), if we bail out using "goto out_release_unlock;" in the cases where idx >= size, or !huge_pte_none(), the code will detect that new_pagecache_page == false, and so call restore_reserve_on_error(). In this case I see restore_reserve_on_error() delete the reservation, and the following call to remove_inode_hugepages() will increment h->resv_hugepages causing a 100% reproducible leak.
We should treat the is_continue case similar to adding a page into the pagecache and set new_pagecache_page to true, to indicate that there is no reservation to restore on the error path, and we need not call restore_reserve_on_error(). Rename new_pagecache_page to page_in_pagecache to make that clear.
Cc: Wei Xu weixugc@google.com
Cc: stable@vger.kernel.org
Fixes: c7b1850dfb41 ("hugetlb: don't pass page cache pages to restore_reserve_on_error") Signed-off-by: Mina Almasry almasrymina@google.com Reported-by: James Houghton jthoughton@google.com
---
Changes in v2: - Renamed new_pagecache_page to page_in_pagecache - Removed unnecessary comment after the name update. - Cc: stable --- mm/hugetlb.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e09159c957e3..e7ebc4b355cf 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5734,13 +5734,14 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, int ret = -ENOMEM; struct page *page; int writable; - bool new_pagecache_page = false; + bool page_in_pagecache = false;
if (is_continue) { ret = -EFAULT; page = find_lock_page(mapping, idx); if (!page) goto out; + page_in_pagecache = true; } else if (!*pagep) { /* If a page already exists, then it's UFFDIO_COPY for * a non-missing case. Return -EEXIST. @@ -5828,7 +5829,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, ret = huge_add_to_page_cache(page, mapping, idx); if (ret) goto out_release_nounlock; - new_pagecache_page = true; + page_in_pagecache = true; }
ptl = huge_pte_lockptr(h, dst_mm, dst_pte); @@ -5892,7 +5893,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, if (vm_shared || is_continue) unlock_page(page); out_release_nounlock: - if (!new_pagecache_page) + if (!page_in_pagecache) restore_reserve_on_error(h, dst_vma, dst_addr, page); put_page(page); goto out; -- 2.34.0.rc2.393.gf8c9666880-goog
On 11/17/21 11:38, Mina Almasry wrote:
Currently in the is_continue case in hugetlb_mcopy_atomic_pte(), if we bail out using "goto out_release_unlock;" in the cases where idx >= size, or !huge_pte_none(), the code will detect that new_pagecache_page == false, and so call restore_reserve_on_error(). In this case I see restore_reserve_on_error() delete the reservation, and the following call to remove_inode_hugepages() will increment h->resv_hugepages causing a 100% reproducible leak.
We should treat the is_continue case similar to adding a page into the pagecache and set new_pagecache_page to true, to indicate that there is no reservation to restore on the error path, and we need not call restore_reserve_on_error(). Rename new_pagecache_page to page_in_pagecache to make that clear.
Cc: Wei Xu weixugc@google.com
Cc: stable@vger.kernel.org
Fixes: c7b1850dfb41 ("hugetlb: don't pass page cache pages to restore_reserve_on_error") Signed-off-by: Mina Almasry almasrymina@google.com Reported-by: James Houghton jthoughton@google.com
Changes in v2:
- Renamed new_pagecache_page to page_in_pagecache
- Removed unnecessary comment after the name update.
- Cc: stable
Thanks for making the changes!
Reviewed-by: Mike Kravetz mike.kravetz@oracle.com
linux-stable-mirror@lists.linaro.org