On Tue, Apr 30, 2024 at 02:31:38PM +0100, Ryan Roberts wrote:
__split_huge_pmd_locked() can be called for a present THP, devmap or (non-present) migration entry. It calls pmdp_invalidate() unconditionally on the pmdp and only determines if it is present or not based on the returned old pmd.
But arm64's pmd_mkinvalid(), called by pmdp_invalidate(), unconditionally sets the PMD_PRESENT_INVALID flag, which causes future pmd_present() calls to return true - even for a swap pmd. Therefore any lockless pgtable walker could see the migration entry pmd in this state and start interpretting the fields (e.g. pmd_pfn()) as if it were present, leading to BadThings (TM). GUP-fast appears to be one such lockless pgtable walker.
While the obvious fix is for core-mm to avoid such calls for non-present pmds (pmdp_invalidate() will also issue TLBI which is not necessary for this case either), all other arches that implement pmd_mkinvalid() do it in such a way that it is robust to being called with a non-present pmd. So it is simpler and safer to make arm64 robust too. This approach means we can even add tests to debug_vm_pgtable.c to validate the required behaviour.
This is a theoretical bug found during code review. I don't have any test case to trigger it in practice.
Cc: stable@vger.kernel.org Fixes: 53fa117bb33c ("arm64/mm: Enable THP migration") Signed-off-by: Ryan Roberts ryan.roberts@arm.com
Hi all,
v1 of this fix [1] took the approach of fixing core-mm to never call pmdp_invalidate() on a non-present pmd. But Zi Yan highlighted that only arm64 suffers this problem; all other arches are robust. So his suggestion was to instead make arm64 robust in the same way and add tests to validate it. Despite my stated reservations in the context of the v1 discussion, having thought on it for a bit, I now agree with Zi Yan. Hence this post.
Andrew has v1 in mm-unstable at the moment, so probably the best thing to do is remove it from there and have this go in through the arm64 tree? Assuming there is agreement that this approach is right one.
This applies on top of v6.9-rc5. Passes all the mm selftests on arm64.
[1] https://lore.kernel.org/linux-mm/20240425170704.3379492-1-ryan.roberts@arm.c...
Thanks, Ryan
arch/arm64/include/asm/pgtable.h | 12 +++++-- mm/debug_vm_pgtable.c | 61 ++++++++++++++++++++++++++++++++ 2 files changed, 71 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index afdd56d26ad7..7d580271a46d 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -511,8 +511,16 @@ static inline int pmd_trans_huge(pmd_t pmd)
static inline pmd_t pmd_mkinvalid(pmd_t pmd) {
- pmd = set_pmd_bit(pmd, __pgprot(PMD_PRESENT_INVALID));
- pmd = clear_pmd_bit(pmd, __pgprot(PMD_SECT_VALID));
/*
* If not valid then either we are already present-invalid or we are
* not-present (i.e. none or swap entry). We must not convert
* not-present to present-invalid. Unbelievably, the core-mm may call
* pmd_mkinvalid() for a swap entry and all other arches can handle it.
*/
if (pmd_valid(pmd)) {
pmd = set_pmd_bit(pmd, __pgprot(PMD_PRESENT_INVALID));
pmd = clear_pmd_bit(pmd, __pgprot(PMD_SECT_VALID));
}
return pmd;
}
Acked-by: Will Deacon will@kernel.org
But it might be worth splitting the tests from the fix to make backporting easier.
Catalin -- I assume you'll pick this up, but please shout if you want me to take it instead.
Will