Re: [PATCH 1/2] KVM: arm64: Disable interrupts while walking userspace PTs

13 Mar 2023

+David
On Mon, Mar 13, 2023, Marc Zyngier wrote:
...
We walk the userspace PTs to discover what mapping size was
used there. However, this can race against the userspace tables
being freed, and we end-up in the weeds.
Thankfully, the mm code is being generous and will IPI us when
doing so. So let's implement our part of the bargain and disable
interrupts around the walk. This ensures that nothing terrible
happens during that time.
We still need to handle the removal of the page tables before
the walk. For that, allow get_user_mapping_size() to return an
error, and make sure this error can be propagated all the way
to the the exit handler.
Signed-off-by: Marc Zyngier maz@kernel.org
Cc: stable@vger.kernel.org

arch/arm64/kvm/mmu.c | 35 ++++++++++++++++++++++++++++-------
 1 file changed, 28 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 7113587222ff..d7b8b25942df 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -666,14 +666,23 @@ static int get_user_mapping_size(struct kvm *kvm, u64 addr)
   			   CONFIG_PGTABLE_LEVELS),
   	.mm_ops		= &kvm_user_mm_ops,
   };

unsigned long flags;
kvm_pte_t pte = 0;	/* Keep GCC quiet... */
u32 level = ~0;
int ret;


/*
* Disable IRQs so that we hazard against a concurrent


* teardown of the userspace page tables (which relies on


* IPI-ing threads).


*/


local_irq_save(flags);
ret = kvm_pgtable_get_leaf(&pgt, addr, &pte, &level);


VM_BUG_ON(ret);
VM_BUG_ON(level >= KVM_PGTABLE_MAX_LEVELS);
VM_BUG_ON(!(pte & PTE_VALID));


local_irq_restore(flags);

/* Oops, the userspace PTs are gone... */
if (ret || level >= KVM_PGTABLE_MAX_LEVELS || !(pte & PTE_VALID))
return -EFAULT;



I don't think this should return -EFAULT all the way out to userspace.  Unless
arm64 differs from x86 in terms of how the userspace page tables are managed, not
having a valid translation _right now_ doesn't mean that one can't be created in
the future, e.g. by way of a subsequent hva_to_pfn().
FWIW, the approach x86 takes is to install a 4KiB (smallest granuale) translation,
which is safe since there _was_ a valid translation when mmu_lock was acquired and
mmu_invalidate_retry() was checked.  It's the primary MMU's responsibility to ensure
all secondary MMUs are purged before freeing memory, i.e. worst case should be that
KVMs stage-2 translation will be immediately zapped via mmu_notifier.
KVM ARM also has a bug that might be related: the mmu_seq snapshot needs to be
taken _before_ mmap_read_unlock(), otherwise vma_shift may be stale by the time
it's consumed.  I believe David is going to submit a patch (I found and "reported"
the bug when doing an internal review of "common MMU" stuff).

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH 1/2] KVM: arm64: Disable interrupts while walking userspace PTs