From: Lu Baolu baolu.lu@linux.intel.com Sent: Wednesday, July 9, 2025 2:28 PM
The vmalloc() and vfree() functions manage virtually contiguous, but not necessarily physically contiguous, kernel memory regions. When vfree() unmaps such a region, it tears down the associated kernel page table entries and frees the physical pages.
In the IOMMU Shared Virtual Addressing (SVA) context, the IOMMU hardware shares and walks the CPU's page tables. Architectures like x86 share static kernel address mappings across all user page tables, allowing the
I'd remove 'static'
IOMMU to access the kernel portion of these tables.
Modern IOMMUs often cache page table entries to optimize walk performance, even for intermediate page table levels. If kernel page table mappings are changed (e.g., by vfree()), but the IOMMU's internal caches retain stale entries, Use-After-Free (UAF) vulnerability condition arises. If these freed page table pages are reallocated for a different purpose, potentially by an attacker, the IOMMU could misinterpret the new data as valid page table entries. This allows the IOMMU to walk into attacker-controlled memory, leading to arbitrary physical memory DMA access or privilege escalation.
this lacks of a background that currently the iommu driver is notified only for changes of user VA mappings, so the IOMMU's internal caches may retain stale entries for kernel VA.
To mitigate this, introduce a new iommu interface to flush IOMMU caches and fence pending page table walks when kernel page mappings are updated. This interface should be invoked from architecture-specific code that manages combined user and kernel page tables.
this also needs some words about the fact that new flushes are triggered not just for freeing page tables.
static DEFINE_MUTEX(iommu_sva_lock); +static DEFINE_STATIC_KEY_FALSE(iommu_sva_present); +static LIST_HEAD(iommu_sva_mms); +static DEFINE_SPINLOCK(iommu_mms_lock);
s/iommu_mms_lock/iommu_mm_lock/
Reviewed-by: Kevin Tian kevin.tian@intel.com