On Wed, Jan 07, 2026 at 07:36:44PM -0800, Alex Mastro wrote:
This was inspired by QEMU's hw/vfio/region.c which also does this rounding up of size to the next power of two [1].
I'm now realizing that's only necessary for regions with VFIO_REGION_INFO_CAP_SPARSE_MMAP where there are multiple mmaps per region, and each mmap's size is less than the size of the BAR. Here, since we're mapping the entire BAR which must be pow2, it shouldn't be necessary.
You only need to do this dance if you care about having large PTEs under the VMAs, which is probably something worth testing both scenarios.
The intent of QEMU's mmap alignment code is imperfect in the SPARE_MMAP case? After a hole, the next mmap'able range could be some arbitrary page-aligned offset into the region. It's not helpful mmap some region offset which is maximally 4K-aligned at a 1G-aligned vaddr.
I think to be optimal, QEMU should be attempting to align the vaddr for bar mmaps such that
vaddr % {2M,1G} == region_offset % {2M,1G}
Would love someone to sanity check me on this. Kind of a diversion.
What you write is correct. Ankit recently discovered this bug in qemu. It happens not just with SPARSE_MMAP but also when mmmaping around the MSI-X hole..
I also advocated for what you write here that qemu should ensure:
vaddr % region_size == region_offset % region_size
Until VFIO learns to align its VMAs on its own via Peter's work.
Jason