On Thu, Jun 12, 2025 at 10:53:34AM -0700, Nicolin Chen wrote:
On Thu, Jun 12, 2025 at 12:42:42PM -0300, Jason Gunthorpe wrote:
On Thu, Jun 12, 2025 at 05:23:01PM +0200, Thomas Weißschuh wrote:
On Thu, Jun 12, 2025 at 11:58:01AM -0300, Jason Gunthorpe wrote:
On Thu, Jun 12, 2025 at 04:27:41PM +0200, Thomas Weißschuh wrote:
If the assumption is that this is most likely a kernel bug, shouldn't it be fixed properly rather than worked around? After all the job of a selftest is to detect bugs to be fixed.
I investigated the history for a bit and it seems likely we cannot change the kernel here. Call it an undocumented "feature".
I looked a bit and it seems to be mentioned in mmap(2):
For mmap(), offset must be a multiple of the underlying huge page size. The system automatically aligns length to be a multiple of the underlying huge page size.
Oh there you go then :) Horrible design. No way for userspace to know what the rounded up length actually was and thus no way for userspace to unmap it.
OK. I think we would have to skip those cases then.
Or.. maybe we could just allocate a huge page:
@@ -2022,7 +2023,19 @@ FIXTURE_SETUP(iommufd_dirty_tracking) self->fd = open("/dev/iommu", O_RDWR); ASSERT_NE(-1, self->fd);
- rc = posix_memalign(&self->buffer, HUGEPAGE_SIZE, variant->buffer_size); + if (variant->hugepages) { + /* + * Allocation must be aligned to the HUGEPAGE_SIZE, because the + * following mmap() will automatically align the length to be a + * multiple of the underlying huge page size. Failing to do the + * same at this allocation will result in a memory overwrite by + * the mmap(). + */ + size = __ALIGN_KERNEL(variant->buffer_size, HUGEPAGE_SIZE); + } else { + size = variant->buffer_size; + } + rc = posix_memalign(&self->buffer, HUGEPAGE_SIZE, size); if (rc || !self->buffer) { SKIP(return, "Skipping buffer_size=%lu due to errno=%d", variant->buffer_size, rc);
It can just upsize the allocation, i.e. the test case will only use the first 64M or 128MB out of the reserved 512MB huge page.
Thanks Nicolin