From: Moti Haimovski moti.haimovski@intel.com
[ Upstream commit 513024d5a0e34fd34247043f1876b6138ca52847 ]
When IOMMU is enabled, dma_alloc_coherent() with GFP_USER may return addresses from the vmalloc range. If such an address is mapped without VM_MIXEDMAP, vm_insert_page() will trigger a BUG_ON due to the VM_PFNMAP restriction.
Fix this by checking for vmalloc addresses and setting VM_MIXEDMAP in the VMA before mapping. This ensures safe mapping and avoids kernel crashes. The memory is still driver-allocated and cannot be accessed directly by userspace.
Signed-off-by: Moti Haimovski moti.haimovski@intel.com Reviewed-by: Koby Elbaz koby.elbaz@intel.com Signed-off-by: Koby Elbaz koby.elbaz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
YES Fix prevents a hard kernel BUG when user CBs are mmap’ed under an IOMMU.
- `drivers/accel/habanalabs/gaudi/gaudi.c:4173` now marks the VMA with `VM_MIXEDMAP` whenever the coherent buffer lives in the vmalloc space, which is exactly what `dma_alloc_coherent(..., GFP_USER, …)` can return on IOMMU-backed systems; without this flag the later `vm_insert_page()` path hits `BUG_ON(vma->vm_flags & VM_PFNMAP)` in `mm/memory.c:2475`, crashing the kernel. - The same guard is added for Gaudi2 in `drivers/accel/habanalabs/gaudi2/gaudi2.c:6842`, covering both current ASIC generations whose command buffers are allocated this way. - Behaviour is unchanged for the legacy fallback path (`#else` branch using `remap_pfn_range`) and for non-vmalloc allocations, so regression risk is limited to setting one extra VMA flag only when needed.
Given that the pre-existing bug is an immediate kernel crash reachable from userspace workloads and the fix is tightly scoped with no architectural side effects, this is an excellent stable-candidate backport. Suggested follow-up test: on affected hardware with IOMMU enabled, mmap a user CB allocated via `GFP_USER` to confirm the BUG is gone.
drivers/accel/habanalabs/gaudi/gaudi.c | 19 +++++++++++++++++++ drivers/accel/habanalabs/gaudi2/gaudi2.c | 7 +++++++ 2 files changed, 26 insertions(+)
diff --git a/drivers/accel/habanalabs/gaudi/gaudi.c b/drivers/accel/habanalabs/gaudi/gaudi.c index fa893a9b826ec..34771d75da9d7 100644 --- a/drivers/accel/habanalabs/gaudi/gaudi.c +++ b/drivers/accel/habanalabs/gaudi/gaudi.c @@ -4168,10 +4168,29 @@ static int gaudi_mmap(struct hl_device *hdev, struct vm_area_struct *vma, vm_flags_set(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP | VM_DONTCOPY | VM_NORESERVE);
+#ifdef _HAS_DMA_MMAP_COHERENT + /* + * If dma_alloc_coherent() returns a vmalloc address, set VM_MIXEDMAP + * so vm_insert_page() can handle it safely. Without this, the kernel + * may BUG_ON due to VM_PFNMAP. + */ + if (is_vmalloc_addr(cpu_addr)) + vm_flags_set(vma, VM_MIXEDMAP); + rc = dma_mmap_coherent(hdev->dev, vma, cpu_addr, (dma_addr - HOST_PHYS_BASE), size); if (rc) dev_err(hdev->dev, "dma_mmap_coherent error %d", rc); +#else + + rc = remap_pfn_range(vma, vma->vm_start, + virt_to_phys(cpu_addr) >> PAGE_SHIFT, + size, vma->vm_page_prot); + if (rc) + dev_err(hdev->dev, "remap_pfn_range error %d", rc); + + #endif +
return rc; } diff --git a/drivers/accel/habanalabs/gaudi2/gaudi2.c b/drivers/accel/habanalabs/gaudi2/gaudi2.c index 3df72a5d024a6..b957957df3d3a 100644 --- a/drivers/accel/habanalabs/gaudi2/gaudi2.c +++ b/drivers/accel/habanalabs/gaudi2/gaudi2.c @@ -6490,6 +6490,13 @@ static int gaudi2_mmap(struct hl_device *hdev, struct vm_area_struct *vma, VM_DONTCOPY | VM_NORESERVE);
#ifdef _HAS_DMA_MMAP_COHERENT + /* + * If dma_alloc_coherent() returns a vmalloc address, set VM_MIXEDMAP + * so vm_insert_page() can handle it safely. Without this, the kernel + * may BUG_ON due to VM_PFNMAP. + */ + if (is_vmalloc_addr(cpu_addr)) + vm_flags_set(vma, VM_MIXEDMAP);
rc = dma_mmap_coherent(hdev->dev, vma, cpu_addr, dma_addr, size); if (rc)