On 2024/12/12 15:24, Zhenzhong Duan wrote:
When setup mapping on an paging domain before the domain is attached to any device, a NULL pointer dereference triggers as below:
BUG: kernel NULL pointer dereference, address: 0000000000000200 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page RIP: 0010:cache_tag_flush_range_np+0x114/0x1f0 ... Call Trace:
<TASK> ? __die+0x20/0x70 ? page_fault_oops+0x80/0x150 ? do_user_addr_fault+0x5f/0x670 ? pfn_to_dma_pte+0xca/0x280 ? exc_page_fault+0x78/0x170 ? asm_exc_page_fault+0x22/0x30 ? cache_tag_flush_range_np+0x114/0x1f0 intel_iommu_iotlb_sync_map+0x16/0x20 iommu_map+0x59/0xd0 batch_to_domain+0x154/0x1e0 iopt_area_fill_domains+0x106/0x300 iopt_map_pages+0x1bc/0x290 iopt_map_user_pages+0xe8/0x1e0 ? xas_load+0x9/0xb0 iommufd_ioas_map+0xc9/0x1c0 iommufd_fops_ioctl+0xff/0x1b0 __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x50/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e
qi_batch structure is allocated when domain is attached to a device for the first time, when a mapping is setup before that, qi_batch is referenced to do batched flush and trigger above issue.
Fix it by checking qi_batch pointer and bypass batched flushing if no device is attached.
Cc: stable@vger.kernel.org Fixes: 705c1cdf1e73 ("iommu/vt-d: Introduce batched cache invalidation") Signed-off-by: Zhenzhong Duan zhenzhong.duan@intel.com
drivers/iommu/intel/cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index e5b89f728ad3..bb9dae9a7fba 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -264,7 +264,7 @@ static unsigned long calculate_psi_aligned_address(unsigned long start, static void qi_batch_flush_descs(struct intel_iommu *iommu, struct qi_batch *batch) {
- if (!iommu || !batch->index)
- if (!iommu || !batch || !batch->index)
this is the same issue in the below link. :) And we should fix it by allocating the qi_batch for parent domains. The nested parent domains is not going to be attached to device at all. It acts more as a background domain, so this fix will miss the future cache flushes. It would have bigger problems. :)
https://lore.kernel.org/linux-iommu/20241210130322.17175-1-yi.l.liu@intel.co...
return;
qi_submit_sync(iommu, batch->descs, batch->index, 0);