When setup mapping on an paging domain before the domain is attached to any device, a NULL pointer dereference triggers as below:
BUG: kernel NULL pointer dereference, address: 0000000000000200 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page RIP: 0010:cache_tag_flush_range_np+0x114/0x1f0 ... Call Trace: <TASK> ? __die+0x20/0x70 ? page_fault_oops+0x80/0x150 ? do_user_addr_fault+0x5f/0x670 ? pfn_to_dma_pte+0xca/0x280 ? exc_page_fault+0x78/0x170 ? asm_exc_page_fault+0x22/0x30 ? cache_tag_flush_range_np+0x114/0x1f0 intel_iommu_iotlb_sync_map+0x16/0x20 iommu_map+0x59/0xd0 batch_to_domain+0x154/0x1e0 iopt_area_fill_domains+0x106/0x300 iopt_map_pages+0x1bc/0x290 iopt_map_user_pages+0xe8/0x1e0 ? xas_load+0x9/0xb0 iommufd_ioas_map+0xc9/0x1c0 iommufd_fops_ioctl+0xff/0x1b0 __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x50/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e
qi_batch structure is allocated when domain is attached to a device for the first time, when a mapping is setup before that, qi_batch is referenced to do batched flush and trigger above issue.
Fix it by checking qi_batch pointer and bypass batched flushing if no device is attached.
Cc: stable@vger.kernel.org Fixes: 705c1cdf1e73 ("iommu/vt-d: Introduce batched cache invalidation") Signed-off-by: Zhenzhong Duan zhenzhong.duan@intel.com --- drivers/iommu/intel/cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index e5b89f728ad3..bb9dae9a7fba 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -264,7 +264,7 @@ static unsigned long calculate_psi_aligned_address(unsigned long start,
static void qi_batch_flush_descs(struct intel_iommu *iommu, struct qi_batch *batch) { - if (!iommu || !batch->index) + if (!iommu || !batch || !batch->index) return;
qi_submit_sync(iommu, batch->descs, batch->index, 0);
On 2024/12/12 15:24, Zhenzhong Duan wrote:
When setup mapping on an paging domain before the domain is attached to any device, a NULL pointer dereference triggers as below:
BUG: kernel NULL pointer dereference, address: 0000000000000200 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page RIP: 0010:cache_tag_flush_range_np+0x114/0x1f0 ... Call Trace:
<TASK> ? __die+0x20/0x70 ? page_fault_oops+0x80/0x150 ? do_user_addr_fault+0x5f/0x670 ? pfn_to_dma_pte+0xca/0x280 ? exc_page_fault+0x78/0x170 ? asm_exc_page_fault+0x22/0x30 ? cache_tag_flush_range_np+0x114/0x1f0 intel_iommu_iotlb_sync_map+0x16/0x20 iommu_map+0x59/0xd0 batch_to_domain+0x154/0x1e0 iopt_area_fill_domains+0x106/0x300 iopt_map_pages+0x1bc/0x290 iopt_map_user_pages+0xe8/0x1e0 ? xas_load+0x9/0xb0 iommufd_ioas_map+0xc9/0x1c0 iommufd_fops_ioctl+0xff/0x1b0 __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x50/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e
qi_batch structure is allocated when domain is attached to a device for the first time, when a mapping is setup before that, qi_batch is referenced to do batched flush and trigger above issue.
Fix it by checking qi_batch pointer and bypass batched flushing if no device is attached.
Cc: stable@vger.kernel.org Fixes: 705c1cdf1e73 ("iommu/vt-d: Introduce batched cache invalidation") Signed-off-by: Zhenzhong Duan zhenzhong.duan@intel.com
drivers/iommu/intel/cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index e5b89f728ad3..bb9dae9a7fba 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -264,7 +264,7 @@ static unsigned long calculate_psi_aligned_address(unsigned long start, static void qi_batch_flush_descs(struct intel_iommu *iommu, struct qi_batch *batch) {
- if (!iommu || !batch->index)
- if (!iommu || !batch || !batch->index)
this is the same issue in the below link. :) And we should fix it by allocating the qi_batch for parent domains. The nested parent domains is not going to be attached to device at all. It acts more as a background domain, so this fix will miss the future cache flushes. It would have bigger problems. :)
https://lore.kernel.org/linux-iommu/20241210130322.17175-1-yi.l.liu@intel.co...
return;
qi_submit_sync(iommu, batch->descs, batch->index, 0);
-----Original Message----- From: Liu, Yi L yi.l.liu@intel.com Sent: Thursday, December 12, 2024 3:45 PM Subject: Re: [PATCH] iommu/vt-d: Fix kernel NULL pointer dereference in cache_tag_flush_range_np()
On 2024/12/12 15:24, Zhenzhong Duan wrote:
When setup mapping on an paging domain before the domain is attached to
any
device, a NULL pointer dereference triggers as below:
BUG: kernel NULL pointer dereference, address: 0000000000000200 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page RIP: 0010:cache_tag_flush_range_np+0x114/0x1f0 ... Call Trace:
<TASK> ? __die+0x20/0x70 ? page_fault_oops+0x80/0x150 ? do_user_addr_fault+0x5f/0x670 ? pfn_to_dma_pte+0xca/0x280 ? exc_page_fault+0x78/0x170 ? asm_exc_page_fault+0x22/0x30 ? cache_tag_flush_range_np+0x114/0x1f0 intel_iommu_iotlb_sync_map+0x16/0x20 iommu_map+0x59/0xd0 batch_to_domain+0x154/0x1e0 iopt_area_fill_domains+0x106/0x300 iopt_map_pages+0x1bc/0x290 iopt_map_user_pages+0xe8/0x1e0 ? xas_load+0x9/0xb0 iommufd_ioas_map+0xc9/0x1c0 iommufd_fops_ioctl+0xff/0x1b0 __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x50/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e
qi_batch structure is allocated when domain is attached to a device for the first time, when a mapping is setup before that, qi_batch is referenced to do batched flush and trigger above issue.
Fix it by checking qi_batch pointer and bypass batched flushing if no device is attached.
Cc: stable@vger.kernel.org Fixes: 705c1cdf1e73 ("iommu/vt-d: Introduce batched cache invalidation") Signed-off-by: Zhenzhong Duan zhenzhong.duan@intel.com
drivers/iommu/intel/cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index e5b89f728ad3..bb9dae9a7fba 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -264,7 +264,7 @@ static unsigned long
calculate_psi_aligned_address(unsigned long start,
static void qi_batch_flush_descs(struct intel_iommu *iommu, struct qi_batch
*batch)
{
- if (!iommu || !batch->index)
- if (!iommu || !batch || !batch->index)
this is the same issue in the below link. :) And we should fix it by allocating the qi_batch for parent domains. The nested parent domains is not going to be attached to device at all. It acts more as a background domain, so this fix will miss the future cache flushes. It would have bigger problems. :)
https://lore.kernel.org/linux-iommu/20241210130322.17175-1- yi.l.liu@intel.com/
Ah, just see this😊 This patch tries to fix an issue that mapping setup on a paging domain before it's attached to a device, your patch fixed an issue that nested parent domain's qi_batch is not allocated even if nested domain is attached to a device. I think they are different issues?
Thanks Zhenzhong
On 2024/12/12 16:19, Duan, Zhenzhong wrote:
-----Original Message----- From: Liu, Yi L yi.l.liu@intel.com Sent: Thursday, December 12, 2024 3:45 PM Subject: Re: [PATCH] iommu/vt-d: Fix kernel NULL pointer dereference in cache_tag_flush_range_np()
On 2024/12/12 15:24, Zhenzhong Duan wrote:
When setup mapping on an paging domain before the domain is attached to
any
device, a NULL pointer dereference triggers as below:
BUG: kernel NULL pointer dereference, address: 0000000000000200 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page RIP: 0010:cache_tag_flush_range_np+0x114/0x1f0 ... Call Trace: <TASK> ? __die+0x20/0x70 ? page_fault_oops+0x80/0x150 ? do_user_addr_fault+0x5f/0x670 ? pfn_to_dma_pte+0xca/0x280 ? exc_page_fault+0x78/0x170 ? asm_exc_page_fault+0x22/0x30 ? cache_tag_flush_range_np+0x114/0x1f0 intel_iommu_iotlb_sync_map+0x16/0x20 iommu_map+0x59/0xd0 batch_to_domain+0x154/0x1e0 iopt_area_fill_domains+0x106/0x300 iopt_map_pages+0x1bc/0x290 iopt_map_user_pages+0xe8/0x1e0 ? xas_load+0x9/0xb0 iommufd_ioas_map+0xc9/0x1c0 iommufd_fops_ioctl+0xff/0x1b0 __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x50/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e
qi_batch structure is allocated when domain is attached to a device for the first time, when a mapping is setup before that, qi_batch is referenced to do batched flush and trigger above issue.
Fix it by checking qi_batch pointer and bypass batched flushing if no device is attached.
Cc: stable@vger.kernel.org Fixes: 705c1cdf1e73 ("iommu/vt-d: Introduce batched cache invalidation") Signed-off-by: Zhenzhong Duan zhenzhong.duan@intel.com
drivers/iommu/intel/cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index e5b89f728ad3..bb9dae9a7fba 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -264,7 +264,7 @@ static unsigned long
calculate_psi_aligned_address(unsigned long start,
static void qi_batch_flush_descs(struct intel_iommu *iommu, struct qi_batch
*batch)
{
- if (!iommu || !batch->index)
- if (!iommu || !batch || !batch->index)
this is the same issue in the below link. :) And we should fix it by allocating the qi_batch for parent domains. The nested parent domains is not going to be attached to device at all. It acts more as a background domain, so this fix will miss the future cache flushes. It would have bigger problems. :)
https://lore.kernel.org/linux-iommu/20241210130322.17175-1- yi.l.liu@intel.com/
Ah, just see this😊 This patch tries to fix an issue that mapping setup on a paging domain before it's attached to a device, your patch fixed an issue that nested parent domain's qi_batch is not allocated even if nested domain is attached to a device. I think they are different issues?
Oops. I see. I think your case is allocating a hwpt based on an IOAS that already has mappings. When the hwpt is added to it, all the mappings of this IOAS would be pushing to the hwpt. But the hwpt has not been attached yet, so hit this issue. I remember there is a immediate_attach bool to let iommufd_hwpt_paging_alloc() do an attach before calling iopt_table_add_domain() which pushes the IOAS mappings to domain.
One possible fix is to set the immediate_attach to be true. But I doubt if it will be agreed since it was introduced due to some gap on ARM side. If that gap has been resolved, this behavior would go away as well.
So back to this issue, with this change, the flush would be skipped. It looks ok to me to skip cache flush for map path. And we should not expect any unmap on this domain since there is no device attached (parent domain is an exception), hence nothing to be flushed even there is unmap in the domain's IOAS. So it appears to be a acceptable fix. @Baolu, your opinion?
Hi Yi,
-----Original Message----- From: Liu, Yi L yi.l.liu@intel.com Sent: Thursday, December 12, 2024 5:29 PM Subject: Re: [PATCH] iommu/vt-d: Fix kernel NULL pointer dereference in cache_tag_flush_range_np()
On 2024/12/12 16:19, Duan, Zhenzhong wrote:
-----Original Message----- From: Liu, Yi L yi.l.liu@intel.com Sent: Thursday, December 12, 2024 3:45 PM Subject: Re: [PATCH] iommu/vt-d: Fix kernel NULL pointer dereference in cache_tag_flush_range_np()
On 2024/12/12 15:24, Zhenzhong Duan wrote:
When setup mapping on an paging domain before the domain is attached to
any
device, a NULL pointer dereference triggers as below:
BUG: kernel NULL pointer dereference, address: 0000000000000200 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page RIP: 0010:cache_tag_flush_range_np+0x114/0x1f0 ... Call Trace: <TASK> ? __die+0x20/0x70 ? page_fault_oops+0x80/0x150 ? do_user_addr_fault+0x5f/0x670 ? pfn_to_dma_pte+0xca/0x280 ? exc_page_fault+0x78/0x170 ? asm_exc_page_fault+0x22/0x30 ? cache_tag_flush_range_np+0x114/0x1f0 intel_iommu_iotlb_sync_map+0x16/0x20 iommu_map+0x59/0xd0 batch_to_domain+0x154/0x1e0 iopt_area_fill_domains+0x106/0x300 iopt_map_pages+0x1bc/0x290 iopt_map_user_pages+0xe8/0x1e0 ? xas_load+0x9/0xb0 iommufd_ioas_map+0xc9/0x1c0 iommufd_fops_ioctl+0xff/0x1b0 __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x50/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e
qi_batch structure is allocated when domain is attached to a device for the first time, when a mapping is setup before that, qi_batch is referenced to do batched flush and trigger above issue.
Fix it by checking qi_batch pointer and bypass batched flushing if no device is attached.
Cc: stable@vger.kernel.org Fixes: 705c1cdf1e73 ("iommu/vt-d: Introduce batched cache invalidation") Signed-off-by: Zhenzhong Duan zhenzhong.duan@intel.com
drivers/iommu/intel/cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index e5b89f728ad3..bb9dae9a7fba 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -264,7 +264,7 @@ static unsigned long
calculate_psi_aligned_address(unsigned long start,
static void qi_batch_flush_descs(struct intel_iommu *iommu, struct
qi_batch
*batch)
{
- if (!iommu || !batch->index)
- if (!iommu || !batch || !batch->index)
this is the same issue in the below link. :) And we should fix it by allocating the qi_batch for parent domains. The nested parent domains is not going to be attached to device at all. It acts more as a background domain, so this fix will miss the future cache flushes. It would have bigger problems. :)
https://lore.kernel.org/linux-iommu/20241210130322.17175-1- yi.l.liu@intel.com/
Ah, just see this😊 This patch tries to fix an issue that mapping setup on a paging domain before it's attached to a device, your patch fixed an issue that nested parent domain's qi_batch is not allocated even if nested domain is attached to a device. I think they are different issues?
Oops. I see. I think your case is allocating a hwpt based on an IOAS that already has mappings. When the hwpt is added to it, all the mappings of this IOAS would be pushing to the hwpt. But the hwpt has not been attached yet, so hit this issue. I remember there is a immediate_attach bool to let iommufd_hwpt_paging_alloc() do an attach before calling iopt_table_add_domain() which pushes the IOAS mappings to domain.
One possible fix is to set the immediate_attach to be true. But I doubt if it will be agreed since it was introduced due to some gap on ARM side. If that gap has been resolved, this behavior would go away as well.
So back to this issue, with this change, the flush would be skipped. It looks ok to me to skip cache flush for map path. And we should not expect any unmap on this domain since there is no device attached (parent domain is an exception), hence nothing to be flushed even there is unmap in the domain's IOAS. So it appears to be a acceptable fix. @Baolu, your opinion?
Hold on, it looks I'm wrong on analyzing related code qi_batch_flush_descs(). The iommu should always be NULL in my suspected case, so qi_batch_flush_descs() will return early without issue.
I reproduced the backtrace when playing with iommufd qemu nesting, I think your previous comment is correct, I misunderstood the root cause of it, it's indeed due to nesting parent domain not paging domain. Please ignore this patch.
Thanks Zhenzhong
On 2024/12/12 18:01, Duan, Zhenzhong wrote:
Hi Yi,
-----Original Message----- From: Liu, Yi L yi.l.liu@intel.com Sent: Thursday, December 12, 2024 5:29 PM Subject: Re: [PATCH] iommu/vt-d: Fix kernel NULL pointer dereference in cache_tag_flush_range_np()
On 2024/12/12 16:19, Duan, Zhenzhong wrote:
-----Original Message----- From: Liu, Yi L yi.l.liu@intel.com Sent: Thursday, December 12, 2024 3:45 PM Subject: Re: [PATCH] iommu/vt-d: Fix kernel NULL pointer dereference in cache_tag_flush_range_np()
On 2024/12/12 15:24, Zhenzhong Duan wrote:
When setup mapping on an paging domain before the domain is attached to
any
device, a NULL pointer dereference triggers as below:
BUG: kernel NULL pointer dereference, address: 0000000000000200 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page RIP: 0010:cache_tag_flush_range_np+0x114/0x1f0 ... Call Trace: <TASK> ? __die+0x20/0x70 ? page_fault_oops+0x80/0x150 ? do_user_addr_fault+0x5f/0x670 ? pfn_to_dma_pte+0xca/0x280 ? exc_page_fault+0x78/0x170 ? asm_exc_page_fault+0x22/0x30 ? cache_tag_flush_range_np+0x114/0x1f0 intel_iommu_iotlb_sync_map+0x16/0x20 iommu_map+0x59/0xd0 batch_to_domain+0x154/0x1e0 iopt_area_fill_domains+0x106/0x300 iopt_map_pages+0x1bc/0x290 iopt_map_user_pages+0xe8/0x1e0 ? xas_load+0x9/0xb0 iommufd_ioas_map+0xc9/0x1c0 iommufd_fops_ioctl+0xff/0x1b0 __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x50/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e
qi_batch structure is allocated when domain is attached to a device for the first time, when a mapping is setup before that, qi_batch is referenced to do batched flush and trigger above issue.
Fix it by checking qi_batch pointer and bypass batched flushing if no device is attached.
Cc: stable@vger.kernel.org Fixes: 705c1cdf1e73 ("iommu/vt-d: Introduce batched cache invalidation") Signed-off-by: Zhenzhong Duan zhenzhong.duan@intel.com
drivers/iommu/intel/cache.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index e5b89f728ad3..bb9dae9a7fba 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -264,7 +264,7 @@ static unsigned long
calculate_psi_aligned_address(unsigned long start,
static void qi_batch_flush_descs(struct intel_iommu *iommu, struct
qi_batch
*batch)
{
- if (!iommu || !batch->index)
- if (!iommu || !batch || !batch->index)
this is the same issue in the below link. :) And we should fix it by allocating the qi_batch for parent domains. The nested parent domains is not going to be attached to device at all. It acts more as a background domain, so this fix will miss the future cache flushes. It would have bigger problems. :)
https://lore.kernel.org/linux-iommu/20241210130322.17175-1- yi.l.liu@intel.com/
Ah, just see this😊 This patch tries to fix an issue that mapping setup on a paging domain before it's attached to a device, your patch fixed an issue that nested parent domain's qi_batch is not allocated even if nested domain is attached to a device. I think they are different issues?
Oops. I see. I think your case is allocating a hwpt based on an IOAS that already has mappings. When the hwpt is added to it, all the mappings of this IOAS would be pushing to the hwpt. But the hwpt has not been attached yet, so hit this issue. I remember there is a immediate_attach bool to let iommufd_hwpt_paging_alloc() do an attach before calling iopt_table_add_domain() which pushes the IOAS mappings to domain.
One possible fix is to set the immediate_attach to be true. But I doubt if it will be agreed since it was introduced due to some gap on ARM side. If that gap has been resolved, this behavior would go away as well.
So back to this issue, with this change, the flush would be skipped. It looks ok to me to skip cache flush for map path. And we should not expect any unmap on this domain since there is no device attached (parent domain is an exception), hence nothing to be flushed even there is unmap in the domain's IOAS. So it appears to be a acceptable fix. @Baolu, your opinion?
Hold on, it looks I'm wrong on analyzing related code qi_batch_flush_descs(). The iommu should always be NULL in my suspected case, so qi_batch_flush_descs() will return early without issue.
I reproduced the backtrace when playing with iommufd qemu nesting, I think your previous comment is correct, I misunderstood the root cause of it, it's indeed due to nesting parent domain not paging domain. Please ignore this patch.
Great. I also had a try to allocate hwpt with an IOAS that has already got a bunch of mappings, it can work as the iommu is null.
@Baolu, you may ignore this patch as it's already fixed.
On 2024/12/12 19:00, Yi Liu wrote:
On 2024/12/12 18:01, Duan, Zhenzhong wrote:
Hi Yi,
-----Original Message----- From: Liu, Yi L yi.l.liu@intel.com Sent: Thursday, December 12, 2024 5:29 PM Subject: Re: [PATCH] iommu/vt-d: Fix kernel NULL pointer dereference in cache_tag_flush_range_np()
On 2024/12/12 16:19, Duan, Zhenzhong wrote:
-----Original Message----- From: Liu, Yi L yi.l.liu@intel.com Sent: Thursday, December 12, 2024 3:45 PM Subject: Re: [PATCH] iommu/vt-d: Fix kernel NULL pointer dereference in cache_tag_flush_range_np()
On 2024/12/12 15:24, Zhenzhong Duan wrote:
When setup mapping on an paging domain before the domain is attached to
any
device, a NULL pointer dereference triggers as below:
BUG: kernel NULL pointer dereference, address: 0000000000000200    #PF: supervisor read access in kernel mode    #PF: error_code(0x0000) - not-present page    RIP: 0010:cache_tag_flush_range_np+0x114/0x1f0    ...    Call Trace:     <TASK>     ? __die+0x20/0x70     ? page_fault_oops+0x80/0x150     ? do_user_addr_fault+0x5f/0x670     ? pfn_to_dma_pte+0xca/0x280     ? exc_page_fault+0x78/0x170     ? asm_exc_page_fault+0x22/0x30     ? cache_tag_flush_range_np+0x114/0x1f0     intel_iommu_iotlb_sync_map+0x16/0x20     iommu_map+0x59/0xd0     batch_to_domain+0x154/0x1e0     iopt_area_fill_domains+0x106/0x300     iopt_map_pages+0x1bc/0x290     iopt_map_user_pages+0xe8/0x1e0     ? xas_load+0x9/0xb0     iommufd_ioas_map+0xc9/0x1c0     iommufd_fops_ioctl+0xff/0x1b0     __x64_sys_ioctl+0x87/0xc0     do_syscall_64+0x50/0x110     entry_SYSCALL_64_after_hwframe+0x76/0x7e
qi_batch structure is allocated when domain is attached to a device for the first time, when a mapping is setup before that, qi_batch is referenced to do batched flush and trigger above issue.
Fix it by checking qi_batch pointer and bypass batched flushing if no device is attached.
Cc: stable@vger.kernel.org Fixes: 705c1cdf1e73 ("iommu/vt-d: Introduce batched cache invalidation") Signed-off-by: Zhenzhong Duan zhenzhong.duan@intel.com
drivers/iommu/intel/cache.c | 2 +- Â Â Â 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/ cache.c index e5b89f728ad3..bb9dae9a7fba 100644 --- a/drivers/iommu/intel/cache.c +++ b/drivers/iommu/intel/cache.c @@ -264,7 +264,7 @@ static unsigned long
calculate_psi_aligned_address(unsigned long start,
static void qi_batch_flush_descs(struct intel_iommu *iommu, struct
qi_batch
*batch)
{ -Â Â Â if (!iommu || !batch->index) +Â Â Â if (!iommu || !batch || !batch->index)
this is the same issue in the below link. 🙂 And we should fix it by allocating the qi_batch for parent domains. The nested parent domains is not going to be attached to device at all. It acts more as a background domain, so this fix will miss the future cache flushes. It would have bigger problems. 🙂
https://lore.kernel.org/linux-iommu/20241210130322.17175-1- yi.l.liu@intel.com/
Ah, just see this😊 This patch tries to fix an issue that mapping setup on a paging domain before it's attached to a device, your patch fixed an issue that nested parent domain's qi_batch is not allocated even if nested domain is attached to a device. I think they are different issues?
Oops. I see. I think your case is allocating a hwpt based on an IOAS that already has mappings. When the hwpt is added to it, all the mappings of this IOAS would be pushing to the hwpt. But the hwpt has not been attached yet, so hit this issue. I remember there is a immediate_attach bool to let iommufd_hwpt_paging_alloc() do an attach before calling iopt_table_add_domain() which pushes the IOAS mappings to domain.
One possible fix is to set the immediate_attach to be true. But I doubt if it will be agreed since it was introduced due to some gap on ARM side. If that gap has been resolved, this behavior would go away as well.
So back to this issue, with this change, the flush would be skipped. It looks ok to me to skip cache flush for map path. And we should not expect any unmap on this domain since there is no device attached (parent domain is an exception), hence nothing to be flushed even there is unmap in the domain's IOAS. So it appears to be a acceptable fix. @Baolu, your opinion?
Hold on, it looks I'm wrong on analyzing related code qi_batch_flush_descs(). The iommu should always be NULL in my suspected case, so qi_batch_flush_descs() will return early without issue.
I reproduced the backtrace when playing with iommufd qemu nesting, I think your previous comment is correct, I misunderstood the root cause of it, it's indeed due to nesting parent domain not paging domain. Please ignore this patch.
Great. I also had a try to allocate hwpt with an IOAS that has already got a bunch of mappings, it can work as the iommu is null.
@Baolu, you may ignore this patch as it's already fixed.
Okay, thank you!
linux-stable-mirror@lists.linaro.org