From: Charan Teja Kalla quic_charante@quicinc.com
This fix is applicable for LTS kernel, 6.1.y. In latest kernels, this race issue is fixed by the patch series [1] and [2]. The right thing to do here would have been propagating these changes from latest kernel to the stable branch, 6.1.y. However, these changes seems too intrusive to be picked for stable branches. Hence, the fix proposed can be taken as an alternative instead of backporting the patch series. [1] https://lore.kernel.org/all/0-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia... [2] https://lore.kernel.org/all/0-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvi...
Issue: A race condition is observed when arm_smmu_device_probe and modprobe of client devices happens in parallel. This results in the allocation of a new default domain for the iommu group even though it was previously allocated and the respective iova domain(iovad) was initialized. However, for this newly allocated default domain, iovad will not be initialized. As a result, for devices requesting dma allocations, this uninitialized iovad will be used, thereby causing NULL pointer dereference issue.
Flow: - During arm_smmu_device_probe, bus_iommu_probe() will be called as part of iommu_device_register(). This results in the device probe, __iommu_probe_device().
- When the modprobe of the client device happens in parallel, it sets up the DMA configuration for the device using of_dma_configure_id(), which inturn calls iommu_probe_device(). Later, default domain is allocated and attached using iommu_alloc_default_domain() and __iommu_attach_device() respectively. It then ends up initializing a mapping domain(IOVA domain) and rcaches for the device via arch_setup_dma_ops()->iommu_setup_dma_ops().
- Now, in the bus_iommu_probe() path, it again tries to allocate a default domain via probe_alloc_default_domain(). This results in allocating a new default domain(along with IOVA domain) via __iommu_domain_alloc(). However, this newly allocated IOVA domain will not be initialized.
- Now, when the same client device tries dma allocations via iommu_dma_alloc(), it ends up accessing the rcaches of the newly allocated IOVA domain, which is not initialized. This results into NULL pointer dereferencing.
Fix this issue by adding a check in probe_alloc_default_domain() to see if the iommu_group already has a default domain allocated and initialized.
Signed-off-by: Charan Teja Kalla quic_charante@quicinc.com Co-developed-by: Nikhil V quic_nprakash@quicinc.com Signed-off-by: Nikhil V quic_nprakash@quicinc.com --- drivers/iommu/iommu.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 8b3897239477..83736824f17d 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1741,6 +1741,9 @@ static void probe_alloc_default_domain(struct bus_type *bus, { struct __group_domain_type gtype;
+ if (group->default_domain) + return; + memset(>ype, 0, sizeof(gtype));
/* Ask for default domain requirements of all devices in the group */
Hi,
Thanks for your patch.
FYI: kernel test robot notices the stable kernel rule is not satisfied.
The check is based on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#opti...
Rule: add the tag "Cc: stable@vger.kernel.org" in the sign-off area to have the patch automatically included in the stable tree. Subject: [PATCH] iommu: Avoid races around default domain allocations Link: https://lore.kernel.org/stable/cbf1295589bd90083ad6f75a7fbced01f327c047.1708...
On Mon, Mar 04, 2024 at 04:40:50PM +0530, Nikhil V wrote:
From: Charan Teja Kalla quic_charante@quicinc.com
This fix is applicable for LTS kernel, 6.1.y. In latest kernels, this race issue is fixed by the patch series [1] and [2]. The right thing to do here would have been propagating these changes from latest kernel to the stable branch, 6.1.y. However, these changes seems too intrusive to be picked for stable branches. Hence, the fix proposed can be taken as an alternative instead of backporting the patch series. [1] https://lore.kernel.org/all/0-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia... [2] https://lore.kernel.org/all/0-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvi...
Issue: A race condition is observed when arm_smmu_device_probe and modprobe of client devices happens in parallel. This results in the allocation of a new default domain for the iommu group even though it was previously allocated and the respective iova domain(iovad) was initialized. However, for this newly allocated default domain, iovad will not be initialized. As a result, for devices requesting dma allocations, this uninitialized iovad will be used, thereby causing NULL pointer dereference issue.
Flow:
- During arm_smmu_device_probe, bus_iommu_probe() will be called
as part of iommu_device_register(). This results in the device probe, __iommu_probe_device().
- When the modprobe of the client device happens in parallel, it
sets up the DMA configuration for the device using of_dma_configure_id(), which inturn calls iommu_probe_device(). Later, default domain is allocated and attached using iommu_alloc_default_domain() and __iommu_attach_device() respectively. It then ends up initializing a mapping domain(IOVA domain) and rcaches for the device via arch_setup_dma_ops()->iommu_setup_dma_ops().
- Now, in the bus_iommu_probe() path, it again tries to allocate
a default domain via probe_alloc_default_domain(). This results in allocating a new default domain(along with IOVA domain) via __iommu_domain_alloc(). However, this newly allocated IOVA domain will not be initialized.
- Now, when the same client device tries dma allocations via
iommu_dma_alloc(), it ends up accessing the rcaches of the newly allocated IOVA domain, which is not initialized. This results into NULL pointer dereferencing.
Fix this issue by adding a check in probe_alloc_default_domain() to see if the iommu_group already has a default domain allocated and initialized.
Cc: stable@vger.kernel.org # see patch description, fix applicable only for 6.1.y Signed-off-by: Charan Teja Kalla quic_charante@quicinc.com Co-developed-by: Nikhil V quic_nprakash@quicinc.com Signed-off-by: Nikhil V quic_nprakash@quicinc.com
drivers/iommu/iommu.c | 3 +++ 1 file changed, 3 insertions(+)
Why RESEND? What happened to the first send?
thanks,
greg k-h
On 3/4/2024 6:55 PM, Greg KH wrote:
On Mon, Mar 04, 2024 at 04:40:50PM +0530, Nikhil V wrote:
From: Charan Teja Kalla quic_charante@quicinc.com
This fix is applicable for LTS kernel, 6.1.y. In latest kernels, this race issue is fixed by the patch series [1] and [2]. The right thing to do here would have been propagating these changes from latest kernel to the stable branch, 6.1.y. However, these changes seems too intrusive to be picked for stable branches. Hence, the fix proposed can be taken as an alternative instead of backporting the patch series. [1] https://lore.kernel.org/all/0-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia... [2] https://lore.kernel.org/all/0-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvi...
Issue: A race condition is observed when arm_smmu_device_probe and modprobe of client devices happens in parallel. This results in the allocation of a new default domain for the iommu group even though it was previously allocated and the respective iova domain(iovad) was initialized. However, for this newly allocated default domain, iovad will not be initialized. As a result, for devices requesting dma allocations, this uninitialized iovad will be used, thereby causing NULL pointer dereference issue.
Flow:
- During arm_smmu_device_probe, bus_iommu_probe() will be called
as part of iommu_device_register(). This results in the device probe, __iommu_probe_device().
- When the modprobe of the client device happens in parallel, it
sets up the DMA configuration for the device using of_dma_configure_id(), which inturn calls iommu_probe_device(). Later, default domain is allocated and attached using iommu_alloc_default_domain() and __iommu_attach_device() respectively. It then ends up initializing a mapping domain(IOVA domain) and rcaches for the device via arch_setup_dma_ops()->iommu_setup_dma_ops().
- Now, in the bus_iommu_probe() path, it again tries to allocate
a default domain via probe_alloc_default_domain(). This results in allocating a new default domain(along with IOVA domain) via __iommu_domain_alloc(). However, this newly allocated IOVA domain will not be initialized.
- Now, when the same client device tries dma allocations via
iommu_dma_alloc(), it ends up accessing the rcaches of the newly allocated IOVA domain, which is not initialized. This results into NULL pointer dereferencing.
Fix this issue by adding a check in probe_alloc_default_domain() to see if the iommu_group already has a default domain allocated and initialized.
Cc: stable@vger.kernel.org # see patch description, fix applicable only for 6.1.y Signed-off-by: Charan Teja Kalla quic_charante@quicinc.com Co-developed-by: Nikhil V quic_nprakash@quicinc.com Signed-off-by: Nikhil V quic_nprakash@quicinc.com
drivers/iommu/iommu.c | 3 +++ 1 file changed, 3 insertions(+)
Why RESEND? What happened to the first send?
thanks,
greg k-h
Hi Greg,
There are no changes as such w.r.t first send, [1]. It is resent to gain attention on this patch. Also, we have added a proper Cc: stable tag with this patch.
[1] https://lore.kernel.org/all/cbf1295589bd90083ad6f75a7fbced01f327c047.1708680...
Thanks Nikhil V
On Tue, Mar 05, 2024 at 01:01:12PM +0530, Nikhil V wrote:
On 3/4/2024 6:55 PM, Greg KH wrote:
On Mon, Mar 04, 2024 at 04:40:50PM +0530, Nikhil V wrote:
From: Charan Teja Kalla quic_charante@quicinc.com
This fix is applicable for LTS kernel, 6.1.y. In latest kernels, this race issue is fixed by the patch series [1] and [2]. The right thing to do here would have been propagating these changes from latest kernel to the stable branch, 6.1.y. However, these changes seems too intrusive to be picked for stable branches. Hence, the fix proposed can be taken as an alternative instead of backporting the patch series. [1] https://lore.kernel.org/all/0-v8-81230027b2fa+9d-iommu_all_defdom_jgg@nvidia... [2] https://lore.kernel.org/all/0-v5-1b99ae392328+44574-iommu_err_unwind_jgg@nvi...
Issue: A race condition is observed when arm_smmu_device_probe and modprobe of client devices happens in parallel. This results in the allocation of a new default domain for the iommu group even though it was previously allocated and the respective iova domain(iovad) was initialized. However, for this newly allocated default domain, iovad will not be initialized. As a result, for devices requesting dma allocations, this uninitialized iovad will be used, thereby causing NULL pointer dereference issue.
Flow:
- During arm_smmu_device_probe, bus_iommu_probe() will be called
as part of iommu_device_register(). This results in the device probe, __iommu_probe_device().
- When the modprobe of the client device happens in parallel, it
sets up the DMA configuration for the device using of_dma_configure_id(), which inturn calls iommu_probe_device(). Later, default domain is allocated and attached using iommu_alloc_default_domain() and __iommu_attach_device() respectively. It then ends up initializing a mapping domain(IOVA domain) and rcaches for the device via arch_setup_dma_ops()->iommu_setup_dma_ops().
- Now, in the bus_iommu_probe() path, it again tries to allocate
a default domain via probe_alloc_default_domain(). This results in allocating a new default domain(along with IOVA domain) via __iommu_domain_alloc(). However, this newly allocated IOVA domain will not be initialized.
- Now, when the same client device tries dma allocations via
iommu_dma_alloc(), it ends up accessing the rcaches of the newly allocated IOVA domain, which is not initialized. This results into NULL pointer dereferencing.
Fix this issue by adding a check in probe_alloc_default_domain() to see if the iommu_group already has a default domain allocated and initialized.
Cc: stable@vger.kernel.org # see patch description, fix applicable only for 6.1.y Signed-off-by: Charan Teja Kalla quic_charante@quicinc.com Co-developed-by: Nikhil V quic_nprakash@quicinc.com Signed-off-by: Nikhil V quic_nprakash@quicinc.com
drivers/iommu/iommu.c | 3 +++ 1 file changed, 3 insertions(+)
Why RESEND? What happened to the first send?
thanks,
greg k-h
Hi Greg,
There are no changes as such w.r.t first send, [1]. It is resent to gain attention on this patch. Also, we have added a proper Cc: stable tag with this patch.
Thanks, now queued up.
greg k-h
linux-stable-mirror@lists.linaro.org