On Wed, 9 Jul 2025 at 05:55, Jason Gunthorpe jgg@nvidia.com wrote:
On Wed, Jul 09, 2025 at 02:26:20AM +0530, Naresh Kamboju wrote:
Regression identified while booting the Dragonboard 410c (Qualcomm APQ8016 SBC) using the Linux next-20250702 kernel tag. During device initialization, the kernel triggers a WARNING in the arm_lpae_map_pages() function, which is part of the IOMMU subsystem. The call trace also involves qcom_iommu_map().
Test environments:
- Dragonboard-410c
Regression Analysis:
- New regression? Yes
- Reproducibility? Yes
Boot regression: next-20250702 WARNING iommu io-pgtable-arm.c at arm_lpae_map_pages qcom_iommu_map
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
List of suspected patches with recent changes.
Can you test this fix please:
I have tested this patch on top of Linux next-20250702 tag, and found kernel warning,
[ 1.510468] ------------[ cut here ]------------ [ 1.516302] WARNING: drivers/iommu/iommu.c:1142 at iommu_create_device_direct_mappings+0x240/0x258, CPU#1: swapper/0/1 [ 1.521001] Modules linked in: [ 1.531485] CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.16.0-rc4-next-20250702 #1 PREEMPT [ 1.534538] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) [ 1.543473] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1.550241] pc : iommu_create_device_direct_mappings (drivers/iommu/iommu.c:1142 (discriminator 7)) [ 1.556924] lr : iommu_setup_default_domain (drivers/iommu/iommu.c:2992 (discriminator 1)) [ 1.563170] sp : ffff80008002b9c0 [ 1.568113] x29: ffff80008002b9e0 x28: 0000000000000000 x27: ffff80008174e2c0 [ 1.571596] x26: ffff000004c58030 x25: ffff800081d75228 x24: ffff80008221eba4 [ 1.578714] x23: ffff000003d99410 x22: ffff80008002b9c8 x21: ffff000003ce5900 [ 1.585833] x20: ffff000003ce5948 x19: ffff000002e6d5a0 x18: 0000000000000000 [ 1.592951] x17: ffff000003d4a000 x16: ffff000002c37e00 x15: 07690720076f0774 [ 1.600068] x14: 0000000000000000 x13: ffff800082237670 x12: 000000000003786e [ 1.607185] x11: 0000000000000115 x10: 0000000000103758 x9 : 0000000000000000 [ 1.614303] x8 : ffff000003ce5d00 x7 : 0000000000000000 x6 : 000000000000003f [ 1.621422] x5 : 0000000000000040 x4 : 0000000000000000 x3 : 0000000000000001 [ 1.628539] x2 : ffff000002ce0000 x1 : ffff000003d99410 x0 : 0000000000000003 [ 1.635660] Call trace: [ 1.642766] iommu_create_device_direct_mappings (drivers/iommu/iommu.c:1142 (discriminator 7)) (P) [ 1.645031] iommu_setup_default_domain (drivers/iommu/iommu.c:2992 (discriminator 1)) [ 1.651277] iommu_device_register (drivers/iommu/iommu.c:1905 drivers/iommu/iommu.c:277) [ 1.655877] qcom_iommu_device_probe (drivers/iommu/arm/arm-smmu/qcom_iommu.c:860) [ 1.660392] platform_probe (drivers/base/platform.c:1404) [ 1.665163] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:657) [ 1.668723] __driver_probe_device (drivers/base/dd.c:799) [ 1.672371] driver_probe_device (drivers/base/dd.c:829) [ 1.676623] __driver_attach (drivers/base/dd.c:1216 drivers/base/dd.c:1155) [ 1.680615] bus_for_each_dev (drivers/base/bus.c:370) [ 1.684434] driver_attach (drivers/base/dd.c:1234) [ 1.688255] bus_add_driver (drivers/base/bus.c:678) [ 1.692073] driver_register (drivers/base/driver.c:249) [ 1.695633] __platform_driver_register (drivers/base/platform.c:868) [ 1.699368] qcom_iommu_init (drivers/iommu/arm/arm-smmu/qcom_iommu.c:943) [ 1.704226] do_one_initcall (init/main.c:1269) [ 1.707873] kernel_init_freeable (init/main.c:1330 (discriminator 1) init/main.c:1347 (discriminator 1) init/main.c:1366 (discriminator 1) init/main.c:1579 (discriminator 1)) [ 1.711607] kernel_init (init/main.c:1473) [ 1.716118] ret_from_fork (arch/arm64/kernel/entry.S:863) [ 1.719419] ---[ end trace 0000000000000000 ]--- [ 1.723302] iommu 1ef0000.iommu: IOMMU driver was not able to establish FW requested direct mapping. [ 1.734231] platform 1c00000.gpu: Adding to iommu group 2 [ 1.748218] loop: module loaded
Links: - https://lkft.validation.linaro.org/scheduler/job/8350682#L2838
--- a/drivers/iommu/arm/arm-smmu/qcom_iommu.c +++ b/drivers/iommu/arm/arm-smmu/qcom_iommu.c @@ -229,7 +229,7 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain, goto out_unlock;
pgtbl_cfg = (struct io_pgtable_cfg) {
.pgsize_bitmap = domain->pgsize_bitmap,
.pgsize_bitmap = SZ_4K | SZ_64K | SZ_1M | SZ_16M, .ias = 32, .oas = 40, .tlb = &qcom_flush_ops,
@@ -246,6 +246,8 @@ static int qcom_iommu_init_domain(struct iommu_domain *domain, goto out_clear_iommu; }
/* Update the domain's page sizes to reflect the page table format */
domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap; domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1; domain->geometry.force_aperture = true;
@@ -335,7 +337,6 @@ static struct iommu_domain *qcom_iommu_domain_alloc_paging(struct device *dev)
mutex_init(&qcom_domain->init_mutex); spin_lock_init(&qcom_domain->pgtbl_lock);
qcom_domain->domain.pgsize_bitmap = SZ_4K | SZ_64K | SZ_1M | SZ_16M; return &qcom_domain->domain;
}
Of all the drivers qcom is the only one that uses the 64 bit arm page table, 4 & 64k sizes, and was using the ops global. The io_pgtable code will remove one of the two depending on PAGE_SIZE which makes things inconsistent and hits that warn.
Jason