Hi,
iommu crash when kernel 5.15.179 boot on DELL R7715/AMD 9015. but kernel 6.1.133/6.6.86 boot well.
It is the first time to boot kernel 5.15.y on DELL R7715/AMD 9015, so yet no more info about other 5.15.y kernel version.
It seems iommu related, but seems no relationship to 5.15.181 iommu-amd-return-an-error-if-vcpu-affinity-is-set-fo.patch 5.15.182 iommu-amd-fix-potential-buffer-overflow-in-parse_ivrs_acpihid.patch
dmesg output: [ 4.658313] Trying to unpack rootfs image as initramfs... [ 4.663349] BUG: kernel NULL pointer dereference, address: 0000000000000030 [ 4.664346] #PF: supervisor read access in kernel mode [ 4.664346] #PF: error_code(0x0000) - not-present page [ 4.664346] PGD 0 [ 4.664346] Oops: 0000 [#1] SMP NOPTI [ 4.664346] CPU: 8 PID: 1 Comm: swapper/0 Not tainted 5.15.179-1.el9.x86_64 #1 [ 4.664346] Hardware name: Dell Inc. PowerEdge R7715/0KRFPX, BIOS 1.1.2 02/20/2025 [ 4.664346] RIP: 0010:sysfs_add_link_to_group+0x12/0x60 [ 4.664346] Code: cb ff ff 48 89 ef 5d 41 5c e9 ca b4 ff ff 5d 41 5c c3 cc cc cc cc 66 90 0f 1f 44 00 00 41 55 49 89 cd 41 54 49 89 d4 31 d2 55 <48> 8b 7f 30 e8 a5 b2 ff ff 48 85 c0 74 29 48 89 c5 4c 89 e6 48 89 [ 4.664346] RSP: 0018:ff3f20b800047c28 EFLAGS: 00010246 [ 4.664346] RAX: 0000000000000001 RBX: ff25a0fc800530a8 RCX: ff25a0fc82cdb410 [ 4.664346] RDX: 0000000000000000 RSI: ffffffff904726e7 RDI: 0000000000000000 [ 4.664346] RBP: ff25a0fc801320d0 R08: ff3f20b800047d00 R09: ff3f20b800047d00 [ 4.664346] R10: 0720072007200720 R11: 0720072007200720 R12: ff25a0fc801320d0 [ 4.664346] R13: ff25a0fc82cdb410 R14: ff3f20b800047d00 R15: 0000000000000000 [ 4.664346] FS: 0000000000000000(0000) GS:ff25a10c1d400000(0000) knlGS:0000000000000000 [ 4.664346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.664346] CR2: 0000000000000030 CR3: 0000001036a10001 CR4: 0000000000771ee0 [ 4.664346] PKRU: 55555554 [ 4.664346] Call Trace: [ 4.664346] <TASK> [ 4.664346] ? show_trace_log_lvl+0x1c1/0x2d9 [ 4.664346] ? show_trace_log_lvl+0x1c1/0x2d9 [ 4.664346] ? iommu_device_link+0x3f/0xb0 [ 4.664346] ? __die_body.cold+0x8/0xd [ 4.664346] ? page_fault_oops+0xac/0x140 [ 4.664346] ? exc_page_fault+0x62/0x130 [ 4.664346] ? asm_exc_page_fault+0x22/0x30 [ 4.664346] ? sysfs_add_link_to_group+0x12/0x60 [ 4.664346] iommu_device_link+0x3f/0xb0 [ 4.664346] __iommu_probe_device+0x188/0x260 [ 4.664346] ? __iommu_probe_device+0x260/0x260 [ 4.664346] probe_iommu_group+0x31/0x40 [ 4.664346] bus_for_each_dev+0x75/0xc0 [ 4.664346] bus_iommu_probe+0x48/0x2c0 [ 4.664346] ? kmem_cache_alloc_trace+0x165/0x290 [ 4.664346] ? __cond_resched+0x16/0x50 [ 4.664346] bus_set_iommu+0x8c/0xe0 [ 4.664346] amd_iommu_init_api+0x18/0x34 [ 4.664346] amd_iommu_init_pci+0x56/0x21c [ 4.664346] ? e820__memblock_setup+0x7d/0x7d [ 4.664346] state_next+0x19a/0x2d4 [ 4.664346] ? blake2s_update+0x48/0xc0 [ 4.664346] ? e820__memblock_setup+0x7d/0x7d [ 4.664346] iommu_go_to_state+0x24/0x2c [ 4.664346] amd_iommu_init+0xf/0x29 [ 4.664346] pci_iommu_init+0x16/0x43 [ 4.664346] do_one_initcall+0x41/0x1d0 [ 4.664346] do_initcalls+0xc6/0xdf [ 4.664346] kernel_init_freeable+0x14e/0x19d [ 4.664346] ? rest_init+0xc0/0xc0 [ 4.664346] kernel_init+0x16/0x130 [ 4.664346] ret_from_fork+0x1f/0x30 [ 4.664346] </TASK> [ 4.664346] Modules linked in: [ 4.664346] CR2: 0000000000000030 [ 4.664346] ---[ end trace 9672514da279163d ]--- [ 4.664346] RIP: 0010:sysfs_add_link_to_group+0x12/0x60 [ 4.664346] Code: cb ff ff 48 89 ef 5d 41 5c e9 ca b4 ff ff 5d 41 5c c3 cc cc cc cc 66 90 0f 1f 44 00 00 41 55 49 89 cd 41 54 49 89 d4 31 d2 55 <48> 8b 7f 30 e8 a5 b2 ff ff 48 85 c0 74 29 48 89 c5 4c 89 e6 48 89 [ 4.664346] RSP: 0018:ff3f20b800047c28 EFLAGS: 00010246 [ 4.664346] RAX: 0000000000000001 RBX: ff25a0fc800530a8 RCX: ff25a0fc82cdb410 [ 4.664346] RDX: 0000000000000000 RSI: ffffffff904726e7 RDI: 0000000000000000 [ 4.664346] RBP: ff25a0fc801320d0 R08: ff3f20b800047d00 R09: ff3f20b800047d00 [ 4.664346] R10: 0720072007200720 R11: 0720072007200720 R12: ff25a0fc801320d0 [ 4.664346] R13: ff25a0fc82cdb410 R14: ff3f20b800047d00 R15: 0000000000000000 [ 4.664346] FS: 0000000000000000(0000) GS:ff25a10c1d400000(0000) knlGS:0000000000000000 [ 4.664346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.664346] CR2: 0000000000000030 CR3: 0000001036a10001 CR4: 0000000000771ee0 [ 4.664346] PKRU: 55555554 [ 4.664346] Kernel panic - not syncing: Fatal exception [ 4.664346] Rebooting in 15 seconds..
Best Regards Wang Yugui (wangyugui@e16-tech.com) 2025/05/18
On 5/18/2025 12:24 PM, Wang Yugui wrote:
Hi,
iommu crash when kernel 5.15.179 boot on DELL R7715/AMD 9015. but kernel 6.1.133/6.6.86 boot well.
It is the first time to boot kernel 5.15.y on DELL R7715/AMD 9015, so yet no more info about other 5.15.y kernel version.
It seems iommu related, but seems no relationship to 5.15.181 iommu-amd-return-an-error-if-vcpu-affinity-is-set-fo.patch 5.15.182 iommu-amd-fix-potential-buffer-overflow-in-parse_ivrs_acpihid.patch
Hi,
Could you please provide more details such as 1. What distro you are using 2. Steps to reproduce 3. Kernel config 4. Hardware details about the machine
Also have you tried bisecting the kernel (between 5.15.y to 6.1.y ) ? It can help find the commit that fixes the kernel.
Thanks Sairaj
dmesg output: [ 4.658313] Trying to unpack rootfs image as initramfs... [ 4.663349] BUG: kernel NULL pointer dereference, address: 0000000000000030 [ 4.664346] #PF: supervisor read access in kernel mode [ 4.664346] #PF: error_code(0x0000) - not-present page [ 4.664346] PGD 0 [ 4.664346] Oops: 0000 [#1] SMP NOPTI [ 4.664346] CPU: 8 PID: 1 Comm: swapper/0 Not tainted 5.15.179-1.el9.x86_64 #1 [ 4.664346] Hardware name: Dell Inc. PowerEdge R7715/0KRFPX, BIOS 1.1.2 02/20/2025 [ 4.664346] RIP: 0010:sysfs_add_link_to_group+0x12/0x60 [ 4.664346] Code: cb ff ff 48 89 ef 5d 41 5c e9 ca b4 ff ff 5d 41 5c c3 cc cc cc cc 66 90 0f 1f 44 00 00 41 55 49 89 cd 41 54 49 89 d4 31 d2 55 <48> 8b 7f 30 e8 a5 b2 ff ff 48 85 c0 74 29 48 89 c5 4c 89 e6 48 89 [ 4.664346] RSP: 0018:ff3f20b800047c28 EFLAGS: 00010246 [ 4.664346] RAX: 0000000000000001 RBX: ff25a0fc800530a8 RCX: ff25a0fc82cdb410 [ 4.664346] RDX: 0000000000000000 RSI: ffffffff904726e7 RDI: 0000000000000000 [ 4.664346] RBP: ff25a0fc801320d0 R08: ff3f20b800047d00 R09: ff3f20b800047d00 [ 4.664346] R10: 0720072007200720 R11: 0720072007200720 R12: ff25a0fc801320d0 [ 4.664346] R13: ff25a0fc82cdb410 R14: ff3f20b800047d00 R15: 0000000000000000 [ 4.664346] FS: 0000000000000000(0000) GS:ff25a10c1d400000(0000) knlGS:0000000000000000 [ 4.664346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.664346] CR2: 0000000000000030 CR3: 0000001036a10001 CR4: 0000000000771ee0 [ 4.664346] PKRU: 55555554 [ 4.664346] Call Trace: [ 4.664346] <TASK> [ 4.664346] ? show_trace_log_lvl+0x1c1/0x2d9 [ 4.664346] ? show_trace_log_lvl+0x1c1/0x2d9 [ 4.664346] ? iommu_device_link+0x3f/0xb0 [ 4.664346] ? __die_body.cold+0x8/0xd [ 4.664346] ? page_fault_oops+0xac/0x140 [ 4.664346] ? exc_page_fault+0x62/0x130 [ 4.664346] ? asm_exc_page_fault+0x22/0x30 [ 4.664346] ? sysfs_add_link_to_group+0x12/0x60 [ 4.664346] iommu_device_link+0x3f/0xb0 [ 4.664346] __iommu_probe_device+0x188/0x260 [ 4.664346] ? __iommu_probe_device+0x260/0x260 [ 4.664346] probe_iommu_group+0x31/0x40 [ 4.664346] bus_for_each_dev+0x75/0xc0 [ 4.664346] bus_iommu_probe+0x48/0x2c0 [ 4.664346] ? kmem_cache_alloc_trace+0x165/0x290 [ 4.664346] ? __cond_resched+0x16/0x50 [ 4.664346] bus_set_iommu+0x8c/0xe0 [ 4.664346] amd_iommu_init_api+0x18/0x34 [ 4.664346] amd_iommu_init_pci+0x56/0x21c [ 4.664346] ? e820__memblock_setup+0x7d/0x7d [ 4.664346] state_next+0x19a/0x2d4 [ 4.664346] ? blake2s_update+0x48/0xc0 [ 4.664346] ? e820__memblock_setup+0x7d/0x7d [ 4.664346] iommu_go_to_state+0x24/0x2c [ 4.664346] amd_iommu_init+0xf/0x29 [ 4.664346] pci_iommu_init+0x16/0x43 [ 4.664346] do_one_initcall+0x41/0x1d0 [ 4.664346] do_initcalls+0xc6/0xdf [ 4.664346] kernel_init_freeable+0x14e/0x19d [ 4.664346] ? rest_init+0xc0/0xc0 [ 4.664346] kernel_init+0x16/0x130 [ 4.664346] ret_from_fork+0x1f/0x30 [ 4.664346] </TASK> [ 4.664346] Modules linked in: [ 4.664346] CR2: 0000000000000030 [ 4.664346] ---[ end trace 9672514da279163d ]--- [ 4.664346] RIP: 0010:sysfs_add_link_to_group+0x12/0x60 [ 4.664346] Code: cb ff ff 48 89 ef 5d 41 5c e9 ca b4 ff ff 5d 41 5c c3 cc cc cc cc 66 90 0f 1f 44 00 00 41 55 49 89 cd 41 54 49 89 d4 31 d2 55 <48> 8b 7f 30 e8 a5 b2 ff ff 48 85 c0 74 29 48 89 c5 4c 89 e6 48 89 [ 4.664346] RSP: 0018:ff3f20b800047c28 EFLAGS: 00010246 [ 4.664346] RAX: 0000000000000001 RBX: ff25a0fc800530a8 RCX: ff25a0fc82cdb410 [ 4.664346] RDX: 0000000000000000 RSI: ffffffff904726e7 RDI: 0000000000000000 [ 4.664346] RBP: ff25a0fc801320d0 R08: ff3f20b800047d00 R09: ff3f20b800047d00 [ 4.664346] R10: 0720072007200720 R11: 0720072007200720 R12: ff25a0fc801320d0 [ 4.664346] R13: ff25a0fc82cdb410 R14: ff3f20b800047d00 R15: 0000000000000000 [ 4.664346] FS: 0000000000000000(0000) GS:ff25a10c1d400000(0000) knlGS:0000000000000000 [ 4.664346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.664346] CR2: 0000000000000030 CR3: 0000001036a10001 CR4: 0000000000771ee0 [ 4.664346] PKRU: 55555554 [ 4.664346] Kernel panic - not syncing: Fatal exception [ 4.664346] Rebooting in 15 seconds..
Best Regards Wang Yugui (wangyugui@e16-tech.com) 2025/05/18
Hi,
On 5/18/2025 12:24 PM, Wang Yugui wrote:
Hi,
iommu crash when kernel 5.15.179 boot on DELL R7715/AMD 9015. but kernel 6.1.133/6.6.86 boot well.
It is the first time to boot kernel 5.15.y on DELL R7715/AMD 9015, so yet no more info about other 5.15.y kernel version.
It seems iommu related, but seems no relationship to 5.15.181 iommu-amd-return-an-error-if-vcpu-affinity-is-set-fo.patch 5.15.182 iommu-amd-fix-potential-buffer-overflow-in-parse_ivrs_acpihid.patch
Hi,
Could you please provide more details such as
- What distro you are using
- Steps to reproduce
- Kernel config
- Hardware details about the machine
Also have you tried bisecting the kernel (between 5.15.y to 6.1.y ) ? It can help find the commit that fixes the kernel.
more test result. 5.16.20 same boot panic 5.17.0 same boot panic 5.18.0 boot OK.
so this AMD iommu problem is fixed in some patch of 5.18.0
Best Regards Wang Yugui (wangyugui@e16-tech.com) 2025/05/26
Thanks Sairaj
dmesg output: [ 4.658313] Trying to unpack rootfs image as initramfs... [ 4.663349] BUG: kernel NULL pointer dereference, address: 0000000000000030 [ 4.664346] #PF: supervisor read access in kernel mode [ 4.664346] #PF: error_code(0x0000) - not-present page [ 4.664346] PGD 0 [ 4.664346] Oops: 0000 [#1] SMP NOPTI [ 4.664346] CPU: 8 PID: 1 Comm: swapper/0 Not tainted 5.15.179-1.el9.x86_64 #1 [ 4.664346] Hardware name: Dell Inc. PowerEdge R7715/0KRFPX, BIOS 1.1.2 02/20/2025 [ 4.664346] RIP: 0010:sysfs_add_link_to_group+0x12/0x60 [ 4.664346] Code: cb ff ff 48 89 ef 5d 41 5c e9 ca b4 ff ff 5d 41 5c c3 cc cc cc cc 66 90 0f 1f 44 00 00 41 55 49 89 cd 41 54 49 89 d4 31 d2 55 <48> 8b 7f 30 e8 a5 b2 ff ff 48 85 c0 74 29 48 89 c5 4c 89 e6 48 89 [ 4.664346] RSP: 0018:ff3f20b800047c28 EFLAGS: 00010246 [ 4.664346] RAX: 0000000000000001 RBX: ff25a0fc800530a8 RCX: ff25a0fc82cdb410 [ 4.664346] RDX: 0000000000000000 RSI: ffffffff904726e7 RDI: 0000000000000000 [ 4.664346] RBP: ff25a0fc801320d0 R08: ff3f20b800047d00 R09: ff3f20b800047d00 [ 4.664346] R10: 0720072007200720 R11: 0720072007200720 R12: ff25a0fc801320d0 [ 4.664346] R13: ff25a0fc82cdb410 R14: ff3f20b800047d00 R15: 0000000000000000 [ 4.664346] FS: 0000000000000000(0000) GS:ff25a10c1d400000(0000) knlGS:0000000000000000 [ 4.664346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.664346] CR2: 0000000000000030 CR3: 0000001036a10001 CR4: 0000000000771ee0 [ 4.664346] PKRU: 55555554 [ 4.664346] Call Trace: [ 4.664346] <TASK> [ 4.664346] ? show_trace_log_lvl+0x1c1/0x2d9 [ 4.664346] ? show_trace_log_lvl+0x1c1/0x2d9 [ 4.664346] ? iommu_device_link+0x3f/0xb0 [ 4.664346] ? __die_body.cold+0x8/0xd [ 4.664346] ? page_fault_oops+0xac/0x140 [ 4.664346] ? exc_page_fault+0x62/0x130 [ 4.664346] ? asm_exc_page_fault+0x22/0x30 [ 4.664346] ? sysfs_add_link_to_group+0x12/0x60 [ 4.664346] iommu_device_link+0x3f/0xb0 [ 4.664346] __iommu_probe_device+0x188/0x260 [ 4.664346] ? __iommu_probe_device+0x260/0x260 [ 4.664346] probe_iommu_group+0x31/0x40 [ 4.664346] bus_for_each_dev+0x75/0xc0 [ 4.664346] bus_iommu_probe+0x48/0x2c0 [ 4.664346] ? kmem_cache_alloc_trace+0x165/0x290 [ 4.664346] ? __cond_resched+0x16/0x50 [ 4.664346] bus_set_iommu+0x8c/0xe0 [ 4.664346] amd_iommu_init_api+0x18/0x34 [ 4.664346] amd_iommu_init_pci+0x56/0x21c [ 4.664346] ? e820__memblock_setup+0x7d/0x7d [ 4.664346] state_next+0x19a/0x2d4 [ 4.664346] ? blake2s_update+0x48/0xc0 [ 4.664346] ? e820__memblock_setup+0x7d/0x7d [ 4.664346] iommu_go_to_state+0x24/0x2c [ 4.664346] amd_iommu_init+0xf/0x29 [ 4.664346] pci_iommu_init+0x16/0x43 [ 4.664346] do_one_initcall+0x41/0x1d0 [ 4.664346] do_initcalls+0xc6/0xdf [ 4.664346] kernel_init_freeable+0x14e/0x19d [ 4.664346] ? rest_init+0xc0/0xc0 [ 4.664346] kernel_init+0x16/0x130 [ 4.664346] ret_from_fork+0x1f/0x30 [ 4.664346] </TASK> [ 4.664346] Modules linked in: [ 4.664346] CR2: 0000000000000030 [ 4.664346] ---[ end trace 9672514da279163d ]--- [ 4.664346] RIP: 0010:sysfs_add_link_to_group+0x12/0x60 [ 4.664346] Code: cb ff ff 48 89 ef 5d 41 5c e9 ca b4 ff ff 5d 41 5c c3 cc cc cc cc 66 90 0f 1f 44 00 00 41 55 49 89 cd 41 54 49 89 d4 31 d2 55 <48> 8b 7f 30 e8 a5 b2 ff ff 48 85 c0 74 29 48 89 c5 4c 89 e6 48 89 [ 4.664346] RSP: 0018:ff3f20b800047c28 EFLAGS: 00010246 [ 4.664346] RAX: 0000000000000001 RBX: ff25a0fc800530a8 RCX: ff25a0fc82cdb410 [ 4.664346] RDX: 0000000000000000 RSI: ffffffff904726e7 RDI: 0000000000000000 [ 4.664346] RBP: ff25a0fc801320d0 R08: ff3f20b800047d00 R09: ff3f20b800047d00 [ 4.664346] R10: 0720072007200720 R11: 0720072007200720 R12: ff25a0fc801320d0 [ 4.664346] R13: ff25a0fc82cdb410 R14: ff3f20b800047d00 R15: 0000000000000000 [ 4.664346] FS: 0000000000000000(0000) GS:ff25a10c1d400000(0000) knlGS:0000000000000000 [ 4.664346] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4.664346] CR2: 0000000000000030 CR3: 0000001036a10001 CR4: 0000000000771ee0 [ 4.664346] PKRU: 55555554 [ 4.664346] Kernel panic - not syncing: Fatal exception [ 4.664346] Rebooting in 15 seconds..
Best Regards Wang Yugui (wangyugui@e16-tech.com) 2025/05/18
linux-stable-mirror@lists.linaro.org