On Wed, 9 Jan 2019 16:13:42 +0100 Frederic Barrat fbarrat@linux.ibm.com wrote:
With a recent change around IOMMU group, a system with an opencapi adapter is no longer booting and we get a kernel oops:
BUG: Kernel NULL pointer dereference at 0x00000028 Faulting instruction address: 0xc0000000000aa38c Oops: Kernel access of bad area, sig: 7 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 5 PID: 1 Comm: swapper/4 Not tainted 5.0.0-rc1-fxb-00001-g3bd6e94bec12 NIP: c0000000000aa38c LR: c0000000000a6608 CTR: c000000000097480 REGS: c000000005783700 TRAP: 0300 Not tainted (5.0.0-rc1-fxb-00001-g3bd6 MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 28000228 XER: 20 CFAR: c0000000000a6604 DAR: 0000000000000028 DSISR: 00080000 IRQMASK: 0 GPR00: c0000000000a6608 c000000005783990 c000000001036100 c0000007bf761860 GPR04: 0000000000000000 c000000005783834 0000000000000000 0000000000000000 GPR08: 69626d2c6e707500 0000000000000000 0000000000000000 9000000002001003 GPR12: 0000000000000000 c0000007bfff8300 c000000000010450 0000000000000000 GPR16: c000000000ced938 0000000000000100 c000000000ced948 00000000000a0000 GPR20: 00000000000bfffe c000000000ced9a8 0000000000000200 c000000000ced978 GPR24: 00000000006080c0 c000000716d09828 c00000002e6fd000 0000000000000000 GPR28: c0000007bf4aff68 c0000007bf8d0080 c000000000f23938 c0000007bf761860 NIP [c0000000000aa38c] pnv_try_setup_npu_table_group+0x1c/0x1a0 LR [c0000000000a6608] pnv_pci_ioda_fixup+0x1f8/0x660 Call Trace: [c000000005783990] [c0000000000aa3d0] pnv_try_setup_npu_table_group+0x60/0x [c0000000057839d0] [c0000000000a661c] pnv_pci_ioda_fixup+0x20c/0x660 [c000000005783ab0] [c000000000e1d4c0] pcibios_resource_survey+0x2c8/0x31c [c000000005783b90] [c000000000e1caf4] pcibios_init+0xb0/0xe4 [c000000005783c10] [c000000000010054] do_one_initcall+0x64/0x264 [c000000005783ce0] [c000000000e1132c] kernel_init_freeable+0x36c/0x468 [c000000005783db0] [c000000000010474] kernel_init+0x2c/0x148 [c000000005783e20] [c00000000000b794] ret_from_kernel_thread+0x5c/0x68
An opencapi device is using a device PE, so the current code breaks because pe->pbus is not defined.
More generally, there's no need to define an IOMMU group for opencapi, as the device sends real addresses directly (admittedly, the virtualization story is yet to be written). So let's fix it by
Current plan is to go for mediated VFIO. The real HW stays under the control of the host ocxl driver, and we still don't need an IOMMU group.
skipping the IOMMU group setup for opencapi PHBs.
Fixes: 0bd971676e68 ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Frederic Barrat fbarrat@linux.ibm.com
Reviewed-by: Greg Kurz groug@kaod.org
and
Cc: stable@vger.kernel.org # v4.20
arch/powerpc/platforms/powernv/pci-ioda.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 1d6406a051f1..7db3119f8a5b 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2681,7 +2681,8 @@ static void pnv_pci_ioda_setup_iommu_api(void) list_for_each_entry(hose, &hose_list, list_node) { phb = hose->private_data;
if (phb->type == PNV_PHB_NPU_NVLINK)
if (phb->type == PNV_PHB_NPU_NVLINK ||
phb->type == PNV_PHB_NPU_OCAPI) continue;
list_for_each_entry(pe, &phb->ioda.pe_list, list) {
Le 09/01/2019 à 17:25, Greg Kurz a écrit :
On Wed, 9 Jan 2019 16:13:42 +0100 Frederic Barrat fbarrat@linux.ibm.com wrote:
With a recent change around IOMMU group, a system with an opencapi adapter is no longer booting and we get a kernel oops:
BUG: Kernel NULL pointer dereference at 0x00000028 Faulting instruction address: 0xc0000000000aa38c Oops: Kernel access of bad area, sig: 7 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 5 PID: 1 Comm: swapper/4 Not tainted 5.0.0-rc1-fxb-00001-g3bd6e94bec12 NIP: c0000000000aa38c LR: c0000000000a6608 CTR: c000000000097480 REGS: c000000005783700 TRAP: 0300 Not tainted (5.0.0-rc1-fxb-00001-g3bd6 MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 28000228 XER: 20 CFAR: c0000000000a6604 DAR: 0000000000000028 DSISR: 00080000 IRQMASK: 0 GPR00: c0000000000a6608 c000000005783990 c000000001036100 c0000007bf761860 GPR04: 0000000000000000 c000000005783834 0000000000000000 0000000000000000 GPR08: 69626d2c6e707500 0000000000000000 0000000000000000 9000000002001003 GPR12: 0000000000000000 c0000007bfff8300 c000000000010450 0000000000000000 GPR16: c000000000ced938 0000000000000100 c000000000ced948 00000000000a0000 GPR20: 00000000000bfffe c000000000ced9a8 0000000000000200 c000000000ced978 GPR24: 00000000006080c0 c000000716d09828 c00000002e6fd000 0000000000000000 GPR28: c0000007bf4aff68 c0000007bf8d0080 c000000000f23938 c0000007bf761860 NIP [c0000000000aa38c] pnv_try_setup_npu_table_group+0x1c/0x1a0 LR [c0000000000a6608] pnv_pci_ioda_fixup+0x1f8/0x660 Call Trace: [c000000005783990] [c0000000000aa3d0] pnv_try_setup_npu_table_group+0x60/0x [c0000000057839d0] [c0000000000a661c] pnv_pci_ioda_fixup+0x20c/0x660 [c000000005783ab0] [c000000000e1d4c0] pcibios_resource_survey+0x2c8/0x31c [c000000005783b90] [c000000000e1caf4] pcibios_init+0xb0/0xe4 [c000000005783c10] [c000000000010054] do_one_initcall+0x64/0x264 [c000000005783ce0] [c000000000e1132c] kernel_init_freeable+0x36c/0x468 [c000000005783db0] [c000000000010474] kernel_init+0x2c/0x148 [c000000005783e20] [c00000000000b794] ret_from_kernel_thread+0x5c/0x68
An opencapi device is using a device PE, so the current code breaks because pe->pbus is not defined.
More generally, there's no need to define an IOMMU group for opencapi, as the device sends real addresses directly (admittedly, the virtualization story is yet to be written). So let's fix it by
Current plan is to go for mediated VFIO. The real HW stays under the control of the host ocxl driver, and we still don't need an IOMMU group.
skipping the IOMMU group setup for opencapi PHBs.
Fixes: 0bd971676e68 ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Frederic Barrat fbarrat@linux.ibm.com
Reviewed-by: Greg Kurz groug@kaod.org
and
Cc: stable@vger.kernel.org # v4.20
Thanks for the review! But why did you add stable? that problem is only seen on 5.0-rc1, isn't it?
Fred
arch/powerpc/platforms/powernv/pci-ioda.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 1d6406a051f1..7db3119f8a5b 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2681,7 +2681,8 @@ static void pnv_pci_ioda_setup_iommu_api(void) list_for_each_entry(hose, &hose_list, list_node) { phb = hose->private_data;
if (phb->type == PNV_PHB_NPU_NVLINK)
if (phb->type == PNV_PHB_NPU_NVLINK ||
phb->type == PNV_PHB_NPU_OCAPI) continue;
list_for_each_entry(pe, &phb->ioda.pe_list, list) {
On Wed, 9 Jan 2019 17:45:53 +0100 Frederic Barrat fbarrat@linux.ibm.com wrote:
Le 09/01/2019 à 17:25, Greg Kurz a écrit :
On Wed, 9 Jan 2019 16:13:42 +0100 Frederic Barrat fbarrat@linux.ibm.com wrote:
With a recent change around IOMMU group, a system with an opencapi adapter is no longer booting and we get a kernel oops:
BUG: Kernel NULL pointer dereference at 0x00000028 Faulting instruction address: 0xc0000000000aa38c Oops: Kernel access of bad area, sig: 7 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 5 PID: 1 Comm: swapper/4 Not tainted 5.0.0-rc1-fxb-00001-g3bd6e94bec12 NIP: c0000000000aa38c LR: c0000000000a6608 CTR: c000000000097480 REGS: c000000005783700 TRAP: 0300 Not tainted (5.0.0-rc1-fxb-00001-g3bd6 MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 28000228 XER: 20 CFAR: c0000000000a6604 DAR: 0000000000000028 DSISR: 00080000 IRQMASK: 0 GPR00: c0000000000a6608 c000000005783990 c000000001036100 c0000007bf761860 GPR04: 0000000000000000 c000000005783834 0000000000000000 0000000000000000 GPR08: 69626d2c6e707500 0000000000000000 0000000000000000 9000000002001003 GPR12: 0000000000000000 c0000007bfff8300 c000000000010450 0000000000000000 GPR16: c000000000ced938 0000000000000100 c000000000ced948 00000000000a0000 GPR20: 00000000000bfffe c000000000ced9a8 0000000000000200 c000000000ced978 GPR24: 00000000006080c0 c000000716d09828 c00000002e6fd000 0000000000000000 GPR28: c0000007bf4aff68 c0000007bf8d0080 c000000000f23938 c0000007bf761860 NIP [c0000000000aa38c] pnv_try_setup_npu_table_group+0x1c/0x1a0 LR [c0000000000a6608] pnv_pci_ioda_fixup+0x1f8/0x660 Call Trace: [c000000005783990] [c0000000000aa3d0] pnv_try_setup_npu_table_group+0x60/0x [c0000000057839d0] [c0000000000a661c] pnv_pci_ioda_fixup+0x20c/0x660 [c000000005783ab0] [c000000000e1d4c0] pcibios_resource_survey+0x2c8/0x31c [c000000005783b90] [c000000000e1caf4] pcibios_init+0xb0/0xe4 [c000000005783c10] [c000000000010054] do_one_initcall+0x64/0x264 [c000000005783ce0] [c000000000e1132c] kernel_init_freeable+0x36c/0x468 [c000000005783db0] [c000000000010474] kernel_init+0x2c/0x148 [c000000005783e20] [c00000000000b794] ret_from_kernel_thread+0x5c/0x68
An opencapi device is using a device PE, so the current code breaks because pe->pbus is not defined.
More generally, there's no need to define an IOMMU group for opencapi, as the device sends real addresses directly (admittedly, the virtualization story is yet to be written). So let's fix it by
Current plan is to go for mediated VFIO. The real HW stays under the control of the host ocxl driver, and we still don't need an IOMMU group.
skipping the IOMMU group setup for opencapi PHBs.
Fixes: 0bd971676e68 ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Frederic Barrat fbarrat@linux.ibm.com
Reviewed-by: Greg Kurz groug@kaod.org
and
Cc: stable@vger.kernel.org # v4.20
Thanks for the review! But why did you add stable? that problem is only seen on 5.0-rc1, isn't it?
Based on the fact that 0bd971676e68 was committed in 4.20... but I haven't tested :)
Fred
arch/powerpc/platforms/powernv/pci-ioda.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 1d6406a051f1..7db3119f8a5b 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2681,7 +2681,8 @@ static void pnv_pci_ioda_setup_iommu_api(void) list_for_each_entry(hose, &hose_list, list_node) { phb = hose->private_data;
if (phb->type == PNV_PHB_NPU_NVLINK)
if (phb->type == PNV_PHB_NPU_NVLINK ||
phb->type == PNV_PHB_NPU_OCAPI) continue;
list_for_each_entry(pe, &phb->ioda.pe_list, list) {
Greg Kurz groug@kaod.org writes:
On Wed, 9 Jan 2019 17:45:53 +0100 Frederic Barrat fbarrat@linux.ibm.com wrote:
Le 09/01/2019 à 17:25, Greg Kurz a écrit :
On Wed, 9 Jan 2019 16:13:42 +0100 Frederic Barrat fbarrat@linux.ibm.com wrote:
With a recent change around IOMMU group, a system with an opencapi adapter is no longer booting and we get a kernel oops:
BUG: Kernel NULL pointer dereference at 0x00000028 Faulting instruction address: 0xc0000000000aa38c Oops: Kernel access of bad area, sig: 7 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 5 PID: 1 Comm: swapper/4 Not tainted 5.0.0-rc1-fxb-00001-g3bd6e94bec12 NIP: c0000000000aa38c LR: c0000000000a6608 CTR: c000000000097480 REGS: c000000005783700 TRAP: 0300 Not tainted (5.0.0-rc1-fxb-00001-g3bd6 MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 28000228 XER: 20 CFAR: c0000000000a6604 DAR: 0000000000000028 DSISR: 00080000 IRQMASK: 0 GPR00: c0000000000a6608 c000000005783990 c000000001036100 c0000007bf761860 GPR04: 0000000000000000 c000000005783834 0000000000000000 0000000000000000 GPR08: 69626d2c6e707500 0000000000000000 0000000000000000 9000000002001003 GPR12: 0000000000000000 c0000007bfff8300 c000000000010450 0000000000000000 GPR16: c000000000ced938 0000000000000100 c000000000ced948 00000000000a0000 GPR20: 00000000000bfffe c000000000ced9a8 0000000000000200 c000000000ced978 GPR24: 00000000006080c0 c000000716d09828 c00000002e6fd000 0000000000000000 GPR28: c0000007bf4aff68 c0000007bf8d0080 c000000000f23938 c0000007bf761860 NIP [c0000000000aa38c] pnv_try_setup_npu_table_group+0x1c/0x1a0 LR [c0000000000a6608] pnv_pci_ioda_fixup+0x1f8/0x660 Call Trace: [c000000005783990] [c0000000000aa3d0] pnv_try_setup_npu_table_group+0x60/0x [c0000000057839d0] [c0000000000a661c] pnv_pci_ioda_fixup+0x20c/0x660 [c000000005783ab0] [c000000000e1d4c0] pcibios_resource_survey+0x2c8/0x31c [c000000005783b90] [c000000000e1caf4] pcibios_init+0xb0/0xe4 [c000000005783c10] [c000000000010054] do_one_initcall+0x64/0x264 [c000000005783ce0] [c000000000e1132c] kernel_init_freeable+0x36c/0x468 [c000000005783db0] [c000000000010474] kernel_init+0x2c/0x148 [c000000005783e20] [c00000000000b794] ret_from_kernel_thread+0x5c/0x68
An opencapi device is using a device PE, so the current code breaks because pe->pbus is not defined.
More generally, there's no need to define an IOMMU group for opencapi, as the device sends real addresses directly (admittedly, the virtualization story is yet to be written). So let's fix it by
Current plan is to go for mediated VFIO. The real HW stays under the control of the host ocxl driver, and we still don't need an IOMMU group.
skipping the IOMMU group setup for opencapi PHBs.
Fixes: 0bd971676e68 ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Frederic Barrat fbarrat@linux.ibm.com
Reviewed-by: Greg Kurz groug@kaod.org
and
Cc: stable@vger.kernel.org # v4.20
Thanks for the review! But why did you add stable? that problem is only seen on 5.0-rc1, isn't it?
Based on the fact that 0bd971676e68 was committed in 4.20... but I haven't tested :)
It was committed to a branch based off 4.20-rc2, but it wasn't merged into the 4.20 release.
$ git describe --match "v[0-9]*" --contains 0bd971676e68 v5.0-rc1~137^2~15
So it doesn't need to go to stable.
cheers
On Thu, 10 Jan 2019 23:25:11 +1100 Michael Ellerman mpe@ellerman.id.au wrote:
Greg Kurz groug@kaod.org writes:
On Wed, 9 Jan 2019 17:45:53 +0100 Frederic Barrat fbarrat@linux.ibm.com wrote:
Le 09/01/2019 à 17:25, Greg Kurz a écrit :
On Wed, 9 Jan 2019 16:13:42 +0100 Frederic Barrat fbarrat@linux.ibm.com wrote:
With a recent change around IOMMU group, a system with an opencapi adapter is no longer booting and we get a kernel oops:
BUG: Kernel NULL pointer dereference at 0x00000028 Faulting instruction address: 0xc0000000000aa38c Oops: Kernel access of bad area, sig: 7 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 5 PID: 1 Comm: swapper/4 Not tainted 5.0.0-rc1-fxb-00001-g3bd6e94bec12 NIP: c0000000000aa38c LR: c0000000000a6608 CTR: c000000000097480 REGS: c000000005783700 TRAP: 0300 Not tainted (5.0.0-rc1-fxb-00001-g3bd6 MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 28000228 XER: 20 CFAR: c0000000000a6604 DAR: 0000000000000028 DSISR: 00080000 IRQMASK: 0 GPR00: c0000000000a6608 c000000005783990 c000000001036100 c0000007bf761860 GPR04: 0000000000000000 c000000005783834 0000000000000000 0000000000000000 GPR08: 69626d2c6e707500 0000000000000000 0000000000000000 9000000002001003 GPR12: 0000000000000000 c0000007bfff8300 c000000000010450 0000000000000000 GPR16: c000000000ced938 0000000000000100 c000000000ced948 00000000000a0000 GPR20: 00000000000bfffe c000000000ced9a8 0000000000000200 c000000000ced978 GPR24: 00000000006080c0 c000000716d09828 c00000002e6fd000 0000000000000000 GPR28: c0000007bf4aff68 c0000007bf8d0080 c000000000f23938 c0000007bf761860 NIP [c0000000000aa38c] pnv_try_setup_npu_table_group+0x1c/0x1a0 LR [c0000000000a6608] pnv_pci_ioda_fixup+0x1f8/0x660 Call Trace: [c000000005783990] [c0000000000aa3d0] pnv_try_setup_npu_table_group+0x60/0x [c0000000057839d0] [c0000000000a661c] pnv_pci_ioda_fixup+0x20c/0x660 [c000000005783ab0] [c000000000e1d4c0] pcibios_resource_survey+0x2c8/0x31c [c000000005783b90] [c000000000e1caf4] pcibios_init+0xb0/0xe4 [c000000005783c10] [c000000000010054] do_one_initcall+0x64/0x264 [c000000005783ce0] [c000000000e1132c] kernel_init_freeable+0x36c/0x468 [c000000005783db0] [c000000000010474] kernel_init+0x2c/0x148 [c000000005783e20] [c00000000000b794] ret_from_kernel_thread+0x5c/0x68
An opencapi device is using a device PE, so the current code breaks because pe->pbus is not defined.
More generally, there's no need to define an IOMMU group for opencapi, as the device sends real addresses directly (admittedly, the virtualization story is yet to be written). So let's fix it by
Current plan is to go for mediated VFIO. The real HW stays under the control of the host ocxl driver, and we still don't need an IOMMU group.
skipping the IOMMU group setup for opencapi PHBs.
Fixes: 0bd971676e68 ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Frederic Barrat fbarrat@linux.ibm.com
Reviewed-by: Greg Kurz groug@kaod.org
and
Cc: stable@vger.kernel.org # v4.20
Thanks for the review! But why did you add stable? that problem is only seen on 5.0-rc1, isn't it?
Based on the fact that 0bd971676e68 was committed in 4.20... but I haven't tested :)
It was committed to a branch based off 4.20-rc2, but it wasn't merged into the 4.20 release.
$ git describe --match "v[0-9]*" --contains 0bd971676e68 v5.0-rc1~137^2~15
So it doesn't need to go to stable.
Yeah I realized that afterwards, sorry for the noise and Happy New Year :)
cheers
Le 10/01/2019 à 13:25, Michael Ellerman a écrit :
Greg Kurz groug@kaod.org writes:
On Wed, 9 Jan 2019 17:45:53 +0100 Frederic Barrat fbarrat@linux.ibm.com wrote:
Le 09/01/2019 à 17:25, Greg Kurz a écrit :
On Wed, 9 Jan 2019 16:13:42 +0100 Frederic Barrat fbarrat@linux.ibm.com wrote:
With a recent change around IOMMU group, a system with an opencapi adapter is no longer booting and we get a kernel oops:
BUG: Kernel NULL pointer dereference at 0x00000028 Faulting instruction address: 0xc0000000000aa38c Oops: Kernel access of bad area, sig: 7 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 5 PID: 1 Comm: swapper/4 Not tainted 5.0.0-rc1-fxb-00001-g3bd6e94bec12 NIP: c0000000000aa38c LR: c0000000000a6608 CTR: c000000000097480 REGS: c000000005783700 TRAP: 0300 Not tainted (5.0.0-rc1-fxb-00001-g3bd6 MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 28000228 XER: 20 CFAR: c0000000000a6604 DAR: 0000000000000028 DSISR: 00080000 IRQMASK: 0 GPR00: c0000000000a6608 c000000005783990 c000000001036100 c0000007bf761860 GPR04: 0000000000000000 c000000005783834 0000000000000000 0000000000000000 GPR08: 69626d2c6e707500 0000000000000000 0000000000000000 9000000002001003 GPR12: 0000000000000000 c0000007bfff8300 c000000000010450 0000000000000000 GPR16: c000000000ced938 0000000000000100 c000000000ced948 00000000000a0000 GPR20: 00000000000bfffe c000000000ced9a8 0000000000000200 c000000000ced978 GPR24: 00000000006080c0 c000000716d09828 c00000002e6fd000 0000000000000000 GPR28: c0000007bf4aff68 c0000007bf8d0080 c000000000f23938 c0000007bf761860 NIP [c0000000000aa38c] pnv_try_setup_npu_table_group+0x1c/0x1a0 LR [c0000000000a6608] pnv_pci_ioda_fixup+0x1f8/0x660 Call Trace: [c000000005783990] [c0000000000aa3d0] pnv_try_setup_npu_table_group+0x60/0x [c0000000057839d0] [c0000000000a661c] pnv_pci_ioda_fixup+0x20c/0x660 [c000000005783ab0] [c000000000e1d4c0] pcibios_resource_survey+0x2c8/0x31c [c000000005783b90] [c000000000e1caf4] pcibios_init+0xb0/0xe4 [c000000005783c10] [c000000000010054] do_one_initcall+0x64/0x264 [c000000005783ce0] [c000000000e1132c] kernel_init_freeable+0x36c/0x468 [c000000005783db0] [c000000000010474] kernel_init+0x2c/0x148 [c000000005783e20] [c00000000000b794] ret_from_kernel_thread+0x5c/0x68
An opencapi device is using a device PE, so the current code breaks because pe->pbus is not defined.
More generally, there's no need to define an IOMMU group for opencapi, as the device sends real addresses directly (admittedly, the virtualization story is yet to be written). So let's fix it by
Current plan is to go for mediated VFIO. The real HW stays under the control of the host ocxl driver, and we still don't need an IOMMU group.
skipping the IOMMU group setup for opencapi PHBs.
Fixes: 0bd971676e68 ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Frederic Barrat fbarrat@linux.ibm.com
Reviewed-by: Greg Kurz groug@kaod.org
and
Cc: stable@vger.kernel.org # v4.20
Thanks for the review! But why did you add stable? that problem is only seen on 5.0-rc1, isn't it?
Based on the fact that 0bd971676e68 was committed in 4.20... but I haven't tested :)
It was committed to a branch based off 4.20-rc2, but it wasn't merged into the 4.20 release.
$ git describe --match "v[0-9]*" --contains 0bd971676e68 v5.0-rc1~137^2~15
So it doesn't need to go to stable.
Which makes me wonder if Greg (KH) was really talking about that original patch and whether something worthwhile was dropped from stable by mistake?
Fred
On Thu, Jan 10, 2019 at 01:58:31PM +0100, Frederic Barrat wrote:
Le 10/01/2019 à 13:25, Michael Ellerman a écrit :
Greg Kurz groug@kaod.org writes:
On Wed, 9 Jan 2019 17:45:53 +0100 Frederic Barrat fbarrat@linux.ibm.com wrote:
Le 09/01/2019 à 17:25, Greg Kurz a écrit :
On Wed, 9 Jan 2019 16:13:42 +0100 Frederic Barrat fbarrat@linux.ibm.com wrote:
With a recent change around IOMMU group, a system with an opencapi adapter is no longer booting and we get a kernel oops:
BUG: Kernel NULL pointer dereference at 0x00000028 Faulting instruction address: 0xc0000000000aa38c Oops: Kernel access of bad area, sig: 7 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 5 PID: 1 Comm: swapper/4 Not tainted 5.0.0-rc1-fxb-00001-g3bd6e94bec12 NIP: c0000000000aa38c LR: c0000000000a6608 CTR: c000000000097480 REGS: c000000005783700 TRAP: 0300 Not tainted (5.0.0-rc1-fxb-00001-g3bd6 MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 28000228 XER: 20 CFAR: c0000000000a6604 DAR: 0000000000000028 DSISR: 00080000 IRQMASK: 0 GPR00: c0000000000a6608 c000000005783990 c000000001036100 c0000007bf761860 GPR04: 0000000000000000 c000000005783834 0000000000000000 0000000000000000 GPR08: 69626d2c6e707500 0000000000000000 0000000000000000 9000000002001003 GPR12: 0000000000000000 c0000007bfff8300 c000000000010450 0000000000000000 GPR16: c000000000ced938 0000000000000100 c000000000ced948 00000000000a0000 GPR20: 00000000000bfffe c000000000ced9a8 0000000000000200 c000000000ced978 GPR24: 00000000006080c0 c000000716d09828 c00000002e6fd000 0000000000000000 GPR28: c0000007bf4aff68 c0000007bf8d0080 c000000000f23938 c0000007bf761860 NIP [c0000000000aa38c] pnv_try_setup_npu_table_group+0x1c/0x1a0 LR [c0000000000a6608] pnv_pci_ioda_fixup+0x1f8/0x660 Call Trace: [c000000005783990] [c0000000000aa3d0] pnv_try_setup_npu_table_group+0x60/0x [c0000000057839d0] [c0000000000a661c] pnv_pci_ioda_fixup+0x20c/0x660 [c000000005783ab0] [c000000000e1d4c0] pcibios_resource_survey+0x2c8/0x31c [c000000005783b90] [c000000000e1caf4] pcibios_init+0xb0/0xe4 [c000000005783c10] [c000000000010054] do_one_initcall+0x64/0x264 [c000000005783ce0] [c000000000e1132c] kernel_init_freeable+0x36c/0x468 [c000000005783db0] [c000000000010474] kernel_init+0x2c/0x148 [c000000005783e20] [c00000000000b794] ret_from_kernel_thread+0x5c/0x68
An opencapi device is using a device PE, so the current code breaks because pe->pbus is not defined.
More generally, there's no need to define an IOMMU group for opencapi, as the device sends real addresses directly (admittedly, the virtualization story is yet to be written). So let's fix it by
Current plan is to go for mediated VFIO. The real HW stays under the control of the host ocxl driver, and we still don't need an IOMMU group.
skipping the IOMMU group setup for opencapi PHBs.
Fixes: 0bd971676e68 ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Frederic Barrat fbarrat@linux.ibm.com
Reviewed-by: Greg Kurz groug@kaod.org
and
Cc: stable@vger.kernel.org # v4.20
Thanks for the review! But why did you add stable? that problem is only seen on 5.0-rc1, isn't it?
Based on the fact that 0bd971676e68 was committed in 4.20... but I haven't tested :)
It was committed to a branch based off 4.20-rc2, but it wasn't merged into the 4.20 release.
$ git describe --match "v[0-9]*" --contains 0bd971676e68 v5.0-rc1~137^2~15
So it doesn't need to go to stable.
Which makes me wonder if Greg (KH) was really talking about that original patch and whether something worthwhile was dropped from stable by mistake?
Totally different thread, sorry for the noise, my fault...
greg k-h
On Wed, Jan 09, 2019 at 05:45:53PM +0100, Frederic Barrat wrote:
Le 09/01/2019 à 17:25, Greg Kurz a écrit :
On Wed, 9 Jan 2019 16:13:42 +0100 Frederic Barrat fbarrat@linux.ibm.com wrote:
With a recent change around IOMMU group, a system with an opencapi adapter is no longer booting and we get a kernel oops:
BUG: Kernel NULL pointer dereference at 0x00000028 Faulting instruction address: 0xc0000000000aa38c Oops: Kernel access of bad area, sig: 7 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 5 PID: 1 Comm: swapper/4 Not tainted 5.0.0-rc1-fxb-00001-g3bd6e94bec12 NIP: c0000000000aa38c LR: c0000000000a6608 CTR: c000000000097480 REGS: c000000005783700 TRAP: 0300 Not tainted (5.0.0-rc1-fxb-00001-g3bd6 MSR: 9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE> CR: 28000228 XER: 20 CFAR: c0000000000a6604 DAR: 0000000000000028 DSISR: 00080000 IRQMASK: 0 GPR00: c0000000000a6608 c000000005783990 c000000001036100 c0000007bf761860 GPR04: 0000000000000000 c000000005783834 0000000000000000 0000000000000000 GPR08: 69626d2c6e707500 0000000000000000 0000000000000000 9000000002001003 GPR12: 0000000000000000 c0000007bfff8300 c000000000010450 0000000000000000 GPR16: c000000000ced938 0000000000000100 c000000000ced948 00000000000a0000 GPR20: 00000000000bfffe c000000000ced9a8 0000000000000200 c000000000ced978 GPR24: 00000000006080c0 c000000716d09828 c00000002e6fd000 0000000000000000 GPR28: c0000007bf4aff68 c0000007bf8d0080 c000000000f23938 c0000007bf761860 NIP [c0000000000aa38c] pnv_try_setup_npu_table_group+0x1c/0x1a0 LR [c0000000000a6608] pnv_pci_ioda_fixup+0x1f8/0x660 Call Trace: [c000000005783990] [c0000000000aa3d0] pnv_try_setup_npu_table_group+0x60/0x [c0000000057839d0] [c0000000000a661c] pnv_pci_ioda_fixup+0x20c/0x660 [c000000005783ab0] [c000000000e1d4c0] pcibios_resource_survey+0x2c8/0x31c [c000000005783b90] [c000000000e1caf4] pcibios_init+0xb0/0xe4 [c000000005783c10] [c000000000010054] do_one_initcall+0x64/0x264 [c000000005783ce0] [c000000000e1132c] kernel_init_freeable+0x36c/0x468 [c000000005783db0] [c000000000010474] kernel_init+0x2c/0x148 [c000000005783e20] [c00000000000b794] ret_from_kernel_thread+0x5c/0x68
An opencapi device is using a device PE, so the current code breaks because pe->pbus is not defined.
More generally, there's no need to define an IOMMU group for opencapi, as the device sends real addresses directly (admittedly, the virtualization story is yet to be written). So let's fix it by
Current plan is to go for mediated VFIO. The real HW stays under the control of the host ocxl driver, and we still don't need an IOMMU group.
skipping the IOMMU group setup for opencapi PHBs.
Fixes: 0bd971676e68 ("powerpc/powernv/npu: Add compound IOMMU groups") Signed-off-by: Frederic Barrat fbarrat@linux.ibm.com
Reviewed-by: Greg Kurz groug@kaod.org
and
Cc: stable@vger.kernel.org # v4.20
Thanks for the review! But why did you add stable? that problem is only seen on 5.0-rc1, isn't it?
No, this is fixing a patch that got backported to stable.
Well, attempted to be backported, I dropped it because of the problem :)
thanks,
greg k-h
linux-stable-mirror@lists.linaro.org