March 2024 - Linux-stable-mirror

[PATCH v3 09/10] arm64: dts: qcom: sc8280xp-x13s: disable ASPM L0s for NVMe and modem

by Johan Hovold

There are indications that ASPM L0s is not working very well on this machine so disable it also for the NVMe and modem controllers for now. Note that this is done as a precaution based on problems with the Wi-Fi on the X13s as well as the NVMe, modem and Wi-Fi on the sc8280xp-crd reference design (the NVMe controller on my X13s does not support L0s and the machine lacks a modem). Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops") Cc: stable(a)vger.kernel.org # 6.7 Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org> Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org> --- arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts index 9567b82db9a5..057e4d9d3c0f 100644 --- a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts +++ b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts @@ -695,6 +695,8 @@ keyboard@68 { }; &pcie2a { + aspm-no-l0s; + perst-gpios = <&tlmm 143 GPIO_ACTIVE_LOW>; wake-gpios = <&tlmm 145 GPIO_ACTIVE_LOW>; @@ -714,6 +716,8 @@ &pcie2a_phy { }; &pcie3a { + aspm-no-l0s; + perst-gpios = <&tlmm 151 GPIO_ACTIVE_LOW>; wake-gpios = <&tlmm 148 GPIO_ACTIVE_LOW>; -- 2.43.0

1 year, 8 months

1
0
0 0

[PATCH v3 08/10] arm64: dts: qcom: sc8280xp-x13s: disable ASPM L0s for Wi-Fi

by Johan Hovold

Enabling ASPM L0s on the Lenovo Thinkpad X13s results in Correctable Errors (BadTLP, Timeout) when accessing the Wi-Fi controller so disable it for now. Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops") Cc: stable(a)vger.kernel.org # 6.7 Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org> Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org> --- arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts index eb8a16aa233e..9567b82db9a5 100644 --- a/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts +++ b/arch/arm64/boot/dts/qcom/sc8280xp-lenovo-thinkpad-x13s.dts @@ -734,6 +734,7 @@ &pcie3a_phy { &pcie4 { max-link-speed = <2>; + aspm-no-l0s; perst-gpios = <&tlmm 141 GPIO_ACTIVE_LOW>; wake-gpios = <&tlmm 139 GPIO_ACTIVE_LOW>; -- 2.43.0

1 year, 8 months

1
0
0 0

[PATCH v3 07/10] arm64: dts: qcom: sc8280xp-crd: disable ASPM L0s for modem and Wi-Fi

by Johan Hovold

There are indications that ASPM L0s is not working very well on this machine so disable it also for the modem and Wi-Fi controllers for now. This specifically avoids having the modem and Wi-Fi controllers bounce in an out of L0s when not used (the modem still bounces in and out of L1) as well as intermittent Correctable errors on the Wi-Fi link when not used. Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops") Cc: stable(a)vger.kernel.org # 6.7 Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org> Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org> --- arch/arm64/boot/dts/qcom/sc8280xp-crd.dts | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts b/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts index 7e94a68d5d9f..8fc0380f65a0 100644 --- a/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts +++ b/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts @@ -546,6 +546,8 @@ &pcie2a_phy { }; &pcie3a { + aspm-no-l0s; + perst-gpios = <&tlmm 151 GPIO_ACTIVE_LOW>; wake-gpios = <&tlmm 148 GPIO_ACTIVE_LOW>; @@ -566,6 +568,7 @@ &pcie3a_phy { &pcie4 { max-link-speed = <2>; + aspm-no-l0s; perst-gpios = <&tlmm 141 GPIO_ACTIVE_LOW>; wake-gpios = <&tlmm 139 GPIO_ACTIVE_LOW>; -- 2.43.0

1 year, 8 months

1
0
0 0

[PATCH v3 06/10] arm64: dts: qcom: sc8280xp-crd: disable ASPM L0s for NVMe

by Johan Hovold

Enabling ASPM L0s on the CRD results in a large amount of Correctable Errors (Timeout) when accessing the NVMe controller so disable it for now. Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops") Cc: stable(a)vger.kernel.org # 6.7 Reviewed-by: Konrad Dybcio <konrad.dybcio(a)linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org> Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org> --- arch/arm64/boot/dts/qcom/sc8280xp-crd.dts | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts b/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts index 41215567b3ae..7e94a68d5d9f 100644 --- a/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts +++ b/arch/arm64/boot/dts/qcom/sc8280xp-crd.dts @@ -525,6 +525,8 @@ keyboard@68 { }; &pcie2a { + aspm-no-l0s; + perst-gpios = <&tlmm 143 GPIO_ACTIVE_LOW>; wake-gpios = <&tlmm 145 GPIO_ACTIVE_LOW>; -- 2.43.0

1 year, 8 months

1
0
0 0

[PATCH v3 05/10] arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP

by Johan Hovold

Add the missing PCIe CX performance level votes to avoid relying on other drivers (e.g. USB or UFS) to maintain the nominal performance level required for Gen3 speeds. Fixes: 813e83157001 ("arm64: dts: qcom: sc8280xp/sa8540p: add PCIe2-4 nodes") Cc: stable(a)vger.kernel.org # 6.2 Reviewed-by: Konrad Dybcio <konrad.dybcio(a)linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org> Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org> --- arch/arm64/boot/dts/qcom/sc8280xp.dtsi | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi index c8e84c53935c..424d143ee26a 100644 --- a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi +++ b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi @@ -1780,6 +1780,7 @@ pcie4: pcie@1c00000 { reset-names = "pci"; power-domains = <&gcc PCIE_4_GDSC>; + required-opps = <&rpmhpd_opp_nom>; phys = <&pcie4_phy>; phy-names = "pciephy"; @@ -1878,6 +1879,7 @@ pcie3b: pcie@1c08000 { reset-names = "pci"; power-domains = <&gcc PCIE_3B_GDSC>; + required-opps = <&rpmhpd_opp_nom>; phys = <&pcie3b_phy>; phy-names = "pciephy"; @@ -1976,6 +1978,7 @@ pcie3a: pcie@1c10000 { reset-names = "pci"; power-domains = <&gcc PCIE_3A_GDSC>; + required-opps = <&rpmhpd_opp_nom>; phys = <&pcie3a_phy>; phy-names = "pciephy"; @@ -2077,6 +2080,7 @@ pcie2b: pcie@1c18000 { reset-names = "pci"; power-domains = <&gcc PCIE_2B_GDSC>; + required-opps = <&rpmhpd_opp_nom>; phys = <&pcie2b_phy>; phy-names = "pciephy"; @@ -2175,6 +2179,7 @@ pcie2a: pcie@1c20000 { reset-names = "pci"; power-domains = <&gcc PCIE_2A_GDSC>; + required-opps = <&rpmhpd_opp_nom>; phys = <&pcie2a_phy>; phy-names = "pciephy"; -- 2.43.0

1 year, 8 months

1
0
0 0

[PATCH v3 04/10] PCI: qcom: Add support for disabling ASPM L0s in devicetree

by Johan Hovold

Commit 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops") started enabling ASPM unconditionally when the hardware claims to support it. This triggers Correctable Errors for some PCIe devices on machines like the Lenovo ThinkPad X13s, which could indicate an incomplete driver ASPM implementation or that the hardware does in fact not support L0s. Add support for disabling ASPM L0s in the devicetree when it is not supported on a particular machine and controller. Note that only the 1.9.0 ops enable ASPM currently. Fixes: 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops") Cc: stable(a)vger.kernel.org # 6.7 Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org> Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org> --- drivers/pci/controller/dwc/pcie-qcom.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c index 09d485df34b9..0fb5dc06d2ef 100644 --- a/drivers/pci/controller/dwc/pcie-qcom.c +++ b/drivers/pci/controller/dwc/pcie-qcom.c @@ -273,6 +273,25 @@ static int qcom_pcie_start_link(struct dw_pcie *pci) return 0; } +static void qcom_pcie_clear_aspm_l0s(struct dw_pcie *pci) +{ + u16 offset; + u32 val; + + if (!of_property_read_bool(pci->dev->of_node, "aspm-no-l0s")) + return; + + offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP); + + dw_pcie_dbi_ro_wr_en(pci); + + val = readl(pci->dbi_base + offset + PCI_EXP_LNKCAP); + val &= ~PCI_EXP_LNKCAP_ASPM_L0S; + writel(val, pci->dbi_base + offset + PCI_EXP_LNKCAP); + + dw_pcie_dbi_ro_wr_dis(pci); +} + static void qcom_pcie_clear_hpc(struct dw_pcie *pci) { u16 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP); @@ -962,6 +981,7 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie) static int qcom_pcie_post_init_2_7_0(struct qcom_pcie *pcie) { + qcom_pcie_clear_aspm_l0s(pcie->pci); qcom_pcie_clear_hpc(pcie->pci); return 0; -- 2.43.0

1 year, 8 months

1
0
0 0

[PATCH v2 0/3] Support intra-function call validation

by Rui Qi

Since kernel version 5.4.217 LTS, there has been an issue with the kernel live patching feature becoming unavailable. When compiling the sample code for kernel live patching, the following message is displayed when enabled: livepatch: klp_check_stack: kworker/u256:6:23490 has an unreliable stack Reproduction steps: 1.git checkout v5.4.269 -b v5.4.269 2.make defconfig 3. Set CONFIG_LIVEPATCH=y、CONFIG_SAMPLE_LIVEPATCH=m 4. make -j bzImage 5. make samples/livepatch/livepatch-sample.ko 6. qemu-system-x86_64 -kernel arch/x86_64/boot/bzImage -nographic -append "console=ttyS0" -initrd initrd.img -m 1024M 7. insmod livepatch-sample.ko Kernel live patch cannot complete successfully. After some debugging, the immediate cause of the patch failure is an error in stack checking. The logs are as follows: [ 340.974853] livepatch: klp_check_stack: kworker/u256:0:23486 has an unreliable stack [ 340.974858] livepatch: klp_check_stack: kworker/u256:1:23487 has an unreliable stack [ 340.974863] livepatch: klp_check_stack: kworker/u256:2:23488 has an unreliable stack [ 340.974868] livepatch: klp_check_stack: kworker/u256:5:23489 has an unreliable stack [ 340.974872] livepatch: klp_check_stack: kworker/u256:6:23490 has an unreliable stack ...... BTW,if you use the v5.4.217 tag for testing, make sure to set CONFIG_RETPOLINE = y and CONFIG_LIVEPATCH = y, and other steps are consistent with v5.4.269 After investigation, The problem is strongly related to the commit 8afd1c7da2b0 ("x86/speculation: Change FILL_RETURN_BUFFER to work with objtool"), which would cause incorrect ORC entries to be generated, and the v5.4.217 version can undo this commit to make kernel livepatch work normally. It is a back-ported upstream patch with some code adjustments,from the git log, the author also mentioned no intra-function call validation support. Based on commit 6e1f54a4985b63bc1b55a09e5e75a974c5d6719b (Linux 5.4.269), This patchset adds stack validation support for intra-function calls, allowing the kernel live patching feature to work correctly. Alexandre Chartre (2): objtool: is_fentry_call() crashes if call has no destination objtool: Add support for intra-function calls Rui Qi (1): x86/speculation: Support intra-function call validation arch/x86/include/asm/nospec-branch.h | 7 ++ include/linux/frame.h | 11 ++++ .../Documentation/stack-validation.txt | 8 +++ tools/objtool/arch/x86/decode.c | 6 ++ tools/objtool/check.c | 64 +++++++++++++++++-- 5 files changed, 91 insertions(+), 5 deletions(-) -- 2.39.2 (Apple Git-143)

1 year, 8 months

2
9
0 0

RE: [PATCH 3/5] virtio_blk: Fix device surprise removal

by Parav Pandit

> From: Parav Pandit <parav(a)nvidia.com> > Sent: Tuesday, March 5, 2024 12:06 PM > > When the PCI device is surprise removed, requests won't complete from the > device. These IOs are never completed and disk deletion hangs indefinitely. > > Fix it by aborting the IOs which the device will never complete when the VQ is > broken. > > With this fix now fio completes swiftly. > An alternative of IO timeout has been considered, however timeout can take > very long time. For unresponsive device, quickly completing the request with > error enables users and upper layer to react quickly. > > Verified with multiple device unplug cycles with pending IOs in virtio used ring > and some pending with device. > > Also verified without surprise removal. > > Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci > device") > Cc: stable(a)vger.kernel.org > Reported-by: lirongqing(a)baidu.com > Closes: > https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb73ca9b4741 > @baidu.com/ > Co-developed-by: Chaitanya Kulkarni <kch(a)nvidia.com> > Signed-off-by: Chaitanya Kulkarni <kch(a)nvidia.com> > Signed-off-by: Parav Pandit <parav(a)nvidia.com> > --- Please ignore this patch, I am still debugging and verifying it. Incorrectly it got CCed to stable. I am sorry for the noise. > changelog: > v0->v1: > - addressed comments from Ming and Michael > - changed the flow to not depend on broken vq anymore to avoid > the race of missing the detection > - flow updated to quiesce request queue and device, followed by > syncing with any ongoing interrupt handler or callbacks, > finally finishing the requests which are not completed by device > with error. > --- > drivers/block/virtio_blk.c | 46 +++++++++++++++++++++++++++++++++++- > -- > 1 file changed, 43 insertions(+), 3 deletions(-) > > diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index > 2bf14a0e2815..1956172b4b1a 100644 > --- a/drivers/block/virtio_blk.c > +++ b/drivers/block/virtio_blk.c > @@ -1562,9 +1562,52 @@ static int virtblk_probe(struct virtio_device *vdev) > return err; > } > > +static bool virtblk_cancel_request(struct request *rq, void *data) { > + struct virtblk_req *vbr = blk_mq_rq_to_pdu(rq); > + > + vbr->in_hdr.status = VIRTIO_BLK_S_IOERR; > + if (blk_mq_request_started(rq) && !blk_mq_request_completed(rq)) > + blk_mq_complete_request(rq); > + > + return true; > +} > + > static void virtblk_remove(struct virtio_device *vdev) { > struct virtio_blk *vblk = vdev->priv; > + int i; > + > + /* Block upper layer to not get any new requests */ > + blk_mq_quiesce_queue(vblk->disk->queue); > + > + mutex_lock(&vblk->vdev_mutex); > + > + /* Stop all the virtqueues and configuration change notification and > also > + * synchronize with pending interrupt handlers. > + */ > + virtio_reset_device(vdev); > + > + mutex_unlock(&vblk->vdev_mutex); > + > + /* Syncronize with any callback handlers for request completion */ > + for (i = 0; i < vblk->num_vqs; i++) > + virtblk_done(vblk->vqs[i].vq); > + > + blk_sync_queue(vblk->disk->queue); > + > + /* At this point block layer and device/transport are quiet; > + * hence, safely complete all the pending requests with error. > + */ > + blk_mq_tagset_busy_iter(&vblk->tag_set, virtblk_cancel_request, > vblk); > + blk_mq_tagset_wait_completed_request(&vblk->tag_set); > + > + /* > + * Unblock any pending dispatch I/Os before we destroy device. From > + * del_gendisk() -> __blk_mark_disk_dead(disk) will set GD_DEAD flag, > + * that will make sure any new I/O from bio_queue_enter() to fail. > + */ > + blk_mq_unquiesce_queue(vblk->disk->queue); > > /* Make sure no work handler is accessing the device. */ > flush_work(&vblk->config_work); > @@ -1574,9 +1617,6 @@ static void virtblk_remove(struct virtio_device > *vdev) > > mutex_lock(&vblk->vdev_mutex); > > - /* Stop all the virtqueues. */ > - virtio_reset_device(vdev); > - > /* Virtqueues are stopped, nothing can use vblk->vdev anymore. */ > vblk->vdev = NULL; > > -- > 2.34.1

1 year, 8 months

1
0
0 0

[PATCH] tee: optee: Fix kernel panic caused by incorrect error handling

by Sumit Garg

The error path while failing to register devices on the TEE bus has a bug leading to kernel panic as follows: [ 15.398930] Unable to handle kernel paging request at virtual address ffff07ed00626d7c [ 15.406913] Mem abort info: [ 15.409722] ESR = 0x0000000096000005 [ 15.413490] EC = 0x25: DABT (current EL), IL = 32 bits [ 15.418814] SET = 0, FnV = 0 [ 15.421878] EA = 0, S1PTW = 0 [ 15.425031] FSC = 0x05: level 1 translation fault [ 15.429922] Data abort info: [ 15.432813] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [ 15.438310] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 15.443372] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 15.448697] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000d9e3e000 [ 15.455413] [ffff07ed00626d7c] pgd=1800000bffdf9003, p4d=1800000bffdf9003, pud=0000000000000000 [ 15.464146] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP Commit 7269cba53d90 ("tee: optee: Fix supplicant based device enumeration") lead to the introduction of this bug. So fix it appropriately. Reported-by: Mikko Rapeli <mikko.rapeli(a)linaro.org> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218542 Fixes: 7269cba53d90 ("tee: optee: Fix supplicant based device enumeration") Cc: stable(a)vger.kernel.org Signed-off-by: Sumit Garg <sumit.garg(a)linaro.org> --- drivers/tee/optee/device.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/tee/optee/device.c b/drivers/tee/optee/device.c index 9d2afac96acc..d296c70ddfdc 100644 --- a/drivers/tee/optee/device.c +++ b/drivers/tee/optee/device.c @@ -90,13 +90,14 @@ static int optee_register_device(const uuid_t *device_uuid, u32 func) if (rc) { pr_err("device registration failed, err: %d\n", rc); put_device(&optee_device->dev); + return rc; } if (func == PTA_CMD_GET_DEVICES_SUPP) device_create_file(&optee_device->dev, &dev_attr_need_supplicant); - return rc; + return 0; } static int __optee_enumerate_devices(u32 func) -- 2.34.1

1 year, 8 months

3
4
0 0

Re: [PATCH 6.7 000/162] 6.7.9-rc1 review

by Ronald Warsow

Hi Greg *no* regressions here on x86_64 (RKL, Intel 11th Gen. CPU) Thanks Tested-by: Ronald Warsow <rwarsow(a)gmx.de>

1 year, 8 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror March 2024