ich7_lpc_probe() uses pci_read_config_dword() that returns PCIBIOS_*
codes. The error handling code assumes incorrectly it's a normal errno
and checks for < 0. The return code is returned from the probe function
as is but probe functions should return normal errnos.
Remove < 0 from the check and convert PCIBIOS_* returns code using
pcibios_err_to_errno() into normal errno before returning it.
Fixes: a328e95b82c1 ("leds: LED driver for Intel NAS SS4200 series (v5)")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
---
drivers/leds/leds-ss4200.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/leds/leds-ss4200.c b/drivers/leds/leds-ss4200.c
index fcaa34706b6c..2ef9fc7371bd 100644
--- a/drivers/leds/leds-ss4200.c
+++ b/drivers/leds/leds-ss4200.c
@@ -356,8 +356,10 @@ static int ich7_lpc_probe(struct pci_dev *dev,
nas_gpio_pci_dev = dev;
status = pci_read_config_dword(dev, PMBASE, &g_pm_io_base);
- if (status)
+ if (status) {
+ status = pcibios_err_to_errno(status);
goto out;
+ }
g_pm_io_base &= 0x00000ff80;
status = pci_read_config_dword(dev, GPIO_CTRL, &gc);
@@ -369,8 +371,9 @@ static int ich7_lpc_probe(struct pci_dev *dev,
}
status = pci_read_config_dword(dev, GPIO_BASE, &nas_gpio_io_base);
- if (0 > status) {
+ if (status) {
dev_info(&dev->dev, "Unable to read GPIOBASE.\n");
+ status = pcibios_err_to_errno(status);
goto out;
}
dev_dbg(&dev->dev, ": GPIOBASE = 0x%08x\n", nas_gpio_io_base);
--
2.39.2
This bug was found by syzkaller.
The reproducer and the detailed warning log can be viewed here [1]
[1] https://lore.kernel.org/bpf/20240129091746.260538-1-kovalev@altlinux.org/#t
Capability cap_sys_admin is required for reproduce and on kernels with panic_on_warn
enabled it will cause the system to crash.
v2:
Added an additional patch that fixes a build error by the clang compiler.
To solve the problem, it is proposed to backport the following commits:
[PATCH v2 5.10.y 1/2] bpf: Convert BPF_DISPATCHER to use static_call() (not
[PATCH v2 5.10.y 2/2] bpf: Add explicit cast to 'void *' for
This bug was found by syzkaller.
The reproducer and the detailed warning log can be viewed here [1]
[1] https://lore.kernel.org/bpf/20240129091746.260538-1-kovalev@altlinux.org/#t
Capability cap_sys_admin is required for reproduce and on kernels with panic_on_warn
enabled it will cause the system to crash.
v2:
Added an additional patch that fixes a build error by the clang compiler.
To solve the problem, it is proposed to backport the following commits:
[PATCH v2 5.15.y 1/2] bpf: Convert BPF_DISPATCHER to use static_call() (not
[PATCH v2 5.15.y 2/2] bpf: Add explicit cast to 'void *' for
Commit 7627a0edef54 ("ata: ahci: Drop low power policy board type")
dropped the board_ahci_low_power board type, and instead enables LPM if:
-The AHCI controller reports that it supports LPM (Partial/Slumber), and
-CONFIG_SATA_MOBILE_LPM_POLICY != 0, and
-The port is not defined as external in the per port PxCMD register, and
-The port is not defined as hotplug capable in the per port PxCMD
register.
Partial and Slumber LPM states can either be initiated by HIPM or DIPM.
For HIPM (host initiated power management) to get enabled, both the AHCI
controller and the drive have to report that they support HIPM.
For DIPM (device initiated power management) to get enabled, only the
drive has to report that it supports DIPM. However, the HBA will reject
device requests to enter LPM states which the HBA does not support.
The problem is that Apacer AS340 drives do not handle low power modes
correctly. The problem was most likely not seen before because no one
had used this drive with a AHCI controller with LPM enabled.
Add a quirk so that we do not enable LPM for this drive, since we see
command timeouts if we do (even though the drive claims to support DIPM).
Fixes: 7627a0edef54 ("ata: ahci: Drop low power policy board type")
Cc: stable(a)vger.kernel.org
Reported-by: Tim Teichmann <teichmanntim(a)outlook.de>
Closes: https://lore.kernel.org/linux-ide/87bk4pbve8.ffs@tglx/
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
---
On the system reporting this issue, the HBA supports SALP (HIPM) and
LPM states Partial and Slumber.
This drive only supports DIPM but not HIPM, however, that should not
matter, as a DIPM request from the device still has to be acked by the
HBA, and according to AHCI 1.3.1, section 5.3.2.11 P:Idle, if the link
layer has negotiated to low power state based on device power management
request, the HBA will jump to state PM:LowPower.
In PM:LowPower, the HBA will automatically request to wake the link
(exit from Partial/Slumber) when a new command is queued (by writing to
PxCI). Thus, there should be no need for host software to request an
explicit wakeup (by writing PxCMD.ICC to 1).
Therefore, even with only DIPM supported/enabled, we shouldn't see command
timeouts with the current code. Also, only enabling only DIPM (by
modifying the AHCI driver) with another drive (which support both DIPM
and HIPM), shows no errors. Thus, it seems like the drive is the problem.
drivers/ata/libata-core.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 4f35aab81a0a..25b400f1c3de 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4155,6 +4155,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
ATA_HORKAGE_ZERO_AFTER_TRIM |
ATA_HORKAGE_NOLPM },
+ /* Apacer models with LPM issues */
+ { "Apacer AS340*", NULL, ATA_HORKAGE_NOLPM },
+
/* These specific Samsung models/firmware-revs do not handle LPM well */
{ "SAMSUNG MZMPC128HBFU-000MV", "CXM14M1Q", ATA_HORKAGE_NOLPM },
{ "SAMSUNG SSD PM830 mSATA *", "CXM13D1Q", ATA_HORKAGE_NOLPM },
--
2.45.1
Commit 7627a0edef54 ("ata: ahci: Drop low power policy board type")
dropped the board_ahci_low_power board type, and instead enables LPM if:
-The AHCI controller reports that it supports LPM (Partial/Slumber), and
-CONFIG_SATA_MOBILE_LPM_POLICY != 0, and
-The port is not defined as external in the per port PxCMD register, and
-The port is not defined as hotplug capable in the per port PxCMD
register.
Partial and Slumber LPM states can either be initiated by HIPM or DIPM.
For HIPM (host initiated power management) to get enabled, both the AHCI
controller and the drive have to report that they support HIPM.
For DIPM (device initiated power management) to get enabled, only the
drive has to report that it supports DIPM. However, the HBA will reject
device requests to enter LPM states which the HBA does not support.
The problem is that AMD Radeon S3 SSD drives do not handle low power modes
correctly. The problem was most likely not seen before because no one
had used this drive with a AHCI controller with LPM enabled.
Add a quirk so that we do not enable LPM for this drive, since we see
command timeouts if we do (even though the drive claims to support DIPM).
Fixes: 7627a0edef54 ("ata: ahci: Drop low power policy board type")
Cc: stable(a)vger.kernel.org
Reported-by: Doru Iorgulescu <doru.iorgulescu1(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218832
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
---
drivers/ata/libata-core.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 4f35aab81a0a..6c4b69d34aa1 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4155,6 +4155,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
ATA_HORKAGE_ZERO_AFTER_TRIM |
ATA_HORKAGE_NOLPM },
+ /* AMD Radeon devices with broken LPM support */
+ { "R3SL240G", NULL, ATA_HORKAGE_NOLPM },
+
/* These specific Samsung models/firmware-revs do not handle LPM well */
{ "SAMSUNG MZMPC128HBFU-000MV", "CXM14M1Q", ATA_HORKAGE_NOLPM },
{ "SAMSUNG SSD PM830 mSATA *", "CXM13D1Q", ATA_HORKAGE_NOLPM },
--
2.45.1
Commit 7627a0edef54 ("ata: ahci: Drop low power policy board type")
dropped the board_ahci_low_power board type, and instead enables LPM if:
-The AHCI controller reports that it supports LPM (Partial/Slumber), and
-CONFIG_SATA_MOBILE_LPM_POLICY != 0, and
-The port is not defined as external in the per port PxCMD register, and
-The port is not defined as hotplug capable in the per port PxCMD
register.
Partial and Slumber LPM states can either be initiated by HIPM or DIPM.
For HIPM (host initiated power management) to get enabled, both the AHCI
controller and the drive have to report that they support HIPM.
For DIPM (device initiated power management) to get enabled, only the
drive has to report that it supports DIPM. However, the HBA will reject
device requests to enter LPM states which the HBA does not support.
The problem is that Crucial CT240BX500SSD1 drives do not handle low power
modes correctly. The problem was most likely not seen before because no
one had used this drive with a AHCI controller with LPM enabled.
Add a quirk so that we do not enable LPM for this drive, since we see
command timeouts if we do (even though the drive claims to support DIPM).
Fixes: 7627a0edef54 ("ata: ahci: Drop low power policy board type")
Cc: stable(a)vger.kernel.org
Reported-by: Aarrayy <lp610mh(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218832
Signed-off-by: Niklas Cassel <cassel(a)kernel.org>
---
On the system reporting this issue, the HBA supports SALP (HIPM) and
LPM states Partial and Slumber.
This drive only supports DIPM but not HIPM, however, that should not
matter, as a DIPM request from the device still has to be acked by the
HBA, and according to AHCI 1.3.1, section 5.3.2.11 P:Idle, if the link
layer has negotiated to low power state based on device power management
request, the HBA will jump to state PM:LowPower.
In PM:LowPower, the HBA will automatically request to wake the link
(exit from Partial/Slumber) when a new command is queued (by writing to
PxCI). Thus, there should be no need for host software to request an
explicit wakeup (by writing PxCMD.ICC to 1).
Therefore, even with only DIPM supported/enabled, we shouldn't see command
timeouts with the current code. Also, only enabling only DIPM (by
modifying the AHCI driver) with another drive (which support both DIPM
and HIPM), shows no errors. Thus, it seems like the drive is the problem.
drivers/ata/libata-core.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 4f35aab81a0a..b0ce621fe2a1 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4136,8 +4136,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
{ "PIONEER BD-RW BDR-207M", NULL, ATA_HORKAGE_NOLPM },
{ "PIONEER BD-RW BDR-205", NULL, ATA_HORKAGE_NOLPM },
- /* Crucial BX100 SSD 500GB has broken LPM support */
+ /* Crucial devices with broken LPM support */
{ "CT500BX100SSD1", NULL, ATA_HORKAGE_NOLPM },
+ { "CT240BX500SSD1", NULL, ATA_HORKAGE_NOLPM },
/* 512GB MX100 with MU01 firmware has both queued TRIM and LPM issues */
{ "Crucial_CT512MX100*", "MU01", ATA_HORKAGE_NO_NCQ_TRIM |
--
2.45.1
The subordinate pointer can be null if we are out of bus number. The
function of_pci_prop_intr_map() deferences this pointer without checking
and may crashes the kernel.
This crash can be reproduced by starting a QEMU instance:
qemu-system-riscv64 -machine virt \
-kernel ../build-pci-riscv/arch/riscv/boot/Image \
-append "console=ttyS0 root=/dev/vda" \
-nographic \
-drive "file=root.img,format=raw,id=hd0" \
-device virtio-blk-device,drive=hd0 \
-device pcie-root-port,bus=pcie.0,slot=1,id=rp1,bus-reserve=0 \
-device pcie-pci-bridge,id=br1,bus=rp1
Then hot-add a bridge with
device_add pci-bridge,id=br2,bus=br1,chassis_nr=1,addr=1
Then the kernel crashes:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
[snip]
[<ffffffff804dac82>] of_pci_prop_intr_map+0x104/0x362
[<ffffffff804db262>] of_pci_add_properties+0x382/0x3ca
[<ffffffff804c8228>] of_pci_make_dev_node+0xb6/0x116
[<ffffffff804a6b72>] pci_bus_add_device+0xa8/0xaa
[<ffffffff804a6ba4>] pci_bus_add_devices+0x30/0x6a
[<ffffffff804d3b5c>] shpchp_configure_device+0xa0/0x112
[<ffffffff804d2b3a>] board_added+0xce/0x354
[<ffffffff804d2e9a>] shpchp_enable_slot+0xda/0x30c
[<ffffffff804d336c>] shpchp_pushbutton_thread+0x84/0xa0
NULL check this pointer first before proceeding.
Fixes: 407d1a51921e ("PCI: Create device tree node for bridge")
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Cc: stable(a)vger.kernel.org
---
drivers/pci/of_property.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/of_property.c b/drivers/pci/of_property.c
index 5fb516807ba2..c405978a0b7e 100644
--- a/drivers/pci/of_property.c
+++ b/drivers/pci/of_property.c
@@ -199,6 +199,9 @@ static int of_pci_prop_intr_map(struct pci_dev *pdev, struct of_changeset *ocs,
int ret;
u8 pin;
+ if (!pdev->subordinate)
+ return 0;
+
pnode = pci_device_to_OF_node(pdev->bus->self);
if (!pnode)
pnode = pci_bus_to_OF_node(pdev->bus);
--
2.39.2
The subordinate pointer can be null if we are out of bus number. The
function of_pci_prop_bus_range() deferences this pointer without checking
and may crashes the kernel.
This crash can be reproduced by starting a QEMU instance:
qemu-system-riscv64 -machine virt \
-kernel ../build-pci-riscv/arch/riscv/boot/Image \
-append "console=ttyS0 root=/dev/vda" \
-nographic \
-drive "file=root.img,format=raw,id=hd0" \
-device virtio-blk-device,drive=hd0 \
-device pcie-root-port,bus=pcie.0,slot=1,id=rp1,bus-reserve=0 \
-device pcie-pci-bridge,id=br1,bus=rp1
Then hot-add a bridge with
device_add pci-bridge,id=br2,bus=br1,chassis_nr=1,addr=1
Then the kernel crashes:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000088
[snip]
[<ffffffff804db234>] of_pci_add_properties+0x34c/0x3c6
[<ffffffff804c8228>] of_pci_make_dev_node+0xb6/0x116
[<ffffffff804a6b72>] pci_bus_add_device+0xa8/0xaa
[<ffffffff804a6ba4>] pci_bus_add_devices+0x30/0x6a
[<ffffffff804d3b5c>] shpchp_configure_device+0xa0/0x112
[<ffffffff804d2b3a>] board_added+0xce/0x354
[<ffffffff804d2e9a>] shpchp_enable_slot+0xda/0x30c
[<ffffffff804d336c>] shpchp_pushbutton_thread+0x84/0xa0
NULL check this pointer first before proceeding.
Fixes: 407d1a51921e ("PCI: Create device tree node for bridge")
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Cc: stable(a)vger.kernel.org
---
drivers/pci/of_property.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/of_property.c b/drivers/pci/of_property.c
index c2c7334152bc..5fb516807ba2 100644
--- a/drivers/pci/of_property.c
+++ b/drivers/pci/of_property.c
@@ -91,6 +91,9 @@ static int of_pci_prop_bus_range(struct pci_dev *pdev,
struct of_changeset *ocs,
struct device_node *np)
{
+ if (!pdev->subordinate)
+ return 0;
+
u32 bus_range[] = { pdev->subordinate->busn_res.start,
pdev->subordinate->busn_res.end };
--
2.39.2
pci_dev->subordinate pointer can be NULL if we run out of bus number. The
driver deferences this pointer without checking, and the kernel crashes.
This crash can be reproduced by starting a QEMU instance:
qemu-system-x86_64 -machine pc-q35-2.10 \
-kernel bzImage \
-drive "file=img,format=raw" \
-m 2048 -smp 1 -enable-kvm \
-append "console=ttyS0 root=/dev/sda debug" \
-nographic \
-device pcie-root-port,bus=pcie.0,slot=1,id=rp1 \
-device pcie-pci-bridge,id=br1,bus=rp1
Then hot-add a bridge with the QEMU command:
device_add pci-bridge,id=br2,bus=br1,chassis_nr=1,addr=1
Then the kernel crashes:
shpchp 0000:02:01.0: enabling device (0000 -> 0002)
shpchp 0000:02:01.0: enabling bus mastering
BUG: kernel NULL pointer dereference, address: 00000000000000da
[snip]
Call Trace:
<TASK>
? show_regs+0x63/0x70
? __die+0x23/0x70
? page_fault_oops+0x17a/0x480
? shpc_init+0x3fb/0x9d0
? search_module_extables+0x4e/0x80
? shpc_init+0x3fb/0x9d0
? kernelmode_fixup_or_oops+0x9b/0x120
? __bad_area_nosemaphore+0x16e/0x240
? bad_area_nosemaphore+0x11/0x20
? do_user_addr_fault+0x2a3/0x610
? exc_page_fault+0x6d/0x160
? asm_exc_page_fault+0x2b/0x30
? shpc_init+0x3fb/0x9d0
shpc_probe+0x92/0x390
NULL check this pointer first before proceeding. If there is no
secondary bus number, there is no point in initializing this hot-plug
controller, so just bails out.
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Cc: stable(a)vger.kernel.org # all
---
This one exists since beginning of git history. So I didn't bother
with a Fixes: tag.
This patch is almost a copy-paste from pciehp
---
drivers/pci/hotplug/shpchp_core.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/pci/hotplug/shpchp_core.c b/drivers/pci/hotplug/shpchp_core.c
index 56c7795ed890..14cf9e894201 100644
--- a/drivers/pci/hotplug/shpchp_core.c
+++ b/drivers/pci/hotplug/shpchp_core.c
@@ -262,6 +262,12 @@ static int shpc_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
if (acpi_get_hp_hw_control_from_firmware(pdev))
return -ENODEV;
+ if (!pdev->subordinate) {
+ /* Can happen if we run out of bus numbers during probe */
+ pci_err(pdev, "Hotplug bridge without secondary bus, ignoring\n");
+ return -ENODEV;
+ }
+
ctrl = kzalloc(sizeof(*ctrl), GFP_KERNEL);
if (!ctrl)
goto err_out_none;
--
2.39.2
In function pci_scan_bridge_extend(), if the variable next_busnr gets to
256, "child = pci_find_bus()" will return bus 0 (root bus). Consequently,
we have a circular PCI topology. The scan will then go in circle until the
kernel crashes due to stack overflow.
This can be reproduced with:
qemu-system-x86_64 -machine pc-q35-2.10 \
-kernel bzImage \
-m 2048 -smp 1 -enable-kvm \
-append "console=ttyS0 root=/dev/sda debug" \
-nographic \
-device pcie-root-port,bus=pcie.0,slot=1,id=rp1,bus-reserve=253 \
-device pcie-root-port,bus=pcie.0,slot=2,id=rp2,bus-reserve=0 \
-device pcie-root-port,bus=pcie.0,slot=3,id=rp3,bus-reserve=0
Check if next_busnr "overflow" and bail out if this is the case.
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Cc: stable(a)vger.kernel.org # all
---
This bug exists since the beginning of git history. So I didn't bother
tracing beyond git to see which patch introduced this.
---
drivers/pci/probe.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 1325fbae2f28..03caae76337c 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1382,6 +1382,9 @@ static int pci_scan_bridge_extend(struct pci_bus *bus, struct pci_dev *dev,
else
next_busnr = max + 1;
+ if (next_busnr == 256)
+ goto out;
+
/*
* Prevent assigning a bus number that already exists.
* This can happen when a bridge is hot-plugged, so in this
--
2.39.2