The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x 48991e4935078b05f80616c75d1ee2ea3ae18e58 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to 'stable@vger.kernel.org' --in-reply-to '2025101636-tartar-brethren-067c@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 48991e4935078b05f80616c75d1ee2ea3ae18e58 Mon Sep 17 00:00:00 2001 From: Brian Norris briannorris@google.com Date: Wed, 24 Sep 2025 09:57:11 -0700 Subject: [PATCH] PCI/sysfs: Ensure devices are powered for config reads
The "max_link_width", "current_link_speed", "current_link_width", "secondary_bus_number", and "subordinate_bus_number" sysfs files all access config registers, but they don't check the runtime PM state. If the device is in D3cold or a parent bridge is suspended, we may see -EINVAL, bogus values, or worse, depending on implementation details.
Wrap these access in pci_config_pm_runtime_{get,put}() like most of the rest of the similar sysfs attributes.
Notably, "max_link_speed" does not access config registers; it returns a cached value since d2bd39c0456b ("PCI: Store all PCIe Supported Link Speeds").
Fixes: 56c1af4606f0 ("PCI: Add sysfs max_link_speed/width, current_link_speed/width, etc") Signed-off-by: Brian Norris briannorris@google.com Signed-off-by: Brian Norris briannorris@chromium.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250924095711.v2.1.Ibb5b6ca1e2c059e04ec53140cd98a4...
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index 5eea14c1f7f5..2b231ef1dac9 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -201,8 +201,14 @@ static ssize_t max_link_width_show(struct device *dev, struct device_attribute *attr, char *buf) { struct pci_dev *pdev = to_pci_dev(dev); + ssize_t ret;
- return sysfs_emit(buf, "%u\n", pcie_get_width_cap(pdev)); + /* We read PCI_EXP_LNKCAP, so we need the device to be accessible. */ + pci_config_pm_runtime_get(pdev); + ret = sysfs_emit(buf, "%u\n", pcie_get_width_cap(pdev)); + pci_config_pm_runtime_put(pdev); + + return ret; } static DEVICE_ATTR_RO(max_link_width);
@@ -214,7 +220,10 @@ static ssize_t current_link_speed_show(struct device *dev, int err; enum pci_bus_speed speed;
+ pci_config_pm_runtime_get(pci_dev); err = pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &linkstat); + pci_config_pm_runtime_put(pci_dev); + if (err) return -EINVAL;
@@ -231,7 +240,10 @@ static ssize_t current_link_width_show(struct device *dev, u16 linkstat; int err;
+ pci_config_pm_runtime_get(pci_dev); err = pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &linkstat); + pci_config_pm_runtime_put(pci_dev); + if (err) return -EINVAL;
@@ -247,7 +259,10 @@ static ssize_t secondary_bus_number_show(struct device *dev, u8 sec_bus; int err;
+ pci_config_pm_runtime_get(pci_dev); err = pci_read_config_byte(pci_dev, PCI_SECONDARY_BUS, &sec_bus); + pci_config_pm_runtime_put(pci_dev); + if (err) return -EINVAL;
@@ -263,7 +278,10 @@ static ssize_t subordinate_bus_number_show(struct device *dev, u8 sub_bus; int err;
+ pci_config_pm_runtime_get(pci_dev); err = pci_read_config_byte(pci_dev, PCI_SUBORDINATE_BUS, &sub_bus); + pci_config_pm_runtime_put(pci_dev); + if (err) return -EINVAL;
From: Maximilian Luz luzmaximilian@gmail.com
[ Upstream commit 80a129afb75cba8434fc5071bd6919172442315c ]
While PCI power states D0-D3hot can be queried from user-space via lspci, D3cold cannot. lspci cannot provide an accurate value when the device is in D3cold as it has to restore the device to D0 before it can access its power state via the configuration space, leading to it reporting D0 or another on-state. Thus lspci cannot be used to diagnose power consumption issues for devices that can enter D3cold or to ensure that devices properly enter D3cold at all.
Add a new sysfs device attribute for the PCI power state, showing the current power state as seen by the kernel.
[bhelgaas: drop READ_ONCE(), see discussion at the link] Link: https://lore.kernel.org/r/20201102141520.831630-1-luzmaximilian@gmail.com Signed-off-by: Maximilian Luz luzmaximilian@gmail.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Stable-dep-of: 48991e493507 ("PCI/sysfs: Ensure devices are powered for config reads") Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/ABI/testing/sysfs-bus-pci | 9 +++++++++ drivers/pci/pci-sysfs.c | 10 ++++++++++ 2 files changed, 19 insertions(+)
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci index da33ab66ddfe7..9d499a126e87f 100644 --- a/Documentation/ABI/testing/sysfs-bus-pci +++ b/Documentation/ABI/testing/sysfs-bus-pci @@ -377,3 +377,12 @@ Contact: Heiner Kallweit hkallweit1@gmail.com Description: If ASPM is supported for an endpoint, these files can be used to disable or enable the individual power management states. Write y/1/on to enable, n/0/off to disable. + +What: /sys/bus/pci/devices/.../power_state +Date: November 2020 +Contact: Linux PCI developers linux-pci@vger.kernel.org +Description: + This file contains the current PCI power state of the device. + The value comes from the PCI kernel device state and can be one + of: "unknown", "error", "D0", D1", "D2", "D3hot", "D3cold". + The file is read only. diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index d27bc5a5d2f86..5a9d942198586 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -124,6 +124,15 @@ static ssize_t cpulistaffinity_show(struct device *dev, } static DEVICE_ATTR_RO(cpulistaffinity);
+static ssize_t power_state_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct pci_dev *pdev = to_pci_dev(dev); + + return sprintf(buf, "%s\n", pci_power_name(pdev->current_state)); +} +static DEVICE_ATTR_RO(power_state); + /* show resources */ static ssize_t resource_show(struct device *dev, struct device_attribute *attr, char *buf) @@ -603,6 +612,7 @@ static ssize_t driver_override_show(struct device *dev, static DEVICE_ATTR_RW(driver_override);
static struct attribute *pci_dev_attrs[] = { + &dev_attr_power_state.attr, &dev_attr_resource.attr, &dev_attr_vendor.attr, &dev_attr_device.attr,
From: Krzysztof Wilczyński kw@linux.com
[ Upstream commit ad025f8e46f3dbf09b1bf8d7a5b4ce858df74544 ]
The sysfs_emit() and sysfs_emit_at() functions were introduced to make it less ambiguous which function is preferred when writing to the output buffer in a device attribute's "show" callback [1].
Convert the PCI sysfs object "show" functions from sprintf(), snprintf() and scnprintf() to sysfs_emit() and sysfs_emit_at() accordingly, as the latter is aware of the PAGE_SIZE buffer and correctly returns the number of bytes written into the buffer.
No functional change intended.
[1] Documentation/filesystems/sysfs.rst
[bhelgaas: drop dsm_label_utf16s_to_utf8s(), link speed/width changes] Link: https://lore.kernel.org/r/20210416205856.3234481-10-kw@linux.com Signed-off-by: Krzysztof Wilczyński kw@linux.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Stable-dep-of: 48991e493507 ("PCI/sysfs: Ensure devices are powered for config reads") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci-label.c | 10 +++--- drivers/pci/pci-sysfs.c | 72 ++++++++++++++++++++--------------------- 2 files changed, 40 insertions(+), 42 deletions(-)
diff --git a/drivers/pci/pci-label.c b/drivers/pci/pci-label.c index cd84cf52a92e1..6c406453b49f4 100644 --- a/drivers/pci/pci-label.c +++ b/drivers/pci/pci-label.c @@ -62,13 +62,11 @@ static size_t find_smbios_instance_string(struct pci_dev *pdev, char *buf, donboard->devfn == devfn) { if (buf) { if (attribute == SMBIOS_ATTR_INSTANCE_SHOW) - return scnprintf(buf, PAGE_SIZE, - "%d\n", - donboard->instance); + return sysfs_emit(buf, "%d\n", + donboard->instance); else if (attribute == SMBIOS_ATTR_LABEL_SHOW) - return scnprintf(buf, PAGE_SIZE, - "%s\n", - dmi->name); + return sysfs_emit(buf, "%s\n", + dmi->name); } return strlen(dmi->name); } diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index 5a9d942198586..3fddd421bbe66 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -39,7 +39,7 @@ field##_show(struct device *dev, struct device_attribute *attr, char *buf) \ struct pci_dev *pdev; \ \ pdev = to_pci_dev(dev); \ - return sprintf(buf, format_string, pdev->field); \ + return sysfs_emit(buf, format_string, pdev->field); \ } \ static DEVICE_ATTR_RO(field)
@@ -56,7 +56,7 @@ static ssize_t broken_parity_status_show(struct device *dev, char *buf) { struct pci_dev *pdev = to_pci_dev(dev); - return sprintf(buf, "%u\n", pdev->broken_parity_status); + return sysfs_emit(buf, "%u\n", pdev->broken_parity_status); }
static ssize_t broken_parity_status_store(struct device *dev, @@ -129,7 +129,7 @@ static ssize_t power_state_show(struct device *dev, { struct pci_dev *pdev = to_pci_dev(dev);
- return sprintf(buf, "%s\n", pci_power_name(pdev->current_state)); + return sysfs_emit(buf, "%s\n", pci_power_name(pdev->current_state)); } static DEVICE_ATTR_RO(power_state);
@@ -138,10 +138,10 @@ static ssize_t resource_show(struct device *dev, struct device_attribute *attr, char *buf) { struct pci_dev *pci_dev = to_pci_dev(dev); - char *str = buf; int i; int max; resource_size_t start, end; + size_t len = 0;
if (pci_dev->subordinate) max = DEVICE_COUNT_RESOURCE; @@ -151,12 +151,12 @@ static ssize_t resource_show(struct device *dev, struct device_attribute *attr, for (i = 0; i < max; i++) { struct resource *res = &pci_dev->resource[i]; pci_resource_to_user(pci_dev, i, res, &start, &end); - str += sprintf(str, "0x%016llx 0x%016llx 0x%016llx\n", - (unsigned long long)start, - (unsigned long long)end, - (unsigned long long)res->flags); + len += sysfs_emit_at(buf, len, "0x%016llx 0x%016llx 0x%016llx\n", + (unsigned long long)start, + (unsigned long long)end, + (unsigned long long)res->flags); } - return (str - buf); + return len; } static DEVICE_ATTR_RO(resource);
@@ -165,8 +165,8 @@ static ssize_t max_link_speed_show(struct device *dev, { struct pci_dev *pdev = to_pci_dev(dev);
- return sprintf(buf, "%s\n", - pci_speed_string(pcie_get_speed_cap(pdev))); + return sysfs_emit(buf, "%s\n", + pci_speed_string(pcie_get_speed_cap(pdev))); } static DEVICE_ATTR_RO(max_link_speed);
@@ -175,7 +175,7 @@ static ssize_t max_link_width_show(struct device *dev, { struct pci_dev *pdev = to_pci_dev(dev);
- return sprintf(buf, "%u\n", pcie_get_width_cap(pdev)); + return sysfs_emit(buf, "%u\n", pcie_get_width_cap(pdev)); } static DEVICE_ATTR_RO(max_link_width);
@@ -193,7 +193,7 @@ static ssize_t current_link_speed_show(struct device *dev,
speed = pcie_link_speed[linkstat & PCI_EXP_LNKSTA_CLS];
- return sprintf(buf, "%s\n", pci_speed_string(speed)); + return sysfs_emit(buf, "%s\n", pci_speed_string(speed)); } static DEVICE_ATTR_RO(current_link_speed);
@@ -208,7 +208,7 @@ static ssize_t current_link_width_show(struct device *dev, if (err) return -EINVAL;
- return sprintf(buf, "%u\n", + return sysfs_emit(buf, "%u\n", (linkstat & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT); } static DEVICE_ATTR_RO(current_link_width); @@ -225,7 +225,7 @@ static ssize_t secondary_bus_number_show(struct device *dev, if (err) return -EINVAL;
- return sprintf(buf, "%u\n", sec_bus); + return sysfs_emit(buf, "%u\n", sec_bus); } static DEVICE_ATTR_RO(secondary_bus_number);
@@ -241,7 +241,7 @@ static ssize_t subordinate_bus_number_show(struct device *dev, if (err) return -EINVAL;
- return sprintf(buf, "%u\n", sub_bus); + return sysfs_emit(buf, "%u\n", sub_bus); } static DEVICE_ATTR_RO(subordinate_bus_number);
@@ -251,7 +251,7 @@ static ssize_t ari_enabled_show(struct device *dev, { struct pci_dev *pci_dev = to_pci_dev(dev);
- return sprintf(buf, "%u\n", pci_ari_enabled(pci_dev->bus)); + return sysfs_emit(buf, "%u\n", pci_ari_enabled(pci_dev->bus)); } static DEVICE_ATTR_RO(ari_enabled);
@@ -260,11 +260,11 @@ static ssize_t modalias_show(struct device *dev, struct device_attribute *attr, { struct pci_dev *pci_dev = to_pci_dev(dev);
- return sprintf(buf, "pci:v%08Xd%08Xsv%08Xsd%08Xbc%02Xsc%02Xi%02X\n", - pci_dev->vendor, pci_dev->device, - pci_dev->subsystem_vendor, pci_dev->subsystem_device, - (u8)(pci_dev->class >> 16), (u8)(pci_dev->class >> 8), - (u8)(pci_dev->class)); + return sysfs_emit(buf, "pci:v%08Xd%08Xsv%08Xsd%08Xbc%02Xsc%02Xi%02X\n", + pci_dev->vendor, pci_dev->device, + pci_dev->subsystem_vendor, pci_dev->subsystem_device, + (u8)(pci_dev->class >> 16), (u8)(pci_dev->class >> 8), + (u8)(pci_dev->class)); } static DEVICE_ATTR_RO(modalias);
@@ -302,7 +302,7 @@ static ssize_t enable_show(struct device *dev, struct device_attribute *attr, struct pci_dev *pdev;
pdev = to_pci_dev(dev); - return sprintf(buf, "%u\n", atomic_read(&pdev->enable_cnt)); + return sysfs_emit(buf, "%u\n", atomic_read(&pdev->enable_cnt)); } static DEVICE_ATTR_RW(enable);
@@ -338,7 +338,7 @@ static ssize_t numa_node_store(struct device *dev, static ssize_t numa_node_show(struct device *dev, struct device_attribute *attr, char *buf) { - return sprintf(buf, "%d\n", dev->numa_node); + return sysfs_emit(buf, "%d\n", dev->numa_node); } static DEVICE_ATTR_RW(numa_node); #endif @@ -348,7 +348,7 @@ static ssize_t dma_mask_bits_show(struct device *dev, { struct pci_dev *pdev = to_pci_dev(dev);
- return sprintf(buf, "%d\n", fls64(pdev->dma_mask)); + return sysfs_emit(buf, "%d\n", fls64(pdev->dma_mask)); } static DEVICE_ATTR_RO(dma_mask_bits);
@@ -356,7 +356,7 @@ static ssize_t consistent_dma_mask_bits_show(struct device *dev, struct device_attribute *attr, char *buf) { - return sprintf(buf, "%d\n", fls64(dev->coherent_dma_mask)); + return sysfs_emit(buf, "%d\n", fls64(dev->coherent_dma_mask)); } static DEVICE_ATTR_RO(consistent_dma_mask_bits);
@@ -366,9 +366,9 @@ static ssize_t msi_bus_show(struct device *dev, struct device_attribute *attr, struct pci_dev *pdev = to_pci_dev(dev); struct pci_bus *subordinate = pdev->subordinate;
- return sprintf(buf, "%u\n", subordinate ? - !(subordinate->bus_flags & PCI_BUS_FLAGS_NO_MSI) - : !pdev->no_msi); + return sysfs_emit(buf, "%u\n", subordinate ? + !(subordinate->bus_flags & PCI_BUS_FLAGS_NO_MSI) + : !pdev->no_msi); }
static ssize_t msi_bus_store(struct device *dev, struct device_attribute *attr, @@ -545,7 +545,7 @@ static ssize_t d3cold_allowed_show(struct device *dev, struct device_attribute *attr, char *buf) { struct pci_dev *pdev = to_pci_dev(dev); - return sprintf(buf, "%u\n", pdev->d3cold_allowed); + return sysfs_emit(buf, "%u\n", pdev->d3cold_allowed); } static DEVICE_ATTR_RW(d3cold_allowed); #endif @@ -559,7 +559,7 @@ static ssize_t devspec_show(struct device *dev,
if (np == NULL) return 0; - return sprintf(buf, "%pOF", np); + return sysfs_emit(buf, "%pOF", np); } static DEVICE_ATTR_RO(devspec); #endif @@ -605,7 +605,7 @@ static ssize_t driver_override_show(struct device *dev, ssize_t len;
device_lock(dev); - len = scnprintf(buf, PAGE_SIZE, "%s\n", pdev->driver_override); + len = sysfs_emit(buf, "%s\n", pdev->driver_override); device_unlock(dev); return len; } @@ -681,11 +681,11 @@ static ssize_t boot_vga_show(struct device *dev, struct device_attribute *attr, struct pci_dev *vga_dev = vga_default_device();
if (vga_dev) - return sprintf(buf, "%u\n", (pdev == vga_dev)); + return sysfs_emit(buf, "%u\n", (pdev == vga_dev));
- return sprintf(buf, "%u\n", - !!(pdev->resource[PCI_ROM_RESOURCE].flags & - IORESOURCE_ROM_SHADOW)); + return sysfs_emit(buf, "%u\n", + !!(pdev->resource[PCI_ROM_RESOURCE].flags & + IORESOURCE_ROM_SHADOW)); } static DEVICE_ATTR_RO(boot_vga);
From: Brian Norris briannorris@google.com
[ Upstream commit 48991e4935078b05f80616c75d1ee2ea3ae18e58 ]
The "max_link_width", "current_link_speed", "current_link_width", "secondary_bus_number", and "subordinate_bus_number" sysfs files all access config registers, but they don't check the runtime PM state. If the device is in D3cold or a parent bridge is suspended, we may see -EINVAL, bogus values, or worse, depending on implementation details.
Wrap these access in pci_config_pm_runtime_{get,put}() like most of the rest of the similar sysfs attributes.
Notably, "max_link_speed" does not access config registers; it returns a cached value since d2bd39c0456b ("PCI: Store all PCIe Supported Link Speeds").
Fixes: 56c1af4606f0 ("PCI: Add sysfs max_link_speed/width, current_link_speed/width, etc") Signed-off-by: Brian Norris briannorris@google.com Signed-off-by: Brian Norris briannorris@chromium.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250924095711.v2.1.Ibb5b6ca1e2c059e04ec53140cd98a4... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci-sysfs.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index 3fddd421bbe66..651887d36368c 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -174,8 +174,14 @@ static ssize_t max_link_width_show(struct device *dev, struct device_attribute *attr, char *buf) { struct pci_dev *pdev = to_pci_dev(dev); + ssize_t ret;
- return sysfs_emit(buf, "%u\n", pcie_get_width_cap(pdev)); + /* We read PCI_EXP_LNKCAP, so we need the device to be accessible. */ + pci_config_pm_runtime_get(pdev); + ret = sysfs_emit(buf, "%u\n", pcie_get_width_cap(pdev)); + pci_config_pm_runtime_put(pdev); + + return ret; } static DEVICE_ATTR_RO(max_link_width);
@@ -187,7 +193,10 @@ static ssize_t current_link_speed_show(struct device *dev, int err; enum pci_bus_speed speed;
+ pci_config_pm_runtime_get(pci_dev); err = pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &linkstat); + pci_config_pm_runtime_put(pci_dev); + if (err) return -EINVAL;
@@ -204,7 +213,10 @@ static ssize_t current_link_width_show(struct device *dev, u16 linkstat; int err;
+ pci_config_pm_runtime_get(pci_dev); err = pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &linkstat); + pci_config_pm_runtime_put(pci_dev); + if (err) return -EINVAL;
@@ -221,7 +233,10 @@ static ssize_t secondary_bus_number_show(struct device *dev, u8 sec_bus; int err;
+ pci_config_pm_runtime_get(pci_dev); err = pci_read_config_byte(pci_dev, PCI_SECONDARY_BUS, &sec_bus); + pci_config_pm_runtime_put(pci_dev); + if (err) return -EINVAL;
@@ -237,7 +252,10 @@ static ssize_t subordinate_bus_number_show(struct device *dev, u8 sub_bus; int err;
+ pci_config_pm_runtime_get(pci_dev); err = pci_read_config_byte(pci_dev, PCI_SUBORDINATE_BUS, &sub_bus); + pci_config_pm_runtime_put(pci_dev); + if (err) return -EINVAL;
linux-stable-mirror@lists.linaro.org