The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x cca42bd8eb1b54a4c9bbf48c79d120e66619a3e4
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023112259-concrete-detonator-b0ed@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
cca42bd8eb1b ("rcutorture: Fix stuttering races and other issues")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From cca42bd8eb1b54a4c9bbf48c79d120e66619a3e4 Mon Sep 17 00:00:00 2001
From: "Joel Fernandes (Google)" <joel(a)joelfernandes.org>
Date: Sat, 29 Jul 2023 14:27:31 +0000
Subject: [PATCH] rcutorture: Fix stuttering races and other issues
The stuttering code isn't functioning as expected. Ideally, it should
pause the torture threads for a designated period before resuming. Yet,
it fails to halt the test for the correct duration. Additionally, a race
condition exists, potentially causing the stuttering code to pause for
an extended period if the 'spt' variable is non-zero due to the stutter
orchestration thread's inadequate CPU time.
Moreover, over-stuttering can hinder RCU's progress on TREE07 kernels.
This happens as the stuttering code may run within a softirq due to RCU
callbacks. Consequently, ksoftirqd keeps a CPU busy for several seconds,
thus obstructing RCU's progress. This situation triggers a warning
message in the logs:
[ 2169.481783] rcu_torture_writer: rtort_pipe_count: 9
This warning suggests that an RCU torture object, although invisible to
RCU readers, couldn't make it past the pipe array and be freed -- a
strong indication that there weren't enough grace periods during the
stutter interval.
To address these issues, this patch sets the "stutter end" time to an
absolute point in the future set by the main stutter thread. This is
then used for waiting in stutter_wait(). While the stutter thread still
defines this absolute time, the waiters' waiting logic doesn't rely on
the stutter thread receiving sufficient CPU time to halt the stuttering
as the halting is now self-controlled.
Cc: stable(a)vger.kernel.org
Signed-off-by: Joel Fernandes (Google) <joel(a)joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck(a)kernel.org>
Signed-off-by: Frederic Weisbecker <frederic(a)kernel.org>
diff --git a/kernel/torture.c b/kernel/torture.c
index 6ba62e5993e7..fd353f98162f 100644
--- a/kernel/torture.c
+++ b/kernel/torture.c
@@ -720,7 +720,7 @@ static void torture_shutdown_cleanup(void)
* suddenly applied to or removed from the system.
*/
static struct task_struct *stutter_task;
-static int stutter_pause_test;
+static ktime_t stutter_till_abs_time;
static int stutter;
static int stutter_gap;
@@ -730,30 +730,16 @@ static int stutter_gap;
*/
bool stutter_wait(const char *title)
{
- unsigned int i = 0;
bool ret = false;
- int spt;
+ ktime_t till_ns;
cond_resched_tasks_rcu_qs();
- spt = READ_ONCE(stutter_pause_test);
- for (; spt; spt = READ_ONCE(stutter_pause_test)) {
- if (!ret && !rt_task(current)) {
- sched_set_normal(current, MAX_NICE);
- ret = true;
- }
- if (spt == 1) {
- torture_hrtimeout_jiffies(1, NULL);
- } else if (spt == 2) {
- while (READ_ONCE(stutter_pause_test)) {
- if (!(i++ & 0xffff))
- torture_hrtimeout_us(10, 0, NULL);
- cond_resched();
- }
- } else {
- torture_hrtimeout_jiffies(round_jiffies_relative(HZ), NULL);
- }
- torture_shutdown_absorb(title);
+ till_ns = READ_ONCE(stutter_till_abs_time);
+ if (till_ns && ktime_before(ktime_get(), till_ns)) {
+ torture_hrtimeout_ns(till_ns, 0, HRTIMER_MODE_ABS, NULL);
+ ret = true;
}
+ torture_shutdown_absorb(title);
return ret;
}
EXPORT_SYMBOL_GPL(stutter_wait);
@@ -764,23 +750,16 @@ EXPORT_SYMBOL_GPL(stutter_wait);
*/
static int torture_stutter(void *arg)
{
- DEFINE_TORTURE_RANDOM(rand);
- int wtime;
+ ktime_t till_ns;
VERBOSE_TOROUT_STRING("torture_stutter task started");
do {
if (!torture_must_stop() && stutter > 1) {
- wtime = stutter;
- if (stutter > 2) {
- WRITE_ONCE(stutter_pause_test, 1);
- wtime = stutter - 3;
- torture_hrtimeout_jiffies(wtime, &rand);
- wtime = 2;
- }
- WRITE_ONCE(stutter_pause_test, 2);
- torture_hrtimeout_jiffies(wtime, NULL);
+ till_ns = ktime_add_ns(ktime_get(),
+ jiffies_to_nsecs(stutter));
+ WRITE_ONCE(stutter_till_abs_time, till_ns);
+ torture_hrtimeout_jiffies(stutter - 1, NULL);
}
- WRITE_ONCE(stutter_pause_test, 0);
if (!torture_must_stop())
torture_hrtimeout_jiffies(stutter_gap, NULL);
torture_shutdown_absorb("torture_stutter");
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 83a939f0fdc208ff3639dd3d42ac9b3c35607fd2
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023112207-dealmaker-frigidly-e080@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
83a939f0fdc2 ("PCI: exynos: Don't discard .remove() callback")
778f7c194b1d ("PCI: dwc: exynos: Rework the driver to support Exynos5433 variant")
b9ac0f9dc8ea ("PCI: dwc: Move dw_pcie_setup_rc() to DWC common code")
59fbab1ae40e ("PCI: dwc: Move dw_pcie_msi_init() into core")
886a9c134755 ("PCI: dwc: Move link handling into common code")
331e9bcead52 ("PCI: dwc: Drop the .set_num_vectors() host op")
a0fd361db8e5 ("PCI: dwc: Move "dbi", "dbi2", and "addr_space" resource setup into common code")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 83a939f0fdc208ff3639dd3d42ac9b3c35607fd2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig(a)pengutronix.de>
Date: Sun, 1 Oct 2023 19:02:51 +0200
Subject: [PATCH] PCI: exynos: Don't discard .remove() callback
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
With CONFIG_PCI_EXYNOS=y and exynos_pcie_remove() marked with __exit, the
function is discarded from the driver. In this case a bound device can
still get unbound, e.g via sysfs. Then no cleanup code is run resulting in
resource leaks or worse.
The right thing to do is do always have the remove callback available.
This fixes the following warning by modpost:
WARNING: modpost: drivers/pci/controller/dwc/pci-exynos: section mismatch in reference: exynos_pcie_driver+0x8 (section: .data) -> exynos_pcie_remove (section: .exit.text)
(with ARCH=x86_64 W=1 allmodconfig).
Fixes: 340cba6092c2 ("pci: Add PCIe driver for Samsung Exynos")
Link: https://lore.kernel.org/r/20231001170254.2506508-2-u.kleine-koenig@pengutro…
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Reviewed-by: Alim Akhtar <alim.akhtar(a)samsung.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/pci/controller/dwc/pci-exynos.c b/drivers/pci/controller/dwc/pci-exynos.c
index 6319082301d6..c6bede346932 100644
--- a/drivers/pci/controller/dwc/pci-exynos.c
+++ b/drivers/pci/controller/dwc/pci-exynos.c
@@ -375,7 +375,7 @@ static int exynos_pcie_probe(struct platform_device *pdev)
return ret;
}
-static int __exit exynos_pcie_remove(struct platform_device *pdev)
+static int exynos_pcie_remove(struct platform_device *pdev)
{
struct exynos_pcie *ep = platform_get_drvdata(pdev);
@@ -431,7 +431,7 @@ static const struct of_device_id exynos_pcie_of_match[] = {
static struct platform_driver exynos_pcie_driver = {
.probe = exynos_pcie_probe,
- .remove = __exit_p(exynos_pcie_remove),
+ .remove = exynos_pcie_remove,
.driver = {
.name = "exynos-pcie",
.of_match_table = exynos_pcie_of_match,
Hello,
This series contains the backports of two related changes to fix
removal of timed-out catchall elements.
As-is, removed element remains on the list and will be collected
again.
The adjustments are needed because of missing commit
0e1ea651c971 ("netfilter: nf_tables: shrink memory consumption of set elements"),
so we need to pass set_elem container struct instead of "elem_priv".
Pablo Neira Ayuso (2):
netfilter: nf_tables: remove catchall element in GC sync path
netfilter: nf_tables: split async and sync catchall in two functions
net/netfilter/nf_tables_api.c | 53 ++++++++++++++++++++++++-----------
1 file changed, 36 insertions(+), 17 deletions(-)
--
2.41.0
From: Mike Christie <michael.christie(a)oracle.com>
[ Upstream commit 3b83486399a6a9feb9c681b74c21a227d48d7020 ]
If scsi_execute_cmd() returns < 0, it doesn't initialize the sshdr, so we
shouldn't access the sshdr. If it returns 0, then the cmd executed
successfully, so there is no need to check the sshdr. sd_sync_cache() will
only access the sshdr if it's been setup because it calls
scsi_status_is_check_condition() before accessing it. However, the
sd_sync_cache() caller, sd_suspend_common(), does not check.
sd_suspend_common() is only checking for ILLEGAL_REQUEST which it's using
to determine if the command is supported. If it's not it just ignores the
error. So to fix its sshdr use this patch just moves that check to
sd_sync_cache() where it converts ILLEGAL_REQUEST to success/0.
sd_suspend_common() was ignoring that error and sd_shutdown() doesn't check
for errors so there will be no behavior changes.
Signed-off-by: Mike Christie <michael.christie(a)oracle.com>
Link: https://lore.kernel.org/r/20231106231304.5694-2-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Reviewed-by: Martin Wilck <mwilck(a)suse.com>
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/scsi/sd.c | 53 ++++++++++++++++++++---------------------------
1 file changed, 23 insertions(+), 30 deletions(-)
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index c4babb16dac73..1b7de2daf64a2 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1681,24 +1681,21 @@ static unsigned int sd_check_events(struct gendisk *disk, unsigned int clearing)
return disk_changed ? DISK_EVENT_MEDIA_CHANGE : 0;
}
-static int sd_sync_cache(struct scsi_disk *sdkp, struct scsi_sense_hdr *sshdr)
+static int sd_sync_cache(struct scsi_disk *sdkp)
{
int retries, res;
struct scsi_device *sdp = sdkp->device;
const int timeout = sdp->request_queue->rq_timeout
* SD_FLUSH_TIMEOUT_MULTIPLIER;
- struct scsi_sense_hdr my_sshdr;
+ struct scsi_sense_hdr sshdr;
const struct scsi_exec_args exec_args = {
.req_flags = BLK_MQ_REQ_PM,
- /* caller might not be interested in sense, but we need it */
- .sshdr = sshdr ? : &my_sshdr,
+ .sshdr = &sshdr,
};
if (!scsi_device_online(sdp))
return -ENODEV;
- sshdr = exec_args.sshdr;
-
for (retries = 3; retries > 0; --retries) {
unsigned char cmd[16] = { 0 };
@@ -1723,15 +1720,23 @@ static int sd_sync_cache(struct scsi_disk *sdkp, struct scsi_sense_hdr *sshdr)
return res;
if (scsi_status_is_check_condition(res) &&
- scsi_sense_valid(sshdr)) {
- sd_print_sense_hdr(sdkp, sshdr);
+ scsi_sense_valid(&sshdr)) {
+ sd_print_sense_hdr(sdkp, &sshdr);
/* we need to evaluate the error return */
- if (sshdr->asc == 0x3a || /* medium not present */
- sshdr->asc == 0x20 || /* invalid command */
- (sshdr->asc == 0x74 && sshdr->ascq == 0x71)) /* drive is password locked */
+ if (sshdr.asc == 0x3a || /* medium not present */
+ sshdr.asc == 0x20 || /* invalid command */
+ (sshdr.asc == 0x74 && sshdr.ascq == 0x71)) /* drive is password locked */
/* this is no error here */
return 0;
+ /*
+ * This drive doesn't support sync and there's not much
+ * we can do because this is called during shutdown
+ * or suspend so just return success so those operations
+ * can proceed.
+ */
+ if (sshdr.sense_key == ILLEGAL_REQUEST)
+ return 0;
}
switch (host_byte(res)) {
@@ -3886,7 +3891,7 @@ static void sd_shutdown(struct device *dev)
if (sdkp->WCE && sdkp->media_present) {
sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
- sd_sync_cache(sdkp, NULL);
+ sd_sync_cache(sdkp);
}
if ((system_state != SYSTEM_RESTART &&
@@ -3907,7 +3912,6 @@ static inline bool sd_do_start_stop(struct scsi_device *sdev, bool runtime)
static int sd_suspend_common(struct device *dev, bool runtime)
{
struct scsi_disk *sdkp = dev_get_drvdata(dev);
- struct scsi_sense_hdr sshdr;
int ret = 0;
if (!sdkp) /* E.g.: runtime suspend following sd_remove() */
@@ -3916,24 +3920,13 @@ static int sd_suspend_common(struct device *dev, bool runtime)
if (sdkp->WCE && sdkp->media_present) {
if (!sdkp->device->silence_suspend)
sd_printk(KERN_NOTICE, sdkp, "Synchronizing SCSI cache\n");
- ret = sd_sync_cache(sdkp, &sshdr);
-
- if (ret) {
- /* ignore OFFLINE device */
- if (ret == -ENODEV)
- return 0;
-
- if (!scsi_sense_valid(&sshdr) ||
- sshdr.sense_key != ILLEGAL_REQUEST)
- return ret;
+ ret = sd_sync_cache(sdkp);
+ /* ignore OFFLINE device */
+ if (ret == -ENODEV)
+ return 0;
- /*
- * sshdr.sense_key == ILLEGAL_REQUEST means this drive
- * doesn't support sync. There's not much to do and
- * suspend shouldn't fail.
- */
- ret = 0;
- }
+ if (ret)
+ return ret;
}
if (sd_do_start_stop(sdkp->device, runtime)) {
--
2.42.0
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x c9260693aa0c1e029ed23693cfd4d7814eee6624
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023112200-barcode-wasp-98d7@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
c9260693aa0c ("PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e card")
26409dd04589 ("of: unittest: Add pci_dt_testdrv pci driver")
ae9813db1dc5 ("PCI: Add quirks to generate device tree node for Xilinx Alveo U50")
74df14cd301a ("of: unittest: add node lifecycle tests")
e87cacadebaf ("of: overlay: rename overlay source files from .dts to .dtso")
5459c0b70467 ("PCI/DPC: Quirk PIO log size for certain Intel Root Ports")
3cc30140dbe2 ("Merge tag 'pci-v5.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c9260693aa0c1e029ed23693cfd4d7814eee6624 Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Thu, 21 Sep 2023 16:23:34 +0200
Subject: [PATCH] PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e
card
Commit ac91e6980563 ("PCI: Unify delay handling for reset and resume")
shortened an unconditional 1 sec delay after a Secondary Bus Reset to 100
msec for PCIe (per PCIe r6.1 sec 6.6.1). The 1 sec delay is only required
for Conventional PCI.
But it turns out that there are PCIe devices which require a longer delay
than prescribed before first config space access after reset recovery or
resume from D3cold:
Chad reports that a "VideoPropulsion Torrent QN16e" MPEG QAM Modulator
"raises a PCI system error (PERR), as reported by the IPMI event log, and
the hardware itself would suffer a catastrophic event, cycling the server"
unless the longer delay is observed.
The card is specified to conform to PCIe r1.0 and indeed only supports Gen1
speed (2.5 GT/s) according to lspci. PCIe r1.0 sec 7.6 prescribes the same
100 msec delay as PCIe r6.1 sec 6.6.1:
To allow components to perform internal initialization, system software
must wait for at least 100 ms from the end of a reset (cold/warm/hot)
before it is permitted to issue Configuration Requests
The behavior of the Torrent QN16e card thus appears to be a quirk. Treat
it as such and lengthen the reset delay for this specific device.
Fixes: ac91e6980563 ("PCI: Unify delay handling for reset and resume")
Link: https://lore.kernel.org/r/47727e792c7f0282dc144e3ec8ce8eb6e713394e.16953045…
Reported-by: Chad Schroeder <CSchroeder(a)sonifi.com>
Closes: https://lore.kernel.org/linux-pci/DM6PR16MB2844903E34CAB910082DF019B1FAA@DM…
Tested-by: Chad Schroeder <CSchroeder(a)sonifi.com>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: stable(a)vger.kernel.org # v5.4+
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index eeec1d6f9023..91a15d79c7c4 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -6188,3 +6188,15 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a31, dpc_log_size);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5020, of_pci_make_dev_node);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5021, of_pci_make_dev_node);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REDHAT, 0x0005, of_pci_make_dev_node);
+
+/*
+ * Devices known to require a longer delay before first config space access
+ * after reset recovery or resume from D3cold:
+ *
+ * VideoPropulsion (aka Genroco) Torrent QN16e MPEG QAM Modulator
+ */
+static void pci_fixup_d3cold_delay_1sec(struct pci_dev *pdev)
+{
+ pdev->d3cold_delay = 1000;
+}
+DECLARE_PCI_FIXUP_FINAL(0x5555, 0x0004, pci_fixup_d3cold_delay_1sec);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x c9260693aa0c1e029ed23693cfd4d7814eee6624
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023112258-saddlebag-vantage-48b3@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
c9260693aa0c ("PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e card")
26409dd04589 ("of: unittest: Add pci_dt_testdrv pci driver")
ae9813db1dc5 ("PCI: Add quirks to generate device tree node for Xilinx Alveo U50")
74df14cd301a ("of: unittest: add node lifecycle tests")
e87cacadebaf ("of: overlay: rename overlay source files from .dts to .dtso")
5459c0b70467 ("PCI/DPC: Quirk PIO log size for certain Intel Root Ports")
3cc30140dbe2 ("Merge tag 'pci-v5.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c9260693aa0c1e029ed23693cfd4d7814eee6624 Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Thu, 21 Sep 2023 16:23:34 +0200
Subject: [PATCH] PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e
card
Commit ac91e6980563 ("PCI: Unify delay handling for reset and resume")
shortened an unconditional 1 sec delay after a Secondary Bus Reset to 100
msec for PCIe (per PCIe r6.1 sec 6.6.1). The 1 sec delay is only required
for Conventional PCI.
But it turns out that there are PCIe devices which require a longer delay
than prescribed before first config space access after reset recovery or
resume from D3cold:
Chad reports that a "VideoPropulsion Torrent QN16e" MPEG QAM Modulator
"raises a PCI system error (PERR), as reported by the IPMI event log, and
the hardware itself would suffer a catastrophic event, cycling the server"
unless the longer delay is observed.
The card is specified to conform to PCIe r1.0 and indeed only supports Gen1
speed (2.5 GT/s) according to lspci. PCIe r1.0 sec 7.6 prescribes the same
100 msec delay as PCIe r6.1 sec 6.6.1:
To allow components to perform internal initialization, system software
must wait for at least 100 ms from the end of a reset (cold/warm/hot)
before it is permitted to issue Configuration Requests
The behavior of the Torrent QN16e card thus appears to be a quirk. Treat
it as such and lengthen the reset delay for this specific device.
Fixes: ac91e6980563 ("PCI: Unify delay handling for reset and resume")
Link: https://lore.kernel.org/r/47727e792c7f0282dc144e3ec8ce8eb6e713394e.16953045…
Reported-by: Chad Schroeder <CSchroeder(a)sonifi.com>
Closes: https://lore.kernel.org/linux-pci/DM6PR16MB2844903E34CAB910082DF019B1FAA@DM…
Tested-by: Chad Schroeder <CSchroeder(a)sonifi.com>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: stable(a)vger.kernel.org # v5.4+
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index eeec1d6f9023..91a15d79c7c4 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -6188,3 +6188,15 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a31, dpc_log_size);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5020, of_pci_make_dev_node);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5021, of_pci_make_dev_node);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REDHAT, 0x0005, of_pci_make_dev_node);
+
+/*
+ * Devices known to require a longer delay before first config space access
+ * after reset recovery or resume from D3cold:
+ *
+ * VideoPropulsion (aka Genroco) Torrent QN16e MPEG QAM Modulator
+ */
+static void pci_fixup_d3cold_delay_1sec(struct pci_dev *pdev)
+{
+ pdev->d3cold_delay = 1000;
+}
+DECLARE_PCI_FIXUP_FINAL(0x5555, 0x0004, pci_fixup_d3cold_delay_1sec);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x c9260693aa0c1e029ed23693cfd4d7814eee6624
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023112256-fancy-quintet-036e@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
c9260693aa0c ("PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e card")
26409dd04589 ("of: unittest: Add pci_dt_testdrv pci driver")
ae9813db1dc5 ("PCI: Add quirks to generate device tree node for Xilinx Alveo U50")
74df14cd301a ("of: unittest: add node lifecycle tests")
e87cacadebaf ("of: overlay: rename overlay source files from .dts to .dtso")
5459c0b70467 ("PCI/DPC: Quirk PIO log size for certain Intel Root Ports")
3cc30140dbe2 ("Merge tag 'pci-v5.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c9260693aa0c1e029ed23693cfd4d7814eee6624 Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Thu, 21 Sep 2023 16:23:34 +0200
Subject: [PATCH] PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e
card
Commit ac91e6980563 ("PCI: Unify delay handling for reset and resume")
shortened an unconditional 1 sec delay after a Secondary Bus Reset to 100
msec for PCIe (per PCIe r6.1 sec 6.6.1). The 1 sec delay is only required
for Conventional PCI.
But it turns out that there are PCIe devices which require a longer delay
than prescribed before first config space access after reset recovery or
resume from D3cold:
Chad reports that a "VideoPropulsion Torrent QN16e" MPEG QAM Modulator
"raises a PCI system error (PERR), as reported by the IPMI event log, and
the hardware itself would suffer a catastrophic event, cycling the server"
unless the longer delay is observed.
The card is specified to conform to PCIe r1.0 and indeed only supports Gen1
speed (2.5 GT/s) according to lspci. PCIe r1.0 sec 7.6 prescribes the same
100 msec delay as PCIe r6.1 sec 6.6.1:
To allow components to perform internal initialization, system software
must wait for at least 100 ms from the end of a reset (cold/warm/hot)
before it is permitted to issue Configuration Requests
The behavior of the Torrent QN16e card thus appears to be a quirk. Treat
it as such and lengthen the reset delay for this specific device.
Fixes: ac91e6980563 ("PCI: Unify delay handling for reset and resume")
Link: https://lore.kernel.org/r/47727e792c7f0282dc144e3ec8ce8eb6e713394e.16953045…
Reported-by: Chad Schroeder <CSchroeder(a)sonifi.com>
Closes: https://lore.kernel.org/linux-pci/DM6PR16MB2844903E34CAB910082DF019B1FAA@DM…
Tested-by: Chad Schroeder <CSchroeder(a)sonifi.com>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: stable(a)vger.kernel.org # v5.4+
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index eeec1d6f9023..91a15d79c7c4 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -6188,3 +6188,15 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x9a31, dpc_log_size);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5020, of_pci_make_dev_node);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_XILINX, 0x5021, of_pci_make_dev_node);
DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_REDHAT, 0x0005, of_pci_make_dev_node);
+
+/*
+ * Devices known to require a longer delay before first config space access
+ * after reset recovery or resume from D3cold:
+ *
+ * VideoPropulsion (aka Genroco) Torrent QN16e MPEG QAM Modulator
+ */
+static void pci_fixup_d3cold_delay_1sec(struct pci_dev *pdev)
+{
+ pdev->d3cold_delay = 1000;
+}
+DECLARE_PCI_FIXUP_FINAL(0x5555, 0x0004, pci_fixup_d3cold_delay_1sec);