The patch titled
Subject: mm/damon/core: fix damos_commit_filter not changing allow
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-damon-core-fix-damos_commit_filter-not-changing-allow.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Sang-Heon Jeon <ekffu200098(a)gmail.com>
Subject: mm/damon/core: fix damos_commit_filter not changing allow
Date: Sat, 16 Aug 2025 10:51:16 +0900
Current damos_commit_filter() does not persist the `allow' value of the
filter. As a result, changing the `allow' value of a filter and
committing doesn't change the `allow' value.
Add the missing `allow' value update, so committing the filter
persistently changes the `allow' value well.
Link: https://lkml.kernel.org/r/20250816015116.194589-1-ekffu200098@gmail.com
Fixes: fe6d7fdd6249 ("mm/damon/core: add damos_filter->allow field")
Signed-off-by: Sang-Heon Jeon <ekffu200098(a)gmail.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [6.14.x]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/core.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/damon/core.c~mm-damon-core-fix-damos_commit_filter-not-changing-allow
+++ a/mm/damon/core.c
@@ -883,6 +883,7 @@ static void damos_commit_filter(
{
dst->type = src->type;
dst->matching = src->matching;
+ dst->allow = src->allow;
damos_commit_filter_arg(dst, src);
}
_
Patches currently in -mm which might be from ekffu200098(a)gmail.com are
mm-damon-core-fix-commit_ops_filters-by-using-correct-nth-function.patch
selftests-damon-fix-selftests-by-installing-drgn-related-script.patch
mm-damon-core-fix-damos_commit_filter-not-changing-allow.patch
mm-damon-update-expired-description-of-damos_action.patch
docs-mm-damon-design-fix-typo-s-sz_trtied-sz_tried.patch
selftests-damon-test-no-op-commit-broke-damon-status.patch
Since commit 2ca34b508774 ("staging: axis-fifo: Correct handling of
tx_fifo_depth for size validation"), write() operations with packets
larger than 'tx_fifo_depth - 4' words are no longer rejected with -EINVAL.
Fortunately, the packets are not actually getting transmitted to hardware,
otherwise they would be raising a 'Transmit Packet Overrun Error'
interrupt, which requires a reset of the TX circuit to recover from.
Instead, the request times out inside wait_event_interruptible_timeout()
and always returns -EAGAIN, since the wake up condition can never be true
for these packets. But still, they unnecessarily block other tasks from
writing to the FIFO and the EAGAIN return code signals userspace to retry
the write() call, even though it will always fail and time out.
According to the AXI4-Stream FIFO reference manual (PG080), the maximum
valid packet length is 'tx_fifo_depth - 4' words, so attempting to send
larger packets is invalid and should not be happening in the first place:
> The maximum packet that can be transmitted is limited by the size of
> the FIFO, which is (C_TX_FIFO_DEPTH–4)*(data interface width/8) bytes.
Therefore, bring back the old behavior and outright reject packets larger
than 'tx_fifo_depth - 4' with -EINVAL. Add a comment to explain why the
check is necessary. The dev_err() message was removed to avoid cluttering
the dmesg log if an invalid packet is received from userspace.
Fixes: 2ca34b508774 ("staging: axis-fifo: Correct handling of tx_fifo_depth for size validation")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ovidiu Panait <ovidiu.panait.oss(a)gmail.com>
---
Changes in v2:
- added "cc: stable" tag
drivers/staging/axis-fifo/axis-fifo.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/staging/axis-fifo/axis-fifo.c b/drivers/staging/axis-fifo/axis-fifo.c
index e8aa632e0a31..271236ad023f 100644
--- a/drivers/staging/axis-fifo/axis-fifo.c
+++ b/drivers/staging/axis-fifo/axis-fifo.c
@@ -325,11 +325,17 @@ static ssize_t axis_fifo_write(struct file *f, const char __user *buf,
return -EINVAL;
}
- if (words_to_write > fifo->tx_fifo_depth) {
- dev_err(fifo->dt_device, "tried to write more words [%u] than slots in the fifo buffer [%u]\n",
- words_to_write, fifo->tx_fifo_depth);
+ /*
+ * In 'Store-and-Forward' mode, the maximum packet that can be
+ * transmitted is limited by the size of the FIFO, which is
+ * (C_TX_FIFO_DEPTH–4)*(data interface width/8) bytes.
+ *
+ * Do not attempt to send a packet larger than 'tx_fifo_depth - 4',
+ * otherwise a 'Transmit Packet Overrun Error' interrupt will be
+ * raised, which requires a reset of the TX circuit to recover.
+ */
+ if (words_to_write > (fifo->tx_fifo_depth - 4))
return -EINVAL;
- }
if (fifo->write_flags & O_NONBLOCK) {
/*
--
2.50.0
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 5349ae5e05fa37409fd48a1eb483b199c32c889b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081755-subdivide-astound-6aef@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5349ae5e05fa37409fd48a1eb483b199c32c889b Mon Sep 17 00:00:00 2001
From: Stefan Metzmacher <metze(a)samba.org>
Date: Mon, 4 Aug 2025 14:10:12 +0200
Subject: [PATCH] smb: client: let send_done() cleanup before calling
smbd_disconnect_rdma_connection()
We should call ib_dma_unmap_single() and mempool_free() before calling
smbd_disconnect_rdma_connection().
And smbd_disconnect_rdma_connection() needs to be the last function to
call as all other state might already be gone after it returns.
Cc: Steve French <smfrench(a)gmail.com>
Cc: Tom Talpey <tom(a)talpey.com>
Cc: Long Li <longli(a)microsoft.com>
Cc: linux-cifs(a)vger.kernel.org
Cc: samba-technical(a)lists.samba.org
Fixes: f198186aa9bb ("CIFS: SMBD: Establish SMB Direct connection")
Signed-off-by: Stefan Metzmacher <metze(a)samba.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/client/smbdirect.c b/fs/smb/client/smbdirect.c
index 754e94a0e07f..e99e783f1b0e 100644
--- a/fs/smb/client/smbdirect.c
+++ b/fs/smb/client/smbdirect.c
@@ -281,18 +281,20 @@ static void send_done(struct ib_cq *cq, struct ib_wc *wc)
log_rdma_send(INFO, "smbd_request 0x%p completed wc->status=%d\n",
request, wc->status);
- if (wc->status != IB_WC_SUCCESS || wc->opcode != IB_WC_SEND) {
- log_rdma_send(ERR, "wc->status=%d wc->opcode=%d\n",
- wc->status, wc->opcode);
- smbd_disconnect_rdma_connection(request->info);
- }
-
for (i = 0; i < request->num_sge; i++)
ib_dma_unmap_single(sc->ib.dev,
request->sge[i].addr,
request->sge[i].length,
DMA_TO_DEVICE);
+ if (wc->status != IB_WC_SUCCESS || wc->opcode != IB_WC_SEND) {
+ log_rdma_send(ERR, "wc->status=%d wc->opcode=%d\n",
+ wc->status, wc->opcode);
+ mempool_free(request, info->request_mempool);
+ smbd_disconnect_rdma_connection(info);
+ return;
+ }
+
if (atomic_dec_and_test(&request->info->send_pending))
wake_up(&request->info->wait_send_pending);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 5349ae5e05fa37409fd48a1eb483b199c32c889b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081757-moonwalk-backpedal-fe00@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5349ae5e05fa37409fd48a1eb483b199c32c889b Mon Sep 17 00:00:00 2001
From: Stefan Metzmacher <metze(a)samba.org>
Date: Mon, 4 Aug 2025 14:10:12 +0200
Subject: [PATCH] smb: client: let send_done() cleanup before calling
smbd_disconnect_rdma_connection()
We should call ib_dma_unmap_single() and mempool_free() before calling
smbd_disconnect_rdma_connection().
And smbd_disconnect_rdma_connection() needs to be the last function to
call as all other state might already be gone after it returns.
Cc: Steve French <smfrench(a)gmail.com>
Cc: Tom Talpey <tom(a)talpey.com>
Cc: Long Li <longli(a)microsoft.com>
Cc: linux-cifs(a)vger.kernel.org
Cc: samba-technical(a)lists.samba.org
Fixes: f198186aa9bb ("CIFS: SMBD: Establish SMB Direct connection")
Signed-off-by: Stefan Metzmacher <metze(a)samba.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/client/smbdirect.c b/fs/smb/client/smbdirect.c
index 754e94a0e07f..e99e783f1b0e 100644
--- a/fs/smb/client/smbdirect.c
+++ b/fs/smb/client/smbdirect.c
@@ -281,18 +281,20 @@ static void send_done(struct ib_cq *cq, struct ib_wc *wc)
log_rdma_send(INFO, "smbd_request 0x%p completed wc->status=%d\n",
request, wc->status);
- if (wc->status != IB_WC_SUCCESS || wc->opcode != IB_WC_SEND) {
- log_rdma_send(ERR, "wc->status=%d wc->opcode=%d\n",
- wc->status, wc->opcode);
- smbd_disconnect_rdma_connection(request->info);
- }
-
for (i = 0; i < request->num_sge; i++)
ib_dma_unmap_single(sc->ib.dev,
request->sge[i].addr,
request->sge[i].length,
DMA_TO_DEVICE);
+ if (wc->status != IB_WC_SUCCESS || wc->opcode != IB_WC_SEND) {
+ log_rdma_send(ERR, "wc->status=%d wc->opcode=%d\n",
+ wc->status, wc->opcode);
+ mempool_free(request, info->request_mempool);
+ smbd_disconnect_rdma_connection(info);
+ return;
+ }
+
if (atomic_dec_and_test(&request->info->send_pending))
wake_up(&request->info->wait_send_pending);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 5349ae5e05fa37409fd48a1eb483b199c32c889b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081753-revolver-radiated-4b75@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5349ae5e05fa37409fd48a1eb483b199c32c889b Mon Sep 17 00:00:00 2001
From: Stefan Metzmacher <metze(a)samba.org>
Date: Mon, 4 Aug 2025 14:10:12 +0200
Subject: [PATCH] smb: client: let send_done() cleanup before calling
smbd_disconnect_rdma_connection()
We should call ib_dma_unmap_single() and mempool_free() before calling
smbd_disconnect_rdma_connection().
And smbd_disconnect_rdma_connection() needs to be the last function to
call as all other state might already be gone after it returns.
Cc: Steve French <smfrench(a)gmail.com>
Cc: Tom Talpey <tom(a)talpey.com>
Cc: Long Li <longli(a)microsoft.com>
Cc: linux-cifs(a)vger.kernel.org
Cc: samba-technical(a)lists.samba.org
Fixes: f198186aa9bb ("CIFS: SMBD: Establish SMB Direct connection")
Signed-off-by: Stefan Metzmacher <metze(a)samba.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/client/smbdirect.c b/fs/smb/client/smbdirect.c
index 754e94a0e07f..e99e783f1b0e 100644
--- a/fs/smb/client/smbdirect.c
+++ b/fs/smb/client/smbdirect.c
@@ -281,18 +281,20 @@ static void send_done(struct ib_cq *cq, struct ib_wc *wc)
log_rdma_send(INFO, "smbd_request 0x%p completed wc->status=%d\n",
request, wc->status);
- if (wc->status != IB_WC_SUCCESS || wc->opcode != IB_WC_SEND) {
- log_rdma_send(ERR, "wc->status=%d wc->opcode=%d\n",
- wc->status, wc->opcode);
- smbd_disconnect_rdma_connection(request->info);
- }
-
for (i = 0; i < request->num_sge; i++)
ib_dma_unmap_single(sc->ib.dev,
request->sge[i].addr,
request->sge[i].length,
DMA_TO_DEVICE);
+ if (wc->status != IB_WC_SUCCESS || wc->opcode != IB_WC_SEND) {
+ log_rdma_send(ERR, "wc->status=%d wc->opcode=%d\n",
+ wc->status, wc->opcode);
+ mempool_free(request, info->request_mempool);
+ smbd_disconnect_rdma_connection(info);
+ return;
+ }
+
if (atomic_dec_and_test(&request->info->send_pending))
wake_up(&request->info->wait_send_pending);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 5349ae5e05fa37409fd48a1eb483b199c32c889b
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081751-drench-clammy-f8dd@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5349ae5e05fa37409fd48a1eb483b199c32c889b Mon Sep 17 00:00:00 2001
From: Stefan Metzmacher <metze(a)samba.org>
Date: Mon, 4 Aug 2025 14:10:12 +0200
Subject: [PATCH] smb: client: let send_done() cleanup before calling
smbd_disconnect_rdma_connection()
We should call ib_dma_unmap_single() and mempool_free() before calling
smbd_disconnect_rdma_connection().
And smbd_disconnect_rdma_connection() needs to be the last function to
call as all other state might already be gone after it returns.
Cc: Steve French <smfrench(a)gmail.com>
Cc: Tom Talpey <tom(a)talpey.com>
Cc: Long Li <longli(a)microsoft.com>
Cc: linux-cifs(a)vger.kernel.org
Cc: samba-technical(a)lists.samba.org
Fixes: f198186aa9bb ("CIFS: SMBD: Establish SMB Direct connection")
Signed-off-by: Stefan Metzmacher <metze(a)samba.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/client/smbdirect.c b/fs/smb/client/smbdirect.c
index 754e94a0e07f..e99e783f1b0e 100644
--- a/fs/smb/client/smbdirect.c
+++ b/fs/smb/client/smbdirect.c
@@ -281,18 +281,20 @@ static void send_done(struct ib_cq *cq, struct ib_wc *wc)
log_rdma_send(INFO, "smbd_request 0x%p completed wc->status=%d\n",
request, wc->status);
- if (wc->status != IB_WC_SUCCESS || wc->opcode != IB_WC_SEND) {
- log_rdma_send(ERR, "wc->status=%d wc->opcode=%d\n",
- wc->status, wc->opcode);
- smbd_disconnect_rdma_connection(request->info);
- }
-
for (i = 0; i < request->num_sge; i++)
ib_dma_unmap_single(sc->ib.dev,
request->sge[i].addr,
request->sge[i].length,
DMA_TO_DEVICE);
+ if (wc->status != IB_WC_SUCCESS || wc->opcode != IB_WC_SEND) {
+ log_rdma_send(ERR, "wc->status=%d wc->opcode=%d\n",
+ wc->status, wc->opcode);
+ mempool_free(request, info->request_mempool);
+ smbd_disconnect_rdma_connection(info);
+ return;
+ }
+
if (atomic_dec_and_test(&request->info->send_pending))
wake_up(&request->info->wait_send_pending);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 6cff20ce3b92ffbf2fc5eb9e5a030b3672aa414a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081513-doorpost-truck-2a5e@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6cff20ce3b92ffbf2fc5eb9e5a030b3672aa414a Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Sun, 13 Jul 2025 16:31:01 +0200
Subject: [PATCH] PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable
ports
pci_bridge_d3_possible() is called from both pcie_portdrv_probe() and
pcie_portdrv_remove() to determine whether runtime power management shall
be enabled (on probe) or disabled (on remove) on a PCIe port.
The underlying assumption is that pci_bridge_d3_possible() always returns
the same value, else a runtime PM reference imbalance would occur. That
assumption is not given if the PCIe port is inaccessible on remove due to
hot-unplug: pci_bridge_d3_possible() calls pciehp_is_native(), which
accesses Config Space to determine whether the port is Hot-Plug Capable.
An inaccessible port returns "all ones", which is converted to "all
zeroes" by pcie_capability_read_dword(). Hence the port no longer seems
Hot-Plug Capable on remove even though it was on probe.
The resulting runtime PM ref imbalance causes warning messages such as:
pcieport 0000:02:04.0: Runtime PM usage count underflow!
Avoid the Config Space access (and thus the runtime PM ref imbalance) by
caching the Hot-Plug Capable bit in struct pci_dev.
The struct already contains an "is_hotplug_bridge" flag, which however is
not only set on Hot-Plug Capable PCIe ports, but also Conventional PCI
Hot-Plug bridges and ACPI slots. The flag identifies bridges which are
allocated additional MMIO and bus number resources to allow for hierarchy
expansion.
The kernel is somewhat sloppily using "is_hotplug_bridge" in a number of
places to identify Hot-Plug Capable PCIe ports, even though the flag
encompasses other devices. Subsequent commits replace these occurrences
with the new flag to clearly delineate Hot-Plug Capable PCIe ports from
other kinds of hotplug bridges.
Document the existing "is_hotplug_bridge" and the new "is_pciehp" flag
and document the (non-obvious) requirement that pci_bridge_d3_possible()
always returns the same value across the entire lifetime of a bridge,
including its hot-removal.
Fixes: 5352a44a561d ("PCI: pciehp: Make pciehp_is_native() stricter")
Reported-by: Laurent Bigonville <bigon(a)bigon.be>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220216
Reported-by: Mario Limonciello <mario.limonciello(a)amd.com>
Closes: https://lore.kernel.org/r/20250609020223.269407-3-superm1@kernel.org/
Link: https://lore.kernel.org/all/20250620025535.3425049-3-superm1@kernel.org/T/#u
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Acked-by: Rafael J. Wysocki <rafael(a)kernel.org>
Cc: stable(a)vger.kernel.org # v4.18+
Link: https://patch.msgid.link/fe5dcc3b2e62ee1df7905d746bde161eb1b3291c.175239010…
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index b78e0e417324..efe478e5073e 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -816,13 +816,11 @@ int pci_acpi_program_hp_params(struct pci_dev *dev)
bool pciehp_is_native(struct pci_dev *bridge)
{
const struct pci_host_bridge *host;
- u32 slot_cap;
if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
return false;
- pcie_capability_read_dword(bridge, PCI_EXP_SLTCAP, &slot_cap);
- if (!(slot_cap & PCI_EXP_SLTCAP_HPC))
+ if (!bridge->is_pciehp)
return false;
if (pcie_ports_native)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e9448d55113b..23d8fe98ddf9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3030,8 +3030,12 @@ static const struct dmi_system_id bridge_d3_blacklist[] = {
* pci_bridge_d3_possible - Is it possible to put the bridge into D3
* @bridge: Bridge to check
*
- * This function checks if it is possible to move the bridge to D3.
* Currently we only allow D3 for some PCIe ports and for Thunderbolt.
+ *
+ * Return: Whether it is possible to move the bridge to D3.
+ *
+ * The return value is guaranteed to be constant across the entire lifetime
+ * of the bridge, including its hot-removal.
*/
bool pci_bridge_d3_possible(struct pci_dev *bridge)
{
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4b8693ec9e4c..cf50be63bf5f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1678,7 +1678,7 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev)
pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP, ®32);
if (reg32 & PCI_EXP_SLTCAP_HPC)
- pdev->is_hotplug_bridge = 1;
+ pdev->is_hotplug_bridge = pdev->is_pciehp = 1;
}
static void set_pcie_thunderbolt(struct pci_dev *dev)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 05e68f35f392..d56d0dd80afb 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -328,6 +328,11 @@ struct rcec_ea;
* determined (e.g., for Root Complex Integrated
* Endpoints without the relevant Capability
* Registers).
+ * @is_hotplug_bridge: Hotplug bridge of any kind (e.g. PCIe Hot-Plug Capable,
+ * Conventional PCI Hot-Plug, ACPI slot).
+ * Such bridges are allocated additional MMIO and bus
+ * number resources to allow for hierarchy expansion.
+ * @is_pciehp: PCIe Hot-Plug Capable bridge.
*/
struct pci_dev {
struct list_head bus_list; /* Node in per-bus list */
@@ -451,6 +456,7 @@ struct pci_dev {
unsigned int is_physfn:1;
unsigned int is_virtfn:1;
unsigned int is_hotplug_bridge:1;
+ unsigned int is_pciehp:1;
unsigned int shpc_managed:1; /* SHPC owned by shpchp */
unsigned int is_thunderbolt:1; /* Thunderbolt controller */
/*
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 6cff20ce3b92ffbf2fc5eb9e5a030b3672aa414a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081512-refresh-safely-eef5@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6cff20ce3b92ffbf2fc5eb9e5a030b3672aa414a Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Sun, 13 Jul 2025 16:31:01 +0200
Subject: [PATCH] PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable
ports
pci_bridge_d3_possible() is called from both pcie_portdrv_probe() and
pcie_portdrv_remove() to determine whether runtime power management shall
be enabled (on probe) or disabled (on remove) on a PCIe port.
The underlying assumption is that pci_bridge_d3_possible() always returns
the same value, else a runtime PM reference imbalance would occur. That
assumption is not given if the PCIe port is inaccessible on remove due to
hot-unplug: pci_bridge_d3_possible() calls pciehp_is_native(), which
accesses Config Space to determine whether the port is Hot-Plug Capable.
An inaccessible port returns "all ones", which is converted to "all
zeroes" by pcie_capability_read_dword(). Hence the port no longer seems
Hot-Plug Capable on remove even though it was on probe.
The resulting runtime PM ref imbalance causes warning messages such as:
pcieport 0000:02:04.0: Runtime PM usage count underflow!
Avoid the Config Space access (and thus the runtime PM ref imbalance) by
caching the Hot-Plug Capable bit in struct pci_dev.
The struct already contains an "is_hotplug_bridge" flag, which however is
not only set on Hot-Plug Capable PCIe ports, but also Conventional PCI
Hot-Plug bridges and ACPI slots. The flag identifies bridges which are
allocated additional MMIO and bus number resources to allow for hierarchy
expansion.
The kernel is somewhat sloppily using "is_hotplug_bridge" in a number of
places to identify Hot-Plug Capable PCIe ports, even though the flag
encompasses other devices. Subsequent commits replace these occurrences
with the new flag to clearly delineate Hot-Plug Capable PCIe ports from
other kinds of hotplug bridges.
Document the existing "is_hotplug_bridge" and the new "is_pciehp" flag
and document the (non-obvious) requirement that pci_bridge_d3_possible()
always returns the same value across the entire lifetime of a bridge,
including its hot-removal.
Fixes: 5352a44a561d ("PCI: pciehp: Make pciehp_is_native() stricter")
Reported-by: Laurent Bigonville <bigon(a)bigon.be>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220216
Reported-by: Mario Limonciello <mario.limonciello(a)amd.com>
Closes: https://lore.kernel.org/r/20250609020223.269407-3-superm1@kernel.org/
Link: https://lore.kernel.org/all/20250620025535.3425049-3-superm1@kernel.org/T/#u
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Acked-by: Rafael J. Wysocki <rafael(a)kernel.org>
Cc: stable(a)vger.kernel.org # v4.18+
Link: https://patch.msgid.link/fe5dcc3b2e62ee1df7905d746bde161eb1b3291c.175239010…
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index b78e0e417324..efe478e5073e 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -816,13 +816,11 @@ int pci_acpi_program_hp_params(struct pci_dev *dev)
bool pciehp_is_native(struct pci_dev *bridge)
{
const struct pci_host_bridge *host;
- u32 slot_cap;
if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
return false;
- pcie_capability_read_dword(bridge, PCI_EXP_SLTCAP, &slot_cap);
- if (!(slot_cap & PCI_EXP_SLTCAP_HPC))
+ if (!bridge->is_pciehp)
return false;
if (pcie_ports_native)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e9448d55113b..23d8fe98ddf9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3030,8 +3030,12 @@ static const struct dmi_system_id bridge_d3_blacklist[] = {
* pci_bridge_d3_possible - Is it possible to put the bridge into D3
* @bridge: Bridge to check
*
- * This function checks if it is possible to move the bridge to D3.
* Currently we only allow D3 for some PCIe ports and for Thunderbolt.
+ *
+ * Return: Whether it is possible to move the bridge to D3.
+ *
+ * The return value is guaranteed to be constant across the entire lifetime
+ * of the bridge, including its hot-removal.
*/
bool pci_bridge_d3_possible(struct pci_dev *bridge)
{
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4b8693ec9e4c..cf50be63bf5f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1678,7 +1678,7 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev)
pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP, ®32);
if (reg32 & PCI_EXP_SLTCAP_HPC)
- pdev->is_hotplug_bridge = 1;
+ pdev->is_hotplug_bridge = pdev->is_pciehp = 1;
}
static void set_pcie_thunderbolt(struct pci_dev *dev)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 05e68f35f392..d56d0dd80afb 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -328,6 +328,11 @@ struct rcec_ea;
* determined (e.g., for Root Complex Integrated
* Endpoints without the relevant Capability
* Registers).
+ * @is_hotplug_bridge: Hotplug bridge of any kind (e.g. PCIe Hot-Plug Capable,
+ * Conventional PCI Hot-Plug, ACPI slot).
+ * Such bridges are allocated additional MMIO and bus
+ * number resources to allow for hierarchy expansion.
+ * @is_pciehp: PCIe Hot-Plug Capable bridge.
*/
struct pci_dev {
struct list_head bus_list; /* Node in per-bus list */
@@ -451,6 +456,7 @@ struct pci_dev {
unsigned int is_physfn:1;
unsigned int is_virtfn:1;
unsigned int is_hotplug_bridge:1;
+ unsigned int is_pciehp:1;
unsigned int shpc_managed:1; /* SHPC owned by shpchp */
unsigned int is_thunderbolt:1; /* Thunderbolt controller */
/*
Hello,
The following is the original thread, where a bug was reported to the
linux-wireless and ath10k mailing lists. The specific bug has been
detailed clearly here.
https://lore.kernel.org/linux-wireless/690B1DB2-C9DC-4FAD-8063-4CED659B1701…
There is also a Bugzilla report by me, which was opened later:
https://bugzilla.kernel.org/show_bug.cgi?id=220264
As stated, it is highly encouraged to check out all the logs,
especially the line of IRQ #16 in /proc/interrupts.
Here is where all the logs are:
https://gist.github.com/BandhanPramanik/ddb0cb23eca03ca2ea43a1d832a16180
(these logs are taken from an Arch liveboot)
On my daily driver, I found these on my IRQ #16:
16: 173210 0 0 0 IR-IO-APIC
16-fasteoi i2c_designware.0, idma64.0, i801_smbus
The fixes stated on the Reddit post for this Wi-Fi card didn't quite
work. (But git-cloning the firmware files did give me some more time
to have stable internet)
This time, I had to go for the GRUB kernel parameters.
Right now, I'm using "irqpoll" to curb the errors caused.
"intel_iommu=off" did not work, and the Wi-Fi was constantly crashing
even then. Did not try out "pci=noaer" this time.
If it's of any concern, there is a very weird error in Chromium-based
browsers which has only happened after I started using irqpoll. When I
Google something, the background of the individual result boxes shows
as pure black, while the surrounding space is the usual
greyish-blackish, like we see in Dark Mode. Here is a picture of the
exact thing I'm experiencing: https://files.catbox.moe/mjew6g.png
If you notice anything in my logs/bug reports, please let me know.
(Because it seems like Wi-Fi errors are just a red herring, there are
some ACPI or PCIe-related errors in the computers of this model - just
a naive speculation, though.)
Thanking you,
Bandhan Pramanik
Hi,
I just wanted to check in and see if you had the chance to review my previous email regarding International Broadcast Conference (IBC) 2025.
Looking forward to your reply.
Kind regards,
Jack Reacher
_____________________________________________________________________________________
From: Jack Reacher
Subject: International Broadcast Conference (IBC) 2025
Hi,
We are offering the visitors contact list International Broadcast Conference (IBC) 2025.
We currently have 50,656 verified visitor contacts.
Each contact contains: Contact name, Job Title, Business Name, Physical Address, Phone Numbers, Official Email address and many more…
Rest assured, our contacts are 100% opt-in and GDPR compliant, ensuring the utmost integrity in your outreach.
Let me know if you are interested so that I can share the pricing for the same.
Kind Regards,
Jack Reacher
Sr. Marketing Manager
If you do not wish to receive this newsletter reply as “Unfollow”
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 6cff20ce3b92ffbf2fc5eb9e5a030b3672aa414a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081511-procreate-museum-7213@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6cff20ce3b92ffbf2fc5eb9e5a030b3672aa414a Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Sun, 13 Jul 2025 16:31:01 +0200
Subject: [PATCH] PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable
ports
pci_bridge_d3_possible() is called from both pcie_portdrv_probe() and
pcie_portdrv_remove() to determine whether runtime power management shall
be enabled (on probe) or disabled (on remove) on a PCIe port.
The underlying assumption is that pci_bridge_d3_possible() always returns
the same value, else a runtime PM reference imbalance would occur. That
assumption is not given if the PCIe port is inaccessible on remove due to
hot-unplug: pci_bridge_d3_possible() calls pciehp_is_native(), which
accesses Config Space to determine whether the port is Hot-Plug Capable.
An inaccessible port returns "all ones", which is converted to "all
zeroes" by pcie_capability_read_dword(). Hence the port no longer seems
Hot-Plug Capable on remove even though it was on probe.
The resulting runtime PM ref imbalance causes warning messages such as:
pcieport 0000:02:04.0: Runtime PM usage count underflow!
Avoid the Config Space access (and thus the runtime PM ref imbalance) by
caching the Hot-Plug Capable bit in struct pci_dev.
The struct already contains an "is_hotplug_bridge" flag, which however is
not only set on Hot-Plug Capable PCIe ports, but also Conventional PCI
Hot-Plug bridges and ACPI slots. The flag identifies bridges which are
allocated additional MMIO and bus number resources to allow for hierarchy
expansion.
The kernel is somewhat sloppily using "is_hotplug_bridge" in a number of
places to identify Hot-Plug Capable PCIe ports, even though the flag
encompasses other devices. Subsequent commits replace these occurrences
with the new flag to clearly delineate Hot-Plug Capable PCIe ports from
other kinds of hotplug bridges.
Document the existing "is_hotplug_bridge" and the new "is_pciehp" flag
and document the (non-obvious) requirement that pci_bridge_d3_possible()
always returns the same value across the entire lifetime of a bridge,
including its hot-removal.
Fixes: 5352a44a561d ("PCI: pciehp: Make pciehp_is_native() stricter")
Reported-by: Laurent Bigonville <bigon(a)bigon.be>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220216
Reported-by: Mario Limonciello <mario.limonciello(a)amd.com>
Closes: https://lore.kernel.org/r/20250609020223.269407-3-superm1@kernel.org/
Link: https://lore.kernel.org/all/20250620025535.3425049-3-superm1@kernel.org/T/#u
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Acked-by: Rafael J. Wysocki <rafael(a)kernel.org>
Cc: stable(a)vger.kernel.org # v4.18+
Link: https://patch.msgid.link/fe5dcc3b2e62ee1df7905d746bde161eb1b3291c.175239010…
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index b78e0e417324..efe478e5073e 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -816,13 +816,11 @@ int pci_acpi_program_hp_params(struct pci_dev *dev)
bool pciehp_is_native(struct pci_dev *bridge)
{
const struct pci_host_bridge *host;
- u32 slot_cap;
if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
return false;
- pcie_capability_read_dword(bridge, PCI_EXP_SLTCAP, &slot_cap);
- if (!(slot_cap & PCI_EXP_SLTCAP_HPC))
+ if (!bridge->is_pciehp)
return false;
if (pcie_ports_native)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e9448d55113b..23d8fe98ddf9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3030,8 +3030,12 @@ static const struct dmi_system_id bridge_d3_blacklist[] = {
* pci_bridge_d3_possible - Is it possible to put the bridge into D3
* @bridge: Bridge to check
*
- * This function checks if it is possible to move the bridge to D3.
* Currently we only allow D3 for some PCIe ports and for Thunderbolt.
+ *
+ * Return: Whether it is possible to move the bridge to D3.
+ *
+ * The return value is guaranteed to be constant across the entire lifetime
+ * of the bridge, including its hot-removal.
*/
bool pci_bridge_d3_possible(struct pci_dev *bridge)
{
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4b8693ec9e4c..cf50be63bf5f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1678,7 +1678,7 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev)
pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP, ®32);
if (reg32 & PCI_EXP_SLTCAP_HPC)
- pdev->is_hotplug_bridge = 1;
+ pdev->is_hotplug_bridge = pdev->is_pciehp = 1;
}
static void set_pcie_thunderbolt(struct pci_dev *dev)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 05e68f35f392..d56d0dd80afb 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -328,6 +328,11 @@ struct rcec_ea;
* determined (e.g., for Root Complex Integrated
* Endpoints without the relevant Capability
* Registers).
+ * @is_hotplug_bridge: Hotplug bridge of any kind (e.g. PCIe Hot-Plug Capable,
+ * Conventional PCI Hot-Plug, ACPI slot).
+ * Such bridges are allocated additional MMIO and bus
+ * number resources to allow for hierarchy expansion.
+ * @is_pciehp: PCIe Hot-Plug Capable bridge.
*/
struct pci_dev {
struct list_head bus_list; /* Node in per-bus list */
@@ -451,6 +456,7 @@ struct pci_dev {
unsigned int is_physfn:1;
unsigned int is_virtfn:1;
unsigned int is_hotplug_bridge:1;
+ unsigned int is_pciehp:1;
unsigned int shpc_managed:1; /* SHPC owned by shpchp */
unsigned int is_thunderbolt:1; /* Thunderbolt controller */
/*
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 3f66ccbaaef3a0c5bd844eab04e3207b4061c546
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081544-coliseum-cylinder-43d2@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3f66ccbaaef3a0c5bd844eab04e3207b4061c546 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <dlemoal(a)kernel.org>
Date: Wed, 25 Jun 2025 18:33:23 +0900
Subject: [PATCH] block: Make REQ_OP_ZONE_FINISH a write operation
REQ_OP_ZONE_FINISH is defined as "12", which makes
op_is_write(REQ_OP_ZONE_FINISH) return false, despite the fact that a
zone finish operation is an operation that modifies a zone (transition
it to full) and so should be considered as a write operation (albeit
one that does not transfer any data to the device).
Fix this by redefining REQ_OP_ZONE_FINISH to be an odd number (13), and
redefine REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL using sequential
odd numbers from that new value.
Fixes: 6c1b1da58f8c ("block: add zone open, close and finish operations")
Cc: stable(a)vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Link: https://lore.kernel.org/r/20250625093327.548866-2-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 3d1577f07c1c..930daff207df 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -350,11 +350,11 @@ enum req_op {
/* Close a zone */
REQ_OP_ZONE_CLOSE = (__force blk_opf_t)11,
/* Transition a zone to full */
- REQ_OP_ZONE_FINISH = (__force blk_opf_t)12,
+ REQ_OP_ZONE_FINISH = (__force blk_opf_t)13,
/* reset a zone write pointer */
- REQ_OP_ZONE_RESET = (__force blk_opf_t)13,
+ REQ_OP_ZONE_RESET = (__force blk_opf_t)15,
/* reset all the zone present on the device */
- REQ_OP_ZONE_RESET_ALL = (__force blk_opf_t)15,
+ REQ_OP_ZONE_RESET_ALL = (__force blk_opf_t)17,
/* Driver private requests */
REQ_OP_DRV_IN = (__force blk_opf_t)34,
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 6cff20ce3b92ffbf2fc5eb9e5a030b3672aa414a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081510-deodorize-kitten-ff60@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6cff20ce3b92ffbf2fc5eb9e5a030b3672aa414a Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Sun, 13 Jul 2025 16:31:01 +0200
Subject: [PATCH] PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable
ports
pci_bridge_d3_possible() is called from both pcie_portdrv_probe() and
pcie_portdrv_remove() to determine whether runtime power management shall
be enabled (on probe) or disabled (on remove) on a PCIe port.
The underlying assumption is that pci_bridge_d3_possible() always returns
the same value, else a runtime PM reference imbalance would occur. That
assumption is not given if the PCIe port is inaccessible on remove due to
hot-unplug: pci_bridge_d3_possible() calls pciehp_is_native(), which
accesses Config Space to determine whether the port is Hot-Plug Capable.
An inaccessible port returns "all ones", which is converted to "all
zeroes" by pcie_capability_read_dword(). Hence the port no longer seems
Hot-Plug Capable on remove even though it was on probe.
The resulting runtime PM ref imbalance causes warning messages such as:
pcieport 0000:02:04.0: Runtime PM usage count underflow!
Avoid the Config Space access (and thus the runtime PM ref imbalance) by
caching the Hot-Plug Capable bit in struct pci_dev.
The struct already contains an "is_hotplug_bridge" flag, which however is
not only set on Hot-Plug Capable PCIe ports, but also Conventional PCI
Hot-Plug bridges and ACPI slots. The flag identifies bridges which are
allocated additional MMIO and bus number resources to allow for hierarchy
expansion.
The kernel is somewhat sloppily using "is_hotplug_bridge" in a number of
places to identify Hot-Plug Capable PCIe ports, even though the flag
encompasses other devices. Subsequent commits replace these occurrences
with the new flag to clearly delineate Hot-Plug Capable PCIe ports from
other kinds of hotplug bridges.
Document the existing "is_hotplug_bridge" and the new "is_pciehp" flag
and document the (non-obvious) requirement that pci_bridge_d3_possible()
always returns the same value across the entire lifetime of a bridge,
including its hot-removal.
Fixes: 5352a44a561d ("PCI: pciehp: Make pciehp_is_native() stricter")
Reported-by: Laurent Bigonville <bigon(a)bigon.be>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220216
Reported-by: Mario Limonciello <mario.limonciello(a)amd.com>
Closes: https://lore.kernel.org/r/20250609020223.269407-3-superm1@kernel.org/
Link: https://lore.kernel.org/all/20250620025535.3425049-3-superm1@kernel.org/T/#u
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Acked-by: Rafael J. Wysocki <rafael(a)kernel.org>
Cc: stable(a)vger.kernel.org # v4.18+
Link: https://patch.msgid.link/fe5dcc3b2e62ee1df7905d746bde161eb1b3291c.175239010…
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index b78e0e417324..efe478e5073e 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -816,13 +816,11 @@ int pci_acpi_program_hp_params(struct pci_dev *dev)
bool pciehp_is_native(struct pci_dev *bridge)
{
const struct pci_host_bridge *host;
- u32 slot_cap;
if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
return false;
- pcie_capability_read_dword(bridge, PCI_EXP_SLTCAP, &slot_cap);
- if (!(slot_cap & PCI_EXP_SLTCAP_HPC))
+ if (!bridge->is_pciehp)
return false;
if (pcie_ports_native)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e9448d55113b..23d8fe98ddf9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3030,8 +3030,12 @@ static const struct dmi_system_id bridge_d3_blacklist[] = {
* pci_bridge_d3_possible - Is it possible to put the bridge into D3
* @bridge: Bridge to check
*
- * This function checks if it is possible to move the bridge to D3.
* Currently we only allow D3 for some PCIe ports and for Thunderbolt.
+ *
+ * Return: Whether it is possible to move the bridge to D3.
+ *
+ * The return value is guaranteed to be constant across the entire lifetime
+ * of the bridge, including its hot-removal.
*/
bool pci_bridge_d3_possible(struct pci_dev *bridge)
{
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4b8693ec9e4c..cf50be63bf5f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1678,7 +1678,7 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev)
pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP, ®32);
if (reg32 & PCI_EXP_SLTCAP_HPC)
- pdev->is_hotplug_bridge = 1;
+ pdev->is_hotplug_bridge = pdev->is_pciehp = 1;
}
static void set_pcie_thunderbolt(struct pci_dev *dev)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 05e68f35f392..d56d0dd80afb 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -328,6 +328,11 @@ struct rcec_ea;
* determined (e.g., for Root Complex Integrated
* Endpoints without the relevant Capability
* Registers).
+ * @is_hotplug_bridge: Hotplug bridge of any kind (e.g. PCIe Hot-Plug Capable,
+ * Conventional PCI Hot-Plug, ACPI slot).
+ * Such bridges are allocated additional MMIO and bus
+ * number resources to allow for hierarchy expansion.
+ * @is_pciehp: PCIe Hot-Plug Capable bridge.
*/
struct pci_dev {
struct list_head bus_list; /* Node in per-bus list */
@@ -451,6 +456,7 @@ struct pci_dev {
unsigned int is_physfn:1;
unsigned int is_virtfn:1;
unsigned int is_hotplug_bridge:1;
+ unsigned int is_pciehp:1;
unsigned int shpc_managed:1; /* SHPC owned by shpchp */
unsigned int is_thunderbolt:1; /* Thunderbolt controller */
/*
More KVM backports, this time for 6.12.y (and with far fewer dependencies).
Same note/caveat about Sasha already posting[1][2] many/most of these:
KVM: VMX: Allow guest to set DEBUGCTL.RTM_DEBUG if RTM is supported
KVM: VMX: Extract checking of guest's DEBUGCTL into helper
KVM: nVMX: Check vmcs12->guest_ia32_debugctl on nested VM-Enter
KVM: VMX: Wrap all accesses to IA32_DEBUGCTL with getter/setter APIs
I'm including them here to hopefully make life easier for y'all, and because
the order they are presented here is the preferred ordering, i.e. should be
the same ordering as the original upstream patches.
But, if you end up grabbing Sasha's patches first, it's not a big deal as the
only true dependencies is that the DEBUGCTL.RTM_DEBUG patch needs to land
before "Check vmcs12->guest_ia32_debugctl on nested VM-Enter".
[1] https://lore.kernel.org/all/20250813182455.2068642-1-sashal@kernel.org
[2] https://lore.kernel.org/all/20250814125614.2090890-1-sashal@kernel.org
Maxim Levitsky (3):
KVM: nVMX: Check vmcs12->guest_ia32_debugctl on nested VM-Enter
KVM: VMX: Wrap all accesses to IA32_DEBUGCTL with getter/setter APIs
KVM: VMX: Preserve host's DEBUGCTLMSR_FREEZE_IN_SMM while running the
guest
Sean Christopherson (4):
KVM: x86: Convert vcpu_run()'s immediate exit param into a generic
bitmap
KVM: x86: Drop kvm_x86_ops.set_dr6() in favor of a new KVM_RUN flag
KVM: VMX: Allow guest to set DEBUGCTL.RTM_DEBUG if RTM is supported
KVM: VMX: Extract checking of guest's DEBUGCTL into helper
arch/x86/include/asm/kvm-x86-ops.h | 1 -
arch/x86/include/asm/kvm_host.h | 15 ++++++--
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kvm/svm/svm.c | 14 ++++----
arch/x86/kvm/vmx/main.c | 3 +-
arch/x86/kvm/vmx/nested.c | 21 ++++++++---
arch/x86/kvm/vmx/pmu_intel.c | 8 ++---
arch/x86/kvm/vmx/vmx.c | 57 ++++++++++++++++++------------
arch/x86/kvm/vmx/vmx.h | 26 ++++++++++++++
arch/x86/kvm/vmx/x86_ops.h | 2 +-
arch/x86/kvm/x86.c | 25 ++++++++++---
11 files changed, 125 insertions(+), 48 deletions(-)
base-commit: 8f5ff9784f3262e6e85c68d86f8b7931827f2983
--
2.51.0.rc1.163.g2494970778-goog
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 3f66ccbaaef3a0c5bd844eab04e3207b4061c546
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081544-slackness-vantage-3e99@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3f66ccbaaef3a0c5bd844eab04e3207b4061c546 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <dlemoal(a)kernel.org>
Date: Wed, 25 Jun 2025 18:33:23 +0900
Subject: [PATCH] block: Make REQ_OP_ZONE_FINISH a write operation
REQ_OP_ZONE_FINISH is defined as "12", which makes
op_is_write(REQ_OP_ZONE_FINISH) return false, despite the fact that a
zone finish operation is an operation that modifies a zone (transition
it to full) and so should be considered as a write operation (albeit
one that does not transfer any data to the device).
Fix this by redefining REQ_OP_ZONE_FINISH to be an odd number (13), and
redefine REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL using sequential
odd numbers from that new value.
Fixes: 6c1b1da58f8c ("block: add zone open, close and finish operations")
Cc: stable(a)vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Link: https://lore.kernel.org/r/20250625093327.548866-2-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 3d1577f07c1c..930daff207df 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -350,11 +350,11 @@ enum req_op {
/* Close a zone */
REQ_OP_ZONE_CLOSE = (__force blk_opf_t)11,
/* Transition a zone to full */
- REQ_OP_ZONE_FINISH = (__force blk_opf_t)12,
+ REQ_OP_ZONE_FINISH = (__force blk_opf_t)13,
/* reset a zone write pointer */
- REQ_OP_ZONE_RESET = (__force blk_opf_t)13,
+ REQ_OP_ZONE_RESET = (__force blk_opf_t)15,
/* reset all the zone present on the device */
- REQ_OP_ZONE_RESET_ALL = (__force blk_opf_t)15,
+ REQ_OP_ZONE_RESET_ALL = (__force blk_opf_t)17,
/* Driver private requests */
REQ_OP_DRV_IN = (__force blk_opf_t)34,
In __iodyn_find_io_region(), pcmcia_make_resource() is assigned to
res and used in pci_bus_alloc_resource(). There is a dereference of res
in pci_bus_alloc_resource(), which could lead to a NULL pointer
dereference on failure of pcmcia_make_resource().
Fix this bug by adding a check of res.
Found by code review, complie tested only.
Cc: stable(a)vger.kernel.org
Fixes: 49b1153adfe1 ("pcmcia: move all pcmcia_resource_ops providers into one module")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/pcmcia/rsrc_iodyn.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/pcmcia/rsrc_iodyn.c b/drivers/pcmcia/rsrc_iodyn.c
index b04b16496b0c..2677b577c1f8 100644
--- a/drivers/pcmcia/rsrc_iodyn.c
+++ b/drivers/pcmcia/rsrc_iodyn.c
@@ -62,6 +62,9 @@ static struct resource *__iodyn_find_io_region(struct pcmcia_socket *s,
unsigned long min = base;
int ret;
+ if (!res)
+ return NULL;
+
data.mask = align - 1;
data.offset = base & data.mask;
--
2.25.1
Mount options (uid, gid, mode) are silently ignored when debugfs is
mounted. This is a regression introduced during the conversion to the
new mount API.
When the mount API conversion was done, the line that sets
sb->s_fs_info to the parsed options was removed. This causes
debugfs_apply_options() to operate on a NULL pointer.
Fix this by following the same pattern as the tracefs fix in commit
e4d32142d1de ("tracing: Fix tracefs mount options"). Call
debugfs_reconfigure() in debugfs_get_tree() to apply the mount options
to the superblock after it has been created or reused.
As an example, with the bug the "mode" mount option is ignored:
$ mount -o mode=0666 -t debugfs debugfs /tmp/debugfs_test
$ mount | grep debugfs_test
debugfs on /tmp/debugfs_test type debugfs (rw,relatime)
$ ls -ld /tmp/debugfs_test
drwx------ 25 root root 0 Aug 4 14:16 /tmp/debugfs_test
With the fix applied, it works as expected:
$ mount -o mode=0666 -t debugfs debugfs /tmp/debugfs_test
$ mount | grep debugfs_test
debugfs on /tmp/debugfs_test type debugfs (rw,relatime,mode=666)
$ ls -ld /tmp/debugfs_test
drw-rw-rw- 37 root root 0 Aug 2 17:28 /tmp/debugfs_test
Fixes: a20971c18752 ("vfs: Convert debugfs to use the new mount API")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220406
Cc: stable(a)vger.kernel.org
Signed-off-by: Charalampos Mitrodimas <charmitro(a)posteo.net>
---
Changes in v2:
- Follow the same pattern as e4d32142d1de ("tracing: Fix tracefs mount options")
- Add Cc: stable tag
- Link to v1: https://lore.kernel.org/r/20250804-debugfs-mount-opts-v1-1-bc05947a80b5@pos…
---
fs/debugfs/inode.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/debugfs/inode.c b/fs/debugfs/inode.c
index a0357b0cf362d8ac47ff810e162402d6a8ae2cb9..c12d649df6a5435050f606c2828a9a7cc61922e4 100644
--- a/fs/debugfs/inode.c
+++ b/fs/debugfs/inode.c
@@ -183,6 +183,9 @@ static int debugfs_reconfigure(struct fs_context *fc)
struct debugfs_fs_info *sb_opts = sb->s_fs_info;
struct debugfs_fs_info *new_opts = fc->s_fs_info;
+ if (!new_opts)
+ return 0;
+
sync_filesystem(sb);
/* structure copy of new mount options to sb */
@@ -282,10 +285,16 @@ static int debugfs_fill_super(struct super_block *sb, struct fs_context *fc)
static int debugfs_get_tree(struct fs_context *fc)
{
+ int err;
+
if (!(debugfs_allow & DEBUGFS_ALLOW_API))
return -EPERM;
- return get_tree_single(fc, debugfs_fill_super);
+ err = get_tree_single(fc, debugfs_fill_super);
+ if (err)
+ return err;
+
+ return debugfs_reconfigure(fc);
}
static void debugfs_free_fc(struct fs_context *fc)
---
base-commit: 3c4a063b1f8ab71352df1421d9668521acb63cd9
change-id: 20250804-debugfs-mount-opts-2a68d7741f05
Best regards,
--
Charalampos Mitrodimas <charmitro(a)posteo.net>
From: Su Hui <suhui(a)nfschina.com>
[ Upstream commit 7919407eca2ef562fa6c98c41cfdf6f6cdd69d92 ]
When encounters some errors like these:
xhci_hcd 0000:4a:00.2: xHCI dying or halted, can't queue_command
xhci_hcd 0000:4a:00.2: FIXME: allocate a command ring segment
usb usb5-port6: couldn't allocate usb_device
It's hard to know whether xhc_state is dying or halted. So it's better
to print xhc_state's value which can help locate the resaon of the bug.
Signed-off-by: Su Hui <suhui(a)nfschina.com>
Link: https://lore.kernel.org/r/20250725060117.1773770-1-suhui@nfschina.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit is suitable for backporting to stable kernel trees for the
following reasons:
1. **Enhanced Debugging for Real-World Issues**: The commit improves
debugging of USB xHCI host controller failures by printing the actual
`xhc_state` value when `queue_command` fails. The commit message
shows real error messages users encounter ("xHCI dying or halted,
can't queue_command"), demonstrating this is a real-world debugging
problem.
2. **Minimal and Safe Change**: The change is extremely small and safe -
it only modifies a debug print statement from:
```c
xhci_dbg(xhci, "xHCI dying or halted, can't queue_command\n");
```
to:
```c
xhci_dbg(xhci, "xHCI dying or halted, can't queue_command. state:
0x%x\n", xhci->xhc_state);
```
3. **No Functional Changes**: This is a pure diagnostic improvement. It
doesn't change any logic, control flow, or data structures. It only
adds the state value (0x%x format) to an existing debug message.
4. **Important for Troubleshooting**: The xHCI driver is critical for
USB functionality, and when it fails with "dying or halted" states,
knowing the exact state helps diagnose whether:
- `XHCI_STATE_DYING` (0x1) - controller is dying
- `XHCI_STATE_HALTED` (0x2) - controller is halted
- Both states (0x3) - controller has both flags set
This distinction is valuable for debugging hardware issues, driver
bugs, or system problems.
5. **Zero Risk of Regression**: Adding a parameter to a debug print
statement has no risk of introducing regressions. The worst case is
the debug message prints the state value.
6. **Follows Stable Rules**: This meets stable kernel criteria as it:
- Fixes a real debugging limitation
- Is obviously correct
- Has been tested (signed-off and accepted by Greg KH)
- Is small (single line change)
- Doesn't add new features, just improves existing diagnostics
The commit helps system administrators and developers diagnose USB
issues more effectively by providing the actual state value rather than
just saying "dying or halted", making it a valuable debugging
enhancement for stable kernels.
drivers/usb/host/xhci-ring.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 94c9c9271658..131e7530ec4a 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -4372,7 +4372,8 @@ static int queue_command(struct xhci_hcd *xhci, struct xhci_command *cmd,
if ((xhci->xhc_state & XHCI_STATE_DYING) ||
(xhci->xhc_state & XHCI_STATE_HALTED)) {
- xhci_dbg(xhci, "xHCI dying or halted, can't queue_command\n");
+ xhci_dbg(xhci, "xHCI dying or halted, can't queue_command. state: 0x%x\n",
+ xhci->xhc_state);
return -ESHUTDOWN;
}
--
2.39.5
From: Su Hui <suhui(a)nfschina.com>
[ Upstream commit 7919407eca2ef562fa6c98c41cfdf6f6cdd69d92 ]
When encounters some errors like these:
xhci_hcd 0000:4a:00.2: xHCI dying or halted, can't queue_command
xhci_hcd 0000:4a:00.2: FIXME: allocate a command ring segment
usb usb5-port6: couldn't allocate usb_device
It's hard to know whether xhc_state is dying or halted. So it's better
to print xhc_state's value which can help locate the resaon of the bug.
Signed-off-by: Su Hui <suhui(a)nfschina.com>
Link: https://lore.kernel.org/r/20250725060117.1773770-1-suhui@nfschina.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit is suitable for backporting to stable kernel trees for the
following reasons:
1. **Enhanced Debugging for Real-World Issues**: The commit improves
debugging of USB xHCI host controller failures by printing the actual
`xhc_state` value when `queue_command` fails. The commit message
shows real error messages users encounter ("xHCI dying or halted,
can't queue_command"), demonstrating this is a real-world debugging
problem.
2. **Minimal and Safe Change**: The change is extremely small and safe -
it only modifies a debug print statement from:
```c
xhci_dbg(xhci, "xHCI dying or halted, can't queue_command\n");
```
to:
```c
xhci_dbg(xhci, "xHCI dying or halted, can't queue_command. state:
0x%x\n", xhci->xhc_state);
```
3. **No Functional Changes**: This is a pure diagnostic improvement. It
doesn't change any logic, control flow, or data structures. It only
adds the state value (0x%x format) to an existing debug message.
4. **Important for Troubleshooting**: The xHCI driver is critical for
USB functionality, and when it fails with "dying or halted" states,
knowing the exact state helps diagnose whether:
- `XHCI_STATE_DYING` (0x1) - controller is dying
- `XHCI_STATE_HALTED` (0x2) - controller is halted
- Both states (0x3) - controller has both flags set
This distinction is valuable for debugging hardware issues, driver
bugs, or system problems.
5. **Zero Risk of Regression**: Adding a parameter to a debug print
statement has no risk of introducing regressions. The worst case is
the debug message prints the state value.
6. **Follows Stable Rules**: This meets stable kernel criteria as it:
- Fixes a real debugging limitation
- Is obviously correct
- Has been tested (signed-off and accepted by Greg KH)
- Is small (single line change)
- Doesn't add new features, just improves existing diagnostics
The commit helps system administrators and developers diagnose USB
issues more effectively by providing the actual state value rather than
just saying "dying or halted", making it a valuable debugging
enhancement for stable kernels.
drivers/usb/host/xhci-ring.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 0862fdd3e568..c4880b22f359 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -4421,7 +4421,8 @@ static int queue_command(struct xhci_hcd *xhci, struct xhci_command *cmd,
if ((xhci->xhc_state & XHCI_STATE_DYING) ||
(xhci->xhc_state & XHCI_STATE_HALTED)) {
- xhci_dbg(xhci, "xHCI dying or halted, can't queue_command\n");
+ xhci_dbg(xhci, "xHCI dying or halted, can't queue_command. state: 0x%x\n",
+ xhci->xhc_state);
return -ESHUTDOWN;
}
--
2.39.5
From: Gabriel Totev <gabriel.totev(a)zetier.com>
[ Upstream commit c5bf96d20fd787e4909b755de4705d52f3458836 ]
When using AppArmor profiles inside an unprivileged container,
the link operation observes an unshifted ouid.
(tested with LXD and Incus)
For example, root inside container and uid 1000000 outside, with
`owner /root/link l,` profile entry for ln:
/root$ touch chain && ln chain link
==> dmesg
apparmor="DENIED" operation="link" class="file"
namespace="root//lxd-feet_<var-snap-lxd-common-lxd>" profile="linkit"
name="/root/link" pid=1655 comm="ln" requested_mask="l" denied_mask="l"
fsuid=1000000 ouid=0 [<== should be 1000000] target="/root/chain"
Fix by mapping inode uid of old_dentry in aa_path_link() rather than
using it directly, similarly to how it's mapped in __file_path_perm()
later in the file.
Signed-off-by: Gabriel Totev <gabriel.totev(a)zetier.com>
Signed-off-by: John Johansen <john.johansen(a)canonical.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
**Backport Status: YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Clear Bug Fix**: This fixes a real bug where AppArmor incorrectly
reports the unshifted uid when mediating hard link operations inside
user namespaces/containers. The commit message provides a concrete
example showing ouid=0 instead of the expected ouid=1000000 in
container logs.
2. **Security Impact**: This is a security-relevant bug that causes
AppArmor policy enforcement to behave incorrectly in containerized
environments. The owner-based AppArmor rules (like `owner /root/link
l,`) won't work correctly because the uid comparison is done with the
wrong (unshifted) value.
3. **Minimal and Contained Fix**: The change is small and surgical - it
only modifies the aa_path_link() function to properly map the inode
uid through the mount's idmapping, following the exact same pattern
already used in __file_path_perm():
- Uses `i_uid_into_vfsuid(mnt_idmap(target.mnt), inode)` to get the
vfsuid
- Converts it back with `vfsuid_into_kuid(vfsuid)` for the path_cond
structure
4. **No New Features or Architecture Changes**: This is purely a bug fix
that makes aa_path_link() consistent with how __file_path_perm()
already handles uid mapping. No new functionality is added.
5. **Container/User Namespace Compatibility**: With the widespread use
of containers (LXD, Incus, Docker with userns), this bug affects many
production systems. The fix ensures AppArmor policies work correctly
in these environments.
6. **Low Risk**: The change follows an established pattern in the
codebase (from __file_path_perm) and only affects the specific case
of hard link permission checks in user namespaces. The risk of
regression is minimal.
7. **Clear Testing**: The commit message indicates this was tested with
LXD and Incus containers, showing the issue is reproducible and the
fix validated.
The code change is straightforward - replacing direct access to
`d_backing_inode(old_dentry)->i_uid` with proper idmapping-aware
conversion that respects user namespace uid shifts. This makes
aa_path_link() consistent with other AppArmor file operations that
already handle idmapped mounts correctly.
security/apparmor/file.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/security/apparmor/file.c b/security/apparmor/file.c
index d52a5b14dad4..62bc46e03758 100644
--- a/security/apparmor/file.c
+++ b/security/apparmor/file.c
@@ -423,9 +423,11 @@ int aa_path_link(const struct cred *subj_cred,
{
struct path link = { .mnt = new_dir->mnt, .dentry = new_dentry };
struct path target = { .mnt = new_dir->mnt, .dentry = old_dentry };
+ struct inode *inode = d_backing_inode(old_dentry);
+ vfsuid_t vfsuid = i_uid_into_vfsuid(mnt_idmap(target.mnt), inode);
struct path_cond cond = {
- d_backing_inode(old_dentry)->i_uid,
- d_backing_inode(old_dentry)->i_mode
+ .uid = vfsuid_into_kuid(vfsuid),
+ .mode = inode->i_mode,
};
char *buffer = NULL, *buffer2 = NULL;
struct aa_profile *profile;
--
2.39.5
The ISR calls dev_pm_opp_find_bw_ceil(), which can return EINVAL, ERANGE
or ENODEV, and if that one fails with ERANGE, then it tries again with
floor dev_pm_opp_find_bw_floor().
Code misses error checks for two cases:
1. First dev_pm_opp_find_bw_ceil() failed with error different than
ERANGE,
2. Any error from second dev_pm_opp_find_bw_floor().
In an unlikely case these error happened, the code would further
dereference the ERR pointer. Close that possibility and make the code
more obvious that all errors are correctly handled.
Reported by Smatch:
icc-bwmon.c:693 bwmon_intr_thread() error: 'target_opp' dereferencing possible ERR_PTR()
Fixes: b9c2ae6cac40 ("soc: qcom: icc-bwmon: Add bandwidth monitoring driver")
Cc: <stable(a)vger.kernel.org>
Reported-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Closes: https://lore.kernel.org/r/aJTNEQsRFjrFknG9@stanley.mountain/
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
---
Some unreleased smatch, though, because I cannot reproduce the warning,
but I imagine Dan keeps the tastiests reports for later. :)
---
drivers/soc/qcom/icc-bwmon.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/soc/qcom/icc-bwmon.c b/drivers/soc/qcom/icc-bwmon.c
index 3dfa448bf8cf..597f9025e422 100644
--- a/drivers/soc/qcom/icc-bwmon.c
+++ b/drivers/soc/qcom/icc-bwmon.c
@@ -656,6 +656,9 @@ static irqreturn_t bwmon_intr_thread(int irq, void *dev_id)
if (IS_ERR(target_opp) && PTR_ERR(target_opp) == -ERANGE)
target_opp = dev_pm_opp_find_bw_floor(bwmon->dev, &bw_kbps, 0);
+ if (IS_ERR(target_opp))
+ return IRQ_HANDLED;
+
bwmon->target_kbps = bw_kbps;
bw_kbps--;
--
2.48.1
From: Sabrina Dubroca <sd(a)queasysnail.net>
[ Upstream commit 41532b785e9d79636b3815a64ddf6a096647d011 ]
If we're not doing async, the handling is much simpler. There's no
reference counting, we just need to wait for the completion to wake us
up and return its result.
We should preferably also use a separate crypto_wait. I'm not seeing a
UAF as I did in the past, I think aec7961 ("tls: fix race between
async notify and socket close") took care of it.
This will make the next fix easier.
Signed-off-by: Sabrina Dubroca <sd(a)queasysnail.net>
Link: https://lore.kernel.org/r/47bde5f649707610eaef9f0d679519966fc31061.17091326…
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
[ William: The original patch did not apply cleanly due to deletions of
non-existent lines in 6.1.y. The UAF the author stopped seeing can still
be reproduced on systems without AVX in conjunction with cryptd.
Also removed an extraneous statement after a return statement that is
adjacent to diff. ]
Link: https://lore.kernel.org/netdev/he2K1yz_u7bZ-CnYcTSQ4OxuLuHZXN6xZRgp6_ICSWnq…
Signed-off-by: William Liu <will(a)willsroot.io>
---
Tested with original repro and all tests in selftests/net/tls.c
---
net/tls/tls_sw.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 6ac3dcbe87b5..e1e3207168e3 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -274,9 +274,15 @@ static int tls_do_decryption(struct sock *sk,
DEBUG_NET_WARN_ON_ONCE(atomic_read(&ctx->decrypt_pending) < 1);
atomic_inc(&ctx->decrypt_pending);
} else {
+ DECLARE_CRYPTO_WAIT(wait);
+
aead_request_set_callback(aead_req,
CRYPTO_TFM_REQ_MAY_BACKLOG,
- crypto_req_done, &ctx->async_wait);
+ crypto_req_done, &wait);
+ ret = crypto_aead_decrypt(aead_req);
+ if (ret == -EINPROGRESS || ret == -EBUSY)
+ ret = crypto_wait_req(ret, &wait);
+ return ret;
}
ret = crypto_aead_decrypt(aead_req);
@@ -289,7 +295,6 @@ static int tls_do_decryption(struct sock *sk,
/* all completions have run, we're not doing async anymore */
darg->async = false;
return ret;
- ret = ret ?: -EINPROGRESS;
}
atomic_dec(&ctx->decrypt_pending);
--
2.43.0
The patch titled
Subject: mm/memory-failure: fix infinite UCE for VM_PFNMAP pfn
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-memory-failure-fix-infinite-uce-for-vm_pfnmap-pfn.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Jinjiang Tu <tujinjiang(a)huawei.com>
Subject: mm/memory-failure: fix infinite UCE for VM_PFNMAP pfn
Date: Fri, 15 Aug 2025 15:32:09 +0800
When memory_failure() is called for a already hwpoisoned pfn,
kill_accessing_process() will be called to kill current task. However, if
the vma of the accessing vaddr is VM_PFNMAP, walk_page_range() will skip
the vma in walk_page_test() and return 0.
Before commit aaf99ac2ceb7 ("mm/hwpoison: do not send SIGBUS to processes
with recovered clean pages"), kill_accessing_process() will return EFAULT.
For x86, the current task will be killed in kill_me_maybe().
However, after this commit, kill_accessing_process() simplies return 0,
that means UCE is handled properly, but it doesn't actually. In such
case, the user task will trigger UCE infinitely.
To fix it, add .test_walk callback for hwpoison_walk_ops to scan all vmas.
Link: https://lkml.kernel.org/r/20250815073209.1984582-1-tujinjiang@huawei.com
Fixes: aaf99ac2ceb7 ("mm/hwpoison: do not send SIGBUS to processes with recovered clean pages")
Signed-off-by: Jinjiang Tu <tujinjiang(a)huawei.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Miaohe Lin <linmiaohe(a)huawei.com>
Reviewed-by: Jane Chu <jane.chu(a)oracle.com>
Cc: Kefeng Wang <wangkefeng.wang(a)huawei.com>
Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com>
Cc: Oscar Salvador <osalvador(a)suse.de>
Cc: Shuai Xue <xueshuai(a)linux.alibaba.com>
Cc: Zi Yan <ziy(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory-failure.c | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/mm/memory-failure.c~mm-memory-failure-fix-infinite-uce-for-vm_pfnmap-pfn
+++ a/mm/memory-failure.c
@@ -853,9 +853,17 @@ static int hwpoison_hugetlb_range(pte_t
#define hwpoison_hugetlb_range NULL
#endif
+static int hwpoison_test_walk(unsigned long start, unsigned long end,
+ struct mm_walk *walk)
+{
+ /* We also want to consider pages mapped into VM_PFNMAP. */
+ return 0;
+}
+
static const struct mm_walk_ops hwpoison_walk_ops = {
.pmd_entry = hwpoison_pte_range,
.hugetlb_entry = hwpoison_hugetlb_range,
+ .test_walk = hwpoison_test_walk,
.walk_lock = PGWALK_RDLOCK,
};
_
Patches currently in -mm which might be from tujinjiang(a)huawei.com are
mm-memory-failure-fix-infinite-uce-for-vm_pfnmap-pfn.patch
mm-memory_hotplug-fix-hwpoisoned-large-folio-handling-in-do_migrate_range.patch
The patch titled
Subject: init-handle-bootloader-identifier-in-kernel-parameters-v4
has been added to the -mm mm-nonmm-unstable branch. Its filename is
init-handle-bootloader-identifier-in-kernel-parameters-v4.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-nonmm-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Huacai Chen <chenhuacai(a)loongson.cn>
Subject: init-handle-bootloader-identifier-in-kernel-parameters-v4
Date: Fri, 15 Aug 2025 17:01:20 +0800
use strstarts()
Link: https://lkml.kernel.org/r/20250815090120.1569947-1-chenhuacai@loongson.cn
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Christian Brauner <brauner(a)kernel.org>
Cc: Jan Kara <jack(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
init/main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/init/main.c~init-handle-bootloader-identifier-in-kernel-parameters-v4
+++ a/init/main.c
@@ -559,7 +559,7 @@ static int __init unknown_bootoption(cha
/* Handle bootloader identifier */
for (int i = 0; bootloader[i]; i++) {
- if (!strncmp(param, bootloader[i], strlen(bootloader[i])))
+ if (strstarts(param, bootloader[i]))
return 0;
}
_
Patches currently in -mm which might be from chenhuacai(a)loongson.cn are
init-handle-bootloader-identifier-in-kernel-parameters.patch
init-handle-bootloader-identifier-in-kernel-parameters-v4.patch
Pentium 4's which are INTEL_P4_PRESCOTT (model 0x03) and later have
a constant TSC. This was correctly captured until commit fadb6f569b10
("x86/cpu/intel: Limit the non-architectural constant_tsc model checks").
In that commit, an error was introduced while selecting the last P4
model (0x06) as the upper bound. Model 0x06 was transposed to
INTEL_P4_WILLAMETTE, which is just plain wrong. That was presumably a
simple typo, probably just copying and pasting the wrong P4 model.
Fix the constant TSC logic to cover all later P4 models. End at
INTEL_P4_CEDARMILL which accurately corresponds to the last P4 model.
Fixes: fadb6f569b10 ("x86/cpu/intel: Limit the non-architectural constant_tsc model checks")
Cc: <stable(a)vger.kernel.org> # v6.15
Reviewed-by: Sohil Mehta <sohil.mehta(a)intel.com>
Signed-off-by: Suchit Karunakaran <suchitkarunakaran(a)gmail.com>
---
Changes since v4:
- Updated the patch based on review suggestions
Changes since v3:
- Refined changelog
Changes since v2:
- Improved commit message
Changes since v1:
- Fixed incorrect logic
arch/x86/kernel/cpu/intel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 076eaa41b8c8..98ae4c37c93e 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -262,7 +262,7 @@ static void early_init_intel(struct cpuinfo_x86 *c)
if (c->x86_power & (1 << 8)) {
set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC);
- } else if ((c->x86_vfm >= INTEL_P4_PRESCOTT && c->x86_vfm <= INTEL_P4_WILLAMETTE) ||
+ } else if ((c->x86_vfm >= INTEL_P4_PRESCOTT && c->x86_vfm <= INTEL_P4_CEDARMILL) ||
(c->x86_vfm >= INTEL_CORE_YONAH && c->x86_vfm <= INTEL_IVYBRIDGE)) {
set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
}
--
2.50.1
Dear stable team,
I'd like to have the following commit backported to v6.16 and v6.15:
e40fc1160d49 ("mfd: cros_ec: Separate charge-control probing from USB-PD")
It fixes probing issues for the cros-charge-control driver, for details
see the commit message.
As far as I can see, the commit also was marked for -stable backporting
from the beginning. Somehow this doesn't seem to have worked.
Thanks for your consideration,
Thomas
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 6cff20ce3b92ffbf2fc5eb9e5a030b3672aa414a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081509-uncrown-janitor-adad@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6cff20ce3b92ffbf2fc5eb9e5a030b3672aa414a Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Sun, 13 Jul 2025 16:31:01 +0200
Subject: [PATCH] PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable
ports
pci_bridge_d3_possible() is called from both pcie_portdrv_probe() and
pcie_portdrv_remove() to determine whether runtime power management shall
be enabled (on probe) or disabled (on remove) on a PCIe port.
The underlying assumption is that pci_bridge_d3_possible() always returns
the same value, else a runtime PM reference imbalance would occur. That
assumption is not given if the PCIe port is inaccessible on remove due to
hot-unplug: pci_bridge_d3_possible() calls pciehp_is_native(), which
accesses Config Space to determine whether the port is Hot-Plug Capable.
An inaccessible port returns "all ones", which is converted to "all
zeroes" by pcie_capability_read_dword(). Hence the port no longer seems
Hot-Plug Capable on remove even though it was on probe.
The resulting runtime PM ref imbalance causes warning messages such as:
pcieport 0000:02:04.0: Runtime PM usage count underflow!
Avoid the Config Space access (and thus the runtime PM ref imbalance) by
caching the Hot-Plug Capable bit in struct pci_dev.
The struct already contains an "is_hotplug_bridge" flag, which however is
not only set on Hot-Plug Capable PCIe ports, but also Conventional PCI
Hot-Plug bridges and ACPI slots. The flag identifies bridges which are
allocated additional MMIO and bus number resources to allow for hierarchy
expansion.
The kernel is somewhat sloppily using "is_hotplug_bridge" in a number of
places to identify Hot-Plug Capable PCIe ports, even though the flag
encompasses other devices. Subsequent commits replace these occurrences
with the new flag to clearly delineate Hot-Plug Capable PCIe ports from
other kinds of hotplug bridges.
Document the existing "is_hotplug_bridge" and the new "is_pciehp" flag
and document the (non-obvious) requirement that pci_bridge_d3_possible()
always returns the same value across the entire lifetime of a bridge,
including its hot-removal.
Fixes: 5352a44a561d ("PCI: pciehp: Make pciehp_is_native() stricter")
Reported-by: Laurent Bigonville <bigon(a)bigon.be>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220216
Reported-by: Mario Limonciello <mario.limonciello(a)amd.com>
Closes: https://lore.kernel.org/r/20250609020223.269407-3-superm1@kernel.org/
Link: https://lore.kernel.org/all/20250620025535.3425049-3-superm1@kernel.org/T/#u
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Acked-by: Rafael J. Wysocki <rafael(a)kernel.org>
Cc: stable(a)vger.kernel.org # v4.18+
Link: https://patch.msgid.link/fe5dcc3b2e62ee1df7905d746bde161eb1b3291c.175239010…
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index b78e0e417324..efe478e5073e 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -816,13 +816,11 @@ int pci_acpi_program_hp_params(struct pci_dev *dev)
bool pciehp_is_native(struct pci_dev *bridge)
{
const struct pci_host_bridge *host;
- u32 slot_cap;
if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
return false;
- pcie_capability_read_dword(bridge, PCI_EXP_SLTCAP, &slot_cap);
- if (!(slot_cap & PCI_EXP_SLTCAP_HPC))
+ if (!bridge->is_pciehp)
return false;
if (pcie_ports_native)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e9448d55113b..23d8fe98ddf9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3030,8 +3030,12 @@ static const struct dmi_system_id bridge_d3_blacklist[] = {
* pci_bridge_d3_possible - Is it possible to put the bridge into D3
* @bridge: Bridge to check
*
- * This function checks if it is possible to move the bridge to D3.
* Currently we only allow D3 for some PCIe ports and for Thunderbolt.
+ *
+ * Return: Whether it is possible to move the bridge to D3.
+ *
+ * The return value is guaranteed to be constant across the entire lifetime
+ * of the bridge, including its hot-removal.
*/
bool pci_bridge_d3_possible(struct pci_dev *bridge)
{
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4b8693ec9e4c..cf50be63bf5f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1678,7 +1678,7 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev)
pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP, ®32);
if (reg32 & PCI_EXP_SLTCAP_HPC)
- pdev->is_hotplug_bridge = 1;
+ pdev->is_hotplug_bridge = pdev->is_pciehp = 1;
}
static void set_pcie_thunderbolt(struct pci_dev *dev)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 05e68f35f392..d56d0dd80afb 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -328,6 +328,11 @@ struct rcec_ea;
* determined (e.g., for Root Complex Integrated
* Endpoints without the relevant Capability
* Registers).
+ * @is_hotplug_bridge: Hotplug bridge of any kind (e.g. PCIe Hot-Plug Capable,
+ * Conventional PCI Hot-Plug, ACPI slot).
+ * Such bridges are allocated additional MMIO and bus
+ * number resources to allow for hierarchy expansion.
+ * @is_pciehp: PCIe Hot-Plug Capable bridge.
*/
struct pci_dev {
struct list_head bus_list; /* Node in per-bus list */
@@ -451,6 +456,7 @@ struct pci_dev {
unsigned int is_physfn:1;
unsigned int is_virtfn:1;
unsigned int is_hotplug_bridge:1;
+ unsigned int is_pciehp:1;
unsigned int shpc_managed:1; /* SHPC owned by shpchp */
unsigned int is_thunderbolt:1; /* Thunderbolt controller */
/*
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 3f66ccbaaef3a0c5bd844eab04e3207b4061c546
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081543-myself-easiest-be00@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3f66ccbaaef3a0c5bd844eab04e3207b4061c546 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <dlemoal(a)kernel.org>
Date: Wed, 25 Jun 2025 18:33:23 +0900
Subject: [PATCH] block: Make REQ_OP_ZONE_FINISH a write operation
REQ_OP_ZONE_FINISH is defined as "12", which makes
op_is_write(REQ_OP_ZONE_FINISH) return false, despite the fact that a
zone finish operation is an operation that modifies a zone (transition
it to full) and so should be considered as a write operation (albeit
one that does not transfer any data to the device).
Fix this by redefining REQ_OP_ZONE_FINISH to be an odd number (13), and
redefine REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL using sequential
odd numbers from that new value.
Fixes: 6c1b1da58f8c ("block: add zone open, close and finish operations")
Cc: stable(a)vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Link: https://lore.kernel.org/r/20250625093327.548866-2-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 3d1577f07c1c..930daff207df 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -350,11 +350,11 @@ enum req_op {
/* Close a zone */
REQ_OP_ZONE_CLOSE = (__force blk_opf_t)11,
/* Transition a zone to full */
- REQ_OP_ZONE_FINISH = (__force blk_opf_t)12,
+ REQ_OP_ZONE_FINISH = (__force blk_opf_t)13,
/* reset a zone write pointer */
- REQ_OP_ZONE_RESET = (__force blk_opf_t)13,
+ REQ_OP_ZONE_RESET = (__force blk_opf_t)15,
/* reset all the zone present on the device */
- REQ_OP_ZONE_RESET_ALL = (__force blk_opf_t)15,
+ REQ_OP_ZONE_RESET_ALL = (__force blk_opf_t)17,
/* Driver private requests */
REQ_OP_DRV_IN = (__force blk_opf_t)34,
Hello,
Status summary for stable/linux-6.6.y
Dashboard:
https://d.kernelci.org/c/stable/linux-6.6.y/bb9c90ab9c5a1a933a0dfd302a3fde7…
giturl: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
branch: linux-6.6.y
commit hash: bb9c90ab9c5a1a933a0dfd302a3fde73642b2b06
origin: maestro
test start time: 2025-08-15 10:52:19.705000+00:00
Builds: 45 ✅ 0 ❌ 0 ⚠️
Boots: 60 ✅ 0 ❌ 42 ⚠️
Tests: 3948 ✅ 607 ❌ 1617 ⚠️
### POSSIBLE REGRESSIONS
No possible regressions observed.
### FIXED REGRESSIONS
No fixed regressions observed.
### UNSTABLE TESTS
Hardware: bcm2711-rpi-4-b
- boot (defconfig+lab-setup+kselftest)
last run: https://d.kernelci.org/test/maestro:689f184b233e484a3f986be3
history: > ✅ > ✅ > ⚠️
Hardware: dell-latitude-3445-7520c-skyrim
- boot (cros://chromeos-6.6/x86_64/chromeos-amd-stoneyridge.flavour.config+lab-setup+x86-board+CONFIG_MODULE_COMPRESS=n+CONFIG_MODULE_COMPRESS_NONE=y)
last run: https://d.kernelci.org/test/maestro:689f24dd233e484a3f98e2c2
history: > ⚠️ > ⚠️
Sent every day if there were changes in the past 24 hours.
Legend: ✅ PASS ❌ FAIL ⚠️ INCONCLUSIVE
--
This is an experimental report format. Please send feedback in!
Talk to us at kernelci(a)lists.linux.dev
Made with love by the KernelCI team - https://kernelci.org
Hello,
Status summary for stable/linux-6.1.y
Dashboard:
https://d.kernelci.org/c/stable/linux-6.1.y/0bc96de781b4da716c8dbd248af4b26…
giturl: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
branch: linux-6.1.y
commit hash: 0bc96de781b4da716c8dbd248af4b26d8d8341f5
origin: maestro
test start time: 2025-08-15 10:52:19.284000+00:00
Builds: 44 ✅ 1 ❌ 0 ⚠️
Boots: 35 ✅ 0 ❌ 24 ⚠️
Tests: 1994 ✅ 166 ❌ 1054 ⚠️
### POSSIBLE REGRESSIONS
No possible regressions observed.
### FIXED REGRESSIONS
No fixed regressions observed.
### UNSTABLE TESTS
Hardware: bcm2711-rpi-4-b
- boot (defconfig+lab-setup+kselftest)
last run: https://d.kernelci.org/test/maestro:689f28a2233e484a3f98e9a8
history: > ✅ > ✅ > ⚠️
Sent every day if there were changes in the past 24 hours.
Legend: ✅ PASS ❌ FAIL ⚠️ INCONCLUSIVE
--
This is an experimental report format. Please send feedback in!
Talk to us at kernelci(a)lists.linux.dev
Made with love by the KernelCI team - https://kernelci.org
Using device_find_child() and of_find_device_by_node() to locate
devices could cause an imbalance in the device's reference count.
device_find_child() and of_find_device_by_node() both call
get_device() to increment the reference count of the found device
before returning the pointer. In mtk_drm_get_all_drm_priv(), these
references are never released through put_device(), resulting in
permanent reference count increments. Additionally, the
for_each_child_of_node() iterator fails to release node references in
all code paths. This leaks device node references when loop
termination occurs before reaching MAX_CRTC. These reference count
leaks may prevent device/node resources from being properly released
during driver unbind operations.
As comment of device_find_child() says, 'NOTE: you will need to drop
the reference with put_device() after use'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: 1ef7ed48356c ("drm/mediatek: Modify mediatek-drm for mt8195 multi mmsys support")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
Changes in v2:
- added goto labels as suggestions.
---
drivers/gpu/drm/mediatek/mtk_drm_drv.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index d5e6bab36414..f8a817689e16 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -387,19 +387,19 @@ static bool mtk_drm_get_all_drm_priv(struct device *dev)
of_id = of_match_node(mtk_drm_of_ids, node);
if (!of_id)
- continue;
+ goto next_put_node;
pdev = of_find_device_by_node(node);
if (!pdev)
- continue;
+ goto next_put_node;
drm_dev = device_find_child(&pdev->dev, NULL, mtk_drm_match);
if (!drm_dev)
- continue;
+ goto next_put_device_pdev_dev;
temp_drm_priv = dev_get_drvdata(drm_dev);
if (!temp_drm_priv)
- continue;
+ goto next_put_device_drm_dev;
if (temp_drm_priv->data->main_len)
all_drm_priv[CRTC_MAIN] = temp_drm_priv;
@@ -411,10 +411,17 @@ static bool mtk_drm_get_all_drm_priv(struct device *dev)
if (temp_drm_priv->mtk_drm_bound)
cnt++;
- if (cnt == MAX_CRTC) {
- of_node_put(node);
+next_put_device_drm_dev:
+ put_device(drm_dev);
+
+next_put_device_pdev_dev:
+ put_device(&pdev->dev);
+
+next_put_node:
+ of_node_put(node);
+
+ if (cnt == MAX_CRTC)
break;
- }
}
if (drm_priv->data->mmsys_dev_num == cnt) {
--
2.25.1
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 6cff20ce3b92ffbf2fc5eb9e5a030b3672aa414a
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081508-ultra-derived-f72d@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6cff20ce3b92ffbf2fc5eb9e5a030b3672aa414a Mon Sep 17 00:00:00 2001
From: Lukas Wunner <lukas(a)wunner.de>
Date: Sun, 13 Jul 2025 16:31:01 +0200
Subject: [PATCH] PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable
ports
pci_bridge_d3_possible() is called from both pcie_portdrv_probe() and
pcie_portdrv_remove() to determine whether runtime power management shall
be enabled (on probe) or disabled (on remove) on a PCIe port.
The underlying assumption is that pci_bridge_d3_possible() always returns
the same value, else a runtime PM reference imbalance would occur. That
assumption is not given if the PCIe port is inaccessible on remove due to
hot-unplug: pci_bridge_d3_possible() calls pciehp_is_native(), which
accesses Config Space to determine whether the port is Hot-Plug Capable.
An inaccessible port returns "all ones", which is converted to "all
zeroes" by pcie_capability_read_dword(). Hence the port no longer seems
Hot-Plug Capable on remove even though it was on probe.
The resulting runtime PM ref imbalance causes warning messages such as:
pcieport 0000:02:04.0: Runtime PM usage count underflow!
Avoid the Config Space access (and thus the runtime PM ref imbalance) by
caching the Hot-Plug Capable bit in struct pci_dev.
The struct already contains an "is_hotplug_bridge" flag, which however is
not only set on Hot-Plug Capable PCIe ports, but also Conventional PCI
Hot-Plug bridges and ACPI slots. The flag identifies bridges which are
allocated additional MMIO and bus number resources to allow for hierarchy
expansion.
The kernel is somewhat sloppily using "is_hotplug_bridge" in a number of
places to identify Hot-Plug Capable PCIe ports, even though the flag
encompasses other devices. Subsequent commits replace these occurrences
with the new flag to clearly delineate Hot-Plug Capable PCIe ports from
other kinds of hotplug bridges.
Document the existing "is_hotplug_bridge" and the new "is_pciehp" flag
and document the (non-obvious) requirement that pci_bridge_d3_possible()
always returns the same value across the entire lifetime of a bridge,
including its hot-removal.
Fixes: 5352a44a561d ("PCI: pciehp: Make pciehp_is_native() stricter")
Reported-by: Laurent Bigonville <bigon(a)bigon.be>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220216
Reported-by: Mario Limonciello <mario.limonciello(a)amd.com>
Closes: https://lore.kernel.org/r/20250609020223.269407-3-superm1@kernel.org/
Link: https://lore.kernel.org/all/20250620025535.3425049-3-superm1@kernel.org/T/#u
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas(a)google.com>
Acked-by: Rafael J. Wysocki <rafael(a)kernel.org>
Cc: stable(a)vger.kernel.org # v4.18+
Link: https://patch.msgid.link/fe5dcc3b2e62ee1df7905d746bde161eb1b3291c.175239010…
diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index b78e0e417324..efe478e5073e 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -816,13 +816,11 @@ int pci_acpi_program_hp_params(struct pci_dev *dev)
bool pciehp_is_native(struct pci_dev *bridge)
{
const struct pci_host_bridge *host;
- u32 slot_cap;
if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))
return false;
- pcie_capability_read_dword(bridge, PCI_EXP_SLTCAP, &slot_cap);
- if (!(slot_cap & PCI_EXP_SLTCAP_HPC))
+ if (!bridge->is_pciehp)
return false;
if (pcie_ports_native)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e9448d55113b..23d8fe98ddf9 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3030,8 +3030,12 @@ static const struct dmi_system_id bridge_d3_blacklist[] = {
* pci_bridge_d3_possible - Is it possible to put the bridge into D3
* @bridge: Bridge to check
*
- * This function checks if it is possible to move the bridge to D3.
* Currently we only allow D3 for some PCIe ports and for Thunderbolt.
+ *
+ * Return: Whether it is possible to move the bridge to D3.
+ *
+ * The return value is guaranteed to be constant across the entire lifetime
+ * of the bridge, including its hot-removal.
*/
bool pci_bridge_d3_possible(struct pci_dev *bridge)
{
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4b8693ec9e4c..cf50be63bf5f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1678,7 +1678,7 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev)
pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP, ®32);
if (reg32 & PCI_EXP_SLTCAP_HPC)
- pdev->is_hotplug_bridge = 1;
+ pdev->is_hotplug_bridge = pdev->is_pciehp = 1;
}
static void set_pcie_thunderbolt(struct pci_dev *dev)
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 05e68f35f392..d56d0dd80afb 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -328,6 +328,11 @@ struct rcec_ea;
* determined (e.g., for Root Complex Integrated
* Endpoints without the relevant Capability
* Registers).
+ * @is_hotplug_bridge: Hotplug bridge of any kind (e.g. PCIe Hot-Plug Capable,
+ * Conventional PCI Hot-Plug, ACPI slot).
+ * Such bridges are allocated additional MMIO and bus
+ * number resources to allow for hierarchy expansion.
+ * @is_pciehp: PCIe Hot-Plug Capable bridge.
*/
struct pci_dev {
struct list_head bus_list; /* Node in per-bus list */
@@ -451,6 +456,7 @@ struct pci_dev {
unsigned int is_physfn:1;
unsigned int is_virtfn:1;
unsigned int is_hotplug_bridge:1;
+ unsigned int is_pciehp:1;
unsigned int shpc_managed:1; /* SHPC owned by shpchp */
unsigned int is_thunderbolt:1; /* Thunderbolt controller */
/*
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 3f66ccbaaef3a0c5bd844eab04e3207b4061c546
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081543-graffiti-nastily-6e2b@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 3f66ccbaaef3a0c5bd844eab04e3207b4061c546 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <dlemoal(a)kernel.org>
Date: Wed, 25 Jun 2025 18:33:23 +0900
Subject: [PATCH] block: Make REQ_OP_ZONE_FINISH a write operation
REQ_OP_ZONE_FINISH is defined as "12", which makes
op_is_write(REQ_OP_ZONE_FINISH) return false, despite the fact that a
zone finish operation is an operation that modifies a zone (transition
it to full) and so should be considered as a write operation (albeit
one that does not transfer any data to the device).
Fix this by redefining REQ_OP_ZONE_FINISH to be an odd number (13), and
redefine REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL using sequential
odd numbers from that new value.
Fixes: 6c1b1da58f8c ("block: add zone open, close and finish operations")
Cc: stable(a)vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal(a)kernel.org>
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Link: https://lore.kernel.org/r/20250625093327.548866-2-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 3d1577f07c1c..930daff207df 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -350,11 +350,11 @@ enum req_op {
/* Close a zone */
REQ_OP_ZONE_CLOSE = (__force blk_opf_t)11,
/* Transition a zone to full */
- REQ_OP_ZONE_FINISH = (__force blk_opf_t)12,
+ REQ_OP_ZONE_FINISH = (__force blk_opf_t)13,
/* reset a zone write pointer */
- REQ_OP_ZONE_RESET = (__force blk_opf_t)13,
+ REQ_OP_ZONE_RESET = (__force blk_opf_t)15,
/* reset all the zone present on the device */
- REQ_OP_ZONE_RESET_ALL = (__force blk_opf_t)15,
+ REQ_OP_ZONE_RESET_ALL = (__force blk_opf_t)17,
/* Driver private requests */
REQ_OP_DRV_IN = (__force blk_opf_t)34,
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x b41c1d8d07906786c60893980d52688f31d114a6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081514-salad-crumpet-f125@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b41c1d8d07906786c60893980d52688f31d114a6 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Fri, 4 Jul 2025 00:03:22 -0700
Subject: [PATCH] fscrypt: Don't use problematic non-inline crypto engines
Make fscrypt no longer use Crypto API drivers for non-inline crypto
engines, even when the Crypto API prioritizes them over CPU-based code
(which unfortunately it often does). These drivers tend to be really
problematic, especially for fscrypt's workload. This commit has no
effect on inline crypto engines, which are different and do work well.
Specifically, exclude drivers that have CRYPTO_ALG_KERN_DRIVER_ONLY or
CRYPTO_ALG_ALLOCATES_MEMORY set. (Later, CRYPTO_ALG_ASYNC should be
excluded too. That's omitted for now to keep this commit backportable,
since until recently some CPU-based code had CRYPTO_ALG_ASYNC set.)
There are two major issues with these drivers: bugs and performance.
First, these drivers tend to be buggy. They're fundamentally much more
error-prone and harder to test than the CPU-based code. They often
don't get tested before kernel releases, and even if they do, the crypto
self-tests don't properly test these drivers. Released drivers have
en/decrypted or hashed data incorrectly. These bugs cause issues for
fscrypt users who often didn't even want to use these drivers, e.g.:
- https://github.com/google/fscryptctl/issues/32
- https://github.com/google/fscryptctl/issues/9
- https://lore.kernel.org/r/PH0PR02MB731916ECDB6C613665863B6CFFAA2@PH0PR02MB7…
These drivers have also similarly caused issues for dm-crypt users,
including data corruption and deadlocks. Since Linux v5.10, dm-crypt
has disabled most of them by excluding CRYPTO_ALG_ALLOCATES_MEMORY.
Second, these drivers tend to be *much* slower than the CPU-based code.
This may seem counterintuitive, but benchmarks clearly show it. There's
a *lot* of overhead associated with going to a hardware driver, off the
CPU, and back again. To prove this, I gathered as many systems with
this type of crypto engine as I could, and I measured synchronous
encryption of 4096-byte messages (which matches fscrypt's workload):
Intel Emerald Rapids server:
AES-256-XTS:
xts-aes-vaes-avx512 16171 MB/s [CPU-based, Vector AES]
qat_aes_xts 289 MB/s [Offload, Intel QuickAssist]
Qualcomm SM8650 HDK:
AES-256-XTS:
xts-aes-ce 4301 MB/s [CPU-based, ARMv8 Crypto Extensions]
xts-aes-qce 73 MB/s [Offload, Qualcomm Crypto Engine]
i.MX 8M Nano LPDDR4 EVK:
AES-256-XTS:
xts-aes-ce 647 MB/s [CPU-based, ARMv8 Crypto Extensions]
xts(ecb-aes-caam) 20 MB/s [Offload, CAAM]
AES-128-CBC-ESSIV:
essiv(cbc-aes-caam,sha256-lib) 23 MB/s [Offload, CAAM]
STM32MP157F-DK2:
AES-256-XTS:
xts-aes-neonbs 13.2 MB/s [CPU-based, ARM NEON]
xts(stm32-ecb-aes) 3.1 MB/s [Offload, STM32 crypto engine]
AES-128-CBC-ESSIV:
essiv(cbc-aes-neonbs,sha256-lib)
14.7 MB/s [CPU-based, ARM NEON]
essiv(stm32-cbc-aes,sha256-lib)
3.2 MB/s [Offload, STM32 crypto engine]
Adiantum:
adiantum(xchacha12-arm,aes-arm,nhpoly1305-neon)
52.8 MB/s [CPU-based, ARM scalar + NEON]
So, there was no case in which the crypto engine was even *close* to
being faster. On the first three, which have AES instructions in the
CPU, the CPU was 30 to 55 times faster (!). Even on STM32MP157F-DK2
which has a Cortex-A7 CPU that doesn't have AES instructions, AES was
over 4 times faster on the CPU. And Adiantum encryption, which is what
actually should be used on CPUs like that, was over 17 times faster.
Other justifications that have been given for these non-inline crypto
engines (almost always coming from the hardware vendors, not actual
users) don't seem very plausible either:
- The crypto engine throughput could be improved by processing
multiple requests concurrently. Currently irrelevant to fscrypt,
since it doesn't do that. This would also be complex, and unhelpful
in many cases. 2 of the 4 engines I tested even had only one queue.
- Some of the engines, e.g. STM32, support hardware keys. Also
currently irrelevant to fscrypt, since it doesn't support these.
Interestingly, the STM32 driver itself doesn't support this either.
- Free up CPU for other tasks and/or reduce energy usage. Not very
plausible considering the "short" message length, driver overhead,
and scheduling overhead. There's just very little time for the CPU
to do something else like run another task or enter low-power state,
before the message finishes and it's time to process the next one.
- Some of these engines resist power analysis and electromagnetic
attacks, while the CPU-based crypto generally does not. In theory,
this sounds great. In practice, if this benefit requires the use of
an off-CPU offload that massively regresses performance and has a
low-quality, buggy driver, the price for this hardening (which is
not relevant to most fscrypt users, and tends to be incomplete) is
just too high. Inline crypto engines are much more promising here,
as are on-CPU solutions like RISC-V High Assurance Cryptography.
Fixes: b30ab0e03407 ("ext4 crypto: add ext4 encryption facilities")
Cc: stable(a)vger.kernel.org
Acked-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250704070322.20692-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index f63791641c1d..696a5844bfa3 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -147,9 +147,8 @@ However, these ioctls have some limitations:
were wiped. To partially solve this, you can add init_on_free=1 to
your kernel command line. However, this has a performance cost.
-- Secret keys might still exist in CPU registers, in crypto
- accelerator hardware (if used by the crypto API to implement any of
- the algorithms), or in other places not explicitly considered here.
+- Secret keys might still exist in CPU registers or in other places
+ not explicitly considered here.
Full system compromise
~~~~~~~~~~~~~~~~~~~~~~
@@ -406,9 +405,12 @@ the work is done by XChaCha12, which is much faster than AES when AES
acceleration is unavailable. For more information about Adiantum, see
`the Adiantum paper <https://eprint.iacr.org/2018/720.pdf>`_.
-The (AES-128-CBC-ESSIV, AES-128-CBC-CTS) pair exists only to support
-systems whose only form of AES acceleration is an off-CPU crypto
-accelerator such as CAAM or CESA that does not support XTS.
+The (AES-128-CBC-ESSIV, AES-128-CBC-CTS) pair was added to try to
+provide a more efficient option for systems that lack AES instructions
+in the CPU but do have a non-inline crypto engine such as CAAM or CESA
+that supports AES-CBC (and not AES-XTS). This is deprecated. It has
+been shown that just doing AES on the CPU is actually faster.
+Moreover, Adiantum is faster still and is recommended on such systems.
The remaining mode pairs are the "national pride ciphers":
@@ -1318,22 +1320,13 @@ this by validating all top-level encryption policies prior to access.
Inline encryption support
=========================
-By default, fscrypt uses the kernel crypto API for all cryptographic
-operations (other than HKDF, which fscrypt partially implements
-itself). The kernel crypto API supports hardware crypto accelerators,
-but only ones that work in the traditional way where all inputs and
-outputs (e.g. plaintexts and ciphertexts) are in memory. fscrypt can
-take advantage of such hardware, but the traditional acceleration
-model isn't particularly efficient and fscrypt hasn't been optimized
-for it.
-
-Instead, many newer systems (especially mobile SoCs) have *inline
-encryption hardware* that can encrypt/decrypt data while it is on its
-way to/from the storage device. Linux supports inline encryption
-through a set of extensions to the block layer called *blk-crypto*.
-blk-crypto allows filesystems to attach encryption contexts to bios
-(I/O requests) to specify how the data will be encrypted or decrypted
-in-line. For more information about blk-crypto, see
+Many newer systems (especially mobile SoCs) have *inline encryption
+hardware* that can encrypt/decrypt data while it is on its way to/from
+the storage device. Linux supports inline encryption through a set of
+extensions to the block layer called *blk-crypto*. blk-crypto allows
+filesystems to attach encryption contexts to bios (I/O requests) to
+specify how the data will be encrypted or decrypted in-line. For more
+information about blk-crypto, see
:ref:`Documentation/block/inline-encryption.rst <inline_encryption>`.
On supported filesystems (currently ext4 and f2fs), fscrypt can use
diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
index c1d92074b65c..6e7164530a1e 100644
--- a/fs/crypto/fscrypt_private.h
+++ b/fs/crypto/fscrypt_private.h
@@ -45,6 +45,23 @@
*/
#undef FSCRYPT_MAX_KEY_SIZE
+/*
+ * This mask is passed as the third argument to the crypto_alloc_*() functions
+ * to prevent fscrypt from using the Crypto API drivers for non-inline crypto
+ * engines. Those drivers have been problematic for fscrypt. fscrypt users
+ * have reported hangs and even incorrect en/decryption with these drivers.
+ * Since going to the driver, off CPU, and back again is really slow, such
+ * drivers can be over 50 times slower than the CPU-based code for fscrypt's
+ * workload. Even on platforms that lack AES instructions on the CPU, using the
+ * offloads has been shown to be slower, even staying with AES. (Of course,
+ * Adiantum is faster still, and is the recommended option on such platforms...)
+ *
+ * Note that fscrypt also supports inline crypto engines. Those don't use the
+ * Crypto API and work much better than the old-style (non-inline) engines.
+ */
+#define FSCRYPT_CRYPTOAPI_MASK \
+ (CRYPTO_ALG_ALLOCATES_MEMORY | CRYPTO_ALG_KERN_DRIVER_ONLY)
+
#define FSCRYPT_CONTEXT_V1 1
#define FSCRYPT_CONTEXT_V2 2
diff --git a/fs/crypto/hkdf.c b/fs/crypto/hkdf.c
index 5c095c8aa3b5..b1ef506cd341 100644
--- a/fs/crypto/hkdf.c
+++ b/fs/crypto/hkdf.c
@@ -58,7 +58,7 @@ int fscrypt_init_hkdf(struct fscrypt_hkdf *hkdf, const u8 *master_key,
u8 prk[HKDF_HASHLEN];
int err;
- hmac_tfm = crypto_alloc_shash(HKDF_HMAC_ALG, 0, 0);
+ hmac_tfm = crypto_alloc_shash(HKDF_HMAC_ALG, 0, FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(hmac_tfm)) {
fscrypt_err(NULL, "Error allocating " HKDF_HMAC_ALG ": %ld",
PTR_ERR(hmac_tfm));
diff --git a/fs/crypto/keysetup.c b/fs/crypto/keysetup.c
index a67e20d126c9..74d4a2e1ad23 100644
--- a/fs/crypto/keysetup.c
+++ b/fs/crypto/keysetup.c
@@ -104,7 +104,8 @@ fscrypt_allocate_skcipher(struct fscrypt_mode *mode, const u8 *raw_key,
struct crypto_skcipher *tfm;
int err;
- tfm = crypto_alloc_skcipher(mode->cipher_str, 0, 0);
+ tfm = crypto_alloc_skcipher(mode->cipher_str, 0,
+ FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(tfm)) {
if (PTR_ERR(tfm) == -ENOENT) {
fscrypt_warn(inode,
diff --git a/fs/crypto/keysetup_v1.c b/fs/crypto/keysetup_v1.c
index b70521c55132..158ceae8a5bc 100644
--- a/fs/crypto/keysetup_v1.c
+++ b/fs/crypto/keysetup_v1.c
@@ -52,7 +52,8 @@ static int derive_key_aes(const u8 *master_key,
struct skcipher_request *req = NULL;
DECLARE_CRYPTO_WAIT(wait);
struct scatterlist src_sg, dst_sg;
- struct crypto_skcipher *tfm = crypto_alloc_skcipher("ecb(aes)", 0, 0);
+ struct crypto_skcipher *tfm =
+ crypto_alloc_skcipher("ecb(aes)", 0, FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(tfm)) {
res = PTR_ERR(tfm);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x b41c1d8d07906786c60893980d52688f31d114a6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081512-filter-droop-370f@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b41c1d8d07906786c60893980d52688f31d114a6 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Fri, 4 Jul 2025 00:03:22 -0700
Subject: [PATCH] fscrypt: Don't use problematic non-inline crypto engines
Make fscrypt no longer use Crypto API drivers for non-inline crypto
engines, even when the Crypto API prioritizes them over CPU-based code
(which unfortunately it often does). These drivers tend to be really
problematic, especially for fscrypt's workload. This commit has no
effect on inline crypto engines, which are different and do work well.
Specifically, exclude drivers that have CRYPTO_ALG_KERN_DRIVER_ONLY or
CRYPTO_ALG_ALLOCATES_MEMORY set. (Later, CRYPTO_ALG_ASYNC should be
excluded too. That's omitted for now to keep this commit backportable,
since until recently some CPU-based code had CRYPTO_ALG_ASYNC set.)
There are two major issues with these drivers: bugs and performance.
First, these drivers tend to be buggy. They're fundamentally much more
error-prone and harder to test than the CPU-based code. They often
don't get tested before kernel releases, and even if they do, the crypto
self-tests don't properly test these drivers. Released drivers have
en/decrypted or hashed data incorrectly. These bugs cause issues for
fscrypt users who often didn't even want to use these drivers, e.g.:
- https://github.com/google/fscryptctl/issues/32
- https://github.com/google/fscryptctl/issues/9
- https://lore.kernel.org/r/PH0PR02MB731916ECDB6C613665863B6CFFAA2@PH0PR02MB7…
These drivers have also similarly caused issues for dm-crypt users,
including data corruption and deadlocks. Since Linux v5.10, dm-crypt
has disabled most of them by excluding CRYPTO_ALG_ALLOCATES_MEMORY.
Second, these drivers tend to be *much* slower than the CPU-based code.
This may seem counterintuitive, but benchmarks clearly show it. There's
a *lot* of overhead associated with going to a hardware driver, off the
CPU, and back again. To prove this, I gathered as many systems with
this type of crypto engine as I could, and I measured synchronous
encryption of 4096-byte messages (which matches fscrypt's workload):
Intel Emerald Rapids server:
AES-256-XTS:
xts-aes-vaes-avx512 16171 MB/s [CPU-based, Vector AES]
qat_aes_xts 289 MB/s [Offload, Intel QuickAssist]
Qualcomm SM8650 HDK:
AES-256-XTS:
xts-aes-ce 4301 MB/s [CPU-based, ARMv8 Crypto Extensions]
xts-aes-qce 73 MB/s [Offload, Qualcomm Crypto Engine]
i.MX 8M Nano LPDDR4 EVK:
AES-256-XTS:
xts-aes-ce 647 MB/s [CPU-based, ARMv8 Crypto Extensions]
xts(ecb-aes-caam) 20 MB/s [Offload, CAAM]
AES-128-CBC-ESSIV:
essiv(cbc-aes-caam,sha256-lib) 23 MB/s [Offload, CAAM]
STM32MP157F-DK2:
AES-256-XTS:
xts-aes-neonbs 13.2 MB/s [CPU-based, ARM NEON]
xts(stm32-ecb-aes) 3.1 MB/s [Offload, STM32 crypto engine]
AES-128-CBC-ESSIV:
essiv(cbc-aes-neonbs,sha256-lib)
14.7 MB/s [CPU-based, ARM NEON]
essiv(stm32-cbc-aes,sha256-lib)
3.2 MB/s [Offload, STM32 crypto engine]
Adiantum:
adiantum(xchacha12-arm,aes-arm,nhpoly1305-neon)
52.8 MB/s [CPU-based, ARM scalar + NEON]
So, there was no case in which the crypto engine was even *close* to
being faster. On the first three, which have AES instructions in the
CPU, the CPU was 30 to 55 times faster (!). Even on STM32MP157F-DK2
which has a Cortex-A7 CPU that doesn't have AES instructions, AES was
over 4 times faster on the CPU. And Adiantum encryption, which is what
actually should be used on CPUs like that, was over 17 times faster.
Other justifications that have been given for these non-inline crypto
engines (almost always coming from the hardware vendors, not actual
users) don't seem very plausible either:
- The crypto engine throughput could be improved by processing
multiple requests concurrently. Currently irrelevant to fscrypt,
since it doesn't do that. This would also be complex, and unhelpful
in many cases. 2 of the 4 engines I tested even had only one queue.
- Some of the engines, e.g. STM32, support hardware keys. Also
currently irrelevant to fscrypt, since it doesn't support these.
Interestingly, the STM32 driver itself doesn't support this either.
- Free up CPU for other tasks and/or reduce energy usage. Not very
plausible considering the "short" message length, driver overhead,
and scheduling overhead. There's just very little time for the CPU
to do something else like run another task or enter low-power state,
before the message finishes and it's time to process the next one.
- Some of these engines resist power analysis and electromagnetic
attacks, while the CPU-based crypto generally does not. In theory,
this sounds great. In practice, if this benefit requires the use of
an off-CPU offload that massively regresses performance and has a
low-quality, buggy driver, the price for this hardening (which is
not relevant to most fscrypt users, and tends to be incomplete) is
just too high. Inline crypto engines are much more promising here,
as are on-CPU solutions like RISC-V High Assurance Cryptography.
Fixes: b30ab0e03407 ("ext4 crypto: add ext4 encryption facilities")
Cc: stable(a)vger.kernel.org
Acked-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250704070322.20692-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index f63791641c1d..696a5844bfa3 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -147,9 +147,8 @@ However, these ioctls have some limitations:
were wiped. To partially solve this, you can add init_on_free=1 to
your kernel command line. However, this has a performance cost.
-- Secret keys might still exist in CPU registers, in crypto
- accelerator hardware (if used by the crypto API to implement any of
- the algorithms), or in other places not explicitly considered here.
+- Secret keys might still exist in CPU registers or in other places
+ not explicitly considered here.
Full system compromise
~~~~~~~~~~~~~~~~~~~~~~
@@ -406,9 +405,12 @@ the work is done by XChaCha12, which is much faster than AES when AES
acceleration is unavailable. For more information about Adiantum, see
`the Adiantum paper <https://eprint.iacr.org/2018/720.pdf>`_.
-The (AES-128-CBC-ESSIV, AES-128-CBC-CTS) pair exists only to support
-systems whose only form of AES acceleration is an off-CPU crypto
-accelerator such as CAAM or CESA that does not support XTS.
+The (AES-128-CBC-ESSIV, AES-128-CBC-CTS) pair was added to try to
+provide a more efficient option for systems that lack AES instructions
+in the CPU but do have a non-inline crypto engine such as CAAM or CESA
+that supports AES-CBC (and not AES-XTS). This is deprecated. It has
+been shown that just doing AES on the CPU is actually faster.
+Moreover, Adiantum is faster still and is recommended on such systems.
The remaining mode pairs are the "national pride ciphers":
@@ -1318,22 +1320,13 @@ this by validating all top-level encryption policies prior to access.
Inline encryption support
=========================
-By default, fscrypt uses the kernel crypto API for all cryptographic
-operations (other than HKDF, which fscrypt partially implements
-itself). The kernel crypto API supports hardware crypto accelerators,
-but only ones that work in the traditional way where all inputs and
-outputs (e.g. plaintexts and ciphertexts) are in memory. fscrypt can
-take advantage of such hardware, but the traditional acceleration
-model isn't particularly efficient and fscrypt hasn't been optimized
-for it.
-
-Instead, many newer systems (especially mobile SoCs) have *inline
-encryption hardware* that can encrypt/decrypt data while it is on its
-way to/from the storage device. Linux supports inline encryption
-through a set of extensions to the block layer called *blk-crypto*.
-blk-crypto allows filesystems to attach encryption contexts to bios
-(I/O requests) to specify how the data will be encrypted or decrypted
-in-line. For more information about blk-crypto, see
+Many newer systems (especially mobile SoCs) have *inline encryption
+hardware* that can encrypt/decrypt data while it is on its way to/from
+the storage device. Linux supports inline encryption through a set of
+extensions to the block layer called *blk-crypto*. blk-crypto allows
+filesystems to attach encryption contexts to bios (I/O requests) to
+specify how the data will be encrypted or decrypted in-line. For more
+information about blk-crypto, see
:ref:`Documentation/block/inline-encryption.rst <inline_encryption>`.
On supported filesystems (currently ext4 and f2fs), fscrypt can use
diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
index c1d92074b65c..6e7164530a1e 100644
--- a/fs/crypto/fscrypt_private.h
+++ b/fs/crypto/fscrypt_private.h
@@ -45,6 +45,23 @@
*/
#undef FSCRYPT_MAX_KEY_SIZE
+/*
+ * This mask is passed as the third argument to the crypto_alloc_*() functions
+ * to prevent fscrypt from using the Crypto API drivers for non-inline crypto
+ * engines. Those drivers have been problematic for fscrypt. fscrypt users
+ * have reported hangs and even incorrect en/decryption with these drivers.
+ * Since going to the driver, off CPU, and back again is really slow, such
+ * drivers can be over 50 times slower than the CPU-based code for fscrypt's
+ * workload. Even on platforms that lack AES instructions on the CPU, using the
+ * offloads has been shown to be slower, even staying with AES. (Of course,
+ * Adiantum is faster still, and is the recommended option on such platforms...)
+ *
+ * Note that fscrypt also supports inline crypto engines. Those don't use the
+ * Crypto API and work much better than the old-style (non-inline) engines.
+ */
+#define FSCRYPT_CRYPTOAPI_MASK \
+ (CRYPTO_ALG_ALLOCATES_MEMORY | CRYPTO_ALG_KERN_DRIVER_ONLY)
+
#define FSCRYPT_CONTEXT_V1 1
#define FSCRYPT_CONTEXT_V2 2
diff --git a/fs/crypto/hkdf.c b/fs/crypto/hkdf.c
index 5c095c8aa3b5..b1ef506cd341 100644
--- a/fs/crypto/hkdf.c
+++ b/fs/crypto/hkdf.c
@@ -58,7 +58,7 @@ int fscrypt_init_hkdf(struct fscrypt_hkdf *hkdf, const u8 *master_key,
u8 prk[HKDF_HASHLEN];
int err;
- hmac_tfm = crypto_alloc_shash(HKDF_HMAC_ALG, 0, 0);
+ hmac_tfm = crypto_alloc_shash(HKDF_HMAC_ALG, 0, FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(hmac_tfm)) {
fscrypt_err(NULL, "Error allocating " HKDF_HMAC_ALG ": %ld",
PTR_ERR(hmac_tfm));
diff --git a/fs/crypto/keysetup.c b/fs/crypto/keysetup.c
index a67e20d126c9..74d4a2e1ad23 100644
--- a/fs/crypto/keysetup.c
+++ b/fs/crypto/keysetup.c
@@ -104,7 +104,8 @@ fscrypt_allocate_skcipher(struct fscrypt_mode *mode, const u8 *raw_key,
struct crypto_skcipher *tfm;
int err;
- tfm = crypto_alloc_skcipher(mode->cipher_str, 0, 0);
+ tfm = crypto_alloc_skcipher(mode->cipher_str, 0,
+ FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(tfm)) {
if (PTR_ERR(tfm) == -ENOENT) {
fscrypt_warn(inode,
diff --git a/fs/crypto/keysetup_v1.c b/fs/crypto/keysetup_v1.c
index b70521c55132..158ceae8a5bc 100644
--- a/fs/crypto/keysetup_v1.c
+++ b/fs/crypto/keysetup_v1.c
@@ -52,7 +52,8 @@ static int derive_key_aes(const u8 *master_key,
struct skcipher_request *req = NULL;
DECLARE_CRYPTO_WAIT(wait);
struct scatterlist src_sg, dst_sg;
- struct crypto_skcipher *tfm = crypto_alloc_skcipher("ecb(aes)", 0, 0);
+ struct crypto_skcipher *tfm =
+ crypto_alloc_skcipher("ecb(aes)", 0, FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(tfm)) {
res = PTR_ERR(tfm);
So we've had this regression in 9p for.. almost a year, which is way too
long, but there was no "easy" reproducer until yesterday (thank you
again!!)
It turned out to be a bug with iov_iter on folios,
iov_iter_get_pages_alloc2() would advance the iov_iter correctly up to
the end edge of a folio and the later copy_to_iter() fails on the
iterate_folioq() bug.
Happy to consider alternative ways of fixing this, now there's a
reproducer it's all much clearer; for the bug to be visible we basically
need to make and IO with non-contiguous folios in the iov_iter which is
not obvious to test with synthetic VMs, with size that triggers a
zero-copy read followed by a non-zero-copy read.
Signed-off-by: Dominique Martinet <asmadeus(a)codewreck.org>
---
Changes in v3:
- convert 'goto next' to a big "if there is valid data in current folio"
Future optimizations can remove it again after making sure this (iov iter
advanced to end of folio) can never happen.
- Link to v2: https://lore.kernel.org/r/20250812-iot_iter_folio-v2-0-f99423309478@codewre…
Changes in v2:
- Fixed 'remain' being used uninitialized in iterate_folioq when going
through the goto
- s/forwarded/advanced in commit message
- Link to v1: https://lore.kernel.org/r/20250811-iot_iter_folio-v1-0-d9c223adf93c@codewre…
---
Dominique Martinet (2):
iov_iter: iterate_folioq: fix handling of offset >= folio size
iov_iter: iov_folioq_get_pages: don't leave empty slot behind
include/linux/iov_iter.h | 20 +++++++++++---------
lib/iov_iter.c | 6 +++---
2 files changed, 14 insertions(+), 12 deletions(-)
---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20250811-iot_iter_folio-1b7849f88fed
Best regards,
--
Dominique Martinet <asmadeus(a)codewreck.org>
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 908e4ead7f757504d8b345452730636e298cbf68
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081525-comprised-nastily-418e@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 908e4ead7f757504d8b345452730636e298cbf68 Mon Sep 17 00:00:00 2001
From: Jeff Layton <jlayton(a)kernel.org>
Date: Wed, 4 Jun 2025 12:01:10 -0400
Subject: [PATCH] nfsd: handle get_client_locked() failure in
nfsd4_setclientid_confirm()
Lei Lu recently reported that nfsd4_setclientid_confirm() did not check
the return value from get_client_locked(). a SETCLIENTID_CONFIRM could
race with a confirmed client expiring and fail to get a reference. That
could later lead to a UAF.
Fix this by getting a reference early in the case where there is an
extant confirmed client. If that fails then treat it as if there were no
confirmed client found at all.
In the case where the unconfirmed client is expiring, just fail and
return the result from get_client_locked().
Reported-by: lei lu <llfamsec(a)gmail.com>
Closes: https://lore.kernel.org/linux-nfs/CAEBF3_b=UvqzNKdnfD_52L05Mqrqui9vZ2eFamgA…
Fixes: d20c11d86d8f ("nfsd: Protect session creation and client confirm using client_lock")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jeff Layton <jlayton(a)kernel.org>
Signed-off-by: Chuck Lever <chuck.lever(a)oracle.com>
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index a0e3fa2718c7..95d88555648d 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -4690,10 +4690,16 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
}
status = nfs_ok;
if (conf) {
- old = unconf;
- unhash_client_locked(old);
- nfsd4_change_callback(conf, &unconf->cl_cb_conn);
- } else {
+ if (get_client_locked(conf) == nfs_ok) {
+ old = unconf;
+ unhash_client_locked(old);
+ nfsd4_change_callback(conf, &unconf->cl_cb_conn);
+ } else {
+ conf = NULL;
+ }
+ }
+
+ if (!conf) {
old = find_confirmed_client_by_name(&unconf->cl_name, nn);
if (old) {
status = nfserr_clid_inuse;
@@ -4710,10 +4716,14 @@ nfsd4_setclientid_confirm(struct svc_rqst *rqstp,
}
trace_nfsd_clid_replaced(&old->cl_clientid);
}
+ status = get_client_locked(unconf);
+ if (status != nfs_ok) {
+ old = NULL;
+ goto out;
+ }
move_to_confirmed(unconf);
conf = unconf;
}
- get_client_locked(conf);
spin_unlock(&nn->client_lock);
if (conf == unconf)
fsnotify_dentry(conf->cl_nfsd_info_dentry, FS_MODIFY);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 70458f8a6b44daf3ad39f0d9b6d1097c8a7780ed
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081557-glimmer-distill-dc47@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 70458f8a6b44daf3ad39f0d9b6d1097c8a7780ed Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Fri, 25 Jul 2025 19:12:10 +0200
Subject: [PATCH] net: enetc: fix device and OF node leak at probe
Make sure to drop the references to the IERB OF node and platform device
taken by of_parse_phandle() and of_find_device_by_node() during probe.
Fixes: e7d48e5fbf30 ("net: enetc: add a mini driver for the Integrated Endpoint Register Block")
Cc: stable(a)vger.kernel.org # 5.13
Cc: Vladimir Oltean <vladimir.oltean(a)nxp.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Link: https://patch.msgid.link/20250725171213.880-3-johan@kernel.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
index f63a29e2e031..de0fb272c847 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
@@ -829,19 +829,29 @@ static int enetc_pf_register_with_ierb(struct pci_dev *pdev)
{
struct platform_device *ierb_pdev;
struct device_node *ierb_node;
+ int ret;
ierb_node = of_find_compatible_node(NULL, NULL,
"fsl,ls1028a-enetc-ierb");
- if (!ierb_node || !of_device_is_available(ierb_node))
+ if (!ierb_node)
return -ENODEV;
+ if (!of_device_is_available(ierb_node)) {
+ of_node_put(ierb_node);
+ return -ENODEV;
+ }
+
ierb_pdev = of_find_device_by_node(ierb_node);
of_node_put(ierb_node);
if (!ierb_pdev)
return -EPROBE_DEFER;
- return enetc_ierb_register_pf(ierb_pdev, pdev);
+ ret = enetc_ierb_register_pf(ierb_pdev, pdev);
+
+ put_device(&ierb_pdev->dev);
+
+ return ret;
}
static const struct enetc_si_ops enetc_psi_ops = {
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 70458f8a6b44daf3ad39f0d9b6d1097c8a7780ed
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081556-icon-nursing-ee42@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 70458f8a6b44daf3ad39f0d9b6d1097c8a7780ed Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Fri, 25 Jul 2025 19:12:10 +0200
Subject: [PATCH] net: enetc: fix device and OF node leak at probe
Make sure to drop the references to the IERB OF node and platform device
taken by of_parse_phandle() and of_find_device_by_node() during probe.
Fixes: e7d48e5fbf30 ("net: enetc: add a mini driver for the Integrated Endpoint Register Block")
Cc: stable(a)vger.kernel.org # 5.13
Cc: Vladimir Oltean <vladimir.oltean(a)nxp.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Reviewed-by: Simon Horman <horms(a)kernel.org>
Link: https://patch.msgid.link/20250725171213.880-3-johan@kernel.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
index f63a29e2e031..de0fb272c847 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
@@ -829,19 +829,29 @@ static int enetc_pf_register_with_ierb(struct pci_dev *pdev)
{
struct platform_device *ierb_pdev;
struct device_node *ierb_node;
+ int ret;
ierb_node = of_find_compatible_node(NULL, NULL,
"fsl,ls1028a-enetc-ierb");
- if (!ierb_node || !of_device_is_available(ierb_node))
+ if (!ierb_node)
return -ENODEV;
+ if (!of_device_is_available(ierb_node)) {
+ of_node_put(ierb_node);
+ return -ENODEV;
+ }
+
ierb_pdev = of_find_device_by_node(ierb_node);
of_node_put(ierb_node);
if (!ierb_pdev)
return -EPROBE_DEFER;
- return enetc_ierb_register_pf(ierb_pdev, pdev);
+ ret = enetc_ierb_register_pf(ierb_pdev, pdev);
+
+ put_device(&ierb_pdev->dev);
+
+ return ret;
}
static const struct enetc_si_ops enetc_psi_ops = {
Hi,
Please apply the commit e37617c8e53a1f7fcba6d5e1041f4fd8a2425c27,
it fix the REGRESSION by ada8d7fa0ad49a2a078f97f7f6e02d24d3c357a3
("sched/cpufreq: Rework schedutil governor performance estimation") which
introduced in v6.6.89.
Best Regards
Wentao Guan
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 65ba2a6e77e9e5c843a591055789050e77b5c65e
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081521-dangle-drapery-9a9a@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 65ba2a6e77e9e5c843a591055789050e77b5c65e Mon Sep 17 00:00:00 2001
From: Siddharth Vadapalli <s-vadapalli(a)ti.com>
Date: Mon, 23 Jun 2025 15:36:57 +0530
Subject: [PATCH] arm64: dts: ti: k3-j722s-evm: Fix USB gpio-hog level for
Type-C
According to the "GPIO Expander Map / Table" section of the J722S EVM
Schematic within the Evaluation Module Design Files package [0], the
GPIO Pin P05 located on the GPIO Expander 1 (I2C0/0x23) has to be pulled
down to select the Type-C interface. Since commit under Fixes claims to
enable the Type-C interface, update the property within "p05-hog" from
"output-high" to "output-low", thereby switching from the Type-A
interface to the Type-C interface.
[0]: https://www.ti.com/lit/zip/sprr495
Cc: stable(a)vger.kernel.org
Fixes: 485705df5d5f ("arm64: dts: ti: k3-j722s: Enable PCIe and USB support on J722S-EVM")
Signed-off-by: Siddharth Vadapalli <s-vadapalli(a)ti.com>
Link: https://lore.kernel.org/r/20250623100657.4082031-1-s-vadapalli@ti.com
Signed-off-by: Vignesh Raghavendra <vigneshr(a)ti.com>
diff --git a/arch/arm64/boot/dts/ti/k3-j722s-evm.dts b/arch/arm64/boot/dts/ti/k3-j722s-evm.dts
index a47852fdca70..d0533723412a 100644
--- a/arch/arm64/boot/dts/ti/k3-j722s-evm.dts
+++ b/arch/arm64/boot/dts/ti/k3-j722s-evm.dts
@@ -634,7 +634,7 @@ p05-hog {
/* P05 - USB2.0_MUX_SEL */
gpio-hog;
gpios = <5 GPIO_ACTIVE_LOW>;
- output-high;
+ output-low;
};
p01_hog: p01-hog {
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x f2e467a48287c868818085aa35389a224d226732
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081535-enrage-raging-295b@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f2e467a48287c868818085aa35389a224d226732 Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Fri, 11 Jul 2025 18:33:36 +0200
Subject: [PATCH] eventpoll: Fix semi-unbounded recursion
Ensure that epoll instances can never form a graph deeper than
EP_MAX_NESTS+1 links.
Currently, ep_loop_check_proc() ensures that the graph is loop-free and
does some recursion depth checks, but those recursion depth checks don't
limit the depth of the resulting tree for two reasons:
- They don't look upwards in the tree.
- If there are multiple downwards paths of different lengths, only one of
the paths is actually considered for the depth check since commit
28d82dc1c4ed ("epoll: limit paths").
Essentially, the current recursion depth check in ep_loop_check_proc() just
serves to prevent it from recursing too deeply while checking for loops.
A more thorough check is done in reverse_path_check() after the new graph
edge has already been created; this checks, among other things, that no
paths going upwards from any non-epoll file with a length of more than 5
edges exist. However, this check does not apply to non-epoll files.
As a result, it is possible to recurse to a depth of at least roughly 500,
tested on v6.15. (I am unsure if deeper recursion is possible; and this may
have changed with commit 8c44dac8add7 ("eventpoll: Fix priority inversion
problem").)
To fix it:
1. In ep_loop_check_proc(), note the subtree depth of each visited node,
and use subtree depths for the total depth calculation even when a subtree
has already been visited.
2. Add ep_get_upwards_depth_proc() for similarly determining the maximum
depth of an upwards walk.
3. In ep_loop_check(), use these values to limit the total path length
between epoll nodes to EP_MAX_NESTS edges.
Fixes: 22bacca48a17 ("epoll: prevent creating circular epoll structures")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jann Horn <jannh(a)google.com>
Link: https://lore.kernel.org/20250711-epoll-recursion-fix-v1-1-fb2457c33292@goog…
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index d4dbffdedd08..44648cc09250 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -218,6 +218,7 @@ struct eventpoll {
/* used to optimize loop detection check */
u64 gen;
struct hlist_head refs;
+ u8 loop_check_depth;
/*
* usage count, used together with epitem->dying to
@@ -2142,23 +2143,24 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
}
/**
- * ep_loop_check_proc - verify that adding an epoll file inside another
- * epoll structure does not violate the constraints, in
- * terms of closed loops, or too deep chains (which can
- * result in excessive stack usage).
+ * ep_loop_check_proc - verify that adding an epoll file @ep inside another
+ * epoll file does not create closed loops, and
+ * determine the depth of the subtree starting at @ep
*
* @ep: the &struct eventpoll to be currently checked.
* @depth: Current depth of the path being checked.
*
- * Return: %zero if adding the epoll @file inside current epoll
- * structure @ep does not violate the constraints, or %-1 otherwise.
+ * Return: depth of the subtree, or INT_MAX if we found a loop or went too deep.
*/
static int ep_loop_check_proc(struct eventpoll *ep, int depth)
{
- int error = 0;
+ int result = 0;
struct rb_node *rbp;
struct epitem *epi;
+ if (ep->gen == loop_check_gen)
+ return ep->loop_check_depth;
+
mutex_lock_nested(&ep->mtx, depth + 1);
ep->gen = loop_check_gen;
for (rbp = rb_first_cached(&ep->rbr); rbp; rbp = rb_next(rbp)) {
@@ -2166,13 +2168,11 @@ static int ep_loop_check_proc(struct eventpoll *ep, int depth)
if (unlikely(is_file_epoll(epi->ffd.file))) {
struct eventpoll *ep_tovisit;
ep_tovisit = epi->ffd.file->private_data;
- if (ep_tovisit->gen == loop_check_gen)
- continue;
if (ep_tovisit == inserting_into || depth > EP_MAX_NESTS)
- error = -1;
+ result = INT_MAX;
else
- error = ep_loop_check_proc(ep_tovisit, depth + 1);
- if (error != 0)
+ result = max(result, ep_loop_check_proc(ep_tovisit, depth + 1) + 1);
+ if (result > EP_MAX_NESTS)
break;
} else {
/*
@@ -2186,9 +2186,27 @@ static int ep_loop_check_proc(struct eventpoll *ep, int depth)
list_file(epi->ffd.file);
}
}
+ ep->loop_check_depth = result;
mutex_unlock(&ep->mtx);
- return error;
+ return result;
+}
+
+/**
+ * ep_get_upwards_depth_proc - determine depth of @ep when traversed upwards
+ */
+static int ep_get_upwards_depth_proc(struct eventpoll *ep, int depth)
+{
+ int result = 0;
+ struct epitem *epi;
+
+ if (ep->gen == loop_check_gen)
+ return ep->loop_check_depth;
+ hlist_for_each_entry_rcu(epi, &ep->refs, fllink)
+ result = max(result, ep_get_upwards_depth_proc(epi->ep, depth + 1) + 1);
+ ep->gen = loop_check_gen;
+ ep->loop_check_depth = result;
+ return result;
}
/**
@@ -2204,8 +2222,22 @@ static int ep_loop_check_proc(struct eventpoll *ep, int depth)
*/
static int ep_loop_check(struct eventpoll *ep, struct eventpoll *to)
{
+ int depth, upwards_depth;
+
inserting_into = ep;
- return ep_loop_check_proc(to, 0);
+ /*
+ * Check how deep down we can get from @to, and whether it is possible
+ * to loop up to @ep.
+ */
+ depth = ep_loop_check_proc(to, 0);
+ if (depth > EP_MAX_NESTS)
+ return -1;
+ /* Check how far up we can go from @ep. */
+ rcu_read_lock();
+ upwards_depth = ep_get_upwards_depth_proc(ep, 0);
+ rcu_read_unlock();
+
+ return (depth+1+upwards_depth > EP_MAX_NESTS) ? -1 : 0;
}
static void clear_tfile_check_list(void)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x f2e467a48287c868818085aa35389a224d226732
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081534-circling-deepen-8561@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f2e467a48287c868818085aa35389a224d226732 Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Fri, 11 Jul 2025 18:33:36 +0200
Subject: [PATCH] eventpoll: Fix semi-unbounded recursion
Ensure that epoll instances can never form a graph deeper than
EP_MAX_NESTS+1 links.
Currently, ep_loop_check_proc() ensures that the graph is loop-free and
does some recursion depth checks, but those recursion depth checks don't
limit the depth of the resulting tree for two reasons:
- They don't look upwards in the tree.
- If there are multiple downwards paths of different lengths, only one of
the paths is actually considered for the depth check since commit
28d82dc1c4ed ("epoll: limit paths").
Essentially, the current recursion depth check in ep_loop_check_proc() just
serves to prevent it from recursing too deeply while checking for loops.
A more thorough check is done in reverse_path_check() after the new graph
edge has already been created; this checks, among other things, that no
paths going upwards from any non-epoll file with a length of more than 5
edges exist. However, this check does not apply to non-epoll files.
As a result, it is possible to recurse to a depth of at least roughly 500,
tested on v6.15. (I am unsure if deeper recursion is possible; and this may
have changed with commit 8c44dac8add7 ("eventpoll: Fix priority inversion
problem").)
To fix it:
1. In ep_loop_check_proc(), note the subtree depth of each visited node,
and use subtree depths for the total depth calculation even when a subtree
has already been visited.
2. Add ep_get_upwards_depth_proc() for similarly determining the maximum
depth of an upwards walk.
3. In ep_loop_check(), use these values to limit the total path length
between epoll nodes to EP_MAX_NESTS edges.
Fixes: 22bacca48a17 ("epoll: prevent creating circular epoll structures")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jann Horn <jannh(a)google.com>
Link: https://lore.kernel.org/20250711-epoll-recursion-fix-v1-1-fb2457c33292@goog…
Signed-off-by: Christian Brauner <brauner(a)kernel.org>
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index d4dbffdedd08..44648cc09250 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -218,6 +218,7 @@ struct eventpoll {
/* used to optimize loop detection check */
u64 gen;
struct hlist_head refs;
+ u8 loop_check_depth;
/*
* usage count, used together with epitem->dying to
@@ -2142,23 +2143,24 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
}
/**
- * ep_loop_check_proc - verify that adding an epoll file inside another
- * epoll structure does not violate the constraints, in
- * terms of closed loops, or too deep chains (which can
- * result in excessive stack usage).
+ * ep_loop_check_proc - verify that adding an epoll file @ep inside another
+ * epoll file does not create closed loops, and
+ * determine the depth of the subtree starting at @ep
*
* @ep: the &struct eventpoll to be currently checked.
* @depth: Current depth of the path being checked.
*
- * Return: %zero if adding the epoll @file inside current epoll
- * structure @ep does not violate the constraints, or %-1 otherwise.
+ * Return: depth of the subtree, or INT_MAX if we found a loop or went too deep.
*/
static int ep_loop_check_proc(struct eventpoll *ep, int depth)
{
- int error = 0;
+ int result = 0;
struct rb_node *rbp;
struct epitem *epi;
+ if (ep->gen == loop_check_gen)
+ return ep->loop_check_depth;
+
mutex_lock_nested(&ep->mtx, depth + 1);
ep->gen = loop_check_gen;
for (rbp = rb_first_cached(&ep->rbr); rbp; rbp = rb_next(rbp)) {
@@ -2166,13 +2168,11 @@ static int ep_loop_check_proc(struct eventpoll *ep, int depth)
if (unlikely(is_file_epoll(epi->ffd.file))) {
struct eventpoll *ep_tovisit;
ep_tovisit = epi->ffd.file->private_data;
- if (ep_tovisit->gen == loop_check_gen)
- continue;
if (ep_tovisit == inserting_into || depth > EP_MAX_NESTS)
- error = -1;
+ result = INT_MAX;
else
- error = ep_loop_check_proc(ep_tovisit, depth + 1);
- if (error != 0)
+ result = max(result, ep_loop_check_proc(ep_tovisit, depth + 1) + 1);
+ if (result > EP_MAX_NESTS)
break;
} else {
/*
@@ -2186,9 +2186,27 @@ static int ep_loop_check_proc(struct eventpoll *ep, int depth)
list_file(epi->ffd.file);
}
}
+ ep->loop_check_depth = result;
mutex_unlock(&ep->mtx);
- return error;
+ return result;
+}
+
+/**
+ * ep_get_upwards_depth_proc - determine depth of @ep when traversed upwards
+ */
+static int ep_get_upwards_depth_proc(struct eventpoll *ep, int depth)
+{
+ int result = 0;
+ struct epitem *epi;
+
+ if (ep->gen == loop_check_gen)
+ return ep->loop_check_depth;
+ hlist_for_each_entry_rcu(epi, &ep->refs, fllink)
+ result = max(result, ep_get_upwards_depth_proc(epi->ep, depth + 1) + 1);
+ ep->gen = loop_check_gen;
+ ep->loop_check_depth = result;
+ return result;
}
/**
@@ -2204,8 +2222,22 @@ static int ep_loop_check_proc(struct eventpoll *ep, int depth)
*/
static int ep_loop_check(struct eventpoll *ep, struct eventpoll *to)
{
+ int depth, upwards_depth;
+
inserting_into = ep;
- return ep_loop_check_proc(to, 0);
+ /*
+ * Check how deep down we can get from @to, and whether it is possible
+ * to loop up to @ep.
+ */
+ depth = ep_loop_check_proc(to, 0);
+ if (depth > EP_MAX_NESTS)
+ return -1;
+ /* Check how far up we can go from @ep. */
+ rcu_read_lock();
+ upwards_depth = ep_get_upwards_depth_proc(ep, 0);
+ rcu_read_unlock();
+
+ return (depth+1+upwards_depth > EP_MAX_NESTS) ? -1 : 0;
}
static void clear_tfile_check_list(void)
On Wed, Aug 13, 2025 at 06:25:32PM +0100, Jon Hunter wrote:
> On Wed, Aug 13, 2025 at 08:48:28AM -0700, Jon Hunter wrote:
> > On Tue, 12 Aug 2025 19:43:28 +0200, Greg Kroah-Hartman wrote:
> > > This is the start of the stable review cycle for the 6.15.10 release.
> > > There are 480 patches in this series, all will be posted as a response
> > > to this one. If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Thu, 14 Aug 2025 17:42:20 +0000.
> > > Anything received after that time might be too late.
> > >
> > > The whole patch series can be found in one patch at:
> > > https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.15.10-rc…
> > > or in the git tree and branch at:
> > > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.15.y
> > > and the diffstat can be found below.
> > >
> > > thanks,
> > >
> > > greg k-h
> >
> > Failures detected for Tegra ...
> >
> > Test results for stable-v6.15:
> > 10 builds: 10 pass, 0 fail
> > 28 boots: 28 pass, 0 fail
> > 120 tests: 119 pass, 1 fail
> >
> > Linux version: 6.15.10-rc1-g2510f67e2e34
> > Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000,
> > tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000,
> > tegra194-p3509-0000+p3668-0000, tegra20-ventana,
> > tegra210-p2371-2180, tegra210-p3450-0000,
> > tegra30-cardhu-a04
> >
> > Test failures: tegra194-p2972-0000: boot.py
>
> I am seeing the following kernel warning for both linux-6.15.y and linux-6.16.y …
>
> WARNING KERN sched: DL replenish lagged too much
>
> I believe that this is introduced by …
>
> Peter Zijlstra <peterz(a)infradead.org>
> sched/deadline: Less agressive dl_server handling
>
> This has been reported here: https://lore.kernel.org/all/CAMuHMdXn4z1pioTtBGMfQM0jsLviqS2jwysaWXpoLxWYoG…
I've now dropped this.
Is that causing the test failure for you?
thanks,
greg k-h
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x eec76ea5a7213c48529a46eed1b343e5cee3aaab
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081559-frighten-antivirus-2da1@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From eec76ea5a7213c48529a46eed1b343e5cee3aaab Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:58 -0700
Subject: [PATCH] lib/crypto: arm64/poly1305: Fix register corruption in
no-SIMD contexts
Restore the SIMD usability check that was removed by commit a59e5468a921
("crypto: arm64/poly1305 - Add block-only interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use may_use_simd() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: a59e5468a921 ("crypto: arm64/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/arm64/poly1305-glue.c b/lib/crypto/arm64/poly1305-glue.c
index c9a74766785b..31aea21ce42f 100644
--- a/lib/crypto/arm64/poly1305-glue.c
+++ b/lib/crypto/arm64/poly1305-glue.c
@@ -7,6 +7,7 @@
#include <asm/hwcap.h>
#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/internal/poly1305.h>
#include <linux/cpufeature.h>
#include <linux/jump_label.h>
@@ -33,7 +34,7 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *src,
unsigned int len, u32 padbit)
{
len = round_down(len, POLY1305_BLOCK_SIZE);
- if (static_branch_likely(&have_neon)) {
+ if (static_branch_likely(&have_neon) && likely(may_use_simd())) {
do {
unsigned int todo = min_t(unsigned int, len, SZ_4K);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x eec76ea5a7213c48529a46eed1b343e5cee3aaab
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081558-designate-eatery-407a@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From eec76ea5a7213c48529a46eed1b343e5cee3aaab Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:58 -0700
Subject: [PATCH] lib/crypto: arm64/poly1305: Fix register corruption in
no-SIMD contexts
Restore the SIMD usability check that was removed by commit a59e5468a921
("crypto: arm64/poly1305 - Add block-only interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use may_use_simd() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: a59e5468a921 ("crypto: arm64/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/arm64/poly1305-glue.c b/lib/crypto/arm64/poly1305-glue.c
index c9a74766785b..31aea21ce42f 100644
--- a/lib/crypto/arm64/poly1305-glue.c
+++ b/lib/crypto/arm64/poly1305-glue.c
@@ -7,6 +7,7 @@
#include <asm/hwcap.h>
#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/internal/poly1305.h>
#include <linux/cpufeature.h>
#include <linux/jump_label.h>
@@ -33,7 +34,7 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *src,
unsigned int len, u32 padbit)
{
len = round_down(len, POLY1305_BLOCK_SIZE);
- if (static_branch_likely(&have_neon)) {
+ if (static_branch_likely(&have_neon) && likely(may_use_simd())) {
do {
unsigned int todo = min_t(unsigned int, len, SZ_4K);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x eec76ea5a7213c48529a46eed1b343e5cee3aaab
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081558-kissable-seducing-8159@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From eec76ea5a7213c48529a46eed1b343e5cee3aaab Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:58 -0700
Subject: [PATCH] lib/crypto: arm64/poly1305: Fix register corruption in
no-SIMD contexts
Restore the SIMD usability check that was removed by commit a59e5468a921
("crypto: arm64/poly1305 - Add block-only interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use may_use_simd() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: a59e5468a921 ("crypto: arm64/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/arm64/poly1305-glue.c b/lib/crypto/arm64/poly1305-glue.c
index c9a74766785b..31aea21ce42f 100644
--- a/lib/crypto/arm64/poly1305-glue.c
+++ b/lib/crypto/arm64/poly1305-glue.c
@@ -7,6 +7,7 @@
#include <asm/hwcap.h>
#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/internal/poly1305.h>
#include <linux/cpufeature.h>
#include <linux/jump_label.h>
@@ -33,7 +34,7 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *src,
unsigned int len, u32 padbit)
{
len = round_down(len, POLY1305_BLOCK_SIZE);
- if (static_branch_likely(&have_neon)) {
+ if (static_branch_likely(&have_neon) && likely(may_use_simd())) {
do {
unsigned int todo = min_t(unsigned int, len, SZ_4K);
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x eec76ea5a7213c48529a46eed1b343e5cee3aaab
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081557-unnatural-unexpired-089f@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From eec76ea5a7213c48529a46eed1b343e5cee3aaab Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:58 -0700
Subject: [PATCH] lib/crypto: arm64/poly1305: Fix register corruption in
no-SIMD contexts
Restore the SIMD usability check that was removed by commit a59e5468a921
("crypto: arm64/poly1305 - Add block-only interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use may_use_simd() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: a59e5468a921 ("crypto: arm64/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/arm64/poly1305-glue.c b/lib/crypto/arm64/poly1305-glue.c
index c9a74766785b..31aea21ce42f 100644
--- a/lib/crypto/arm64/poly1305-glue.c
+++ b/lib/crypto/arm64/poly1305-glue.c
@@ -7,6 +7,7 @@
#include <asm/hwcap.h>
#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/internal/poly1305.h>
#include <linux/cpufeature.h>
#include <linux/jump_label.h>
@@ -33,7 +34,7 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *src,
unsigned int len, u32 padbit)
{
len = round_down(len, POLY1305_BLOCK_SIZE);
- if (static_branch_likely(&have_neon)) {
+ if (static_branch_likely(&have_neon) && likely(may_use_simd())) {
do {
unsigned int todo = min_t(unsigned int, len, SZ_4K);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x eec76ea5a7213c48529a46eed1b343e5cee3aaab
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081557-sinless-cleat-22ee@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From eec76ea5a7213c48529a46eed1b343e5cee3aaab Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:58 -0700
Subject: [PATCH] lib/crypto: arm64/poly1305: Fix register corruption in
no-SIMD contexts
Restore the SIMD usability check that was removed by commit a59e5468a921
("crypto: arm64/poly1305 - Add block-only interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use may_use_simd() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: a59e5468a921 ("crypto: arm64/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/arm64/poly1305-glue.c b/lib/crypto/arm64/poly1305-glue.c
index c9a74766785b..31aea21ce42f 100644
--- a/lib/crypto/arm64/poly1305-glue.c
+++ b/lib/crypto/arm64/poly1305-glue.c
@@ -7,6 +7,7 @@
#include <asm/hwcap.h>
#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/internal/poly1305.h>
#include <linux/cpufeature.h>
#include <linux/jump_label.h>
@@ -33,7 +34,7 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *src,
unsigned int len, u32 padbit)
{
len = round_down(len, POLY1305_BLOCK_SIZE);
- if (static_branch_likely(&have_neon)) {
+ if (static_branch_likely(&have_neon) && likely(may_use_simd())) {
do {
unsigned int todo = min_t(unsigned int, len, SZ_4K);
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x eec76ea5a7213c48529a46eed1b343e5cee3aaab
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081556-relish-alongside-9c22@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From eec76ea5a7213c48529a46eed1b343e5cee3aaab Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:58 -0700
Subject: [PATCH] lib/crypto: arm64/poly1305: Fix register corruption in
no-SIMD contexts
Restore the SIMD usability check that was removed by commit a59e5468a921
("crypto: arm64/poly1305 - Add block-only interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use may_use_simd() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: a59e5468a921 ("crypto: arm64/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/arm64/poly1305-glue.c b/lib/crypto/arm64/poly1305-glue.c
index c9a74766785b..31aea21ce42f 100644
--- a/lib/crypto/arm64/poly1305-glue.c
+++ b/lib/crypto/arm64/poly1305-glue.c
@@ -7,6 +7,7 @@
#include <asm/hwcap.h>
#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/internal/poly1305.h>
#include <linux/cpufeature.h>
#include <linux/jump_label.h>
@@ -33,7 +34,7 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *src,
unsigned int len, u32 padbit)
{
len = round_down(len, POLY1305_BLOCK_SIZE);
- if (static_branch_likely(&have_neon)) {
+ if (static_branch_likely(&have_neon) && likely(may_use_simd())) {
do {
unsigned int todo = min_t(unsigned int, len, SZ_4K);
The patch below does not apply to the 6.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.15.y
git checkout FETCH_HEAD
git cherry-pick -x eec76ea5a7213c48529a46eed1b343e5cee3aaab
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081556-septic-skeleton-d360@gregkh' --subject-prefix 'PATCH 6.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From eec76ea5a7213c48529a46eed1b343e5cee3aaab Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:58 -0700
Subject: [PATCH] lib/crypto: arm64/poly1305: Fix register corruption in
no-SIMD contexts
Restore the SIMD usability check that was removed by commit a59e5468a921
("crypto: arm64/poly1305 - Add block-only interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use may_use_simd() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: a59e5468a921 ("crypto: arm64/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/arm64/poly1305-glue.c b/lib/crypto/arm64/poly1305-glue.c
index c9a74766785b..31aea21ce42f 100644
--- a/lib/crypto/arm64/poly1305-glue.c
+++ b/lib/crypto/arm64/poly1305-glue.c
@@ -7,6 +7,7 @@
#include <asm/hwcap.h>
#include <asm/neon.h>
+#include <asm/simd.h>
#include <crypto/internal/poly1305.h>
#include <linux/cpufeature.h>
#include <linux/jump_label.h>
@@ -33,7 +34,7 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *src,
unsigned int len, u32 padbit)
{
len = round_down(len, POLY1305_BLOCK_SIZE);
- if (static_branch_likely(&have_neon)) {
+ if (static_branch_likely(&have_neon) && likely(may_use_simd())) {
do {
unsigned int todo = min_t(unsigned int, len, SZ_4K);
If all the subrequests in an unbuffered write stream fail, the subrequest
collector doesn't update the stream->transferred value and it retains its
initial LONG_MAX value. Unfortunately, if all active streams fail, then we
take the smallest value of { LONG_MAX, LONG_MAX, ... } as the value to set
in wreq->transferred - which is then returned from ->write_iter().
LONG_MAX was chosen as the initial value so that all the streams can be
quickly assessed by taking the smallest value of all stream->transferred -
but this only works if we've set any of them.
Fix this by adding a flag to indicate whether the value in
stream->transferred is valid and checking that when we integrate the
values. stream->transferred can then be initialised to zero.
This was found by running the generic/750 xfstest against cifs with
cache=none. It splices data to the target file. Once (if) it has used up
all the available scratch space, the writes start failing with ENOSPC.
This causes ->write_iter() to fail. However, it was returning
wreq->transferred, i.e. LONG_MAX, rather than an error (because it thought
the amount transferred was non-zero) and iter_file_splice_write() would
then try to clean up that amount of pipe bufferage - leading to an oops
when it overran. The kernel log showed:
CIFS: VFS: Send error in write = -28
followed by:
BUG: kernel NULL pointer dereference, address: 0000000000000008
with:
RIP: 0010:iter_file_splice_write+0x3a4/0x520
do_splice+0x197/0x4e0
or:
RIP: 0010:pipe_buf_release (include/linux/pipe_fs_i.h:282)
iter_file_splice_write (fs/splice.c:755)
Also put a warning check into splice to announce if ->write_iter() returned
that it had written more than it was asked to.
Fixes: 288ace2f57c9 ("netfs: New writeback implementation")
Reported-by: Xiaoli Feng <fengxiaoli0714(a)gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220445
Signed-off-by: David Howells <dhowells(a)redhat.com>
cc: Paulo Alcantara <pc(a)manguebit.org>
cc: Steve French <sfrench(a)samba.org>
cc: Shyam Prasad N <sprasad(a)microsoft.com>
cc: netfs(a)lists.linux.dev
cc: linux-cifs(a)vger.kernel.org
cc: linux-fsdevel(a)vger.kernel.org
cc: stable(a)vger.kernel.org
---
fs/netfs/read_collect.c | 4 +++-
fs/netfs/write_collect.c | 10 ++++++++--
fs/netfs/write_issue.c | 4 ++--
fs/splice.c | 3 +++
include/linux/netfs.h | 1 +
5 files changed, 17 insertions(+), 5 deletions(-)
diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c
index 3e804da1e1eb..a95e7aadafd0 100644
--- a/fs/netfs/read_collect.c
+++ b/fs/netfs/read_collect.c
@@ -281,8 +281,10 @@ static void netfs_collect_read_results(struct netfs_io_request *rreq)
} else if (test_bit(NETFS_RREQ_SHORT_TRANSFER, &rreq->flags)) {
notes |= MADE_PROGRESS;
} else {
- if (!stream->failed)
+ if (!stream->failed) {
stream->transferred += transferred;
+ stream->transferred_valid = true;
+ }
if (front->transferred < front->len)
set_bit(NETFS_RREQ_SHORT_TRANSFER, &rreq->flags);
notes |= MADE_PROGRESS;
diff --git a/fs/netfs/write_collect.c b/fs/netfs/write_collect.c
index 0f3a36852a4d..cbf3d9194c7b 100644
--- a/fs/netfs/write_collect.c
+++ b/fs/netfs/write_collect.c
@@ -254,6 +254,7 @@ static void netfs_collect_write_results(struct netfs_io_request *wreq)
if (front->start + front->transferred > stream->collected_to) {
stream->collected_to = front->start + front->transferred;
stream->transferred = stream->collected_to - wreq->start;
+ stream->transferred_valid = true;
notes |= MADE_PROGRESS;
}
if (test_bit(NETFS_SREQ_FAILED, &front->flags)) {
@@ -356,6 +357,7 @@ bool netfs_write_collection(struct netfs_io_request *wreq)
{
struct netfs_inode *ictx = netfs_inode(wreq->inode);
size_t transferred;
+ bool transferred_valid = false;
int s;
_enter("R=%x", wreq->debug_id);
@@ -376,12 +378,16 @@ bool netfs_write_collection(struct netfs_io_request *wreq)
continue;
if (!list_empty(&stream->subrequests))
return false;
- if (stream->transferred < transferred)
+ if (stream->transferred_valid &&
+ stream->transferred < transferred) {
transferred = stream->transferred;
+ transferred_valid = true;
+ }
}
/* Okay, declare that all I/O is complete. */
- wreq->transferred = transferred;
+ if (transferred_valid)
+ wreq->transferred = transferred;
trace_netfs_rreq(wreq, netfs_rreq_trace_write_done);
if (wreq->io_streams[1].active &&
diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c
index 50bee2c4130d..0584cba1a043 100644
--- a/fs/netfs/write_issue.c
+++ b/fs/netfs/write_issue.c
@@ -118,12 +118,12 @@ struct netfs_io_request *netfs_create_write_req(struct address_space *mapping,
wreq->io_streams[0].prepare_write = ictx->ops->prepare_write;
wreq->io_streams[0].issue_write = ictx->ops->issue_write;
wreq->io_streams[0].collected_to = start;
- wreq->io_streams[0].transferred = LONG_MAX;
+ wreq->io_streams[0].transferred = 0;
wreq->io_streams[1].stream_nr = 1;
wreq->io_streams[1].source = NETFS_WRITE_TO_CACHE;
wreq->io_streams[1].collected_to = start;
- wreq->io_streams[1].transferred = LONG_MAX;
+ wreq->io_streams[1].transferred = 0;
if (fscache_resources_valid(&wreq->cache_resources)) {
wreq->io_streams[1].avail = true;
wreq->io_streams[1].active = true;
diff --git a/fs/splice.c b/fs/splice.c
index 4d6df083e0c0..f5094b6d00a0 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -739,6 +739,9 @@ iter_file_splice_write(struct pipe_inode_info *pipe, struct file *out,
sd.pos = kiocb.ki_pos;
if (ret <= 0)
break;
+ WARN_ONCE(ret > sd.total_len - left,
+ "Splice Exceeded! ret=%zd tot=%zu left=%zu\n",
+ ret, sd.total_len, left);
sd.num_spliced += ret;
sd.total_len -= ret;
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index 185bd8196503..98c96d649bf9 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -150,6 +150,7 @@ struct netfs_io_stream {
bool active; /* T if stream is active */
bool need_retry; /* T if this stream needs retrying */
bool failed; /* T if this stream failed */
+ bool transferred_valid; /* T is ->transferred is valid */
};
/*
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081509-octopus-proponent-0409@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:59 -0700
Subject: [PATCH] lib/crypto: x86/poly1305: Fix register corruption in no-SIMD
contexts
Restore the SIMD usability check and base conversion that were removed
by commit 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only
interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use irq_fpu_usable() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/x86/poly1305_glue.c b/lib/crypto/x86/poly1305_glue.c
index b7e78a583e07..968d84677631 100644
--- a/lib/crypto/x86/poly1305_glue.c
+++ b/lib/crypto/x86/poly1305_glue.c
@@ -25,6 +25,42 @@ struct poly1305_arch_internal {
struct { u32 r2, r1, r4, r3; } rn[9];
};
+/*
+ * The AVX code uses base 2^26, while the scalar code uses base 2^64. If we hit
+ * the unfortunate situation of using AVX and then having to go back to scalar
+ * -- because the user is silly and has called the update function from two
+ * separate contexts -- then we need to convert back to the original base before
+ * proceeding. It is possible to reason that the initial reduction below is
+ * sufficient given the implementation invariants. However, for an avoidance of
+ * doubt and because this is not performance critical, we do the full reduction
+ * anyway. Z3 proof of below function: https://xn--4db.cc/ltPtHCKN/py
+ */
+static void convert_to_base2_64(void *ctx)
+{
+ struct poly1305_arch_internal *state = ctx;
+ u32 cy;
+
+ if (!state->is_base2_26)
+ return;
+
+ cy = state->h[0] >> 26; state->h[0] &= 0x3ffffff; state->h[1] += cy;
+ cy = state->h[1] >> 26; state->h[1] &= 0x3ffffff; state->h[2] += cy;
+ cy = state->h[2] >> 26; state->h[2] &= 0x3ffffff; state->h[3] += cy;
+ cy = state->h[3] >> 26; state->h[3] &= 0x3ffffff; state->h[4] += cy;
+ state->hs[0] = ((u64)state->h[2] << 52) | ((u64)state->h[1] << 26) | state->h[0];
+ state->hs[1] = ((u64)state->h[4] << 40) | ((u64)state->h[3] << 14) | (state->h[2] >> 12);
+ state->hs[2] = state->h[4] >> 24;
+ /* Unsigned Less Than: branchlessly produces 1 if a < b, else 0. */
+#define ULT(a, b) ((a ^ ((a ^ b) | ((a - b) ^ b))) >> (sizeof(a) * 8 - 1))
+ cy = (state->hs[2] >> 2) + (state->hs[2] & ~3ULL);
+ state->hs[2] &= 3;
+ state->hs[0] += cy;
+ state->hs[1] += (cy = ULT(state->hs[0], cy));
+ state->hs[2] += ULT(state->hs[1], cy);
+#undef ULT
+ state->is_base2_26 = 0;
+}
+
asmlinkage void poly1305_block_init_arch(
struct poly1305_block_state *state,
const u8 raw_key[POLY1305_BLOCK_SIZE]);
@@ -62,7 +98,9 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *inp,
BUILD_BUG_ON(SZ_4K < POLY1305_BLOCK_SIZE ||
SZ_4K % POLY1305_BLOCK_SIZE);
- if (!static_branch_likely(&poly1305_use_avx)) {
+ if (!static_branch_likely(&poly1305_use_avx) ||
+ unlikely(!irq_fpu_usable())) {
+ convert_to_base2_64(ctx);
poly1305_blocks_x86_64(ctx, inp, len, padbit);
return;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081508-pelt-scouting-b79c@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:59 -0700
Subject: [PATCH] lib/crypto: x86/poly1305: Fix register corruption in no-SIMD
contexts
Restore the SIMD usability check and base conversion that were removed
by commit 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only
interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use irq_fpu_usable() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/x86/poly1305_glue.c b/lib/crypto/x86/poly1305_glue.c
index b7e78a583e07..968d84677631 100644
--- a/lib/crypto/x86/poly1305_glue.c
+++ b/lib/crypto/x86/poly1305_glue.c
@@ -25,6 +25,42 @@ struct poly1305_arch_internal {
struct { u32 r2, r1, r4, r3; } rn[9];
};
+/*
+ * The AVX code uses base 2^26, while the scalar code uses base 2^64. If we hit
+ * the unfortunate situation of using AVX and then having to go back to scalar
+ * -- because the user is silly and has called the update function from two
+ * separate contexts -- then we need to convert back to the original base before
+ * proceeding. It is possible to reason that the initial reduction below is
+ * sufficient given the implementation invariants. However, for an avoidance of
+ * doubt and because this is not performance critical, we do the full reduction
+ * anyway. Z3 proof of below function: https://xn--4db.cc/ltPtHCKN/py
+ */
+static void convert_to_base2_64(void *ctx)
+{
+ struct poly1305_arch_internal *state = ctx;
+ u32 cy;
+
+ if (!state->is_base2_26)
+ return;
+
+ cy = state->h[0] >> 26; state->h[0] &= 0x3ffffff; state->h[1] += cy;
+ cy = state->h[1] >> 26; state->h[1] &= 0x3ffffff; state->h[2] += cy;
+ cy = state->h[2] >> 26; state->h[2] &= 0x3ffffff; state->h[3] += cy;
+ cy = state->h[3] >> 26; state->h[3] &= 0x3ffffff; state->h[4] += cy;
+ state->hs[0] = ((u64)state->h[2] << 52) | ((u64)state->h[1] << 26) | state->h[0];
+ state->hs[1] = ((u64)state->h[4] << 40) | ((u64)state->h[3] << 14) | (state->h[2] >> 12);
+ state->hs[2] = state->h[4] >> 24;
+ /* Unsigned Less Than: branchlessly produces 1 if a < b, else 0. */
+#define ULT(a, b) ((a ^ ((a ^ b) | ((a - b) ^ b))) >> (sizeof(a) * 8 - 1))
+ cy = (state->hs[2] >> 2) + (state->hs[2] & ~3ULL);
+ state->hs[2] &= 3;
+ state->hs[0] += cy;
+ state->hs[1] += (cy = ULT(state->hs[0], cy));
+ state->hs[2] += ULT(state->hs[1], cy);
+#undef ULT
+ state->is_base2_26 = 0;
+}
+
asmlinkage void poly1305_block_init_arch(
struct poly1305_block_state *state,
const u8 raw_key[POLY1305_BLOCK_SIZE]);
@@ -62,7 +98,9 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *inp,
BUILD_BUG_ON(SZ_4K < POLY1305_BLOCK_SIZE ||
SZ_4K % POLY1305_BLOCK_SIZE);
- if (!static_branch_likely(&poly1305_use_avx)) {
+ if (!static_branch_likely(&poly1305_use_avx) ||
+ unlikely(!irq_fpu_usable())) {
+ convert_to_base2_64(ctx);
poly1305_blocks_x86_64(ctx, inp, len, padbit);
return;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081508-handcart-oversweet-f21e@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:59 -0700
Subject: [PATCH] lib/crypto: x86/poly1305: Fix register corruption in no-SIMD
contexts
Restore the SIMD usability check and base conversion that were removed
by commit 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only
interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use irq_fpu_usable() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/x86/poly1305_glue.c b/lib/crypto/x86/poly1305_glue.c
index b7e78a583e07..968d84677631 100644
--- a/lib/crypto/x86/poly1305_glue.c
+++ b/lib/crypto/x86/poly1305_glue.c
@@ -25,6 +25,42 @@ struct poly1305_arch_internal {
struct { u32 r2, r1, r4, r3; } rn[9];
};
+/*
+ * The AVX code uses base 2^26, while the scalar code uses base 2^64. If we hit
+ * the unfortunate situation of using AVX and then having to go back to scalar
+ * -- because the user is silly and has called the update function from two
+ * separate contexts -- then we need to convert back to the original base before
+ * proceeding. It is possible to reason that the initial reduction below is
+ * sufficient given the implementation invariants. However, for an avoidance of
+ * doubt and because this is not performance critical, we do the full reduction
+ * anyway. Z3 proof of below function: https://xn--4db.cc/ltPtHCKN/py
+ */
+static void convert_to_base2_64(void *ctx)
+{
+ struct poly1305_arch_internal *state = ctx;
+ u32 cy;
+
+ if (!state->is_base2_26)
+ return;
+
+ cy = state->h[0] >> 26; state->h[0] &= 0x3ffffff; state->h[1] += cy;
+ cy = state->h[1] >> 26; state->h[1] &= 0x3ffffff; state->h[2] += cy;
+ cy = state->h[2] >> 26; state->h[2] &= 0x3ffffff; state->h[3] += cy;
+ cy = state->h[3] >> 26; state->h[3] &= 0x3ffffff; state->h[4] += cy;
+ state->hs[0] = ((u64)state->h[2] << 52) | ((u64)state->h[1] << 26) | state->h[0];
+ state->hs[1] = ((u64)state->h[4] << 40) | ((u64)state->h[3] << 14) | (state->h[2] >> 12);
+ state->hs[2] = state->h[4] >> 24;
+ /* Unsigned Less Than: branchlessly produces 1 if a < b, else 0. */
+#define ULT(a, b) ((a ^ ((a ^ b) | ((a - b) ^ b))) >> (sizeof(a) * 8 - 1))
+ cy = (state->hs[2] >> 2) + (state->hs[2] & ~3ULL);
+ state->hs[2] &= 3;
+ state->hs[0] += cy;
+ state->hs[1] += (cy = ULT(state->hs[0], cy));
+ state->hs[2] += ULT(state->hs[1], cy);
+#undef ULT
+ state->is_base2_26 = 0;
+}
+
asmlinkage void poly1305_block_init_arch(
struct poly1305_block_state *state,
const u8 raw_key[POLY1305_BLOCK_SIZE]);
@@ -62,7 +98,9 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *inp,
BUILD_BUG_ON(SZ_4K < POLY1305_BLOCK_SIZE ||
SZ_4K % POLY1305_BLOCK_SIZE);
- if (!static_branch_likely(&poly1305_use_avx)) {
+ if (!static_branch_likely(&poly1305_use_avx) ||
+ unlikely(!irq_fpu_usable())) {
+ convert_to_base2_64(ctx);
poly1305_blocks_x86_64(ctx, inp, len, padbit);
return;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081507-raking-viewless-4b7d@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:59 -0700
Subject: [PATCH] lib/crypto: x86/poly1305: Fix register corruption in no-SIMD
contexts
Restore the SIMD usability check and base conversion that were removed
by commit 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only
interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use irq_fpu_usable() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/x86/poly1305_glue.c b/lib/crypto/x86/poly1305_glue.c
index b7e78a583e07..968d84677631 100644
--- a/lib/crypto/x86/poly1305_glue.c
+++ b/lib/crypto/x86/poly1305_glue.c
@@ -25,6 +25,42 @@ struct poly1305_arch_internal {
struct { u32 r2, r1, r4, r3; } rn[9];
};
+/*
+ * The AVX code uses base 2^26, while the scalar code uses base 2^64. If we hit
+ * the unfortunate situation of using AVX and then having to go back to scalar
+ * -- because the user is silly and has called the update function from two
+ * separate contexts -- then we need to convert back to the original base before
+ * proceeding. It is possible to reason that the initial reduction below is
+ * sufficient given the implementation invariants. However, for an avoidance of
+ * doubt and because this is not performance critical, we do the full reduction
+ * anyway. Z3 proof of below function: https://xn--4db.cc/ltPtHCKN/py
+ */
+static void convert_to_base2_64(void *ctx)
+{
+ struct poly1305_arch_internal *state = ctx;
+ u32 cy;
+
+ if (!state->is_base2_26)
+ return;
+
+ cy = state->h[0] >> 26; state->h[0] &= 0x3ffffff; state->h[1] += cy;
+ cy = state->h[1] >> 26; state->h[1] &= 0x3ffffff; state->h[2] += cy;
+ cy = state->h[2] >> 26; state->h[2] &= 0x3ffffff; state->h[3] += cy;
+ cy = state->h[3] >> 26; state->h[3] &= 0x3ffffff; state->h[4] += cy;
+ state->hs[0] = ((u64)state->h[2] << 52) | ((u64)state->h[1] << 26) | state->h[0];
+ state->hs[1] = ((u64)state->h[4] << 40) | ((u64)state->h[3] << 14) | (state->h[2] >> 12);
+ state->hs[2] = state->h[4] >> 24;
+ /* Unsigned Less Than: branchlessly produces 1 if a < b, else 0. */
+#define ULT(a, b) ((a ^ ((a ^ b) | ((a - b) ^ b))) >> (sizeof(a) * 8 - 1))
+ cy = (state->hs[2] >> 2) + (state->hs[2] & ~3ULL);
+ state->hs[2] &= 3;
+ state->hs[0] += cy;
+ state->hs[1] += (cy = ULT(state->hs[0], cy));
+ state->hs[2] += ULT(state->hs[1], cy);
+#undef ULT
+ state->is_base2_26 = 0;
+}
+
asmlinkage void poly1305_block_init_arch(
struct poly1305_block_state *state,
const u8 raw_key[POLY1305_BLOCK_SIZE]);
@@ -62,7 +98,9 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *inp,
BUILD_BUG_ON(SZ_4K < POLY1305_BLOCK_SIZE ||
SZ_4K % POLY1305_BLOCK_SIZE);
- if (!static_branch_likely(&poly1305_use_avx)) {
+ if (!static_branch_likely(&poly1305_use_avx) ||
+ unlikely(!irq_fpu_usable())) {
+ convert_to_base2_64(ctx);
poly1305_blocks_x86_64(ctx, inp, len, padbit);
return;
}
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081507-verse-prozac-3c44@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:59 -0700
Subject: [PATCH] lib/crypto: x86/poly1305: Fix register corruption in no-SIMD
contexts
Restore the SIMD usability check and base conversion that were removed
by commit 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only
interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use irq_fpu_usable() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/x86/poly1305_glue.c b/lib/crypto/x86/poly1305_glue.c
index b7e78a583e07..968d84677631 100644
--- a/lib/crypto/x86/poly1305_glue.c
+++ b/lib/crypto/x86/poly1305_glue.c
@@ -25,6 +25,42 @@ struct poly1305_arch_internal {
struct { u32 r2, r1, r4, r3; } rn[9];
};
+/*
+ * The AVX code uses base 2^26, while the scalar code uses base 2^64. If we hit
+ * the unfortunate situation of using AVX and then having to go back to scalar
+ * -- because the user is silly and has called the update function from two
+ * separate contexts -- then we need to convert back to the original base before
+ * proceeding. It is possible to reason that the initial reduction below is
+ * sufficient given the implementation invariants. However, for an avoidance of
+ * doubt and because this is not performance critical, we do the full reduction
+ * anyway. Z3 proof of below function: https://xn--4db.cc/ltPtHCKN/py
+ */
+static void convert_to_base2_64(void *ctx)
+{
+ struct poly1305_arch_internal *state = ctx;
+ u32 cy;
+
+ if (!state->is_base2_26)
+ return;
+
+ cy = state->h[0] >> 26; state->h[0] &= 0x3ffffff; state->h[1] += cy;
+ cy = state->h[1] >> 26; state->h[1] &= 0x3ffffff; state->h[2] += cy;
+ cy = state->h[2] >> 26; state->h[2] &= 0x3ffffff; state->h[3] += cy;
+ cy = state->h[3] >> 26; state->h[3] &= 0x3ffffff; state->h[4] += cy;
+ state->hs[0] = ((u64)state->h[2] << 52) | ((u64)state->h[1] << 26) | state->h[0];
+ state->hs[1] = ((u64)state->h[4] << 40) | ((u64)state->h[3] << 14) | (state->h[2] >> 12);
+ state->hs[2] = state->h[4] >> 24;
+ /* Unsigned Less Than: branchlessly produces 1 if a < b, else 0. */
+#define ULT(a, b) ((a ^ ((a ^ b) | ((a - b) ^ b))) >> (sizeof(a) * 8 - 1))
+ cy = (state->hs[2] >> 2) + (state->hs[2] & ~3ULL);
+ state->hs[2] &= 3;
+ state->hs[0] += cy;
+ state->hs[1] += (cy = ULT(state->hs[0], cy));
+ state->hs[2] += ULT(state->hs[1], cy);
+#undef ULT
+ state->is_base2_26 = 0;
+}
+
asmlinkage void poly1305_block_init_arch(
struct poly1305_block_state *state,
const u8 raw_key[POLY1305_BLOCK_SIZE]);
@@ -62,7 +98,9 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *inp,
BUILD_BUG_ON(SZ_4K < POLY1305_BLOCK_SIZE ||
SZ_4K % POLY1305_BLOCK_SIZE);
- if (!static_branch_likely(&poly1305_use_avx)) {
+ if (!static_branch_likely(&poly1305_use_avx) ||
+ unlikely(!irq_fpu_usable())) {
+ convert_to_base2_64(ctx);
poly1305_blocks_x86_64(ctx, inp, len, padbit);
return;
}
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081506-outsource-shorter-c39a@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:59 -0700
Subject: [PATCH] lib/crypto: x86/poly1305: Fix register corruption in no-SIMD
contexts
Restore the SIMD usability check and base conversion that were removed
by commit 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only
interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use irq_fpu_usable() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/x86/poly1305_glue.c b/lib/crypto/x86/poly1305_glue.c
index b7e78a583e07..968d84677631 100644
--- a/lib/crypto/x86/poly1305_glue.c
+++ b/lib/crypto/x86/poly1305_glue.c
@@ -25,6 +25,42 @@ struct poly1305_arch_internal {
struct { u32 r2, r1, r4, r3; } rn[9];
};
+/*
+ * The AVX code uses base 2^26, while the scalar code uses base 2^64. If we hit
+ * the unfortunate situation of using AVX and then having to go back to scalar
+ * -- because the user is silly and has called the update function from two
+ * separate contexts -- then we need to convert back to the original base before
+ * proceeding. It is possible to reason that the initial reduction below is
+ * sufficient given the implementation invariants. However, for an avoidance of
+ * doubt and because this is not performance critical, we do the full reduction
+ * anyway. Z3 proof of below function: https://xn--4db.cc/ltPtHCKN/py
+ */
+static void convert_to_base2_64(void *ctx)
+{
+ struct poly1305_arch_internal *state = ctx;
+ u32 cy;
+
+ if (!state->is_base2_26)
+ return;
+
+ cy = state->h[0] >> 26; state->h[0] &= 0x3ffffff; state->h[1] += cy;
+ cy = state->h[1] >> 26; state->h[1] &= 0x3ffffff; state->h[2] += cy;
+ cy = state->h[2] >> 26; state->h[2] &= 0x3ffffff; state->h[3] += cy;
+ cy = state->h[3] >> 26; state->h[3] &= 0x3ffffff; state->h[4] += cy;
+ state->hs[0] = ((u64)state->h[2] << 52) | ((u64)state->h[1] << 26) | state->h[0];
+ state->hs[1] = ((u64)state->h[4] << 40) | ((u64)state->h[3] << 14) | (state->h[2] >> 12);
+ state->hs[2] = state->h[4] >> 24;
+ /* Unsigned Less Than: branchlessly produces 1 if a < b, else 0. */
+#define ULT(a, b) ((a ^ ((a ^ b) | ((a - b) ^ b))) >> (sizeof(a) * 8 - 1))
+ cy = (state->hs[2] >> 2) + (state->hs[2] & ~3ULL);
+ state->hs[2] &= 3;
+ state->hs[0] += cy;
+ state->hs[1] += (cy = ULT(state->hs[0], cy));
+ state->hs[2] += ULT(state->hs[1], cy);
+#undef ULT
+ state->is_base2_26 = 0;
+}
+
asmlinkage void poly1305_block_init_arch(
struct poly1305_block_state *state,
const u8 raw_key[POLY1305_BLOCK_SIZE]);
@@ -62,7 +98,9 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *inp,
BUILD_BUG_ON(SZ_4K < POLY1305_BLOCK_SIZE ||
SZ_4K % POLY1305_BLOCK_SIZE);
- if (!static_branch_likely(&poly1305_use_avx)) {
+ if (!static_branch_likely(&poly1305_use_avx) ||
+ unlikely(!irq_fpu_usable())) {
+ convert_to_base2_64(ctx);
poly1305_blocks_x86_64(ctx, inp, len, padbit);
return;
}
The patch below does not apply to the 6.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.15.y
git checkout FETCH_HEAD
git cherry-pick -x 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081506-washable-climatic-2c7b@gregkh' --subject-prefix 'PATCH 6.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 16f2c30e290e04135b70ad374fb7e1d1ed9ff5e7 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Sun, 6 Jul 2025 16:10:59 -0700
Subject: [PATCH] lib/crypto: x86/poly1305: Fix register corruption in no-SIMD
contexts
Restore the SIMD usability check and base conversion that were removed
by commit 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only
interface").
This safety check is cheap and is well worth eliminating a footgun.
While the Poly1305 functions should not be called when SIMD registers
are unusable, if they are anyway, they should just do the right thing
instead of corrupting random tasks' registers and/or computing incorrect
MACs. Fixing this is also needed for poly1305_kunit to pass.
Just use irq_fpu_usable() instead of the original crypto_simd_usable(),
since poly1305_kunit won't rely on crypto_simd_disabled_for_test.
Fixes: 318c53ae02f2 ("crypto: x86/poly1305 - Add block-only interface")
Cc: stable(a)vger.kernel.org
Reviewed-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250706231100.176113-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/lib/crypto/x86/poly1305_glue.c b/lib/crypto/x86/poly1305_glue.c
index b7e78a583e07..968d84677631 100644
--- a/lib/crypto/x86/poly1305_glue.c
+++ b/lib/crypto/x86/poly1305_glue.c
@@ -25,6 +25,42 @@ struct poly1305_arch_internal {
struct { u32 r2, r1, r4, r3; } rn[9];
};
+/*
+ * The AVX code uses base 2^26, while the scalar code uses base 2^64. If we hit
+ * the unfortunate situation of using AVX and then having to go back to scalar
+ * -- because the user is silly and has called the update function from two
+ * separate contexts -- then we need to convert back to the original base before
+ * proceeding. It is possible to reason that the initial reduction below is
+ * sufficient given the implementation invariants. However, for an avoidance of
+ * doubt and because this is not performance critical, we do the full reduction
+ * anyway. Z3 proof of below function: https://xn--4db.cc/ltPtHCKN/py
+ */
+static void convert_to_base2_64(void *ctx)
+{
+ struct poly1305_arch_internal *state = ctx;
+ u32 cy;
+
+ if (!state->is_base2_26)
+ return;
+
+ cy = state->h[0] >> 26; state->h[0] &= 0x3ffffff; state->h[1] += cy;
+ cy = state->h[1] >> 26; state->h[1] &= 0x3ffffff; state->h[2] += cy;
+ cy = state->h[2] >> 26; state->h[2] &= 0x3ffffff; state->h[3] += cy;
+ cy = state->h[3] >> 26; state->h[3] &= 0x3ffffff; state->h[4] += cy;
+ state->hs[0] = ((u64)state->h[2] << 52) | ((u64)state->h[1] << 26) | state->h[0];
+ state->hs[1] = ((u64)state->h[4] << 40) | ((u64)state->h[3] << 14) | (state->h[2] >> 12);
+ state->hs[2] = state->h[4] >> 24;
+ /* Unsigned Less Than: branchlessly produces 1 if a < b, else 0. */
+#define ULT(a, b) ((a ^ ((a ^ b) | ((a - b) ^ b))) >> (sizeof(a) * 8 - 1))
+ cy = (state->hs[2] >> 2) + (state->hs[2] & ~3ULL);
+ state->hs[2] &= 3;
+ state->hs[0] += cy;
+ state->hs[1] += (cy = ULT(state->hs[0], cy));
+ state->hs[2] += ULT(state->hs[1], cy);
+#undef ULT
+ state->is_base2_26 = 0;
+}
+
asmlinkage void poly1305_block_init_arch(
struct poly1305_block_state *state,
const u8 raw_key[POLY1305_BLOCK_SIZE]);
@@ -62,7 +98,9 @@ void poly1305_blocks_arch(struct poly1305_block_state *state, const u8 *inp,
BUILD_BUG_ON(SZ_4K < POLY1305_BLOCK_SIZE ||
SZ_4K % POLY1305_BLOCK_SIZE);
- if (!static_branch_likely(&poly1305_use_avx)) {
+ if (!static_branch_likely(&poly1305_use_avx) ||
+ unlikely(!irq_fpu_usable())) {
+ convert_to_base2_64(ctx);
poly1305_blocks_x86_64(ctx, inp, len, padbit);
return;
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x b41c1d8d07906786c60893980d52688f31d114a6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081520-around-unsliced-699b@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b41c1d8d07906786c60893980d52688f31d114a6 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Fri, 4 Jul 2025 00:03:22 -0700
Subject: [PATCH] fscrypt: Don't use problematic non-inline crypto engines
Make fscrypt no longer use Crypto API drivers for non-inline crypto
engines, even when the Crypto API prioritizes them over CPU-based code
(which unfortunately it often does). These drivers tend to be really
problematic, especially for fscrypt's workload. This commit has no
effect on inline crypto engines, which are different and do work well.
Specifically, exclude drivers that have CRYPTO_ALG_KERN_DRIVER_ONLY or
CRYPTO_ALG_ALLOCATES_MEMORY set. (Later, CRYPTO_ALG_ASYNC should be
excluded too. That's omitted for now to keep this commit backportable,
since until recently some CPU-based code had CRYPTO_ALG_ASYNC set.)
There are two major issues with these drivers: bugs and performance.
First, these drivers tend to be buggy. They're fundamentally much more
error-prone and harder to test than the CPU-based code. They often
don't get tested before kernel releases, and even if they do, the crypto
self-tests don't properly test these drivers. Released drivers have
en/decrypted or hashed data incorrectly. These bugs cause issues for
fscrypt users who often didn't even want to use these drivers, e.g.:
- https://github.com/google/fscryptctl/issues/32
- https://github.com/google/fscryptctl/issues/9
- https://lore.kernel.org/r/PH0PR02MB731916ECDB6C613665863B6CFFAA2@PH0PR02MB7…
These drivers have also similarly caused issues for dm-crypt users,
including data corruption and deadlocks. Since Linux v5.10, dm-crypt
has disabled most of them by excluding CRYPTO_ALG_ALLOCATES_MEMORY.
Second, these drivers tend to be *much* slower than the CPU-based code.
This may seem counterintuitive, but benchmarks clearly show it. There's
a *lot* of overhead associated with going to a hardware driver, off the
CPU, and back again. To prove this, I gathered as many systems with
this type of crypto engine as I could, and I measured synchronous
encryption of 4096-byte messages (which matches fscrypt's workload):
Intel Emerald Rapids server:
AES-256-XTS:
xts-aes-vaes-avx512 16171 MB/s [CPU-based, Vector AES]
qat_aes_xts 289 MB/s [Offload, Intel QuickAssist]
Qualcomm SM8650 HDK:
AES-256-XTS:
xts-aes-ce 4301 MB/s [CPU-based, ARMv8 Crypto Extensions]
xts-aes-qce 73 MB/s [Offload, Qualcomm Crypto Engine]
i.MX 8M Nano LPDDR4 EVK:
AES-256-XTS:
xts-aes-ce 647 MB/s [CPU-based, ARMv8 Crypto Extensions]
xts(ecb-aes-caam) 20 MB/s [Offload, CAAM]
AES-128-CBC-ESSIV:
essiv(cbc-aes-caam,sha256-lib) 23 MB/s [Offload, CAAM]
STM32MP157F-DK2:
AES-256-XTS:
xts-aes-neonbs 13.2 MB/s [CPU-based, ARM NEON]
xts(stm32-ecb-aes) 3.1 MB/s [Offload, STM32 crypto engine]
AES-128-CBC-ESSIV:
essiv(cbc-aes-neonbs,sha256-lib)
14.7 MB/s [CPU-based, ARM NEON]
essiv(stm32-cbc-aes,sha256-lib)
3.2 MB/s [Offload, STM32 crypto engine]
Adiantum:
adiantum(xchacha12-arm,aes-arm,nhpoly1305-neon)
52.8 MB/s [CPU-based, ARM scalar + NEON]
So, there was no case in which the crypto engine was even *close* to
being faster. On the first three, which have AES instructions in the
CPU, the CPU was 30 to 55 times faster (!). Even on STM32MP157F-DK2
which has a Cortex-A7 CPU that doesn't have AES instructions, AES was
over 4 times faster on the CPU. And Adiantum encryption, which is what
actually should be used on CPUs like that, was over 17 times faster.
Other justifications that have been given for these non-inline crypto
engines (almost always coming from the hardware vendors, not actual
users) don't seem very plausible either:
- The crypto engine throughput could be improved by processing
multiple requests concurrently. Currently irrelevant to fscrypt,
since it doesn't do that. This would also be complex, and unhelpful
in many cases. 2 of the 4 engines I tested even had only one queue.
- Some of the engines, e.g. STM32, support hardware keys. Also
currently irrelevant to fscrypt, since it doesn't support these.
Interestingly, the STM32 driver itself doesn't support this either.
- Free up CPU for other tasks and/or reduce energy usage. Not very
plausible considering the "short" message length, driver overhead,
and scheduling overhead. There's just very little time for the CPU
to do something else like run another task or enter low-power state,
before the message finishes and it's time to process the next one.
- Some of these engines resist power analysis and electromagnetic
attacks, while the CPU-based crypto generally does not. In theory,
this sounds great. In practice, if this benefit requires the use of
an off-CPU offload that massively regresses performance and has a
low-quality, buggy driver, the price for this hardening (which is
not relevant to most fscrypt users, and tends to be incomplete) is
just too high. Inline crypto engines are much more promising here,
as are on-CPU solutions like RISC-V High Assurance Cryptography.
Fixes: b30ab0e03407 ("ext4 crypto: add ext4 encryption facilities")
Cc: stable(a)vger.kernel.org
Acked-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250704070322.20692-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index f63791641c1d..696a5844bfa3 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -147,9 +147,8 @@ However, these ioctls have some limitations:
were wiped. To partially solve this, you can add init_on_free=1 to
your kernel command line. However, this has a performance cost.
-- Secret keys might still exist in CPU registers, in crypto
- accelerator hardware (if used by the crypto API to implement any of
- the algorithms), or in other places not explicitly considered here.
+- Secret keys might still exist in CPU registers or in other places
+ not explicitly considered here.
Full system compromise
~~~~~~~~~~~~~~~~~~~~~~
@@ -406,9 +405,12 @@ the work is done by XChaCha12, which is much faster than AES when AES
acceleration is unavailable. For more information about Adiantum, see
`the Adiantum paper <https://eprint.iacr.org/2018/720.pdf>`_.
-The (AES-128-CBC-ESSIV, AES-128-CBC-CTS) pair exists only to support
-systems whose only form of AES acceleration is an off-CPU crypto
-accelerator such as CAAM or CESA that does not support XTS.
+The (AES-128-CBC-ESSIV, AES-128-CBC-CTS) pair was added to try to
+provide a more efficient option for systems that lack AES instructions
+in the CPU but do have a non-inline crypto engine such as CAAM or CESA
+that supports AES-CBC (and not AES-XTS). This is deprecated. It has
+been shown that just doing AES on the CPU is actually faster.
+Moreover, Adiantum is faster still and is recommended on such systems.
The remaining mode pairs are the "national pride ciphers":
@@ -1318,22 +1320,13 @@ this by validating all top-level encryption policies prior to access.
Inline encryption support
=========================
-By default, fscrypt uses the kernel crypto API for all cryptographic
-operations (other than HKDF, which fscrypt partially implements
-itself). The kernel crypto API supports hardware crypto accelerators,
-but only ones that work in the traditional way where all inputs and
-outputs (e.g. plaintexts and ciphertexts) are in memory. fscrypt can
-take advantage of such hardware, but the traditional acceleration
-model isn't particularly efficient and fscrypt hasn't been optimized
-for it.
-
-Instead, many newer systems (especially mobile SoCs) have *inline
-encryption hardware* that can encrypt/decrypt data while it is on its
-way to/from the storage device. Linux supports inline encryption
-through a set of extensions to the block layer called *blk-crypto*.
-blk-crypto allows filesystems to attach encryption contexts to bios
-(I/O requests) to specify how the data will be encrypted or decrypted
-in-line. For more information about blk-crypto, see
+Many newer systems (especially mobile SoCs) have *inline encryption
+hardware* that can encrypt/decrypt data while it is on its way to/from
+the storage device. Linux supports inline encryption through a set of
+extensions to the block layer called *blk-crypto*. blk-crypto allows
+filesystems to attach encryption contexts to bios (I/O requests) to
+specify how the data will be encrypted or decrypted in-line. For more
+information about blk-crypto, see
:ref:`Documentation/block/inline-encryption.rst <inline_encryption>`.
On supported filesystems (currently ext4 and f2fs), fscrypt can use
diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
index c1d92074b65c..6e7164530a1e 100644
--- a/fs/crypto/fscrypt_private.h
+++ b/fs/crypto/fscrypt_private.h
@@ -45,6 +45,23 @@
*/
#undef FSCRYPT_MAX_KEY_SIZE
+/*
+ * This mask is passed as the third argument to the crypto_alloc_*() functions
+ * to prevent fscrypt from using the Crypto API drivers for non-inline crypto
+ * engines. Those drivers have been problematic for fscrypt. fscrypt users
+ * have reported hangs and even incorrect en/decryption with these drivers.
+ * Since going to the driver, off CPU, and back again is really slow, such
+ * drivers can be over 50 times slower than the CPU-based code for fscrypt's
+ * workload. Even on platforms that lack AES instructions on the CPU, using the
+ * offloads has been shown to be slower, even staying with AES. (Of course,
+ * Adiantum is faster still, and is the recommended option on such platforms...)
+ *
+ * Note that fscrypt also supports inline crypto engines. Those don't use the
+ * Crypto API and work much better than the old-style (non-inline) engines.
+ */
+#define FSCRYPT_CRYPTOAPI_MASK \
+ (CRYPTO_ALG_ALLOCATES_MEMORY | CRYPTO_ALG_KERN_DRIVER_ONLY)
+
#define FSCRYPT_CONTEXT_V1 1
#define FSCRYPT_CONTEXT_V2 2
diff --git a/fs/crypto/hkdf.c b/fs/crypto/hkdf.c
index 5c095c8aa3b5..b1ef506cd341 100644
--- a/fs/crypto/hkdf.c
+++ b/fs/crypto/hkdf.c
@@ -58,7 +58,7 @@ int fscrypt_init_hkdf(struct fscrypt_hkdf *hkdf, const u8 *master_key,
u8 prk[HKDF_HASHLEN];
int err;
- hmac_tfm = crypto_alloc_shash(HKDF_HMAC_ALG, 0, 0);
+ hmac_tfm = crypto_alloc_shash(HKDF_HMAC_ALG, 0, FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(hmac_tfm)) {
fscrypt_err(NULL, "Error allocating " HKDF_HMAC_ALG ": %ld",
PTR_ERR(hmac_tfm));
diff --git a/fs/crypto/keysetup.c b/fs/crypto/keysetup.c
index a67e20d126c9..74d4a2e1ad23 100644
--- a/fs/crypto/keysetup.c
+++ b/fs/crypto/keysetup.c
@@ -104,7 +104,8 @@ fscrypt_allocate_skcipher(struct fscrypt_mode *mode, const u8 *raw_key,
struct crypto_skcipher *tfm;
int err;
- tfm = crypto_alloc_skcipher(mode->cipher_str, 0, 0);
+ tfm = crypto_alloc_skcipher(mode->cipher_str, 0,
+ FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(tfm)) {
if (PTR_ERR(tfm) == -ENOENT) {
fscrypt_warn(inode,
diff --git a/fs/crypto/keysetup_v1.c b/fs/crypto/keysetup_v1.c
index b70521c55132..158ceae8a5bc 100644
--- a/fs/crypto/keysetup_v1.c
+++ b/fs/crypto/keysetup_v1.c
@@ -52,7 +52,8 @@ static int derive_key_aes(const u8 *master_key,
struct skcipher_request *req = NULL;
DECLARE_CRYPTO_WAIT(wait);
struct scatterlist src_sg, dst_sg;
- struct crypto_skcipher *tfm = crypto_alloc_skcipher("ecb(aes)", 0, 0);
+ struct crypto_skcipher *tfm =
+ crypto_alloc_skcipher("ecb(aes)", 0, FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(tfm)) {
res = PTR_ERR(tfm);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x b41c1d8d07906786c60893980d52688f31d114a6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081518-cusp-plausibly-6afc@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b41c1d8d07906786c60893980d52688f31d114a6 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Fri, 4 Jul 2025 00:03:22 -0700
Subject: [PATCH] fscrypt: Don't use problematic non-inline crypto engines
Make fscrypt no longer use Crypto API drivers for non-inline crypto
engines, even when the Crypto API prioritizes them over CPU-based code
(which unfortunately it often does). These drivers tend to be really
problematic, especially for fscrypt's workload. This commit has no
effect on inline crypto engines, which are different and do work well.
Specifically, exclude drivers that have CRYPTO_ALG_KERN_DRIVER_ONLY or
CRYPTO_ALG_ALLOCATES_MEMORY set. (Later, CRYPTO_ALG_ASYNC should be
excluded too. That's omitted for now to keep this commit backportable,
since until recently some CPU-based code had CRYPTO_ALG_ASYNC set.)
There are two major issues with these drivers: bugs and performance.
First, these drivers tend to be buggy. They're fundamentally much more
error-prone and harder to test than the CPU-based code. They often
don't get tested before kernel releases, and even if they do, the crypto
self-tests don't properly test these drivers. Released drivers have
en/decrypted or hashed data incorrectly. These bugs cause issues for
fscrypt users who often didn't even want to use these drivers, e.g.:
- https://github.com/google/fscryptctl/issues/32
- https://github.com/google/fscryptctl/issues/9
- https://lore.kernel.org/r/PH0PR02MB731916ECDB6C613665863B6CFFAA2@PH0PR02MB7…
These drivers have also similarly caused issues for dm-crypt users,
including data corruption and deadlocks. Since Linux v5.10, dm-crypt
has disabled most of them by excluding CRYPTO_ALG_ALLOCATES_MEMORY.
Second, these drivers tend to be *much* slower than the CPU-based code.
This may seem counterintuitive, but benchmarks clearly show it. There's
a *lot* of overhead associated with going to a hardware driver, off the
CPU, and back again. To prove this, I gathered as many systems with
this type of crypto engine as I could, and I measured synchronous
encryption of 4096-byte messages (which matches fscrypt's workload):
Intel Emerald Rapids server:
AES-256-XTS:
xts-aes-vaes-avx512 16171 MB/s [CPU-based, Vector AES]
qat_aes_xts 289 MB/s [Offload, Intel QuickAssist]
Qualcomm SM8650 HDK:
AES-256-XTS:
xts-aes-ce 4301 MB/s [CPU-based, ARMv8 Crypto Extensions]
xts-aes-qce 73 MB/s [Offload, Qualcomm Crypto Engine]
i.MX 8M Nano LPDDR4 EVK:
AES-256-XTS:
xts-aes-ce 647 MB/s [CPU-based, ARMv8 Crypto Extensions]
xts(ecb-aes-caam) 20 MB/s [Offload, CAAM]
AES-128-CBC-ESSIV:
essiv(cbc-aes-caam,sha256-lib) 23 MB/s [Offload, CAAM]
STM32MP157F-DK2:
AES-256-XTS:
xts-aes-neonbs 13.2 MB/s [CPU-based, ARM NEON]
xts(stm32-ecb-aes) 3.1 MB/s [Offload, STM32 crypto engine]
AES-128-CBC-ESSIV:
essiv(cbc-aes-neonbs,sha256-lib)
14.7 MB/s [CPU-based, ARM NEON]
essiv(stm32-cbc-aes,sha256-lib)
3.2 MB/s [Offload, STM32 crypto engine]
Adiantum:
adiantum(xchacha12-arm,aes-arm,nhpoly1305-neon)
52.8 MB/s [CPU-based, ARM scalar + NEON]
So, there was no case in which the crypto engine was even *close* to
being faster. On the first three, which have AES instructions in the
CPU, the CPU was 30 to 55 times faster (!). Even on STM32MP157F-DK2
which has a Cortex-A7 CPU that doesn't have AES instructions, AES was
over 4 times faster on the CPU. And Adiantum encryption, which is what
actually should be used on CPUs like that, was over 17 times faster.
Other justifications that have been given for these non-inline crypto
engines (almost always coming from the hardware vendors, not actual
users) don't seem very plausible either:
- The crypto engine throughput could be improved by processing
multiple requests concurrently. Currently irrelevant to fscrypt,
since it doesn't do that. This would also be complex, and unhelpful
in many cases. 2 of the 4 engines I tested even had only one queue.
- Some of the engines, e.g. STM32, support hardware keys. Also
currently irrelevant to fscrypt, since it doesn't support these.
Interestingly, the STM32 driver itself doesn't support this either.
- Free up CPU for other tasks and/or reduce energy usage. Not very
plausible considering the "short" message length, driver overhead,
and scheduling overhead. There's just very little time for the CPU
to do something else like run another task or enter low-power state,
before the message finishes and it's time to process the next one.
- Some of these engines resist power analysis and electromagnetic
attacks, while the CPU-based crypto generally does not. In theory,
this sounds great. In practice, if this benefit requires the use of
an off-CPU offload that massively regresses performance and has a
low-quality, buggy driver, the price for this hardening (which is
not relevant to most fscrypt users, and tends to be incomplete) is
just too high. Inline crypto engines are much more promising here,
as are on-CPU solutions like RISC-V High Assurance Cryptography.
Fixes: b30ab0e03407 ("ext4 crypto: add ext4 encryption facilities")
Cc: stable(a)vger.kernel.org
Acked-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250704070322.20692-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index f63791641c1d..696a5844bfa3 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -147,9 +147,8 @@ However, these ioctls have some limitations:
were wiped. To partially solve this, you can add init_on_free=1 to
your kernel command line. However, this has a performance cost.
-- Secret keys might still exist in CPU registers, in crypto
- accelerator hardware (if used by the crypto API to implement any of
- the algorithms), or in other places not explicitly considered here.
+- Secret keys might still exist in CPU registers or in other places
+ not explicitly considered here.
Full system compromise
~~~~~~~~~~~~~~~~~~~~~~
@@ -406,9 +405,12 @@ the work is done by XChaCha12, which is much faster than AES when AES
acceleration is unavailable. For more information about Adiantum, see
`the Adiantum paper <https://eprint.iacr.org/2018/720.pdf>`_.
-The (AES-128-CBC-ESSIV, AES-128-CBC-CTS) pair exists only to support
-systems whose only form of AES acceleration is an off-CPU crypto
-accelerator such as CAAM or CESA that does not support XTS.
+The (AES-128-CBC-ESSIV, AES-128-CBC-CTS) pair was added to try to
+provide a more efficient option for systems that lack AES instructions
+in the CPU but do have a non-inline crypto engine such as CAAM or CESA
+that supports AES-CBC (and not AES-XTS). This is deprecated. It has
+been shown that just doing AES on the CPU is actually faster.
+Moreover, Adiantum is faster still and is recommended on such systems.
The remaining mode pairs are the "national pride ciphers":
@@ -1318,22 +1320,13 @@ this by validating all top-level encryption policies prior to access.
Inline encryption support
=========================
-By default, fscrypt uses the kernel crypto API for all cryptographic
-operations (other than HKDF, which fscrypt partially implements
-itself). The kernel crypto API supports hardware crypto accelerators,
-but only ones that work in the traditional way where all inputs and
-outputs (e.g. plaintexts and ciphertexts) are in memory. fscrypt can
-take advantage of such hardware, but the traditional acceleration
-model isn't particularly efficient and fscrypt hasn't been optimized
-for it.
-
-Instead, many newer systems (especially mobile SoCs) have *inline
-encryption hardware* that can encrypt/decrypt data while it is on its
-way to/from the storage device. Linux supports inline encryption
-through a set of extensions to the block layer called *blk-crypto*.
-blk-crypto allows filesystems to attach encryption contexts to bios
-(I/O requests) to specify how the data will be encrypted or decrypted
-in-line. For more information about blk-crypto, see
+Many newer systems (especially mobile SoCs) have *inline encryption
+hardware* that can encrypt/decrypt data while it is on its way to/from
+the storage device. Linux supports inline encryption through a set of
+extensions to the block layer called *blk-crypto*. blk-crypto allows
+filesystems to attach encryption contexts to bios (I/O requests) to
+specify how the data will be encrypted or decrypted in-line. For more
+information about blk-crypto, see
:ref:`Documentation/block/inline-encryption.rst <inline_encryption>`.
On supported filesystems (currently ext4 and f2fs), fscrypt can use
diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
index c1d92074b65c..6e7164530a1e 100644
--- a/fs/crypto/fscrypt_private.h
+++ b/fs/crypto/fscrypt_private.h
@@ -45,6 +45,23 @@
*/
#undef FSCRYPT_MAX_KEY_SIZE
+/*
+ * This mask is passed as the third argument to the crypto_alloc_*() functions
+ * to prevent fscrypt from using the Crypto API drivers for non-inline crypto
+ * engines. Those drivers have been problematic for fscrypt. fscrypt users
+ * have reported hangs and even incorrect en/decryption with these drivers.
+ * Since going to the driver, off CPU, and back again is really slow, such
+ * drivers can be over 50 times slower than the CPU-based code for fscrypt's
+ * workload. Even on platforms that lack AES instructions on the CPU, using the
+ * offloads has been shown to be slower, even staying with AES. (Of course,
+ * Adiantum is faster still, and is the recommended option on such platforms...)
+ *
+ * Note that fscrypt also supports inline crypto engines. Those don't use the
+ * Crypto API and work much better than the old-style (non-inline) engines.
+ */
+#define FSCRYPT_CRYPTOAPI_MASK \
+ (CRYPTO_ALG_ALLOCATES_MEMORY | CRYPTO_ALG_KERN_DRIVER_ONLY)
+
#define FSCRYPT_CONTEXT_V1 1
#define FSCRYPT_CONTEXT_V2 2
diff --git a/fs/crypto/hkdf.c b/fs/crypto/hkdf.c
index 5c095c8aa3b5..b1ef506cd341 100644
--- a/fs/crypto/hkdf.c
+++ b/fs/crypto/hkdf.c
@@ -58,7 +58,7 @@ int fscrypt_init_hkdf(struct fscrypt_hkdf *hkdf, const u8 *master_key,
u8 prk[HKDF_HASHLEN];
int err;
- hmac_tfm = crypto_alloc_shash(HKDF_HMAC_ALG, 0, 0);
+ hmac_tfm = crypto_alloc_shash(HKDF_HMAC_ALG, 0, FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(hmac_tfm)) {
fscrypt_err(NULL, "Error allocating " HKDF_HMAC_ALG ": %ld",
PTR_ERR(hmac_tfm));
diff --git a/fs/crypto/keysetup.c b/fs/crypto/keysetup.c
index a67e20d126c9..74d4a2e1ad23 100644
--- a/fs/crypto/keysetup.c
+++ b/fs/crypto/keysetup.c
@@ -104,7 +104,8 @@ fscrypt_allocate_skcipher(struct fscrypt_mode *mode, const u8 *raw_key,
struct crypto_skcipher *tfm;
int err;
- tfm = crypto_alloc_skcipher(mode->cipher_str, 0, 0);
+ tfm = crypto_alloc_skcipher(mode->cipher_str, 0,
+ FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(tfm)) {
if (PTR_ERR(tfm) == -ENOENT) {
fscrypt_warn(inode,
diff --git a/fs/crypto/keysetup_v1.c b/fs/crypto/keysetup_v1.c
index b70521c55132..158ceae8a5bc 100644
--- a/fs/crypto/keysetup_v1.c
+++ b/fs/crypto/keysetup_v1.c
@@ -52,7 +52,8 @@ static int derive_key_aes(const u8 *master_key,
struct skcipher_request *req = NULL;
DECLARE_CRYPTO_WAIT(wait);
struct scatterlist src_sg, dst_sg;
- struct crypto_skcipher *tfm = crypto_alloc_skcipher("ecb(aes)", 0, 0);
+ struct crypto_skcipher *tfm =
+ crypto_alloc_skcipher("ecb(aes)", 0, FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(tfm)) {
res = PTR_ERR(tfm);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x b41c1d8d07906786c60893980d52688f31d114a6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081516-syndrome-awkward-ef0d@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b41c1d8d07906786c60893980d52688f31d114a6 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)kernel.org>
Date: Fri, 4 Jul 2025 00:03:22 -0700
Subject: [PATCH] fscrypt: Don't use problematic non-inline crypto engines
Make fscrypt no longer use Crypto API drivers for non-inline crypto
engines, even when the Crypto API prioritizes them over CPU-based code
(which unfortunately it often does). These drivers tend to be really
problematic, especially for fscrypt's workload. This commit has no
effect on inline crypto engines, which are different and do work well.
Specifically, exclude drivers that have CRYPTO_ALG_KERN_DRIVER_ONLY or
CRYPTO_ALG_ALLOCATES_MEMORY set. (Later, CRYPTO_ALG_ASYNC should be
excluded too. That's omitted for now to keep this commit backportable,
since until recently some CPU-based code had CRYPTO_ALG_ASYNC set.)
There are two major issues with these drivers: bugs and performance.
First, these drivers tend to be buggy. They're fundamentally much more
error-prone and harder to test than the CPU-based code. They often
don't get tested before kernel releases, and even if they do, the crypto
self-tests don't properly test these drivers. Released drivers have
en/decrypted or hashed data incorrectly. These bugs cause issues for
fscrypt users who often didn't even want to use these drivers, e.g.:
- https://github.com/google/fscryptctl/issues/32
- https://github.com/google/fscryptctl/issues/9
- https://lore.kernel.org/r/PH0PR02MB731916ECDB6C613665863B6CFFAA2@PH0PR02MB7…
These drivers have also similarly caused issues for dm-crypt users,
including data corruption and deadlocks. Since Linux v5.10, dm-crypt
has disabled most of them by excluding CRYPTO_ALG_ALLOCATES_MEMORY.
Second, these drivers tend to be *much* slower than the CPU-based code.
This may seem counterintuitive, but benchmarks clearly show it. There's
a *lot* of overhead associated with going to a hardware driver, off the
CPU, and back again. To prove this, I gathered as many systems with
this type of crypto engine as I could, and I measured synchronous
encryption of 4096-byte messages (which matches fscrypt's workload):
Intel Emerald Rapids server:
AES-256-XTS:
xts-aes-vaes-avx512 16171 MB/s [CPU-based, Vector AES]
qat_aes_xts 289 MB/s [Offload, Intel QuickAssist]
Qualcomm SM8650 HDK:
AES-256-XTS:
xts-aes-ce 4301 MB/s [CPU-based, ARMv8 Crypto Extensions]
xts-aes-qce 73 MB/s [Offload, Qualcomm Crypto Engine]
i.MX 8M Nano LPDDR4 EVK:
AES-256-XTS:
xts-aes-ce 647 MB/s [CPU-based, ARMv8 Crypto Extensions]
xts(ecb-aes-caam) 20 MB/s [Offload, CAAM]
AES-128-CBC-ESSIV:
essiv(cbc-aes-caam,sha256-lib) 23 MB/s [Offload, CAAM]
STM32MP157F-DK2:
AES-256-XTS:
xts-aes-neonbs 13.2 MB/s [CPU-based, ARM NEON]
xts(stm32-ecb-aes) 3.1 MB/s [Offload, STM32 crypto engine]
AES-128-CBC-ESSIV:
essiv(cbc-aes-neonbs,sha256-lib)
14.7 MB/s [CPU-based, ARM NEON]
essiv(stm32-cbc-aes,sha256-lib)
3.2 MB/s [Offload, STM32 crypto engine]
Adiantum:
adiantum(xchacha12-arm,aes-arm,nhpoly1305-neon)
52.8 MB/s [CPU-based, ARM scalar + NEON]
So, there was no case in which the crypto engine was even *close* to
being faster. On the first three, which have AES instructions in the
CPU, the CPU was 30 to 55 times faster (!). Even on STM32MP157F-DK2
which has a Cortex-A7 CPU that doesn't have AES instructions, AES was
over 4 times faster on the CPU. And Adiantum encryption, which is what
actually should be used on CPUs like that, was over 17 times faster.
Other justifications that have been given for these non-inline crypto
engines (almost always coming from the hardware vendors, not actual
users) don't seem very plausible either:
- The crypto engine throughput could be improved by processing
multiple requests concurrently. Currently irrelevant to fscrypt,
since it doesn't do that. This would also be complex, and unhelpful
in many cases. 2 of the 4 engines I tested even had only one queue.
- Some of the engines, e.g. STM32, support hardware keys. Also
currently irrelevant to fscrypt, since it doesn't support these.
Interestingly, the STM32 driver itself doesn't support this either.
- Free up CPU for other tasks and/or reduce energy usage. Not very
plausible considering the "short" message length, driver overhead,
and scheduling overhead. There's just very little time for the CPU
to do something else like run another task or enter low-power state,
before the message finishes and it's time to process the next one.
- Some of these engines resist power analysis and electromagnetic
attacks, while the CPU-based crypto generally does not. In theory,
this sounds great. In practice, if this benefit requires the use of
an off-CPU offload that massively regresses performance and has a
low-quality, buggy driver, the price for this hardening (which is
not relevant to most fscrypt users, and tends to be incomplete) is
just too high. Inline crypto engines are much more promising here,
as are on-CPU solutions like RISC-V High Assurance Cryptography.
Fixes: b30ab0e03407 ("ext4 crypto: add ext4 encryption facilities")
Cc: stable(a)vger.kernel.org
Acked-by: Ard Biesheuvel <ardb(a)kernel.org>
Link: https://lore.kernel.org/r/20250704070322.20692-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers(a)kernel.org>
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index f63791641c1d..696a5844bfa3 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -147,9 +147,8 @@ However, these ioctls have some limitations:
were wiped. To partially solve this, you can add init_on_free=1 to
your kernel command line. However, this has a performance cost.
-- Secret keys might still exist in CPU registers, in crypto
- accelerator hardware (if used by the crypto API to implement any of
- the algorithms), or in other places not explicitly considered here.
+- Secret keys might still exist in CPU registers or in other places
+ not explicitly considered here.
Full system compromise
~~~~~~~~~~~~~~~~~~~~~~
@@ -406,9 +405,12 @@ the work is done by XChaCha12, which is much faster than AES when AES
acceleration is unavailable. For more information about Adiantum, see
`the Adiantum paper <https://eprint.iacr.org/2018/720.pdf>`_.
-The (AES-128-CBC-ESSIV, AES-128-CBC-CTS) pair exists only to support
-systems whose only form of AES acceleration is an off-CPU crypto
-accelerator such as CAAM or CESA that does not support XTS.
+The (AES-128-CBC-ESSIV, AES-128-CBC-CTS) pair was added to try to
+provide a more efficient option for systems that lack AES instructions
+in the CPU but do have a non-inline crypto engine such as CAAM or CESA
+that supports AES-CBC (and not AES-XTS). This is deprecated. It has
+been shown that just doing AES on the CPU is actually faster.
+Moreover, Adiantum is faster still and is recommended on such systems.
The remaining mode pairs are the "national pride ciphers":
@@ -1318,22 +1320,13 @@ this by validating all top-level encryption policies prior to access.
Inline encryption support
=========================
-By default, fscrypt uses the kernel crypto API for all cryptographic
-operations (other than HKDF, which fscrypt partially implements
-itself). The kernel crypto API supports hardware crypto accelerators,
-but only ones that work in the traditional way where all inputs and
-outputs (e.g. plaintexts and ciphertexts) are in memory. fscrypt can
-take advantage of such hardware, but the traditional acceleration
-model isn't particularly efficient and fscrypt hasn't been optimized
-for it.
-
-Instead, many newer systems (especially mobile SoCs) have *inline
-encryption hardware* that can encrypt/decrypt data while it is on its
-way to/from the storage device. Linux supports inline encryption
-through a set of extensions to the block layer called *blk-crypto*.
-blk-crypto allows filesystems to attach encryption contexts to bios
-(I/O requests) to specify how the data will be encrypted or decrypted
-in-line. For more information about blk-crypto, see
+Many newer systems (especially mobile SoCs) have *inline encryption
+hardware* that can encrypt/decrypt data while it is on its way to/from
+the storage device. Linux supports inline encryption through a set of
+extensions to the block layer called *blk-crypto*. blk-crypto allows
+filesystems to attach encryption contexts to bios (I/O requests) to
+specify how the data will be encrypted or decrypted in-line. For more
+information about blk-crypto, see
:ref:`Documentation/block/inline-encryption.rst <inline_encryption>`.
On supported filesystems (currently ext4 and f2fs), fscrypt can use
diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
index c1d92074b65c..6e7164530a1e 100644
--- a/fs/crypto/fscrypt_private.h
+++ b/fs/crypto/fscrypt_private.h
@@ -45,6 +45,23 @@
*/
#undef FSCRYPT_MAX_KEY_SIZE
+/*
+ * This mask is passed as the third argument to the crypto_alloc_*() functions
+ * to prevent fscrypt from using the Crypto API drivers for non-inline crypto
+ * engines. Those drivers have been problematic for fscrypt. fscrypt users
+ * have reported hangs and even incorrect en/decryption with these drivers.
+ * Since going to the driver, off CPU, and back again is really slow, such
+ * drivers can be over 50 times slower than the CPU-based code for fscrypt's
+ * workload. Even on platforms that lack AES instructions on the CPU, using the
+ * offloads has been shown to be slower, even staying with AES. (Of course,
+ * Adiantum is faster still, and is the recommended option on such platforms...)
+ *
+ * Note that fscrypt also supports inline crypto engines. Those don't use the
+ * Crypto API and work much better than the old-style (non-inline) engines.
+ */
+#define FSCRYPT_CRYPTOAPI_MASK \
+ (CRYPTO_ALG_ALLOCATES_MEMORY | CRYPTO_ALG_KERN_DRIVER_ONLY)
+
#define FSCRYPT_CONTEXT_V1 1
#define FSCRYPT_CONTEXT_V2 2
diff --git a/fs/crypto/hkdf.c b/fs/crypto/hkdf.c
index 5c095c8aa3b5..b1ef506cd341 100644
--- a/fs/crypto/hkdf.c
+++ b/fs/crypto/hkdf.c
@@ -58,7 +58,7 @@ int fscrypt_init_hkdf(struct fscrypt_hkdf *hkdf, const u8 *master_key,
u8 prk[HKDF_HASHLEN];
int err;
- hmac_tfm = crypto_alloc_shash(HKDF_HMAC_ALG, 0, 0);
+ hmac_tfm = crypto_alloc_shash(HKDF_HMAC_ALG, 0, FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(hmac_tfm)) {
fscrypt_err(NULL, "Error allocating " HKDF_HMAC_ALG ": %ld",
PTR_ERR(hmac_tfm));
diff --git a/fs/crypto/keysetup.c b/fs/crypto/keysetup.c
index a67e20d126c9..74d4a2e1ad23 100644
--- a/fs/crypto/keysetup.c
+++ b/fs/crypto/keysetup.c
@@ -104,7 +104,8 @@ fscrypt_allocate_skcipher(struct fscrypt_mode *mode, const u8 *raw_key,
struct crypto_skcipher *tfm;
int err;
- tfm = crypto_alloc_skcipher(mode->cipher_str, 0, 0);
+ tfm = crypto_alloc_skcipher(mode->cipher_str, 0,
+ FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(tfm)) {
if (PTR_ERR(tfm) == -ENOENT) {
fscrypt_warn(inode,
diff --git a/fs/crypto/keysetup_v1.c b/fs/crypto/keysetup_v1.c
index b70521c55132..158ceae8a5bc 100644
--- a/fs/crypto/keysetup_v1.c
+++ b/fs/crypto/keysetup_v1.c
@@ -52,7 +52,8 @@ static int derive_key_aes(const u8 *master_key,
struct skcipher_request *req = NULL;
DECLARE_CRYPTO_WAIT(wait);
struct scatterlist src_sg, dst_sg;
- struct crypto_skcipher *tfm = crypto_alloc_skcipher("ecb(aes)", 0, 0);
+ struct crypto_skcipher *tfm =
+ crypto_alloc_skcipher("ecb(aes)", 0, FSCRYPT_CRYPTOAPI_MASK);
if (IS_ERR(tfm)) {
res = PTR_ERR(tfm);
Hi,
I'm seeing a reproducible kernel oops on my home router updating from 5.15.181 to 5.15.189:
kernel BUG at lib/list_debug.c:50!
invalid opcode: 0000 [#1] SMP NOPTI
..
Call Trace:
<TASK>
drr_qlen_notify+0x11/0x50 [sch_drr]
qdisc_tree_reduce_backlog+0x93/0xf0
drr_graft_class+0x109/0x220 [sch_drr]
qdisc_graft+0xdd/0x510
? qdisc_create+0x335/0x510
tc_modify_qdisc+0x53f/0x9d0
rtnetlink_rcv_msg+0x134/0x370
? __getblk_gfp+0x22/0x230
? rtnl_calcit.isra.38+0x130/0x130
netlink_rcv_skb+0x4f/0x100
rtnetlink_rcv+0x10/0x20
netlink_unicast+0x1d2/0x2a0
netlink_sendmsg+0x22a/0x480
? netlink_broadcast+0x20/0x20
____sys_sendmsg+0x25f/0x280
? copy_msghdr_from_user+0x5b/0x90
___sys_sendmsg+0x77/0xc0
? __sys_recvmsg+0x5a/0xb0
? do_filp_open+0xc3/0x120
__sys_sendmsg+0x5d/0xb0
__x64_sys_sendmsg+0x1a/0x20
x64_sys_call+0x17f1/0x1c80
do_syscall_64+0x53/0x80
? exit_to_user_mode_prepare+0x2c/0x140
? irqentry_exit_to_user_mode+0xe/0x20
? irqentry_exit+0x1d/0x30
? exc_page_fault+0x1e7/0x610
? do_syscall_64+0x5f/0x80
entry_SYSCALL_64_after_hwframe+0x6c/0xd6
..
RIP: 0010:__list_del_entry_valid.cold.1+0xf/0x69
syzbot reported a similar looking thing here:
[v5.15] BUG: unable to handle kernel paging request in drr_qlen_notify
https://groups.google.com/g/syzkaller-lts-bugs/c/_QJHiMHwfRw/m/2j1nSU1hBgAJ
and here:
"[syzbot] [net?] general protection fault in drr_qlen_notify"
https://www.spinics.net/lists/netdev/msg1105420.html
syzboot bisected it to:
****************************************
commit e269f29e9395527bc00c213c6b15da04ebb35070
Refs: v5.15.186-114-ge269f29e9395
Author: Lion Ackermann <nnamrec(a)gmail.com>
AuthorDate: Mon Jun 30 15:27:30 2025 +0200
Commit: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
CommitDate: Thu Jul 10 15:57:46 2025 +0200
net/sched: Always pass notifications when child class becomes empty
[ Upstream commit 103406b38c600fec1fe375a77b27d87e314aea09 ]
****************************************
The last line of the commit message mentions:
"This is not a problem after the recent patch series
that made all the classful qdiscs qlen_notify() handlers idempotent."
It looks like the "idempotent" patches are missing from the 5.15 stable series.
Like this one:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/n…
I've tried Ubuntu's backport for 5.15:
https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/jammy/co…
It seems to be identical to:
https://lore.kernel.org/stable/bcf9c70e9cf750363782816c21c69792f6c81cd9.175…
While the kernel didn't oops anymore with the patch applied, the network traffic behaves erratic:
TCP traffic works, ICMP seems "stuck". tcpdump showed no icmp traffic on the ppp device.
Tomorrow I will try if I can reproduce the issue on a test VM.
Anything else I should try?
Thanks in advance,
Thomas
BootLoader (Grub, LILO, etc) may pass an identifier such as "BOOT_IMAGE=
/boot/vmlinuz-x.y.z" to kernel parameters. But these identifiers are not
recognized by the kernel itself so will be passed to user space. However
user space init program also doesn't recognized it.
KEXEC/KDUMP (kexec-tools) may also pass an identifier such as "kexec" on
some architectures.
We cannot change BootLoader's behavior, because this behavior exists for
many years, and there are already user space programs search BOOT_IMAGE=
in /proc/cmdline to obtain the kernel image locations:
https://github.com/linuxdeepin/deepin-ab-recovery/blob/master/util.go
(search getBootOptions)
https://github.com/linuxdeepin/deepin-ab-recovery/blob/master/main.go
(search getKernelReleaseWithBootOption)
So the the best way is handle (ignore) it by the kernel itself, which
can avoid such boot warnings (if we use something like init=/bin/bash,
bootloader identifier can even cause a crash):
Kernel command line: BOOT_IMAGE=(hd0,1)/vmlinuz-6.x root=/dev/sda3 ro console=tty
Unknown kernel command line parameters "BOOT_IMAGE=(hd0,1)/vmlinuz-6.x", will be passed to user space.
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
V2: Update comments and commit messages.
V3: Document bootloader identifiers.
V4: Use strstarts() instead of strncmp().
init/main.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/init/main.c b/init/main.c
index 0ee0ee7b7c2c..9b5150166bcf 100644
--- a/init/main.c
+++ b/init/main.c
@@ -544,6 +544,12 @@ static int __init unknown_bootoption(char *param, char *val,
const char *unused, void *arg)
{
size_t len = strlen(param);
+ /*
+ * Well-known bootloader identifiers:
+ * 1. LILO/Grub pass "BOOT_IMAGE=...";
+ * 2. kexec/kdump (kexec-tools) pass "kexec".
+ */
+ const char *bootloader[] = { "BOOT_IMAGE=", "kexec", NULL };
/* Handle params aliased to sysctls */
if (sysctl_is_alias(param))
@@ -551,6 +557,12 @@ static int __init unknown_bootoption(char *param, char *val,
repair_env_string(param, val);
+ /* Handle bootloader identifier */
+ for (int i = 0; bootloader[i]; i++) {
+ if (strstarts(param, bootloader[i]))
+ return 0;
+ }
+
/* Handle obsolete-style parameters */
if (obsolete_checksetup(param))
return 0;
--
2.47.3
BootLoader (Grub, LILO, etc) may pass an identifier such as "BOOT_IMAGE=
/boot/vmlinuz-x.y.z" to kernel parameters. But these identifiers are not
recognized by the kernel itself so will be passed to user space. However
user space init program also doesn't recognized it.
KEXEC/KDUMP (kexec-tools) may also pass an identifier such as "kexec" on
some architectures.
We cannot change BootLoader's behavior, because this behavior exists for
many years, and there are already user space programs search BOOT_IMAGE=
in /proc/cmdline to obtain the kernel image locations:
https://github.com/linuxdeepin/deepin-ab-recovery/blob/master/util.go
(search getBootOptions)
https://github.com/linuxdeepin/deepin-ab-recovery/blob/master/main.go
(search getKernelReleaseWithBootOption)
So the the best way is handle (ignore) it by the kernel itself, which
can avoid such boot warnings (if we use something like init=/bin/bash,
bootloader identifier can even cause a crash):
Kernel command line: BOOT_IMAGE=(hd0,1)/vmlinuz-6.x root=/dev/sda3 ro console=tty
Unknown kernel command line parameters "BOOT_IMAGE=(hd0,1)/vmlinuz-6.x", will be passed to user space.
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
V2: Update comments and commit messages.
V3: Document bootloader identifiers.
init/main.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/init/main.c b/init/main.c
index 225a58279acd..b25e7da5347a 100644
--- a/init/main.c
+++ b/init/main.c
@@ -545,6 +545,12 @@ static int __init unknown_bootoption(char *param, char *val,
const char *unused, void *arg)
{
size_t len = strlen(param);
+ /*
+ * Well-known bootloader identifiers:
+ * 1. LILO/Grub pass "BOOT_IMAGE=...";
+ * 2. kexec/kdump (kexec-tools) pass "kexec".
+ */
+ const char *bootloader[] = { "BOOT_IMAGE=", "kexec", NULL };
/* Handle params aliased to sysctls */
if (sysctl_is_alias(param))
@@ -552,6 +558,12 @@ static int __init unknown_bootoption(char *param, char *val,
repair_env_string(param, val);
+ /* Handle bootloader identifier */
+ for (int i = 0; bootloader[i]; i++) {
+ if (!strncmp(param, bootloader[i], strlen(bootloader[i])))
+ return 0;
+ }
+
/* Handle obsolete-style parameters */
if (obsolete_checksetup(param))
return 0;
--
2.47.3
Add fixes to the CC contaminant/connection detection logic to improve
reliability and stability of the maxim tcpc driver.
---
Amit Sunil Dhamne (2):
usb: typec: maxim_contaminant: disable low power mode when reading comparator values
usb: typec: maxim_contaminant: re-enable cc toggle if cc is open and port is clean
drivers/usb/typec/tcpm/maxim_contaminant.c | 58 ++++++++++++++++++++++++++++++
drivers/usb/typec/tcpm/tcpci_maxim.h | 1 +
2 files changed, 59 insertions(+)
---
base-commit: 89be9a83ccf1f88522317ce02f854f30d6115c41
change-id: 20250802-fix-upstream-contaminant-16910e2762ca
Best regards,
--
Amit Sunil Dhamne <amitsd(a)google.com>
The adapter->chan_stats[] array is initialized in
mwifiex_init_channel_scan_gap() with vmalloc(), which doesn't zero out
memory. The array is filled in mwifiex_update_chan_statistics()
and then the user can query the data in mwifiex_cfg80211_dump_survey().
There are two potential issues here. What if the user calls
mwifiex_cfg80211_dump_survey() before the data has been filled in.
Also the mwifiex_update_chan_statistics() function doesn't necessarily
initialize the whole array. Since the array was not initialized at
the start that could result in an information leak.
Also this array is pretty small. It's a maximum of 900 bytes so it's
more appropriate to use kcalloc() instead vmalloc().
Cc: stable(a)vger.kernel.org
Fixes: bf35443314ac ("mwifiex: channel statistics support for mwifiex")
Suggested-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Signed-off-by: Qianfeng Rong <rongqianfeng(a)vivo.com>
---
v2: Change vmalloc_array/vfree to kcalloc/kfree.
v3: Improved commit message.
---
drivers/net/wireless/marvell/mwifiex/cfg80211.c | 5 +++--
drivers/net/wireless/marvell/mwifiex/main.c | 4 ++--
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/marvell/mwifiex/cfg80211.c b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
index 3498743d5ec0..4c8c7a5fdf23 100644
--- a/drivers/net/wireless/marvell/mwifiex/cfg80211.c
+++ b/drivers/net/wireless/marvell/mwifiex/cfg80211.c
@@ -4673,8 +4673,9 @@ int mwifiex_init_channel_scan_gap(struct mwifiex_adapter *adapter)
* additional active scan request for hidden SSIDs on passive channels.
*/
adapter->num_in_chan_stats = 2 * (n_channels_bg + n_channels_a);
- adapter->chan_stats = vmalloc(array_size(sizeof(*adapter->chan_stats),
- adapter->num_in_chan_stats));
+ adapter->chan_stats = kcalloc(adapter->num_in_chan_stats,
+ sizeof(*adapter->chan_stats),
+ GFP_KERNEL);
if (!adapter->chan_stats)
return -ENOMEM;
diff --git a/drivers/net/wireless/marvell/mwifiex/main.c b/drivers/net/wireless/marvell/mwifiex/main.c
index 7b50a88a18e5..1ec069bc8ea1 100644
--- a/drivers/net/wireless/marvell/mwifiex/main.c
+++ b/drivers/net/wireless/marvell/mwifiex/main.c
@@ -642,7 +642,7 @@ static int _mwifiex_fw_dpc(const struct firmware *firmware, void *context)
goto done;
err_add_intf:
- vfree(adapter->chan_stats);
+ kfree(adapter->chan_stats);
err_init_chan_scan:
wiphy_unregister(adapter->wiphy);
wiphy_free(adapter->wiphy);
@@ -1485,7 +1485,7 @@ static void mwifiex_uninit_sw(struct mwifiex_adapter *adapter)
wiphy_free(adapter->wiphy);
adapter->wiphy = NULL;
- vfree(adapter->chan_stats);
+ kfree(adapter->chan_stats);
mwifiex_free_cmd_buffers(adapter);
}
--
2.34.1
VIRQs come in 3 flavors, per-VPU, per-domain, and global. The existing
tracking of VIRQs is handled by per-cpu variables virq_to_irq.
The issue is that bind_virq_to_irq() sets the per_cpu virq_to_irq at
registration time - typically CPU 0. Later, the interrupt can migrate,
and info->cpu is updated. When calling unbind_from_irq(), the per-cpu
virq_to_irq is cleared for a different cpu. If bind_virq_to_irq() is
called again with CPU 0, the stale irq is returned.
Change the virq_to_irq tracking to use CPU 0 for per-domain and global
VIRQs. As there can be at most one of each, there is no need for
per-vcpu tracking. Also, per-domain and global VIRQs need to be
registered on CPU 0 and can later move, so this matches the expectation.
Fixes: e46cdb66c8fc ("xen: event channels")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jason Andryuk <jason.andryuk(a)amd.com>
---
Fixes is the introduction of the virq_to_irq per-cpu array.
This was found with the out-of-tree argo driver during suspend/resume.
On suspend, the per-domain VIRQ_ARGO is unbound. On resume, the driver
attempts to bind VIRQ_ARGO. The stale irq is returned, but the
WARN_ON(info == NULL || info->type != IRQT_VIRQ) in bind_virq_to_irq()
triggers for NULL info. The bind fails and execution continues with the
driver trying to clean up by unbinding. This eventually faults over the
NULL info.
---
drivers/xen/events/events_base.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 41309d38f78c..a27e4d7f061e 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -159,7 +159,19 @@ static DEFINE_MUTEX(irq_mapping_update_lock);
static LIST_HEAD(xen_irq_list_head);
-/* IRQ <-> VIRQ mapping. */
+static bool is_per_vcpu_virq(int virq) {
+ switch (virq) {
+ case VIRQ_TIMER:
+ case VIRQ_DEBUG:
+ case VIRQ_XENOPROF:
+ case VIRQ_XENPMU:
+ return true;
+ default:
+ return false;
+ }
+}
+
+/* IRQ <-> VIRQ mapping. Global/Domain virqs are tracked in cpu 0. */
static DEFINE_PER_CPU(int [NR_VIRQS], virq_to_irq) = {[0 ... NR_VIRQS-1] = -1};
/* IRQ <-> IPI mapping */
@@ -974,6 +986,9 @@ static void __unbind_from_irq(struct irq_info *info, unsigned int irq)
switch (info->type) {
case IRQT_VIRQ:
+ if (!is_per_vcpu_virq(virq_from_irq(info)))
+ cpu = 0;
+
per_cpu(virq_to_irq, cpu)[virq_from_irq(info)] = -1;
break;
case IRQT_IPI:
--
2.50.1
Syzkaller reports a KASAN issue as below:
general protection fault, probably for non-canonical address 0xfbd59c0000000021: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: maybe wild-memory-access in range [0xdead000000000108-0xdead00000000010f]
CPU: 0 PID: 5083 Comm: syz-executor.2 Not tainted 6.1.134-syzkaller-00037-g855bd1d7d838 #0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
RIP: 0010:__list_del include/linux/list.h:114 [inline]
RIP: 0010:__list_del_entry include/linux/list.h:137 [inline]
RIP: 0010:list_del include/linux/list.h:148 [inline]
RIP: 0010:p9_fd_cancelled+0xe9/0x200 net/9p/trans_fd.c:734
Call Trace:
<TASK>
p9_client_flush+0x351/0x440 net/9p/client.c:614
p9_client_rpc+0xb6b/0xc70 net/9p/client.c:734
p9_client_version net/9p/client.c:920 [inline]
p9_client_create+0xb51/0x1240 net/9p/client.c:1027
v9fs_session_init+0x1f0/0x18f0 fs/9p/v9fs.c:408
v9fs_mount+0xba/0xcb0 fs/9p/vfs_super.c:126
legacy_get_tree+0x108/0x220 fs/fs_context.c:632
vfs_get_tree+0x8e/0x300 fs/super.c:1573
do_new_mount fs/namespace.c:3056 [inline]
path_mount+0x6a6/0x1e90 fs/namespace.c:3386
do_mount fs/namespace.c:3399 [inline]
__do_sys_mount fs/namespace.c:3607 [inline]
__se_sys_mount fs/namespace.c:3584 [inline]
__x64_sys_mount+0x283/0x300 fs/namespace.c:3584
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x35/0x80 arch/x86/entry/common.c:81
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
This happens because of a race condition between:
- The 9p client sending an invalid flush request and later cleaning it up;
- The 9p client in p9_read_work() canceled all pending requests.
Thread 1 Thread 2
...
p9_client_create()
...
p9_fd_create()
...
p9_conn_create()
...
// start Thread 2
INIT_WORK(&m->rq, p9_read_work);
p9_read_work()
...
p9_client_rpc()
...
...
p9_conn_cancel()
...
spin_lock(&m->req_lock);
...
p9_fd_cancelled()
...
...
spin_unlock(&m->req_lock);
// status rewrite
p9_client_cb(m->client, req, REQ_STATUS_ERROR)
// first remove
list_del(&req->req_list);
...
spin_lock(&m->req_lock)
...
// second remove
list_del(&req->req_list);
spin_unlock(&m->req_lock)
...
Commit 74d6a5d56629 ("9p/trans_fd: Fix concurrency del of req_list in
p9_fd_cancelled/p9_read_work") fixes a concurrency issue in the 9p filesystem
client where the req_list could be deleted simultaneously by both
p9_read_work and p9_fd_cancelled functions, but for the case where req->status
equals REQ_STATUS_RCVD.
Add an explicit check for REQ_STATUS_ERROR in p9_fd_cancelled before
processing the request. Skip processing if the request is already in the error
state, as it has been removed and its resources cleaned up.
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: afd8d6541155 ("9P: Add cancelled() to the transport functions.")
Cc: stable(a)vger.kernel.org
Signed-off-by: Nalivayko Sergey <Sergey.Nalivayko(a)kaspersky.com>
---
net/9p/trans_fd.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c
index a69422366a23..a6054a392a90 100644
--- a/net/9p/trans_fd.c
+++ b/net/9p/trans_fd.c
@@ -721,9 +721,9 @@ static int p9_fd_cancelled(struct p9_client *client, struct p9_req_t *req)
spin_lock(&m->req_lock);
/* Ignore cancelled request if message has been received
- * before lock.
- */
- if (req->status == REQ_STATUS_RCVD) {
+ * or cancelled with error before lock.
+ */
+ if (req->status == REQ_STATUS_RCVD || req->status == REQ_STATUS_ERROR) {
spin_unlock(&m->req_lock);
return 0;
}
--
2.30.2
From: Qasim Ijaz <qasdev00(a)gmail.com>
commit 1bb3363da862e0464ec050eea2fb5472a36ad86b upstream.
A malicious HID device with quirk APPLE_MAGIC_BACKLIGHT can trigger a NULL
pointer dereference whilst the power feature-report is toggled and sent to
the device in apple_magic_backlight_report_set(). The power feature-report
is expected to have two data fields, but if the descriptor declares one
field then accessing field[1] and dereferencing it in
apple_magic_backlight_report_set() becomes invalid
since field[1] will be NULL.
An example of a minimal descriptor which can cause the crash is something
like the following where the report with ID 3 (power report) only
references a single 1-byte field. When hid core parses the descriptor it
will encounter the final feature tag, allocate a hid_report (all members
of field[] will be zeroed out), create field structure and populate it,
increasing the maxfield to 1. The subsequent field[1] access and
dereference causes the crash.
Usage Page (Vendor Defined 0xFF00)
Usage (0x0F)
Collection (Application)
Report ID (1)
Usage (0x01)
Logical Minimum (0)
Logical Maximum (255)
Report Size (8)
Report Count (1)
Feature (Data,Var,Abs)
Usage (0x02)
Logical Maximum (32767)
Report Size (16)
Report Count (1)
Feature (Data,Var,Abs)
Report ID (3)
Usage (0x03)
Logical Minimum (0)
Logical Maximum (1)
Report Size (8)
Report Count (1)
Feature (Data,Var,Abs)
End Collection
Here we see the KASAN splat when the kernel dereferences the
NULL pointer and crashes:
[ 15.164723] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] SMP KASAN NOPTI
[ 15.165691] KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
[ 15.165691] CPU: 0 UID: 0 PID: 10 Comm: kworker/0:1 Not tainted 6.15.0 #31 PREEMPT(voluntary)
[ 15.165691] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 15.165691] RIP: 0010:apple_magic_backlight_report_set+0xbf/0x210
[ 15.165691] Call Trace:
[ 15.165691] <TASK>
[ 15.165691] apple_probe+0x571/0xa20
[ 15.165691] hid_device_probe+0x2e2/0x6f0
[ 15.165691] really_probe+0x1ca/0x5c0
[ 15.165691] __driver_probe_device+0x24f/0x310
[ 15.165691] driver_probe_device+0x4a/0xd0
[ 15.165691] __device_attach_driver+0x169/0x220
[ 15.165691] bus_for_each_drv+0x118/0x1b0
[ 15.165691] __device_attach+0x1d5/0x380
[ 15.165691] device_initial_probe+0x12/0x20
[ 15.165691] bus_probe_device+0x13d/0x180
[ 15.165691] device_add+0xd87/0x1510
[...]
To fix this issue we should validate the number of fields that the
backlight and power reports have and if they do not have the required
number of fields then bail.
Fixes: 394ba612f941 ("HID: apple: Add support for magic keyboard backlight on T2 Macs")
Cc: stable(a)vger.kernel.org
Signed-off-by: Qasim Ijaz <qasdev00(a)gmail.com>
Reviewed-by: Orlando Chamberlain <orlandoch.dev(a)gmail.com>
Tested-by: Aditya Garg <gargaditya08(a)live.com>
Link: https://patch.msgid.link/20250713233008.15131-1-qasdev00@gmail.com
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/hid/hid-apple.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/hid-apple.c b/drivers/hid/hid-apple.c
index d900dd05c335..c00ce5bfec4a 100644
--- a/drivers/hid/hid-apple.c
+++ b/drivers/hid/hid-apple.c
@@ -890,7 +890,8 @@ static int apple_magic_backlight_init(struct hid_device *hdev)
backlight->brightness = report_enum->report_id_hash[APPLE_MAGIC_REPORT_ID_BRIGHTNESS];
backlight->power = report_enum->report_id_hash[APPLE_MAGIC_REPORT_ID_POWER];
- if (!backlight->brightness || !backlight->power)
+ if (!backlight->brightness || backlight->brightness->maxfield < 2 ||
+ !backlight->power || backlight->power->maxfield < 2)
return -ENODEV;
backlight->cdev.name = ":white:" LED_FUNCTION_KBD_BACKLIGHT;
--
2.43.0
This commit addresses a rarely observed endpoint command timeout
which causes kernel panic due to warn when 'panic_on_warn' is enabled
and unnecessary call trace prints when 'panic_on_warn' is disabled.
It is seen during fast software-controlled connect/disconnect testcases.
The following is one such endpoint command timeout that we observed:
1. Connect
=======
->dwc3_thread_interrupt
->dwc3_ep0_interrupt
->configfs_composite_setup
->composite_setup
->usb_ep_queue
->dwc3_gadget_ep0_queue
->__dwc3_gadget_ep0_queue
->__dwc3_ep0_do_control_data
->dwc3_send_gadget_ep_cmd
2. Disconnect
==========
->dwc3_thread_interrupt
->dwc3_gadget_disconnect_interrupt
->dwc3_ep0_reset_state
->dwc3_ep0_end_control_data
->dwc3_send_gadget_ep_cmd
In the issue scenario, in Exynos platforms, we observed that control
transfers for the previous connect have not yet been completed and end
transfer command sent as a part of the disconnect sequence and
processing of USB_ENDPOINT_HALT feature request from the host timeout.
This maybe an expected scenario since the controller is processing EP
commands sent as a part of the previous connect. It maybe better to
remove WARN_ON in all places where device endpoint commands are sent to
avoid unnecessary kernel panic due to warn.
Cc: stable(a)vger.kernel.org
Co-developed-by: Akash M <akash.m5(a)samsung.com>
Signed-off-by: Akash M <akash.m5(a)samsung.com>
Signed-off-by: Selvarasu Ganesan <selvarasu.g(a)samsung.com>
Acked-by: Thinh Nguyen <Thinh.Nguyen(a)synopsys.com>
---
Changes in v3:
- Added Co-developed-by tags to reflect the correct authorship.
- And Added Acked-by tag as well.
Link to v2: https://lore.kernel.org/all/20250807014639.1596-1-selvarasu.g@samsung.com/
Changes in v2:
- Removed the 'Fixes' tag from the commit message, as this patch does
not contain a fix.
- And Retained the 'stable' tag, as these changes are intended to be
applied across all stable kernels.
- Additionally, replaced 'dev_warn*' with 'dev_err*'."
Link to v1: https://lore.kernel.org/all/20250807005638.thhsgjn73aaov2af@synopsys.com/
---
drivers/usb/dwc3/ep0.c | 20 ++++++++++++++++----
drivers/usb/dwc3/gadget.c | 10 ++++++++--
2 files changed, 24 insertions(+), 6 deletions(-)
diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c
index 666ac432f52d..b4229aa13f37 100644
--- a/drivers/usb/dwc3/ep0.c
+++ b/drivers/usb/dwc3/ep0.c
@@ -288,7 +288,9 @@ void dwc3_ep0_out_start(struct dwc3 *dwc)
dwc3_ep0_prepare_one_trb(dep, dwc->ep0_trb_addr, 8,
DWC3_TRBCTL_CONTROL_SETUP, false);
ret = dwc3_ep0_start_trans(dep);
- WARN_ON(ret < 0);
+ if (ret < 0)
+ dev_err(dwc->dev, "ep0 out start transfer failed: %d\n", ret);
+
for (i = 2; i < DWC3_ENDPOINTS_NUM; i++) {
struct dwc3_ep *dwc3_ep;
@@ -1061,7 +1063,9 @@ static void __dwc3_ep0_do_control_data(struct dwc3 *dwc,
ret = dwc3_ep0_start_trans(dep);
}
- WARN_ON(ret < 0);
+ if (ret < 0)
+ dev_err(dwc->dev,
+ "ep0 data phase start transfer failed: %d\n", ret);
}
static int dwc3_ep0_start_control_status(struct dwc3_ep *dep)
@@ -1078,7 +1082,12 @@ static int dwc3_ep0_start_control_status(struct dwc3_ep *dep)
static void __dwc3_ep0_do_control_status(struct dwc3 *dwc, struct dwc3_ep *dep)
{
- WARN_ON(dwc3_ep0_start_control_status(dep));
+ int ret;
+
+ ret = dwc3_ep0_start_control_status(dep);
+ if (ret)
+ dev_err(dwc->dev,
+ "ep0 status phase start transfer failed: %d\n", ret);
}
static void dwc3_ep0_do_control_status(struct dwc3 *dwc,
@@ -1121,7 +1130,10 @@ void dwc3_ep0_end_control_data(struct dwc3 *dwc, struct dwc3_ep *dep)
cmd |= DWC3_DEPCMD_PARAM(dep->resource_index);
memset(¶ms, 0, sizeof(params));
ret = dwc3_send_gadget_ep_cmd(dep, cmd, ¶ms);
- WARN_ON_ONCE(ret);
+ if (ret)
+ dev_err_ratelimited(dwc->dev,
+ "ep0 data phase end transfer failed: %d\n", ret);
+
dep->resource_index = 0;
}
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 4a3e97e606d1..4a3d076c1015 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -1772,7 +1772,11 @@ static int __dwc3_stop_active_transfer(struct dwc3_ep *dep, bool force, bool int
dep->flags |= DWC3_EP_DELAY_STOP;
return 0;
}
- WARN_ON_ONCE(ret);
+
+ if (ret)
+ dev_err_ratelimited(dep->dwc->dev,
+ "end transfer failed: %d\n", ret);
+
dep->resource_index = 0;
if (!interrupt)
@@ -4039,7 +4043,9 @@ static void dwc3_clear_stall_all_ep(struct dwc3 *dwc)
dep->flags &= ~DWC3_EP_STALL;
ret = dwc3_send_clear_stall_ep_cmd(dep);
- WARN_ON_ONCE(ret);
+ if (ret)
+ dev_err_ratelimited(dwc->dev,
+ "failed to clear STALL on %s\n", dep->name);
}
}
--
2.17.1
Same spiel as the 6.1.y collection...
This is a collection of backports for patches that were Cc'd to stable,
but failed to apply, along with their dependencies.
Note, Sasha already posted[1][2] these (and I acked them):
KVM: VMX: Allow guest to set DEBUGCTL.RTM_DEBUG if RTM is supported
KVM: x86/pmu: Gate all "unimplemented MSR" prints on report_ignored_msrs
KVM: VMX: Extract checking of guest's DEBUGCTL into helper
KVM: nVMX: Check vmcs12->guest_ia32_debugctl on nested VM-Enter
KVM: VMX: Wrap all accesses to IA32_DEBUGCTL with getter/setter APIs
I'm including them here to hopefully make life easier for y'all, and because
the order they are presented here is the preferred ordering, i.e. should be
the same ordering as the original upstream patches.
But, if you end up grabbing Sasha's patches first, it's not a big deal as the
only true dependencies is that the DEBUGCTL.RTM_DEBUG patch needs to land
before "Check vmcs12->guest_ia32_debugctl on nested VM-Enter".
Many of the patches to get to the last patch (the DEBUGCTLMSR_FREEZE_IN_SMM
fix) are dependencies that arguably shouldn't be backported to LTS kernels.
I opted to do the backports because none of the patches are scary (if it was
1-3 dependency patches instead of 8 I wouldn't hesitate), and there's a decent
chance they'll be dependencies for future fixes.
[1] https://lore.kernel.org/all/20250813183728.2070321-1-sashal@kernel.org
[2] https://lore.kernel.org/all/20250814131146.2093579-1-sashal@kernel.org
Chao Gao (1):
KVM: nVMX: Defer SVI update to vmcs01 on EOI when L2 is active w/o VID
Manuel Andreas (1):
KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB flush
Maxim Levitsky (3):
KVM: nVMX: Check vmcs12->guest_ia32_debugctl on nested VM-Enter
KVM: VMX: Wrap all accesses to IA32_DEBUGCTL with getter/setter APIs
KVM: VMX: Preserve host's DEBUGCTLMSR_FREEZE_IN_SMM while running the
guest
Sean Christopherson (15):
KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI
shadow
KVM: x86: Plumb in the vCPU to kvm_x86_ops.hwapic_isr_update()
KVM: x86: Take irqfds.lock when adding/deleting IRQ bypass producer
KVM: x86: Snapshot the host's DEBUGCTL in common x86
KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs
KVM: x86: Plumb "force_immediate_exit" into kvm_entry() tracepoint
KVM: VMX: Re-enter guest in fastpath for "spurious" preemption timer
exits
KVM: VMX: Handle forced exit due to preemption timer in fastpath
KVM: x86: Move handling of is_guest_mode() into fastpath exit handlers
KVM: VMX: Handle KVM-induced preemption timer exits in fastpath for L2
KVM: x86: Fully defer to vendor code to decide how to force immediate
exit
KVM: x86: Convert vcpu_run()'s immediate exit param into a generic
bitmap
KVM: x86: Drop kvm_x86_ops.set_dr6() in favor of a new KVM_RUN flag
KVM: VMX: Allow guest to set DEBUGCTL.RTM_DEBUG if RTM is supported
KVM: VMX: Extract checking of guest's DEBUGCTL into helper
arch/x86/include/asm/kvm-x86-ops.h | 2 -
arch/x86/include/asm/kvm_host.h | 22 ++--
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kvm/hyperv.c | 3 +
arch/x86/kvm/lapic.c | 19 +++-
arch/x86/kvm/lapic.h | 1 +
arch/x86/kvm/svm/svm.c | 42 +++++---
arch/x86/kvm/svm/vmenter.S | 9 +-
arch/x86/kvm/trace.h | 9 +-
arch/x86/kvm/vmx/nested.c | 26 ++++-
arch/x86/kvm/vmx/pmu_intel.c | 8 +-
arch/x86/kvm/vmx/vmx.c | 164 +++++++++++++++++++----------
arch/x86/kvm/vmx/vmx.h | 31 +++++-
arch/x86/kvm/x86.c | 46 +++++---
14 files changed, 265 insertions(+), 118 deletions(-)
base-commit: 3a8ababb8b6a0ced2be230b60b6e3ddbd8d67014
--
2.51.0.rc1.163.g2494970778-goog
This is a collection of backports for patches that were Cc'd to stable,
but failed to apply, along with their dependencies.
Note, Sasha already posted[1][2] these (and I acked them):
KVM: VMX: Allow guest to set DEBUGCTL.RTM_DEBUG if RTM is supported
KVM: x86/pmu: Gate all "unimplemented MSR" prints on report_ignored_msrs
KVM: VMX: Extract checking of guest's DEBUGCTL into helper
KVM: nVMX: Check vmcs12->guest_ia32_debugctl on nested VM-Enter
KVM: VMX: Wrap all accesses to IA32_DEBUGCTL with getter/setter APIs
I'm including them here to hopefully make life easier for y'all, and because
the order they are presented here is the preferred ordering, i.e. should be
the same ordering as the original upstream patches.
But, if you end up grabbing Sasha's patches first, it's not a big deal as the
only true dependencies is that the DEBUGCTL.RTM_DEBUG patch needs to land
before "Check vmcs12->guest_ia32_debugctl on nested VM-Enter".
Many of the patches to get to the last patch (the DEBUGCTLMSR_FREEZE_IN_SMM
fix) are dependencies that arguably shouldn't be backported to LTS kernels.
I opted to do the backports because none of the patches are scary (if it was
1-3 dependency patches instead of 8 I wouldn't hesitate), and there's a decent
chance they'll be dependencies for future fixes.
[1] https://lore.kernel.org/all/20250813184918.2071296-1-sashal@kernel.org
[2] https://lore.kernel.org/all/20250814132434.2096873-1-sashal@kernel.org
Chao Gao (1):
KVM: nVMX: Defer SVI update to vmcs01 on EOI when L2 is active w/o VID
Maxim Levitsky (3):
KVM: nVMX: Check vmcs12->guest_ia32_debugctl on nested VM-Enter
KVM: VMX: Wrap all accesses to IA32_DEBUGCTL with getter/setter APIs
KVM: VMX: Preserve host's DEBUGCTLMSR_FREEZE_IN_SMM while running the
guest
Sean Christopherson (17):
KVM: SVM: Set RFLAGS.IF=1 in C code, to get VMRUN out of the STI
shadow
KVM: x86: Re-split x2APIC ICR into ICR+ICR2 for AMD (x2AVIC)
KVM: x86: Plumb in the vCPU to kvm_x86_ops.hwapic_isr_update()
KVM: x86: Take irqfds.lock when adding/deleting IRQ bypass producer
KVM: x86: Snapshot the host's DEBUGCTL in common x86
KVM: x86: Snapshot the host's DEBUGCTL after disabling IRQs
KVM: x86/pmu: Gate all "unimplemented MSR" prints on
report_ignored_msrs
KVM: x86: Plumb "force_immediate_exit" into kvm_entry() tracepoint
KVM: VMX: Re-enter guest in fastpath for "spurious" preemption timer
exits
KVM: VMX: Handle forced exit due to preemption timer in fastpath
KVM: x86: Move handling of is_guest_mode() into fastpath exit handlers
KVM: VMX: Handle KVM-induced preemption timer exits in fastpath for L2
KVM: x86: Fully defer to vendor code to decide how to force immediate
exit
KVM: x86: Convert vcpu_run()'s immediate exit param into a generic
bitmap
KVM: x86: Drop kvm_x86_ops.set_dr6() in favor of a new KVM_RUN flag
KVM: VMX: Allow guest to set DEBUGCTL.RTM_DEBUG if RTM is supported
KVM: VMX: Extract checking of guest's DEBUGCTL into helper
arch/x86/include/asm/kvm-x86-ops.h | 2 -
arch/x86/include/asm/kvm_host.h | 24 +++--
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kvm/hyperv.c | 10 +-
arch/x86/kvm/lapic.c | 61 ++++++++---
arch/x86/kvm/lapic.h | 1 +
arch/x86/kvm/svm/svm.c | 49 ++++++---
arch/x86/kvm/svm/vmenter.S | 9 +-
arch/x86/kvm/trace.h | 9 +-
arch/x86/kvm/vmx/nested.c | 26 ++++-
arch/x86/kvm/vmx/pmu_intel.c | 8 +-
arch/x86/kvm/vmx/vmx.c | 168 ++++++++++++++++++-----------
arch/x86/kvm/vmx/vmx.h | 31 +++++-
arch/x86/kvm/x86.c | 65 ++++++-----
arch/x86/kvm/x86.h | 12 +++
15 files changed, 322 insertions(+), 154 deletions(-)
base-commit: 3594f306da129190de25938b823f353ef7f9e322
--
2.51.0.rc1.163.g2494970778-goog
The patch titled
Subject: riscv: use an atomic xchg in pudp_huge_get_and_clear()
has been added to the -mm mm-new branch. Its filename is
riscv-use-an-atomic-xchg-in-pudp_huge_get_and_clear.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Alexandre Ghiti <alexghiti(a)rivosinc.com>
Subject: riscv: use an atomic xchg in pudp_huge_get_and_clear()
Date: Thu, 14 Aug 2025 12:06:14 +0000
Make sure we return the right pud value and not a value that could have
been overwritten in between by a different core.
Link: https://lkml.kernel.org/r/20250814-dev-alex-thp_pud_xchg-v1-1-b4704dfae206@…
Fixes: c3cc2a4a3a23 ("riscv: Add support for PUD THP")
Signed-off-by: Alexandre Ghiti <alexghiti(a)rivosinc.com>
Cc: Andrew Donnellan <ajd(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/riscv/include/asm/pgtable.h | 11 +++++++++++
1 file changed, 11 insertions(+)
--- a/arch/riscv/include/asm/pgtable.h~riscv-use-an-atomic-xchg-in-pudp_huge_get_and_clear
+++ a/arch/riscv/include/asm/pgtable.h
@@ -942,6 +942,17 @@ static inline int pudp_test_and_clear_yo
return ptep_test_and_clear_young(vma, address, (pte_t *)pudp);
}
+#define __HAVE_ARCH_PUDP_HUGE_GET_AND_CLEAR
+static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm,
+ unsigned long address, pud_t *pudp)
+{
+ pud_t pud = __pud(atomic_long_xchg((atomic_long_t *)pudp, 0));
+
+ page_table_check_pud_clear(mm, pud);
+
+ return pud;
+}
+
static inline int pud_young(pud_t pud)
{
return pte_young(pud_pte(pud));
_
Patches currently in -mm which might be from alexghiti(a)rivosinc.com are
selftests-damon-fix-damon-selftests-by-installing-_commonsh.patch
riscv-use-an-atomic-xchg-in-pudp_huge_get_and_clear.patch
The patch below does not apply to the 6.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.15.y
git checkout FETCH_HEAD
git cherry-pick -x c6e35dff58d348c1a9489e9b3b62b3721e62631d
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081248-omission-talisman-0619@gregkh' --subject-prefix 'PATCH 6.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c6e35dff58d348c1a9489e9b3b62b3721e62631d Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Sun, 20 Jul 2025 11:22:29 +0100
Subject: [PATCH] KVM: arm64: Check for SYSREGS_ON_CPU before accessing the CPU
state
Mark Brown reports that since we commit to making exceptions
visible without the vcpu being loaded, the external abort selftest
fails.
Upon investigation, it turns out that the code that makes registers
affected by an exception visible to the guest is completely broken
on VHE, as we don't check whether the system registers are loaded
on the CPU at this point. We managed to get away with this so far,
but that's obviously as bad as it gets,
Add the required checksm and document the absolute need to check
for the SYSREGS_ON_CPU flag before calling into any of the
__vcpu_write_sys_reg_to_cpu()__vcpu_read_sys_reg_from_cpu() helpers.
Reported-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/18535df8-e647-4643-af9a-bb780af03a70@sirena.org.uk
Link: https://lore.kernel.org/r/20250720102229.179114-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton(a)linux.dev>
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e54d29feb469..d373d555a69b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1169,6 +1169,8 @@ static inline bool __vcpu_read_sys_reg_from_cpu(int reg, u64 *val)
* System registers listed in the switch are not saved on every
* exit from the guest but are only saved on vcpu_put.
*
+ * SYSREGS_ON_CPU *MUST* be checked before using this helper.
+ *
* Note that MPIDR_EL1 for the guest is set by KVM via VMPIDR_EL2 but
* should never be listed below, because the guest cannot modify its
* own MPIDR_EL1 and MPIDR_EL1 is accessed for VCPU A from VCPU B's
@@ -1221,6 +1223,8 @@ static inline bool __vcpu_write_sys_reg_to_cpu(u64 val, int reg)
* System registers listed in the switch are not restored on every
* entry to the guest but are only restored on vcpu_load.
*
+ * SYSREGS_ON_CPU *MUST* be checked before using this helper.
+ *
* Note that MPIDR_EL1 for the guest is set by KVM via VMPIDR_EL2 but
* should never be listed below, because the MPIDR should only be set
* once, before running the VCPU, and never changed later.
diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
index 7dafd10e52e8..95d186e0bf54 100644
--- a/arch/arm64/kvm/hyp/exception.c
+++ b/arch/arm64/kvm/hyp/exception.c
@@ -26,7 +26,8 @@ static inline u64 __vcpu_read_sys_reg(const struct kvm_vcpu *vcpu, int reg)
if (unlikely(vcpu_has_nv(vcpu)))
return vcpu_read_sys_reg(vcpu, reg);
- else if (__vcpu_read_sys_reg_from_cpu(reg, &val))
+ else if (vcpu_get_flag(vcpu, SYSREGS_ON_CPU) &&
+ __vcpu_read_sys_reg_from_cpu(reg, &val))
return val;
return __vcpu_sys_reg(vcpu, reg);
@@ -36,7 +37,8 @@ static inline void __vcpu_write_sys_reg(struct kvm_vcpu *vcpu, u64 val, int reg)
{
if (unlikely(vcpu_has_nv(vcpu)))
vcpu_write_sys_reg(vcpu, val, reg);
- else if (!__vcpu_write_sys_reg_to_cpu(val, reg))
+ else if (!vcpu_get_flag(vcpu, SYSREGS_ON_CPU) ||
+ !__vcpu_write_sys_reg_to_cpu(val, reg))
__vcpu_assign_sys_reg(vcpu, reg, val);
}
Lists should have fixed constraints, because binding must be specific in
respect to hardware, thus add missing constraints to number of clocks.
Cc: <stable(a)vger.kernel.org>
Fixes: 88a499cd70d4 ("dt-bindings: Add support for the Broadcom UART driver")
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
---
Greg, patch for serial tree.
---
Documentation/devicetree/bindings/serial/brcm,bcm7271-uart.yaml | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/serial/brcm,bcm7271-uart.yaml b/Documentation/devicetree/bindings/serial/brcm,bcm7271-uart.yaml
index 89c462653e2d..8cc848ae11cb 100644
--- a/Documentation/devicetree/bindings/serial/brcm,bcm7271-uart.yaml
+++ b/Documentation/devicetree/bindings/serial/brcm,bcm7271-uart.yaml
@@ -41,7 +41,7 @@ properties:
- const: dma_intr2
clocks:
- minItems: 1
+ maxItems: 1
clock-names:
const: sw_baud
--
2.48.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x a0ee1d5faff135e28810f29e0f06328c66f89852
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025062039-anger-volumes-9d75@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a0ee1d5faff135e28810f29e0f06328c66f89852 Mon Sep 17 00:00:00 2001
From: Chao Gao <chao.gao(a)intel.com>
Date: Mon, 24 Mar 2025 22:08:48 +0800
Subject: [PATCH] KVM: VMX: Flush shadow VMCS on emergency reboot
Ensure the shadow VMCS cache is evicted during an emergency reboot to
prevent potential memory corruption if the cache is evicted after reboot.
This issue was identified through code inspection, as __loaded_vmcs_clear()
flushes both the normal VMCS and the shadow VMCS.
Avoid checking the "launched" state during an emergency reboot, unlike the
behavior in __loaded_vmcs_clear(). This is important because reboot NMIs
can interfere with operations like copy_shadow_to_vmcs12(), where shadow
VMCSes are loaded directly using VMPTRLD. In such cases, if NMIs occur
right after the VMCS load, the shadow VMCSes will be active but the
"launched" state may not be set.
Fixes: 16f5b9034b69 ("KVM: nVMX: Copy processor-specific shadow-vmcs to VMCS12")
Cc: stable(a)vger.kernel.org
Signed-off-by: Chao Gao <chao.gao(a)intel.com>
Reviewed-by: Kai Huang <kai.huang(a)intel.com>
Link: https://lore.kernel.org/r/20250324140849.2099723-1-chao.gao@intel.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ef2d7208dd20..848c4963bdb8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -770,8 +770,11 @@ void vmx_emergency_disable_virtualization_cpu(void)
return;
list_for_each_entry(v, &per_cpu(loaded_vmcss_on_cpu, cpu),
- loaded_vmcss_on_cpu_link)
+ loaded_vmcss_on_cpu_link) {
vmcs_clear(v->vmcs);
+ if (v->shadow_vmcs)
+ vmcs_clear(v->shadow_vmcs);
+ }
kvm_cpu_vmxoff();
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 17ec2f965344ee3fd6620bef7ef68792f4ac3af0
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025081208-commerce-sports-ea01@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 17ec2f965344ee3fd6620bef7ef68792f4ac3af0 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <seanjc(a)google.com>
Date: Tue, 10 Jun 2025 16:20:06 -0700
Subject: [PATCH] KVM: VMX: Allow guest to set DEBUGCTL.RTM_DEBUG if RTM is
supported
Let the guest set DEBUGCTL.RTM_DEBUG if RTM is supported according to the
guest CPUID model, as debug support is supposed to be available if RTM is
supported, and there are no known downsides to letting the guest debug RTM
aborts.
Note, there are no known bug reports related to RTM_DEBUG, the primary
motivation is to reduce the probability of breaking existing guests when a
future change adds a missing consistency check on vmcs12.GUEST_DEBUGCTL
(KVM currently lets L2 run with whatever hardware supports; whoops).
Note #2, KVM already emulates DR6.RTM, and doesn't restrict access to
DR7.RTM.
Fixes: 83c529151ab0 ("KVM: x86: expose Intel cpu new features (HLE, RTM) to guest")
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20250610232010.162191-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index b7dded3c8113..fa878b136eba 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -419,6 +419,7 @@
#define DEBUGCTLMSR_FREEZE_PERFMON_ON_PMI (1UL << 12)
#define DEBUGCTLMSR_FREEZE_IN_SMM_BIT 14
#define DEBUGCTLMSR_FREEZE_IN_SMM (1UL << DEBUGCTLMSR_FREEZE_IN_SMM_BIT)
+#define DEBUGCTLMSR_RTM_DEBUG BIT(15)
#define MSR_PEBS_FRONTEND 0x000003f7
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4ee6cc796855..311f6fa53b67 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2186,6 +2186,10 @@ static u64 vmx_get_supported_debugctl(struct kvm_vcpu *vcpu, bool host_initiated
(host_initiated || intel_pmu_lbr_is_enabled(vcpu)))
debugctl |= DEBUGCTLMSR_LBR | DEBUGCTLMSR_FREEZE_LBRS_ON_PMI;
+ if (boot_cpu_has(X86_FEATURE_RTM) &&
+ (host_initiated || guest_cpu_cap_has(vcpu, X86_FEATURE_RTM)))
+ debugctl |= DEBUGCTLMSR_RTM_DEBUG;
+
return debugctl;
}