From: "Michael C. Pratt" <mcpratt(a)pm.me>
On 11 Oct 2022, it was reported that the crc32 verification
of the u-boot environment failed only on big-endian systems
for the u-boot-env nvmem layout driver with the following error.
Invalid calculated CRC32: 0x88cd6f09 (expected: 0x096fcd88)
This problem has been present since the driver was introduced,
and before it was made into a layout driver.
The suggested fix at the time was to use further endianness
conversion macros in order to have both the stored and calculated
crc32 values to compare always represented in the system's endianness.
This was not accepted due to sparse warnings
and some disagreement on how to handle the situation.
Later on in a newer revision of the patch, it was proposed to use
cpu_to_le32() for both values to compare instead of le32_to_cpu()
and store the values as __le32 type to remove compilation errors.
The necessity of this is based on the assumption that the use of crc32()
requires endianness conversion because the algorithm uses little-endian,
however, this does not prove to be the case and the issue is unrelated.
Upon inspecting the current kernel code,
there already is an existing use of le32_to_cpu() in this driver,
which suggests there already is special handling for big-endian systems,
however, it is big-endian systems that have the problem.
This, being the only functional difference between architectures
in the driver combined with the fact that the suggested fix
was to use the exact same endianness conversion for the values
brings up the possibility that it was not necessary to begin with,
as the same endianness conversion for two values expected to be the same
is expected to be equivalent to no conversion at all.
After inspecting the u-boot environment of devices of both endianness
and trying to remove the existing endianness conversion,
the problem is resolved in an equivalent way as the other suggested fixes.
Ultimately, it seems that u-boot is agnostic to endianness
at least for the purpose of environment variables.
In other words, u-boot reads and writes the stored crc32 value
with the same endianness that the crc32 value is calculated with
in whichever endianness a certain architecture runs on.
Therefore, the u-boot-env driver does not need to convert endianness.
Remove the usage of endianness macros in the u-boot-env driver,
and change the type of local variables to maintain the same return type.
If there is a special situation in the case of endianness,
it would be a corner case and should be handled by a unique "compatible".
Even though it is not necessary to use endianness conversion macros here,
it may be useful to use them in the future for consistent error printing.
Fixes: d5542923f200 ("nvmem: add driver handling U-Boot environment variables")
Reported-by: INAGAKI Hiroshi <musashino.open(a)gmail.com>
Link: https://lore.kernel.org/all/20221011024928.1807-1-musashino.open@gmail.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael C. Pratt <mcpratt(a)pm.me>
Signed-off-by: Srinivas Kandagatla <srini(a)kernel.org>
---
Changes since v1:
- removed long list of short git ids as it was too much for
small patch.
drivers/nvmem/layouts/u-boot-env.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/nvmem/layouts/u-boot-env.c b/drivers/nvmem/layouts/u-boot-env.c
index 436426d4e8f9..8571aac56295 100644
--- a/drivers/nvmem/layouts/u-boot-env.c
+++ b/drivers/nvmem/layouts/u-boot-env.c
@@ -92,7 +92,7 @@ int u_boot_env_parse(struct device *dev, struct nvmem_device *nvmem,
size_t crc32_data_offset;
size_t crc32_data_len;
size_t crc32_offset;
- __le32 *crc32_addr;
+ uint32_t *crc32_addr;
size_t data_offset;
size_t data_len;
size_t dev_size;
@@ -143,8 +143,8 @@ int u_boot_env_parse(struct device *dev, struct nvmem_device *nvmem,
goto err_kfree;
}
- crc32_addr = (__le32 *)(buf + crc32_offset);
- crc32 = le32_to_cpu(*crc32_addr);
+ crc32_addr = (uint32_t *)(buf + crc32_offset);
+ crc32 = *crc32_addr;
crc32_data_len = dev_size - crc32_data_offset;
data_len = dev_size - data_offset;
--
2.43.0
From: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
[ Upstream commit 691b5b53dbcc30bb3572cbb255374990723af0d2 ]
The display connector family of bridges is used on a plenty of ARM64
platforms (including, but not being limited to several Qualcomm Robotics
and Dragonboard platforms). It doesn't make sense for the DRM drivers to
select the driver, so select it via the defconfig.
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov(a)linaro.org>
Reviewed-by: Neil Armstrong <neil.armstrong(a)linaro.org>
Link: https://lore.kernel.org/r/20250214-arm64-display-connector-v1-1-306bca76316…
Signed-off-by: Bjorn Andersson <andersson(a)kernel.org>
[ Backport to 6.12.y ]
Signed-off-by: Macpaul Lin <macpaul.lin(a)mediatek.com>
---
arch/arm64/configs/defconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 7e475f38f3e1..219ef05ee5a7 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -911,6 +911,7 @@ CONFIG_DRM_PANEL_SAMSUNG_ATNA33XC20=m
CONFIG_DRM_PANEL_SITRONIX_ST7703=m
CONFIG_DRM_PANEL_TRULY_NT35597_WQXGA=m
CONFIG_DRM_PANEL_VISIONOX_VTDR6130=m
+CONFIG_DRM_DISPLAY_CONNECTOR=m
CONFIG_DRM_FSL_LDB=m
CONFIG_DRM_LONTIUM_LT8912B=m
CONFIG_DRM_LONTIUM_LT9611=m
--
2.45.2
Hi,
Charles Bordet reported the following issue (full context in
https://bugs.debian.org/1108860)
> Dear Maintainer,
>
> What led up to the situation?
> We run a production environment using Debian 12 VMs, with a network
> topology involving VXLAN tunnels encapsulated inside Wireguard
> interfaces. This setup has worked reliably for over a year, with MTU set
> to 1500 on all interfaces except the Wireguard interface (set to 1420).
> Wireguard kernel fragmentation allowed this configuration to function
> without issues, even though the effective path MTU is lower than 1500.
>
> What exactly did you do (or not do) that was effective (or ineffective)?
> We performed a routine system upgrade, updating all packages include the
> kernel. After the upgrade, we observed severe network issues (timeouts,
> very slow HTTP/HTTPS, and apt update failures) on all VMs behind the
> router. SSH and small-packet traffic continued to work.
>
> To diagnose, we:
>
> * Restored a backup (with the previous kernel): the problem disappeared.
> * Repeated the upgrade, confirming the issue reappeared.
> * Systematically tested each kernel version from 6.1.124-1 up to
> 6.1.140-1. The problem first appears with kernel 6.1.135-1; all earlier
> versions work as expected.
> * Kernel version from the backports (6.12.32-1) did not resolve the
> problem.
>
> What was the outcome of this action?
>
> * With kernel 6.1.135-1 or later, network timeouts occur for
> large-packet protocols (HTTP, apt, etc.), while SSH and small-packet
> protocols work.
> * With kernel 6.1.133-1 or earlier, everything works as expected.
>
> What outcome did you expect instead?
> We expected the network to function as before, with Wireguard handling
> fragmentation transparently and no application-level timeouts,
> regardless of the kernel version.
While triaging the issue we found that the commit 8930424777e4
("tunnels: Accept PACKET_HOST in skb_tunnel_check_pmtu()." introduces
the issue and Charles confirmed that the issue was present as well in
6.12.35 and 6.15.4 (other version up could potentially still be
affected, but we wanted to check it is not a 6.1.y specific
regression).
Reverthing the commit fixes Charles' issue.
Does that ring a bell?
Regards,
Salvatore
Under some circumstances, such as when a server socket is closing, ABORT
packets will be generated in response to incoming packets. Unfortunately,
this also may include generating aborts in response to incoming aborts -
which may cause a cycle. It appears this may be made possible by giving
the client a multicast address.
Fix this such that rxrpc_reject_packet() will refuse to generate aborts in
response to aborts.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells(a)redhat.com>
Reviewed-by: Jeffrey Altman <jaltman(a)auristor.com>
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
cc: LePremierHomme <kwqcheii(a)proton.me>
cc: Linus Torvalds <torvalds(a)linux-foundation.org>
cc: Jakub Kicinski <kuba(a)kernel.org>
cc: Paolo Abeni <pabeni(a)redhat.com>
cc: "David S. Miller" <davem(a)davemloft.net>
cc: Eric Dumazet <edumazet(a)google.com>
cc: Simon Horman <horms(a)kernel.org>
cc: linux-afs(a)lists.infradead.org
cc: netdev(a)vger.kernel.org
cc: stable(a)vger.kernel.org
---
net/rxrpc/output.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c
index ef7b3096c95e..17c33b5cf7dd 100644
--- a/net/rxrpc/output.c
+++ b/net/rxrpc/output.c
@@ -814,6 +814,9 @@ void rxrpc_reject_packet(struct rxrpc_local *local, struct sk_buff *skb)
__be32 code;
int ret, ioc;
+ if (sp->hdr.type == RXRPC_PACKET_TYPE_ABORT)
+ return; /* Never abort an abort. */
+
rxrpc_see_skb(skb, rxrpc_skb_see_reject);
iov[0].iov_base = &whdr;
If a call receives an event (such as incoming data), the call gets placed
on the socket's queue and a thread in recvmsg can be awakened to go and
process it. Once the thread has picked up the call off of the queue,
further events will cause it to be requeued, and once the socket lock is
dropped (recvmsg uses call->user_mutex to allow the socket to be used in
parallel), a second thread can come in and its recvmsg can pop the call off
the socket queue again.
In such a case, the first thread will be receiving stuff from the call and
the second thread will be blocked on call->user_mutex. The first thread
can, at this point, process both the event that it picked call for and the
event that the second thread picked the call for and may see the call
terminate - in which case the call will be "released", decoupling the call
from the user call ID assigned to it (RXRPC_USER_CALL_ID in the control
message).
The first thread will return okay, but then the second thread will wake up
holding the user_mutex and, if it sees that the call has been released by
the first thread, it will BUG thusly:
kernel BUG at net/rxrpc/recvmsg.c:474!
Fix this by just dequeuing the call and ignoring it if it is seen to be
already released. We can't tell userspace about it anyway as the user call
ID has become stale.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
Reviewed-by: Jeffrey Altman <jaltman(a)auristor.com>
cc: LePremierHomme <kwqcheii(a)proton.me>
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: Jakub Kicinski <kuba(a)kernel.org>
cc: Paolo Abeni <pabeni(a)redhat.com>
cc: "David S. Miller" <davem(a)davemloft.net>
cc: Eric Dumazet <edumazet(a)google.com>
cc: Simon Horman <horms(a)kernel.org>
cc: linux-afs(a)lists.infradead.org
cc: netdev(a)vger.kernel.org
cc: stable(a)vger.kernel.org
---
include/trace/events/rxrpc.h | 3 +++
net/rxrpc/call_accept.c | 1 +
net/rxrpc/recvmsg.c | 19 +++++++++++++++++--
3 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/include/trace/events/rxrpc.h b/include/trace/events/rxrpc.h
index 378d2dfc7392..e7dcfb1369b6 100644
--- a/include/trace/events/rxrpc.h
+++ b/include/trace/events/rxrpc.h
@@ -330,12 +330,15 @@
EM(rxrpc_call_put_userid, "PUT user-id ") \
EM(rxrpc_call_see_accept, "SEE accept ") \
EM(rxrpc_call_see_activate_client, "SEE act-clnt") \
+ EM(rxrpc_call_see_already_released, "SEE alrdy-rl") \
EM(rxrpc_call_see_connect_failed, "SEE con-fail") \
EM(rxrpc_call_see_connected, "SEE connect ") \
EM(rxrpc_call_see_conn_abort, "SEE conn-abt") \
+ EM(rxrpc_call_see_discard, "SEE discard ") \
EM(rxrpc_call_see_disconnected, "SEE disconn ") \
EM(rxrpc_call_see_distribute_error, "SEE dist-err") \
EM(rxrpc_call_see_input, "SEE input ") \
+ EM(rxrpc_call_see_recvmsg, "SEE recvmsg ") \
EM(rxrpc_call_see_release, "SEE release ") \
EM(rxrpc_call_see_userid_exists, "SEE u-exists") \
EM(rxrpc_call_see_waiting_call, "SEE q-conn ") \
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 226b4bf82747..a4d76f2da684 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -219,6 +219,7 @@ void rxrpc_discard_prealloc(struct rxrpc_sock *rx)
tail = b->call_backlog_tail;
while (CIRC_CNT(head, tail, size) > 0) {
struct rxrpc_call *call = b->call_backlog[tail];
+ rxrpc_see_call(call, rxrpc_call_see_discard);
rcu_assign_pointer(call->socket, rx);
if (rx->app_ops &&
rx->app_ops->discard_new_call) {
diff --git a/net/rxrpc/recvmsg.c b/net/rxrpc/recvmsg.c
index 86a27fb55a1c..6990e37697de 100644
--- a/net/rxrpc/recvmsg.c
+++ b/net/rxrpc/recvmsg.c
@@ -447,6 +447,16 @@ int rxrpc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
goto try_again;
}
+ rxrpc_see_call(call, rxrpc_call_see_recvmsg);
+ if (test_bit(RXRPC_CALL_RELEASED, &call->flags)) {
+ rxrpc_see_call(call, rxrpc_call_see_already_released);
+ list_del_init(&call->recvmsg_link);
+ spin_unlock_irq(&rx->recvmsg_lock);
+ release_sock(&rx->sk);
+ trace_rxrpc_recvmsg(call->debug_id, rxrpc_recvmsg_unqueue, 0);
+ rxrpc_put_call(call, rxrpc_call_put_recvmsg);
+ goto try_again;
+ }
if (!(flags & MSG_PEEK))
list_del_init(&call->recvmsg_link);
else
@@ -470,8 +480,13 @@ int rxrpc_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
release_sock(&rx->sk);
- if (test_bit(RXRPC_CALL_RELEASED, &call->flags))
- BUG();
+ if (test_bit(RXRPC_CALL_RELEASED, &call->flags)) {
+ rxrpc_see_call(call, rxrpc_call_see_already_released);
+ mutex_unlock(&call->user_mutex);
+ if (!(flags & MSG_PEEK))
+ rxrpc_put_call(call, rxrpc_call_put_recvmsg);
+ goto try_again;
+ }
ret = rxrpc_recvmsg_user_id(call, msg, flags);
if (ret < 0)
The rxrpc_assess_MTU_size() function calls down into the IP layer to find
out the MTU size for a route. When accepting an incoming call, this is
called from rxrpc_new_incoming_call() which holds interrupts disabled
across the code that calls down to it. Unfortunately, the IP layer uses
local_bh_enable() which, config dependent, throws a warning if IRQs are
enabled:
WARNING: CPU: 1 PID: 5544 at kernel/softirq.c:387 __local_bh_enable_ip+0x43/0xd0
...
RIP: 0010:__local_bh_enable_ip+0x43/0xd0
...
Call Trace:
<TASK>
rt_cache_route+0x7e/0xa0
rt_set_nexthop.isra.0+0x3b3/0x3f0
__mkroute_output+0x43a/0x460
ip_route_output_key_hash+0xf7/0x140
ip_route_output_flow+0x1b/0x90
rxrpc_assess_MTU_size.isra.0+0x2a0/0x590
rxrpc_new_incoming_peer+0x46/0x120
rxrpc_alloc_incoming_call+0x1b1/0x400
rxrpc_new_incoming_call+0x1da/0x5e0
rxrpc_input_packet+0x827/0x900
rxrpc_io_thread+0x403/0xb60
kthread+0x2f7/0x310
ret_from_fork+0x2a/0x230
ret_from_fork_asm+0x1a/0x30
...
hardirqs last enabled at (23): _raw_spin_unlock_irq+0x24/0x50
hardirqs last disabled at (24): _raw_read_lock_irq+0x17/0x70
softirqs last enabled at (0): copy_process+0xc61/0x2730
softirqs last disabled at (25): rt_add_uncached_list+0x3c/0x90
Fix this by moving the call to rxrpc_assess_MTU_size() out of
rxrpc_init_peer() and further up the stack where it can be done without
interrupts disabled.
It shouldn't be a problem for rxrpc_new_incoming_call() to do it after the
locks are dropped as pmtud is going to be performed by the I/O thread - and
we're in the I/O thread at this point.
Fixes: a2ea9a907260 ("rxrpc: Use irq-disabling spinlocks between app and I/O thread")
Signed-off-by: David Howells <dhowells(a)redhat.com>
Reviewed-by: Jeffrey Altman <jaltman(a)auristor.com>
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
cc: LePremierHomme <kwqcheii(a)proton.me>
cc: Jakub Kicinski <kuba(a)kernel.org>
cc: Paolo Abeni <pabeni(a)redhat.com>
cc: "David S. Miller" <davem(a)davemloft.net>
cc: Eric Dumazet <edumazet(a)google.com>
cc: Simon Horman <horms(a)kernel.org>
cc: linux-afs(a)lists.infradead.org
cc: netdev(a)vger.kernel.org
cc: stable(a)vger.kernel.org
---
net/rxrpc/ar-internal.h | 1 +
net/rxrpc/call_accept.c | 1 +
net/rxrpc/peer_object.c | 6 ++----
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index 376e33dce8c1..df1a618dbf7d 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -1383,6 +1383,7 @@ struct rxrpc_peer *rxrpc_lookup_peer_rcu(struct rxrpc_local *,
const struct sockaddr_rxrpc *);
struct rxrpc_peer *rxrpc_lookup_peer(struct rxrpc_local *local,
struct sockaddr_rxrpc *srx, gfp_t gfp);
+void rxrpc_assess_MTU_size(struct rxrpc_local *local, struct rxrpc_peer *peer);
struct rxrpc_peer *rxrpc_alloc_peer(struct rxrpc_local *, gfp_t,
enum rxrpc_peer_trace);
void rxrpc_new_incoming_peer(struct rxrpc_local *local, struct rxrpc_peer *peer);
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 49fccee1a726..226b4bf82747 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -406,6 +406,7 @@ bool rxrpc_new_incoming_call(struct rxrpc_local *local,
spin_unlock(&rx->incoming_lock);
read_unlock_irq(&local->services_lock);
+ rxrpc_assess_MTU_size(local, call->peer);
if (hlist_unhashed(&call->error_link)) {
spin_lock_irq(&call->peer->lock);
diff --git a/net/rxrpc/peer_object.c b/net/rxrpc/peer_object.c
index e2f35e6c04d6..366431b0736c 100644
--- a/net/rxrpc/peer_object.c
+++ b/net/rxrpc/peer_object.c
@@ -149,8 +149,7 @@ struct rxrpc_peer *rxrpc_lookup_peer_rcu(struct rxrpc_local *local,
* assess the MTU size for the network interface through which this peer is
* reached
*/
-static void rxrpc_assess_MTU_size(struct rxrpc_local *local,
- struct rxrpc_peer *peer)
+void rxrpc_assess_MTU_size(struct rxrpc_local *local, struct rxrpc_peer *peer)
{
struct net *net = local->net;
struct dst_entry *dst;
@@ -277,8 +276,6 @@ static void rxrpc_init_peer(struct rxrpc_local *local, struct rxrpc_peer *peer,
peer->hdrsize += sizeof(struct rxrpc_wire_header);
peer->max_data = peer->if_mtu - peer->hdrsize;
-
- rxrpc_assess_MTU_size(local, peer);
}
/*
@@ -297,6 +294,7 @@ static struct rxrpc_peer *rxrpc_create_peer(struct rxrpc_local *local,
if (peer) {
memcpy(&peer->srx, srx, sizeof(*srx));
rxrpc_init_peer(local, peer, hash_key);
+ rxrpc_assess_MTU_size(local, peer);
}
_leave(" = %p", peer);
> This is the start of the stable review cycle for the 6.15.7 release.
> There are 192 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu, 17 Jul 2025 13:07:32 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.15.7-rc1…
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.15.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
Tested Linux kernel 6.15.7-rc1 on Fedora 37 (x86_64) with Intel i7-11800H.
All major tests including boot, Wi-Fi, Bluetooth, audio, video, and USB
mass storage detection passed successfully.
Kernel Version : 6.15.7-rc1
Fedora Version : 37 (Thirty Seven)
Processor : 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz
Build Architecture: x86_64
Test Results:
- Boot Test : PASS
- Wi-Fi Test : PASS
- Bluetooth Test : PASS
- Audio Test : PASS
- Video Test : PASS
- USB Mass Storage Drive Detect : PASS
Tested-by: Dileep Malepu <dileep.debian(a)gmail.com>
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
A modern Linux system creates much more than 20 threads at bootup.
When I booted up OpenWrt in qemu the system sometimes failed to boot up
when it wanted to create the 419th thread. The VM had 128MB RAM and the
calculation in set_max_threads() calculated that max_threads should be
set to 419. When the system booted up it tried to notify the user space
about every device it created because CONFIG_UEVENT_HELPER was set and
used. I counted 1299 calles to call_usermodehelper_setup(), all of
them try to create a new thread and call the userspace hotplug script in
it.
This fixes bootup of Linux on systems with low memory.
I saw the problem with qemu 10.0.2 using these commands:
qemu-system-aarch64 -machine virt -cpu cortex-a57 -nographic
Cc: stable(a)vger.kernel.org
Signed-off-by: Hauke Mehrtens <hauke(a)hauke-m.de>
---
kernel/fork.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/fork.c b/kernel/fork.c
index 7966c9a1c163..388299525f3c 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -115,7 +115,7 @@
/*
* Minimum number of threads to boot the kernel
*/
-#define MIN_THREADS 20
+#define MIN_THREADS 600
/*
* Maximum number of threads
--
2.50.1
The Cadence PCIe Controller integrated in the TI K3 SoCs supports both
Root-Complex and Endpoint modes of operation. The Glue Layer allows
"strapping" the mode of operation of the Controller, the Link Speed
and the Link Width. This is enabled by programming the "PCIEn_CTRL"
register (n corresponds to the PCIe instance) within the CTRL_MMR
memory-mapped register space.
In the PCIe Controller's register space, the same set of registers
that correspond to the Root-Port configuration space when the
Controller is configured for Root-Complex mode of operation, also
correspond to the Physical Function configuration space when the
Controller is configured for Endpoint mode of operation. As a result,
the "reset-value" of these set of registers _should_ vary depending
on the selected mode of operation. This is the expected behavior
according to the description of the registers and their reset values
in the Technical Reference Manual for the SoCs.
However, it is observed that the "reset-value" seen in practice
do not match the description. To be precise, when the Controller
is configured for Root-Complex mode of operation, the "reset-value"
of the Root-Port configuration space reflect the "reset-value"
corresponding to the Physical Function configuration space.
This can be attributed to the fact that the "strap" settings play
a role in "switching" the "reset-value" of the registers to match
the expected values as determined by the selected mode of operation.
Since the "strap" settings are sampled the moment the PCIe Controller
is powered ON, the "reset-value" of the registers are setup at that
point in time. As a result, if the "strap" settings are programmed
at a later point in time, it _will not_ update the "reset-value" of
the registers. This will cause the Physical Function configuration
space to be seen when the Root-Port configuration space is accessed
after programming the PCIe Controller for Root-Complex mode of
operation.
Fix this by powering off the PCIe Controller before programming the
"strap" settings and powering it on after that. This will ensure
that the "strap" settings that have been sampled convey the intended
mode of operation, thereby resulting in the "reset-value" of the
registers being accurate.
Fixes: f3e25911a430 ("PCI: j721e: Add TI J721E PCIe driver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Siddharth Vadapalli <s-vadapalli(a)ti.com>
---
Hello,
This patch is based on commit
155a3c003e55 Merge tag 'for-6.16/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
of Mainline Linux.
Patch has been validated on the J7200 SoC which is affected by
the existing programming sequence of the "strap" settings.
Regards,
Siddharth.
drivers/pci/controller/cadence/pci-j721e.c | 82 ++++++++++++++--------
1 file changed, 53 insertions(+), 29 deletions(-)
diff --git a/drivers/pci/controller/cadence/pci-j721e.c b/drivers/pci/controller/cadence/pci-j721e.c
index 6c93f39d0288..d5e7cb7277dc 100644
--- a/drivers/pci/controller/cadence/pci-j721e.c
+++ b/drivers/pci/controller/cadence/pci-j721e.c
@@ -19,6 +19,7 @@
#include <linux/of.h>
#include <linux/pci.h>
#include <linux/platform_device.h>
+#include <linux/pm_domain.h>
#include <linux/pm_runtime.h>
#include <linux/regmap.h>
@@ -173,10 +174,9 @@ static const struct cdns_pcie_ops j721e_pcie_ops = {
.link_up = j721e_pcie_link_up,
};
-static int j721e_pcie_set_mode(struct j721e_pcie *pcie, struct regmap *syscon,
- unsigned int offset)
+static int j721e_pcie_set_mode(struct j721e_pcie *pcie, struct device *dev,
+ struct regmap *syscon, unsigned int offset)
{
- struct device *dev = pcie->cdns_pcie->dev;
u32 mask = J721E_MODE_RC;
u32 mode = pcie->mode;
u32 val = 0;
@@ -193,9 +193,9 @@ static int j721e_pcie_set_mode(struct j721e_pcie *pcie, struct regmap *syscon,
}
static int j721e_pcie_set_link_speed(struct j721e_pcie *pcie,
+ struct device *dev,
struct regmap *syscon, unsigned int offset)
{
- struct device *dev = pcie->cdns_pcie->dev;
struct device_node *np = dev->of_node;
int link_speed;
u32 val = 0;
@@ -214,9 +214,9 @@ static int j721e_pcie_set_link_speed(struct j721e_pcie *pcie,
}
static int j721e_pcie_set_lane_count(struct j721e_pcie *pcie,
+ struct device *dev,
struct regmap *syscon, unsigned int offset)
{
- struct device *dev = pcie->cdns_pcie->dev;
u32 lanes = pcie->num_lanes;
u32 mask = BIT(8);
u32 val = 0;
@@ -234,9 +234,9 @@ static int j721e_pcie_set_lane_count(struct j721e_pcie *pcie,
}
static int j721e_enable_acspcie_refclk(struct j721e_pcie *pcie,
+ struct device *dev,
struct regmap *syscon)
{
- struct device *dev = pcie->cdns_pcie->dev;
struct device_node *node = dev->of_node;
u32 mask = ACSPCIE_PAD_DISABLE_MASK;
struct of_phandle_args args;
@@ -263,9 +263,8 @@ static int j721e_enable_acspcie_refclk(struct j721e_pcie *pcie,
return 0;
}
-static int j721e_pcie_ctrl_init(struct j721e_pcie *pcie)
+static int j721e_pcie_ctrl_init(struct j721e_pcie *pcie, struct device *dev)
{
- struct device *dev = pcie->cdns_pcie->dev;
struct device_node *node = dev->of_node;
struct of_phandle_args args;
unsigned int offset = 0;
@@ -284,19 +283,19 @@ static int j721e_pcie_ctrl_init(struct j721e_pcie *pcie)
if (!ret)
offset = args.args[0];
- ret = j721e_pcie_set_mode(pcie, syscon, offset);
+ ret = j721e_pcie_set_mode(pcie, dev, syscon, offset);
if (ret < 0) {
dev_err(dev, "Failed to set pci mode\n");
return ret;
}
- ret = j721e_pcie_set_link_speed(pcie, syscon, offset);
+ ret = j721e_pcie_set_link_speed(pcie, dev, syscon, offset);
if (ret < 0) {
dev_err(dev, "Failed to set link speed\n");
return ret;
}
- ret = j721e_pcie_set_lane_count(pcie, syscon, offset);
+ ret = j721e_pcie_set_lane_count(pcie, dev, syscon, offset);
if (ret < 0) {
dev_err(dev, "Failed to set num-lanes\n");
return ret;
@@ -308,7 +307,7 @@ static int j721e_pcie_ctrl_init(struct j721e_pcie *pcie)
if (!syscon)
return 0;
- return j721e_enable_acspcie_refclk(pcie, syscon);
+ return j721e_enable_acspcie_refclk(pcie, dev, syscon);
}
static int cdns_ti_pcie_config_read(struct pci_bus *bus, unsigned int devfn,
@@ -469,6 +468,47 @@ static int j721e_pcie_probe(struct platform_device *pdev)
if (!pcie)
return -ENOMEM;
+ pcie->mode = mode;
+
+ ret = of_property_read_u32(node, "num-lanes", &num_lanes);
+ if (ret || num_lanes > data->max_lanes) {
+ dev_warn(dev, "num-lanes property not provided or invalid, setting num-lanes to 1\n");
+ num_lanes = 1;
+ }
+
+ pcie->num_lanes = num_lanes;
+ pcie->max_lanes = data->max_lanes;
+
+ /*
+ * The PCIe Controller's registers have different "reset-value" depending
+ * on the "strap" settings programmed into the Controller's Glue Layer.
+ * This is because the same set of registers are used for representing the
+ * Physical Function configuration space in Endpoint mode and for
+ * representing the Root-Port configuration space in Root-Complex mode.
+ *
+ * The registers latch onto a "reset-value" based on the "strap" settings
+ * sampled after the Controller is powered on. Therefore, for the
+ * "reset-value" to be accurate, it is necessary to program the "strap"
+ * settings when the Controller is powered off, and power on the Controller
+ * after the "strap" settings have been programmed.
+ *
+ * The "strap" settings are programmed by "j721e_pcie_ctrl_init()".
+ * Therefore, power off the Controller before invoking "j721e_pcie_ctrl_init()",
+ * program the "strap" settings, and then power on the Controller. This ensures
+ * that the reset values are accurate and reflect the "strap" settings.
+ */
+ dev_pm_domain_detach(dev, true);
+
+ ret = j721e_pcie_ctrl_init(pcie, dev);
+ if (ret < 0)
+ return ret;
+
+ ret = dev_pm_domain_attach(dev, true);
+ if (ret < 0) {
+ dev_err(dev, "failed to power on PCIe Controller\n");
+ return ret;
+ }
+
switch (mode) {
case PCI_MODE_RC:
if (!IS_ENABLED(CONFIG_PCI_J721E_HOST))
@@ -510,7 +550,6 @@ static int j721e_pcie_probe(struct platform_device *pdev)
return 0;
}
- pcie->mode = mode;
pcie->linkdown_irq_regfield = data->linkdown_irq_regfield;
base = devm_platform_ioremap_resource_byname(pdev, "intd_cfg");
@@ -523,15 +562,6 @@ static int j721e_pcie_probe(struct platform_device *pdev)
return PTR_ERR(base);
pcie->user_cfg_base = base;
- ret = of_property_read_u32(node, "num-lanes", &num_lanes);
- if (ret || num_lanes > data->max_lanes) {
- dev_warn(dev, "num-lanes property not provided or invalid, setting num-lanes to 1\n");
- num_lanes = 1;
- }
-
- pcie->num_lanes = num_lanes;
- pcie->max_lanes = data->max_lanes;
-
if (dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48)))
return -EINVAL;
@@ -547,12 +577,6 @@ static int j721e_pcie_probe(struct platform_device *pdev)
goto err_get_sync;
}
- ret = j721e_pcie_ctrl_init(pcie);
- if (ret < 0) {
- dev_err_probe(dev, ret, "pm_runtime_get_sync failed\n");
- goto err_get_sync;
- }
-
ret = devm_request_irq(dev, irq, j721e_pcie_link_irq_handler, 0,
"j721e-pcie-link-down-irq", pcie);
if (ret < 0) {
@@ -680,7 +704,7 @@ static int j721e_pcie_resume_noirq(struct device *dev)
struct cdns_pcie *cdns_pcie = pcie->cdns_pcie;
int ret;
- ret = j721e_pcie_ctrl_init(pcie);
+ ret = j721e_pcie_ctrl_init(pcie, dev);
if (ret < 0)
return ret;
--
2.34.1
A new warning in clang [1] points out that id_reg is uninitialized then
passed to memstick_init_req() as a const pointer:
drivers/memstick/core/memstick.c:330:59: error: variable 'id_reg' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer]
330 | memstick_init_req(&card->current_mrq, MS_TPC_READ_REG, &id_reg,
| ^~~~~~
Commit de182cc8e882 ("drivers/memstick/core/memstick.c: avoid -Wnonnull
warning") intentionally passed this variable uninitialized to avoid an
-Wnonnull warning from a NULL value that was previously there because
id_reg is never read from the call to memstick_init_req() in
h_memstick_read_dev_id(). Just zero initialize id_reg to avoid the
warning, which is likely happening in the majority of builds using
modern compilers that support '-ftrivial-auto-var-init=zero'.
Cc: stable(a)vger.kernel.org
Fixes: de182cc8e882 ("drivers/memstick/core/memstick.c: avoid -Wnonnull warning")
Link: https://github.com/llvm/llvm-project/commit/00dacf8c22f065cb52efb14cd091d44… [1]
Closes: https://github.com/ClangBuiltLinux/linux/issues/2105
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
drivers/memstick/core/memstick.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/memstick/core/memstick.c b/drivers/memstick/core/memstick.c
index 043b9ec756ff..7f3f47db4c98 100644
--- a/drivers/memstick/core/memstick.c
+++ b/drivers/memstick/core/memstick.c
@@ -324,7 +324,7 @@ EXPORT_SYMBOL(memstick_init_req);
static int h_memstick_read_dev_id(struct memstick_dev *card,
struct memstick_request **mrq)
{
- struct ms_id_register id_reg;
+ struct ms_id_register id_reg = {};
if (!(*mrq)) {
memstick_init_req(&card->current_mrq, MS_TPC_READ_REG, &id_reg,
---
base-commit: ff09b71bf9daeca4f21d6e5e449641c9fad75b53
change-id: 20250715-memstick-fix-uninit-const-pointer-ed6f138bf40d
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
The ready event list of an epoll object is protected by read-write
semaphore:
- The consumer (waiter) acquires the write lock and takes items.
- the producer (waker) takes the read lock and adds items.
The point of this design is enabling epoll to scale well with large number
of producers, as multiple producers can hold the read lock at the same
time.
Unfortunately, this implementation may cause scheduling priority inversion
problem. Suppose the consumer has higher scheduling priority than the
producer. The consumer needs to acquire the write lock, but may be blocked
by the producer holding the read lock. Since read-write semaphore does not
support priority-boosting for the readers (even with CONFIG_PREEMPT_RT=y),
we have a case of priority inversion: a higher priority consumer is blocked
by a lower priority producer. This problem was reported in [1].
Furthermore, this could also cause stall problem, as described in [2].
Fix this problem by replacing rwlock with spinlock.
This reduces the event bandwidth, as the producers now have to contend with
each other for the spinlock. According to the benchmark from
https://github.com/rouming/test-tools/blob/master/stress-epoll.c:
On 12 x86 CPUs:
Before After Diff
threads events/ms events/ms
8 7162 4956 -31%
16 8733 5383 -38%
32 7968 5572 -30%
64 10652 5739 -46%
128 11236 5931 -47%
On 4 riscv CPUs:
Before After Diff
threads events/ms events/ms
8 2958 2833 -4%
16 3323 3097 -7%
32 3451 3240 -6%
64 3554 3178 -11%
128 3601 3235 -10%
Although the numbers look bad, it should be noted that this benchmark
creates multiple threads who do nothing except constantly generating new
epoll events, thus contention on the spinlock is high. For real workload,
the event rate is likely much lower, and the performance drop is not as
bad.
Using another benchmark (perf bench epoll wait) where spinlock contention
is lower, improvement is even observed on x86:
On 12 x86 CPUs:
Before: Averaged 110279 operations/sec (+- 1.09%), total secs = 8
After: Averaged 114577 operations/sec (+- 2.25%), total secs = 8
On 4 riscv CPUs:
Before: Averaged 175767 operations/sec (+- 0.62%), total secs = 8
After: Averaged 167396 operations/sec (+- 0.23%), total secs = 8
In conclusion, no one is likely to be upset over this change. After all,
spinlock was used originally for years, and the commit which converted to
rwlock didn't mention a real workload, just that the benchmark numbers are
nice.
This patch is not exactly the revert of commit a218cc491420 ("epoll: use
rwlock in order to reduce ep_poll_callback() contention"), because git
revert conflicts in some places which are not obvious on the resolution.
This patch is intended to be backported, therefore go with the obvious
approach:
- Replace rwlock_t with spinlock_t one to one
- Delete list_add_tail_lockless() and chain_epi_lockless(). These were
introduced to allow producers to concurrently add items to the list.
But now that spinlock no longer allows producers to touch the event
list concurrently, these two functions are not necessary anymore.
Fixes: a218cc491420 ("epoll: use rwlock in order to reduce ep_poll_callback() contention")
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Cc: stable(a)vger.kernel.org
---
fs/eventpoll.c | 139 +++++++++----------------------------------------
1 file changed, 26 insertions(+), 113 deletions(-)
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 0fbf5dfedb24..a171f7e7dacc 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -46,10 +46,10 @@
*
* 1) epnested_mutex (mutex)
* 2) ep->mtx (mutex)
- * 3) ep->lock (rwlock)
+ * 3) ep->lock (spinlock)
*
* The acquire order is the one listed above, from 1 to 3.
- * We need a rwlock (ep->lock) because we manipulate objects
+ * We need a spinlock (ep->lock) because we manipulate objects
* from inside the poll callback, that might be triggered from
* a wake_up() that in turn might be called from IRQ context.
* So we can't sleep inside the poll callback and hence we need
@@ -195,7 +195,7 @@ struct eventpoll {
struct list_head rdllist;
/* Lock which protects rdllist and ovflist */
- rwlock_t lock;
+ spinlock_t lock;
/* RB tree root used to store monitored fd structs */
struct rb_root_cached rbr;
@@ -740,10 +740,10 @@ static void ep_start_scan(struct eventpoll *ep, struct list_head *txlist)
* in a lockless way.
*/
lockdep_assert_irqs_enabled();
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
list_splice_init(&ep->rdllist, txlist);
WRITE_ONCE(ep->ovflist, NULL);
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
}
static void ep_done_scan(struct eventpoll *ep,
@@ -751,7 +751,7 @@ static void ep_done_scan(struct eventpoll *ep,
{
struct epitem *epi, *nepi;
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
/*
* During the time we spent inside the "sproc" callback, some
* other events might have been queued by the poll callback.
@@ -792,7 +792,7 @@ static void ep_done_scan(struct eventpoll *ep,
wake_up(&ep->wq);
}
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
}
static void ep_get(struct eventpoll *ep)
@@ -867,10 +867,10 @@ static bool __ep_remove(struct eventpoll *ep, struct epitem *epi, bool force)
rb_erase_cached(&epi->rbn, &ep->rbr);
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
if (ep_is_linked(epi))
list_del_init(&epi->rdllink);
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
wakeup_source_unregister(ep_wakeup_source(epi));
/*
@@ -1151,7 +1151,7 @@ static int ep_alloc(struct eventpoll **pep)
return -ENOMEM;
mutex_init(&ep->mtx);
- rwlock_init(&ep->lock);
+ spin_lock_init(&ep->lock);
init_waitqueue_head(&ep->wq);
init_waitqueue_head(&ep->poll_wait);
INIT_LIST_HEAD(&ep->rdllist);
@@ -1238,100 +1238,10 @@ struct file *get_epoll_tfile_raw_ptr(struct file *file, int tfd,
}
#endif /* CONFIG_KCMP */
-/*
- * Adds a new entry to the tail of the list in a lockless way, i.e.
- * multiple CPUs are allowed to call this function concurrently.
- *
- * Beware: it is necessary to prevent any other modifications of the
- * existing list until all changes are completed, in other words
- * concurrent list_add_tail_lockless() calls should be protected
- * with a read lock, where write lock acts as a barrier which
- * makes sure all list_add_tail_lockless() calls are fully
- * completed.
- *
- * Also an element can be locklessly added to the list only in one
- * direction i.e. either to the tail or to the head, otherwise
- * concurrent access will corrupt the list.
- *
- * Return: %false if element has been already added to the list, %true
- * otherwise.
- */
-static inline bool list_add_tail_lockless(struct list_head *new,
- struct list_head *head)
-{
- struct list_head *prev;
-
- /*
- * This is simple 'new->next = head' operation, but cmpxchg()
- * is used in order to detect that same element has been just
- * added to the list from another CPU: the winner observes
- * new->next == new.
- */
- if (!try_cmpxchg(&new->next, &new, head))
- return false;
-
- /*
- * Initially ->next of a new element must be updated with the head
- * (we are inserting to the tail) and only then pointers are atomically
- * exchanged. XCHG guarantees memory ordering, thus ->next should be
- * updated before pointers are actually swapped and pointers are
- * swapped before prev->next is updated.
- */
-
- prev = xchg(&head->prev, new);
-
- /*
- * It is safe to modify prev->next and new->prev, because a new element
- * is added only to the tail and new->next is updated before XCHG.
- */
-
- prev->next = new;
- new->prev = prev;
-
- return true;
-}
-
-/*
- * Chains a new epi entry to the tail of the ep->ovflist in a lockless way,
- * i.e. multiple CPUs are allowed to call this function concurrently.
- *
- * Return: %false if epi element has been already chained, %true otherwise.
- */
-static inline bool chain_epi_lockless(struct epitem *epi)
-{
- struct eventpoll *ep = epi->ep;
-
- /* Fast preliminary check */
- if (epi->next != EP_UNACTIVE_PTR)
- return false;
-
- /* Check that the same epi has not been just chained from another CPU */
- if (cmpxchg(&epi->next, EP_UNACTIVE_PTR, NULL) != EP_UNACTIVE_PTR)
- return false;
-
- /* Atomically exchange tail */
- epi->next = xchg(&ep->ovflist, epi);
-
- return true;
-}
-
/*
* This is the callback that is passed to the wait queue wakeup
* mechanism. It is called by the stored file descriptors when they
* have events to report.
- *
- * This callback takes a read lock in order not to contend with concurrent
- * events from another file descriptor, thus all modifications to ->rdllist
- * or ->ovflist are lockless. Read lock is paired with the write lock from
- * ep_start/done_scan(), which stops all list modifications and guarantees
- * that lists state is seen correctly.
- *
- * Another thing worth to mention is that ep_poll_callback() can be called
- * concurrently for the same @epi from different CPUs if poll table was inited
- * with several wait queues entries. Plural wakeup from different CPUs of a
- * single wait queue is serialized by wq.lock, but the case when multiple wait
- * queues are used should be detected accordingly. This is detected using
- * cmpxchg() operation.
*/
static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, void *key)
{
@@ -1342,7 +1252,7 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
unsigned long flags;
int ewake = 0;
- read_lock_irqsave(&ep->lock, flags);
+ spin_lock_irqsave(&ep->lock, flags);
ep_set_busy_poll_napi_id(epi);
@@ -1371,12 +1281,15 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
* chained in ep->ovflist and requeued later on.
*/
if (READ_ONCE(ep->ovflist) != EP_UNACTIVE_PTR) {
- if (chain_epi_lockless(epi))
+ if (epi->next == EP_UNACTIVE_PTR) {
+ epi->next = READ_ONCE(ep->ovflist);
+ WRITE_ONCE(ep->ovflist, epi);
ep_pm_stay_awake_rcu(epi);
+ }
} else if (!ep_is_linked(epi)) {
/* In the usual case, add event to ready list. */
- if (list_add_tail_lockless(&epi->rdllink, &ep->rdllist))
- ep_pm_stay_awake_rcu(epi);
+ list_add_tail(&epi->rdllink, &ep->rdllist);
+ ep_pm_stay_awake_rcu(epi);
}
/*
@@ -1409,7 +1322,7 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
pwake++;
out_unlock:
- read_unlock_irqrestore(&ep->lock, flags);
+ spin_unlock_irqrestore(&ep->lock, flags);
/* We have to call this outside the lock */
if (pwake)
@@ -1744,7 +1657,7 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,
}
/* We have to drop the new item inside our item list to keep track of it */
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
/* record NAPI ID of new item if present */
ep_set_busy_poll_napi_id(epi);
@@ -1761,7 +1674,7 @@ static int ep_insert(struct eventpoll *ep, const struct epoll_event *event,
pwake++;
}
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
/* We have to call this outside the lock */
if (pwake)
@@ -1825,7 +1738,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi,
* list, push it inside.
*/
if (ep_item_poll(epi, &pt, 1)) {
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
if (!ep_is_linked(epi)) {
list_add_tail(&epi->rdllink, &ep->rdllist);
ep_pm_stay_awake(epi);
@@ -1836,7 +1749,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi,
if (waitqueue_active(&ep->poll_wait))
pwake++;
}
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
}
/* We have to call this outside the lock */
@@ -2088,7 +2001,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
init_wait(&wait);
wait.func = ep_autoremove_wake_function;
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
/*
* Barrierless variant, waitqueue_active() is called under
* the same lock on wakeup ep_poll_callback() side, so it
@@ -2107,7 +2020,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
if (!eavail)
__add_wait_queue_exclusive(&ep->wq, &wait);
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
if (!eavail)
timed_out = !ep_schedule_timeout(to) ||
@@ -2123,7 +2036,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
eavail = 1;
if (!list_empty_careful(&wait.entry)) {
- write_lock_irq(&ep->lock);
+ spin_lock_irq(&ep->lock);
/*
* If the thread timed out and is not on the wait queue,
* it means that the thread was woken up after its
@@ -2134,7 +2047,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
if (timed_out)
eavail = list_empty(&wait.entry);
__remove_wait_queue(&ep->wq, &wait);
- write_unlock_irq(&ep->lock);
+ spin_unlock_irq(&ep->lock);
}
}
}
--
2.39.5
Fix the order of the freq-table-hz property, then convert to OPP tables
and add interconnect support for UFS for the SM6350 SoC.
Signed-off-by: Luca Weiss <luca.weiss(a)fairphone.com>
---
Luca Weiss (3):
arm64: dts: qcom: sm6350: Fix wrong order of freq-table-hz for UFS
arm64: dts: qcom: sm6350: Add OPP table support to UFSHC
arm64: dts: qcom: sm6350: Add interconnect support to UFS
arch/arm64/boot/dts/qcom/sm6350.dtsi | 49 ++++++++++++++++++++++++++++--------
1 file changed, 39 insertions(+), 10 deletions(-)
---
base-commit: eea255893718268e1ab852fb52f70c613d109b99
change-id: 20250314-sm6350-ufs-things-53c5de9fec5e
Best regards,
--
Luca Weiss <luca.weiss(a)fairphone.com>
After a recent change in clang to strengthen uninitialized warnings [1],
it points out that in one of the error paths in parse_btf_arg(), params
is used uninitialized:
kernel/trace/trace_probe.c:660:19: warning: variable 'params' is uninitialized when used here [-Wuninitialized]
660 | return PTR_ERR(params);
| ^~~~~~
Match many other NO_BTF_ENTRY error cases and return -ENOENT, clearing
up the warning.
Cc: stable(a)vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2110
Fixes: d157d7694460 ("tracing/probes: Support BTF field access from $retval")
Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cd… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
kernel/trace/trace_probe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c
index 424751cdf31f..40830a3ecd96 100644
--- a/kernel/trace/trace_probe.c
+++ b/kernel/trace/trace_probe.c
@@ -657,7 +657,7 @@ static int parse_btf_arg(char *varname,
ret = query_btf_context(ctx);
if (ret < 0 || ctx->nr_params == 0) {
trace_probe_log_err(ctx->offset, NO_BTF_ENTRY);
- return PTR_ERR(params);
+ return -ENOENT;
}
}
params = ctx->params;
---
base-commit: 6921d1e07cb5eddec830801087b419194fde0803
change-id: 20250715-trace_probe-fix-const-uninit-warning-7dc3accce903
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
This series contains 3 fixes somewhat related to various races we have
while handling fallback.
The root cause of the issues addressed here is that the check for
"we can fallback to tcp now" and the related action are not atomic. That
also applies to fallback due to MP_FAIL -- where the window race is even
wider.
Address the issue introducing an additional spinlock to bundle together
all the relevant events, as per patch 1 and 2. These fixes can be
backported up to v5.19 and v5.15.
Note that mptcp_disconnect() unconditionally clears the fallback status
(zeroing msk->flags) but don't touch the `allows_infinite_fallback`
flag. Such issue is addressed in patch 3, and can be backported up to
v5.17.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Paolo Abeni (3):
mptcp: make fallback action and fallback decision atomic
mptcp: plug races between subflow fail and subflow creation
mptcp: reset fallback status gracefully at disconnect() time
net/mptcp/options.c | 3 ++-
net/mptcp/pm.c | 8 +++++++-
net/mptcp/protocol.c | 56 ++++++++++++++++++++++++++++++++++++++++++++--------
net/mptcp/protocol.h | 29 ++++++++++++++++++++-------
net/mptcp/subflow.c | 30 +++++++++++++++++-----------
5 files changed, 98 insertions(+), 28 deletions(-)
---
base-commit: b640daa2822a39ff76e70200cb2b7b892b896dce
change-id: 20250714-net-mptcp-fallback-races-a99f171cf5ca
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
After a recent change in clang to expose uninitialized warnings from
const variables and pointers [1], there is a warning around crtc_state
in dpu_plane_virtual_atomic_check():
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c:1145:6: error: variable 'crtc_state' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
1145 | if (plane_state->crtc)
| ^~~~~~~~~~~~~~~~~
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c:1149:58: note: uninitialized use occurs here
1149 | ret = dpu_plane_atomic_check_nosspp(plane, plane_state, crtc_state);
| ^~~~~~~~~~
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c:1145:2: note: remove the 'if' if its condition is always true
1145 | if (plane_state->crtc)
| ^~~~~~~~~~~~~~~~~~~~~~
1146 | crtc_state = drm_atomic_get_new_crtc_state(state,
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c:1139:35: note: initialize the variable 'crtc_state' to silence this warning
1139 | struct drm_crtc_state *crtc_state;
| ^
| = NULL
Initialize crtc_state to NULL like other places in the driver do, so
that it is consistently initialized.
Cc: stable(a)vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2106
Fixes: 774bcfb73176 ("drm/msm/dpu: add support for virtual planes")
Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cd… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
index 421138bc3cb7..30ff21c01a36 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
@@ -1136,7 +1136,7 @@ static int dpu_plane_virtual_atomic_check(struct drm_plane *plane,
struct drm_plane_state *old_plane_state =
drm_atomic_get_old_plane_state(state, plane);
struct dpu_plane_state *pstate = to_dpu_plane_state(plane_state);
- struct drm_crtc_state *crtc_state;
+ struct drm_crtc_state *crtc_state = NULL;
int ret;
if (IS_ERR(plane_state))
---
base-commit: d3deabe4c619875714b9a844b1a3d9752dbae1dd
change-id: 20250715-drm-msm-fix-const-uninit-warning-2b93cef9f1c6
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
After a recent change in clang to expose uninitialized warnings from
const variables [1], there is a warning from the if statement in
advisor_mode_show().
mm/ksm.c:3687:11: error: variable 'output' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
3687 | else if (ksm_advisor == KSM_ADVISOR_SCAN_TIME)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/ksm.c:3690:33: note: uninitialized use occurs here
3690 | return sysfs_emit(buf, "%s\n", output);
| ^~~~~~
Rewrite the if statement to implicitly make KSM_ADVISOR_NONE the else
branch so that it is obvious to the compiler that ksm_advisor can only
be KSM_ADVISOR_NONE or KSM_ADVISOR_SCAN_TIME due to the assignments in
advisor_mode_store().
Cc: stable(a)vger.kernel.org
Fixes: 66790e9a735b ("mm/ksm: add sysfs knobs for advisor")
Closes: https://github.com/ClangBuiltLinux/linux/issues/2100
Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cd… [1]
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
mm/ksm.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/ksm.c b/mm/ksm.c
index 2b0210d41c55..160787bb121c 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -3682,10 +3682,10 @@ static ssize_t advisor_mode_show(struct kobject *kobj,
{
const char *output;
- if (ksm_advisor == KSM_ADVISOR_NONE)
- output = "[none] scan-time";
- else if (ksm_advisor == KSM_ADVISOR_SCAN_TIME)
+ if (ksm_advisor == KSM_ADVISOR_SCAN_TIME)
output = "none [scan-time]";
+ else
+ output = "[none] scan-time";
return sysfs_emit(buf, "%s\n", output);
}
---
base-commit: fed48693bdfeca666f7536ba88a05e9a4e5523b6
change-id: 20250715-ksm-fix-clang-21-uninit-warning-bea5a6b886ce
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
The patch titled
Subject: mm/ksm: fix -Wsometimes-uninitialized from clang-21 in advisor_mode_show()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-ksm-fix-wsometimes-uninitialized-from-clang-21-in-advisor_mode_show.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Nathan Chancellor <nathan(a)kernel.org>
Subject: mm/ksm: fix -Wsometimes-uninitialized from clang-21 in advisor_mode_show()
Date: Tue, 15 Jul 2025 12:56:16 -0700
After a recent change in clang to expose uninitialized warnings from const
variables [1], there is a false positive warning from the if statement in
advisor_mode_show().
mm/ksm.c:3687:11: error: variable 'output' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
3687 | else if (ksm_advisor == KSM_ADVISOR_SCAN_TIME)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mm/ksm.c:3690:33: note: uninitialized use occurs here
3690 | return sysfs_emit(buf, "%s\n", output);
| ^~~~~~
Rewrite the if statement to implicitly make KSM_ADVISOR_NONE the else
branch so that it is obvious to the compiler that ksm_advisor can only be
KSM_ADVISOR_NONE or KSM_ADVISOR_SCAN_TIME due to the assignments in
advisor_mode_store().
Link: https://lkml.kernel.org/r/20250715-ksm-fix-clang-21-uninit-warning-v1-1-f44…
Fixes: 66790e9a735b ("mm/ksm: add sysfs knobs for advisor")
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
Closes: https://github.com/ClangBuiltLinux/linux/issues/2100
Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cd… [1]
Cc: Chengming Zhou <chengming.zhou(a)linux.dev>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Stefan Roesch <shr(a)devkernel.io>
Cc: xu xin <xu.xin16(a)zte.com.cn>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/ksm.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/ksm.c~mm-ksm-fix-wsometimes-uninitialized-from-clang-21-in-advisor_mode_show
+++ a/mm/ksm.c
@@ -3669,10 +3669,10 @@ static ssize_t advisor_mode_show(struct
{
const char *output;
- if (ksm_advisor == KSM_ADVISOR_NONE)
- output = "[none] scan-time";
- else if (ksm_advisor == KSM_ADVISOR_SCAN_TIME)
+ if (ksm_advisor == KSM_ADVISOR_SCAN_TIME)
output = "none [scan-time]";
+ else
+ output = "[none] scan-time";
return sysfs_emit(buf, "%s\n", output);
}
_
Patches currently in -mm which might be from nathan(a)kernel.org are
mm-ksm-fix-wsometimes-uninitialized-from-clang-21-in-advisor_mode_show.patch
panic-add-panic_sys_info-sysctl-to-take-human-readable-string-parameter-fix.patch
From: Ge Yang <yangge1116(a)126.com>
Since commit d228814b1913 ("efi/libstub: Add get_event_log() support
for CC platforms") reuses TPM2 support code for the CC platforms, when
launching a TDX virtual machine with coco measurement enabled, the
following error log is generated:
[Firmware Bug]: Failed to parse event in TPM Final Events Log
Call Trace:
efi_config_parse_tables()
efi_tpm_eventlog_init()
tpm2_calc_event_log_size()
__calc_tpm2_event_size()
The pcr_idx value in the Intel TDX log header is 1, causing the function
__calc_tpm2_event_size() to fail to recognize the log header, ultimately
leading to the "Failed to parse event in TPM Final Events Log" error.
Intel misread the spec and wrongly sets pcrIndex to 1 in the header and
since they did this, we fear others might, so we're relaxing the header
check. There's no danger of this causing problems because we check for
the TCG_SPECID_SIG signature as the next thing.
Fixes: d228814b1913 ("efi/libstub: Add get_event_log() support for CC platforms")
Signed-off-by: Ge Yang <yangge1116(a)126.com>
Cc: stable(a)vger.kernel.org
---
V6:
- improve commit message suggested by James
V5:
- remove the pcr_index check without adding any replacement checks suggested by James and Sathyanarayanan
V4:
- remove cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT) suggested by Ard
V3:
- fix build error
V2:
- limit the fix for CC only suggested by Jarkko and Sathyanarayanan
include/linux/tpm_eventlog.h | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/include/linux/tpm_eventlog.h b/include/linux/tpm_eventlog.h
index 891368e..05c0ae5 100644
--- a/include/linux/tpm_eventlog.h
+++ b/include/linux/tpm_eventlog.h
@@ -202,8 +202,7 @@ static __always_inline u32 __calc_tpm2_event_size(struct tcg_pcr_event2_head *ev
event_type = event->event_type;
/* Verify that it's the log header */
- if (event_header->pcr_idx != 0 ||
- event_header->event_type != NO_ACTION ||
+ if (event_header->event_type != NO_ACTION ||
memcmp(event_header->digest, zero_digest, sizeof(zero_digest))) {
size = 0;
goto out;
--
2.7.4
From: "Borislav Petkov (AMD)" <bp(a)alien8.de>
VERW_CLEAR is supposed to be set only by the hypervisor to denote TSA
mitigation support to a guest. SQ_NO and L1_NO are both synthesizable,
and are going to be set by hw CPUID on future machines.
So keep the kvm_cpu_cap_init_kvm_defined() invocation *and* set them
when synthesized.
This fix is stable-only.
Co-developed-by: Jinpu Wang <jinpu.wang(a)ionos.com>
Signed-off-by: Jinpu Wang <jinpu.wang(a)ionos.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
---
arch/x86/kvm/cpuid.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 02196db26a08..8f587c5bb6bc 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -822,6 +822,7 @@ void kvm_set_cpu_caps(void)
kvm_cpu_cap_check_and_set(X86_FEATURE_SBPB);
kvm_cpu_cap_check_and_set(X86_FEATURE_IBPB_BRTYPE);
kvm_cpu_cap_check_and_set(X86_FEATURE_SRSO_NO);
+ kvm_cpu_cap_check_and_set(X86_FEATURE_VERW_CLEAR);
kvm_cpu_cap_init_kvm_defined(CPUID_8000_0022_EAX,
F(PERFMON_V2)
@@ -831,6 +832,9 @@ void kvm_set_cpu_caps(void)
F(TSA_SQ_NO) | F(TSA_L1_NO)
);
+ kvm_cpu_cap_check_and_set(X86_FEATURE_TSA_SQ_NO);
+ kvm_cpu_cap_check_and_set(X86_FEATURE_TSA_L1_NO);
+
/*
* Synthesize "LFENCE is serializing" into the AMD-defined entry in
* KVM's supported CPUID if the feature is reported as supported by the
--
2.43.0
This reverts commit e8afa1557f4f963c9a511bd2c6074a941c308685.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
v3:
- cc stable
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Reviewed-by: Simona Vetter <simona.vetter(a)ffwll.ch>
Acked-by: Christian König <christian.koenig(a)amd.com>
Cc: <stable(a)vger.kernel.org> # v6.15+
---
drivers/gpu/drm/drm_gem_dma_helper.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_gem_dma_helper.c b/drivers/gpu/drm/drm_gem_dma_helper.c
index b7f033d4352a..4f0320df858f 100644
--- a/drivers/gpu/drm/drm_gem_dma_helper.c
+++ b/drivers/gpu/drm/drm_gem_dma_helper.c
@@ -230,7 +230,7 @@ void drm_gem_dma_free(struct drm_gem_dma_object *dma_obj)
if (drm_gem_is_imported(gem_obj)) {
if (dma_obj->vaddr)
- dma_buf_vunmap_unlocked(gem_obj->dma_buf, &map);
+ dma_buf_vunmap_unlocked(gem_obj->import_attach->dmabuf, &map);
drm_prime_gem_destroy(gem_obj, dma_obj->sgt);
} else if (dma_obj->vaddr) {
if (dma_obj->map_noncoherent)
--
2.50.0
This reverts commit 1a148af06000e545e714fe3210af3d77ff903c11.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
v3:
- cc stable
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Reviewed-by: Simona Vetter <simona.vetter(a)ffwll.ch>
Acked-by: Christian König <christian.koenig(a)amd.com>
Cc: <stable(a)vger.kernel.org> # v6.15+
---
drivers/gpu/drm/drm_gem_shmem_helper.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c
index 8ac0b1fa5287..5d1349c34afd 100644
--- a/drivers/gpu/drm/drm_gem_shmem_helper.c
+++ b/drivers/gpu/drm/drm_gem_shmem_helper.c
@@ -351,7 +351,7 @@ int drm_gem_shmem_vmap_locked(struct drm_gem_shmem_object *shmem,
dma_resv_assert_held(obj->resv);
if (drm_gem_is_imported(obj)) {
- ret = dma_buf_vmap(obj->dma_buf, map);
+ ret = dma_buf_vmap(obj->import_attach->dmabuf, map);
} else {
pgprot_t prot = PAGE_KERNEL;
@@ -413,7 +413,7 @@ void drm_gem_shmem_vunmap_locked(struct drm_gem_shmem_object *shmem,
dma_resv_assert_held(obj->resv);
if (drm_gem_is_imported(obj)) {
- dma_buf_vunmap(obj->dma_buf, map);
+ dma_buf_vunmap(obj->import_attach->dmabuf, map);
} else {
dma_resv_assert_held(shmem->base.resv);
--
2.50.0
This reverts commit cce16fcd7446dcff7480cd9d2b6417075ed81065.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
v3:
- cc stable
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Reviewed-by: Simona Vetter <simona.vetter(a)ffwll.ch>
Acked-by: Christian König <christian.koenig(a)amd.com>
Cc: <stable(a)vger.kernel.org> # v6.15+
---
drivers/gpu/drm/drm_gem_framebuffer_helper.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/drm_gem_framebuffer_helper.c b/drivers/gpu/drm/drm_gem_framebuffer_helper.c
index 618ce725cd75..fefb2a0f6b40 100644
--- a/drivers/gpu/drm/drm_gem_framebuffer_helper.c
+++ b/drivers/gpu/drm/drm_gem_framebuffer_helper.c
@@ -420,6 +420,7 @@ EXPORT_SYMBOL(drm_gem_fb_vunmap);
static void __drm_gem_fb_end_cpu_access(struct drm_framebuffer *fb, enum dma_data_direction dir,
unsigned int num_planes)
{
+ struct dma_buf_attachment *import_attach;
struct drm_gem_object *obj;
int ret;
@@ -428,9 +429,10 @@ static void __drm_gem_fb_end_cpu_access(struct drm_framebuffer *fb, enum dma_dat
obj = drm_gem_fb_get_obj(fb, num_planes);
if (!obj)
continue;
+ import_attach = obj->import_attach;
if (!drm_gem_is_imported(obj))
continue;
- ret = dma_buf_end_cpu_access(obj->dma_buf, dir);
+ ret = dma_buf_end_cpu_access(import_attach->dmabuf, dir);
if (ret)
drm_err(fb->dev, "dma_buf_end_cpu_access(%u, %d) failed: %d\n",
ret, num_planes, dir);
@@ -453,6 +455,7 @@ static void __drm_gem_fb_end_cpu_access(struct drm_framebuffer *fb, enum dma_dat
*/
int drm_gem_fb_begin_cpu_access(struct drm_framebuffer *fb, enum dma_data_direction dir)
{
+ struct dma_buf_attachment *import_attach;
struct drm_gem_object *obj;
unsigned int i;
int ret;
@@ -463,9 +466,10 @@ int drm_gem_fb_begin_cpu_access(struct drm_framebuffer *fb, enum dma_data_direct
ret = -EINVAL;
goto err___drm_gem_fb_end_cpu_access;
}
+ import_attach = obj->import_attach;
if (!drm_gem_is_imported(obj))
continue;
- ret = dma_buf_begin_cpu_access(obj->dma_buf, dir);
+ ret = dma_buf_begin_cpu_access(import_attach->dmabuf, dir);
if (ret)
goto err___drm_gem_fb_end_cpu_access;
}
--
2.50.0
This reverts commit f83a9b8c7fd0557b0c50784bfdc1bbe9140c9bf8.
The dma_buf field in struct drm_gem_object is not stable over the
object instance's lifetime. The field becomes NULL when user space
releases the final GEM handle on the buffer object. This resulted
in a NULL-pointer deref.
Workarounds in commit 5307dce878d4 ("drm/gem: Acquire references on
GEM handles for framebuffers") and commit f6bfc9afc751 ("drm/framebuffer:
Acquire internal references on GEM handles") only solved the problem
partially. They especially don't work for buffer objects without a DRM
framebuffer associated.
Hence, this revert to going back to using .import_attach->dmabuf.
v3:
- cc stable
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Reviewed-by: Simona Vetter <simona.vetter(a)ffwll.ch>
Acked-by: Christian König <christian.koenig(a)amd.com>
Cc: <stable(a)vger.kernel.org> # v6.15+
---
drivers/gpu/drm/drm_prime.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
index b703f83874e1..a23fc712a8b7 100644
--- a/drivers/gpu/drm/drm_prime.c
+++ b/drivers/gpu/drm/drm_prime.c
@@ -453,7 +453,13 @@ struct dma_buf *drm_gem_prime_handle_to_dmabuf(struct drm_device *dev,
}
mutex_lock(&dev->object_name_lock);
- /* re-export the original imported/exported object */
+ /* re-export the original imported object */
+ if (obj->import_attach) {
+ dmabuf = obj->import_attach->dmabuf;
+ get_dma_buf(dmabuf);
+ goto out_have_obj;
+ }
+
if (obj->dma_buf) {
get_dma_buf(obj->dma_buf);
dmabuf = obj->dma_buf;
--
2.50.0
The FRED specification v9.0 states that there is no need for FRED
event handlers to begin with ENDBR64, because in the presence of
supervisor indirect branch tracking, FRED event delivery does not
enter the WAIT_FOR_ENDBRANCH state.
As a result, remove ENDBR64 from FRED entry points.
Then add ANNOTATE_NOENDBR to indicate that FRED entry points will
never be used for indirect calls to suppress an objtool warning.
This change implies that any indirect CALL/JMP to FRED entry points
causes #CP in the presence of supervisor indirect branch tracking.
Credit goes to Jennifer Miller <jmill(a)asu.edu> and other contributors
from Arizona State University whose work led to this change.
Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code")
Link: https://lore.kernel.org/linux-hardening/Z60NwR4w%2F28Z7XUa@ubun/
Reviewed-by: H. Peter Anvin (Intel) <hpa(a)zytor.com>
Signed-off-by: Xin Li (Intel) <xin(a)zytor.com>
Cc: Jennifer Miller <jmill(a)asu.edu>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Andrew Cooper <andrew.cooper3(a)citrix.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: stable(a)vger.kernel.org # v6.9+
---
Change in v2:
*) CC stable and add a fixes tag (PeterZ).
---
arch/x86/entry/entry_64_fred.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/entry/entry_64_fred.S b/arch/x86/entry/entry_64_fred.S
index 29c5c32c16c3..907bd233c6c1 100644
--- a/arch/x86/entry/entry_64_fred.S
+++ b/arch/x86/entry/entry_64_fred.S
@@ -16,7 +16,7 @@
.macro FRED_ENTER
UNWIND_HINT_END_OF_STACK
- ENDBR
+ ANNOTATE_NOENDBR
PUSH_AND_CLEAR_REGS
movq %rsp, %rdi /* %rdi -> pt_regs */
.endm
--
2.50.1
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x fea18c686320a53fce7ad62a87a3e1d10ad02f31
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071315-rural-exploring-3260@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fea18c686320a53fce7ad62a87a3e1d10ad02f31 Mon Sep 17 00:00:00 2001
From: Alexander Gordeev <agordeev(a)linux.ibm.com>
Date: Mon, 23 Jun 2025 09:57:21 +0200
Subject: [PATCH] mm/vmalloc: leave lazy MMU mode on PTE mapping error
vmap_pages_pte_range() enters the lazy MMU mode, but fails to leave it in
case an error is encountered.
Link: https://lkml.kernel.org/r/20250623075721.2817094-1-agordeev@linux.ibm.com
Fixes: 2ba3e6947aed ("mm/vmalloc: track which page-table levels were modified")
Signed-off-by: Alexander Gordeev <agordeev(a)linux.ibm.com>
Reported-by: kernel test robot <lkp(a)intel.com>
Reported-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Closes: https://lore.kernel.org/r/202506132017.T1l1l6ME-lkp@intel.com/
Reviewed-by: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ab986dd09b6a..6dbcdceecae1 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -514,6 +514,7 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long end, pgprot_t prot, struct page **pages, int *nr,
pgtbl_mod_mask *mask)
{
+ int err = 0;
pte_t *pte;
/*
@@ -530,12 +531,18 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
do {
struct page *page = pages[*nr];
- if (WARN_ON(!pte_none(ptep_get(pte))))
- return -EBUSY;
- if (WARN_ON(!page))
- return -ENOMEM;
- if (WARN_ON(!pfn_valid(page_to_pfn(page))))
- return -EINVAL;
+ if (WARN_ON(!pte_none(ptep_get(pte)))) {
+ err = -EBUSY;
+ break;
+ }
+ if (WARN_ON(!page)) {
+ err = -ENOMEM;
+ break;
+ }
+ if (WARN_ON(!pfn_valid(page_to_pfn(page)))) {
+ err = -EINVAL;
+ break;
+ }
set_pte_at(&init_mm, addr, pte, mk_pte(page, prot));
(*nr)++;
@@ -543,7 +550,8 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
arch_leave_lazy_mmu_mode();
*mask |= PGTBL_PTE_MODIFIED;
- return 0;
+
+ return err;
}
static int vmap_pages_pmd_range(pud_t *pud, unsigned long addr,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x fea18c686320a53fce7ad62a87a3e1d10ad02f31
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071316-jawed-backward-2063@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fea18c686320a53fce7ad62a87a3e1d10ad02f31 Mon Sep 17 00:00:00 2001
From: Alexander Gordeev <agordeev(a)linux.ibm.com>
Date: Mon, 23 Jun 2025 09:57:21 +0200
Subject: [PATCH] mm/vmalloc: leave lazy MMU mode on PTE mapping error
vmap_pages_pte_range() enters the lazy MMU mode, but fails to leave it in
case an error is encountered.
Link: https://lkml.kernel.org/r/20250623075721.2817094-1-agordeev@linux.ibm.com
Fixes: 2ba3e6947aed ("mm/vmalloc: track which page-table levels were modified")
Signed-off-by: Alexander Gordeev <agordeev(a)linux.ibm.com>
Reported-by: kernel test robot <lkp(a)intel.com>
Reported-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Closes: https://lore.kernel.org/r/202506132017.T1l1l6ME-lkp@intel.com/
Reviewed-by: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ab986dd09b6a..6dbcdceecae1 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -514,6 +514,7 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long end, pgprot_t prot, struct page **pages, int *nr,
pgtbl_mod_mask *mask)
{
+ int err = 0;
pte_t *pte;
/*
@@ -530,12 +531,18 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
do {
struct page *page = pages[*nr];
- if (WARN_ON(!pte_none(ptep_get(pte))))
- return -EBUSY;
- if (WARN_ON(!page))
- return -ENOMEM;
- if (WARN_ON(!pfn_valid(page_to_pfn(page))))
- return -EINVAL;
+ if (WARN_ON(!pte_none(ptep_get(pte)))) {
+ err = -EBUSY;
+ break;
+ }
+ if (WARN_ON(!page)) {
+ err = -ENOMEM;
+ break;
+ }
+ if (WARN_ON(!pfn_valid(page_to_pfn(page)))) {
+ err = -EINVAL;
+ break;
+ }
set_pte_at(&init_mm, addr, pte, mk_pte(page, prot));
(*nr)++;
@@ -543,7 +550,8 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
arch_leave_lazy_mmu_mode();
*mask |= PGTBL_PTE_MODIFIED;
- return 0;
+
+ return err;
}
static int vmap_pages_pmd_range(pud_t *pud, unsigned long addr,
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x fea18c686320a53fce7ad62a87a3e1d10ad02f31
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071323-settling-calculus-60b3@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fea18c686320a53fce7ad62a87a3e1d10ad02f31 Mon Sep 17 00:00:00 2001
From: Alexander Gordeev <agordeev(a)linux.ibm.com>
Date: Mon, 23 Jun 2025 09:57:21 +0200
Subject: [PATCH] mm/vmalloc: leave lazy MMU mode on PTE mapping error
vmap_pages_pte_range() enters the lazy MMU mode, but fails to leave it in
case an error is encountered.
Link: https://lkml.kernel.org/r/20250623075721.2817094-1-agordeev@linux.ibm.com
Fixes: 2ba3e6947aed ("mm/vmalloc: track which page-table levels were modified")
Signed-off-by: Alexander Gordeev <agordeev(a)linux.ibm.com>
Reported-by: kernel test robot <lkp(a)intel.com>
Reported-by: Dan Carpenter <dan.carpenter(a)linaro.org>
Closes: https://lore.kernel.org/r/202506132017.T1l1l6ME-lkp@intel.com/
Reviewed-by: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ab986dd09b6a..6dbcdceecae1 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -514,6 +514,7 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long end, pgprot_t prot, struct page **pages, int *nr,
pgtbl_mod_mask *mask)
{
+ int err = 0;
pte_t *pte;
/*
@@ -530,12 +531,18 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
do {
struct page *page = pages[*nr];
- if (WARN_ON(!pte_none(ptep_get(pte))))
- return -EBUSY;
- if (WARN_ON(!page))
- return -ENOMEM;
- if (WARN_ON(!pfn_valid(page_to_pfn(page))))
- return -EINVAL;
+ if (WARN_ON(!pte_none(ptep_get(pte)))) {
+ err = -EBUSY;
+ break;
+ }
+ if (WARN_ON(!page)) {
+ err = -ENOMEM;
+ break;
+ }
+ if (WARN_ON(!pfn_valid(page_to_pfn(page)))) {
+ err = -EINVAL;
+ break;
+ }
set_pte_at(&init_mm, addr, pte, mk_pte(page, prot));
(*nr)++;
@@ -543,7 +550,8 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
arch_leave_lazy_mmu_mode();
*mask |= PGTBL_PTE_MODIFIED;
- return 0;
+
+ return err;
}
static int vmap_pages_pmd_range(pud_t *pud, unsigned long addr,
Hi Sasha,
the changes in hci_request.c look quite different from what I submitted for mainline
(and also from the version for 6.6 LTS). Are you sure that everything went correctly?
regards,
Christian
On Monday, 14 July 2025, 19:37:07 CEST, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> Bluetooth: HCI: Set extended advertising data synchronously
>
> to the 5.10-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> bluetooth-hci-set-extended-advertising-data-synchron.patch
> and it can be found in the queue-5.10 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
>
>
> commit 8766e406adb9b3c4bd8c0506a71bbbf99a43906d
> Author: Christian Eggers <ceggers(a)arri.de>
> Date: Sat Jul 12 02:08:03 2025 -0400
>
> Bluetooth: HCI: Set extended advertising data synchronously
>
> [ Upstream commit 89fb8acc38852116d38d721ad394aad7f2871670 ]
>
> Currently, for controllers with extended advertising, the advertising
> data is set in the asynchronous response handler for extended
> adverstising params. As most advertising settings are performed in a
> synchronous context, the (asynchronous) setting of the advertising data
> is done too late (after enabling the advertising).
>
> Move setting of adverstising data from asynchronous response handler
> into synchronous context to fix ordering of HCI commands.
>
> Signed-off-by: Christian Eggers <ceggers(a)arri.de>
> Fixes: a0fb3726ba55 ("Bluetooth: Use Set ext adv/scan rsp data if controller supports")
> Cc: stable(a)vger.kernel.org
> v2: https://lore.kernel.org/linux-bluetooth/20250626115209.17839-1-ceggers@arri…
> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
> index 7f26c1aab9a06..2cc4aaba09abe 100644
> --- a/net/bluetooth/hci_event.c
> +++ b/net/bluetooth/hci_event.c
> @@ -1726,36 +1726,6 @@ static void hci_cc_set_adv_param(struct hci_dev *hdev, struct sk_buff *skb)
> hci_dev_unlock(hdev);
> }
>
> -static void hci_cc_set_ext_adv_param(struct hci_dev *hdev, struct sk_buff *skb)
> -{
> - struct hci_rp_le_set_ext_adv_params *rp = (void *) skb->data;
> - struct hci_cp_le_set_ext_adv_params *cp;
> - struct adv_info *adv_instance;
> -
> - BT_DBG("%s status 0x%2.2x", hdev->name, rp->status);
> -
> - if (rp->status)
> - return;
> -
> - cp = hci_sent_cmd_data(hdev, HCI_OP_LE_SET_EXT_ADV_PARAMS);
> - if (!cp)
> - return;
> -
> - hci_dev_lock(hdev);
> - hdev->adv_addr_type = cp->own_addr_type;
> - if (!hdev->cur_adv_instance) {
> - /* Store in hdev for instance 0 */
> - hdev->adv_tx_power = rp->tx_power;
> - } else {
> - adv_instance = hci_find_adv_instance(hdev,
> - hdev->cur_adv_instance);
> - if (adv_instance)
> - adv_instance->tx_power = rp->tx_power;
> - }
> - /* Update adv data as tx power is known now */
> - hci_req_update_adv_data(hdev, hdev->cur_adv_instance);
> - hci_dev_unlock(hdev);
> -}
>
> static void hci_cc_read_rssi(struct hci_dev *hdev, struct sk_buff *skb)
> {
> @@ -3601,9 +3571,6 @@ static void hci_cmd_complete_evt(struct hci_dev *hdev, struct sk_buff *skb,
> hci_cc_le_read_num_adv_sets(hdev, skb);
> break;
>
> - case HCI_OP_LE_SET_EXT_ADV_PARAMS:
> - hci_cc_set_ext_adv_param(hdev, skb);
> - break;
>
> case HCI_OP_LE_SET_EXT_ADV_ENABLE:
> hci_cc_le_set_ext_adv_enable(hdev, skb);
> diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
> index 7ce6db1ac558a..743ba58941f8b 100644
> --- a/net/bluetooth/hci_request.c
> +++ b/net/bluetooth/hci_request.c
> @@ -2179,6 +2179,9 @@ int __hci_req_setup_ext_adv_instance(struct hci_request *req, u8 instance)
>
> hci_req_add(req, HCI_OP_LE_SET_EXT_ADV_PARAMS, sizeof(cp), &cp);
>
> + /* Update adv data after setting ext adv params */
> + __hci_req_update_adv_data(req, instance);
> +
> if (own_addr_type == ADDR_LE_DEV_RANDOM &&
> bacmp(&random_addr, BDADDR_ANY)) {
> struct hci_cp_le_set_adv_set_rand_addr cp;
>
Hi,
in the last week, after updating to 6.6.92, we’ve encountered a number of VMs reporting temporarily hung tasks blocking the whole system for a few minutes. They unblock by themselves and have similar tracebacks.
The IO PSIs show 100% pressure for that time, but the underlying devices are still processing read and write IO (well within their capacity). I’ve eliminated the underlying storage (Ceph) as the source of problems as I couldn’t find any latency outliers or significant queuing during that time.
I’ve seen somewhat similar reports on 6.6.88 and 6.6.77, but those might have been different outliers.
I’m attaching 3 logs - my intuition and the data so far leads me to consider this might be a kernel bug. I haven’t found a way to reproduce this, yet.
Regards,
Christian
--
Christian Theune · ct(a)flyingcircus.io · +49 345 219401 0
Flying Circus Internet Operations GmbH · https://flyingcircus.io
Leipziger Str. 70/71 · 06108 Halle (Saale) · Deutschland
HR Stendal HRB 21169 · Geschäftsführer: Christian Theune, Christian Zagrodnick
vhost_vsock_alloc_skb() returns NULL for packets advertising a length
larger than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE in the packet header. However,
this is only checked once the SKB has been allocated and, if the length
in the packet header is zero, the SKB may not be freed immediately.
Hoist the size check before the SKB allocation so that an iovec larger
than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + the header size is rejected
outright. The subsequent check on the length field in the header can
then simply check that the allocated SKB is indeed large enough to hold
the packet.
Cc: <stable(a)vger.kernel.org>
Fixes: 71dc9ec9ac7d ("virtio/vsock: replace virtio_vsock_pkt with sk_buff")
Signed-off-by: Will Deacon <will(a)kernel.org>
---
drivers/vhost/vsock.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 802153e23073..66a0f060770e 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -344,6 +344,9 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
len = iov_length(vq->iov, out);
+ if (len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + VIRTIO_VSOCK_SKB_HEADROOM)
+ return NULL;
+
/* len contains both payload and hdr */
skb = virtio_vsock_alloc_skb(len, GFP_KERNEL);
if (!skb)
@@ -367,8 +370,7 @@ vhost_vsock_alloc_skb(struct vhost_virtqueue *vq,
return skb;
/* The pkt is too big or the length in the header is invalid */
- if (payload_len > VIRTIO_VSOCK_MAX_PKT_BUF_SIZE ||
- payload_len + sizeof(*hdr) > len) {
+ if (payload_len + sizeof(*hdr) > len) {
kfree_skb(skb);
return NULL;
}
--
2.50.0.727.gbf7dc18ff4-goog
mptcp_connect.sh can be executed manually with "-m <MODE>" and "-C" to
make sure everything works as expected when using "mmap" and "sendfile"
modes instead of "poll", and with the MPTCP checksum support.
These modes should be validated, but they are not when the selftests are
executed via the kselftest helpers. It means that most CIs validating
these selftests, like NIPA for the net development trees and LKFT for
the stable ones, are not covering these modes.
To fix that, new test programs have been added, simply calling
mptcp_connect.sh with the right parameters.
The first patch can be backported up to v5.6, and the second one up to
v5.14.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Matthieu Baerts (NGI0) (2):
selftests: mptcp: connect: also cover alt modes
selftests: mptcp: connect: also cover checksum
tools/testing/selftests/net/mptcp/Makefile | 3 ++-
tools/testing/selftests/net/mptcp/mptcp_connect_checksum.sh | 4 ++++
tools/testing/selftests/net/mptcp/mptcp_connect_mmap.sh | 4 ++++
tools/testing/selftests/net/mptcp/mptcp_connect_sendfile.sh | 4 ++++
4 files changed, 14 insertions(+), 1 deletion(-)
---
base-commit: b640daa2822a39ff76e70200cb2b7b892b896dce
change-id: 20250714-net-mptcp-sft-connect-alt-c1aaf073ef4e
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
From: Georg Kohmann <geokohma(a)cisco.com>
[ Upstream commit 4a65dff81a04f874fa6915c7f069b4dc2c4010e4 ]
When a ICMPV6_PKT_TOOBIG report a next-hop MTU that is less than the IPv6
minimum link MTU, the estimated path MTU is reduced to the minimum link
MTU. This behaviour breaks TAHI IPv6 Core Conformance Test v6LC4.1.6:
Packet Too Big Less than IPv6 MTU.
Referring to RFC 8201 section 4: "If a node receives a Packet Too Big
message reporting a next-hop MTU that is less than the IPv6 minimum link
MTU, it must discard it. A node must not reduce its estimate of the Path
MTU below the IPv6 minimum link MTU on receipt of a Packet Too Big
message."
Drop the path MTU update if reported MTU is less than the minimum link MTU.
Signed-off-by: Georg Kohmann <geokohma(a)cisco.com>
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: WangYuli <wangyuli(a)uniontech.com>
---
net/ipv6/route.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index b321cec71127..93bae134e730 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2765,7 +2765,8 @@ static void __ip6_rt_update_pmtu(struct dst_entry *dst, const struct sock *sk,
if (confirm_neigh)
dst_confirm_neigh(dst, daddr);
- mtu = max_t(u32, mtu, IPV6_MIN_MTU);
+ if (mtu < IPV6_MIN_MTU)
+ return;
if (mtu >= dst_mtu(dst))
return;
--
2.50.0
From: Hans de Goede <hdegoede(a)redhat.com>
commit 9cf6e24c9fbf17e52de9fff07f12be7565ea6d61 upstream.
After commit 936e4d49ecbc ("Input: atkbd - skip ATKBD_CMD_GETID in
translated mode") not only the getid command is skipped, but also
the de-activating of the keyboard at the end of atkbd_probe(), potentially
re-introducing the problem fixed by commit be2d7e4233a4 ("Input: atkbd -
fix multi-byte scancode handling on reconnect").
Make sure multi-byte scancode handling on reconnect is still handled
correctly by not skipping the atkbd_deactivate() call.
Fixes: 936e4d49ecbc ("Input: atkbd - skip ATKBD_CMD_GETID in translated mode")
Tested-by: Paul Menzel <pmenzel(a)molgen.mpg.de>
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Link: https://lore.kernel.org/r/20240126160724.13278-3-hdegoede@redhat.com
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Signed-off-by: Wang Hai <wanghai38(a)huawei.com>
---
drivers/input/keyboard/atkbd.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/input/keyboard/atkbd.c b/drivers/input/keyboard/atkbd.c
index b3a856333d4e..de59fc1a24bc 100644
--- a/drivers/input/keyboard/atkbd.c
+++ b/drivers/input/keyboard/atkbd.c
@@ -805,11 +805,11 @@ static int atkbd_probe(struct atkbd *atkbd)
"keyboard reset failed on %s\n",
ps2dev->serio->phys);
if (atkbd_skip_getid(atkbd)) {
atkbd->id = 0xab83;
- return 0;
+ goto deactivate_kbd;
}
/*
* Then we check the keyboard ID. We should get 0xab83 under normal conditions.
* Some keyboards report different values, but the first byte is always 0xab or
@@ -842,10 +842,11 @@ static int atkbd_probe(struct atkbd *atkbd)
"NCD terminal keyboards are only supported on non-translating controllers. "
"Use i8042.direct=1 to disable translation.\n");
return -1;
}
+deactivate_kbd:
/*
* Make sure nothing is coming from the keyboard and disturbs our
* internal state.
*/
if (!atkbd_skip_deactivate)
--
2.17.1
Using ipu_bridge_get_ivsc_csi_dev() to locate the device could cause
an imbalance in the device's reference count.
ipu_bridge_get_ivsc_csi_dev() calls device_find_child_by_name() to
implement the localization, and device_find_child_by_name() calls an
implicit get_device() to increment the device's reference count before
returning the pointer. Throughout the entire implementation process,
no mechanism releases resources properly. This leads to a memory leak
because the reference count of the device is never decremented.
As the comment of device_find_child_by_name() says, 'NOTE: you will
need to drop the reference with put_device() after use'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: c66821f381ae ("media: pci: intel: Add IVSC support for IPU bridge driver")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/media/pci/intel/ipu-bridge.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/media/pci/intel/ipu-bridge.c b/drivers/media/pci/intel/ipu-bridge.c
index 83e682e1a4b7..f8b4672accab 100644
--- a/drivers/media/pci/intel/ipu-bridge.c
+++ b/drivers/media/pci/intel/ipu-bridge.c
@@ -192,6 +192,7 @@ static int ipu_bridge_check_ivsc_dev(struct ipu_sensor *sensor,
sensor->csi_dev = csi_dev;
sensor->ivsc_adev = adev;
+ put_device(csi_dev);
}
return 0;
--
2.25.1
Require a minimum GHCB version of 2 when starting SEV-SNP guests through
KVM_SEV_INIT2. When a VMM attempts to start an SEV-SNP guest with an
incompatible GHCB version (less than 2), reject the request early rather
than allowing the guest to start with an incorrect protocol version and
fail later.
Fixes: 4af663c2f64a ("KVM: SEV: Allow per-guest configuration of GHCB protocol version")
Cc: Thomas Lendacky <thomas.lendacky(a)amd.com>
Cc: Sean Christopherson <seanjc(a)google.com>
Cc: Michael Roth <michael.roth(a)amd.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Nikunj A Dadhania <nikunj(a)amd.com>
---
arch/x86/kvm/svm/sev.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index a12e78b67466..91d06fb91ba2 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -435,6 +435,9 @@ static int __sev_guest_init(struct kvm *kvm, struct kvm_sev_cmd *argp,
if (unlikely(sev->active))
return -EINVAL;
+ if (snp_active && data->ghcb_version && data->ghcb_version < 2)
+ return -EINVAL;
+
sev->active = true;
sev->es_active = es_active;
sev->vmsa_features = data->vmsa_features;
--
2.43.0
On Fri, Apr 11, 2025 at 10:14:44AM -0500, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello(a)amd.com>
>
> commit a5cfc9d65879c ("thunderbolt: Add wake on connect/disconnect
> on USB4 ports") introduced a sysfs file to control wake up policy
> for a given USB4 port that defaulted to disabled.
>
> However when testing commit 4bfeea6ec1c02 ("thunderbolt: Use wake
> on connect and disconnect over suspend") I found that it was working
> even without making changes to the power/wakeup file (which defaults
> to disabled). This is because of a logic error doing a bitwise or
> of the wake-on-connect flag with device_may_wakeup() which should
> have been a logical AND.
>
> Adjust the logic so that policy is only applied when wakeup is
> actually enabled.
>
> Fixes: a5cfc9d65879c ("thunderbolt: Add wake on connect/disconnect on USB4 ports")
> Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
Hi! There have been a couple of reports of a Thunderbolt regression in
recent stable kernels, and one reporter has now bisected it to this
change:
• https://bugzilla.kernel.org/show_bug.cgi?id=220284
• https://github.com/NixOS/nixpkgs/issues/420730
Both reporters are CCed, and say it starts working after the module is
reloaded.
Link: https://lore.kernel.org/r/bug-220284-208809@https.bugzilla.kernel.org%2F/
(for regzbot)
> ---
> drivers/thunderbolt/usb4.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/thunderbolt/usb4.c b/drivers/thunderbolt/usb4.c
> index e51d01671d8e7..3e96f1afd4268 100644
> --- a/drivers/thunderbolt/usb4.c
> +++ b/drivers/thunderbolt/usb4.c
> @@ -440,10 +440,10 @@ int usb4_switch_set_wake(struct tb_switch *sw, unsigned int flags)
> bool configured = val & PORT_CS_19_PC;
> usb4 = port->usb4;
>
> - if (((flags & TB_WAKE_ON_CONNECT) |
> + if (((flags & TB_WAKE_ON_CONNECT) &&
> device_may_wakeup(&usb4->dev)) && !configured)
> val |= PORT_CS_19_WOC;
> - if (((flags & TB_WAKE_ON_DISCONNECT) |
> + if (((flags & TB_WAKE_ON_DISCONNECT) &&
> device_may_wakeup(&usb4->dev)) && configured)
> val |= PORT_CS_19_WOD;
> if ((flags & TB_WAKE_ON_USB4) && configured)
> --
> 2.43.0
>
On 1970/1/1 8:00, wrote:
> 6.6-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From Hans de Goede hdegoede(a)redhat.com
>
> [ Upstream commit 936e4d49ecbc8c404790504386e1422b599dec39 ]
>
> There have been multiple reports of keyboard issues on recent laptop models
> which can be worked around by setting i8042.dumbkbd, with the downside
> being this breaks the capslock LED.
>
> It seems that these issues are caused by recent laptops getting confused by
> ATKBD_CMD_GETID. Rather then adding and endless growing list of quirks for
> this, just skip ATKBD_CMD_GETID alltogether on laptops in translated mode.
>
> The main goal of sending ATKBD_CMD_GETID is to skip binding to ps2
> micetouchpads and those are never used in translated mode.
>
> Examples of laptop models which benefit from skipping ATKBD_CMD_GETID
>
> HP Laptop 15s-fq2xxx, HP laptop 15s-fq4xxx and HP Laptop 15-dy2xxx
> models the kbd stops working for the first 2 - 5 minutes after boot
> (waiting for EC watchdog reset)
>
> On HP Spectre x360 13-aw2xxx atkbd fails to probe the keyboard
>
> At least 9 different Lenovo models have issues with ATKBD_CMD_GETID, see
> httpsgithub.comyescallopatkbd-nogetid
>
> This has been tested on
>
> 1. A MSI B550M PRO-VDH WIFI desktop, where the i8042 controller is not
> in translated mode when no keyboard is plugged in and with a ps2 kbd
> a AT Translated Set 2 keyboard devinputevent# node shows up
>
> 2. A Lenovo ThinkPad X1 Yoga gen 8 (always has a translated set 2 keyboard)
>
> Reported-by Shang Ye yesh25(a)mail2.sysu.edu.cn
> Closes httpslore.kernel.orglinux-input886D6167733841AE+20231017135318.11142-1-yesh25(a)mail2.sysu.edu.cn
> Closes httpsgithub.comyescallopatkbd-nogetid
> Reported-by gurevitch mail(a)gurevit.ch
> Closes httpslore.kernel.orglinux-input2iAJTwqZV6lQs26cTb38RNYqxvsink6SRmrZ5h0cBUSuf9NT0tZTsf9fEAbbto2maavHJEOP8GA1evlKa6xjKOsaskDhtJWxjcnrgPigzVo=(a)gurevit.ch
> Reported-by Egor Ignatov egori(a)altlinux.org
> Closes httpslore.kernel.orgall20210609073333.8425-1-egori(a)altlinux.org
> Reported-by Anton Zhilyaev anton(a)cpp.in
> Closes httpslore.kernel.orglinux-input20210201160336.16008-1-anton(a)cpp.in
> Closes httpsbugzilla.redhat.comshow_bug.cgiid=2086156
> Signed-off-by Hans de Goede hdegoede(a)redhat.com
> Link httpslore.kernel.orgr20231115174625.7462-1-hdegoede(a)redhat.com
> Signed-off-by Dmitry Torokhov dmitry.torokhov(a)gmail.com
> Signed-off-by Sasha Levin sashal(a)kernel.org
> ---
Hi, Hans
I noticed there's a subsequent bugfix [1] for this patch, but it hasn't
been merged into the stable-6.6 branch. Based on the bugfix description,
the issue should exist there as well. Would you like this patch to be
merged into the stable-6.6 branch?"
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?…
Currently, __mkroute_output overrules the MTU value configured for
broadcast routes.
This buggy behaviour can be reproduced with:
ip link set dev eth1 mtu 9000
ip route del broadcast 192.168.0.255 dev eth1 proto kernel scope link src 192.168.0.2
ip route add broadcast 192.168.0.255 dev eth1 proto kernel scope link src 192.168.0.2 mtu 1500
The maximum packet size should be 1500, but it is actually 8000:
ping -b 192.168.0.255 -s 8000
Fix __mkroute_output to allow MTU values to be configured for
for broadcast routes (to support a mixed-MTU local-area-network).
Signed-off-by: Oscar Maes <oscmaes92(a)gmail.com>
---
net/ipv4/route.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 64ba377cd..f639a2ae8 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2588,7 +2588,6 @@ static struct rtable *__mkroute_output(const struct fib_result *res,
do_cache = true;
if (type == RTN_BROADCAST) {
flags |= RTCF_BROADCAST | RTCF_LOCAL;
- fi = NULL;
} else if (type == RTN_MULTICAST) {
flags |= RTCF_MULTICAST | RTCF_LOCAL;
if (!ip_check_mc_rcu(in_dev, fl4->daddr, fl4->saddr,
--
2.39.5
From: David Howells <dhowells(a)redhat.com>
[ Upstream commit 880a88f318cf1d2a0f4c0a7ff7b07e2062b434a4 ]
If an AF_RXRPC service socket is opened and bound, but calls are
preallocated, then rxrpc_alloc_incoming_call() will oops because the
rxrpc_backlog struct doesn't get allocated until the first preallocation is
made.
Fix this by returning NULL from rxrpc_alloc_incoming_call() if there is no
backlog struct. This will cause the incoming call to be aborted.
Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Suggested-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
cc: LePremierHomme <kwqcheii(a)proton.me>
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: Willy Tarreau <w(a)1wt.eu>
cc: Simon Horman <horms(a)kernel.org>
cc: linux-afs(a)lists.infradead.org
Link: https://patch.msgid.link/20250708211506.2699012-3-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Fixes a Critical Kernel Oops**: The commit addresses a NULL pointer
dereference that causes a kernel crash when `rx->backlog` is NULL. At
line 257 of the original code,
`smp_load_acquire(&b->call_backlog_head)` would dereference a NULL
pointer if no preallocation was done.
2. **Minimal and Safe Fix**: The fix is a simple defensive check:
```c
+ if (!b)
+ return NULL;
```
This is placed immediately after obtaining the backlog pointer and
before any usage. The fix has zero risk of regression - if `b` is
NULL, the code would have crashed anyway.
3. **Clear Reproducible Scenario**: The bug occurs in a specific but
realistic scenario - when an AF_RXRPC service socket is opened and
bound but no calls are preallocated (meaning
`rxrpc_service_prealloc()` was never called to allocate the backlog
structure).
4. **Follows Stable Kernel Rules**: This fix meets all criteria for
stable backporting:
- Fixes a real bug that users can hit
- Small and contained change (2 lines)
- Obviously correct with no side effects
- Already tested and merged upstream
5. **Similar to Previously Backported Fixes**: Looking at Similar Commit
#2 which was marked YES, it also fixed an oops in the rxrpc
preallocation/backlog system with minimal changes.
The commit prevents a kernel crash with a trivial NULL check, making it
an ideal candidate for stable backporting.
net/rxrpc/call_accept.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 55fb3744552de..99f05057e4c90 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -281,6 +281,9 @@ static struct rxrpc_call *rxrpc_alloc_incoming_call(struct rxrpc_sock *rx,
unsigned short call_tail, conn_tail, peer_tail;
unsigned short call_count, conn_count;
+ if (!b)
+ return NULL;
+
/* #calls >= #conns >= #peers must hold true. */
call_head = smp_load_acquire(&b->call_backlog_head);
call_tail = b->call_backlog_tail;
--
2.39.5
From: David Howells <dhowells(a)redhat.com>
[ Upstream commit 880a88f318cf1d2a0f4c0a7ff7b07e2062b434a4 ]
If an AF_RXRPC service socket is opened and bound, but calls are
preallocated, then rxrpc_alloc_incoming_call() will oops because the
rxrpc_backlog struct doesn't get allocated until the first preallocation is
made.
Fix this by returning NULL from rxrpc_alloc_incoming_call() if there is no
backlog struct. This will cause the incoming call to be aborted.
Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Suggested-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
cc: LePremierHomme <kwqcheii(a)proton.me>
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: Willy Tarreau <w(a)1wt.eu>
cc: Simon Horman <horms(a)kernel.org>
cc: linux-afs(a)lists.infradead.org
Link: https://patch.msgid.link/20250708211506.2699012-3-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Fixes a Critical Kernel Oops**: The commit addresses a NULL pointer
dereference that causes a kernel crash when `rx->backlog` is NULL. At
line 257 of the original code,
`smp_load_acquire(&b->call_backlog_head)` would dereference a NULL
pointer if no preallocation was done.
2. **Minimal and Safe Fix**: The fix is a simple defensive check:
```c
+ if (!b)
+ return NULL;
```
This is placed immediately after obtaining the backlog pointer and
before any usage. The fix has zero risk of regression - if `b` is
NULL, the code would have crashed anyway.
3. **Clear Reproducible Scenario**: The bug occurs in a specific but
realistic scenario - when an AF_RXRPC service socket is opened and
bound but no calls are preallocated (meaning
`rxrpc_service_prealloc()` was never called to allocate the backlog
structure).
4. **Follows Stable Kernel Rules**: This fix meets all criteria for
stable backporting:
- Fixes a real bug that users can hit
- Small and contained change (2 lines)
- Obviously correct with no side effects
- Already tested and merged upstream
5. **Similar to Previously Backported Fixes**: Looking at Similar Commit
#2 which was marked YES, it also fixed an oops in the rxrpc
preallocation/backlog system with minimal changes.
The commit prevents a kernel crash with a trivial NULL check, making it
an ideal candidate for stable backporting.
net/rxrpc/call_accept.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 2a14d69b171f3..b96af42a1b041 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -271,6 +271,9 @@ static struct rxrpc_call *rxrpc_alloc_incoming_call(struct rxrpc_sock *rx,
unsigned short call_tail, conn_tail, peer_tail;
unsigned short call_count, conn_count;
+ if (!b)
+ return NULL;
+
/* #calls >= #conns >= #peers must hold true. */
call_head = smp_load_acquire(&b->call_backlog_head);
call_tail = b->call_backlog_tail;
--
2.39.5
From: David Howells <dhowells(a)redhat.com>
[ Upstream commit 880a88f318cf1d2a0f4c0a7ff7b07e2062b434a4 ]
If an AF_RXRPC service socket is opened and bound, but calls are
preallocated, then rxrpc_alloc_incoming_call() will oops because the
rxrpc_backlog struct doesn't get allocated until the first preallocation is
made.
Fix this by returning NULL from rxrpc_alloc_incoming_call() if there is no
backlog struct. This will cause the incoming call to be aborted.
Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Suggested-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
cc: LePremierHomme <kwqcheii(a)proton.me>
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: Willy Tarreau <w(a)1wt.eu>
cc: Simon Horman <horms(a)kernel.org>
cc: linux-afs(a)lists.infradead.org
Link: https://patch.msgid.link/20250708211506.2699012-3-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Fixes a Critical Kernel Oops**: The commit addresses a NULL pointer
dereference that causes a kernel crash when `rx->backlog` is NULL. At
line 257 of the original code,
`smp_load_acquire(&b->call_backlog_head)` would dereference a NULL
pointer if no preallocation was done.
2. **Minimal and Safe Fix**: The fix is a simple defensive check:
```c
+ if (!b)
+ return NULL;
```
This is placed immediately after obtaining the backlog pointer and
before any usage. The fix has zero risk of regression - if `b` is
NULL, the code would have crashed anyway.
3. **Clear Reproducible Scenario**: The bug occurs in a specific but
realistic scenario - when an AF_RXRPC service socket is opened and
bound but no calls are preallocated (meaning
`rxrpc_service_prealloc()` was never called to allocate the backlog
structure).
4. **Follows Stable Kernel Rules**: This fix meets all criteria for
stable backporting:
- Fixes a real bug that users can hit
- Small and contained change (2 lines)
- Obviously correct with no side effects
- Already tested and merged upstream
5. **Similar to Previously Backported Fixes**: Looking at Similar Commit
#2 which was marked YES, it also fixed an oops in the rxrpc
preallocation/backlog system with minimal changes.
The commit prevents a kernel crash with a trivial NULL check, making it
an ideal candidate for stable backporting.
net/rxrpc/call_accept.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 99e10eea37321..658b592e58e05 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -270,6 +270,9 @@ static struct rxrpc_call *rxrpc_alloc_incoming_call(struct rxrpc_sock *rx,
unsigned short call_tail, conn_tail, peer_tail;
unsigned short call_count, conn_count;
+ if (!b)
+ return NULL;
+
/* #calls >= #conns >= #peers must hold true. */
call_head = smp_load_acquire(&b->call_backlog_head);
call_tail = b->call_backlog_tail;
--
2.39.5
From: Xiang Mei <xmei5(a)asu.edu>
[ Upstream commit dd831ac8221e691e9e918585b1003c7071df0379 ]
To prevent a potential crash in agg_dequeue (net/sched/sch_qfq.c)
when cl->qdisc->ops->peek(cl->qdisc) returns NULL, we check the return
value before using it, similar to the existing approach in sch_hfsc.c.
To avoid code duplication, the following changes are made:
1. Changed qdisc_warn_nonwc(include/net/pkt_sched.h) into a static
inline function.
2. Moved qdisc_peek_len from net/sched/sch_hfsc.c to
include/net/pkt_sched.h so that sch_qfq can reuse it.
3. Applied qdisc_peek_len in agg_dequeue to avoid crashing.
Signed-off-by: Xiang Mei <xmei5(a)asu.edu>
Reviewed-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Link: https://patch.msgid.link/20250705212143.3982664-1-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and related code, here is my
assessment:
**YES** - This commit should be backported to stable kernel trees.
Here's my extensive explanation:
## Reason for Backporting
1. **Real Bug Fix**: This commit fixes a NULL pointer dereference bug in
`agg_dequeue()` function in `net/sched/sch_qfq.c`. The bug occurs
when `cl->qdisc->ops->peek(cl->qdisc)` returns NULL, but the code at
line 992 (before the fix) directly uses this potentially NULL value
in `qdisc_pkt_len()` without checking:
```c
else if (cl->deficit < qdisc_pkt_len(cl->qdisc->ops->peek(cl->qdisc)))
```
2. **Crash Prevention**: This is not a theoretical issue - it causes an
actual kernel crash (NULL pointer dereference) that can affect system
stability. This meets the stable kernel criteria of fixing "a real
bug that bothers people" including "an oops, a hang, data
corruption."
3. **Similar Patterns in Other Schedulers**: The commit shows that this
pattern of NULL checking after peek operations is already implemented
in other packet schedulers:
- `sch_hfsc.c` already has the `qdisc_peek_len()` function with NULL
checking
- `sch_drr.c` checks for NULL after peek operations (lines 385-387)
- The similar commit #1 shows DRR had a similar issue fixed
4. **Minimal and Contained Fix**: The fix is:
- Small in size (well under 100 lines)
- Obviously correct - it adds proper NULL checking
- Moves existing code to be reusable
- Makes the code more consistent across schedulers
5. **Precedent from Similar Commits**: Looking at the historical
commits:
- Similar commit #2 (sch_codel NULL check) was backported (Status:
YES)
- Similar commit #3 (multiple schedulers NULL handling) was
backported (Status: YES)
- Both dealt with NULL pointer handling in packet scheduler dequeue
paths
6. **Code Consolidation**: The fix properly consolidates the NULL
checking logic:
- Converts `qdisc_warn_nonwc` from a regular function to static
inline (reducing overhead)
- Moves `qdisc_peek_len` from sch_hfsc.c to the common header so it
can be reused
- Uses the same pattern across multiple schedulers for consistency
7. **Tested Pattern**: The `qdisc_peek_len()` function being moved has
been in use in sch_hfsc.c, proving it's a tested and working
solution.
8. **Security Consideration**: While not explicitly a security
vulnerability, NULL pointer dereferences can potentially be exploited
for denial of service attacks, making this fix important for system
stability.
The commit follows all the stable kernel rules: it fixes a real bug
(NULL pointer dereference), is obviously correct (adds NULL check), is
small and contained, and improves consistency across the codebase. The
pattern of backporting similar NULL check fixes in packet schedulers (as
seen in similar commits #2 and #3) supports backporting this fix as
well.
include/net/pkt_sched.h | 25 ++++++++++++++++++++++++-
net/sched/sch_api.c | 10 ----------
net/sched/sch_hfsc.c | 16 ----------------
net/sched/sch_qfq.c | 2 +-
4 files changed, 25 insertions(+), 28 deletions(-)
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index f99a513b40a92..76f9fb651faff 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -113,7 +113,6 @@ struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r,
struct netlink_ext_ack *extack);
void qdisc_put_rtab(struct qdisc_rate_table *tab);
void qdisc_put_stab(struct qdisc_size_table *tab);
-void qdisc_warn_nonwc(const char *txt, struct Qdisc *qdisc);
bool sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
struct net_device *dev, struct netdev_queue *txq,
spinlock_t *root_lock, bool validate);
@@ -247,4 +246,28 @@ static inline bool tc_qdisc_stats_dump(struct Qdisc *sch,
return true;
}
+static inline void qdisc_warn_nonwc(const char *txt, struct Qdisc *qdisc)
+{
+ if (!(qdisc->flags & TCQ_F_WARN_NONWC)) {
+ pr_warn("%s: %s qdisc %X: is non-work-conserving?\n",
+ txt, qdisc->ops->id, qdisc->handle >> 16);
+ qdisc->flags |= TCQ_F_WARN_NONWC;
+ }
+}
+
+static inline unsigned int qdisc_peek_len(struct Qdisc *sch)
+{
+ struct sk_buff *skb;
+ unsigned int len;
+
+ skb = sch->ops->peek(sch);
+ if (unlikely(skb == NULL)) {
+ qdisc_warn_nonwc("qdisc_peek_len", sch);
+ return 0;
+ }
+ len = qdisc_pkt_len(skb);
+
+ return len;
+}
+
#endif
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 7c5df62421bbd..6a2d258b01daa 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -593,16 +593,6 @@ void __qdisc_calculate_pkt_len(struct sk_buff *skb,
qdisc_skb_cb(skb)->pkt_len = pkt_len;
}
-void qdisc_warn_nonwc(const char *txt, struct Qdisc *qdisc)
-{
- if (!(qdisc->flags & TCQ_F_WARN_NONWC)) {
- pr_warn("%s: %s qdisc %X: is non-work-conserving?\n",
- txt, qdisc->ops->id, qdisc->handle >> 16);
- qdisc->flags |= TCQ_F_WARN_NONWC;
- }
-}
-EXPORT_SYMBOL(qdisc_warn_nonwc);
-
static enum hrtimer_restart qdisc_watchdog(struct hrtimer *timer)
{
struct qdisc_watchdog *wd = container_of(timer, struct qdisc_watchdog,
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 61b91de8065f0..302413e0aceff 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -836,22 +836,6 @@ update_vf(struct hfsc_class *cl, unsigned int len, u64 cur_time)
}
}
-static unsigned int
-qdisc_peek_len(struct Qdisc *sch)
-{
- struct sk_buff *skb;
- unsigned int len;
-
- skb = sch->ops->peek(sch);
- if (unlikely(skb == NULL)) {
- qdisc_warn_nonwc("qdisc_peek_len", sch);
- return 0;
- }
- len = qdisc_pkt_len(skb);
-
- return len;
-}
-
static void
hfsc_adjust_levels(struct hfsc_class *cl)
{
diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c
index 6462468bf77c7..cde6c7026ab1d 100644
--- a/net/sched/sch_qfq.c
+++ b/net/sched/sch_qfq.c
@@ -991,7 +991,7 @@ static struct sk_buff *agg_dequeue(struct qfq_aggregate *agg,
if (cl->qdisc->q.qlen == 0) /* no more packets, remove from list */
list_del_init(&cl->alist);
- else if (cl->deficit < qdisc_pkt_len(cl->qdisc->ops->peek(cl->qdisc))) {
+ else if (cl->deficit < qdisc_peek_len(cl->qdisc)) {
cl->deficit += agg->lmax;
list_move_tail(&cl->alist, &agg->active);
}
--
2.39.5
From: Xiang Mei <xmei5(a)asu.edu>
[ Upstream commit dd831ac8221e691e9e918585b1003c7071df0379 ]
To prevent a potential crash in agg_dequeue (net/sched/sch_qfq.c)
when cl->qdisc->ops->peek(cl->qdisc) returns NULL, we check the return
value before using it, similar to the existing approach in sch_hfsc.c.
To avoid code duplication, the following changes are made:
1. Changed qdisc_warn_nonwc(include/net/pkt_sched.h) into a static
inline function.
2. Moved qdisc_peek_len from net/sched/sch_hfsc.c to
include/net/pkt_sched.h so that sch_qfq can reuse it.
3. Applied qdisc_peek_len in agg_dequeue to avoid crashing.
Signed-off-by: Xiang Mei <xmei5(a)asu.edu>
Reviewed-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Link: https://patch.msgid.link/20250705212143.3982664-1-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and related code, here is my
assessment:
**YES** - This commit should be backported to stable kernel trees.
Here's my extensive explanation:
## Reason for Backporting
1. **Real Bug Fix**: This commit fixes a NULL pointer dereference bug in
`agg_dequeue()` function in `net/sched/sch_qfq.c`. The bug occurs
when `cl->qdisc->ops->peek(cl->qdisc)` returns NULL, but the code at
line 992 (before the fix) directly uses this potentially NULL value
in `qdisc_pkt_len()` without checking:
```c
else if (cl->deficit < qdisc_pkt_len(cl->qdisc->ops->peek(cl->qdisc)))
```
2. **Crash Prevention**: This is not a theoretical issue - it causes an
actual kernel crash (NULL pointer dereference) that can affect system
stability. This meets the stable kernel criteria of fixing "a real
bug that bothers people" including "an oops, a hang, data
corruption."
3. **Similar Patterns in Other Schedulers**: The commit shows that this
pattern of NULL checking after peek operations is already implemented
in other packet schedulers:
- `sch_hfsc.c` already has the `qdisc_peek_len()` function with NULL
checking
- `sch_drr.c` checks for NULL after peek operations (lines 385-387)
- The similar commit #1 shows DRR had a similar issue fixed
4. **Minimal and Contained Fix**: The fix is:
- Small in size (well under 100 lines)
- Obviously correct - it adds proper NULL checking
- Moves existing code to be reusable
- Makes the code more consistent across schedulers
5. **Precedent from Similar Commits**: Looking at the historical
commits:
- Similar commit #2 (sch_codel NULL check) was backported (Status:
YES)
- Similar commit #3 (multiple schedulers NULL handling) was
backported (Status: YES)
- Both dealt with NULL pointer handling in packet scheduler dequeue
paths
6. **Code Consolidation**: The fix properly consolidates the NULL
checking logic:
- Converts `qdisc_warn_nonwc` from a regular function to static
inline (reducing overhead)
- Moves `qdisc_peek_len` from sch_hfsc.c to the common header so it
can be reused
- Uses the same pattern across multiple schedulers for consistency
7. **Tested Pattern**: The `qdisc_peek_len()` function being moved has
been in use in sch_hfsc.c, proving it's a tested and working
solution.
8. **Security Consideration**: While not explicitly a security
vulnerability, NULL pointer dereferences can potentially be exploited
for denial of service attacks, making this fix important for system
stability.
The commit follows all the stable kernel rules: it fixes a real bug
(NULL pointer dereference), is obviously correct (adds NULL check), is
small and contained, and improves consistency across the codebase. The
pattern of backporting similar NULL check fixes in packet schedulers (as
seen in similar commits #2 and #3) supports backporting this fix as
well.
include/net/pkt_sched.h | 25 ++++++++++++++++++++++++-
net/sched/sch_api.c | 10 ----------
net/sched/sch_hfsc.c | 16 ----------------
net/sched/sch_qfq.c | 2 +-
4 files changed, 25 insertions(+), 28 deletions(-)
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 15960564e0c36..4d72d24b1f33e 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -112,7 +112,6 @@ struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r,
struct netlink_ext_ack *extack);
void qdisc_put_rtab(struct qdisc_rate_table *tab);
void qdisc_put_stab(struct qdisc_size_table *tab);
-void qdisc_warn_nonwc(const char *txt, struct Qdisc *qdisc);
bool sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
struct net_device *dev, struct netdev_queue *txq,
spinlock_t *root_lock, bool validate);
@@ -306,4 +305,28 @@ static inline bool tc_qdisc_stats_dump(struct Qdisc *sch,
return true;
}
+static inline void qdisc_warn_nonwc(const char *txt, struct Qdisc *qdisc)
+{
+ if (!(qdisc->flags & TCQ_F_WARN_NONWC)) {
+ pr_warn("%s: %s qdisc %X: is non-work-conserving?\n",
+ txt, qdisc->ops->id, qdisc->handle >> 16);
+ qdisc->flags |= TCQ_F_WARN_NONWC;
+ }
+}
+
+static inline unsigned int qdisc_peek_len(struct Qdisc *sch)
+{
+ struct sk_buff *skb;
+ unsigned int len;
+
+ skb = sch->ops->peek(sch);
+ if (unlikely(skb == NULL)) {
+ qdisc_warn_nonwc("qdisc_peek_len", sch);
+ return 0;
+ }
+ len = qdisc_pkt_len(skb);
+
+ return len;
+}
+
#endif
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 282423106f15d..ae63b3199b1d1 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -594,16 +594,6 @@ void __qdisc_calculate_pkt_len(struct sk_buff *skb,
qdisc_skb_cb(skb)->pkt_len = pkt_len;
}
-void qdisc_warn_nonwc(const char *txt, struct Qdisc *qdisc)
-{
- if (!(qdisc->flags & TCQ_F_WARN_NONWC)) {
- pr_warn("%s: %s qdisc %X: is non-work-conserving?\n",
- txt, qdisc->ops->id, qdisc->handle >> 16);
- qdisc->flags |= TCQ_F_WARN_NONWC;
- }
-}
-EXPORT_SYMBOL(qdisc_warn_nonwc);
-
static enum hrtimer_restart qdisc_watchdog(struct hrtimer *timer)
{
struct qdisc_watchdog *wd = container_of(timer, struct qdisc_watchdog,
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index afcb83d469ff6..751b1e2c35b3f 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -835,22 +835,6 @@ update_vf(struct hfsc_class *cl, unsigned int len, u64 cur_time)
}
}
-static unsigned int
-qdisc_peek_len(struct Qdisc *sch)
-{
- struct sk_buff *skb;
- unsigned int len;
-
- skb = sch->ops->peek(sch);
- if (unlikely(skb == NULL)) {
- qdisc_warn_nonwc("qdisc_peek_len", sch);
- return 0;
- }
- len = qdisc_pkt_len(skb);
-
- return len;
-}
-
static void
hfsc_adjust_levels(struct hfsc_class *cl)
{
diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c
index 5e557b960bde3..e0cefa21ce21c 100644
--- a/net/sched/sch_qfq.c
+++ b/net/sched/sch_qfq.c
@@ -992,7 +992,7 @@ static struct sk_buff *agg_dequeue(struct qfq_aggregate *agg,
if (cl->qdisc->q.qlen == 0) /* no more packets, remove from list */
list_del_init(&cl->alist);
- else if (cl->deficit < qdisc_pkt_len(cl->qdisc->ops->peek(cl->qdisc))) {
+ else if (cl->deficit < qdisc_peek_len(cl->qdisc)) {
cl->deficit += agg->lmax;
list_move_tail(&cl->alist, &agg->active);
}
--
2.39.5
From: Miri Korenblit <miriam.rachel.korenblit(a)intel.com>
[ Upstream commit d7a54d02db41f72f0581a3c77c75b0993ed3f6e2 ]
This is currently not initialized for a virtual monitor, leading to a
NULL pointer dereference when - for example - iterating over all the
keys of all the vifs.
Reviewed-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit(a)intel.com>
Link: https://patch.msgid.link/20250709233400.8dcefe578497.I4c90a00ae3256520e0631…
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and the code changes, here is my
assessment:
**YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Critical NULL Pointer Dereference Fix**: The commit fixes a NULL
pointer dereference that occurs when iterating over the key_list of
virtual monitor interfaces. This is a crash-inducing bug that affects
system stability.
2. **Clear Bug with Simple Fix**: The issue is straightforward - the
`key_list` was not initialized for virtual monitor interfaces created
via `ieee80211_add_virtual_monitor()`. The fix is minimal and
contained - it simply moves the `INIT_LIST_HEAD(&sdata->key_list)`
initialization from `ieee80211_if_add()` into
`ieee80211_sdata_init()`, ensuring all sdata structures have their
key_list properly initialized.
3. **Real-World Impact**: The bug can be triggered when any code
iterates over all interfaces and their keys. Looking at the code,
functions like `ieee80211_iter_keys()` and
`ieee80211_iter_keys_rcu()` iterate through all interfaces when
called without a specific vif parameter:
```c
list_for_each_entry(sdata, &local->interfaces, list)
list_for_each_entry_safe(key, tmp, &sdata->key_list, list)
```
This would cause a NULL pointer dereference when it encounters a
virtual monitor interface.
4. **Minimal Risk**: The change is extremely low risk - it only adds
initialization of a list head that should have been initialized all
along. There are no architectural changes or feature additions.
5. **Follows Stable Rules**: This perfectly fits the stable kernel
criteria:
- Fixes a real bug (NULL pointer dereference/crash)
- Small and contained change (2 lines moved)
- Obviously correct fix
- No new features or behaviors introduced
The commit is similar in nature to commit #5 in the reference list which
was marked as suitable for backporting - both fix NULL pointer
dereferences in the wifi/mac80211 subsystem with minimal, targeted
changes.
net/mac80211/iface.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 209d6ffa8e426..adfdc14bd91ac 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -1121,6 +1121,8 @@ static void ieee80211_sdata_init(struct ieee80211_local *local,
{
sdata->local = local;
+ INIT_LIST_HEAD(&sdata->key_list);
+
/*
* Initialize the default link, so we can use link_id 0 for non-MLD,
* and that continues to work for non-MLD-aware drivers that use just
@@ -2162,8 +2164,6 @@ int ieee80211_if_add(struct ieee80211_local *local, const char *name,
ieee80211_init_frag_cache(&sdata->frags);
- INIT_LIST_HEAD(&sdata->key_list);
-
wiphy_delayed_work_init(&sdata->dec_tailroom_needed_wk,
ieee80211_delayed_tailroom_dec);
--
2.39.5
From: Miri Korenblit <miriam.rachel.korenblit(a)intel.com>
[ Upstream commit d7a54d02db41f72f0581a3c77c75b0993ed3f6e2 ]
This is currently not initialized for a virtual monitor, leading to a
NULL pointer dereference when - for example - iterating over all the
keys of all the vifs.
Reviewed-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit(a)intel.com>
Link: https://patch.msgid.link/20250709233400.8dcefe578497.I4c90a00ae3256520e0631…
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and the code changes, here is my
assessment:
**YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Critical NULL Pointer Dereference Fix**: The commit fixes a NULL
pointer dereference that occurs when iterating over the key_list of
virtual monitor interfaces. This is a crash-inducing bug that affects
system stability.
2. **Clear Bug with Simple Fix**: The issue is straightforward - the
`key_list` was not initialized for virtual monitor interfaces created
via `ieee80211_add_virtual_monitor()`. The fix is minimal and
contained - it simply moves the `INIT_LIST_HEAD(&sdata->key_list)`
initialization from `ieee80211_if_add()` into
`ieee80211_sdata_init()`, ensuring all sdata structures have their
key_list properly initialized.
3. **Real-World Impact**: The bug can be triggered when any code
iterates over all interfaces and their keys. Looking at the code,
functions like `ieee80211_iter_keys()` and
`ieee80211_iter_keys_rcu()` iterate through all interfaces when
called without a specific vif parameter:
```c
list_for_each_entry(sdata, &local->interfaces, list)
list_for_each_entry_safe(key, tmp, &sdata->key_list, list)
```
This would cause a NULL pointer dereference when it encounters a
virtual monitor interface.
4. **Minimal Risk**: The change is extremely low risk - it only adds
initialization of a list head that should have been initialized all
along. There are no architectural changes or feature additions.
5. **Follows Stable Rules**: This perfectly fits the stable kernel
criteria:
- Fixes a real bug (NULL pointer dereference/crash)
- Small and contained change (2 lines moved)
- Obviously correct fix
- No new features or behaviors introduced
The commit is similar in nature to commit #5 in the reference list which
was marked as suitable for backporting - both fix NULL pointer
dereferences in the wifi/mac80211 subsystem with minimal, targeted
changes.
net/mac80211/iface.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 7d93e5aa595b2..0485a78eda366 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -1117,6 +1117,8 @@ static void ieee80211_sdata_init(struct ieee80211_local *local,
{
sdata->local = local;
+ INIT_LIST_HEAD(&sdata->key_list);
+
/*
* Initialize the default link, so we can use link_id 0 for non-MLD,
* and that continues to work for non-MLD-aware drivers that use just
@@ -2177,8 +2179,6 @@ int ieee80211_if_add(struct ieee80211_local *local, const char *name,
ieee80211_init_frag_cache(&sdata->frags);
- INIT_LIST_HEAD(&sdata->key_list);
-
wiphy_delayed_work_init(&sdata->dec_tailroom_needed_wk,
ieee80211_delayed_tailroom_dec);
--
2.39.5
Commit e7607f7d6d81 ("ARM: 9443/1: Require linker to support KEEP within
OVERLAY for DCE") accidentally broke the binutils version restriction
that was added in commit 0d437918fb64 ("ARM: 9414/1: Fix build issue
with LD_DEAD_CODE_DATA_ELIMINATION"), reintroducing the segmentation
fault addressed by that workaround.
Restore the binutils version dependency by using
CONFIG_LD_CAN_USE_KEEP_IN_OVERLAY as an additional condition to ensure
that CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION is only enabled with
binutils >= 2.36 and ld.lld >= 21.0.0.
Cc: stable(a)vger.kernel.org
Fixes: e7607f7d6d81 ("ARM: 9443/1: Require linker to support KEEP within OVERLAY for DCE")
Reported-by: Rob Landley <rob(a)landley.net>
Closes: https://lore.kernel.org/6739da7d-e555-407a-b5cb-e5681da71056@landley.net/
Tested-by: Rob Landley <rob(a)landley.net>
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
arch/arm/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 3072731fe09c..962451e54fdd 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -121,7 +121,7 @@ config ARM
select HAVE_KERNEL_XZ
select HAVE_KPROBES if !XIP_KERNEL && !CPU_ENDIAN_BE32 && !CPU_V7M
select HAVE_KRETPROBES if HAVE_KPROBES
- select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if (LD_VERSION >= 23600 || LD_CAN_USE_KEEP_IN_OVERLAY)
+ select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if (LD_VERSION >= 23600 || LD_IS_LLD) && LD_CAN_USE_KEEP_IN_OVERLAY
select HAVE_MOD_ARCH_SPECIFIC
select HAVE_NMI
select HAVE_OPTPROBES if !THUMB2_KERNEL
---
base-commit: d7b8f8e20813f0179d8ef519541a3527e7661d3a
change-id: 20250707-arm-fix-dce-older-binutils-87a5a4b829d9
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
Hello,
with kernel v6.16-rc5-121-gbc9ff192a6c9 I see this:
cat /sys/devices/system/cpu/vulnerabilities/tsa
Mitigation: Clear CPU buffers
dmesg | grep micro
[ 1.479203] microcode: Current revision: 0x0a20102e
[ 1.479206] microcode: Updated early from: 0x0a201016
So, this works.
but same machine with 6.6.97:
dmesg | grep micro
[ 0.451496] Transient Scheduler Attacks: Vulnerable: Clear CPU buffers
attempted, no microcode
[ 1.077149] microcode: Current revision: 0x0a20102e
[ 1.077152] microcode: Updated early from: 0x0a201016
so:
cat /sys/devices/system/cpu/vulnerabilities/tsa
Vulnerable: Clear CPU buffers attempted, no microcode
but it is switched on:
zcat /proc/config.gz | grep TSA
CONFIG_MITIGATION_TSA=y
And other stuff which need microcode works:
cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
Mitigation: Safe RET
without microcode you wwould see:
Vulnerable: Safe RET, no microcode
6.12.37 broken too
6.15.6 works
v6.16-rc5-121-gbc9ff192a6c9 works
This is a:
processor : 11
vendor_id : AuthenticAMD
cpu family : 25
model : 33
model name : AMD Ryzen 5 5600X 6-Core Processor
stepping : 0
microcode : 0xa20102e
Is something missing in 6.6.y and 6.12.y?
Thomas
(already submitted @ Fedora; advised to post here to upstream as well)
-------- Forwarded Message --------
Subject: regression in 6.15.5: KVM guest launch FAILSs with missing CPU feature error (sbpb, ibpb-brtype)
Date: Sun, 13 Jul 2025 17:37:57 -0400
From: pgnd <pgnd(a)dev-mail.net>
Reply-To: pgnd(a)dev-mail.net
To: kernel(a)lists.fedoraproject.org
i'm seeing a regression on Fedora 42 kernel 6.15.4 -> 6.15.5 on AMD Ryzen 5 5600G host (x86_64)
kvm guests that launch ok under kernels 6.15.[3,4] fail with the following error when attempting to autostart under 6.15.5:
internal error: Failed to autostart VM: operation failed: guest CPU doesn't match specification: missing features: sbpb,ibpb-brtype
no changes made to libvirt, qemu, or VM defs between kernel versions.
re-booting to old kernel versions restores expected behavior.
maybe related (?) to recent changes in AMD CPU feature exposure / mitigation handling in kernel 6.15.5?
i've opened a bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2379784
I'm announcing the release of the 6.12.38 kernel.
Only users of AMD x86-based processors need to upgrade, all others may
skip this release.
The updated 6.12.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.12.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/x86/kernel/cpu/amd.c | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
Borislav Petkov (AMD) (1):
x86/CPU/AMD: Properly check the TSA microcode
Greg Kroah-Hartman (1):
Linux 6.12.38
I'm announcing the release of the 6.6.98 kernel.
Only users of AMD x86-based processors need to upgrade, all others may
skip this release.
The updated 6.6.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.6.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/x86/kernel/cpu/amd.c | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
Borislav Petkov (AMD) (1):
x86/CPU/AMD: Properly check the TSA microcode
Greg Kroah-Hartman (1):
Linux 6.6.98
I'm announcing the release of the 6.1.145 kernel.
Only users of AMD x86-based processors need to upgrade, all others may
skip this release.
The updated 6.1.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.1.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/x86/kernel/cpu/amd.c | 5 +++--
2 files changed, 4 insertions(+), 3 deletions(-)
Borislav Petkov (AMD) (1):
x86/CPU/AMD: Properly check the TSA microcode
Greg Kroah-Hartman (1):
Linux 6.1.145
I'm announcing the release of the 5.15.188 kernel.
Only users of AMD x86-based processors need to upgrade, all others may
skip this release.
The updated 5.15.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.15.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/x86/kernel/cpu/amd.c | 5 +++--
2 files changed, 4 insertions(+), 3 deletions(-)
Borislav Petkov (AMD) (1):
x86/CPU/AMD: Properly check the TSA microcode
Greg Kroah-Hartman (1):
Linux 5.15.188
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071423-imbecile-slander-d8e9@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Wed, 2 Jul 2025 10:32:04 +0200
Subject: [PATCH] x86/mm: Disable hugetlb page table sharing on 32-bit
Only select ARCH_WANT_HUGE_PMD_SHARE on 64-bit x86.
Page table sharing requires at least three levels because it involves
shared references to PMD tables; 32-bit x86 has either two-level paging
(without PAE) or three-level paging (with PAE), but even with
three-level paging, having a dedicated PGD entry for hugetlb is only
barely possible (because the PGD only has four entries), and it seems
unlikely anyone's actually using PMD sharing on 32-bit.
Having ARCH_WANT_HUGE_PMD_SHARE enabled on non-PAE 32-bit X86 (which
has 2-level paging) became particularly problematic after commit
59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count"),
since that changes `struct ptdesc` such that the `pt_mm` (for PGDs) and
the `pt_share_count` (for PMDs) share the same union storage - and with
2-level paging, PMDs are PGDs.
(For comparison, arm64 also gates ARCH_WANT_HUGE_PMD_SHARE on the
configuration of page tables such that it is never enabled with 2-level
paging.)
Closes: https://lore.kernel.org/r/srhpjxlqfna67blvma5frmy3aa@altlinux.org
Fixes: cfe28c5d63d8 ("x86: mm: Remove x86 version of huge_pmd_share.")
Reported-by: Vitaly Chikunov <vt(a)altlinux.org>
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Signed-off-by: Jann Horn <jannh(a)google.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: David Hildenbrand <david(a)redhat.com>
Tested-by: Vitaly Chikunov <vt(a)altlinux.org>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250702-x86-2level-hugetlb-v2-1-1a98096edf92%4…
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 71019b3b54ea..4e0fe688cc83 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,7 +147,7 @@ config X86
select ARCH_WANTS_DYNAMIC_TASK_STRUCT
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_GENERAL_HUGETLB
- select ARCH_WANT_HUGE_PMD_SHARE
+ select ARCH_WANT_HUGE_PMD_SHARE if X86_64
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64
select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071422-target-professor-6f97@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Wed, 2 Jul 2025 10:32:04 +0200
Subject: [PATCH] x86/mm: Disable hugetlb page table sharing on 32-bit
Only select ARCH_WANT_HUGE_PMD_SHARE on 64-bit x86.
Page table sharing requires at least three levels because it involves
shared references to PMD tables; 32-bit x86 has either two-level paging
(without PAE) or three-level paging (with PAE), but even with
three-level paging, having a dedicated PGD entry for hugetlb is only
barely possible (because the PGD only has four entries), and it seems
unlikely anyone's actually using PMD sharing on 32-bit.
Having ARCH_WANT_HUGE_PMD_SHARE enabled on non-PAE 32-bit X86 (which
has 2-level paging) became particularly problematic after commit
59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count"),
since that changes `struct ptdesc` such that the `pt_mm` (for PGDs) and
the `pt_share_count` (for PMDs) share the same union storage - and with
2-level paging, PMDs are PGDs.
(For comparison, arm64 also gates ARCH_WANT_HUGE_PMD_SHARE on the
configuration of page tables such that it is never enabled with 2-level
paging.)
Closes: https://lore.kernel.org/r/srhpjxlqfna67blvma5frmy3aa@altlinux.org
Fixes: cfe28c5d63d8 ("x86: mm: Remove x86 version of huge_pmd_share.")
Reported-by: Vitaly Chikunov <vt(a)altlinux.org>
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Signed-off-by: Jann Horn <jannh(a)google.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: David Hildenbrand <david(a)redhat.com>
Tested-by: Vitaly Chikunov <vt(a)altlinux.org>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250702-x86-2level-hugetlb-v2-1-1a98096edf92%4…
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 71019b3b54ea..4e0fe688cc83 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,7 +147,7 @@ config X86
select ARCH_WANTS_DYNAMIC_TASK_STRUCT
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_GENERAL_HUGETLB
- select ARCH_WANT_HUGE_PMD_SHARE
+ select ARCH_WANT_HUGE_PMD_SHARE if X86_64
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64
select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071421-molar-theme-43d7@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Wed, 2 Jul 2025 10:32:04 +0200
Subject: [PATCH] x86/mm: Disable hugetlb page table sharing on 32-bit
Only select ARCH_WANT_HUGE_PMD_SHARE on 64-bit x86.
Page table sharing requires at least three levels because it involves
shared references to PMD tables; 32-bit x86 has either two-level paging
(without PAE) or three-level paging (with PAE), but even with
three-level paging, having a dedicated PGD entry for hugetlb is only
barely possible (because the PGD only has four entries), and it seems
unlikely anyone's actually using PMD sharing on 32-bit.
Having ARCH_WANT_HUGE_PMD_SHARE enabled on non-PAE 32-bit X86 (which
has 2-level paging) became particularly problematic after commit
59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count"),
since that changes `struct ptdesc` such that the `pt_mm` (for PGDs) and
the `pt_share_count` (for PMDs) share the same union storage - and with
2-level paging, PMDs are PGDs.
(For comparison, arm64 also gates ARCH_WANT_HUGE_PMD_SHARE on the
configuration of page tables such that it is never enabled with 2-level
paging.)
Closes: https://lore.kernel.org/r/srhpjxlqfna67blvma5frmy3aa@altlinux.org
Fixes: cfe28c5d63d8 ("x86: mm: Remove x86 version of huge_pmd_share.")
Reported-by: Vitaly Chikunov <vt(a)altlinux.org>
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Signed-off-by: Jann Horn <jannh(a)google.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: David Hildenbrand <david(a)redhat.com>
Tested-by: Vitaly Chikunov <vt(a)altlinux.org>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250702-x86-2level-hugetlb-v2-1-1a98096edf92%4…
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 71019b3b54ea..4e0fe688cc83 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,7 +147,7 @@ config X86
select ARCH_WANTS_DYNAMIC_TASK_STRUCT
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_GENERAL_HUGETLB
- select ARCH_WANT_HUGE_PMD_SHARE
+ select ARCH_WANT_HUGE_PMD_SHARE if X86_64
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64
select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071420-occupy-vowel-a8a5@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Wed, 2 Jul 2025 10:32:04 +0200
Subject: [PATCH] x86/mm: Disable hugetlb page table sharing on 32-bit
Only select ARCH_WANT_HUGE_PMD_SHARE on 64-bit x86.
Page table sharing requires at least three levels because it involves
shared references to PMD tables; 32-bit x86 has either two-level paging
(without PAE) or three-level paging (with PAE), but even with
three-level paging, having a dedicated PGD entry for hugetlb is only
barely possible (because the PGD only has four entries), and it seems
unlikely anyone's actually using PMD sharing on 32-bit.
Having ARCH_WANT_HUGE_PMD_SHARE enabled on non-PAE 32-bit X86 (which
has 2-level paging) became particularly problematic after commit
59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count"),
since that changes `struct ptdesc` such that the `pt_mm` (for PGDs) and
the `pt_share_count` (for PMDs) share the same union storage - and with
2-level paging, PMDs are PGDs.
(For comparison, arm64 also gates ARCH_WANT_HUGE_PMD_SHARE on the
configuration of page tables such that it is never enabled with 2-level
paging.)
Closes: https://lore.kernel.org/r/srhpjxlqfna67blvma5frmy3aa@altlinux.org
Fixes: cfe28c5d63d8 ("x86: mm: Remove x86 version of huge_pmd_share.")
Reported-by: Vitaly Chikunov <vt(a)altlinux.org>
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Signed-off-by: Jann Horn <jannh(a)google.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: David Hildenbrand <david(a)redhat.com>
Tested-by: Vitaly Chikunov <vt(a)altlinux.org>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250702-x86-2level-hugetlb-v2-1-1a98096edf92%4…
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 71019b3b54ea..4e0fe688cc83 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,7 +147,7 @@ config X86
select ARCH_WANTS_DYNAMIC_TASK_STRUCT
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_GENERAL_HUGETLB
- select ARCH_WANT_HUGE_PMD_SHARE
+ select ARCH_WANT_HUGE_PMD_SHARE if X86_64
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64
select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
After commit 725f31f300e3 ("thermal/of: support thermal zones w/o trips
subnode") was backported on 6.6 stable branch as commit d3304dbc2d5f
("thermal/of: support thermal zones w/o trips subnode"), thermal zones
w/o trips subnode still fail to register since `mask` argument is not
set correctly. When number of trips subnode is 0, `mask` must be 0 to
pass the check in `thermal_zone_device_register_with_trips()`.
Set `mask` to 0 when there's no trips subnode.
Signed-off-by: Hsin-Te Yuan <yuanhsinte(a)chromium.org>
---
drivers/thermal/thermal_of.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index 0f520cf923a1e684411a3077ad283551395eec11..97aeb869abf5179dfa512dd744725121ec7fd0d9 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -514,7 +514,7 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *
of_ops->bind = thermal_of_bind;
of_ops->unbind = thermal_of_unbind;
- mask = GENMASK_ULL((ntrips) - 1, 0);
+ mask = ntrips ? GENMASK_ULL((ntrips) - 1, 0) : 0;
tz = thermal_zone_device_register_with_trips(np->name, trips, ntrips,
mask, data, of_ops, &tzp,
---
base-commit: a5df3a702b2cba82bcfb066fa9f5e4a349c33924
change-id: 20250707-trip-point-73dae9fd9c74
Best regards,
--
Hsin-Te Yuan <yuanhsinte(a)chromium.org>
The grp->bb_largest_free_order is updated regardless of whether
mb_optimize_scan is enabled. This can lead to inconsistencies between
grp->bb_largest_free_order and the actual s_mb_largest_free_orders list
index when mb_optimize_scan is repeatedly enabled and disabled via remount.
For example, if mb_optimize_scan is initially enabled, largest free
order is 3, and the group is in s_mb_largest_free_orders[3]. Then,
mb_optimize_scan is disabled via remount, block allocations occur,
updating largest free order to 2. Finally, mb_optimize_scan is re-enabled
via remount, more block allocations update largest free order to 1.
At this point, the group would be removed from s_mb_largest_free_orders[3]
under the protection of s_mb_largest_free_orders_locks[2]. This lock
mismatch can lead to list corruption.
To fix this, whenever grp->bb_largest_free_order changes, we now always
attempt to remove the group from its old order list. However, we only
insert the group into the new order list if `mb_optimize_scan` is enabled.
This approach helps prevent lock inconsistencies and ensures the data in
the order lists remains reliable.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable(a)vger.kernel.org
Suggested-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
---
fs/ext4/mballoc.c | 33 ++++++++++++++-------------------
1 file changed, 14 insertions(+), 19 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 72b20fc52bbf..fada0d1b3fdb 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1152,33 +1152,28 @@ static void
mb_set_largest_free_order(struct super_block *sb, struct ext4_group_info *grp)
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
- int i;
+ int new, old = grp->bb_largest_free_order;
- for (i = MB_NUM_ORDERS(sb) - 1; i >= 0; i--)
- if (grp->bb_counters[i] > 0)
+ for (new = MB_NUM_ORDERS(sb) - 1; new >= 0; new--)
+ if (grp->bb_counters[new] > 0)
break;
+
/* No need to move between order lists? */
- if (!test_opt2(sb, MB_OPTIMIZE_SCAN) ||
- i == grp->bb_largest_free_order) {
- grp->bb_largest_free_order = i;
+ if (new == old)
return;
- }
- if (grp->bb_largest_free_order >= 0) {
- write_lock(&sbi->s_mb_largest_free_orders_locks[
- grp->bb_largest_free_order]);
+ if (old >= 0 && !list_empty(&grp->bb_largest_free_order_node)) {
+ write_lock(&sbi->s_mb_largest_free_orders_locks[old]);
list_del_init(&grp->bb_largest_free_order_node);
- write_unlock(&sbi->s_mb_largest_free_orders_locks[
- grp->bb_largest_free_order]);
+ write_unlock(&sbi->s_mb_largest_free_orders_locks[old]);
}
- grp->bb_largest_free_order = i;
- if (grp->bb_largest_free_order >= 0 && grp->bb_free) {
- write_lock(&sbi->s_mb_largest_free_orders_locks[
- grp->bb_largest_free_order]);
+
+ grp->bb_largest_free_order = new;
+ if (test_opt2(sb, MB_OPTIMIZE_SCAN) && new >= 0 && grp->bb_free) {
+ write_lock(&sbi->s_mb_largest_free_orders_locks[new]);
list_add_tail(&grp->bb_largest_free_order_node,
- &sbi->s_mb_largest_free_orders[grp->bb_largest_free_order]);
- write_unlock(&sbi->s_mb_largest_free_orders_locks[
- grp->bb_largest_free_order]);
+ &sbi->s_mb_largest_free_orders[new]);
+ write_unlock(&sbi->s_mb_largest_free_orders_locks[new]);
}
}
--
2.46.1
Groups with no free blocks shouldn't be in any average fragment size list.
However, when all blocks in a group are allocated(i.e., bb_fragments or
bb_free is 0), we currently skip updating the average fragment size, which
means the group isn't removed from its previous s_mb_avg_fragment_size[old]
list.
This created "zombie" groups that were always skipped during traversal as
they couldn't satisfy any block allocation requests, negatively impacting
traversal efficiency.
Therefore, when a group becomes completely full, bb_avg_fragment_size_order
is now set to -1. If the old order was not -1, a removal operation is
performed; if the new order is not -1, an insertion is performed.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable(a)vger.kernel.org
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
---
fs/ext4/mballoc.c | 36 ++++++++++++++++++------------------
1 file changed, 18 insertions(+), 18 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 6d98f2a5afc4..72b20fc52bbf 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -841,30 +841,30 @@ static void
mb_update_avg_fragment_size(struct super_block *sb, struct ext4_group_info *grp)
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
- int new_order;
+ int new, old;
- if (!test_opt2(sb, MB_OPTIMIZE_SCAN) || grp->bb_fragments == 0)
+ if (!test_opt2(sb, MB_OPTIMIZE_SCAN))
return;
- new_order = mb_avg_fragment_size_order(sb,
- grp->bb_free / grp->bb_fragments);
- if (new_order == grp->bb_avg_fragment_size_order)
+ old = grp->bb_avg_fragment_size_order;
+ new = grp->bb_fragments == 0 ? -1 :
+ mb_avg_fragment_size_order(sb, grp->bb_free / grp->bb_fragments);
+ if (new == old)
return;
- if (grp->bb_avg_fragment_size_order != -1) {
- write_lock(&sbi->s_mb_avg_fragment_size_locks[
- grp->bb_avg_fragment_size_order]);
+ if (old >= 0) {
+ write_lock(&sbi->s_mb_avg_fragment_size_locks[old]);
list_del(&grp->bb_avg_fragment_size_node);
- write_unlock(&sbi->s_mb_avg_fragment_size_locks[
- grp->bb_avg_fragment_size_order]);
- }
- grp->bb_avg_fragment_size_order = new_order;
- write_lock(&sbi->s_mb_avg_fragment_size_locks[
- grp->bb_avg_fragment_size_order]);
- list_add_tail(&grp->bb_avg_fragment_size_node,
- &sbi->s_mb_avg_fragment_size[grp->bb_avg_fragment_size_order]);
- write_unlock(&sbi->s_mb_avg_fragment_size_locks[
- grp->bb_avg_fragment_size_order]);
+ write_unlock(&sbi->s_mb_avg_fragment_size_locks[old]);
+ }
+
+ grp->bb_avg_fragment_size_order = new;
+ if (new >= 0) {
+ write_lock(&sbi->s_mb_avg_fragment_size_locks[new]);
+ list_add_tail(&grp->bb_avg_fragment_size_node,
+ &sbi->s_mb_avg_fragment_size[new]);
+ write_unlock(&sbi->s_mb_avg_fragment_size_locks[new]);
+ }
}
/*
--
2.46.1
Bit 7 of the 'Device Type 2' (0Bh) register is reserved in the FSA9480
device, but is used by the FSA880 and TSU6111 devices.
From FSA9480 datasheet, Table 18. Device Type 2:
Reset Value: x0000000
===========================================================================
Bit # | Name | Size (Bits) | Description
---------------------------------------------------------------------------
7 | Reserved | 1 | NA
From FSA880 datasheet, Table 13. Device Type 2:
Reset Value: 0xxx0000
===========================================================================
Bit # | Name | Size (Bits) | Description
---------------------------------------------------------------------------
7 | Unknown | 1 | 1: Any accessory detected as unknown
| Accessory | | or an accessory that cannot be
| | | detected as being valid even
| | | though ID_CON is not floating
| | | 0: Unknown accessory not detected
From TSU6111 datasheet, Device Type 2:
Reset Value:x0000000
===========================================================================
Bit # | Name | Size (Bits) | Description
---------------------------------------------------------------------------
7 | Audio Type 3 | 1 | Audio device type 3
So the value obtained from the FSA9480_REG_DEV_T2 register in the
fsa9480_detect_dev() function may have the 7th bit set.
In this case, the 'dev' parameter in the fsa9480_handle_change() function
will be 15. And this will cause the 'cable_types' array to overflow when
accessed at this index.
Extend the 'cable_types' array with a new value 'DEV_RESERVED' as
specified in the FSA9480 datasheet. Do not use it as it serves for
various purposes in the listed devices.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: bad5b5e707a5 ("extcon: Add fsa9480 extcon driver")
Cc: stable(a)vger.kernel.org
Signed-off-by: Vladimir Moskovkin <Vladimir.Moskovkin(a)kaspersky.com>
---
drivers/extcon/extcon-fsa9480.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/extcon/extcon-fsa9480.c b/drivers/extcon/extcon-fsa9480.c
index b11b43171063..30972a7214f7 100644
--- a/drivers/extcon/extcon-fsa9480.c
+++ b/drivers/extcon/extcon-fsa9480.c
@@ -68,6 +68,7 @@
#define DEV_T1_CHARGER_MASK (DEV_DEDICATED_CHG | DEV_USB_CHG)
/* Device Type 2 */
+#define DEV_RESERVED 15
#define DEV_AV 14
#define DEV_TTY 13
#define DEV_PPD 12
@@ -133,6 +134,7 @@ static const u64 cable_types[] = {
[DEV_USB] = BIT_ULL(EXTCON_USB) | BIT_ULL(EXTCON_CHG_USB_SDP),
[DEV_AUDIO_2] = BIT_ULL(EXTCON_JACK_LINE_OUT),
[DEV_AUDIO_1] = BIT_ULL(EXTCON_JACK_LINE_OUT),
+ [DEV_RESERVED] = 0,
[DEV_AV] = BIT_ULL(EXTCON_JACK_LINE_OUT)
| BIT_ULL(EXTCON_JACK_VIDEO_OUT),
[DEV_TTY] = BIT_ULL(EXTCON_JIG),
@@ -228,7 +230,7 @@ static void fsa9480_detect_dev(struct fsa9480_usbsw *usbsw)
dev_err(usbsw->dev, "%s: failed to read registers", __func__);
return;
}
- val = val2 << 8 | val1;
+ val = val2 << 8 | (val1 & 0xFF);
dev_info(usbsw->dev, "dev1: 0x%x, dev2: 0x%x\n", val1, val2);
--
2.25.1
BootLoader may pass a head such as "BOOT_IMAGE=/boot/vmlinuz-x.y.z" to
kernel parameters. But this head is not recognized by the kernel so will
be passed to user space. However, user space init program also doesn't
recognized it.
KEXEC may also pass a head such as "kexec" on some architectures.
So the the best way is handle it by the kernel itself, which can avoid
such boot warnings:
Kernel command line: BOOT_IMAGE=(hd0,1)/vmlinuz-6.x root=/dev/sda3 ro console=tty
Unknown kernel command line parameters "BOOT_IMAGE=(hd0,1)/vmlinuz-6.x", will be passed to user space.
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
init/main.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/init/main.c b/init/main.c
index 225a58279acd..9e0a7e8913c0 100644
--- a/init/main.c
+++ b/init/main.c
@@ -545,6 +545,7 @@ static int __init unknown_bootoption(char *param, char *val,
const char *unused, void *arg)
{
size_t len = strlen(param);
+ const char *bootloader[] = { "BOOT_IMAGE", "kexec", NULL };
/* Handle params aliased to sysctls */
if (sysctl_is_alias(param))
@@ -552,6 +553,12 @@ static int __init unknown_bootoption(char *param, char *val,
repair_env_string(param, val);
+ /* Handle bootloader head */
+ for (int i = 0; bootloader[i]; i++) {
+ if (!strncmp(param, bootloader[i], strlen(bootloader[i])))
+ return 0;
+ }
+
/* Handle obsolete-style parameters */
if (obsolete_checksetup(param))
return 0;
--
2.47.1
Using device_find_child() to lookup the proper device to destroy
causes an unbalance in device refcount, since device_find_child()
calls an implicit get_device() to increment the device's reference
count before returning the pointer. cxl_port_find_switch_decoder()
directly converts this pointer to a cxl_decoder and returns it without
releasing the reference properly. We should call put_device() to
decrement reference count.
As comment of device_find_child() says, 'NOTE: you will need to drop
the reference with put_device() after use'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: d6879d8cfb81 ("cxl/region: Add function to find a port's switch decoder by range")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
drivers/cxl/core/region.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 6e5e1460068d..cae16d761261 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -3234,6 +3234,9 @@ cxl_port_find_switch_decoder(struct cxl_port *port, struct range *hpa)
struct device *cxld_dev = device_find_child(&port->dev, hpa,
match_decoder_by_range);
+ /* Drop the refcnt bumped implicitly by device_find_child */
+ put_device(cxld_dev);
+
return cxld_dev ? to_cxl_decoder(cxld_dev) : NULL;
}
--
2.25.1
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 277627b431a0a6401635c416a21b2a0f77a77347
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071424-payroll-posh-4e3e@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 277627b431a0a6401635c416a21b2a0f77a77347 Mon Sep 17 00:00:00 2001
From: Al Viro <viro(a)zeniv.linux.org.uk>
Date: Sun, 6 Jul 2025 02:26:45 +0100
Subject: [PATCH] ksmbd: fix a mount write count leak in
ksmbd_vfs_kern_path_locked()
If the call of ksmbd_vfs_lock_parent() fails, we drop the parent_path
references and return an error. We need to drop the write access we
just got on parent_path->mnt before we drop the mount reference - callers
assume that ksmbd_vfs_kern_path_locked() returns with mount write
access grabbed if and only if it has returned 0.
Fixes: 864fb5d37163 ("ksmbd: fix possible deadlock in smb2_open")
Signed-off-by: Al Viro <viro(a)zeniv.linux.org.uk>
Acked-by: Namjae Jeon <linkinjeon(a)kernel.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/server/vfs.c b/fs/smb/server/vfs.c
index 0f3aad12e495..d3437f6644e3 100644
--- a/fs/smb/server/vfs.c
+++ b/fs/smb/server/vfs.c
@@ -1282,6 +1282,7 @@ int ksmbd_vfs_kern_path_locked(struct ksmbd_work *work, char *name,
err = ksmbd_vfs_lock_parent(parent_path->dentry, path->dentry);
if (err) {
+ mnt_drop_write(parent_path->mnt);
path_put(path);
path_put(parent_path);
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 0c2b53997e8f5e2ec9e0fbd17ac0436466b65488
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071408-overarch-ashes-8fbd@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0c2b53997e8f5e2ec9e0fbd17ac0436466b65488 Mon Sep 17 00:00:00 2001
From: Stefan Metzmacher <metze(a)samba.org>
Date: Wed, 2 Jul 2025 09:18:05 +0200
Subject: [PATCH] smb: server: make use of rdma_destroy_qp()
The qp is created by rdma_create_qp() as t->cm_id->qp
and t->qp is just a shortcut.
rdma_destroy_qp() also calls ib_destroy_qp(cm_id->qp) internally,
but it is protected by a mutex, clears the cm_id and also calls
trace_cm_qp_destroy().
This should make the tracing more useful as both
rdma_create_qp() and rdma_destroy_qp() are traces and it makes
the code look more sane as functions from the same layer are used
for the specific qp object.
trace-cmd stream -e rdma_cma:cm_qp_create -e rdma_cma:cm_qp_destroy
shows this now while doing a mount and unmount from a client:
<...>-80 [002] 378.514182: cm_qp_create: cm.id=1 src=172.31.9.167:5445 dst=172.31.9.166:37113 tos=0 pd.id=0 qp_type=RC send_wr=867 recv_wr=255 qp_num=1 rc=0
<...>-6283 [001] 381.686172: cm_qp_destroy: cm.id=1 src=172.31.9.167:5445 dst=172.31.9.166:37113 tos=0 qp_num=1
Before we only saw the first line.
Cc: Namjae Jeon <linkinjeon(a)kernel.org>
Cc: Steve French <stfrench(a)microsoft.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky(a)gmail.com>
Cc: Hyunchul Lee <hyc.lee(a)gmail.com>
Cc: Tom Talpey <tom(a)talpey.com>
Cc: linux-cifs(a)vger.kernel.org
Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers")
Signed-off-by: Stefan Metzmacher <metze(a)samba.org>
Reviewed-by: Tom Talpey <tom(a)talpey.com>
Acked-by: Namjae Jeon <linkinjeon(a)kernel.org>
Signed-off-by: Steve French <stfrench(a)microsoft.com>
diff --git a/fs/smb/server/transport_rdma.c b/fs/smb/server/transport_rdma.c
index 64a428a06ace..c6cbe0d56e32 100644
--- a/fs/smb/server/transport_rdma.c
+++ b/fs/smb/server/transport_rdma.c
@@ -433,7 +433,8 @@ static void free_transport(struct smb_direct_transport *t)
if (t->qp) {
ib_drain_qp(t->qp);
ib_mr_pool_destroy(t->qp, &t->qp->rdma_mrs);
- ib_destroy_qp(t->qp);
+ t->qp = NULL;
+ rdma_destroy_qp(t->cm_id);
}
ksmbd_debug(RDMA, "drain the reassembly queue\n");
@@ -1940,8 +1941,8 @@ static int smb_direct_create_qpair(struct smb_direct_transport *t,
return 0;
err:
if (t->qp) {
- ib_destroy_qp(t->qp);
t->qp = NULL;
+ rdma_destroy_qp(t->cm_id);
}
if (t->recv_cq) {
ib_destroy_cq(t->recv_cq);
Starting with Rust 1.89.0 (expected 2025-08-07), under
`CONFIG_RUST_DEBUG_ASSERTIONS=y`, `objtool` may report:
rust/kernel.o: warning: objtool: _R..._6kernel4pageNtB5_4Page8read_raw()
falls through to next function _R..._6kernel4pageNtB5_4Page9write_raw()
(and many others) due to calls to the `noreturn` symbol:
core::panicking::panic_nounwind_fmt
Thus add the mangled one to the list so that `objtool` knows it is
actually `noreturn`.
See commit 56d680dd23c3 ("objtool/rust: list `noreturn` Rust functions")
for more details.
Cc: stable(a)vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs).
Cc: Josh Poimboeuf <jpoimboe(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org>
---
tools/objtool/check.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index f23bdda737aa..3257eefc41ed 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -224,6 +224,7 @@ static bool is_rust_noreturn(const struct symbol *func)
str_ends_with(func->name, "_4core9panicking14panic_explicit") ||
str_ends_with(func->name, "_4core9panicking14panic_nounwind") ||
str_ends_with(func->name, "_4core9panicking18panic_bounds_check") ||
+ str_ends_with(func->name, "_4core9panicking18panic_nounwind_fmt") ||
str_ends_with(func->name, "_4core9panicking19assert_failed_inner") ||
str_ends_with(func->name, "_4core9panicking30panic_null_pointer_dereference") ||
str_ends_with(func->name, "_4core9panicking36panic_misaligned_pointer_dereference") ||
--
2.50.1
Starting with Rust 1.89.0 (expected 2025-08-07), the Rust compiler fails
to build the `rusttest` target due to undefined references such as:
kernel...-cgu.0:(.text....+0x116): undefined reference to
`rust_helper_kunit_get_current_test'
Moreover, tooling like `modpost` gets confused:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/nova/nova.o
ERROR: modpost: missing MODULE_LICENSE() in drivers/gpu/nova-core/nova_core.o
The reason behind both issues is that the Rust compiler will now [1]
treat `#[used]` as `#[used(linker)]` instead of `#[used(compiler)]`
for our targets. This means that the retain section flag (`R`,
`SHF_GNU_RETAIN`) will be used and that they will be marked as `unique`
too, with different IDs. In turn, that means we end up with undefined
references that did not get discarded in `rusttest` and that multiple
`.modinfo` sections are generated, which confuse tooling like `modpost`
because they only expect one.
Thus start using `#[used(compiler)]` to keep the previous behavior
and to be explicit about what we want. Sadly, it is an unstable feature
(`used_with_arg`) [2] -- we will talk to upstream Rust about it. The good
news is that it has been available for a long time (Rust >= 1.60) [3].
The changes should also be fine for previous Rust versions, since they
behave the same way as before [4].
Alternatively, we could use `#[no_mangle]` or `#[export_name = ...]`
since those still behave like `#[used(compiler)]`, but of course it is
not really what we want to express, and it requires other changes to
avoid symbol conflicts.
Cc: Björn Roy Baron <bjorn3_gh(a)protonmail.com>
Cc: David Wood <david(a)davidtw.co>
Cc: Wesley Wiser <wwiser(a)gmail.com>
Cc: stable(a)vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs).
Link: https://github.com/rust-lang/rust/pull/140872 [1]
Link: https://github.com/rust-lang/rust/issues/93798 [2]
Link: https://github.com/rust-lang/rust/pull/91504 [3]
Link: https://godbolt.org/z/sxzWTMfzW [4]
Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org>
---
rust/Makefile | 1 +
rust/kernel/firmware.rs | 2 +-
rust/kernel/kunit.rs | 2 +-
rust/kernel/lib.rs | 3 +++
rust/macros/module.rs | 10 +++++-----
scripts/Makefile.build | 3 ++-
6 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/rust/Makefile b/rust/Makefile
index 27dec7904c3a..115b63b7d1e3 100644
--- a/rust/Makefile
+++ b/rust/Makefile
@@ -194,6 +194,7 @@ quiet_cmd_rustdoc_test = RUSTDOC T $<
RUST_MODFILE=test.rs \
OBJTREE=$(abspath $(objtree)) \
$(RUSTDOC) --test $(rust_common_flags) \
+ -Zcrate-attr='feature(used_with_arg)' \
@$(objtree)/include/generated/rustc_cfg \
$(rustc_target_flags) $(rustdoc_test_target_flags) \
$(rustdoc_test_quiet) \
diff --git a/rust/kernel/firmware.rs b/rust/kernel/firmware.rs
index 2494c96e105f..4fe621f35716 100644
--- a/rust/kernel/firmware.rs
+++ b/rust/kernel/firmware.rs
@@ -202,7 +202,7 @@ macro_rules! module_firmware {
};
#[link_section = ".modinfo"]
- #[used]
+ #[used(compiler)]
static __MODULE_FIRMWARE: [u8; $($builder)*::create(__MODULE_FIRMWARE_PREFIX)
.build_length()] = $($builder)*::create(__MODULE_FIRMWARE_PREFIX).build();
};
diff --git a/rust/kernel/kunit.rs b/rust/kernel/kunit.rs
index 4b8cdcb21e77..b9e65905e121 100644
--- a/rust/kernel/kunit.rs
+++ b/rust/kernel/kunit.rs
@@ -302,7 +302,7 @@ macro_rules! kunit_unsafe_test_suite {
is_init: false,
};
- #[used]
+ #[used(compiler)]
#[allow(unused_unsafe)]
#[cfg_attr(not(target_os = "macos"), link_section = ".kunit_test_suites")]
static mut KUNIT_TEST_SUITE_ENTRY: *const ::kernel::bindings::kunit_suite =
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index 6b4774b2b1c3..e13d6ed88fa6 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -34,6 +34,9 @@
// Expected to become stable.
#![feature(arbitrary_self_types)]
//
+// To be determined.
+#![feature(used_with_arg)]
+//
// `feature(derive_coerce_pointee)` is expected to become stable. Before Rust
// 1.84.0, it did not exist, so enable the predecessor features.
#![cfg_attr(CONFIG_RUSTC_HAS_COERCE_POINTEE, feature(derive_coerce_pointee))]
diff --git a/rust/macros/module.rs b/rust/macros/module.rs
index 2ddd2eeb2852..75efc6eeeafc 100644
--- a/rust/macros/module.rs
+++ b/rust/macros/module.rs
@@ -57,7 +57,7 @@ fn emit_base(&mut self, field: &str, content: &str, builtin: bool) {
{cfg}
#[doc(hidden)]
#[cfg_attr(not(target_os = \"macos\"), link_section = \".modinfo\")]
- #[used]
+ #[used(compiler)]
pub static __{module}_{counter}: [u8; {length}] = *{string};
",
cfg = if builtin {
@@ -249,7 +249,7 @@ mod __module_init {{
// key or a new section. For the moment, keep it simple.
#[cfg(MODULE)]
#[doc(hidden)]
- #[used]
+ #[used(compiler)]
static __IS_RUST_MODULE: () = ();
static mut __MOD: ::core::mem::MaybeUninit<{type_}> =
@@ -273,7 +273,7 @@ mod __module_init {{
#[cfg(MODULE)]
#[doc(hidden)]
- #[used]
+ #[used(compiler)]
#[link_section = \".init.data\"]
static __UNIQUE_ID___addressable_init_module: unsafe extern \"C\" fn() -> i32 = init_module;
@@ -293,7 +293,7 @@ mod __module_init {{
#[cfg(MODULE)]
#[doc(hidden)]
- #[used]
+ #[used(compiler)]
#[link_section = \".exit.data\"]
static __UNIQUE_ID___addressable_cleanup_module: extern \"C\" fn() = cleanup_module;
@@ -303,7 +303,7 @@ mod __module_init {{
#[cfg(not(CONFIG_HAVE_ARCH_PREL32_RELOCATIONS))]
#[doc(hidden)]
#[link_section = \"{initcall_section}\"]
- #[used]
+ #[used(compiler)]
pub static __{ident}_initcall: extern \"C\" fn() ->
::kernel::ffi::c_int = __{ident}_init;
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index a6461ea411f7..ba71b27aa363 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -312,10 +312,11 @@ $(obj)/%.lst: $(obj)/%.c FORCE
# - Stable since Rust 1.82.0: `feature(asm_const)`, `feature(raw_ref_op)`.
# - Stable since Rust 1.87.0: `feature(asm_goto)`.
# - Expected to become stable: `feature(arbitrary_self_types)`.
+# - To be determined: `feature(used_with_arg)`.
#
# Please see https://github.com/Rust-for-Linux/linux/issues/2 for details on
# the unstable features in use.
-rust_allowed_features := asm_const,asm_goto,arbitrary_self_types,lint_reasons,raw_ref_op
+rust_allowed_features := asm_const,asm_goto,arbitrary_self_types,lint_reasons,raw_ref_op,used_with_arg
# `--out-dir` is required to avoid temporaries being created by `rustc` in the
# current working directory, which may be not accessible in the out-of-tree
--
2.50.1
Now relocate_new_kernel_size is a .long value, which means 32bit, so its
high 32bit is undefined. This causes memcpy((void *)reboot_code_buffer,
relocate_new_kernel, relocate_new_kernel_size) in machine_kexec_prepare()
access out of range memories in some cases, and then end up with an ADE
exception.
So make relocate_new_kernel_size be a .quad value, which means 64bit, to
avoid such errors.
Cc: stable(a)vger.kernel.org
Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn>
---
arch/loongarch/kernel/relocate_kernel.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/loongarch/kernel/relocate_kernel.S b/arch/loongarch/kernel/relocate_kernel.S
index 84e6de2fd973..8b5140ac9ea1 100644
--- a/arch/loongarch/kernel/relocate_kernel.S
+++ b/arch/loongarch/kernel/relocate_kernel.S
@@ -109,4 +109,4 @@ SYM_CODE_END(kexec_smp_wait)
relocate_new_kernel_end:
.section ".data"
-SYM_DATA(relocate_new_kernel_size, .long relocate_new_kernel_end - relocate_new_kernel)
+SYM_DATA(relocate_new_kernel_size, .quad relocate_new_kernel_end - relocate_new_kernel)
--
2.47.1
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 6ee9b3d84775944fb8c8a447961cd01274ac671c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071348-swizzle-utter-2496@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6ee9b3d84775944fb8c8a447961cd01274ac671c Mon Sep 17 00:00:00 2001
From: Yeoreum Yun <yeoreum.yun(a)arm.com>
Date: Thu, 3 Jul 2025 19:10:18 +0100
Subject: [PATCH] kasan: remove kasan_find_vm_area() to prevent possible
deadlock
find_vm_area() couldn't be called in atomic_context. If find_vm_area() is
called to reports vm area information, kasan can trigger deadlock like:
CPU0 CPU1
vmalloc();
alloc_vmap_area();
spin_lock(&vn->busy.lock)
spin_lock_bh(&some_lock);
<interrupt occurs>
<in softirq>
spin_lock(&some_lock);
<access invalid address>
kasan_report();
print_report();
print_address_description();
kasan_find_vm_area();
find_vm_area();
spin_lock(&vn->busy.lock) // deadlock!
To prevent possible deadlock while kasan reports, remove kasan_find_vm_area().
Link: https://lkml.kernel.org/r/20250703181018.580833-1-yeoreum.yun@arm.com
Fixes: c056a364e954 ("kasan: print virtual mapping info in reports")
Signed-off-by: Yeoreum Yun <yeoreum.yun(a)arm.com>
Reported-by: Yunseong Kim <ysk(a)kzalloc.com>
Reviewed-by: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Byungchul Park <byungchul(a)sk.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 8357e1a33699..b0877035491f 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -370,36 +370,6 @@ static inline bool init_task_stack_addr(const void *addr)
sizeof(init_thread_union.stack));
}
-/*
- * This function is invoked with report_lock (a raw_spinlock) held. A
- * PREEMPT_RT kernel cannot call find_vm_area() as it will acquire a sleeping
- * rt_spinlock.
- *
- * For !RT kernel, the PROVE_RAW_LOCK_NESTING config option will print a
- * lockdep warning for this raw_spinlock -> spinlock dependency. This config
- * option is enabled by default to ensure better test coverage to expose this
- * kind of RT kernel problem. This lockdep splat, however, can be suppressed
- * by using DEFINE_WAIT_OVERRIDE_MAP() if it serves a useful purpose and the
- * invalid PREEMPT_RT case has been taken care of.
- */
-static inline struct vm_struct *kasan_find_vm_area(void *addr)
-{
- static DEFINE_WAIT_OVERRIDE_MAP(vmalloc_map, LD_WAIT_SLEEP);
- struct vm_struct *va;
-
- if (IS_ENABLED(CONFIG_PREEMPT_RT))
- return NULL;
-
- /*
- * Suppress lockdep warning and fetch vmalloc area of the
- * offending address.
- */
- lock_map_acquire_try(&vmalloc_map);
- va = find_vm_area(addr);
- lock_map_release(&vmalloc_map);
- return va;
-}
-
static void print_address_description(void *addr, u8 tag,
struct kasan_report_info *info)
{
@@ -429,19 +399,8 @@ static void print_address_description(void *addr, u8 tag,
}
if (is_vmalloc_addr(addr)) {
- struct vm_struct *va = kasan_find_vm_area(addr);
-
- if (va) {
- pr_err("The buggy address belongs to the virtual mapping at\n"
- " [%px, %px) created by:\n"
- " %pS\n",
- va->addr, va->addr + va->size, va->caller);
- pr_err("\n");
-
- page = vmalloc_to_page(addr);
- } else {
- pr_err("The buggy address %px belongs to a vmalloc virtual mapping\n", addr);
- }
+ pr_err("The buggy address %px belongs to a vmalloc virtual mapping\n", addr);
+ page = vmalloc_to_page(addr);
}
if (page) {
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 6ee9b3d84775944fb8c8a447961cd01274ac671c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071347-pregnant-dismount-40e9@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6ee9b3d84775944fb8c8a447961cd01274ac671c Mon Sep 17 00:00:00 2001
From: Yeoreum Yun <yeoreum.yun(a)arm.com>
Date: Thu, 3 Jul 2025 19:10:18 +0100
Subject: [PATCH] kasan: remove kasan_find_vm_area() to prevent possible
deadlock
find_vm_area() couldn't be called in atomic_context. If find_vm_area() is
called to reports vm area information, kasan can trigger deadlock like:
CPU0 CPU1
vmalloc();
alloc_vmap_area();
spin_lock(&vn->busy.lock)
spin_lock_bh(&some_lock);
<interrupt occurs>
<in softirq>
spin_lock(&some_lock);
<access invalid address>
kasan_report();
print_report();
print_address_description();
kasan_find_vm_area();
find_vm_area();
spin_lock(&vn->busy.lock) // deadlock!
To prevent possible deadlock while kasan reports, remove kasan_find_vm_area().
Link: https://lkml.kernel.org/r/20250703181018.580833-1-yeoreum.yun@arm.com
Fixes: c056a364e954 ("kasan: print virtual mapping info in reports")
Signed-off-by: Yeoreum Yun <yeoreum.yun(a)arm.com>
Reported-by: Yunseong Kim <ysk(a)kzalloc.com>
Reviewed-by: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Byungchul Park <byungchul(a)sk.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 8357e1a33699..b0877035491f 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -370,36 +370,6 @@ static inline bool init_task_stack_addr(const void *addr)
sizeof(init_thread_union.stack));
}
-/*
- * This function is invoked with report_lock (a raw_spinlock) held. A
- * PREEMPT_RT kernel cannot call find_vm_area() as it will acquire a sleeping
- * rt_spinlock.
- *
- * For !RT kernel, the PROVE_RAW_LOCK_NESTING config option will print a
- * lockdep warning for this raw_spinlock -> spinlock dependency. This config
- * option is enabled by default to ensure better test coverage to expose this
- * kind of RT kernel problem. This lockdep splat, however, can be suppressed
- * by using DEFINE_WAIT_OVERRIDE_MAP() if it serves a useful purpose and the
- * invalid PREEMPT_RT case has been taken care of.
- */
-static inline struct vm_struct *kasan_find_vm_area(void *addr)
-{
- static DEFINE_WAIT_OVERRIDE_MAP(vmalloc_map, LD_WAIT_SLEEP);
- struct vm_struct *va;
-
- if (IS_ENABLED(CONFIG_PREEMPT_RT))
- return NULL;
-
- /*
- * Suppress lockdep warning and fetch vmalloc area of the
- * offending address.
- */
- lock_map_acquire_try(&vmalloc_map);
- va = find_vm_area(addr);
- lock_map_release(&vmalloc_map);
- return va;
-}
-
static void print_address_description(void *addr, u8 tag,
struct kasan_report_info *info)
{
@@ -429,19 +399,8 @@ static void print_address_description(void *addr, u8 tag,
}
if (is_vmalloc_addr(addr)) {
- struct vm_struct *va = kasan_find_vm_area(addr);
-
- if (va) {
- pr_err("The buggy address belongs to the virtual mapping at\n"
- " [%px, %px) created by:\n"
- " %pS\n",
- va->addr, va->addr + va->size, va->caller);
- pr_err("\n");
-
- page = vmalloc_to_page(addr);
- } else {
- pr_err("The buggy address %px belongs to a vmalloc virtual mapping\n", addr);
- }
+ pr_err("The buggy address %px belongs to a vmalloc virtual mapping\n", addr);
+ page = vmalloc_to_page(addr);
}
if (page) {
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x 6ee9b3d84775944fb8c8a447961cd01274ac671c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071347-compel-crate-aeba@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6ee9b3d84775944fb8c8a447961cd01274ac671c Mon Sep 17 00:00:00 2001
From: Yeoreum Yun <yeoreum.yun(a)arm.com>
Date: Thu, 3 Jul 2025 19:10:18 +0100
Subject: [PATCH] kasan: remove kasan_find_vm_area() to prevent possible
deadlock
find_vm_area() couldn't be called in atomic_context. If find_vm_area() is
called to reports vm area information, kasan can trigger deadlock like:
CPU0 CPU1
vmalloc();
alloc_vmap_area();
spin_lock(&vn->busy.lock)
spin_lock_bh(&some_lock);
<interrupt occurs>
<in softirq>
spin_lock(&some_lock);
<access invalid address>
kasan_report();
print_report();
print_address_description();
kasan_find_vm_area();
find_vm_area();
spin_lock(&vn->busy.lock) // deadlock!
To prevent possible deadlock while kasan reports, remove kasan_find_vm_area().
Link: https://lkml.kernel.org/r/20250703181018.580833-1-yeoreum.yun@arm.com
Fixes: c056a364e954 ("kasan: print virtual mapping info in reports")
Signed-off-by: Yeoreum Yun <yeoreum.yun(a)arm.com>
Reported-by: Yunseong Kim <ysk(a)kzalloc.com>
Reviewed-by: Andrey Ryabinin <ryabinin.a.a(a)gmail.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Andrey Konovalov <andreyknvl(a)gmail.com>
Cc: Byungchul Park <byungchul(a)sk.com>
Cc: Dmitriy Vyukov <dvyukov(a)google.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Steven Rostedt <rostedt(a)goodmis.org>
Cc: Vincenzo Frascino <vincenzo.frascino(a)arm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 8357e1a33699..b0877035491f 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -370,36 +370,6 @@ static inline bool init_task_stack_addr(const void *addr)
sizeof(init_thread_union.stack));
}
-/*
- * This function is invoked with report_lock (a raw_spinlock) held. A
- * PREEMPT_RT kernel cannot call find_vm_area() as it will acquire a sleeping
- * rt_spinlock.
- *
- * For !RT kernel, the PROVE_RAW_LOCK_NESTING config option will print a
- * lockdep warning for this raw_spinlock -> spinlock dependency. This config
- * option is enabled by default to ensure better test coverage to expose this
- * kind of RT kernel problem. This lockdep splat, however, can be suppressed
- * by using DEFINE_WAIT_OVERRIDE_MAP() if it serves a useful purpose and the
- * invalid PREEMPT_RT case has been taken care of.
- */
-static inline struct vm_struct *kasan_find_vm_area(void *addr)
-{
- static DEFINE_WAIT_OVERRIDE_MAP(vmalloc_map, LD_WAIT_SLEEP);
- struct vm_struct *va;
-
- if (IS_ENABLED(CONFIG_PREEMPT_RT))
- return NULL;
-
- /*
- * Suppress lockdep warning and fetch vmalloc area of the
- * offending address.
- */
- lock_map_acquire_try(&vmalloc_map);
- va = find_vm_area(addr);
- lock_map_release(&vmalloc_map);
- return va;
-}
-
static void print_address_description(void *addr, u8 tag,
struct kasan_report_info *info)
{
@@ -429,19 +399,8 @@ static void print_address_description(void *addr, u8 tag,
}
if (is_vmalloc_addr(addr)) {
- struct vm_struct *va = kasan_find_vm_area(addr);
-
- if (va) {
- pr_err("The buggy address belongs to the virtual mapping at\n"
- " [%px, %px) created by:\n"
- " %pS\n",
- va->addr, va->addr + va->size, va->caller);
- pr_err("\n");
-
- page = vmalloc_to_page(addr);
- } else {
- pr_err("The buggy address %px belongs to a vmalloc virtual mapping\n", addr);
- }
+ pr_err("The buggy address %px belongs to a vmalloc virtual mapping\n", addr);
+ page = vmalloc_to_page(addr);
}
if (page) {
The patch below does not apply to the 6.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.15.y
git checkout FETCH_HEAD
git cherry-pick -x 52e1a03e6cf61ae165f59f41c44394a653a0a788
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071203-dime-overcook-61dc@gregkh' --subject-prefix 'PATCH 6.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 52e1a03e6cf61ae165f59f41c44394a653a0a788 Mon Sep 17 00:00:00 2001
From: Nikunj A Dadhania <nikunj(a)amd.com>
Date: Mon, 30 Jun 2025 13:48:58 +0530
Subject: [PATCH] x86/sev: Use TSC_FACTOR for Secure TSC frequency calculation
When using Secure TSC, the GUEST_TSC_FREQ MSR reports a frequency based on
the nominal P0 frequency, which deviates slightly (typically ~0.2%) from
the actual mean TSC frequency due to clocking parameters.
Over extended VM uptime, this discrepancy accumulates, causing clock skew
between the hypervisor and a SEV-SNP VM, leading to early timer interrupts as
perceived by the guest.
The guest kernel relies on the reported nominal frequency for TSC-based
timekeeping, while the actual frequency set during SNP_LAUNCH_START may
differ. This mismatch results in inaccurate time calculations, causing the
guest to perceive hrtimers as firing earlier than expected.
Utilize the TSC_FACTOR from the SEV firmware's secrets page (see "Secrets
Page Format" in the SNP Firmware ABI Specification) to calculate the mean
TSC frequency, ensuring accurate timekeeping and mitigating clock skew in
SEV-SNP VMs.
Use early_ioremap_encrypted() to map the secrets page as
ioremap_encrypted() uses kmalloc() which is not available during early TSC
initialization and causes a panic.
[ bp: Drop the silly dummy var:
https://lore.kernel.org/r/20250630192726.GBaGLlHl84xIopx4Pt@fat_crate.local ]
Fixes: 73bbf3b0fbba ("x86/tsc: Init the TSC for Secure TSC guests")
Signed-off-by: Nikunj A Dadhania <nikunj(a)amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/20250630081858.485187-1-nikunj@amd.com
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c
index b6db4e0b936b..7543a8b52c67 100644
--- a/arch/x86/coco/sev/core.c
+++ b/arch/x86/coco/sev/core.c
@@ -88,7 +88,7 @@ static const char * const sev_status_feat_names[] = {
*/
static u64 snp_tsc_scale __ro_after_init;
static u64 snp_tsc_offset __ro_after_init;
-static u64 snp_tsc_freq_khz __ro_after_init;
+static unsigned long snp_tsc_freq_khz __ro_after_init;
DEFINE_PER_CPU(struct sev_es_runtime_data*, runtime_data);
DEFINE_PER_CPU(struct sev_es_save_area *, sev_vmsa);
@@ -2167,15 +2167,31 @@ static unsigned long securetsc_get_tsc_khz(void)
void __init snp_secure_tsc_init(void)
{
- unsigned long long tsc_freq_mhz;
+ struct snp_secrets_page *secrets;
+ unsigned long tsc_freq_mhz;
+ void *mem;
if (!cc_platform_has(CC_ATTR_GUEST_SNP_SECURE_TSC))
return;
+ mem = early_memremap_encrypted(sev_secrets_pa, PAGE_SIZE);
+ if (!mem) {
+ pr_err("Unable to get TSC_FACTOR: failed to map the SNP secrets page.\n");
+ sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SECURE_TSC);
+ }
+
+ secrets = (__force struct snp_secrets_page *)mem;
+
setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
rdmsrq(MSR_AMD64_GUEST_TSC_FREQ, tsc_freq_mhz);
- snp_tsc_freq_khz = (unsigned long)(tsc_freq_mhz * 1000);
+
+ /* Extract the GUEST TSC MHZ from BIT[17:0], rest is reserved space */
+ tsc_freq_mhz &= GENMASK_ULL(17, 0);
+
+ snp_tsc_freq_khz = SNP_SCALE_TSC_FREQ(tsc_freq_mhz * 1000, secrets->tsc_factor);
x86_platform.calibrate_cpu = securetsc_get_tsc_khz;
x86_platform.calibrate_tsc = securetsc_get_tsc_khz;
+
+ early_memunmap(mem, PAGE_SIZE);
}
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index 58e028d42e41..a631f7d7c0c0 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -223,6 +223,18 @@ struct snp_tsc_info_resp {
u8 rsvd2[100];
} __packed;
+/*
+ * Obtain the mean TSC frequency by decreasing the nominal TSC frequency with
+ * TSC_FACTOR as documented in the SNP Firmware ABI specification:
+ *
+ * GUEST_TSC_FREQ * (1 - (TSC_FACTOR * 0.00001))
+ *
+ * which is equivalent to:
+ *
+ * GUEST_TSC_FREQ -= (GUEST_TSC_FREQ * TSC_FACTOR) / 100000;
+ */
+#define SNP_SCALE_TSC_FREQ(freq, factor) ((freq) - (freq) * (factor) / 100000)
+
struct snp_guest_req {
void *req_buf;
size_t req_sz;
@@ -282,8 +294,11 @@ struct snp_secrets_page {
u8 svsm_guest_vmpl;
u8 rsvd3[3];
+ /* The percentage decrease from nominal to mean TSC frequency. */
+ u32 tsc_factor;
+
/* Remainder of page */
- u8 rsvd4[3744];
+ u8 rsvd4[3740];
} __packed;
struct snp_msg_desc {
The iotlb_sync_map iommu ops allows drivers to perform necessary cache
flushes when new mappings are established. For the Intel iommu driver,
this callback specifically serves two purposes:
- To flush caches when a second-stage page table is attached to a device
whose iommu is operating in caching mode (CAP_REG.CM==1).
- To explicitly flush internal write buffers to ensure updates to memory-
resident remapping structures are visible to hardware (CAP_REG.RWBF==1).
However, in scenarios where neither caching mode nor the RWBF flag is
active, the cache_tag_flush_range_np() helper, which is called in the
iotlb_sync_map path, effectively becomes a no-op.
Despite being a no-op, cache_tag_flush_range_np() involves iterating
through all cache tags of the iommu's attached to the domain, protected
by a spinlock. This unnecessary execution path introduces overhead,
leading to a measurable I/O performance regression. On systems with NVMes
under the same bridge, performance was observed to drop from approximately
~6150 MiB/s down to ~4985 MiB/s.
Introduce a flag in the dmar_domain structure. This flag will only be set
when iotlb_sync_map is required (i.e., when CM or RWBF is set). The
cache_tag_flush_range_np() is called only for domains where this flag is
set.
Reported-by: Ioanna Alifieraki <ioanna-maria.alifieraki(a)canonical.com>
Closes: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2115738
Link: https://lore.kernel.org/r/20250701171154.52435-1-ioanna-maria.alifieraki@ca…
Fixes: 129dab6e1286 ("iommu/vt-d: Use cache_tag_flush_range_np() in iotlb_sync_map")
Cc: stable(a)vger.kernel.org
Signed-off-by: Lu Baolu <baolu.lu(a)linux.intel.com>
---
drivers/iommu/intel/iommu.c | 19 ++++++++++++++++++-
drivers/iommu/intel/iommu.h | 3 +++
2 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 7aa3932251b2..f60201ee4be0 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1796,6 +1796,18 @@ static int domain_setup_first_level(struct intel_iommu *iommu,
(pgd_t *)pgd, flags, old);
}
+static bool domain_need_iotlb_sync_map(struct dmar_domain *domain,
+ struct intel_iommu *iommu)
+{
+ if (cap_caching_mode(iommu->cap) && !domain->use_first_level)
+ return true;
+
+ if (rwbf_quirk || cap_rwbf(iommu->cap))
+ return true;
+
+ return false;
+}
+
static int dmar_domain_attach_device(struct dmar_domain *domain,
struct device *dev)
{
@@ -1833,6 +1845,8 @@ static int dmar_domain_attach_device(struct dmar_domain *domain,
if (ret)
goto out_block_translation;
+ domain->iotlb_sync_map |= domain_need_iotlb_sync_map(domain, iommu);
+
return 0;
out_block_translation:
@@ -3945,7 +3959,10 @@ static bool risky_device(struct pci_dev *pdev)
static int intel_iommu_iotlb_sync_map(struct iommu_domain *domain,
unsigned long iova, size_t size)
{
- cache_tag_flush_range_np(to_dmar_domain(domain), iova, iova + size - 1);
+ struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+
+ if (dmar_domain->iotlb_sync_map)
+ cache_tag_flush_range_np(dmar_domain, iova, iova + size - 1);
return 0;
}
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 3ddbcc603de2..7ab2c34a5ecc 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -614,6 +614,9 @@ struct dmar_domain {
u8 has_mappings:1; /* Has mappings configured through
* iommu_map() interface.
*/
+ u8 iotlb_sync_map:1; /* Need to flush IOTLB cache or write
+ * buffer when creating mappings.
+ */
spinlock_t lock; /* Protect device tracking lists */
struct list_head devices; /* all devices' list */
--
2.43.0
The wx_rx_buffer structure contained two DMA address fields: 'dma' and
'page_dma'. However, only 'page_dma' was actually initialized and used
to program the Rx descriptor. But 'dma' was uninitialized and used in
some paths.
This could lead to undefined behavior, including DMA errors or
use-after-free, if the uninitialized 'dma' was used. Althrough such
error has not yet occurred, it is worth fixing in the code.
Fixes: 3c47e8ae113a ("net: libwx: Support to receive packets in NAPI")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jiawen Wu <jiawenwu(a)trustnetic.com>
Reviewed-by: Simon Horman <horms(a)kernel.org>
---
drivers/net/ethernet/wangxun/libwx/wx_lib.c | 4 ++--
drivers/net/ethernet/wangxun/libwx/wx_type.h | 1 -
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_lib.c b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
index 7e3d7fb61a52..c91487909811 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_lib.c
+++ b/drivers/net/ethernet/wangxun/libwx/wx_lib.c
@@ -307,7 +307,7 @@ static bool wx_alloc_mapped_page(struct wx_ring *rx_ring,
return false;
dma = page_pool_get_dma_addr(page);
- bi->page_dma = dma;
+ bi->dma = dma;
bi->page = page;
bi->page_offset = 0;
@@ -344,7 +344,7 @@ void wx_alloc_rx_buffers(struct wx_ring *rx_ring, u16 cleaned_count)
DMA_FROM_DEVICE);
rx_desc->read.pkt_addr =
- cpu_to_le64(bi->page_dma + bi->page_offset);
+ cpu_to_le64(bi->dma + bi->page_offset);
rx_desc++;
bi++;
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h
index 622f2bfe6cde..1a90fcede86b 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_type.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h
@@ -997,7 +997,6 @@ struct wx_tx_buffer {
struct wx_rx_buffer {
struct sk_buff *skb;
dma_addr_t dma;
- dma_addr_t page_dma;
struct page *page;
unsigned int page_offset;
};
--
2.48.1
The quilt patch titled
Subject: samples/damon/mtier: support boot time enable setup
has been removed from the -mm tree. Its filename was
samples-damon-mtier-support-boot-time-enable-setup.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: samples/damon/mtier: support boot time enable setup
Date: Sun, 6 Jul 2025 12:32:04 -0700
If 'enable' parameter of the 'mtier' DAMON sample module is set at boot
time via the kernel command line, memory allocation is tried before the
slab is initialized. As a result kernel NULL pointer dereference BUG can
happen. Fix it by checking the initialization status.
Link: https://lkml.kernel.org/r/20250706193207.39810-4-sj@kernel.org
Fixes: 82a08bde3cf7 ("samples/damon: implement a DAMON module for memory tiering")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
samples/damon/mtier.c | 10 ++++++++++
1 file changed, 10 insertions(+)
--- a/samples/damon/mtier.c~samples-damon-mtier-support-boot-time-enable-setup
+++ a/samples/damon/mtier.c
@@ -157,6 +157,8 @@ static void damon_sample_mtier_stop(void
damon_destroy_ctx(ctxs[1]);
}
+static bool init_called;
+
static int damon_sample_mtier_enable_store(
const char *val, const struct kernel_param *kp)
{
@@ -182,6 +184,14 @@ static int damon_sample_mtier_enable_sto
static int __init damon_sample_mtier_init(void)
{
+ int err = 0;
+
+ init_called = true;
+ if (enable) {
+ err = damon_sample_mtier_start();
+ if (err)
+ enable = false;
+ }
return 0;
}
_
Patches currently in -mm which might be from sj(a)kernel.org are
samples-damon-wsse-rename-to-have-damon_sample_-prefix.patch
samples-damon-prcl-rename-to-have-damon_sample_-prefix.patch
samples-damon-mtier-rename-to-have-damon_sample_-prefix.patch
mm-damon-sysfs-use-damon-core-api-damon_is_running.patch
mm-damon-sysfs-dont-hold-kdamond_lock-in-before_terminate.patch
docs-mm-damon-maintainer-profile-update-for-mm-new-tree.patch
mm-damon-add-struct-damos_migrate_dests.patch
mm-damon-core-add-damos-migrate_dests-field.patch
mm-damon-sysfs-schemes-implement-damos-action-destinations-directory.patch
mm-damon-sysfs-schemes-set-damos-migrate_dests.patch
docs-abi-damon-document-schemes-dests-directory.patch
docs-admin-guide-mm-damon-usage-document-dests-directory.patch
mm-damon-accept-parallel-damon_call-requests.patch
mm-damon-core-introduce-repeat-mode-damon_call.patch
mm-damon-stat-use-damon_call-repeat-mode-instead-of-damon_callback.patch
mm-damon-reclaim-use-damon_call-repeat-mode-instead-of-damon_callback.patch
mm-damon-lru_sort-use-damon_call-repeat-mode-instead-of-damon_callback.patch
samples-damon-prcl-use-damon_call-repeat-mode-instead-of-damon_callback.patch
samples-damon-wsse-use-damon_call-repeat-mode-instead-of-damon_callback.patch
mm-damon-core-do-not-call-opscleanup-when-destroying-targets.patch
mm-damon-core-add-cleanup_target-ops-callback.patch
mm-damon-vaddr-put-pid-in-cleanup_target.patch
mm-damon-sysfs-remove-damon_sysfs_destroy_targets.patch
mm-damon-core-destroy-targets-when-kdamond_fn-finish.patch
mm-damon-sysfs-remove-damon_sysfs_before_terminate.patch
mm-damon-core-remove-damon_callback.patch
The quilt patch titled
Subject: samples/damon/wsse: fix boot time enable handling
has been removed from the -mm tree. Its filename was
samples-damon-wsse-fix-boot-time-enable-handling.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: SeongJae Park <sj(a)kernel.org>
Subject: samples/damon/wsse: fix boot time enable handling
Date: Sun, 6 Jul 2025 12:32:02 -0700
Patch series "mm/damon: fix misc bugs in DAMON modules".
From manual code review, I found below bugs in DAMON modules.
DAMON sample modules crash if those are enabled at boot time, via kernel
command line. A similar issue was found and fixed on DAMON non-sample
modules in the past, but we didn't check that for sample modules.
DAMON non-sample modules are not setting 'enabled' parameters accordingly
when real enabling is failed. Honggyu found and fixed[1] this type of
bugs in DAMON sample modules, and my inspection was motivated by the great
work. Kudos to Honggyu.
Finally, DAMON_RECLIAM is mistakenly losing scheme internal status due to
misuse of damon_commit_ctx(). DAMON_LRU_SORT has a similar misuse, but
fortunately it is not causing real status loss.
Fix the bugs. Since these are similar patterns of bugs that were found in
the past, it would be better to add tests or refactor the code, in future.
This patch (of 6):
If 'enable' parameter of the 'wsse' DAMON sample module is set at boot
time via the kernel command line, memory allocation is tried before the
slab is initialized. As a result kernel NULL pointer dereference BUG can
happen. Fix it by checking the initialization status.
Link: https://lkml.kernel.org/r/20250706193207.39810-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20250706193207.39810-2-sj@kernel.org
Link: https://lore.kernel.org/20250702000205.1921-1-honggyu.kim@sk.com [1]
Fixes: b757c6cfc696 ("samples/damon/wsse: start and stop DAMON as the user requests")
Signed-off-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
samples/damon/wsse.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
--- a/samples/damon/wsse.c~samples-damon-wsse-fix-boot-time-enable-handling
+++ a/samples/damon/wsse.c
@@ -89,6 +89,8 @@ static void damon_sample_wsse_stop(void)
put_pid(target_pidp);
}
+static bool init_called;
+
static int damon_sample_wsse_enable_store(
const char *val, const struct kernel_param *kp)
{
@@ -114,7 +116,15 @@ static int damon_sample_wsse_enable_stor
static int __init damon_sample_wsse_init(void)
{
- return 0;
+ int err = 0;
+
+ init_called = true;
+ if (enable) {
+ err = damon_sample_wsse_start();
+ if (err)
+ enable = false;
+ }
+ return err;
}
module_init(damon_sample_wsse_init);
_
Patches currently in -mm which might be from sj(a)kernel.org are
samples-damon-wsse-rename-to-have-damon_sample_-prefix.patch
samples-damon-prcl-rename-to-have-damon_sample_-prefix.patch
samples-damon-mtier-rename-to-have-damon_sample_-prefix.patch
mm-damon-sysfs-use-damon-core-api-damon_is_running.patch
mm-damon-sysfs-dont-hold-kdamond_lock-in-before_terminate.patch
docs-mm-damon-maintainer-profile-update-for-mm-new-tree.patch
mm-damon-add-struct-damos_migrate_dests.patch
mm-damon-core-add-damos-migrate_dests-field.patch
mm-damon-sysfs-schemes-implement-damos-action-destinations-directory.patch
mm-damon-sysfs-schemes-set-damos-migrate_dests.patch
docs-abi-damon-document-schemes-dests-directory.patch
docs-admin-guide-mm-damon-usage-document-dests-directory.patch
mm-damon-accept-parallel-damon_call-requests.patch
mm-damon-core-introduce-repeat-mode-damon_call.patch
mm-damon-stat-use-damon_call-repeat-mode-instead-of-damon_callback.patch
mm-damon-reclaim-use-damon_call-repeat-mode-instead-of-damon_callback.patch
mm-damon-lru_sort-use-damon_call-repeat-mode-instead-of-damon_callback.patch
samples-damon-prcl-use-damon_call-repeat-mode-instead-of-damon_callback.patch
samples-damon-wsse-use-damon_call-repeat-mode-instead-of-damon_callback.patch
mm-damon-core-do-not-call-opscleanup-when-destroying-targets.patch
mm-damon-core-add-cleanup_target-ops-callback.patch
mm-damon-vaddr-put-pid-in-cleanup_target.patch
mm-damon-sysfs-remove-damon_sysfs_destroy_targets.patch
mm-damon-core-destroy-targets-when-kdamond_fn-finish.patch
mm-damon-sysfs-remove-damon_sysfs_before_terminate.patch
mm-damon-core-remove-damon_callback.patch
Hello,
Could you kindly respond to me?
I am Elena Victoria fernanda from Bradford and I'd like to share something very important with you right now.
Provide your names to help me verify I m addressing the correct contact.
I would be pleased to receive your swift response.
Best wishes
E1ena V!ct0ria F3rnanda
In the max3420_set_clear_feature() function, the endpoint index `id` can have a value from 0 to 15.
However, the udc->ep array is initialized with a maximum of 4 endpoints in max3420_eps_init().
If host sends a request with a wIndex greater than 3, the access to `udc->ep[id]` will go out-of-bounds,
leading to memory corruption or a potential kernel crash.
This bug was found by code inspection and has not been tested on hardware.
Fixes: 48ba02b2e2b1a ("usb: gadget: add udc driver for max3420")
Cc: stable(a)vger.kernel.org
Signed-off-by: Seungjin Bae <eeodqql09(a)gmail.com>
---
drivers/usb/gadget/udc/max3420_udc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/usb/gadget/udc/max3420_udc.c b/drivers/usb/gadget/udc/max3420_udc.c
index 7349ea774adf..e4ecc7f7f3be 100644
--- a/drivers/usb/gadget/udc/max3420_udc.c
+++ b/drivers/usb/gadget/udc/max3420_udc.c
@@ -596,6 +596,8 @@ static void max3420_set_clear_feature(struct max3420_udc *udc)
break;
id = udc->setup.wIndex & USB_ENDPOINT_NUMBER_MASK;
+ if (id >= MAX3420_MAX_EPS)
+ break;
ep = &udc->ep[id];
spin_lock_irqsave(&ep->lock, flags);
--
2.43.0
Hi!
I bumped into a build regression when building Alpine Linux kernel 6.12.35 on x86_64:
In file included from ../arch/x86/tools/insn_decoder_test.c:13:
../tools/include/linux/kallsyms.h:21:10: fatal error: execinfo.h: No such file or directory
21 | #include <execinfo.h>
| ^~~~~~~~~~~~
compilation terminated.
The 6.12.34 kernel built just fine.
I bisected it to:
commit b8abcba6e4aec53868dfe44f97270fc4dee0df2a (HEAD)
Author: Sergio Gonz_lez Collado <sergio.collado(a)gmail.com>
Date: Sun Mar 2 23:15:18 2025 +0100
Kunit to check the longest symbol length
commit c104c16073b7fdb3e4eae18f66f4009f6b073d6f upstream.
which has this hunk:
diff --git a/arch/x86/tools/insn_decoder_test.c b/arch/x86/tools/insn_decoder_test.c
index 472540aeabc2..6c2986d2ad11 100644
--- a/arch/x86/tools/insn_decoder_test.c
+++ b/arch/x86/tools/insn_decoder_test.c
@@ -10,6 +10,7 @@
#include <assert.h>
#include <unistd.h>
#include <stdarg.h>
+#include <linux/kallsyms.h>
#define unlikely(cond) (cond)
@@ -106,7 +107,7 @@ static void parse_args(int argc, char **argv)
}
}
-#define BUFSIZE 256
+#define BUFSIZE (256 + KSYM_NAME_LEN)
int main(int argc, char **argv)
{
It looks like the linux/kallsyms.h was included to get KSYM_NAME_LEN.
Unfortunately it also introduced the include of execinfo.h, which does
not exist on musl libc.
This has previously been reported to and tried fixed:
https://lore.kernel.org/stable/DB0OSTC6N4TL.2NK75K2CWE9JV@pwned.life/T/#t
Would it be an idea to revert commit b8abcba6e4ae til we have a proper
solution for this?
Thanks!
-nc
Starting with Rust 1.89.0 (expected 2025-08-07), the Rust compiler
may warn:
error: trait `MustNotImplDrop` is never used
--> rust/kernel/init/macros.rs:927:15
|
927 | trait MustNotImplDrop {}
| ^^^^^^^^^^^^^^^
|
::: rust/kernel/sync/arc.rs:133:1
|
133 | #[pin_data]
| ----------- in this procedural macro expansion
|
= note: `-D dead-code` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(dead_code)]`
= note: this error originates in the macro `$crate::__pin_data`
which comes from the expansion of the attribute macro
`pin_data` (in Nightly builds, run with
-Z macro-backtrace for more info)
Thus `allow` it to clean it up.
This does not happen in mainline nor 6.15.y, because there the macro was
moved out of the `kernel` crate, and `dead_code` warnings are not
emitted if the macro is foreign to the crate. Thus this patch is
directly sent to stable and intended for 6.12.y only.
Similarly, it is not needed in previous LTSs, because there the Rust
version is pinned.
Cc: Benno Lossin <lossin(a)kernel.org>
Signed-off-by: Miguel Ojeda <ojeda(a)kernel.org>
---
Greg, Sasha: please note that an equivalent patch is _not_ in mainline.
We could put these `allow`s in mainline (they wouldn't hurt), but it
isn't a good idea to add things in mainline for the only reason of
backporting them, thus I am sending this directly to stable.
The patch is pretty safe -- there is no actual code change.
rust/kernel/init/macros.rs | 2 ++
1 file changed, 2 insertions(+)
diff --git a/rust/kernel/init/macros.rs b/rust/kernel/init/macros.rs
index b7213962a6a5..e530028bb9ed 100644
--- a/rust/kernel/init/macros.rs
+++ b/rust/kernel/init/macros.rs
@@ -924,6 +924,7 @@ impl<'__pin, $($impl_generics)*> ::core::marker::Unpin for $name<$($ty_generics)
// We prevent this by creating a trait that will be implemented for all types implementing
// `Drop`. Additionally we will implement this trait for the struct leading to a conflict,
// if it also implements `Drop`
+ #[allow(dead_code)]
trait MustNotImplDrop {}
#[expect(drop_bounds)]
impl<T: ::core::ops::Drop> MustNotImplDrop for T {}
@@ -932,6 +933,7 @@ impl<$($impl_generics)*> MustNotImplDrop for $name<$($ty_generics)*>
// We also take care to prevent users from writing a useless `PinnedDrop` implementation.
// They might implement `PinnedDrop` correctly for the struct, but forget to give
// `PinnedDrop` as the parameter to `#[pin_data]`.
+ #[allow(dead_code)]
#[expect(non_camel_case_types)]
trait UselessPinnedDropImpl_you_need_to_specify_PinnedDrop {}
impl<T: $crate::init::PinnedDrop>
base-commit: fbad404f04d758c52bae79ca20d0e7fe5fef91d3
--
2.50.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 505b730ede7f5c4083ff212aa955155b5b92e574
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071236-generous-jazz-41e4@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 505b730ede7f5c4083ff212aa955155b5b92e574 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig(a)baylibre.com>
Date: Fri, 4 Jul 2025 19:27:27 +0200
Subject: [PATCH] pwm: mediatek: Ensure to disable clocks in error path
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
After enabling the clocks each error path must disable the clocks again.
One of them failed to do so. Unify the error paths to use goto to make it
harder for future changes to add a similar bug.
Fixes: 7ca59947b5fc ("pwm: mediatek: Prevent divide-by-zero in pwm_mediatek_config()")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)baylibre.com>
Link: https://lore.kernel.org/r/20250704172728.626815-2-u.kleine-koenig@baylibre.…
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <ukleinek(a)kernel.org>
diff --git a/drivers/pwm/pwm-mediatek.c b/drivers/pwm/pwm-mediatek.c
index 7eaab5831499..33d3554b9197 100644
--- a/drivers/pwm/pwm-mediatek.c
+++ b/drivers/pwm/pwm-mediatek.c
@@ -130,8 +130,10 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
return ret;
clk_rate = clk_get_rate(pc->clk_pwms[pwm->hwpwm]);
- if (!clk_rate)
- return -EINVAL;
+ if (!clk_rate) {
+ ret = -EINVAL;
+ goto out;
+ }
/* Make sure we use the bus clock and not the 26MHz clock */
if (pc->soc->has_ck_26m_sel)
@@ -150,9 +152,9 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
}
if (clkdiv > PWM_CLK_DIV_MAX) {
- pwm_mediatek_clk_disable(chip, pwm);
dev_err(pwmchip_parent(chip), "period of %d ns not supported\n", period_ns);
- return -EINVAL;
+ ret = -EINVAL;
+ goto out;
}
if (pc->soc->pwm45_fixup && pwm->hwpwm > 2) {
@@ -169,9 +171,10 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
pwm_mediatek_writel(pc, pwm->hwpwm, reg_width, cnt_period);
pwm_mediatek_writel(pc, pwm->hwpwm, reg_thres, cnt_duty);
+out:
pwm_mediatek_clk_disable(chip, pwm);
- return 0;
+ return ret;
}
static int pwm_mediatek_enable(struct pwm_chip *chip, struct pwm_device *pwm)
Built 6.12.37 with CONFIG_MITIGATION_TSA=y on a Ryzen 5700G system.
According to [1] the minimum BIOS revision should be
ComboAM4v2PI 1.2.0.E which is installed. According to [2] the minimum
microcode revision required for that CPU is 0x0A500012. Installed is
0x0A500014. So I think the mitigating should work. But this is not the
case:
~> dmesg | grep -Ei 'ryzen|microcode'
[ 0.171006] Transient Scheduler Attacks: Vulnerable: Clear CPU
buffers attempted, no microcode
[ 0.288006] smpboot: CPU0: AMD Ryzen 7 5700G with Radeon Graphics
(family: 0x19, model: 0x50, stepping: 0x0)
[ 0.480729] microcode: Current revision: 0x0a500014
I'm wondering why the mitigation isn't working. Thanks.
Rainer
[1]
https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7029.html
[2]
https://www.amd.com/content/dam/amd/en/documents/resources/bulletin/technic…
--
The truth always turns out to be simpler than you thought.
Richard Feynman
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 505b730ede7f5c4083ff212aa955155b5b92e574
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071235-ebook-definite-e54a@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 505b730ede7f5c4083ff212aa955155b5b92e574 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig(a)baylibre.com>
Date: Fri, 4 Jul 2025 19:27:27 +0200
Subject: [PATCH] pwm: mediatek: Ensure to disable clocks in error path
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
After enabling the clocks each error path must disable the clocks again.
One of them failed to do so. Unify the error paths to use goto to make it
harder for future changes to add a similar bug.
Fixes: 7ca59947b5fc ("pwm: mediatek: Prevent divide-by-zero in pwm_mediatek_config()")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)baylibre.com>
Link: https://lore.kernel.org/r/20250704172728.626815-2-u.kleine-koenig@baylibre.…
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <ukleinek(a)kernel.org>
diff --git a/drivers/pwm/pwm-mediatek.c b/drivers/pwm/pwm-mediatek.c
index 7eaab5831499..33d3554b9197 100644
--- a/drivers/pwm/pwm-mediatek.c
+++ b/drivers/pwm/pwm-mediatek.c
@@ -130,8 +130,10 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
return ret;
clk_rate = clk_get_rate(pc->clk_pwms[pwm->hwpwm]);
- if (!clk_rate)
- return -EINVAL;
+ if (!clk_rate) {
+ ret = -EINVAL;
+ goto out;
+ }
/* Make sure we use the bus clock and not the 26MHz clock */
if (pc->soc->has_ck_26m_sel)
@@ -150,9 +152,9 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
}
if (clkdiv > PWM_CLK_DIV_MAX) {
- pwm_mediatek_clk_disable(chip, pwm);
dev_err(pwmchip_parent(chip), "period of %d ns not supported\n", period_ns);
- return -EINVAL;
+ ret = -EINVAL;
+ goto out;
}
if (pc->soc->pwm45_fixup && pwm->hwpwm > 2) {
@@ -169,9 +171,10 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
pwm_mediatek_writel(pc, pwm->hwpwm, reg_width, cnt_period);
pwm_mediatek_writel(pc, pwm->hwpwm, reg_thres, cnt_duty);
+out:
pwm_mediatek_clk_disable(chip, pwm);
- return 0;
+ return ret;
}
static int pwm_mediatek_enable(struct pwm_chip *chip, struct pwm_device *pwm)
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 505b730ede7f5c4083ff212aa955155b5b92e574
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071234-appealing-unless-4a54@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 505b730ede7f5c4083ff212aa955155b5b92e574 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig(a)baylibre.com>
Date: Fri, 4 Jul 2025 19:27:27 +0200
Subject: [PATCH] pwm: mediatek: Ensure to disable clocks in error path
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
After enabling the clocks each error path must disable the clocks again.
One of them failed to do so. Unify the error paths to use goto to make it
harder for future changes to add a similar bug.
Fixes: 7ca59947b5fc ("pwm: mediatek: Prevent divide-by-zero in pwm_mediatek_config()")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)baylibre.com>
Link: https://lore.kernel.org/r/20250704172728.626815-2-u.kleine-koenig@baylibre.…
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <ukleinek(a)kernel.org>
diff --git a/drivers/pwm/pwm-mediatek.c b/drivers/pwm/pwm-mediatek.c
index 7eaab5831499..33d3554b9197 100644
--- a/drivers/pwm/pwm-mediatek.c
+++ b/drivers/pwm/pwm-mediatek.c
@@ -130,8 +130,10 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
return ret;
clk_rate = clk_get_rate(pc->clk_pwms[pwm->hwpwm]);
- if (!clk_rate)
- return -EINVAL;
+ if (!clk_rate) {
+ ret = -EINVAL;
+ goto out;
+ }
/* Make sure we use the bus clock and not the 26MHz clock */
if (pc->soc->has_ck_26m_sel)
@@ -150,9 +152,9 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
}
if (clkdiv > PWM_CLK_DIV_MAX) {
- pwm_mediatek_clk_disable(chip, pwm);
dev_err(pwmchip_parent(chip), "period of %d ns not supported\n", period_ns);
- return -EINVAL;
+ ret = -EINVAL;
+ goto out;
}
if (pc->soc->pwm45_fixup && pwm->hwpwm > 2) {
@@ -169,9 +171,10 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
pwm_mediatek_writel(pc, pwm->hwpwm, reg_width, cnt_period);
pwm_mediatek_writel(pc, pwm->hwpwm, reg_thres, cnt_duty);
+out:
pwm_mediatek_clk_disable(chip, pwm);
- return 0;
+ return ret;
}
static int pwm_mediatek_enable(struct pwm_chip *chip, struct pwm_device *pwm)
LX2160 SoC uses a densely packed configuration area in memory for pin
muxing - configuring a variable number of IOs at a time.
Since pinctrl nodes were added for the i2c signals of LX2160 SoC, boot
errors have been observed on SolidRun LX2162A Clearfog board when rootfs
is located on SD-Card (esdhc0):
[ 1.961035] mmc0: new ultra high speed SDR104 SDHC card at address aaaa
...
[ 5.220655] i2c i2c-1: using pinctrl states for GPIO recovery
[ 5.226425] i2c i2c-1: using generic GPIOs for recovery
...
[ 5.440471] mmc0: card aaaa removed
The card-detect and write-protect signals of esdhc0 are an alternate
function of IIC2 (in dts i2c1 - on lx2162 clearfog status disabled).
By use of u-boot "md", and linux "devmem" command it was confirmed that
RCWSR12 (at 0x01e0012c) with IIC2_PMUX (at bits 0-2) changes from
0x08000006 to 0x0000000 after starting Linux.
This means that the card-detect pin function has changed to i2c function
- which will cause the controller to detect card removal.
The respective i2c1-scl-pins node is only linked to i2c1 node that has
status disabled in device-tree for the solidrun boards.
How the memory is changed has not been investigated.
As a workaround add a new pinctrl definition for the
card-detect/write-protect function of IIC2 pins.
It seems unwise to link this directly from the SoC dtsi as boards may
rely on other functions such as flextimer.
Instead add the pinctrl to each board's esdhc0 node if it is known to
rely on native card-detect function. These boards have esdhc0 node
enabled and do not define broken-cd property:
- fsl-lx2160a-bluebox3.dts
- fsl-lx2160a-clearfog-itx.dtsi
- fsl-lx2160a-qds.dts
- fsl-lx2160a-rdb.dts
- fsl-lx2160a-tqmlx2160a-mblx2160a.dts
- fsl-lx2162a-clearfog.dts
- fsl-lx2162a-qds.dts
This was tested on the SolidRun LX2162 Clearfog with Linux v6.12.33.
Fixes: 8a1365c7bbc1 ("arm64: dts: lx2160a: add pinmux and i2c gpio to
support bus recovery")
Cc: stable(a)vger.kernel.org
Signed-off-by: Josua Mayer <josua(a)solid-run.com>
---
arch/arm64/boot/dts/freescale/fsl-lx2160a-bluebox3.dts | 2 ++
arch/arm64/boot/dts/freescale/fsl-lx2160a-clearfog-itx.dtsi | 2 ++
arch/arm64/boot/dts/freescale/fsl-lx2160a-qds.dts | 2 ++
arch/arm64/boot/dts/freescale/fsl-lx2160a-rdb.dts | 2 ++
arch/arm64/boot/dts/freescale/fsl-lx2160a-tqmlx2160a-mblx2160a.dts | 2 ++
arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi | 4
++++
arch/arm64/boot/dts/freescale/fsl-lx2162a-clearfog.dts | 2 ++
arch/arm64/boot/dts/freescale/fsl-lx2162a-qds.dts | 2 ++
8 files changed, 18 insertions(+)
diff --git a/arch/arm64/boot/dts/freescale/fsl-lx2160a-bluebox3.dts
b/arch/arm64/boot/dts/freescale/fsl-lx2160a-bluebox3.dts
index 042c486bdda2..298fcdce6c6b 100644
--- a/arch/arm64/boot/dts/freescale/fsl-lx2160a-bluebox3.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-lx2160a-bluebox3.dts
@@ -152,6 +152,8 @@ &esdhc0 {
sd-uhs-sdr50;
sd-uhs-sdr25;
sd-uhs-sdr12;
+ pinctrl-0 = <&i2c1_sdhc_cdwp>;
+ pinctrl-names = "default";
status = "okay";
};
diff --git
a/arch/arm64/boot/dts/freescale/fsl-lx2160a-clearfog-itx.dtsi
b/arch/arm64/boot/dts/freescale/fsl-lx2160a-clearfog-itx.dtsi
index a7dcbecc1f41..67ffe2e1b0bc 100644
--- a/arch/arm64/boot/dts/freescale/fsl-lx2160a-clearfog-itx.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-lx2160a-clearfog-itx.dtsi
@@ -93,6 +93,8 @@ &esdhc0 {
sd-uhs-sdr50;
sd-uhs-sdr25;
sd-uhs-sdr12;
+ pinctrl-0 = <&i2c1_sdhc_cdwp>;
+ pinctrl-names = "default";
status = "okay";
};
diff --git a/arch/arm64/boot/dts/freescale/fsl-lx2160a-qds.dts
b/arch/arm64/boot/dts/freescale/fsl-lx2160a-qds.dts
index 4d721197d837..17fe4a8fd35e 100644
--- a/arch/arm64/boot/dts/freescale/fsl-lx2160a-qds.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-lx2160a-qds.dts
@@ -214,6 +214,8 @@ &emdio2 {
};
&esdhc0 {
+ pinctrl-0 = <&i2c1_sdhc_cdwp>;
+ pinctrl-names = "default";
status = "okay";
};
diff --git a/arch/arm64/boot/dts/freescale/fsl-lx2160a-rdb.dts
b/arch/arm64/boot/dts/freescale/fsl-lx2160a-rdb.dts
index 0c44b3cbef77..faa486d6a5b1 100644
--- a/arch/arm64/boot/dts/freescale/fsl-lx2160a-rdb.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-lx2160a-rdb.dts
@@ -131,6 +131,8 @@ &esdhc0 {
sd-uhs-sdr50;
sd-uhs-sdr25;
sd-uhs-sdr12;
+ pinctrl-0 = <&i2c1_sdhc_cdwp>;
+ pinctrl-names = "default";
status = "okay";
};
diff --git
a/arch/arm64/boot/dts/freescale/fsl-lx2160a-tqmlx2160a-mblx2160a.dts
b/arch/arm64/boot/dts/freescale/fsl-lx2160a-tqmlx2160a-mblx2160a.dts
index f6a4f8d54301..4ba55feb18b2 100644
--- a/arch/arm64/boot/dts/freescale/fsl-lx2160a-tqmlx2160a-mblx2160a.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-lx2160a-tqmlx2160a-mblx2160a.dts
@@ -177,6 +177,8 @@ &esdhc0 {
no-sdio;
wp-gpios = <&gpio0 30 GPIO_ACTIVE_LOW>;
cd-gpios = <&gpio0 31 GPIO_ACTIVE_LOW>;
+ pinctrl-0 = <&i2c1_scl_gpio>;
+ pinctrl-names = "default";
status = "okay";
};
diff --git a/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
b/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
index c9541403bcd8..555a191b0bb4 100644
--- a/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
@@ -1717,6 +1717,10 @@ i2c1_scl_gpio: i2c1-scl-gpio-pins {
pinctrl-single,bits = <0x0 0x1 0x7>;
};
+ i2c1_sdhc_cdwp: i2c1-esdhc0-cd-wp-pins {
+ pinctrl-single,bits = <0x0 0x6 0x7>;
+ };
+
i2c2_scl: i2c2-scl-pins {
pinctrl-single,bits = <0x0 0 (0x7 << 3)>;
};
diff --git a/arch/arm64/boot/dts/freescale/fsl-lx2162a-clearfog.dts
b/arch/arm64/boot/dts/freescale/fsl-lx2162a-clearfog.dts
index eafef8718a0f..4b58105c3ffa 100644
--- a/arch/arm64/boot/dts/freescale/fsl-lx2162a-clearfog.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-lx2162a-clearfog.dts
@@ -227,6 +227,8 @@ &esdhc0 {
sd-uhs-sdr50;
sd-uhs-sdr25;
sd-uhs-sdr12;
+ pinctrl-0 = <&i2c1_sdhc_cdwp>;
+ pinctrl-names = "default";
status = "okay";
};
diff --git a/arch/arm64/boot/dts/freescale/fsl-lx2162a-qds.dts
b/arch/arm64/boot/dts/freescale/fsl-lx2162a-qds.dts
index 9f5ff1ffe7d5..caa079df35f6 100644
--- a/arch/arm64/boot/dts/freescale/fsl-lx2162a-qds.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-lx2162a-qds.dts
@@ -238,6 +238,8 @@ &esdhc0 {
sd-uhs-sdr50;
sd-uhs-sdr25;
sd-uhs-sdr12;
+ pinctrl-0 = <&i2c1_sdhc_cdwp>;
+ pinctrl-names = "default";
status = "okay";
};
---
base-commit: 19272b37aa4f83ca52bdf9c16d5d81bdd1354494
change-id: 20250710-lx2160-sd-cd-00bf38ae169e
Best regards,
--
Josua Mayer <josua(a)solid-run.com>
New version (unchanged for patches 1-3), with a test added so we can
detect this.
Followup of https://lore.kernel.org/linux-input/c75433e0-9b47-4072-bbe8-b1d14ea97b13@ro…
This initial series attempt at fixing the various bugs discovered by
Alan regarding __hid_request().
Syzbot managed to create a report descriptor which presents a feature
request of size 0 (still trying to extract it) and this exposed the fact
that __hid_request() was incorrectly handling the case when the report
ID is not used.
Send a first batch of fixes now so we get the feedback from syzbot ASAP.
Note: in the series, I also mentioned that the report of size 0 should
be stripped out of the HID device, but I'm not entirely sure this would
be a good idea in the end.
Signed-off-by: Benjamin Tissoires <bentiss(a)kernel.org>
---
Changes in v2:
- added Tested-by from syzbot (https://lore.kernel.org/r/686e9113.050a0220.385921.0008.GAE@google.com)
- added a python test for it
- Link to v1: https://lore.kernel.org/r/20250709-report-size-null-v1-0-194912215cbc@kerne…
---
Benjamin Tissoires (4):
HID: core: ensure the allocated report buffer can contain the reserved report ID
HID: core: ensure __hid_request reserves the report ID as the first byte
HID: core: do not bypass hid_hw_raw_request
selftests/hid: add a test case for the recent syzbot underflow
drivers/hid/hid-core.c | 19 +++++--
tools/testing/selftests/hid/tests/test_mouse.py | 70 +++++++++++++++++++++++++
2 files changed, 84 insertions(+), 5 deletions(-)
---
base-commit: 1f988d0788f50d8464f957e793fab356e2937369
change-id: 20250709-report-size-null-37619ea20288
Best regards,
--
Benjamin Tissoires <bentiss(a)kernel.org>
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 505b730ede7f5c4083ff212aa955155b5b92e574
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071235-disagree-dolly-ab52@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 505b730ede7f5c4083ff212aa955155b5b92e574 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig(a)baylibre.com>
Date: Fri, 4 Jul 2025 19:27:27 +0200
Subject: [PATCH] pwm: mediatek: Ensure to disable clocks in error path
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
After enabling the clocks each error path must disable the clocks again.
One of them failed to do so. Unify the error paths to use goto to make it
harder for future changes to add a similar bug.
Fixes: 7ca59947b5fc ("pwm: mediatek: Prevent divide-by-zero in pwm_mediatek_config()")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)baylibre.com>
Link: https://lore.kernel.org/r/20250704172728.626815-2-u.kleine-koenig@baylibre.…
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <ukleinek(a)kernel.org>
diff --git a/drivers/pwm/pwm-mediatek.c b/drivers/pwm/pwm-mediatek.c
index 7eaab5831499..33d3554b9197 100644
--- a/drivers/pwm/pwm-mediatek.c
+++ b/drivers/pwm/pwm-mediatek.c
@@ -130,8 +130,10 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
return ret;
clk_rate = clk_get_rate(pc->clk_pwms[pwm->hwpwm]);
- if (!clk_rate)
- return -EINVAL;
+ if (!clk_rate) {
+ ret = -EINVAL;
+ goto out;
+ }
/* Make sure we use the bus clock and not the 26MHz clock */
if (pc->soc->has_ck_26m_sel)
@@ -150,9 +152,9 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
}
if (clkdiv > PWM_CLK_DIV_MAX) {
- pwm_mediatek_clk_disable(chip, pwm);
dev_err(pwmchip_parent(chip), "period of %d ns not supported\n", period_ns);
- return -EINVAL;
+ ret = -EINVAL;
+ goto out;
}
if (pc->soc->pwm45_fixup && pwm->hwpwm > 2) {
@@ -169,9 +171,10 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
pwm_mediatek_writel(pc, pwm->hwpwm, reg_width, cnt_period);
pwm_mediatek_writel(pc, pwm->hwpwm, reg_thres, cnt_duty);
+out:
pwm_mediatek_clk_disable(chip, pwm);
- return 0;
+ return ret;
}
static int pwm_mediatek_enable(struct pwm_chip *chip, struct pwm_device *pwm)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 505b730ede7f5c4083ff212aa955155b5b92e574
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071236-propeller-quality-54b9@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 505b730ede7f5c4083ff212aa955155b5b92e574 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Uwe=20Kleine-K=C3=B6nig?= <u.kleine-koenig(a)baylibre.com>
Date: Fri, 4 Jul 2025 19:27:27 +0200
Subject: [PATCH] pwm: mediatek: Ensure to disable clocks in error path
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
After enabling the clocks each error path must disable the clocks again.
One of them failed to do so. Unify the error paths to use goto to make it
harder for future changes to add a similar bug.
Fixes: 7ca59947b5fc ("pwm: mediatek: Prevent divide-by-zero in pwm_mediatek_config()")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig(a)baylibre.com>
Link: https://lore.kernel.org/r/20250704172728.626815-2-u.kleine-koenig@baylibre.…
Cc: stable(a)vger.kernel.org
Signed-off-by: Uwe Kleine-König <ukleinek(a)kernel.org>
diff --git a/drivers/pwm/pwm-mediatek.c b/drivers/pwm/pwm-mediatek.c
index 7eaab5831499..33d3554b9197 100644
--- a/drivers/pwm/pwm-mediatek.c
+++ b/drivers/pwm/pwm-mediatek.c
@@ -130,8 +130,10 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
return ret;
clk_rate = clk_get_rate(pc->clk_pwms[pwm->hwpwm]);
- if (!clk_rate)
- return -EINVAL;
+ if (!clk_rate) {
+ ret = -EINVAL;
+ goto out;
+ }
/* Make sure we use the bus clock and not the 26MHz clock */
if (pc->soc->has_ck_26m_sel)
@@ -150,9 +152,9 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
}
if (clkdiv > PWM_CLK_DIV_MAX) {
- pwm_mediatek_clk_disable(chip, pwm);
dev_err(pwmchip_parent(chip), "period of %d ns not supported\n", period_ns);
- return -EINVAL;
+ ret = -EINVAL;
+ goto out;
}
if (pc->soc->pwm45_fixup && pwm->hwpwm > 2) {
@@ -169,9 +171,10 @@ static int pwm_mediatek_config(struct pwm_chip *chip, struct pwm_device *pwm,
pwm_mediatek_writel(pc, pwm->hwpwm, reg_width, cnt_period);
pwm_mediatek_writel(pc, pwm->hwpwm, reg_thres, cnt_duty);
+out:
pwm_mediatek_clk_disable(chip, pwm);
- return 0;
+ return ret;
}
static int pwm_mediatek_enable(struct pwm_chip *chip, struct pwm_device *pwm)
From: Steffen Bätz <steffen(a)innosonix.de>
The commit "13bcd440f2ff nvmem: core: verify cell's raw_len" caused an
extension of the "mac-address" cell from 6 to 8 bytes due to word_size
of 4 bytes. This led to a required byte swap of the full buffer length,
which caused truncation of the mac-address when read.
Previously, the mac-address was incorrectly truncated from
70:B3:D5:14:E9:0E to 00:00:70:B3:D5:14.
Fix the issue by swapping only the first 6 bytes to correctly pass the
mac-address to the upper layers.
Fixes: 13bcd440f2ff ("nvmem: core: verify cell's raw_len")
Cc: stable(a)vger.kernel.org
Signed-off-by: Steffen Bätz <steffen(a)innosonix.de>
Tested-by: Alexander Stein <alexander.stein(a)ew.tq-group.com>
Signed-off-by: Srinivas Kandagatla <srini(a)kernel.org>
---
drivers/nvmem/imx-ocotp-ele.c | 5 ++++-
drivers/nvmem/imx-ocotp.c | 5 ++++-
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/nvmem/imx-ocotp-ele.c b/drivers/nvmem/imx-ocotp-ele.c
index ca6dd71d8a2e..7807ec0e2d18 100644
--- a/drivers/nvmem/imx-ocotp-ele.c
+++ b/drivers/nvmem/imx-ocotp-ele.c
@@ -12,6 +12,7 @@
#include <linux/of.h>
#include <linux/platform_device.h>
#include <linux/slab.h>
+#include <linux/if_ether.h> /* ETH_ALEN */
enum fuse_type {
FUSE_FSB = BIT(0),
@@ -118,9 +119,11 @@ static int imx_ocotp_cell_pp(void *context, const char *id, int index,
int i;
/* Deal with some post processing of nvmem cell data */
- if (id && !strcmp(id, "mac-address"))
+ if (id && !strcmp(id, "mac-address")) {
+ bytes = min(bytes, ETH_ALEN);
for (i = 0; i < bytes / 2; i++)
swap(buf[i], buf[bytes - i - 1]);
+ }
return 0;
}
diff --git a/drivers/nvmem/imx-ocotp.c b/drivers/nvmem/imx-ocotp.c
index 79dd4fda0329..7bf7656d4f96 100644
--- a/drivers/nvmem/imx-ocotp.c
+++ b/drivers/nvmem/imx-ocotp.c
@@ -23,6 +23,7 @@
#include <linux/platform_device.h>
#include <linux/slab.h>
#include <linux/delay.h>
+#include <linux/if_ether.h> /* ETH_ALEN */
#define IMX_OCOTP_OFFSET_B0W0 0x400 /* Offset from base address of the
* OTP Bank0 Word0
@@ -227,9 +228,11 @@ static int imx_ocotp_cell_pp(void *context, const char *id, int index,
int i;
/* Deal with some post processing of nvmem cell data */
- if (id && !strcmp(id, "mac-address"))
+ if (id && !strcmp(id, "mac-address")) {
+ bytes = min(bytes, ETH_ALEN);
for (i = 0; i < bytes / 2; i++)
swap(buf[i], buf[bytes - i - 1]);
+ }
return 0;
}
--
2.43.0
Hi stable maintainers,
Please apply commit d1e420772cd1 ("x86/pkeys: Simplify PKRU update in
signal frame") to the stable branches for 6.12 and later.
This fixes a regression introduced in 6.13 by commit ae6012d72fa6
("x86/pkeys: Ensure updated PKRU value is XRSTOR'd"), which was also
backported in 6.12.5.
Ben.
--
Ben Hutchings
73.46% of all statistics are made up.
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x fa787ac07b3ceb56dd88a62d1866038498e96230
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071240-smugly-detergent-14e4@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From fa787ac07b3ceb56dd88a62d1866038498e96230 Mon Sep 17 00:00:00 2001
From: Manuel Andreas <manuel.andreas(a)tum.de>
Date: Wed, 25 Jun 2025 15:53:19 +0200
Subject: [PATCH] KVM: x86/hyper-v: Skip non-canonical addresses during PV TLB
flush
In KVM guests with Hyper-V hypercalls enabled, the hypercalls
HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST and HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX
allow a guest to request invalidation of portions of a virtual TLB.
For this, the hypercall parameter includes a list of GVAs that are supposed
to be invalidated.
However, when non-canonical GVAs are passed, there is currently no
filtering in place and they are eventually passed to checked invocations of
INVVPID on Intel / INVLPGA on AMD. While AMD's INVLPGA silently ignores
non-canonical addresses (effectively a no-op), Intel's INVVPID explicitly
signals VM-Fail and ultimately triggers the WARN_ONCE in invvpid_error():
invvpid failed: ext=0x0 vpid=1 gva=0xaaaaaaaaaaaaa000
WARNING: CPU: 6 PID: 326 at arch/x86/kvm/vmx/vmx.c:482
invvpid_error+0x91/0xa0 [kvm_intel]
Modules linked in: kvm_intel kvm 9pnet_virtio irqbypass fuse
CPU: 6 UID: 0 PID: 326 Comm: kvm-vm Not tainted 6.15.0 #14 PREEMPT(voluntary)
RIP: 0010:invvpid_error+0x91/0xa0 [kvm_intel]
Call Trace:
vmx_flush_tlb_gva+0x320/0x490 [kvm_intel]
kvm_hv_vcpu_flush_tlb+0x24f/0x4f0 [kvm]
kvm_arch_vcpu_ioctl_run+0x3013/0x5810 [kvm]
Hyper-V documents that invalid GVAs (those that are beyond a partition's
GVA space) are to be ignored. While not completely clear whether this
ruling also applies to non-canonical GVAs, it is likely fine to make that
assumption, and manual testing on Azure confirms "real" Hyper-V interprets
the specification in the same way.
Skip non-canonical GVAs when processing the list of address to avoid
tripping the INVVPID failure. Alternatively, KVM could filter out "bad"
GVAs before inserting into the FIFO, but practically speaking the only
downside of pushing validation to the final processing is that doing so
is suboptimal for the guest, and no well-behaved guest will request TLB
flushes for non-canonical addresses.
Fixes: 260970862c88 ("KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently")
Cc: stable(a)vger.kernel.org
Signed-off-by: Manuel Andreas <manuel.andreas(a)tum.de>
Suggested-by: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Link: https://lore.kernel.org/r/c090efb3-ef82-499f-a5e0-360fc8420fb7@tum.de
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 75221a11e15e..ee27064dd72f 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1979,6 +1979,9 @@ int kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
if (entries[i] == KVM_HV_TLB_FLUSHALL_ENTRY)
goto out_flush_all;
+ if (is_noncanonical_invlpg_address(entries[i], vcpu))
+ continue;
+
/*
* Lower 12 bits of 'address' encode the number of additional
* pages to flush.
After 6.12.36, amdgpu on a picasso apu prints an error:
"get invalid ip discovery binary signature".
Both with and without picasso_ip_discovery.bin from
linux-firmware 20250708.
The error is avoided by backporting more ip discovery patches.
25f602fbbcc8 drm/amdgpu/discovery: use specific ip_discovery.bin for legacy asics
2f6dd741cdcd drm/amdgpu/ip_discovery: add missing ip_discovery fw
second is a fix for the first
The netfs copy-to-cache that is used by Ceph with local caching sets up a
new request to write data just read to the cache. The request is started
and then left to look after itself whilst the app continues. The request
gets notified by the backing fs upon completion of the async DIO write, but
then tries to wake up the app because NETFS_RREQ_OFFLOAD_COLLECTION isn't
set - but the app isn't waiting there, and so the request just hangs.
Fix this by setting NETFS_RREQ_OFFLOAD_COLLECTION which causes the
notification from the backing filesystem to put the collection onto a work
queue instead.
Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item")
Reported-by: Max Kellermann <max.kellermann(a)ionos.com>
Link: https://lore.kernel.org/r/CAKPOu+8z_ijTLHdiCYGU_Uk7yYD=shxyGLwfe-L7AV3DhebS…
Signed-off-by: David Howells <dhowells(a)redhat.com>
cc: Paulo Alcantara <pc(a)manguebit.org>
cc: Viacheslav Dubeyko <slava(a)dubeyko.com>
cc: Alex Markuze <amarkuze(a)redhat.com>
cc: Ilya Dryomov <idryomov(a)gmail.com>
cc: netfs(a)lists.linux.dev
cc: ceph-devel(a)vger.kernel.org
cc: linux-fsdevel(a)vger.kernel.org
cc: stable(a)vger.kernel.org
---
fs/netfs/read_pgpriv2.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c
index 5bbe906a551d..080d2a6a51d9 100644
--- a/fs/netfs/read_pgpriv2.c
+++ b/fs/netfs/read_pgpriv2.c
@@ -110,6 +110,7 @@ static struct netfs_io_request *netfs_pgpriv2_begin_copy_to_cache(
if (!creq->io_streams[1].avail)
goto cancel_put;
+ __set_bit(NETFS_RREQ_OFFLOAD_COLLECTION, &creq->flags);
trace_netfs_write(creq, netfs_write_trace_copy_to_cache);
netfs_stat(&netfs_n_wh_copy_to_cache);
rreq->copy_to_cache = creq;
From: Ge Yang <yangge1116(a)126.com>
Since commit d228814b1913 ("efi/libstub: Add get_event_log() support
for CC platforms") reuses TPM2 support code for the CC platforms, when
launching a TDX virtual machine with coco measurement enabled, the
following error log is generated:
[Firmware Bug]: Failed to parse event in TPM Final Events Log
Call Trace:
efi_config_parse_tables()
efi_tpm_eventlog_init()
tpm2_calc_event_log_size()
__calc_tpm2_event_size()
The pcr_idx value in the Intel TDX log header is 1, causing the function
__calc_tpm2_event_size() to fail to recognize the log header, ultimately
leading to the "Failed to parse event in TPM Final Events Log" error.
According to UEFI Specification 2.10, Section 38.4.1: For TDX, TPM PCR
0 maps to MRTD, so the log header uses TPM PCR 1 instead. To successfully
parse the TDX event log header, the check for a pcr_idx value of 0 must
be skipped. There's no danger of this causing problems because we check
for the TCG_SPECID_SIG signature as the next thing.
Link: https://uefi.org/specs/UEFI/2.10/38_Confidential_Computing.html#intel-trust…
Fixes: d228814b1913 ("efi/libstub: Add get_event_log() support for CC platforms")
Signed-off-by: Ge Yang <yangge1116(a)126.com>
Cc: stable(a)vger.kernel.org
---
V5:
- remove the pcr_index check without adding any replacement checks suggested by James and Sathyanarayanan
V4:
- remove cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT) suggested by Ard
V3:
- fix build error
V2:
- limit the fix for CC only suggested by Jarkko and Sathyanarayanan
include/linux/tpm_eventlog.h | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/include/linux/tpm_eventlog.h b/include/linux/tpm_eventlog.h
index 891368e..05c0ae5 100644
--- a/include/linux/tpm_eventlog.h
+++ b/include/linux/tpm_eventlog.h
@@ -202,8 +202,7 @@ static __always_inline u32 __calc_tpm2_event_size(struct tcg_pcr_event2_head *ev
event_type = event->event_type;
/* Verify that it's the log header */
- if (event_header->pcr_idx != 0 ||
- event_header->event_type != NO_ACTION ||
+ if (event_header->event_type != NO_ACTION ||
memcmp(event_header->digest, zero_digest, sizeof(zero_digest))) {
size = 0;
goto out;
--
2.7.4
The commit "13bcd440f2ff nvmem: core: verify cell's raw_len" caused an
extension of the "mac-address" cell from 6 to 8 bytes due to word_size
of 4 bytes. This led to a required byte swap of the full buffer length,
which caused truncation of the mac-address when read.
Previously, the mac-address was incorrectly truncated from
70:B3:D5:14:E9:0E to 00:00:70:B3:D5:14.
Fix the issue by swapping only the first 6 bytes to correctly pass the
mac-address to the upper layers.
Fixes: 13bcd440f2ff ("nvmem: core: verify cell's raw_len")
Cc: stable(a)vger.kernel.org
Signed-off-by: Steffen Bätz <steffen(a)innosonix.de>
Tested-by: Alexander Stein <alexander.stein(a)ew.tq-group.com>
---
v4:
- Adopted the commit message wording recommended by
Frank.li(a)nxp.com to improve clarity.
- Simplified byte count determination by using min() instead of an
explicit conditional statement.
- Kept the Tested-by tag from the previous patch as the change
remains trivial but tested.
v3:
- replace magic number 6 with ETH_ALEN
- Fix misleading indentation and properly group 'mac-address' statements
v2:
- Add Cc: stable(a)vger.kernel.org as requested by Greg KH's patch bot
drivers/nvmem/imx-ocotp-ele.c | 5 ++++-
drivers/nvmem/imx-ocotp.c | 5 ++++-
2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/nvmem/imx-ocotp-ele.c b/drivers/nvmem/imx-ocotp-ele.c
index ca6dd71d8a2e..7807ec0e2d18 100644
--- a/drivers/nvmem/imx-ocotp-ele.c
+++ b/drivers/nvmem/imx-ocotp-ele.c
@@ -12,6 +12,7 @@
#include <linux/of.h>
#include <linux/platform_device.h>
#include <linux/slab.h>
+#include <linux/if_ether.h> /* ETH_ALEN */
enum fuse_type {
FUSE_FSB = BIT(0),
@@ -118,9 +119,11 @@ static int imx_ocotp_cell_pp(void *context, const char *id, int index,
int i;
/* Deal with some post processing of nvmem cell data */
- if (id && !strcmp(id, "mac-address"))
+ if (id && !strcmp(id, "mac-address")) {
+ bytes = min(bytes, ETH_ALEN);
for (i = 0; i < bytes / 2; i++)
swap(buf[i], buf[bytes - i - 1]);
+ }
return 0;
}
diff --git a/drivers/nvmem/imx-ocotp.c b/drivers/nvmem/imx-ocotp.c
index 79dd4fda0329..7bf7656d4f96 100644
--- a/drivers/nvmem/imx-ocotp.c
+++ b/drivers/nvmem/imx-ocotp.c
@@ -23,6 +23,7 @@
#include <linux/platform_device.h>
#include <linux/slab.h>
#include <linux/delay.h>
+#include <linux/if_ether.h> /* ETH_ALEN */
#define IMX_OCOTP_OFFSET_B0W0 0x400 /* Offset from base address of the
* OTP Bank0 Word0
@@ -227,9 +228,11 @@ static int imx_ocotp_cell_pp(void *context, const char *id, int index,
int i;
/* Deal with some post processing of nvmem cell data */
- if (id && !strcmp(id, "mac-address"))
+ if (id && !strcmp(id, "mac-address")) {
+ bytes = min(bytes, ETH_ALEN);
for (i = 0; i < bytes / 2; i++)
swap(buf[i], buf[bytes - i - 1]);
+ }
return 0;
}
--
2.43.0
--
*innosonix GmbH*
Hauptstr. 35
96482 Ahorn
central: +49 9561 7459980
www.innosonix.de <http://www.innosonix.de>
innosonix GmbH
Geschäftsführer:
Markus Bätz, Steffen Bätz
USt.-IdNr / VAT-Nr.: DE266020313
EORI-Nr.:
DE240121536680271
HRB 5192 Coburg
WEEE-Reg.-Nr. DE88021242
When netfslib is issuing subrequests, the subrequests start processing
immediately and may complete before we reach the end of the issuing
function. At the end of the issuing function we set NETFS_RREQ_ALL_QUEUED
to indicate to the collector that we aren't going to issue any more subreqs
and that it can do the final notifications and cleanup.
Now, this isn't a problem if the request is synchronous
(NETFS_RREQ_OFFLOAD_COLLECTION is unset) as the result collection will be
done in-thread and we're guaranteed an opportunity to run the collector.
However, if the request is asynchronous, collection is primarily triggered
by the termination of subrequests queuing it on a workqueue. Now, a race
can occur here if the app thread sets ALL_QUEUED after the last subrequest
terminates.
This can happen most easily with the copy2cache code (as used by Ceph)
where, in the collection routine of a read request, an asynchronous write
request is spawned to copy data to the cache. Folios are added to the
write request as they're unlocked, but there may be a delay before
ALL_QUEUED is set as the write subrequests may complete before we get
there.
If all the write subreqs have finished by the ALL_QUEUED point, no further
events happen and the collection never happens, leaving the request
hanging.
Fix this by queuing the collector after setting ALL_QUEUED. This is a bit
heavy-handed and it may be sufficient to do it only if there are no extant
subreqs.
Also add a tracepoint to cross-reference both requests in a copy-to-request
operation and add a trace to the netfs_rreq tracepoint to indicate the
setting of ALL_QUEUED.
Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item")
Reported-by: Max Kellermann <max.kellermann(a)ionos.com>
Link: https://lore.kernel.org/r/CAKPOu+8z_ijTLHdiCYGU_Uk7yYD=shxyGLwfe-L7AV3DhebS…
Signed-off-by: David Howells <dhowells(a)redhat.com>
cc: Paulo Alcantara <pc(a)manguebit.org>
cc: Viacheslav Dubeyko <slava(a)dubeyko.com>
cc: Alex Markuze <amarkuze(a)redhat.com>
cc: Ilya Dryomov <idryomov(a)gmail.com>
cc: netfs(a)lists.linux.dev
cc: ceph-devel(a)vger.kernel.org
cc: linux-fsdevel(a)vger.kernel.org
cc: stable(a)vger.kernel.org
---
fs/netfs/read_pgpriv2.c | 4 ++++
include/trace/events/netfs.h | 30 ++++++++++++++++++++++++++++++
2 files changed, 34 insertions(+)
diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c
index 080d2a6a51d9..8097bc069c1d 100644
--- a/fs/netfs/read_pgpriv2.c
+++ b/fs/netfs/read_pgpriv2.c
@@ -111,6 +111,7 @@ static struct netfs_io_request *netfs_pgpriv2_begin_copy_to_cache(
goto cancel_put;
__set_bit(NETFS_RREQ_OFFLOAD_COLLECTION, &creq->flags);
+ trace_netfs_copy2cache(rreq, creq);
trace_netfs_write(creq, netfs_write_trace_copy_to_cache);
netfs_stat(&netfs_n_wh_copy_to_cache);
rreq->copy_to_cache = creq;
@@ -155,6 +156,9 @@ void netfs_pgpriv2_end_copy_to_cache(struct netfs_io_request *rreq)
netfs_issue_write(creq, &creq->io_streams[1]);
smp_wmb(); /* Write lists before ALL_QUEUED. */
set_bit(NETFS_RREQ_ALL_QUEUED, &creq->flags);
+ trace_netfs_rreq(rreq, netfs_rreq_trace_end_copy_to_cache);
+ if (list_empty_careful(&creq->io_streams[1].subrequests))
+ netfs_wake_collector(creq);
netfs_put_request(creq, netfs_rreq_trace_put_return);
creq->copy_to_cache = NULL;
diff --git a/include/trace/events/netfs.h b/include/trace/events/netfs.h
index 73e96ccbe830..64a382fbc31a 100644
--- a/include/trace/events/netfs.h
+++ b/include/trace/events/netfs.h
@@ -55,6 +55,7 @@
EM(netfs_rreq_trace_copy, "COPY ") \
EM(netfs_rreq_trace_dirty, "DIRTY ") \
EM(netfs_rreq_trace_done, "DONE ") \
+ EM(netfs_rreq_trace_end_copy_to_cache, "END-C2C") \
EM(netfs_rreq_trace_free, "FREE ") \
EM(netfs_rreq_trace_ki_complete, "KI-CMPL") \
EM(netfs_rreq_trace_recollect, "RECLLCT") \
@@ -559,6 +560,35 @@ TRACE_EVENT(netfs_write,
__entry->start, __entry->start + __entry->len - 1)
);
+TRACE_EVENT(netfs_copy2cache,
+ TP_PROTO(const struct netfs_io_request *rreq,
+ const struct netfs_io_request *creq),
+
+ TP_ARGS(rreq, creq),
+
+ TP_STRUCT__entry(
+ __field(unsigned int, rreq)
+ __field(unsigned int, creq)
+ __field(unsigned int, cookie)
+ __field(unsigned int, ino)
+ ),
+
+ TP_fast_assign(
+ struct netfs_inode *__ctx = netfs_inode(rreq->inode);
+ struct fscache_cookie *__cookie = netfs_i_cookie(__ctx);
+ __entry->rreq = rreq->debug_id;
+ __entry->creq = creq->debug_id;
+ __entry->cookie = __cookie ? __cookie->debug_id : 0;
+ __entry->ino = rreq->inode->i_ino;
+ ),
+
+ TP_printk("R=%08x CR=%08x c=%08x i=%x ",
+ __entry->rreq,
+ __entry->creq,
+ __entry->cookie,
+ __entry->ino)
+ );
+
TRACE_EVENT(netfs_collect,
TP_PROTO(const struct netfs_io_request *wreq),
From: Ge Yang <yangge1116(a)126.com>
Since commit d228814b1913 ("efi/libstub: Add get_event_log() support
for CC platforms") reuses TPM2 support code for the CC platforms, when
launching a TDX virtual machine with coco measurement enabled, the
following error log is generated:
[Firmware Bug]: Failed to parse event in TPM Final Events Log
Call Trace:
efi_config_parse_tables()
efi_tpm_eventlog_init()
tpm2_calc_event_log_size()
__calc_tpm2_event_size()
The pcr_idx value in the Intel TDX log header is 1, causing the function
__calc_tpm2_event_size() to fail to recognize the log header, ultimately
leading to the "Failed to parse event in TPM Final Events Log" error.
According to UEFI Specification 2.10, Section 38.4.1: For TDX, TPM PCR
0 maps to MRTD, so the log header uses TPM PCR 1 instead. To successfully
parse the TDX event log header, the check for a pcr_idx value of 0
must be skipped.
According to Table 6 in Section 10.2.1 of the TCG PC Client
Specification, the index field does not require the PCR index to be
fixed at zero. Therefore, skipping the check for a pcr_idx value of
0 for CC platforms is safe.
Link: https://uefi.org/specs/UEFI/2.10/38_Confidential_Computing.html#intel-trust…
Link: https://trustedcomputinggroup.org/wp-content/uploads/TCG_PCClient_PFP_r1p05…
Fixes: d228814b1913 ("efi/libstub: Add get_event_log() support for CC platforms")
Signed-off-by: Ge Yang <yangge1116(a)126.com>
Cc: stable(a)vger.kernel.org
---
V3:
- fix build error
V2:
- limit the fix for CC only suggested by Jarkko and Sathyanarayanan
drivers/char/tpm/eventlog/tpm2.c | 4 +++-
drivers/firmware/efi/libstub/tpm.c | 13 +++++++++----
drivers/firmware/efi/tpm.c | 4 +++-
include/linux/tpm_eventlog.h | 14 +++++++++++---
4 files changed, 26 insertions(+), 9 deletions(-)
diff --git a/drivers/char/tpm/eventlog/tpm2.c b/drivers/char/tpm/eventlog/tpm2.c
index 37a0580..30ef47c 100644
--- a/drivers/char/tpm/eventlog/tpm2.c
+++ b/drivers/char/tpm/eventlog/tpm2.c
@@ -18,6 +18,7 @@
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/tpm_eventlog.h>
+#include <linux/cc_platform.h>
#include "../tpm.h"
#include "common.h"
@@ -36,7 +37,8 @@
static size_t calc_tpm2_event_size(struct tcg_pcr_event2_head *event,
struct tcg_pcr_event *event_header)
{
- return __calc_tpm2_event_size(event, event_header, false);
+ return __calc_tpm2_event_size(event, event_header, false,
+ cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT));
}
static void *tpm2_bios_measurements_start(struct seq_file *m, loff_t *pos)
diff --git a/drivers/firmware/efi/libstub/tpm.c b/drivers/firmware/efi/libstub/tpm.c
index a5c6c4f..9728060 100644
--- a/drivers/firmware/efi/libstub/tpm.c
+++ b/drivers/firmware/efi/libstub/tpm.c
@@ -50,7 +50,8 @@ void efi_enable_reset_attack_mitigation(void)
static void efi_retrieve_tcg2_eventlog(int version, efi_physical_addr_t log_location,
efi_physical_addr_t log_last_entry,
efi_bool_t truncated,
- struct efi_tcg2_final_events_table *final_events_table)
+ struct efi_tcg2_final_events_table *final_events_table,
+ bool is_cc_event)
{
efi_guid_t linux_eventlog_guid = LINUX_EFI_TPM_EVENT_LOG_GUID;
efi_status_t status;
@@ -87,7 +88,8 @@ static void efi_retrieve_tcg2_eventlog(int version, efi_physical_addr_t log_loca
last_entry_size =
__calc_tpm2_event_size((void *)last_entry_addr,
(void *)(long)log_location,
- false);
+ false,
+ is_cc_event);
} else {
last_entry_size = sizeof(struct tcpa_event) +
((struct tcpa_event *) last_entry_addr)->event_size;
@@ -123,7 +125,8 @@ static void efi_retrieve_tcg2_eventlog(int version, efi_physical_addr_t log_loca
header = data + offset + final_events_size;
event_size = __calc_tpm2_event_size(header,
(void *)(long)log_location,
- false);
+ false,
+ is_cc_event);
/* If calc fails this is a malformed log */
if (!event_size)
break;
@@ -157,6 +160,7 @@ void efi_retrieve_eventlog(void)
efi_tcg2_protocol_t *tpm2 = NULL;
efi_bool_t truncated;
efi_status_t status;
+ bool is_cc_event = false;
status = efi_bs_call(locate_protocol, &tpm2_guid, NULL, (void **)&tpm2);
if (status == EFI_SUCCESS) {
@@ -186,11 +190,12 @@ void efi_retrieve_eventlog(void)
final_events_table =
get_efi_config_table(EFI_CC_FINAL_EVENTS_TABLE_GUID);
+ is_cc_event = true;
}
if (status != EFI_SUCCESS || !log_location)
return;
efi_retrieve_tcg2_eventlog(version, log_location, log_last_entry,
- truncated, final_events_table);
+ truncated, final_events_table, is_cc_event);
}
diff --git a/drivers/firmware/efi/tpm.c b/drivers/firmware/efi/tpm.c
index cdd4310..ca8535d 100644
--- a/drivers/firmware/efi/tpm.c
+++ b/drivers/firmware/efi/tpm.c
@@ -12,6 +12,7 @@
#include <linux/init.h>
#include <linux/memblock.h>
#include <linux/tpm_eventlog.h>
+#include <linux/cc_platform.h>
int efi_tpm_final_log_size;
EXPORT_SYMBOL(efi_tpm_final_log_size);
@@ -23,7 +24,8 @@ static int __init tpm2_calc_event_log_size(void *data, int count, void *size_inf
while (count > 0) {
header = data + size;
- event_size = __calc_tpm2_event_size(header, size_info, true);
+ event_size = __calc_tpm2_event_size(header, size_info, true,
+ cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT));
if (event_size == 0)
return -1;
size += event_size;
diff --git a/include/linux/tpm_eventlog.h b/include/linux/tpm_eventlog.h
index 891368e..b3380c9 100644
--- a/include/linux/tpm_eventlog.h
+++ b/include/linux/tpm_eventlog.h
@@ -143,6 +143,7 @@ struct tcg_algorithm_info {
* @event: Pointer to the event whose size should be calculated
* @event_header: Pointer to the initial event containing the digest lengths
* @do_mapping: Whether or not the event needs to be mapped
+ * @is_cc_event: Whether or not the event is from a CC platform
*
* The TPM2 event log format can contain multiple digests corresponding to
* separate PCR banks, and also contains a variable length of the data that
@@ -159,7 +160,8 @@ struct tcg_algorithm_info {
static __always_inline u32 __calc_tpm2_event_size(struct tcg_pcr_event2_head *event,
struct tcg_pcr_event *event_header,
- bool do_mapping)
+ bool do_mapping,
+ bool is_cc_event)
{
struct tcg_efi_specid_event_head *efispecid;
struct tcg_event_field *event_field;
@@ -201,8 +203,14 @@ static __always_inline u32 __calc_tpm2_event_size(struct tcg_pcr_event2_head *ev
count = event->count;
event_type = event->event_type;
- /* Verify that it's the log header */
- if (event_header->pcr_idx != 0 ||
+ /*
+ * Verify that it's the log header. According to the TCG PC Client
+ * Specification, when identifying a log header, the check for a
+ * pcr_idx value of 0 is not required. For CC platforms, skipping
+ * this check during log header is necessary; otherwise, the CC
+ * platform's log header may fail to be recognized.
+ */
+ if ((!is_cc_event && event_header->pcr_idx != 0) ||
event_header->event_type != NO_ACTION ||
memcmp(event_header->digest, zero_digest, sizeof(zero_digest))) {
size = 0;
--
2.7.4
Since commit 55d4980ce55b ("nvmem: core: support specifying both: cell
raw data & post read lengths"), the aligned length (e.g. '8' instead of
'6') is passed to the read_post_process callback. This causes that the 2
bytes following the MAC address in the ocotp are swapped to the
beginning of the address. As a result, an invalid MAC address is
returned and to make it even worse, this address can be equal on boards
with the same OUI vendor prefix.
Fixes: 55d4980ce55b ("nvmem: core: support specifying both: cell raw data & post read lengths")
Signed-off-by: Christian Eggers <ceggers(a)arri.de>
Cc: stable(a)vger.kernel.org
---
Tested on i.MX6ULL, but I assume that this is also required for.
imx-ocotp-ele.c (i.MX93).
drivers/nvmem/imx-ocotp-ele.c | 5 +++--
drivers/nvmem/imx-ocotp.c | 5 +++--
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/nvmem/imx-ocotp-ele.c b/drivers/nvmem/imx-ocotp-ele.c
index 83617665c8d7..07830785ebf1 100644
--- a/drivers/nvmem/imx-ocotp-ele.c
+++ b/drivers/nvmem/imx-ocotp-ele.c
@@ -6,6 +6,7 @@
*/
#include <linux/device.h>
+#include <linux/if_ether.h>
#include <linux/io.h>
#include <linux/module.h>
#include <linux/nvmem-provider.h>
@@ -119,8 +120,8 @@ static int imx_ocotp_cell_pp(void *context, const char *id, int index,
/* Deal with some post processing of nvmem cell data */
if (id && !strcmp(id, "mac-address"))
- for (i = 0; i < bytes / 2; i++)
- swap(buf[i], buf[bytes - i - 1]);
+ for (i = 0; i < ETH_ALEN / 2; i++)
+ swap(buf[i], buf[ETH_ALEN - i - 1]);
return 0;
}
diff --git a/drivers/nvmem/imx-ocotp.c b/drivers/nvmem/imx-ocotp.c
index 22cc77908018..4dd3b0f94de2 100644
--- a/drivers/nvmem/imx-ocotp.c
+++ b/drivers/nvmem/imx-ocotp.c
@@ -16,6 +16,7 @@
#include <linux/clk.h>
#include <linux/device.h>
+#include <linux/if_ether.h>
#include <linux/io.h>
#include <linux/module.h>
#include <linux/nvmem-provider.h>
@@ -228,8 +229,8 @@ static int imx_ocotp_cell_pp(void *context, const char *id, int index,
/* Deal with some post processing of nvmem cell data */
if (id && !strcmp(id, "mac-address"))
- for (i = 0; i < bytes / 2; i++)
- swap(buf[i], buf[bytes - i - 1]);
+ for (i = 0; i < ETH_ALEN / 2; i++)
+ swap(buf[i], buf[ETH_ALEN - i - 1]);
return 0;
}
--
2.43.0
Verify that the inode mode is sane when loading it from the disk to
avoid complaints from VFS about setting up invalid inodes.
Reported-by: syzbot+895c23f6917da440ed0d(a)syzkaller.appspotmail.com
CC: stable(a)vger.kernel.org
Signed-off-by: Jan Kara <jack(a)suse.cz>
---
fs/isofs/inode.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
I plan to merge this fix through my tree.
diff --git a/fs/isofs/inode.c b/fs/isofs/inode.c
index d5da9817df9b..33e6a620c103 100644
--- a/fs/isofs/inode.c
+++ b/fs/isofs/inode.c
@@ -1440,9 +1440,16 @@ static int isofs_read_inode(struct inode *inode, int relocated)
inode->i_op = &page_symlink_inode_operations;
inode_nohighmem(inode);
inode->i_data.a_ops = &isofs_symlink_aops;
- } else
+ } else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) ||
+ S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
/* XXX - parse_rock_ridge_inode() had already set i_rdev. */
init_special_inode(inode, inode->i_mode, inode->i_rdev);
+ } else {
+ printk(KERN_DEBUG "ISOFS: Invalid file type 0%04o for inode %lu.\n",
+ inode->i_mode, inode->i_ino);
+ ret = -EIO;
+ goto fail;
+ }
ret = 0;
out:
--
2.43.0
The _CRS resources in many cases want to have ResourceSource field
to be a type of ACPI String. This means that to compile properly
we need to enclosure the name path into double quotes. This will
in practice defer the interpretation to a run-time stage, However,
this may be interpreted differently on different OSes and ACPI
interpreter implementations. In particular ACPICA might not correctly
recognize the leading '^' (caret) character and will not resolve
the relative name path properly. On top of that, this piece may be
used in SSDTs which are loaded after the DSDT and on itself may also
not resolve relative name paths outside of their own scopes.
With this all said, fix documentation to use fully-qualified name
paths always to avoid any misinterpretations, which is proven to
work.
Fixes: 8eb5c87a92c0 ("i2c: add ACPI support for I2C mux ports")
Reported-by: Yevhen Kondrashyn <e.kondrashyn(a)gmail.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
---
Rafael, I prefer, if no objections, to push this as v6.16-rc6 material since
the reported issue was detected on old (v5.10.y) and still LTS kernel. Would be
nice for people to not trap to it in older kernels.
Documentation/firmware-guide/acpi/i2c-muxes.rst | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/Documentation/firmware-guide/acpi/i2c-muxes.rst b/Documentation/firmware-guide/acpi/i2c-muxes.rst
index 3a8997ccd7c4..f366539acd79 100644
--- a/Documentation/firmware-guide/acpi/i2c-muxes.rst
+++ b/Documentation/firmware-guide/acpi/i2c-muxes.rst
@@ -14,7 +14,7 @@ Consider this topology::
| | | 0x70 |--CH01--> i2c client B (0x50)
+------+ +------+
-which corresponds to the following ASL::
+which corresponds to the following ASL (in the scope of \_SB)::
Device (SMB1)
{
@@ -24,7 +24,7 @@ which corresponds to the following ASL::
Name (_HID, ...)
Name (_CRS, ResourceTemplate () {
I2cSerialBus (0x70, ControllerInitiated, I2C_SPEED,
- AddressingMode7Bit, "^SMB1", 0x00,
+ AddressingMode7Bit, "\\_SB.SMB1", 0x00,
ResourceConsumer,,)
}
@@ -37,7 +37,7 @@ which corresponds to the following ASL::
Name (_HID, ...)
Name (_CRS, ResourceTemplate () {
I2cSerialBus (0x50, ControllerInitiated, I2C_SPEED,
- AddressingMode7Bit, "^CH00", 0x00,
+ AddressingMode7Bit, "\\_SB.SMB1.CH00", 0x00,
ResourceConsumer,,)
}
}
@@ -52,7 +52,7 @@ which corresponds to the following ASL::
Name (_HID, ...)
Name (_CRS, ResourceTemplate () {
I2cSerialBus (0x50, ControllerInitiated, I2C_SPEED,
- AddressingMode7Bit, "^CH01", 0x00,
+ AddressingMode7Bit, "\\_SB.SMB1.CH01", 0x00,
ResourceConsumer,,)
}
}
--
2.47.2
The IMX8M reference manuals indicate in the USDHC Clock generator section
that the clock rate for DDR is 1/2 the input clock therefore HS400 rates
clocked at 200Mhz require a 400Mhz SDHC clock.
This showed about a 1.5x improvement in read performance for the eMMC's
used on the various imx8m{m,n,p}-venice boards.
Fixes: 6f30b27c5ef5 ("arm64: dts: imx8mm: Add Gateworks i.MX 8M Mini Development Kits")
Signed-off-by: Tim Harvey <tharvey(a)gateworks.com>
---
arch/arm64/boot/dts/freescale/imx8mm-venice-gw700x.dtsi | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm64/boot/dts/freescale/imx8mm-venice-gw700x.dtsi b/arch/arm64/boot/dts/freescale/imx8mm-venice-gw700x.dtsi
index 5a3b1142ddf4..37db4f0dd505 100644
--- a/arch/arm64/boot/dts/freescale/imx8mm-venice-gw700x.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx8mm-venice-gw700x.dtsi
@@ -418,6 +418,8 @@ &usdhc3 {
pinctrl-0 = <&pinctrl_usdhc3>;
pinctrl-1 = <&pinctrl_usdhc3_100mhz>;
pinctrl-2 = <&pinctrl_usdhc3_200mhz>;
+ assigned-clocks = <&clk IMX8MM_CLK_USDHC3>;
+ assigned-clock-rates = <400000000>;
bus-width = <8>;
non-removable;
status = "okay";
--
2.25.1
inline data handling has a race between writing and writing to a memory
map.
When ext4_page_mkwrite is called, it calls ext4_convert_inline_data, which
destroys the inline data, but if block allocation fails, restores the
inline data. In that process, we could have:
CPU1 CPU2
destroy_inline_data
write_begin (does not see inline data)
restory_inline_data
write_end (sees inline data)
This leads to bugs like the one below, as write_begin did not prepare for
the case of inline data, which is expected by the write_end side of it.
------------[ cut here ]------------
kernel BUG at fs/ext4/inline.c:235!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 1 UID: 0 PID: 5838 Comm: syz-executor110 Not tainted 6.13.0-rc3-syzkaller-00209-g499551201b5f #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
RIP: 0010:ext4_write_inline_data fs/ext4/inline.c:235 [inline]
RIP: 0010:ext4_write_inline_data_end+0xdc7/0xdd0 fs/ext4/inline.c:774
Code: 47 1d 8c e8 4b 3a 91 ff 90 0f 0b e8 63 7a 47 ff 48 8b 7c 24 10 48 c7 c6 e0 47 1d 8c e8 32 3a 91 ff 90 0f 0b e8 4a 7a 47 ff 90 <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90
RSP: 0018:ffffc900031c7320 EFLAGS: 00010293
RAX: ffffffff8257f9a6 RBX: 000000000000005a RCX: ffff888012968000
RDX: 0000000000000000 RSI: 000000000000005a RDI: 000000000000005b
RBP: ffffc900031c7448 R08: ffffffff8257ef87 R09: 1ffff11006806070
R10: dffffc0000000000 R11: ffffed1006806071 R12: 000000000000005a
R13: dffffc0000000000 R14: ffff888076b65bd8 R15: 000000000000005b
FS: 00007f5c6bacf6c0(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000a00 CR3: 0000000073fb6000 CR4: 0000000000350ef0
Call Trace:
<TASK>
generic_perform_write+0x6f8/0x990 mm/filemap.c:4070
ext4_buffered_write_iter+0xc5/0x350 fs/ext4/file.c:299
ext4_file_write_iter+0x892/0x1c50
iter_file_splice_write+0xbfc/0x1510 fs/splice.c:743
do_splice_from fs/splice.c:941 [inline]
direct_splice_actor+0x11d/0x220 fs/splice.c:1164
splice_direct_to_actor+0x588/0xc80 fs/splice.c:1108
do_splice_direct_actor fs/splice.c:1207 [inline]
do_splice_direct+0x289/0x3e0 fs/splice.c:1233
do_sendfile+0x564/0x8a0 fs/read_write.c:1363
__do_sys_sendfile64 fs/read_write.c:1424 [inline]
__se_sys_sendfile64+0x17c/0x1e0 fs/read_write.c:1410
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5c6bb18d09
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 b1 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f5c6bacf218 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
RAX: ffffffffffffffda RBX: 00007f5c6bba0708 RCX: 00007f5c6bb18d09
RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000004
RBP: 00007f5c6bba0700 R08: 0000000000000000 R09: 0000000000000000
R10: 000080001d00c0d0 R11: 0000000000000246 R12: 00007f5c6bb6d620
R13: 00007f5c6bb6d0c0 R14: 0031656c69662f2e R15: 8088e3ad122bc192
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
This happens because ext4_page_mkwrite is not protected by the inode_lock.
The xattr semaphore is not sufficient to protect inline data handling in a
sane way, so we need to rely on the inode_lock. Adding the inode_lock to
ext4_page_mkwrite is not an option, otherwise lock-ordering problems with
mmap_lock may arise.
The conversion inside ext4_page_mkwrite was introduced at commit
7b4cc9787fe3 ("ext4: evict inline data when writing to memory map"). This
fixes a documented bug in the commit message, which suggests some
alternative fixes.
Convert inline data when mmap is called, instead of doing it only when the
mmapped page is written to. Using the inode_lock there does not lead to
lock-ordering issues.
The drawback is that inline conversion will happen when the file is
mmapped, even though the page will not be written to.
Fixes: 7b4cc9787fe3 ("ext4: evict inline data when writing to memory map")
Reported-by: syzbot+0c89d865531d053abb2d(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=0c89d865531d053abb2d
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo(a)igalia.com>
Cc: stable(a)vger.kernel.org
---
Changes in v2:
- Convert inline data at mmap time, avoiding data loss.
- Link to v1: https://lore.kernel.org/r/20250519-ext4_inline_page_mkwrite-v1-1-865d9a62b5…
---
fs/ext4/file.c | 6 ++++++
fs/ext4/inode.c | 4 ----
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index beb078ee4811d6092e362e37307e7d87e5276cbc..f2380471df5d99500e49fdc639fa3e56143c328f 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -819,6 +819,12 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma)
if (!daxdev_mapping_supported(vma, dax_dev))
return -EOPNOTSUPP;
+ inode_lock(inode);
+ ret = ext4_convert_inline_data(inode);
+ inode_unlock(inode);
+ if (ret)
+ return ret;
+
file_accessed(file);
if (IS_DAX(file_inode(file))) {
vma->vm_ops = &ext4_dax_vm_ops;
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 94c7d2d828a64e42ded09c82497ed7617071aa19..895ecda786194b29d32c9c49785d56a1a84e2096 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -6222,10 +6222,6 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
filemap_invalidate_lock_shared(mapping);
- err = ext4_convert_inline_data(inode);
- if (err)
- goto out_ret;
-
/*
* On data journalling we skip straight to the transaction handle:
* there's no delalloc; page truncated will be checked later; the
---
base-commit: 4a95bc121ccdaee04c4d72f84dbfa6b880a514b6
change-id: 20250519-ext4_inline_page_mkwrite-c42ca1f02295
Best regards,
--
Thadeu Lima de Souza Cascardo <cascardo(a)igalia.com>
During our internal testing, we started observing intermittent boot
failures when the machine uses 4-level paging and has a large amount
of persistent memory:
BUG: unable to handle page fault for address: ffffe70000000034
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#1] SMP NOPTI
RIP: 0010:__init_single_page+0x9/0x6d
Call Trace:
<TASK>
__init_zone_device_page+0x17/0x5d
memmap_init_zone_device+0x154/0x1bb
pagemap_range+0x2e0/0x40f
memremap_pages+0x10b/0x2f0
devm_memremap_pages+0x1e/0x60
dev_dax_probe+0xce/0x2ec [device_dax]
dax_bus_probe+0x6d/0xc9
[... snip ...]
</TASK>
It turns out that the kernel panics while initializing vmemmap
(struct page array) when the vmemmap region spans two PGD entries,
because the new PGD entry is only installed in init_mm.pgd,
but not in the page tables of other tasks.
And looking at __populate_section_memmap():
if (vmemmap_can_optimize(altmap, pgmap))
// does not sync top level page tables
r = vmemmap_populate_compound_pages(pfn, start, end, nid, pgmap);
else
// sync top level page tables in x86
r = vmemmap_populate(start, end, nid, altmap);
In the normal path, vmemmap_populate() in arch/x86/mm/init_64.c
synchronizes the top level page table (See commit 9b861528a801
("x86-64, mem: Update all PGDs for direct mapping and vmemmap mapping
changes")) so that all tasks in the system can see the new vmemmap area.
However, when vmemmap_can_optimize() returns true, the optimized path
skips synchronization of top-level page tables. This is because
vmemmap_populate_compound_pages() is implemented in core MM code, which
does not handle synchronization of the top-level page tables. Instead,
the core MM has historically relied on each architecture to perform this
synchronization manually.
It turns out that current approach of relying on each arch to handle
the page table sync manually is fragile because 1) it's easy to forget
to sync the top level page table, and 2) it's also easy to overlook that
the kernel should not access vmemmap / direct mapping area before the sync.
As suggested by Dave Hansen, define x86_64 versions of
{pgd,p4d}_populate_kernel() and arch_sync_kernel_pagetables(), and
explicitly perform top-level page table synchronization in
{pgd,p4d}_populate_kernel(). Top level page tables are synchronized in
pgd_pouplate_kernel() for 5-level paging and in p4d_populate_kernel()
for 4-level paging.
arch_sync_kernel_pagetables(addr) synchronizes the top level page table
entry for address. It calls sync_kernel_pagetables_{l4,l5} depending on
the page table levels and installs the page entry in all page tables
in the system to make it visible to all tasks.
Note that sync_kernel_pagetables_{l4,l5} are simply versions of
sync_global_pgds_{l4,l5} that synchronizes only a single page table entry
for specified address, instead of for all page table entries corresponding
to a range. No functional difference intended between sync_global_pgds_*
and sync_kernel_pagetables_* other than that.
This also fixes a crash in vmemmap_set_pmd() caused by accessing vmemmap
before sync_global_pgds() [1]:
BUG: unable to handle page fault for address: ffffeb3ff1200000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: Oops: 0002 [#1] PREEMPT SMP NOPTI
Tainted: [W]=WARN
RIP: 0010:vmemmap_set_pmd+0xff/0x230
<TASK>
vmemmap_populate_hugepages+0x176/0x180
vmemmap_populate+0x34/0x80
__populate_section_memmap+0x41/0x90
sparse_add_section+0x121/0x3e0
__add_pages+0xba/0x150
add_pages+0x1d/0x70
memremap_pages+0x3dc/0x810
devm_memremap_pages+0x1c/0x60
xe_devm_add+0x8b/0x100 [xe]
xe_tile_init_noalloc+0x6a/0x70 [xe]
xe_device_probe+0x48c/0x740 [xe]
[... snip ...]
Cc: <stable(a)vger.kernel.org>
Fixes: 4917f55b4ef9 ("mm/sparse-vmemmap: improve memory savings for compound devmaps")
Fixes: faf1c0008a33 ("x86/vmemmap: optimize for consecutive sections in partial populated PMDs")
Closes: https://lore.kernel.org/linux-mm/20250311114420.240341-1-gwan-gyeong.mun@in… [1]
Suggested-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Signed-off-by: Harry Yoo <harry.yoo(a)oracle.com>
---
arch/x86/include/asm/pgalloc.h | 22 ++++++++++
arch/x86/mm/init_64.c | 80 ++++++++++++++++++++++++++++++++++
2 files changed, 102 insertions(+)
diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index c88691b15f3c..d66f2db54b16 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -10,6 +10,7 @@
#define __HAVE_ARCH_PTE_ALLOC_ONE
#define __HAVE_ARCH_PGD_FREE
+#define __HAVE_ARCH_SYNC_KERNEL_PGTABLE
#include <asm-generic/pgalloc.h>
static inline int __paravirt_pgd_alloc(struct mm_struct *mm) { return 0; }
@@ -114,6 +115,17 @@ static inline void p4d_populate(struct mm_struct *mm, p4d_t *p4d, pud_t *pud)
set_p4d(p4d, __p4d(_PAGE_TABLE | __pa(pud)));
}
+void arch_sync_kernel_pagetables(unsigned long addr);
+
+static inline void p4d_populate_kernel(unsigned long addr,
+ p4d_t *p4d, pud_t *pud)
+{
+ paravirt_alloc_pud(&init_mm, __pa(pud) >> PAGE_SHIFT);
+ set_p4d(p4d, __p4d(_PAGE_TABLE | __pa(pud)));
+ if (!pgtable_l5_enabled())
+ arch_sync_kernel_pagetables(addr);
+}
+
static inline void p4d_populate_safe(struct mm_struct *mm, p4d_t *p4d, pud_t *pud)
{
paravirt_alloc_pud(mm, __pa(pud) >> PAGE_SHIFT);
@@ -137,6 +149,16 @@ static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4d)
set_pgd(pgd, __pgd(_PAGE_TABLE | __pa(p4d)));
}
+static inline void pgd_populate_kernel(unsigned long addr,
+ pgd_t *pgd, p4d_t *p4d)
+{
+ if (!pgtable_l5_enabled())
+ return;
+ paravirt_alloc_p4d(&init_mm, __pa(p4d) >> PAGE_SHIFT);
+ set_pgd(pgd, __pgd(_PAGE_TABLE | __pa(p4d)));
+ arch_sync_kernel_pagetables(addr);
+}
+
static inline void pgd_populate_safe(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4d)
{
if (!pgtable_l5_enabled())
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index fdb6cab524f0..cbddbef434d5 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -223,6 +223,86 @@ static void sync_global_pgds(unsigned long start, unsigned long end)
sync_global_pgds_l4(start, end);
}
+static void sync_kernel_pagetables_l4(unsigned long addr)
+{
+ pgd_t *pgd_ref = pgd_offset_k(addr);
+ const p4d_t *p4d_ref;
+ struct page *page;
+
+ VM_WARN_ON_ONCE(pgtable_l5_enabled());
+ /*
+ * With folded p4d, pgd_none() is always false, we need to
+ * handle synchronization on p4d level.
+ */
+ MAYBE_BUILD_BUG_ON(pgd_none(*pgd_ref));
+ p4d_ref = p4d_offset(pgd_ref, addr);
+
+ if (p4d_none(*p4d_ref))
+ return;
+
+ spin_lock(&pgd_lock);
+ list_for_each_entry(page, &pgd_list, lru) {
+ pgd_t *pgd;
+ p4d_t *p4d;
+ spinlock_t *pgt_lock;
+
+ pgd = (pgd_t *)page_address(page) + pgd_index(addr);
+ p4d = p4d_offset(pgd, addr);
+ /* the pgt_lock only for Xen */
+ pgt_lock = &pgd_page_get_mm(page)->page_table_lock;
+ spin_lock(pgt_lock);
+
+ if (!p4d_none(*p4d_ref) && !p4d_none(*p4d))
+ BUG_ON(p4d_pgtable(*p4d)
+ != p4d_pgtable(*p4d_ref));
+
+ if (p4d_none(*p4d))
+ set_p4d(p4d, *p4d_ref);
+
+ spin_unlock(pgt_lock);
+ }
+ spin_unlock(&pgd_lock);
+}
+
+static void sync_kernel_pagetables_l5(unsigned long addr)
+{
+ const pgd_t *pgd_ref = pgd_offset_k(addr);
+ struct page *page;
+
+ VM_WARN_ON_ONCE(!pgtable_l5_enabled());
+
+ if (pgd_none(*pgd_ref))
+ return;
+
+ spin_lock(&pgd_lock);
+ list_for_each_entry(page, &pgd_list, lru) {
+ pgd_t *pgd;
+ spinlock_t *pgt_lock;
+
+ pgd = (pgd_t *)page_address(page) + pgd_index(addr);
+ /* the pgt_lock only for Xen */
+ pgt_lock = &pgd_page_get_mm(page)->page_table_lock;
+ spin_lock(pgt_lock);
+
+ if (!pgd_none(*pgd_ref) && !pgd_none(*pgd))
+ BUG_ON(pgd_page_vaddr(*pgd) != pgd_page_vaddr(*pgd_ref));
+
+ if (pgd_none(*pgd))
+ set_pgd(pgd, *pgd_ref);
+
+ spin_unlock(pgt_lock);
+ }
+ spin_unlock(&pgd_lock);
+}
+
+void arch_sync_kernel_pagetables(unsigned long addr)
+{
+ if (pgtable_l5_enabled())
+ sync_kernel_pagetables_l5(addr);
+ else
+ sync_kernel_pagetables_l4(addr);
+}
+
/*
* NOTE: This function is marked __ref because it calls __init function
* (alloc_bootmem_pages). It's safe to do it ONLY when after_bootmem == 0.
--
2.43.0
The patch titled
Subject: mm/hmm: move pmd_to_hmm_pfn_flags() to the respective #ifdeffery
has been added to the -mm mm-new branch. Its filename is
mm-hmm-move-pmd_to_hmm_pfn_flags-to-the-respective-ifdeffery.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-new branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Note, mm-new is a provisional staging ground for work-in-progress
patches, and acceptance into mm-new is a notification for others take
notice and to finish up reviews. Please do not hesitate to respond to
review feedback and post updated versions to replace or incrementally
fixup patches in mm-new.
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Subject: mm/hmm: move pmd_to_hmm_pfn_flags() to the respective #ifdeffery
Date: Thu, 10 Jul 2025 11:23:53 +0300
When pmd_to_hmm_pfn_flags() is unused, it prevents kernel builds with
clang, `make W=1` and CONFIG_TRANSPARENT_HUGEPAGE=n:
mm/hmm.c:186:29: warning: unused function 'pmd_to_hmm_pfn_flags' [-Wunused-function]
Fix this by moving the function to the respective existing ifdeffery
for its the only user.
See also:
6863f5643dd7 ("kbuild: allow Clang to find unused static inline functions for W=1 build")
Link: https://lkml.kernel.org/r/20250710082403.664093-1-andriy.shevchenko@linux.i…
Fixes: 9d3973d60f0a ("mm/hmm: cleanup the hmm_vma_handle_pmd stub")
Signed-off-by: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Reviewed-by: Leon Romanovsky <leonro(a)nvidia.com>
Cc: Andriy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Bill Wendling <morbo(a)google.com>
Cc: Jerome Glisse <jglisse(a)redhat.com>
Cc: Justin Stitt <justinstitt(a)google.com>
Cc: Nathan Chancellor <nathan(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hmm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/hmm.c~mm-hmm-move-pmd_to_hmm_pfn_flags-to-the-respective-ifdeffery
+++ a/mm/hmm.c
@@ -183,6 +183,7 @@ static inline unsigned long hmm_pfn_flag
return order << HMM_PFN_ORDER_SHIFT;
}
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
static inline unsigned long pmd_to_hmm_pfn_flags(struct hmm_range *range,
pmd_t pmd)
{
@@ -193,7 +194,6 @@ static inline unsigned long pmd_to_hmm_p
hmm_pfn_flags_order(PMD_SHIFT - PAGE_SHIFT);
}
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
static int hmm_vma_handle_pmd(struct mm_walk *walk, unsigned long addr,
unsigned long end, unsigned long hmm_pfns[],
pmd_t pmd)
_
Patches currently in -mm which might be from andriy.shevchenko(a)linux.intel.com are
mm-hmm-move-pmd_to_hmm_pfn_flags-to-the-respective-ifdeffery.patch
panic-add-panic_sys_info-sysctl-to-take-human-readable-string-parameter-fix.patch
The patch titled
Subject: nilfs2: reject invalid file types when reading inodes
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
nilfs2-reject-invalid-file-types-when-reading-inodes.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Subject: nilfs2: reject invalid file types when reading inodes
Date: Thu, 10 Jul 2025 22:49:08 +0900
To prevent inodes with invalid file types from tripping through the vfs
and causing malfunctions or assertion failures, add a missing sanity check
when reading an inode from a block device. If the file type is not valid,
treat it as a filesystem error.
Link: https://lkml.kernel.org/r/20250710134952.29862-1-konishi.ryusuke@gmail.com
Fixes: 05fe58fdc10d ("nilfs2: inode operations")
Signed-off-by: Ryusuke Konishi <konishi.ryusuke(a)gmail.com>
Reported-by: syzbot+895c23f6917da440ed0d(a)syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/nilfs2/inode.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
--- a/fs/nilfs2/inode.c~nilfs2-reject-invalid-file-types-when-reading-inodes
+++ a/fs/nilfs2/inode.c
@@ -472,11 +472,18 @@ static int __nilfs_read_inode(struct sup
inode->i_op = &nilfs_symlink_inode_operations;
inode_nohighmem(inode);
inode->i_mapping->a_ops = &nilfs_aops;
- } else {
+ } else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) ||
+ S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
inode->i_op = &nilfs_special_inode_operations;
init_special_inode(
inode, inode->i_mode,
huge_decode_dev(le64_to_cpu(raw_inode->i_device_code)));
+ } else {
+ nilfs_error(sb,
+ "invalid file type bits in mode 0%o for inode %lu",
+ inode->i_mode, ino);
+ err = -EIO;
+ goto failed_unmap;
}
nilfs_ifile_unmap_inode(raw_inode);
brelse(bh);
_
Patches currently in -mm which might be from konishi.ryusuke(a)gmail.com are
nilfs2-reject-invalid-file-types-when-reading-inodes.patch
David Howells <dhowells(a)redhat.com> wrote:
> Here are some miscellaneous fixes and changes for netfslib and cifs, if you
> could consider pulling them. All the bugs fixed were observed in cifs, so
> they should probably go through the cifs tree unless Christian would much
> prefer for them to go through the VFS tree.
Hi David,
your commit 2b1424cd131c ("netfs: Fix wait/wake to be consistent about
the waitqueue used") has given me serious headaches; it has caused
outages in our web hosting clusters (yet again - all Linux versions
since 6.9 had serious netfs regressions). Your patch was backported to
6.15 as commit 329ba1cb402a in 6.15.3 (why oh why??), and therefore
the bugs it has caused will be "available" to all Linux stable users.
The problem we had is that writing to certain files never finishes. It
looks like it has to do with the cachefiles subrequest never reporting
completion. (We use Ceph with cachefiles)
I have tried applying the fixes in this pull request, which sounded
promising, but the problem is still there. The only thing that helps
is reverting 2b1424cd131c completely - everything is fine with 6.15.5
plus the revert.
What do you need from me in order to analyze the bug?
Max
Set the dirty bit when the memory resource is not cleared
during BO release.
v2(Christian):
- Drop the cleared flag set to false.
- Improve the amdgpu_vram_mgr_set_clear_state() function.
v3:
- Add back the resource clear flag set function call after
being wiped during eviction (Christian).
- Modified the patch subject name.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam(a)amd.com>
Suggested-by: Christian König <christian.koenig(a)amd.com>
Cc: stable(a)vger.kernel.org
Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality")
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
index b256cbc2bc27..2c88d5fd87da 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
@@ -66,7 +66,10 @@ to_amdgpu_vram_mgr_resource(struct ttm_resource *res)
static inline void amdgpu_vram_mgr_set_cleared(struct ttm_resource *res)
{
- to_amdgpu_vram_mgr_resource(res)->flags |= DRM_BUDDY_CLEARED;
+ struct amdgpu_vram_mgr_resource *ares = to_amdgpu_vram_mgr_resource(res);
+
+ WARN_ON(ares->flags & DRM_BUDDY_CLEARED);
+ ares->flags |= DRM_BUDDY_CLEARED;
}
#endif
--
2.43.0
From: Edip Hazuri <edip(a)medip.dev>
The mute led on this laptop is using ALC245 but requires a quirk to work
This patch enables the existing quirk for the device.
Tested on Victus 16-r0xxx Laptop. The LED behaviour works
as intended.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Edip Hazuri <edip(a)medip.dev>
---
sound/pci/hda/patch_realtek.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index 060db37ea..132cef8fa 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -10814,6 +10814,7 @@ static const struct hda_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x103c, 0x8b97, "HP", ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF),
SND_PCI_QUIRK(0x103c, 0x8bb3, "HP Slim OMEN", ALC287_FIXUP_CS35L41_I2C_2),
SND_PCI_QUIRK(0x103c, 0x8bb4, "HP Slim OMEN", ALC287_FIXUP_CS35L41_I2C_2),
+ SND_PCI_QUIRK(0x103c, 0x8bbe, "HP Victus 16-r0xxx (MB 8BBE)", ALC245_FIXUP_HP_MUTE_LED_COEFBIT),
SND_PCI_QUIRK(0x103c, 0x8bc8, "HP Victus 15-fa1xxx", ALC245_FIXUP_HP_MUTE_LED_COEFBIT),
SND_PCI_QUIRK(0x103c, 0x8bcd, "HP Omen 16-xd0xxx", ALC245_FIXUP_HP_MUTE_LED_V1_COEFBIT),
SND_PCI_QUIRK(0x103c, 0x8bdd, "HP Envy 17", ALC287_FIXUP_CS35L41_I2C_2),
--
2.50.1
Currently, __mkroute_output overrules the MTU value configured for
broadcast routes.
This buggy behaviour can be reproduced with:
ip link set dev eth1 mtu 9000
ip route del broadcast 192.168.0.255 dev eth1 proto kernel scope link src 192.168.0.2
ip route add broadcast 192.168.0.255 dev eth1 proto kernel scope link src 192.168.0.2 mtu 1500
The maximum packet size should be 1500, but it is actually 8000:
ping -b 192.168.0.255 -s 8000
Fix __mkroute_output to allow MTU values to be configured for
for broadcast routes (to support a mixed-MTU local-area-network).
Signed-off-by: Oscar Maes <oscmaes92(a)gmail.com>
---
net/ipv4/route.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index fccb05fb3..a2a3b6482 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2585,7 +2585,6 @@ static struct rtable *__mkroute_output(const struct fib_result *res,
do_cache = true;
if (type == RTN_BROADCAST) {
flags |= RTCF_BROADCAST | RTCF_LOCAL;
- fi = NULL;
} else if (type == RTN_MULTICAST) {
flags |= RTCF_MULTICAST | RTCF_LOCAL;
if (!ip_check_mc_rcu(in_dev, fl4->daddr, fl4->saddr,
--
2.39.5
When the GPU scheduler was ported to using a struct for its
initialization parameters, it was overlooked that panfrost creates a
distinct workqueue for timeout handling.
The pointer to this new workqueue is not initialized to the struct,
resulting in NULL being passed to the scheduler, which then uses the
system_wq for timeout handling.
Set the correct workqueue to the init args struct.
Cc: stable(a)vger.kernel.org # 6.15+
Fixes: 796a9f55a8d1 ("drm/sched: Use struct for drm_sched_init() params")
Reported-by: Tvrtko Ursulin <tvrtko.ursulin(a)igalia.com>
Closes: https://lore.kernel.org/dri-devel/b5d0921c-7cbf-4d55-aa47-c35cd7861c02@igal…
Signed-off-by: Philipp Stanner <phasta(a)kernel.org>
---
drivers/gpu/drm/panfrost/panfrost_job.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c
index 5657106c2f7d..15e2d505550f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_job.c
+++ b/drivers/gpu/drm/panfrost/panfrost_job.c
@@ -841,7 +841,6 @@ int panfrost_job_init(struct panfrost_device *pfdev)
.num_rqs = DRM_SCHED_PRIORITY_COUNT,
.credit_limit = 2,
.timeout = msecs_to_jiffies(JOB_TIMEOUT_MS),
- .timeout_wq = pfdev->reset.wq,
.name = "pan_js",
.dev = pfdev->dev,
};
@@ -879,6 +878,7 @@ int panfrost_job_init(struct panfrost_device *pfdev)
pfdev->reset.wq = alloc_ordered_workqueue("panfrost-reset", 0);
if (!pfdev->reset.wq)
return -ENOMEM;
+ args.timeout_wq = pfdev->reset.wq;
for (j = 0; j < NUM_JOB_SLOTS; j++) {
js->queue[j].fence_context = dma_fence_context_alloc(1);
--
2.49.0
The SD spec says: "In UHS-I mode, after selecting one of SDR50, SDR104,
or DDR50 mode by Function Group 1, host needs to change the Power Limit
to enable the card to operate in higher performance".
The driver previously determined SD card current limits incorrectly by
checking capability bits before bus speed was established, and by using
support bits in function group 4 (bytes 6 & 7) rather than the actual
current requirement (bytes 0 & 1). This is wrong because the card
responds for a given bus speed.
This patch queries the card's current requirement after setting the bus
speed, and uses the reported value to select the appropriate current
limit.
while at it, remove some unused constants and the misleading comment in
the code.
Fixes: d9812780a020 ("mmc: sd: limit SD card power limit according to cards capabilities")
Signed-off-by: Avri Altman <avri.altman(a)sandisk.com>
Cc: stable(a)vger.kernel.org
---
drivers/mmc/core/sd.c | 36 +++++++++++++-----------------------
include/linux/mmc/card.h | 6 ------
2 files changed, 13 insertions(+), 29 deletions(-)
diff --git a/drivers/mmc/core/sd.c b/drivers/mmc/core/sd.c
index cf92c5b2059a..357edfb910df 100644
--- a/drivers/mmc/core/sd.c
+++ b/drivers/mmc/core/sd.c
@@ -365,7 +365,6 @@ static int mmc_read_switch(struct mmc_card *card)
card->sw_caps.sd3_bus_mode = status[13];
/* Driver Strengths supported by the card */
card->sw_caps.sd3_drv_type = status[9];
- card->sw_caps.sd3_curr_limit = status[7] | status[6] << 8;
}
out:
@@ -556,7 +555,7 @@ static int sd_set_current_limit(struct mmc_card *card, u8 *status)
{
int current_limit = SD_SET_CURRENT_LIMIT_200;
int err;
- u32 max_current;
+ u32 max_current, card_needs;
/*
* Current limit switch is only defined for SDR50, SDR104, and DDR50
@@ -575,33 +574,24 @@ static int sd_set_current_limit(struct mmc_card *card, u8 *status)
max_current = sd_get_host_max_current(card->host);
/*
- * We only check host's capability here, if we set a limit that is
- * higher than the card's maximum current, the card will be using its
- * maximum current, e.g. if the card's maximum current is 300ma, and
- * when we set current limit to 200ma, the card will draw 200ma, and
- * when we set current limit to 400/600/800ma, the card will draw its
- * maximum 300ma from the host.
- *
- * The above is incorrect: if we try to set a current limit that is
- * not supported by the card, the card can rightfully error out the
- * attempt, and remain at the default current limit. This results
- * in a 300mA card being limited to 200mA even though the host
- * supports 800mA. Failures seen with SanDisk 8GB UHS cards with
- * an iMX6 host. --rmk
+ * query the card of its maximun current/power consumption given the
+ * bus speed mode
*/
- if (max_current >= 800 &&
- card->sw_caps.sd3_curr_limit & SD_MAX_CURRENT_800)
+ err = mmc_sd_switch(card, 0, 0, card->sd_bus_speed, status);
+ if (err)
+ return err;
+
+ card_needs = status[1] | status[0] << 8;
+
+ if (max_current >= 800 && card_needs > 600)
current_limit = SD_SET_CURRENT_LIMIT_800;
- else if (max_current >= 600 &&
- card->sw_caps.sd3_curr_limit & SD_MAX_CURRENT_600)
+ else if (max_current >= 600 && card_needs > 400)
current_limit = SD_SET_CURRENT_LIMIT_600;
- else if (max_current >= 400 &&
- card->sw_caps.sd3_curr_limit & SD_MAX_CURRENT_400)
+ else if (max_current >= 400 && card_needs > 200)
current_limit = SD_SET_CURRENT_LIMIT_400;
if (current_limit != SD_SET_CURRENT_LIMIT_200) {
- err = mmc_sd_switch(card, SD_SWITCH_SET, 3,
- current_limit, status);
+ err = mmc_sd_switch(card, SD_SWITCH_SET, 3, current_limit, status);
if (err)
return err;
diff --git a/include/linux/mmc/card.h b/include/linux/mmc/card.h
index e9e964c20e53..67c1386ca574 100644
--- a/include/linux/mmc/card.h
+++ b/include/linux/mmc/card.h
@@ -177,17 +177,11 @@ struct sd_switch_caps {
#define SD_DRIVER_TYPE_A 0x02
#define SD_DRIVER_TYPE_C 0x04
#define SD_DRIVER_TYPE_D 0x08
- unsigned int sd3_curr_limit;
#define SD_SET_CURRENT_LIMIT_200 0
#define SD_SET_CURRENT_LIMIT_400 1
#define SD_SET_CURRENT_LIMIT_600 2
#define SD_SET_CURRENT_LIMIT_800 3
-#define SD_MAX_CURRENT_200 (1 << SD_SET_CURRENT_LIMIT_200)
-#define SD_MAX_CURRENT_400 (1 << SD_SET_CURRENT_LIMIT_400)
-#define SD_MAX_CURRENT_600 (1 << SD_SET_CURRENT_LIMIT_600)
-#define SD_MAX_CURRENT_800 (1 << SD_SET_CURRENT_LIMIT_800)
-
#define SD4_SET_POWER_LIMIT_0_72W 0
#define SD4_SET_POWER_LIMIT_1_44W 1
#define SD4_SET_POWER_LIMIT_2_16W 2
--
2.25.1
The quilt patch titled
Subject: ocfs2: reset folio to NULL when get folio fails
has been removed from the -mm tree. Its filename was
ocfs2-reset-folio-to-null-when-get-folio-fails.patch
This patch was dropped because it was merged into the mm-nonmm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Lizhi Xu <lizhi.xu(a)windriver.com>
Subject: ocfs2: reset folio to NULL when get folio fails
Date: Mon, 16 Jun 2025 09:31:40 +0800
The reproducer uses FAULT_INJECTION to make memory allocation fail, which
causes __filemap_get_folio() to fail, when initializing w_folios[i] in
ocfs2_grab_folios_for_write(), it only returns an error code and the value
of w_folios[i] is the error code, which causes
ocfs2_unlock_and_free_folios() to recycle the invalid w_folios[i] when
releasing folios.
Link: https://lkml.kernel.org/r/20250616013140.3602219-1-lizhi.xu@windriver.com
Reported-by: syzbot+c2ea94ae47cd7e3881ec(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=c2ea94ae47cd7e3881ec
Signed-off-by: Lizhi Xu <lizhi.xu(a)windriver.com>
Reviewed-by: Joseph Qi <joseph.qi(a)linux.alibaba.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Changwei Ge <gechangwei(a)live.cn>
Cc: Jun Piao <piaojun(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/aops.c | 1 +
1 file changed, 1 insertion(+)
--- a/fs/ocfs2/aops.c~ocfs2-reset-folio-to-null-when-get-folio-fails
+++ a/fs/ocfs2/aops.c
@@ -1071,6 +1071,7 @@ static int ocfs2_grab_folios_for_write(s
if (IS_ERR(wc->w_folios[i])) {
ret = PTR_ERR(wc->w_folios[i]);
mlog_errno(ret);
+ wc->w_folios[i] = NULL;
goto out;
}
}
_
Patches currently in -mm which might be from lizhi.xu(a)windriver.com are
The quilt patch titled
Subject: mm/ptdump: take the memory hotplug lock inside ptdump_walk_pgd()
has been removed from the -mm tree. Its filename was
mm-ptdump-take-the-memory-hotplug-lock-inside-ptdump_walk_pgd.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Anshuman Khandual <anshuman.khandual(a)arm.com>
Subject: mm/ptdump: take the memory hotplug lock inside ptdump_walk_pgd()
Date: Fri, 20 Jun 2025 10:54:27 +0530
Memory hot remove unmaps and tears down various kernel page table regions
as required. The ptdump code can race with concurrent modifications of
the kernel page tables. When leaf entries are modified concurrently, the
dump code may log stale or inconsistent information for a VA range, but
this is otherwise not harmful.
But when intermediate levels of kernel page table are freed, the dump code
will continue to use memory that has been freed and potentially
reallocated for another purpose. In such cases, the ptdump code may
dereference bogus addresses, leading to a number of potential problems.
To avoid the above mentioned race condition, platforms such as arm64,
riscv and s390 take memory hotplug lock, while dumping kernel page table
via the sysfs interface /sys/kernel/debug/kernel_page_tables.
Similar race condition exists while checking for pages that might have
been marked W+X via /sys/kernel/debug/kernel_page_tables/check_wx_pages
which in turn calls ptdump_check_wx(). Instead of solving this race
condition again, let's just move the memory hotplug lock inside generic
ptdump_check_wx() which will benefit both the scenarios.
Drop get_online_mems() and put_online_mems() combination from all existing
platform ptdump code paths.
Link: https://lkml.kernel.org/r/20250620052427.2092093-1-anshuman.khandual@arm.com
Fixes: bbd6ec605c0f ("arm64/mm: Enable memory hot remove")
Signed-off-by: Anshuman Khandual <anshuman.khandual(a)arm.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Dev Jain <dev.jain(a)arm.com>
Acked-by: Alexander Gordeev <agordeev(a)linux.ibm.com> [s390]
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Paul Walmsley <paul.walmsley(a)sifive.com>
Cc: Palmer Dabbelt <palmer(a)dabbelt.com>
Cc: Alexander Gordeev <agordeev(a)linux.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer(a)linux.ibm.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: Christian Borntraeger <borntraeger(a)linux.ibm.com>
Cc: Sven Schnelle <svens(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/arm64/mm/ptdump_debugfs.c | 3 ---
arch/riscv/mm/ptdump.c | 3 ---
arch/s390/mm/dump_pagetables.c | 2 --
mm/ptdump.c | 2 ++
4 files changed, 2 insertions(+), 8 deletions(-)
--- a/arch/arm64/mm/ptdump_debugfs.c~mm-ptdump-take-the-memory-hotplug-lock-inside-ptdump_walk_pgd
+++ a/arch/arm64/mm/ptdump_debugfs.c
@@ -1,6 +1,5 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/debugfs.h>
-#include <linux/memory_hotplug.h>
#include <linux/seq_file.h>
#include <asm/ptdump.h>
@@ -9,9 +8,7 @@ static int ptdump_show(struct seq_file *
{
struct ptdump_info *info = m->private;
- get_online_mems();
ptdump_walk(m, info);
- put_online_mems();
return 0;
}
DEFINE_SHOW_ATTRIBUTE(ptdump);
--- a/arch/riscv/mm/ptdump.c~mm-ptdump-take-the-memory-hotplug-lock-inside-ptdump_walk_pgd
+++ a/arch/riscv/mm/ptdump.c
@@ -6,7 +6,6 @@
#include <linux/efi.h>
#include <linux/init.h>
#include <linux/debugfs.h>
-#include <linux/memory_hotplug.h>
#include <linux/seq_file.h>
#include <linux/ptdump.h>
@@ -413,9 +412,7 @@ bool ptdump_check_wx(void)
static int ptdump_show(struct seq_file *m, void *v)
{
- get_online_mems();
ptdump_walk(m, m->private);
- put_online_mems();
return 0;
}
--- a/arch/s390/mm/dump_pagetables.c~mm-ptdump-take-the-memory-hotplug-lock-inside-ptdump_walk_pgd
+++ a/arch/s390/mm/dump_pagetables.c
@@ -247,11 +247,9 @@ static int ptdump_show(struct seq_file *
.marker = markers,
};
- get_online_mems();
mutex_lock(&cpa_mutex);
ptdump_walk_pgd(&st.ptdump, &init_mm, NULL);
mutex_unlock(&cpa_mutex);
- put_online_mems();
return 0;
}
DEFINE_SHOW_ATTRIBUTE(ptdump);
--- a/mm/ptdump.c~mm-ptdump-take-the-memory-hotplug-lock-inside-ptdump_walk_pgd
+++ a/mm/ptdump.c
@@ -176,6 +176,7 @@ void ptdump_walk_pgd(struct ptdump_state
{
const struct ptdump_range *range = st->range;
+ get_online_mems();
mmap_write_lock(mm);
while (range->start != range->end) {
walk_page_range_debug(mm, range->start, range->end,
@@ -183,6 +184,7 @@ void ptdump_walk_pgd(struct ptdump_state
range++;
}
mmap_write_unlock(mm);
+ put_online_mems();
/* Flush out the last page */
st->note_page_flush(st);
_
Patches currently in -mm which might be from anshuman.khandual(a)arm.com are
The quilt patch titled
Subject: mm/huge_memory: don't ignore queried cachemode in vmf_insert_pfn_pud()
has been removed from the -mm tree. Its filename was
mm-huge_memory-dont-ignore-queried-cachemode-in-vmf_insert_pfn_pud.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: mm/huge_memory: don't ignore queried cachemode in vmf_insert_pfn_pud()
Date: Fri, 13 Jun 2025 11:27:00 +0200
Patch series "mm/huge_memory: vmf_insert_folio_*() and
vmf_insert_pfn_pud() fixes", v3.
While working on improving vm_normal_page() and friends, I stumbled over
this issues: refcounted "normal" folios must not be marked using
pmd_special() / pud_special(). Otherwise, we're effectively telling the
system that these folios are no "normal", violating the rules we
documented for vm_normal_page().
Fortunately, there are not many pmd_special()/pud_special() users yet. So
far there doesn't seem to be serious damage.
Tested using the ndctl tests ("ndctl:dax" suite).
This patch (of 3):
We set up the cache mode but ... don't forward the updated pgprot to
insert_pfn_pud().
Only a problem on x86-64 PAT when mapping PFNs using PUDs that require a
special cachemode.
Fix it by using the proper pgprot where the cachemode was setup.
It is unclear in which configurations we would get the cachemode wrong:
through vfio seems possible. Getting cachemodes wrong is usually ...
bad. As the fix is easy, let's backport it to stable.
Identified by code inspection.
Link: https://lkml.kernel.org/r/20250613092702.1943533-1-david@redhat.com
Link: https://lkml.kernel.org/r/20250613092702.1943533-2-david@redhat.com
Fixes: 7b806d229ef1 ("mm: remove vmf_insert_pfn_xxx_prot() for huge page-table entries")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Dan Williams <dan.j.williams(a)intel.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Reviewed-by: Jason Gunthorpe <jgg(a)nvidia.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Tested-by: Dan Williams <dan.j.williams(a)intel.com>
Cc: Alistair Popple <apopple(a)nvidia.com>
Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Cc: Dev Jain <dev.jain(a)arm.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Mariano Pache <npache(a)redhat.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Ryan Roberts <ryan.roberts(a)arm.com>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Zi Yan <ziy(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/huge_memory.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
--- a/mm/huge_memory.c~mm-huge_memory-dont-ignore-queried-cachemode-in-vmf_insert_pfn_pud
+++ a/mm/huge_memory.c
@@ -1516,10 +1516,9 @@ static pud_t maybe_pud_mkwrite(pud_t pud
}
static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr,
- pud_t *pud, pfn_t pfn, bool write)
+ pud_t *pud, pfn_t pfn, pgprot_t prot, bool write)
{
struct mm_struct *mm = vma->vm_mm;
- pgprot_t prot = vma->vm_page_prot;
pud_t entry;
if (!pud_none(*pud)) {
@@ -1581,7 +1580,7 @@ vm_fault_t vmf_insert_pfn_pud(struct vm_
pfnmap_setup_cachemode_pfn(pfn_t_to_pfn(pfn), &pgprot);
ptl = pud_lock(vma->vm_mm, vmf->pud);
- insert_pfn_pud(vma, addr, vmf->pud, pfn, write);
+ insert_pfn_pud(vma, addr, vmf->pud, pfn, pgprot, write);
spin_unlock(ptl);
return VM_FAULT_NOPAGE;
@@ -1625,7 +1624,7 @@ vm_fault_t vmf_insert_folio_pud(struct v
add_mm_counter(mm, mm_counter_file(folio), HPAGE_PUD_NR);
}
insert_pfn_pud(vma, addr, vmf->pud, pfn_to_pfn_t(folio_pfn(folio)),
- write);
+ vma->vm_page_prot, write);
spin_unlock(ptl);
return VM_FAULT_NOPAGE;
_
Patches currently in -mm which might be from david(a)redhat.com are
mm-balloon_compaction-we-cannot-have-isolated-pages-in-the-balloon-list.patch
mm-balloon_compaction-convert-balloon_page_delete-to-balloon_page_finalize.patch
mm-zsmalloc-drop-pageisolated-related-vm_bug_ons.patch
mm-page_alloc-let-page-freeing-clear-any-set-page-type.patch
mm-balloon_compaction-make-pageoffline-sticky-until-the-page-is-freed.patch
mm-zsmalloc-make-pagezsmalloc-sticky-until-the-page-is-freed.patch
mm-migrate-rename-isolate_movable_page-to-isolate_movable_ops_page.patch
mm-migrate-rename-putback_movable_folio-to-putback_movable_ops_page.patch
mm-migrate-factor-out-movable_ops-page-handling-into-migrate_movable_ops_page.patch
mm-migrate-remove-folio_test_movable-and-folio_movable_ops.patch
mm-migrate-move-movable_ops-page-handling-out-of-move_to_new_folio.patch
mm-zsmalloc-stop-using-__clearpagemovable.patch
mm-balloon_compaction-stop-using-__clearpagemovable.patch
mm-migrate-remove-__clearpagemovable.patch
mm-migration-remove-pagemovable.patch
mm-rename-__pagemovable-to-page_has_movable_ops.patch
mm-page_isolation-drop-__folio_test_movable-check-for-large-folios.patch
mm-remove-__folio_test_movable.patch
mm-stop-storing-migration_ops-in-page-mapping.patch
mm-convert-movable-flag-in-page-mapping-to-a-page-flag.patch
mm-rename-pg_isolated-to-pg_movable_ops_isolated.patch
mm-page-flags-rename-page_mapping_movable-to-page_mapping_anon_ksm.patch
mm-page-alloc-remove-pagemappingflags.patch
mm-page-flags-remove-folio_mapping_flags.patch
mm-simplify-folio_expected_ref_count.patch
mm-rename-page_mapping_-to-folio_mapping_.patch
docs-mm-convert-from-non-lru-page-migration-to-movable_ops-page-migration.patch
mm-balloon_compaction-movable_ops-doc-updates.patch
mm-balloon_compaction-provide-single-balloon_page_insert-and-balloon_mapping_gfp_mask.patch
mm-convert-fpb_ignore_-into-fpb_respect_.patch
mm-smaller-folio_pte_batch-improvements.patch
mm-split-folio_pte_batch-into-folio_pte_batch-and-folio_pte_batch_flags.patch
mm-remove-boolean-output-parameters-from-folio_pte_batch_ext.patch
mm-memory-introduce-is_huge_zero_pfn-and-use-it-in-vm_normal_page_pmd.patch
The quilt patch titled
Subject: mm/memory-tier: fix abstract distance calculation overflow
has been removed from the -mm tree. Its filename was
mm-memory-tier-fix-abstract-distance-calculation-overflow.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Li Zhijian <lizhijian(a)fujitsu.com>
Subject: mm/memory-tier: fix abstract distance calculation overflow
Date: Tue, 10 Jun 2025 14:27:51 +0800
In mt_perf_to_adistance(), the calculation of abstract distance (adist)
involves multiplying several int values including
MEMTIER_ADISTANCE_DRAM.
*adist = MEMTIER_ADISTANCE_DRAM *
(perf->read_latency + perf->write_latency) /
(default_dram_perf.read_latency + default_dram_perf.write_latency) *
(default_dram_perf.read_bandwidth + default_dram_perf.write_bandwidth) /
(perf->read_bandwidth + perf->write_bandwidth);
Since these values can be large, the multiplication may exceed the
maximum value of an int (INT_MAX) and overflow (Our platform did),
leading to an incorrect adist.
User-visible impact:
The memory tiering subsystem will misinterpret slow memory (like CXL)
as faster than DRAM, causing inappropriate demotion of pages from
CXL (slow memory) to DRAM (fast memory).
For example, we will see the following demotion chains from the dmesg, where
Node0,1 are DRAM, and Node2,3 are CXL node:
Demotion targets for Node 0: null
Demotion targets for Node 1: null
Demotion targets for Node 2: preferred: 0-1, fallback: 0-1
Demotion targets for Node 3: preferred: 0-1, fallback: 0-1
Change MEMTIER_ADISTANCE_DRAM to be a long constant by writing it with
the 'L' suffix. This prevents the overflow because the multiplication
will then be done in the long type which has a larger range.
Link: https://lkml.kernel.org/r/20250611023439.2845785-1-lizhijian@fujitsu.com
Link: https://lkml.kernel.org/r/20250610062751.2365436-1-lizhijian@fujitsu.com
Fixes: 3718c02dbd4c ("acpi, hmat: calculate abstract distance with HMAT")
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
Reviewed-by: Huang Ying <ying.huang(a)linux.alibaba.com>
Acked-by: Balbir Singh <balbirs(a)nvidia.com>
Reviewed-by: Donet Tom <donettom(a)linux.ibm.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/memory-tiers.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/memory-tiers.h~mm-memory-tier-fix-abstract-distance-calculation-overflow
+++ a/include/linux/memory-tiers.h
@@ -18,7 +18,7 @@
* adistance value (slightly faster) than default DRAM adistance to be part of
* the same memory tier.
*/
-#define MEMTIER_ADISTANCE_DRAM ((4 * MEMTIER_CHUNK_SIZE) + (MEMTIER_CHUNK_SIZE >> 1))
+#define MEMTIER_ADISTANCE_DRAM ((4L * MEMTIER_CHUNK_SIZE) + (MEMTIER_CHUNK_SIZE >> 1))
struct memory_tier;
struct memory_dev_type {
_
Patches currently in -mm which might be from lizhijian(a)fujitsu.com are
The quilt patch titled
Subject: readahead: fix return value of page_cache_next_miss() when no hole is found
has been removed from the -mm tree. Its filename was
readahead-fix-return-value-of-page_cache_next_miss-when-no-hole-is-found.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Chi Zhiling <chizhiling(a)kylinos.cn>
Subject: readahead: fix return value of page_cache_next_miss() when no hole is found
Date: Thu, 5 Jun 2025 13:49:35 +0800
max_scan in page_cache_next_miss always decreases to zero when no hole is
found, causing the return value to be index + 0.
Fix this by preserving the max_scan value throughout the loop.
Jan said "From what I know and have seen in the past, wrong responses
from page_cache_next_miss() can lead to readahead window reduction and
thus reduced read speeds."
Link: https://lkml.kernel.org/r/20250605054935.2323451-1-chizhiling@163.com
Fixes: 901a269ff3d5 ("filemap: fix page_cache_next_miss() when no hole found")
Signed-off-by: Chi Zhiling <chizhiling(a)kylinos.cn>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Cc: Josef Bacik <josef(a)toxicpanda.com>
Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/filemap.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/filemap.c~readahead-fix-return-value-of-page_cache_next_miss-when-no-hole-is-found
+++ a/mm/filemap.c
@@ -1778,8 +1778,9 @@ pgoff_t page_cache_next_miss(struct addr
pgoff_t index, unsigned long max_scan)
{
XA_STATE(xas, &mapping->i_pages, index);
+ unsigned long nr = max_scan;
- while (max_scan--) {
+ while (nr--) {
void *entry = xas_next(&xas);
if (!entry || xa_is_value(entry))
return xas.xa_index;
_
Patches currently in -mm which might be from chizhiling(a)kylinos.cn are
Good morning
I'm hoping for a resolution to an issue I'm currently having with the latest kernel release and Workstation Pro (Free Edition). Every time I try to open Workstation, it's prompting me to reinstall modules that ultimately fail (see screenshots). Also see the attached log file. After doing some research, I see the issue is that several header files are missing. I've tried to manually compile and install original modules, but with no success. I'm still faced with an incompatibility issue and the latest kernel. This problem does not occur when I switch back to kernel 6.14.
I'm currently running the latest kernel (6.15.4) on Fedora 42. My hardware specs:
CPU: AMD Ryzen 7840U
Memory: Crucial 96 GB 5600 DDR5
[Screenshot From 2025-07-08 16-53-52.png][Screenshot From 2025-07-08 16-54-06.png],
The quilt patch titled
Subject: mm: fix the inaccurate memory statistics issue for users
has been removed from the -mm tree. Its filename was
mm-fix-the-inaccurate-memory-statistics-issue-for-users.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Subject: mm: fix the inaccurate memory statistics issue for users
Date: Thu, 5 Jun 2025 20:58:29 +0800
On some large machines with a high number of CPUs running a 64K pagesize
kernel, we found that the 'RES' field is always 0 displayed by the top
command for some processes, which will cause a lot of confusion for users.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
875525 root 20 0 12480 0 0 R 0.3 0.0 0:00.08 top
1 root 20 0 172800 0 0 S 0.0 0.0 0:04.52 systemd
The main reason is that the batch size of the percpu counter is quite
large on these machines, caching a significant percpu value, since
converting mm's rss stats into percpu_counter by commit f1a7941243c1 ("mm:
convert mm's rss stats into percpu_counter"). Intuitively, the batch
number should be optimized, but on some paths, performance may take
precedence over statistical accuracy. Therefore, introducing a new
interface to add the percpu statistical count and display it to users,
which can remove the confusion. In addition, this change is not expected
to be on a performance-critical path, so the modification should be
acceptable.
In addition, the 'mm->rss_stat' is updated by using add_mm_counter() and
dec/inc_mm_counter(), which are all wrappers around
percpu_counter_add_batch(). In percpu_counter_add_batch(), there is
percpu batch caching to avoid 'fbc->lock' contention. This patch changes
task_mem() and task_statm() to get the accurate mm counters under the
'fbc->lock', but this should not exacerbate kernel 'mm->rss_stat' lock
contention due to the percpu batch caching of the mm counters. The
following test also confirm the theoretical analysis.
I run the stress-ng that stresses anon page faults in 32 threads on my 32
cores machine, while simultaneously running a script that starts 32
threads to busy-loop pread each stress-ng thread's /proc/pid/status
interface. From the following data, I did not observe any obvious impact
of this patch on the stress-ng tests.
w/o patch:
stress-ng: info: [6848] 4,399,219,085,152 CPU Cycles 67.327 B/sec
stress-ng: info: [6848] 1,616,524,844,832 Instructions 24.740 B/sec (0.367 instr. per cycle)
stress-ng: info: [6848] 39,529,792 Page Faults Total 0.605 M/sec
stress-ng: info: [6848] 39,529,792 Page Faults Minor 0.605 M/sec
w/patch:
stress-ng: info: [2485] 4,462,440,381,856 CPU Cycles 68.382 B/sec
stress-ng: info: [2485] 1,615,101,503,296 Instructions 24.750 B/sec (0.362 instr. per cycle)
stress-ng: info: [2485] 39,439,232 Page Faults Total 0.604 M/sec
stress-ng: info: [2485] 39,439,232 Page Faults Minor 0.604 M/sec
On comparing a very simple app which just allocates & touches some
memory against v6.1 (which doesn't have f1a7941243c1) and latest Linus
tree (4c06e63b9203) I can see that on latest Linus tree the values for
VmRSS, RssAnon and RssFile from /proc/self/status are all zeroes while
they do report values on v6.1 and a Linus tree with this patch.
Link: https://lkml.kernel.org/r/f4586b17f66f97c174f7fd1f8647374fdb53de1c.17491190…
Fixes: f1a7941243c1 ("mm: convert mm's rss stats into percpu_counter")
Signed-off-by: Baolin Wang <baolin.wang(a)linux.alibaba.com>
Reviewed-by: Aboorva Devarajan <aboorvad(a)linux.ibm.com>
Tested-by: Aboorva Devarajan <aboorvad(a)linux.ibm.com>
Tested-by Donet Tom <donettom(a)linux.ibm.com>
Acked-by: Shakeel Butt <shakeel.butt(a)linux.dev>
Acked-by: SeongJae Park <sj(a)kernel.org>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Liam Howlett <liam.howlett(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: Mike Rapoport <rppt(a)kernel.org>
Cc: Suren Baghdasaryan <surenb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/proc/task_mmu.c | 14 +++++++-------
include/linux/mm.h | 5 +++++
2 files changed, 12 insertions(+), 7 deletions(-)
--- a/fs/proc/task_mmu.c~mm-fix-the-inaccurate-memory-statistics-issue-for-users
+++ a/fs/proc/task_mmu.c
@@ -36,9 +36,9 @@ void task_mem(struct seq_file *m, struct
unsigned long text, lib, swap, anon, file, shmem;
unsigned long hiwater_vm, total_vm, hiwater_rss, total_rss;
- anon = get_mm_counter(mm, MM_ANONPAGES);
- file = get_mm_counter(mm, MM_FILEPAGES);
- shmem = get_mm_counter(mm, MM_SHMEMPAGES);
+ anon = get_mm_counter_sum(mm, MM_ANONPAGES);
+ file = get_mm_counter_sum(mm, MM_FILEPAGES);
+ shmem = get_mm_counter_sum(mm, MM_SHMEMPAGES);
/*
* Note: to minimize their overhead, mm maintains hiwater_vm and
@@ -59,7 +59,7 @@ void task_mem(struct seq_file *m, struct
text = min(text, mm->exec_vm << PAGE_SHIFT);
lib = (mm->exec_vm << PAGE_SHIFT) - text;
- swap = get_mm_counter(mm, MM_SWAPENTS);
+ swap = get_mm_counter_sum(mm, MM_SWAPENTS);
SEQ_PUT_DEC("VmPeak:\t", hiwater_vm);
SEQ_PUT_DEC(" kB\nVmSize:\t", total_vm);
SEQ_PUT_DEC(" kB\nVmLck:\t", mm->locked_vm);
@@ -92,12 +92,12 @@ unsigned long task_statm(struct mm_struc
unsigned long *shared, unsigned long *text,
unsigned long *data, unsigned long *resident)
{
- *shared = get_mm_counter(mm, MM_FILEPAGES) +
- get_mm_counter(mm, MM_SHMEMPAGES);
+ *shared = get_mm_counter_sum(mm, MM_FILEPAGES) +
+ get_mm_counter_sum(mm, MM_SHMEMPAGES);
*text = (PAGE_ALIGN(mm->end_code) - (mm->start_code & PAGE_MASK))
>> PAGE_SHIFT;
*data = mm->data_vm + mm->stack_vm;
- *resident = *shared + get_mm_counter(mm, MM_ANONPAGES);
+ *resident = *shared + get_mm_counter_sum(mm, MM_ANONPAGES);
return mm->total_vm;
}
--- a/include/linux/mm.h~mm-fix-the-inaccurate-memory-statistics-issue-for-users
+++ a/include/linux/mm.h
@@ -2568,6 +2568,11 @@ static inline unsigned long get_mm_count
return percpu_counter_read_positive(&mm->rss_stat[member]);
}
+static inline unsigned long get_mm_counter_sum(struct mm_struct *mm, int member)
+{
+ return percpu_counter_sum_positive(&mm->rss_stat[member]);
+}
+
void mm_trace_rss_stat(struct mm_struct *mm, int member);
static inline void add_mm_counter(struct mm_struct *mm, int member, long value)
_
Patches currently in -mm which might be from baolin.wang(a)linux.alibaba.com are
selftests-khugepaged-fix-the-shmem-collapse-failure.patch
selftests-mm-add-shmem-collapse-as-a-default-test-item.patch
mm-huge_memory-fix-the-check-for-allowed-huge-orders-in-shmem.patch
khugepaged-allow-khugepaged-to-check-all-anonymous-mthp-orders.patch
khugepaged-kick-khugepaged-for-enabling-none-pmd-sized-mthps.patch
mm-fault-in-complete-folios-instead-of-individual-pages-for-tmpfs.patch
The quilt patch titled
Subject: mm/damon: fix divide by zero in damon_get_intervals_score()
has been removed from the -mm tree. Its filename was
mm-damon-fix-divide-by-zero-in-damon_get_intervals_score.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Honggyu Kim <honggyu.kim(a)sk.com>
Subject: mm/damon: fix divide by zero in damon_get_intervals_score()
Date: Wed, 2 Jul 2025 09:02:04 +0900
The current implementation allows having zero size regions with no special
reasons, but damon_get_intervals_score() gets crashed by divide by zero
when the region size is zero.
[ 29.403950] Oops: divide error: 0000 [#1] SMP NOPTI
This patch fixes the bug, but does not disallow zero size regions to keep
the backward compatibility since disallowing zero size regions might be a
breaking change for some users.
In addition, the same crash can happen when intervals_goal.access_bp is
zero so this should be fixed in stable trees as well.
Link: https://lkml.kernel.org/r/20250702000205.1921-5-honggyu.kim@sk.com
Fixes: f04b0fedbe71 ("mm/damon/core: implement intervals auto-tuning")
Signed-off-by: Honggyu Kim <honggyu.kim(a)sk.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/damon/core.c | 1 +
1 file changed, 1 insertion(+)
--- a/mm/damon/core.c~mm-damon-fix-divide-by-zero-in-damon_get_intervals_score
+++ a/mm/damon/core.c
@@ -1449,6 +1449,7 @@ static unsigned long damon_get_intervals
}
}
target_access_events = max_access_events * goal_bp / 10000;
+ target_access_events = target_access_events ? : 1;
return access_events * 10000 / target_access_events;
}
_
Patches currently in -mm which might be from honggyu.kim(a)sk.com are
samples-damon-change-enable-parameters-to-enabled.patch
The quilt patch titled
Subject: samples/damon: fix damon sample mtier for start failure
has been removed from the -mm tree. Its filename was
samples-damon-fix-damon-sample-mtier-for-start-failure.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Honggyu Kim <honggyu.kim(a)sk.com>
Subject: samples/damon: fix damon sample mtier for start failure
Date: Wed, 2 Jul 2025 09:02:03 +0900
The damon_sample_mtier_start() can fail so we must reset the "enable"
parameter to "false" again for proper rollback.
In such cases, setting Y to "enable" then N triggers the similar crash
with mtier because damon sample start failed but the "enable" stays as Y.
Link: https://lkml.kernel.org/r/20250702000205.1921-4-honggyu.kim@sk.com
Fixes: 82a08bde3cf7 ("samples/damon: implement a DAMON module for memory tiering")
Signed-off-by: Honggyu Kim <honggyu.kim(a)sk.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
samples/damon/mtier.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/samples/damon/mtier.c~samples-damon-fix-damon-sample-mtier-for-start-failure
+++ a/samples/damon/mtier.c
@@ -164,8 +164,12 @@ static int damon_sample_mtier_enable_sto
if (enable == enabled)
return 0;
- if (enable)
- return damon_sample_mtier_start();
+ if (enable) {
+ err = damon_sample_mtier_start();
+ if (err)
+ enable = false;
+ return err;
+ }
damon_sample_mtier_stop();
return 0;
}
_
Patches currently in -mm which might be from honggyu.kim(a)sk.com are
samples-damon-change-enable-parameters-to-enabled.patch
The quilt patch titled
Subject: samples/damon: fix damon sample wsse for start failure
has been removed from the -mm tree. Its filename was
samples-damon-fix-damon-sample-wsse-for-start-failure.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Honggyu Kim <honggyu.kim(a)sk.com>
Subject: samples/damon: fix damon sample wsse for start failure
Date: Wed, 2 Jul 2025 09:02:02 +0900
The damon_sample_wsse_start() can fail so we must reset the "enable"
parameter to "false" again for proper rollback.
In such cases, setting Y to "enable" then N triggers the similar crash
with wsse because damon sample start failed but the "enable" stays as Y.
Link: https://lkml.kernel.org/r/20250702000205.1921-3-honggyu.kim@sk.com
Fixes: b757c6cfc696 ("samples/damon/wsse: start and stop DAMON as the user requests")
Signed-off-by: Honggyu Kim <honggyu.kim(a)sk.com>
Reviewed-by: SeongJae Park <sj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
samples/damon/wsse.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
--- a/samples/damon/wsse.c~samples-damon-fix-damon-sample-wsse-for-start-failure
+++ a/samples/damon/wsse.c
@@ -102,8 +102,12 @@ static int damon_sample_wsse_enable_stor
if (enable == enabled)
return 0;
- if (enable)
- return damon_sample_wsse_start();
+ if (enable) {
+ err = damon_sample_wsse_start();
+ if (err)
+ enable = false;
+ return err;
+ }
damon_sample_wsse_stop();
return 0;
}
_
Patches currently in -mm which might be from honggyu.kim(a)sk.com are
samples-damon-change-enable-parameters-to-enabled.patch