The patch titled
Subject: kexec/arm64: initialize the random field of kbuf to zero in the image loader
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
kexec-arm64-initialize-the-random-field-of-kbuf-to-zero-in-the-image-loader.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Breno Leitao <leitao(a)debian.org>
Subject: kexec/arm64: initialize the random field of kbuf to zero in the image loader
Date: Thu Aug 21 04:11:21 2025 -0700
Add an explicit initialization for the random member of the kbuf structure
within the image_load function in arch/arm64/kernel/kexec_image.c.
Setting kbuf.random to zero ensures a deterministic and clean starting
state for the buffer used during kernel image loading, avoiding this UBSAN
issue later, when kbuf.random is read.
[ 32.362488] UBSAN: invalid-load in ./include/linux/kexec.h:210:10
[ 32.362649] load of value 252 is not a valid value for type '_Bool'
Link: https://lkml.kernel.org/r/oninomspajhxp4omtdapxnckxydbk2nzmrix7rggmpukpnzad…
Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly
Signed-off-by: Breno Leitao <leitao(a)debian.org>
Cc: Baoquan He <bhe(a)redhat.com>
Cc: Coiby Xu <coxu(a)redhat.com>
Cc: "Daniel P. Berrange" <berrange(a)redhat.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Dave Young <dyoung(a)redhat.com>
Cc: Kairui Song <ryncsn(a)gmail.com>
Cc: Liu Pingfan <kernelfans(a)gmail.com>
Cc: Milan Broz <gmazyland(a)gmail.com>
Cc: Ondrej Kozina <okozina(a)redhat.com>
Cc: Vitaly Kuznetsov <vkuznets(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/arm64/kernel/kexec_image.c | 1 +
1 file changed, 1 insertion(+)
--- a/arch/arm64/kernel/kexec_image.c~kexec-arm64-initialize-the-random-field-of-kbuf-to-zero-in-the-image-loader
+++ a/arch/arm64/kernel/kexec_image.c
@@ -76,6 +76,7 @@ static void *image_load(struct kimage *i
kbuf.buf_min = 0;
kbuf.buf_max = ULONG_MAX;
kbuf.top_down = false;
+ kbuf.random = 0;
kbuf.buffer = kernel;
kbuf.bufsz = kernel_len;
_
Patches currently in -mm which might be from leitao(a)debian.org are
kexec-arm64-initialize-the-random-field-of-kbuf-to-zero-in-the-image-loader.patch
DCCP sockets in DCCP_REQUESTING state do not check the sequence number
or acknowledgment number for incoming Reset, CloseReq, and Close packets.
As a result, an attacker can send a spoofed Reset packet while the client
is in the requesting state. The client will accept the packet without
verification and immediately close the connection, causing a denial of
service (DoS) attack.
This patch moves the processing of Reset, Close, and CloseReq packets
into dccp_rcv_request_sent_state_process() and validates the ack number
before accepting them.
This fix should apply to stable versions *only* in Linux 5.x and 6.x.
Note that DCCP was removed in Linux 6.16, so this patch is only relevant
for older versions. We tested it on Ubuntu 24.04 LTS (Linux 6.8) and
it worked as expected.
Signed-off-by: Yizhou Zhao <zhaoyz24(a)mails.tsinghua.edu.cn>
Cc: stable(a)vger.kernel.org
Signed-off-by: Yizhou Zhao <zhaoyz24(a)mails.tsinghua.edu.cn>
---
net/dccp/input.c | 54 ++++++++++++++++++++++++++++--------------------
1 file changed, 32 insertions(+), 22 deletions(-)
diff --git a/net/dccp/input.c b/net/dccp/input.c
index 2cbb757a8..0b1ffb044 100644
--- a/net/dccp/input.c
+++ b/net/dccp/input.c
@@ -397,21 +397,22 @@ static int dccp_rcv_request_sent_state_process(struct sock *sk,
* / * Response processing continues in Step 10; Reset
* processing continues in Step 9 * /
*/
+ struct dccp_sock *dp = dccp_sk(sk);
+
+ if (!between48(DCCP_SKB_CB(skb)->dccpd_ack_seq,
+ dp->dccps_awl, dp->dccps_awh)) {
+ dccp_pr_debug("invalid ackno: S.AWL=%llu, "
+ "P.ackno=%llu, S.AWH=%llu\n",
+ (unsigned long long)dp->dccps_awl,
+ (unsigned long long)DCCP_SKB_CB(skb)->dccpd_ack_seq,
+ (unsigned long long)dp->dccps_awh);
+ goto out_invalid_packet;
+ }
+
if (dh->dccph_type == DCCP_PKT_RESPONSE) {
const struct inet_connection_sock *icsk = inet_csk(sk);
- struct dccp_sock *dp = dccp_sk(sk);
- long tstamp = dccp_timestamp();
-
- if (!between48(DCCP_SKB_CB(skb)->dccpd_ack_seq,
- dp->dccps_awl, dp->dccps_awh)) {
- dccp_pr_debug("invalid ackno: S.AWL=%llu, "
- "P.ackno=%llu, S.AWH=%llu\n",
- (unsigned long long)dp->dccps_awl,
- (unsigned long long)DCCP_SKB_CB(skb)->dccpd_ack_seq,
- (unsigned long long)dp->dccps_awh);
- goto out_invalid_packet;
- }
+ long tstamp = dccp_timestamp();
/*
* If option processing (Step 8) failed, return 1 here so that
* dccp_v4_do_rcv() sends a Reset. The Reset code depends on
@@ -496,6 +497,13 @@ static int dccp_rcv_request_sent_state_process(struct sock *sk,
}
dccp_send_ack(sk);
return -1;
+ } else if (dh->dccph_type == DCCP_PKT_RESET) {
+ dccp_rcv_reset(sk, skb);
+ return 0;
+ } else if (dh->dccph_type == DCCP_PKT_CLOSEREQ) {
+ return dccp_rcv_closereq(sk, skb);
+ } else if (dh->dccph_type == DCCP_PKT_CLOSE) {
+ return dccp_rcv_close(sk, skb);
}
out_invalid_packet:
@@ -658,17 +666,19 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
* Set TIMEWAIT timer
* Drop packet and return
*/
- if (dh->dccph_type == DCCP_PKT_RESET) {
- dccp_rcv_reset(sk, skb);
- return 0;
- } else if (dh->dccph_type == DCCP_PKT_CLOSEREQ) { /* Step 13 */
- if (dccp_rcv_closereq(sk, skb))
- return 0;
- goto discard;
- } else if (dh->dccph_type == DCCP_PKT_CLOSE) { /* Step 14 */
- if (dccp_rcv_close(sk, skb))
+ if (sk->sk_state != DCCP_REQUESTING) {
+ if (dh->dccph_type == DCCP_PKT_RESET) {
+ dccp_rcv_reset(sk, skb);
return 0;
- goto discard;
+ } else if (dh->dccph_type == DCCP_PKT_CLOSEREQ) { /* Step 13 */
+ if (dccp_rcv_closereq(sk, skb))
+ return 0;
+ goto discard;
+ } else if (dh->dccph_type == DCCP_PKT_CLOSE) { /* Step 14 */
+ if (dccp_rcv_close(sk, skb))
+ return 0;
+ goto discard;
+ }
}
switch (sk->sk_state) {
--
2.34.1
DCCP sockets in DCCP_REQUESTING state do not check the sequence number
or acknowledgment number for incoming Reset, CloseReq, and Close packets.
As a result, an attacker can send a spoofed Reset packet while the client
is in the requesting state. The client will accept the packet without
verification and immediately close the connection, causing a denial of
service (DoS) attack.
This patch moves the processing of Reset, Close, and CloseReq packets
into dccp_rcv_request_sent_state_process() and validates the ack number
before accepting them.
This fix should apply to Linux 5.x and 6.x, including stable versions.
Note that DCCP was removed in Linux 6.16, so this patch is only relevant
for older versions. We tested it on Ubuntu 24.04 LTS (Linux 6.8) and
it worked as expected.
Signed-off-by: Yizhou Zhao <zhaoyz24(a)mails.tsinghua.edu.cn>
Cc: stable(a)vger.kernel.org
---
net/dccp/input.c | 54 ++++++++++++++++++++++++++++--------------------
1 file changed, 32 insertions(+), 22 deletions(-)
diff --git a/net/dccp/input.c b/net/dccp/input.c
index 2cbb757a8..0b1ffb044 100644
--- a/net/dccp/input.c
+++ b/net/dccp/input.c
@@ -397,21 +397,22 @@ static int dccp_rcv_request_sent_state_process(struct sock *sk,
* / * Response processing continues in Step 10; Reset
* processing continues in Step 9 * /
*/
+ struct dccp_sock *dp = dccp_sk(sk);
+
+ if (!between48(DCCP_SKB_CB(skb)->dccpd_ack_seq,
+ dp->dccps_awl, dp->dccps_awh)) {
+ dccp_pr_debug("invalid ackno: S.AWL=%llu, "
+ "P.ackno=%llu, S.AWH=%llu\n",
+ (unsigned long long)dp->dccps_awl,
+ (unsigned long long)DCCP_SKB_CB(skb)->dccpd_ack_seq,
+ (unsigned long long)dp->dccps_awh);
+ goto out_invalid_packet;
+ }
+
if (dh->dccph_type == DCCP_PKT_RESPONSE) {
const struct inet_connection_sock *icsk = inet_csk(sk);
- struct dccp_sock *dp = dccp_sk(sk);
- long tstamp = dccp_timestamp();
-
- if (!between48(DCCP_SKB_CB(skb)->dccpd_ack_seq,
- dp->dccps_awl, dp->dccps_awh)) {
- dccp_pr_debug("invalid ackno: S.AWL=%llu, "
- "P.ackno=%llu, S.AWH=%llu\n",
- (unsigned long long)dp->dccps_awl,
- (unsigned long long)DCCP_SKB_CB(skb)->dccpd_ack_seq,
- (unsigned long long)dp->dccps_awh);
- goto out_invalid_packet;
- }
+ long tstamp = dccp_timestamp();
/*
* If option processing (Step 8) failed, return 1 here so that
* dccp_v4_do_rcv() sends a Reset. The Reset code depends on
@@ -496,6 +497,13 @@ static int dccp_rcv_request_sent_state_process(struct sock *sk,
}
dccp_send_ack(sk);
return -1;
+ } else if (dh->dccph_type == DCCP_PKT_RESET) {
+ dccp_rcv_reset(sk, skb);
+ return 0;
+ } else if (dh->dccph_type == DCCP_PKT_CLOSEREQ) {
+ return dccp_rcv_closereq(sk, skb);
+ } else if (dh->dccph_type == DCCP_PKT_CLOSE) {
+ return dccp_rcv_close(sk, skb);
}
out_invalid_packet:
@@ -658,17 +666,19 @@ int dccp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
* Set TIMEWAIT timer
* Drop packet and return
*/
- if (dh->dccph_type == DCCP_PKT_RESET) {
- dccp_rcv_reset(sk, skb);
- return 0;
- } else if (dh->dccph_type == DCCP_PKT_CLOSEREQ) { /* Step 13 */
- if (dccp_rcv_closereq(sk, skb))
- return 0;
- goto discard;
- } else if (dh->dccph_type == DCCP_PKT_CLOSE) { /* Step 14 */
- if (dccp_rcv_close(sk, skb))
+ if (sk->sk_state != DCCP_REQUESTING) {
+ if (dh->dccph_type == DCCP_PKT_RESET) {
+ dccp_rcv_reset(sk, skb);
return 0;
- goto discard;
+ } else if (dh->dccph_type == DCCP_PKT_CLOSEREQ) { /* Step 13 */
+ if (dccp_rcv_closereq(sk, skb))
+ return 0;
+ goto discard;
+ } else if (dh->dccph_type == DCCP_PKT_CLOSE) { /* Step 14 */
+ if (dccp_rcv_close(sk, skb))
+ return 0;
+ goto discard;
+ }
}
switch (sk->sk_state) {
--
2.34.1
Hi,
While testing Linux kernel 6.12.42 on OpenWrt, we observed a
regression in IPv6 Router Advertisement (RA) handling for the default
router.
Affected commits
The following commits appear related and may have introduced the issue:
ipv6: fix possible infinite loop in fib6_info_uses_dev():
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…
ipv6: prevent infinite loop in rt6_nlmsg_size():
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…
ipv6: annotate data-races around rt->fib6_nsiblings:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=…
Problem description:
In Linux kernel 6.12.42, IPv6 FIB multipath and concurrent access
handling was made stricter (READ_ONCE / WRITE_ONCE + RCU retry).
The RA “Automatic” mode relies on checking whether a local default route exists.
With the stricter FIB handling, this check can fail in multipath scenarios.
As a result, RA does not advertise a default route, and IPv6 clients
on LAN fail to receive the default gateway.
Steps to reproduce
Run OpenWrt with kernel 6.12.42 (or 6.12.43) on a router with br-lan bridge.
Configure IPv6 RA in Automatic default router mode.
Observe that no default route is advertised to clients (though
prefixes may still be delivered).
Expected behavior
Router Advertisement should continue to advertise the default route as
in kernel 6.12.41 and earlier.
Client IPv6 connectivity should not break.
Actual behavior
RA fails to advertise a default route in Automatic mode.
Clients do not install a default IPv6 route → connectivity fails.
Temporary workaround
Change RA default router mode from Automatic → Always / Use available
prefixes in OpenWrt.
This bypasses the dependency on local default route check and restores
correct RA behavior.
Additional notes
This appears to be an unintended side effect of the stricter FIB
handling changes introduced in 6.12.42. Please advise if this has
already been reported or if I should prepare a minimal reproducer
outside OpenWrt.
Thanks,
[GitHub: mgz0227]
[BUG]
When running test case generic/457, there is a chance to hit the
following error, with 64K page size and 4K btrfs block size, and
"compress=zstd" mount option:
FSTYP -- btrfs
PLATFORM -- Linux/aarch64 btrfs-aarch64 6.17.0-rc2-custom+ #129 SMP PREEMPT_DYNAMIC Wed Aug 20 18:52:51 ACST 2025
MKFS_OPTIONS -- -s 4k /dev/mapper/test-scratch1
MOUNT_OPTIONS -- -o compress=zstd /dev/mapper/test-scratch1 /mnt/scratch
generic/457 2s ... [failed, exit status 1]- output mismatch (see /home/adam/xfstests-dev/results//generic/457.out.bad)
--- tests/generic/457.out 2024-04-25 18:13:45.160550980 +0930
+++ /home/adam/xfstests-dev/results//generic/457.out.bad 2025-08-22 16:09:41.039352391 +0930
@@ -1,2 +1,3 @@
QA output created by 457
-Silence is golden
+testfile6 end md5sum mismatched
+(see /home/adam/xfstests-dev/results//generic/457.full for details)
...
(Run 'diff -u /home/adam/xfstests-dev/tests/generic/457.out /home/adam/xfstests-dev/results//generic/457.out.bad' to see the entire diff)
The root problem is, after certain fsx operations the file contents
change just after a mount cycle.
There is a much smaller reproducer based on that test case, which I
mainly used to debug the bug:
workload() {
mkfs.btrfs -f $dev > /dev/null
dmesg -C
trace-cmd clear
mount -o compress=zstd $dev $mnt
xfs_io -f -c "pwrite -S 0xff 0 256K" -c "sync" $mnt/base > /dev/null
cp --reflink=always -p -f $mnt/base $mnt/file
$fsx -N 4 -d -k -S 3746842 $mnt/file
if [ $? -ne 0 ]; then
echo "!!! FSX FAILURE !!!"
fail
fi
csum_before=$(_md5_checksum $mnt/file)
stop_trace
umount $mnt
mount $dev $mnt
csum_after=$(_md5_checksum $mnt/file)
umount $mnt
if [ "$csum_before" != "$csum_after" ]; then
echo "!!! CSUM MISMATCH !!!"
fail
fi
}
This seed value will cause 100% reproducible csum mismatch after a mount
cycle.
The seed value results only 2 real operations:
Seed set to 3746842
main: filesystem does not support fallocate mode FALLOC_FL_UNSHARE_RANGE, disabling!
main: filesystem does not support fallocate mode FALLOC_FL_COLLAPSE_RANGE, disabling!
main: filesystem does not support fallocate mode FALLOC_FL_INSERT_RANGE, disabling!
main: filesystem does not support exchange range, disabling!
main: filesystem does not support dontcache IO, disabling!
2 clone from 0x3b000 to 0x3f000, (0x4000 bytes) at 0x1f000
3 write 0x2975b thru 0x2ba20 (0x22c6 bytes) dontcache=0
All 4 operations completed A-OK!
[CAUSE]
With extra debug trace_printk(), the following sequence can explain the
root cause:
fsx-3900290 [002] ..... 161696.160966: btrfs_submit_compressed_read: r/i=5/258 file_off=131072 em start=126976 len=16384
The "r/i" is showing the root id and the ino number.
In this case, my minimal reproducer is indeed using inode 258 of
subvolume 5, and that's the inode with changing contents.
The above trace is from the function btrfs_submit_compressed_read(),
triggered by fsx to read the folio at file offset 128K.
Notice that the extent map, it's at offset 124K, with a length of 16K.
This means the extent map only covers the first 12K (3 blocks) of the
folio 128K.
fsx-3900290 [002] ..... 161696.160969: trace_dump_cb: btrfs_submit_compressed_read, r/i=5/258 file off start=131072 len=65536 bi_size=65536
This is the line I used to dump the basic info of a bbio, which shows the
bi_size is 64K, aka covering the whole 64K folio at file offset 128K.
But remember, the extent map only covers 3 blocks, definitely not enough
to cover the whole 64K folio at 128K file offset.
kworker/u19:1-3748349 [002] ..... 161696.161154: btrfs_decompress_buf2page: r/i=5/258 file_off=131072 copy_len=4096 content=ffff
kworker/u19:1-3748349 [002] ..... 161696.161155: btrfs_decompress_buf2page: r/i=5/258 file_off=135168 copy_len=4096 content=ffff
kworker/u19:1-3748349 [002] ..... 161696.161156: btrfs_decompress_buf2page: r/i=5/258 file_off=139264 copy_len=4096 content=ffff
kworker/u19:1-3748349 [002] ..... 161696.161157: btrfs_decompress_buf2page: r/i=5/258 file_off=143360 copy_len=4096 content=ffff
The above lines show that btrfs_decompress_buf2page() called by zstd
decompress code is copying the decompressed content into the filemap.
But notice that, the last line is already beyond the extent map range.
Furthermore, there are no more compressed content copy, as the
compressed bio only has the extent map to cover the first 3 blocks (the
4th block copy is already incorrect).
kworker/u19:1-3748349 [002] ..... 161696.161161: trace_dump_cb: r/i=5/258 file_pos=131072 content=ffff
kworker/u19:1-3748349 [002] ..... 161696.161161: trace_dump_cb: r/i=5/258 file_pos=135168 content=ffff
kworker/u19:1-3748349 [002] ..... 161696.161162: trace_dump_cb: r/i=5/258 file_pos=139264 content=ffff
kworker/u19:1-3748349 [002] ..... 161696.161162: trace_dump_cb: r/i=5/258 file_pos=143360 content=ffff
kworker/u19:1-3748349 [002] ..... 161696.161162: trace_dump_cb: r/i=5/258 file_pos=147456 content=0000
This is the extra dumpping of the compressed bio, after file offset
140K (143360), the content is all zero, which is incorrect.
The zero is there because we didn't copy anything into the folio.
The root cause of the corruption is, we are submitting a compressed read
for a whole folio, but the extent map we get only covers the first 3
blocks, meaning the compressed read path is merging reads that shouldn't
be merged.
The involved file extents are:
item 19 key (258 EXTENT_DATA 126976) itemoff 15143 itemsize 53
generation 9 type 1 (regular)
extent data disk byte 13635584 nr 4096
extent data offset 110592 nr 16384 ram 131072
extent compression 3 (zstd)
item 20 key (258 EXTENT_DATA 143360) itemoff 15090 itemsize 53
generation 9 type 1 (regular)
extent data disk byte 13635584 nr 4096
extent data offset 12288 nr 24576 ram 131072
extent compression 3 (zstd)
Note that, both extents at 124K and 140K are pointing to the same
compressed extent, but with different offset.
This means, we reads of range [124K, 140K) and [140K, 165K) should not
be merged.
But read merge check function, btrfs_bio_is_contig(), is only checking
the disk_bytenr of two compressed reads, as there are not enough info
like the involved extent maps to do more comprehensive checks, resulting
the incorrect compressed read.
Unfortunately this is a long existing bug, way before subpage block size
support.
But subpage block size support (and experimental large folio support)
makes it much easier to detect.
If block size equals page size, regular page read will only read one
block each time, thus no extent map sharing nor merge.
(This means for bs == ps cases, it's still possible to hit the bug with
readahead, just we don't have test coverage with content verification
for readahead)
[FIX]
Save the last hit compressed extent map start/len into btrfs_bio_ctrl,
and check if the current extent map is the same as the saved one.
Here we only save em::start/len to save memory for btrfs_bio_ctrl, as
it's using the stack memory, which is a very limited resource inside the
kernel.
Since the compressed extent maps are never merged, their start/len are
unique inside the same inode, thus just checking start/len will be
enough to make sure they are the same extent map.
If the extent maps do not match, force submitting the current bio, so
that the read will never be merged.
CC: stable(a)vger.kernel.org
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
---
v2:
- Only save extent_map::start/len to save memory for btrfs_bio_ctrl
It's using on-stack memory which is very limited inside the kernel.
- Remove the commit message mentioning of clearing last saved em
Since we're using em::start/len, there is no need to clear them.
Either we hit the same em::start/len, meaning hitting the same extent
map, or we hit a different em, which will have a different start/len.
---
fs/btrfs/extent_io.c | 52 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 52 insertions(+)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 0c12fd64a1f3..418e3bf40f94 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -131,6 +131,22 @@ struct btrfs_bio_ctrl {
*/
unsigned long submit_bitmap;
struct readahead_control *ractl;
+
+ /*
+ * The start/len of the last hit compressed extent map.
+ *
+ * The current btrfs_bio_is_contig() only uses disk_bytenr as
+ * the condition to check if the read can be merged with previous
+ * bio, which is not correct. E.g. two file extents pointing to the
+ * same extent.
+ *
+ * So here we need to do extra check to merge reads that are
+ * covered by the same extent map.
+ * Just extent_map::start/len will be enough, as they are unique
+ * inside the same inode.
+ */
+ u64 last_compress_em_start;
+ u64 last_compress_em_len;
};
/*
@@ -957,6 +973,32 @@ static void btrfs_readahead_expand(struct readahead_control *ractl,
readahead_expand(ractl, ra_pos, em_end - ra_pos);
}
+static void save_compressed_em(struct btrfs_bio_ctrl *bio_ctrl,
+ const struct extent_map *em)
+{
+ if (btrfs_extent_map_compression(em) == BTRFS_COMPRESS_NONE)
+ return;
+ bio_ctrl->last_compress_em_start = em->start;
+ bio_ctrl->last_compress_em_len = em->len;
+}
+
+static bool is_same_compressed_em(struct btrfs_bio_ctrl *bio_ctrl,
+ const struct extent_map *em)
+{
+ /*
+ * Only if the em is completely the same as the previous one we can merge
+ * the current folio in the read bio.
+ *
+ * Here we only need to compare the em->start/len against saved
+ * last_compress_em_start/len, as start/len inside an inode are unique,
+ * and compressed extent maps are never merged.
+ */
+ if (em->start != bio_ctrl->last_compress_em_start ||
+ em->len != bio_ctrl->last_compress_em_len)
+ return false;
+ return true;
+}
+
/*
* basic readpage implementation. Locked extent state structs are inserted
* into the tree that are removed when the IO is done (by the end_io
@@ -1080,9 +1122,19 @@ static int btrfs_do_readpage(struct folio *folio, struct extent_map **em_cached,
*prev_em_start != em->start)
force_bio_submit = true;
+ /*
+ * We must ensure we only merge compressed read when the current
+ * extent map matches the previous one exactly.
+ */
+ if (compress_type != BTRFS_COMPRESS_NONE) {
+ if (!is_same_compressed_em(bio_ctrl, em))
+ force_bio_submit = true;
+ }
+
if (prev_em_start)
*prev_em_start = em->start;
+ save_compressed_em(bio_ctrl, em);
em_gen = em->generation;
btrfs_free_extent_map(em);
em = NULL;
--
2.50.1
Hi,
Please provide a quote for your products:
Include:
1.Pricing (per unit)
2.Delivery cost & timeline
3.Quote expiry date
Deadline: September
Thanks!
Kamal Prasad
Albinayah Trading
Dear Kernel Developers,
Hereby we attach patch backported from kernel 6.13 (as proposed by Greg k-h on the full disclosure mailing list) to 6.12 for CVE-2025-21751 vulnerability.
This patch was tested on metal and virtual machines and rolled out in production.
I hope patch is sufficient for cherry-pick. Please let us know if something has to be updated/modified.
Regards,
Sujana, Akendo
Function nvkm_gsp_fwsec_v2() sets 'ret' if the kmemdup() call fails, but
it never uses or returns 'ret' after that point. We always need to release
the firmware regardless, so do that and then check for error.
Fixes: 176fdcbddfd2 ("drm/nouveau/gsp/r535: add support for booting GSP-RM")
Cc: stable(a)vger.kernel.org # v6.7+
Signed-off-by: Timur Tabi <ttabi(a)nvidia.com>
---
drivers/gpu/drm/nouveau/nvkm/subdev/gsp/fwsec.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/fwsec.c b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/fwsec.c
index 52412965fac1..5b721bd9d799 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/fwsec.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/gsp/fwsec.c
@@ -209,11 +209,12 @@ nvkm_gsp_fwsec_v2(struct nvkm_gsp *gsp, const char *name,
fw->boot_addr = bld->start_tag << 8;
fw->boot_size = bld->code_size;
fw->boot = kmemdup(bl->data + hdr->data_offset + bld->code_off, fw->boot_size, GFP_KERNEL);
- if (!fw->boot)
- ret = -ENOMEM;
nvkm_firmware_put(bl);
+ if (!fw->boot)
+ return -ENOMEM;
+
/* Patch in interface data. */
return nvkm_gsp_fwsec_patch(gsp, fw, desc->InterfaceOffset, init_cmd);
}
--
2.43.0
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 62708b9452f8eb77513115b17c4f8d1a22ebf843
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025082443-caliber-swung-4d8f@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 62708b9452f8eb77513115b17c4f8d1a22ebf843 Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Tue, 19 Aug 2025 19:19:51 -0700
Subject: [PATCH] tls: fix handling of zero-length records on the rx_list
Each recvmsg() call must process either
- only contiguous DATA records (any number of them)
- one non-DATA record
If the next record has different type than what has already been
processed we break out of the main processing loop. If the record
has already been decrypted (which may be the case for TLS 1.3 where
we don't know type until decryption) we queue the pending record
to the rx_list. Next recvmsg() will pick it up from there.
Queuing the skb to rx_list after zero-copy decrypt is not possible,
since in that case we decrypted directly to the user space buffer,
and we don't have an skb to queue (darg.skb points to the ciphertext
skb for access to metadata like length).
Only data records are allowed zero-copy, and we break the processing
loop after each non-data record. So we should never zero-copy and
then find out that the record type has changed. The corner case
we missed is when the initial record comes from rx_list, and it's
zero length.
Reported-by: Muhammad Alifa Ramdhan <ramdhan(a)starlabs.sg>
Reported-by: Billy Jheng Bing-Jhong <billy(a)starlabs.sg>
Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser")
Reviewed-by: Sabrina Dubroca <sd(a)queasysnail.net>
Link: https://patch.msgid.link/20250820021952.143068-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 51c98a007dda..bac65d0d4e3e 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -1808,6 +1808,9 @@ int decrypt_skb(struct sock *sk, struct scatterlist *sgout)
return tls_decrypt_sg(sk, NULL, sgout, &darg);
}
+/* All records returned from a recvmsg() call must have the same type.
+ * 0 is not a valid content type. Use it as "no type reported, yet".
+ */
static int tls_record_content_type(struct msghdr *msg, struct tls_msg *tlm,
u8 *control)
{
@@ -2051,8 +2054,10 @@ int tls_sw_recvmsg(struct sock *sk,
if (err < 0)
goto end;
+ /* process_rx_list() will set @control if it processed any records */
copied = err;
- if (len <= copied || (copied && control != TLS_RECORD_TYPE_DATA) || rx_more)
+ if (len <= copied || rx_more ||
+ (control && control != TLS_RECORD_TYPE_DATA))
goto end;
target = sock_rcvlowat(sk, flags & MSG_WAITALL, len);
Hi,
I noticed an ERR_PTR dereference issue in expand_files() on kernel 6.12.43
when allocating large file descriptor tables. The issue occurs when
alloc_fdtable() returns ERR_PTR(-EMFILE) for large nr input, but
expand_fdtable() is not properly checking these error returns. dup_fd()
seems also have the issue, missing proper ERR_PTR handling.
The ERR_PTR return was introduced by d4f9351243c1 ("fs: Prevent file
descriptor table allocations exceeding INT_MAX") which adds INT_MAX limit
check in alloc_fdtable().
I was able to trigger this with the unshare_test selftest:
[ 40.283906] BUG: unable to handle page fault for address: ffffffffffffffe8
...
[ 40.287436] RIP: 0010:expand_files+0x7e/0x1c0
...
[ 40.366211] Kernel panic - not syncing: Fatal exception
Looking at the upstream kernel, this can be addressed by Al Viro's
fdtable series [1], which added the ERR_PTR handling in this code path.
Perhaps backporting this series, especially 1d3b4be ("alloc_fdtable():
change calling conventions.") would help resolve the issue.
Thanks for all the work on stable tree.
Best,
Nathan Gao
[1] https://lore.kernel.org/all/20241007173912.GR4017910@ZenIV/
Signed-off-by: Nathan Gao <zcgao(a)amazon.com>
--
2.47.3