From: Darrick J. Wong <djwong(a)kernel.org>
I received a report from the release engineering side of the house that
xfs_scrub without the -n flag (aka fix it mode) would try to fix a
broken filesystem even on a kernel that doesn't have online repair built
into it:
# xfs_scrub -dTvn /mnt/test
EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
Phase 1: Find filesystem geometry.
/mnt/test: using 1 threads to scrub.
Phase 1: Memory used: 132k/0k (108k/25k), time: 0.00/ 0.00/ 0.00s
<snip>
Phase 4: Repair filesystem.
<snip>
Info: /mnt/test/some/victimdir directory entries: Attempting repair. (repair.c line 351)
Corruption: /mnt/test/some/victimdir directory entries: Repair unsuccessful; offline repair required. (repair.c line 204)
Source: https://blogs.oracle.com/linux/post/xfs-online-filesystem-repair
It is strange that xfs_scrub doesn't refuse to run, because the kernel
is supposed to return EOPNOTSUPP if we actually needed to run a repair,
and xfs_io's repair subcommand will perror that. And yet:
# xfs_io -x -c 'repair probe' /mnt/test
#
The first problem is commit dcb660f9222fd9 (4.15) which should have had
xchk_probe set the CORRUPT OFLAG so that any of the repair machinery
will get called at all.
It turns out that some refactoring that happened in the 6.6-6.8 era
broke the operation of this corner case. What we *really* want to
happen is that all the predicates that would steer xfs_scrub_metadata()
towards calling xrep_attempt() should function the same way that they do
when repair is compiled in; and then xrep_attempt gets to return the
fatal EOPNOTSUPP error code that causes the probe to fail.
Instead, commit 8336a64eb75cba (6.6) started the failwhale swimming by
hoisting OFLAG checking logic into a helper whose non-repair stub always
returns false, causing scrub to return "repair not needed" when in fact
the repair is not supported. Prior to that commit, the oflag checking
that was open-coded in scrub.c worked correctly.
Similarly, in commit 4bdfd7d15747b1 (6.8) we hoisted the IFLAG_REPAIR
and ALREADY_FIXED logic into a helper whose non-repair stub always
returns false, so we never enter the if test body that would have called
xrep_attempt, let alone fail to decode the OFLAGs correctly.
The final insult (yes, we're doing The Naked Gun now) is commit
48a72f60861f79 (6.8) in which we hoisted the "are we going to try a
repair?" predicate into yet another function with a non-repair stub
always returns false.
Fix xchk_probe to trigger xrep_probe if repair is enabled, or return
EOPNOTSUPP directly if it is not. For all the other scrub types, we
need to fix the header predicates so that the ->repair functions (which
are all xrep_notsupported) get called to return EOPNOTSUPP. Commit
48a72 is tagged here because the scrub code prior to LTS 6.12 are
incomplete and not worth patching.
Reported-by: David Flynn <david.flynn(a)oracle.com>
Cc: <stable(a)vger.kernel.org> # v6.8
Fixes: 48a72f60861f79 ("xfs: don't complain about unfixed metadata when repairs were injected")
Signed-off-by: "Darrick J. Wong" <djwong(a)kernel.org>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
---
fs/xfs/scrub/common.h | 5 -----
fs/xfs/scrub/repair.h | 11 ++++++++++-
fs/xfs/scrub/scrub.c | 12 ++++++++++++
3 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
index 546be550b221b6..4722cb51fd1522 100644
--- a/fs/xfs/scrub/common.h
+++ b/fs/xfs/scrub/common.h
@@ -225,7 +225,6 @@ static inline bool xchk_skip_xref(struct xfs_scrub_metadata *sm)
bool xchk_dir_looks_zapped(struct xfs_inode *dp);
bool xchk_pptr_looks_zapped(struct xfs_inode *ip);
-#ifdef CONFIG_XFS_ONLINE_REPAIR
/* Decide if a repair is required. */
static inline bool xchk_needs_repair(const struct xfs_scrub_metadata *sm)
{
@@ -245,10 +244,6 @@ static inline bool xchk_could_repair(const struct xfs_scrub *sc)
return (sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR) &&
!(sc->flags & XREP_ALREADY_FIXED);
}
-#else
-# define xchk_needs_repair(sc) (false)
-# define xchk_could_repair(sc) (false)
-#endif /* CONFIG_XFS_ONLINE_REPAIR */
int xchk_metadata_inode_forks(struct xfs_scrub *sc);
diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h
index 823c00d1a50262..af0a3a9e5ed978 100644
--- a/fs/xfs/scrub/repair.h
+++ b/fs/xfs/scrub/repair.h
@@ -191,7 +191,16 @@ int xrep_reset_metafile_resv(struct xfs_scrub *sc);
#else
#define xrep_ino_dqattach(sc) (0)
-#define xrep_will_attempt(sc) (false)
+
+/*
+ * When online repair is not built into the kernel, we still want to attempt
+ * the repair so that the stub xrep_attempt below will return EOPNOTSUPP.
+ */
+static inline bool xrep_will_attempt(const struct xfs_scrub *sc)
+{
+ return (sc->sm->sm_flags & XFS_SCRUB_IFLAG_FORCE_REBUILD) ||
+ xchk_needs_repair(sc->sm);
+}
static inline int
xrep_attempt(
diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c
index d3a4ddd918f621..01fdbbc7adf30e 100644
--- a/fs/xfs/scrub/scrub.c
+++ b/fs/xfs/scrub/scrub.c
@@ -149,6 +149,18 @@ xchk_probe(
if (xchk_should_terminate(sc, &error))
return error;
+ /*
+ * If the caller is probing to see if repair works but repair isn't
+ * built into the kernel, return EOPNOTSUPP because that's the signal
+ * that userspace expects. If online repair is built in, set the
+ * CORRUPT flag (without any of the usual tracing/logging) to force us
+ * into xrep_probe.
+ */
+ if (xchk_could_repair(sc)) {
+ if (!IS_ENABLED(CONFIG_XFS_ONLINE_REPAIR))
+ return -EOPNOTSUPP;
+ sc->sm->sm_flags |= XFS_SCRUB_OFLAG_CORRUPT;
+ }
return 0;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 38724591364e1e3b278b4053f102b49ea06ee17c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025011310-ruckus-ceramics-7ebc@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 38724591364e1e3b278b4053f102b49ea06ee17c Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Date: Mon, 25 Nov 2024 22:16:12 +0100
Subject: [PATCH] iio: adc: rockchip_saradc: fix information leak in triggered
buffer
The 'data' local struct is used to push data to user space from a
triggered buffer, but it does not set values for inactive channels, as
it only uses iio_for_each_active_channel() to assign new values.
Initialize the struct to zero before using it to avoid pushing
uninitialized information to userspace.
Cc: stable(a)vger.kernel.org
Fixes: 4e130dc7b413 ("iio: adc: rockchip_saradc: Add support iio buffers")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-4-0cb6e98d895c@g…
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/adc/rockchip_saradc.c b/drivers/iio/adc/rockchip_saradc.c
index 240cfa391674..dfd47a6e1f4a 100644
--- a/drivers/iio/adc/rockchip_saradc.c
+++ b/drivers/iio/adc/rockchip_saradc.c
@@ -368,6 +368,8 @@ static irqreturn_t rockchip_saradc_trigger_handler(int irq, void *p)
int ret;
int i, j = 0;
+ memset(&data, 0, sizeof(data));
+
mutex_lock(&info->lock);
iio_for_each_active_channel(i_dev, i) {
From: Dario Binacchi <dario.binacchi(a)amarulasolutions.com>
[ Upstream commit 9ad86d377ef4a19c75a9c639964879a5b25a433b ]
The commit a22bd630cfff ("can: hi311x: do not report txerr and rxerr
during bus-off") removed the reporting of rxerr and txerr even in case
of correct operation (i. e. not bus-off).
The error count information added to the CAN frame after netif_rx() is
a potential use after free, since there is no guarantee that the skb
is in the same state. It might be freed or reused.
Fix the issue by postponing the netif_rx() call in case of txerr and
rxerr reporting.
Fixes: a22bd630cfff ("can: hi311x: do not report txerr and rxerr during bus-off")
Signed-off-by: Dario Binacchi <dario.binacchi(a)amarulasolutions.com>
Link: https://patch.msgid.link/20241122221650.633981-5-dario.binacchi@amarulasolu…
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
[kovalev: changed the call order of netif_rx_ni()
according to netif_rx() of the upstream patch]
Signed-off-by: Vasiliy Kovalev <kovalev(a)altlinux.org>
---
Backport to fix CVE-2024-56651
Link: https://www.cve.org/CVERecord/?id=CVE-2024-56651
---
drivers/net/can/spi/hi311x.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/spi/hi311x.c b/drivers/net/can/spi/hi311x.c
index 28273e84171a25..1cdc05475cda61 100644
--- a/drivers/net/can/spi/hi311x.c
+++ b/drivers/net/can/spi/hi311x.c
@@ -673,9 +673,9 @@ static irqreturn_t hi3110_can_ist(int irq, void *dev_id)
tx_state = txerr >= rxerr ? new_state : 0;
rx_state = txerr <= rxerr ? new_state : 0;
can_change_state(net, cf, tx_state, rx_state);
- netif_rx_ni(skb);
if (new_state == CAN_STATE_BUS_OFF) {
+ netif_rx_ni(skb);
can_bus_off(net);
if (priv->can.restart_ms == 0) {
priv->force_quit = 1;
@@ -685,6 +685,7 @@ static irqreturn_t hi3110_can_ist(int irq, void *dev_id)
} else {
cf->data[6] = txerr;
cf->data[7] = rxerr;
+ netif_rx_ni(skb);
}
}
--
2.33.8
From: Kuniyuki Iwashima <kuniyu(a)amazon.com>
commit e8c526f2bdf1845bedaf6a478816a3d06fa78b8f upstream.
Martin KaFai Lau reported use-after-free [0] in reqsk_timer_handler().
"""
We are seeing a use-after-free from a bpf prog attached to
trace_tcp_retransmit_synack. The program passes the req->sk to the
bpf_sk_storage_get_tracing kernel helper which does check for null
before using it.
"""
The commit 83fccfc3940c ("inet: fix potential deadlock in
reqsk_queue_unlink()") added timer_pending() in reqsk_queue_unlink() not
to call del_timer_sync() from reqsk_timer_handler(), but it introduced a
small race window.
Before the timer is called, expire_timers() calls detach_timer(timer, true)
to clear timer->entry.pprev and marks it as not pending.
If reqsk_queue_unlink() checks timer_pending() just after expire_timers()
calls detach_timer(), TCP will miss del_timer_sync(); the reqsk timer will
continue running and send multiple SYN+ACKs until it expires.
The reported UAF could happen if req->sk is close()d earlier than the timer
expiration, which is 63s by default.
The scenario would be
1. inet_csk_complete_hashdance() calls inet_csk_reqsk_queue_drop(),
but del_timer_sync() is missed
2. reqsk timer is executed and scheduled again
3. req->sk is accept()ed and reqsk_put() decrements rsk_refcnt, but
reqsk timer still has another one, and inet_csk_accept() does not
clear req->sk for non-TFO sockets
4. sk is close()d
5. reqsk timer is executed again, and BPF touches req->sk
Let's not use timer_pending() by passing the caller context to
__inet_csk_reqsk_queue_drop().
Note that reqsk timer is pinned, so the issue does not happen in most
use cases. [1]
[0]
BUG: KFENCE: use-after-free read in bpf_sk_storage_get_tracing+0x2e/0x1b0
Use-after-free read at 0x00000000a891fb3a (in kfence-#1):
bpf_sk_storage_get_tracing+0x2e/0x1b0
bpf_prog_5ea3e95db6da0438_tcp_retransmit_synack+0x1d20/0x1dda
bpf_trace_run2+0x4c/0xc0
tcp_rtx_synack+0xf9/0x100
reqsk_timer_handler+0xda/0x3d0
run_timer_softirq+0x292/0x8a0
irq_exit_rcu+0xf5/0x320
sysvec_apic_timer_interrupt+0x6d/0x80
asm_sysvec_apic_timer_interrupt+0x16/0x20
intel_idle_irq+0x5a/0xa0
cpuidle_enter_state+0x94/0x273
cpu_startup_entry+0x15e/0x260
start_secondary+0x8a/0x90
secondary_startup_64_no_verify+0xfa/0xfb
kfence-#1: 0x00000000a72cc7b6-0x00000000d97616d9, size=2376, cache=TCPv6
allocated by task 0 on cpu 9 at 260507.901592s:
sk_prot_alloc+0x35/0x140
sk_clone_lock+0x1f/0x3f0
inet_csk_clone_lock+0x15/0x160
tcp_create_openreq_child+0x1f/0x410
tcp_v6_syn_recv_sock+0x1da/0x700
tcp_check_req+0x1fb/0x510
tcp_v6_rcv+0x98b/0x1420
ipv6_list_rcv+0x2258/0x26e0
napi_complete_done+0x5b1/0x2990
mlx5e_napi_poll+0x2ae/0x8d0
net_rx_action+0x13e/0x590
irq_exit_rcu+0xf5/0x320
common_interrupt+0x80/0x90
asm_common_interrupt+0x22/0x40
cpuidle_enter_state+0xfb/0x273
cpu_startup_entry+0x15e/0x260
start_secondary+0x8a/0x90
secondary_startup_64_no_verify+0xfa/0xfb
freed by task 0 on cpu 9 at 260507.927527s:
rcu_core_si+0x4ff/0xf10
irq_exit_rcu+0xf5/0x320
sysvec_apic_timer_interrupt+0x6d/0x80
asm_sysvec_apic_timer_interrupt+0x16/0x20
cpuidle_enter_state+0xfb/0x273
cpu_startup_entry+0x15e/0x260
start_secondary+0x8a/0x90
secondary_startup_64_no_verify+0xfa/0xfb
Fixes: 83fccfc3940c ("inet: fix potential deadlock in reqsk_queue_unlink()")
Reported-by: Martin KaFai Lau <martin.lau(a)kernel.org>
Closes: https://lore.kernel.org/netdev/eb6684d0-ffd9-4bdc-9196-33f690c25824@linux.d…
Link: https://lore.kernel.org/netdev/b55e2ca0-42f2-4b7c-b445-6ffd87ca74a0@linux.d… [1]
Signed-off-by: Kuniyuki Iwashima <kuniyu(a)amazon.com>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Reviewed-by: Martin KaFai Lau <martin.lau(a)kernel.org>
Link: https://patch.msgid.link/20241014223312.4254-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
[d.privalov: adapt calling __inet_csk_reqsk_queue_drop()]
Signed-off-by: Dmitriy Privalov <d.privalov(a)omp.ru>
---
Backport fix for CVE-2024-50154
Link: https://nvd.nist.gov/vuln/detail/CVE-2024-50154
---
net/ipv4/inet_connection_sock.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 6ebe43b4d28f..332133aae3e1 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -722,21 +722,31 @@ static bool reqsk_queue_unlink(struct request_sock *req)
found = __sk_nulls_del_node_init_rcu(req_to_sk(req));
spin_unlock(lock);
}
- if (timer_pending(&req->rsk_timer) && del_timer_sync(&req->rsk_timer))
- reqsk_put(req);
+
return found;
}
-bool inet_csk_reqsk_queue_drop(struct sock *sk, struct request_sock *req)
+static bool __inet_csk_reqsk_queue_drop(struct sock *sk,
+ struct request_sock *req,
+ bool from_timer)
{
bool unlinked = reqsk_queue_unlink(req);
+ if (!from_timer && del_timer_sync(&req->rsk_timer))
+ reqsk_put(req);
+
if (unlinked) {
reqsk_queue_removed(&inet_csk(sk)->icsk_accept_queue, req);
reqsk_put(req);
}
+
return unlinked;
}
+
+bool inet_csk_reqsk_queue_drop(struct sock *sk, struct request_sock *req)
+{
+ return __inet_csk_reqsk_queue_drop(sk, req, false);
+}
EXPORT_SYMBOL(inet_csk_reqsk_queue_drop);
void inet_csk_reqsk_queue_drop_and_put(struct sock *sk, struct request_sock *req)
@@ -804,7 +814,8 @@ static void reqsk_timer_handler(struct timer_list *t)
return;
}
drop:
- inet_csk_reqsk_queue_drop_and_put(sk_listener, req);`
+ __inet_csk_reqsk_queue_drop(sk_listener, req, true);
+ reqsk_put(req);
}
static void reqsk_queue_hash_req(struct request_sock *req,
--
2.34.1
From: Tejun Heo <tj(a)kernel.org>
commit 86e6ca55b83c575ab0f2e105cf08f98e58d3d7af upstream.
blkcg_unpin_online() walks up the blkcg hierarchy putting the online pin. To
walk up, it uses blkcg_parent(blkcg) but it was calling that after
blkcg_destroy_blkgs(blkcg) which could free the blkcg, leading to the
following UAF:
==================================================================
BUG: KASAN: slab-use-after-free in blkcg_unpin_online+0x15a/0x270
Read of size 8 at addr ffff8881057678c0 by task kworker/9:1/117
CPU: 9 UID: 0 PID: 117 Comm: kworker/9:1 Not tainted 6.13.0-rc1-work-00182-gb8f52214c61a-dirty #48
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS unknown 02/02/2022
Workqueue: cgwb_release cgwb_release_workfn
Call Trace:
<TASK>
dump_stack_lvl+0x27/0x80
print_report+0x151/0x710
kasan_report+0xc0/0x100
blkcg_unpin_online+0x15a/0x270
cgwb_release_workfn+0x194/0x480
process_scheduled_works+0x71b/0xe20
worker_thread+0x82a/0xbd0
kthread+0x242/0x2c0
ret_from_fork+0x33/0x70
ret_from_fork_asm+0x1a/0x30
</TASK>
...
Freed by task 1944:
kasan_save_track+0x2b/0x70
kasan_save_free_info+0x3c/0x50
__kasan_slab_free+0x33/0x50
kfree+0x10c/0x330
css_free_rwork_fn+0xe6/0xb30
process_scheduled_works+0x71b/0xe20
worker_thread+0x82a/0xbd0
kthread+0x242/0x2c0
ret_from_fork+0x33/0x70
ret_from_fork_asm+0x1a/0x30
Note that the UAF is not easy to trigger as the free path is indirected
behind a couple RCU grace periods and a work item execution. I could only
trigger it with artifical msleep() injected in blkcg_unpin_online().
Fix it by reading the parent pointer before destroying the blkcg's blkg's.
Signed-off-by: Tejun Heo <tj(a)kernel.org>
Reported-by: Abagail ren <renzezhongucas(a)gmail.com>
Suggested-by: Linus Torvalds <torvalds(a)linuxfoundation.org>
Fixes: 4308a434e5e0 ("blkcg: don't offline parent blkcg first")
Cc: stable(a)vger.kernel.org # v5.7+
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
[kovalev: in versions 5.10 and 5.15, the blkcg_unpin_online()
function was not moved to the block/blk-cgroup.c file]
Signed-off-by: Vasiliy Kovalev <kovalev(a)altlinux.org>
---
Backport to fix CVE-2024-56672
Link: https://www.cve.org/CVERecord/?id=CVE-2024-56672
---
include/linux/blk-cgroup.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index 0e6e84db06f674..b89099360a8673 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -428,10 +428,14 @@ static inline void blkcg_pin_online(struct blkcg *blkcg)
static inline void blkcg_unpin_online(struct blkcg *blkcg)
{
do {
+ struct blkcg *parent;
+
if (!refcount_dec_and_test(&blkcg->online_pin))
break;
+
+ parent = blkcg_parent(blkcg);
blkcg_destroy_blkgs(blkcg);
- blkcg = blkcg_parent(blkcg);
+ blkcg = parent;
} while (blkcg);
}
--
2.33.8
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 38724591364e1e3b278b4053f102b49ea06ee17c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025011310-spider-motor-0ef7@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 38724591364e1e3b278b4053f102b49ea06ee17c Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Date: Mon, 25 Nov 2024 22:16:12 +0100
Subject: [PATCH] iio: adc: rockchip_saradc: fix information leak in triggered
buffer
The 'data' local struct is used to push data to user space from a
triggered buffer, but it does not set values for inactive channels, as
it only uses iio_for_each_active_channel() to assign new values.
Initialize the struct to zero before using it to avoid pushing
uninitialized information to userspace.
Cc: stable(a)vger.kernel.org
Fixes: 4e130dc7b413 ("iio: adc: rockchip_saradc: Add support iio buffers")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-4-0cb6e98d895c@g…
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/adc/rockchip_saradc.c b/drivers/iio/adc/rockchip_saradc.c
index 240cfa391674..dfd47a6e1f4a 100644
--- a/drivers/iio/adc/rockchip_saradc.c
+++ b/drivers/iio/adc/rockchip_saradc.c
@@ -368,6 +368,8 @@ static irqreturn_t rockchip_saradc_trigger_handler(int irq, void *p)
int ret;
int i, j = 0;
+ memset(&data, 0, sizeof(data));
+
mutex_lock(&info->lock);
iio_for_each_active_channel(i_dev, i) {
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 38724591364e1e3b278b4053f102b49ea06ee17c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025011309-swab-bootie-b4f4@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 38724591364e1e3b278b4053f102b49ea06ee17c Mon Sep 17 00:00:00 2001
From: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Date: Mon, 25 Nov 2024 22:16:12 +0100
Subject: [PATCH] iio: adc: rockchip_saradc: fix information leak in triggered
buffer
The 'data' local struct is used to push data to user space from a
triggered buffer, but it does not set values for inactive channels, as
it only uses iio_for_each_active_channel() to assign new values.
Initialize the struct to zero before using it to avoid pushing
uninitialized information to userspace.
Cc: stable(a)vger.kernel.org
Fixes: 4e130dc7b413 ("iio: adc: rockchip_saradc: Add support iio buffers")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz(a)gmail.com>
Link: https://patch.msgid.link/20241125-iio_memset_scan_holes-v1-4-0cb6e98d895c@g…
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/adc/rockchip_saradc.c b/drivers/iio/adc/rockchip_saradc.c
index 240cfa391674..dfd47a6e1f4a 100644
--- a/drivers/iio/adc/rockchip_saradc.c
+++ b/drivers/iio/adc/rockchip_saradc.c
@@ -368,6 +368,8 @@ static irqreturn_t rockchip_saradc_trigger_handler(int irq, void *p)
int ret;
int i, j = 0;
+ memset(&data, 0, sizeof(data));
+
mutex_lock(&info->lock);
iio_for_each_active_channel(i_dev, i) {
Currently, DWC3 driver attempts to acquire the USB power supply only
once during the probe. If the USB power supply is not ready at that
time, the driver simply ignores the failure and continues the probe,
leading to permanent non-functioning of the gadget vbus_draw callback.
Address this problem by delaying the dwc3 driver initialization until
the USB power supply is registered.
Fixes: 6f0764b5adea ("usb: dwc3: add a power supply for current control")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kyle Tso <kyletso(a)google.com>
---
v1 -> v2:
- get the power supply in a dedicated function
---
drivers/usb/dwc3/core.c | 30 +++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 7578c5133568..dfa1b5fe48dc 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -1684,8 +1684,6 @@ static void dwc3_get_properties(struct dwc3 *dwc)
u8 tx_thr_num_pkt_prd = 0;
u8 tx_max_burst_prd = 0;
u8 tx_fifo_resize_max_num;
- const char *usb_psy_name;
- int ret;
/* default to highest possible threshold */
lpm_nyet_threshold = 0xf;
@@ -1720,13 +1718,6 @@ static void dwc3_get_properties(struct dwc3 *dwc)
dwc->sys_wakeup = device_may_wakeup(dwc->sysdev);
- ret = device_property_read_string(dev, "usb-psy-name", &usb_psy_name);
- if (ret >= 0) {
- dwc->usb_psy = power_supply_get_by_name(usb_psy_name);
- if (!dwc->usb_psy)
- dev_err(dev, "couldn't get usb power supply\n");
- }
-
dwc->has_lpm_erratum = device_property_read_bool(dev,
"snps,has-lpm-erratum");
device_property_read_u8(dev, "snps,lpm-nyet-threshold",
@@ -2129,6 +2120,23 @@ static int dwc3_get_num_ports(struct dwc3 *dwc)
return 0;
}
+static struct power_supply *dwc3_get_usb_power_supply(struct dwc3 *dwc)
+{
+ struct power_supply *usb_psy;
+ const char *usb_psy_name;
+ int ret;
+
+ ret = device_property_read_string(dwc->dev, "usb-psy-name", &usb_psy_name);
+ if (ret < 0)
+ return NULL;
+
+ usb_psy = power_supply_get_by_name(usb_psy_name);
+ if (!usb_psy)
+ return ERR_PTR(-EPROBE_DEFER);
+
+ return usb_psy;
+}
+
static int dwc3_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
@@ -2185,6 +2193,10 @@ static int dwc3_probe(struct platform_device *pdev)
dwc3_get_software_properties(dwc);
+ dwc->usb_psy = dwc3_get_usb_power_supply(dwc);
+ if (IS_ERR(dwc->usb_psy))
+ return dev_err_probe(dev, PTR_ERR(dwc->usb_psy), "couldn't get usb power supply\n");
+
dwc->reset = devm_reset_control_array_get_optional_shared(dev);
if (IS_ERR(dwc->reset)) {
ret = PTR_ERR(dwc->reset);
--
2.48.0.rc2.279.g1de40edade-goog
The orig_a0 is missing in struct user_regs_struct of riscv, and there is
no way to add it without breaking UAPI. (See Link tag below)
Like NT_ARM_SYSTEM_CALL do, we add a new regset name NT_RISCV_ORIG_A0 to
access original a0 register from userspace via ptrace API.
Link: https://lore.kernel.org/all/59505464-c84a-403d-972f-d4b2055eeaac@gmail.com/
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
---
Changes in v6:
- Fix obsolute comment.
- Copy include/linux/stddef.h to tools/include to use offsetofend in
selftests.
- Link to v5: https://lore.kernel.org/r/20250115-riscv-new-regset-v5-0-d0e6ec031a23@coela…
Changes in v5:
- Fix wrong usage in selftests.
- Link to v4: https://lore.kernel.org/r/20241226-riscv-new-regset-v4-0-4496a29d0436@coela…
Changes in v4:
- Fix a copy paste error in selftest. (Forget to commit...)
- Link to v3: https://lore.kernel.org/r/20241226-riscv-new-regset-v3-0-f5b96465826b@coela…
Changes in v3:
- Use return 0 directly for readability.
- Fix test for modify a0.
- Add Fixes: tag
- Remove useless Cc: stable.
- Selftest will check both a0 and orig_a0, but depends on the
correctness of PTRACE_GET_SYSCALL_INFO.
- Link to v2: https://lore.kernel.org/r/20241203-riscv-new-regset-v2-0-d37da8c0cba6@coela…
Changes in v2:
- Fix integer width.
- Add selftest.
- Link to v1: https://lore.kernel.org/r/20241201-riscv-new-regset-v1-1-c83c58abcc7b@coela…
---
Celeste Liu (3):
riscv/ptrace: add new regset to access original a0 register
tools: copy include/linux/stddef.h to tools/include
riscv: selftests: Add a ptrace test to verify a0 and orig_a0 access
arch/riscv/kernel/ptrace.c | 32 +++++
include/uapi/linux/elf.h | 1 +
tools/include/linux/stddef.h | 85 ++++++++++++
tools/include/uapi/linux/stddef.h | 6 +-
tools/testing/selftests/riscv/abi/.gitignore | 1 +
tools/testing/selftests/riscv/abi/Makefile | 6 +-
tools/testing/selftests/riscv/abi/ptrace.c | 193 +++++++++++++++++++++++++++
7 files changed, 319 insertions(+), 5 deletions(-)
---
base-commit: 0e287d31b62bb53ad81d5e59778384a40f8b6f56
change-id: 20241201-riscv-new-regset-d529b952ad0d
Best regards,
--
Celeste Liu <uwu(a)coelacanthus.name>
The quilt patch titled
Subject: mm/vmscan: fix pgdemote_* accounting with lru_gen_enabled
has been removed from the -mm tree. Its filename was
mm-vmscan-fix-pgdemote_-accounting-with-lru_gen_enabled.patch
This patch was dropped because it is obsolete
------------------------------------------------------
From: Li Zhijian <lizhijian(a)fujitsu.com>
Subject: mm/vmscan: fix pgdemote_* accounting with lru_gen_enabled
Date: Fri, 10 Jan 2025 20:21:33 +0800
Commit f77f0c751478 ("mm,memcg: provide per-cgroup counters for NUMA
balancing operations") moved the accounting of PGDEMOTE_* statistics to
shrink_inactive_list(). However, shrink_inactive_list() is not called
when lrugen_enabled is true, leading to incorrect demotion statistics
despite actual demotion events occurring.
Add the PGDEMOTE_* accounting in evict_folios(), ensuring that demotion
statistics are correctly updated regardless of the lru_gen_enabled state.
This fix is crucial for systems that rely on accurate NUMA balancing
metrics for performance tuning and resource management.
Link: https://lkml.kernel.org/r/20250110122133.423481-2-lizhijian@fujitsu.com
Fixes: f77f0c751478 ("mm,memcg: provide per-cgroup counters for NUMA balancing operations")
Signed-off-by: Li Zhijian <lizhijian(a)fujitsu.com>
Cc: Kaiyang Zhao <kaiyang2(a)cs.cmu.edu>
Cc: Yu Zhao <yuzhao(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/vmscan.c | 2 ++
1 file changed, 2 insertions(+)
--- a/mm/vmscan.c~mm-vmscan-fix-pgdemote_-accounting-with-lru_gen_enabled
+++ a/mm/vmscan.c
@@ -4650,6 +4650,8 @@ retry:
__mod_lruvec_state(lruvec, PGDEMOTE_KSWAPD + reclaimer_offset(),
stat.nr_demoted);
+ __mod_lruvec_state(lruvec, PGDEMOTE_KSWAPD + reclaimer_offset(),
+ stat.nr_demoted);
item = PGSTEAL_KSWAPD + reclaimer_offset();
if (!cgroup_reclaim(sc))
__count_vm_events(item, reclaimed);
_
Patches currently in -mm which might be from lizhijian(a)fujitsu.com are
mm-vmscan-accumulate-nr_demoted-for-accurate-demotion-statistics.patch
mm-vmscan-accumulate-nr_demoted-for-accurate-demotion-statistics-v2.patch
When porting librseq commit:
commit c7b45750fa85 ("Adapt to glibc __rseq_size feature detection")
from librseq to the kernel selftests, the following line was missed
at the end of rseq_init():
rseq_size = get_rseq_kernel_feature_size();
which effectively leaves rseq_size initialized to -1U when glibc does not
have rseq support. glibc supports rseq from version 2.35 onwards.
In a following librseq commit
commit c67d198627c2 ("Only set 'rseq_size' on first thread registration")
to mimic the libc behavior, a new approach is taken: don't set the
feature size in 'rseq_size' until at least one thread has successfully
registered. This allows using 'rseq_size' in fast-paths to test for both
registration status and available features. The caveat is that on libc
either all threads are registered or none are, while with bare librseq
it is the responsability of the user to register all threads using rseq.
This combines the changes from the following librseq commits:
commit c7b45750fa85 ("Adapt to glibc __rseq_size feature detection")
commit c67d198627c2 ("Only set 'rseq_size' on first thread registration")
Fixes: 73a4f5a704a2 ("selftests/rseq: Fix mm_cid test failure")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: Raghavendra Rao Ananta <rananta(a)google.com>
Cc: Shuah Khan <skhan(a)linuxfoundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Boqun Feng <boqun.feng(a)gmail.com>
Cc: "Paul E. McKenney" <paulmck(a)kernel.org>
Cc: Carlos O'Donell <carlos(a)redhat.com>
Cc: Florian Weimer <fweimer(a)redhat.com>
Cc: Michael Jeanson <mjeanson(a)efficios.com>
Cc: linux-kselftest(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
tools/testing/selftests/rseq/rseq.c | 32 ++++++++++++++++++++++-------
tools/testing/selftests/rseq/rseq.h | 9 +++++++-
2 files changed, 33 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
index 5b9772cdf265..f6156790c3b4 100644
--- a/tools/testing/selftests/rseq/rseq.c
+++ b/tools/testing/selftests/rseq/rseq.c
@@ -61,7 +61,6 @@ unsigned int rseq_size = -1U;
unsigned int rseq_flags;
static int rseq_ownership;
-static int rseq_reg_success; /* At least one rseq registration has succeded. */
/* Allocate a large area for the TLS. */
#define RSEQ_THREAD_AREA_ALLOC_SIZE 1024
@@ -152,14 +151,27 @@ int rseq_register_current_thread(void)
}
rc = sys_rseq(&__rseq_abi, get_rseq_min_alloc_size(), 0, RSEQ_SIG);
if (rc) {
- if (RSEQ_READ_ONCE(rseq_reg_success)) {
+ /*
+ * After at least one thread has registered successfully
+ * (rseq_size > 0), the registration of other threads should
+ * never fail.
+ */
+ if (RSEQ_READ_ONCE(rseq_size) > 0) {
/* Incoherent success/failure within process. */
abort();
}
return -1;
}
assert(rseq_current_cpu_raw() >= 0);
- RSEQ_WRITE_ONCE(rseq_reg_success, 1);
+
+ /*
+ * The first thread to register sets the rseq_size to mimic the libc
+ * behavior.
+ */
+ if (RSEQ_READ_ONCE(rseq_size) == 0) {
+ RSEQ_WRITE_ONCE(rseq_size, get_rseq_kernel_feature_size());
+ }
+
return 0;
}
@@ -235,12 +247,18 @@ void rseq_init(void)
return;
}
rseq_ownership = 1;
- if (!rseq_available()) {
- rseq_size = 0;
- return;
- }
+
+ /* Calculate the offset of the rseq area from the thread pointer. */
rseq_offset = (void *)&__rseq_abi - rseq_thread_pointer();
+
+ /* rseq flags are deprecated, always set to 0. */
rseq_flags = 0;
+
+ /*
+ * Set the size to 0 until at least one thread registers to mimic the
+ * libc behavior.
+ */
+ rseq_size = 0;
}
static __attribute__((destructor))
diff --git a/tools/testing/selftests/rseq/rseq.h b/tools/testing/selftests/rseq/rseq.h
index 4e217b620e0c..062d10925a10 100644
--- a/tools/testing/selftests/rseq/rseq.h
+++ b/tools/testing/selftests/rseq/rseq.h
@@ -60,7 +60,14 @@
extern ptrdiff_t rseq_offset;
/*
- * Size of the registered rseq area. 0 if the registration was
+ * The rseq ABI is composed of extensible feature fields. The extensions
+ * are done by appending additional fields at the end of the structure.
+ * The rseq_size defines the size of the active feature set which can be
+ * used by the application for the current rseq registration. Features
+ * starting at offset >= rseq_size are inactive and should not be used.
+ *
+ * The rseq_size is the intersection between the available allocation
+ * size for the rseq area and the feature size supported by the kernel.
* unsuccessful.
*/
extern unsigned int rseq_size;
--
2.39.5
On removal of the device or unloading of the kernel module a potential
NULL pointer dereference occurs.
The following sequence deletes the interface:
brcmf_detach()
brcmf_remove_interface()
brcmf_del_if()
Inside the brcmf_del_if() function the drvr->if2bss[ifidx] is updated to
BRCMF_BSSIDX_INVALID (-1) if the bsscfgidx matches.
After brcmf_remove_interface() call the brcmf_proto_detach() function is
called providing the following sequence:
brcmf_detach()
brcmf_proto_detach()
brcmf_proto_msgbuf_detach()
brcmf_flowring_detach()
brcmf_msgbuf_delete_flowring()
brcmf_msgbuf_remove_flowring()
brcmf_flowring_delete()
brcmf_get_ifp()
brcmf_txfinalize()
Since brcmf_get_ip() can and actually will return NULL in this case the
call to brcmf_txfinalize() will result in a NULL pointer dereference
inside brcmf_txfinalize() when trying to update
ifp->ndev->stats.tx_errors.
This will only happen if a flowring still has an skb.
Although the NULL pointer dereference has only been seen when trying to update
the tx statistic, all other uses of the ifp pointer have been guarded as well.
Cc: stable(a)vger.kernel.org
Signed-off-by: Marcel Hamer <marcel.hamer(a)windriver.com>
Link: https://lore.kernel.org/all/b519e746-ddfd-421f-d897-7620d229e4b2@gmail.com/
---
v1 -> v2: guard all uses of the ifp pointer inside brcmf_txfinalize()
---
drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
index c3a57e30c855..791757a3ec13 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/core.c
@@ -543,13 +543,13 @@ void brcmf_txfinalize(struct brcmf_if *ifp, struct sk_buff *txp, bool success)
eh = (struct ethhdr *)(txp->data);
type = ntohs(eh->h_proto);
- if (type == ETH_P_PAE) {
+ if (type == ETH_P_PAE && ifp) {
atomic_dec(&ifp->pend_8021x_cnt);
if (waitqueue_active(&ifp->pend_8021x_wait))
wake_up(&ifp->pend_8021x_wait);
}
- if (!success)
+ if (!success && ifp)
ifp->ndev->stats.tx_errors++;
brcmu_pkt_buf_free_skb(txp);
--
2.34.1
The PE Reset State "0" returned by RTAS calls
"ibm_read_slot_reset_[state|state2]" indicates that the reset is
deactivated and the PE is in a state where MMIO and DMA are allowed.
However, the current implementation of "pseries_eeh_get_state()" does
not reflect this, causing drivers to incorrectly assume that MMIO and
DMA operations cannot be resumed.
The userspace drivers as a part of EEH recovery using VFIO ioctls fail
to detect when the recovery process is complete. The VFIO_EEH_PE_GET_STATE
ioctl does not report the expected EEH_PE_STATE_NORMAL state, preventing
userspace drivers from functioning properly on pseries systems.
The patch addresses this issue by updating 'pseries_eeh_get_state()'
to include "EEH_STATE_MMIO_ENABLED" and "EEH_STATE_DMA_ENABLED" in
the result mask for PE Reset State "0". This ensures correct state
reporting to the callers, aligning the behavior with the PAPR specification
and fixing the bug in EEH recovery for VFIO user workflows.
Fixes: 00ba05a12b3c ("powerpc/pseries: Cleanup on pseries_eeh_get_state()")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Narayana Murty N <nnmlinux(a)linux.ibm.com>
---
Changelog:
V1:https://lore.kernel.org/all/20241107042027.338065-1-nnmlinux@linux.ibm.c…
--added Fixes tag for "powerpc/pseries: Cleanup on
pseries_eeh_get_state()".
V2:https://lore.kernel.org/stable/20241212075044.10563-1-nnmlinux%40linux.i…
--Updated the patch description to include it in the stable kernel tree.
V3:https://lore.kernel.org/all/87v7vm8pwz.fsf@gmail.com/
--Updated commit description.
---
arch/powerpc/platforms/pseries/eeh_pseries.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/eeh_pseries.c b/arch/powerpc/platforms/pseries/eeh_pseries.c
index 1893f66371fa..b12ef382fec7 100644
--- a/arch/powerpc/platforms/pseries/eeh_pseries.c
+++ b/arch/powerpc/platforms/pseries/eeh_pseries.c
@@ -580,8 +580,10 @@ static int pseries_eeh_get_state(struct eeh_pe *pe, int *delay)
switch(rets[0]) {
case 0:
- result = EEH_STATE_MMIO_ACTIVE |
- EEH_STATE_DMA_ACTIVE;
+ result = EEH_STATE_MMIO_ACTIVE |
+ EEH_STATE_DMA_ACTIVE |
+ EEH_STATE_MMIO_ENABLED |
+ EEH_STATE_DMA_ENABLED;
break;
case 1:
result = EEH_STATE_RESET_ACTIVE |
--
2.47.1
On 1/14/25 09:06, Tom Lendacky wrote:
> On 1/14/25 08:44, Kirill A. Shutemov wrote:
>> On Tue, Jan 14, 2025 at 08:33:39AM -0600, Tom Lendacky wrote:
>>> On 1/14/25 01:27, Kirill A. Shutemov wrote:
>>>> On Mon, Jan 13, 2025 at 02:47:56PM -0600, Tom Lendacky wrote:
>>>>> On 1/13/25 07:14, Kirill A. Shutemov wrote:
>>>>>> Currently memremap(MEMREMAP_WB) can produce decrypted/shared mapping:
>>>>>>
>>>>>> memremap(MEMREMAP_WB)
>>>>>> arch_memremap_wb()
>>>>>> ioremap_cache()
>>>>>> __ioremap_caller(.encrytped = false)
>>>>>>
>>>>>> In such cases, the IORES_MAP_ENCRYPTED flag on the memory will determine
>>>>>> if the resulting mapping is encrypted or decrypted.
>>>>>>
>>>>>> Creating a decrypted mapping without explicit request from the caller is
>>>>>> risky:
>>>>>>
>>>>>> - It can inadvertently expose the guest's data and compromise the
>>>>>> guest.
>>>>>>
>>>>>> - Accessing private memory via shared/decrypted mapping on TDX will
>>>>>> either trigger implicit conversion to shared or #VE (depending on
>>>>>> VMM implementation).
>>>>>>
>>>>>> Implicit conversion is destructive: subsequent access to the same
>>>>>> memory via private mapping will trigger a hard-to-debug #VE crash.
>>>>>>
>>>>>> The kernel already provides a way to request decrypted mapping
>>>>>> explicitly via the MEMREMAP_DEC flag.
>>>>>>
>>>>>> Modify memremap(MEMREMAP_WB) to produce encrypted/private mapping by
>>>>>> default unless MEMREMAP_DEC is specified.
>>>>>>
>>>>>> Fix the crash due to #VE on kexec in TDX guests if CONFIG_EISA is enabled.
>>>>>
>>>>> This patch causes my bare-metal system to crash during boot when using
>>>>> mem_encrypt=on:
>>>>>
>>>>> [ 2.392934] efi: memattr: Entry type should be RuntimeServiceCode/Data
>>>>> [ 2.393731] efi: memattr: ! 0x214c42f01f1162a-0xee70ac7bd1a9c629 [type=2028324321|attr=0x6590648fa4209879]
>>>>
>>>> Could you try if this helps?
>>>>
>>>> diff --git a/drivers/firmware/efi/memattr.c b/drivers/firmware/efi/memattr.c
>>>> index c38b1a335590..b5051dcb7c1d 100644
>>>> --- a/drivers/firmware/efi/memattr.c
>>>> +++ b/drivers/firmware/efi/memattr.c
>>>> @@ -160,7 +160,7 @@ int __init efi_memattr_apply_permissions(struct mm_struct *mm,
>>>> if (WARN_ON(!efi_enabled(EFI_MEMMAP)))
>>>> return 0;
>>>>
>>>> - tbl = memremap(efi_mem_attr_table, tbl_size, MEMREMAP_WB);
>>>> + tbl = memremap(efi_mem_attr_table, tbl_size, MEMREMAP_WB | MEMREMAP_DEC);
>>>
>>> Well that would work for SME where EFI tables/data are not encrypted,
>>> but will break for SEV where EFI tables/data are encrypted.
>>
>> Hm. Why would it break for SEV? It brings the situation back to what it
>> was before the patch.
>
> Ah, true. I can try it and see how much further SME gets. Hopefully it
> doesn't turn into a whack-a-mole thing.
Unfortunately, it is turning into a whack-a-mole thing.
But it looks the following works for SME:
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 3c36f3f5e688..ff3cd5fc8508 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -505,7 +505,7 @@ EXPORT_SYMBOL(iounmap);
void *arch_memremap_wb(phys_addr_t phys_addr, size_t size, unsigned long flags)
{
- if (flags & MEMREMAP_DEC)
+ if (flags & MEMREMAP_DEC || cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT))
return (void __force *)ioremap_cache(phys_addr, size);
return (void __force *)ioremap_encrypted(phys_addr, size);
I haven't had a chance to test the series on SEV, yet.
Thanks,
Tom
>
> Thanks,
> Tom
>
>>
>> Note that that __ioremap_caller() would still check io_desc.flags before
>> mapping it as decrypted.
>>
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: 9322d1915f9d976ee48c09d800fbd5169bc2ddcc
Gitweb: https://git.kernel.org/tip/9322d1915f9d976ee48c09d800fbd5169bc2ddcc
Author: Joe Hattori <joe(a)pf.is.s.u-tokyo.ac.jp>
AuthorDate: Sun, 15 Dec 2024 12:39:45 +09:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Wed, 15 Jan 2025 10:38:43 +01:00
irqchip: Plug a OF node reference leak in platform_irqchip_probe()
platform_irqchip_probe() leaks a OF node when irq_init_cb() fails. Fix it
by declaring par_np with the __free(device_node) cleanup construct.
This bug was found by an experimental static analysis tool that I am
developing.
Fixes: f8410e626569 ("irqchip: Add IRQCHIP_PLATFORM_DRIVER_BEGIN/END and IRQCHIP_MATCH helper macros")
Signed-off-by: Joe Hattori <joe(a)pf.is.s.u-tokyo.ac.jp>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/20241215033945.3414223-1-joe@pf.is.s.u-tokyo.ac…
---
drivers/irqchip/irqchip.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/irqchip/irqchip.c b/drivers/irqchip/irqchip.c
index 1eeb0d0..0ee7b6b 100644
--- a/drivers/irqchip/irqchip.c
+++ b/drivers/irqchip/irqchip.c
@@ -35,11 +35,10 @@ void __init irqchip_init(void)
int platform_irqchip_probe(struct platform_device *pdev)
{
struct device_node *np = pdev->dev.of_node;
- struct device_node *par_np = of_irq_find_parent(np);
+ struct device_node *par_np __free(device_node) = of_irq_find_parent(np);
of_irq_init_cb_t irq_init_cb = of_device_get_match_data(&pdev->dev);
if (!irq_init_cb) {
- of_node_put(par_np);
return -EINVAL;
}
@@ -55,7 +54,6 @@ int platform_irqchip_probe(struct platform_device *pdev)
* interrupt controller can check for specific domains as necessary.
*/
if (par_np && !irq_find_matching_host(par_np, DOMAIN_BUS_ANY)) {
- of_node_put(par_np);
return -EPROBE_DEFER;
}
Fixed a possible UAF problem in driver_override_show() in
drivers/bus/fsl-mc/fsl-mc-bus.c
This function driver_override_show() is part of DEVICE_ATTR_RW, which
includes both driver_override_show() and driver_override_store().
These functions can be executed concurrently in sysfs.
The driver_override_store() function uses driver_set_override() to
update the driver_override value, and driver_set_override() internally
locks the device (device_lock(dev)). If driver_override_show() reads
cdx_dev->driver_override without locking, it could potentially access
a freed pointer if driver_override_store() frees the string
concurrently. This could lead to printing a kernel address, which is a
security risk since DEVICE_ATTR can be read by all users.
Additionally, a similar pattern is used in drivers/amba/bus.c, as well
as many other bus drivers, where device_lock() is taken in the show
function, and it has been working without issues.
This potential bug was detected by our experimental static analysis tool,
which analyzes locking APIs and paired functions to identify data races
and atomicity violations.
Fixes: 1f86a00c1159 ("bus/fsl-mc: add support for 'driver_override' in the mc-bus")
Cc: stable(a)vger.kernel.org
Signed-off-by: Qiu-ji Chen <chenqiuji666(a)gmail.com>
---
V2:
Revised the description to reduce the confusion it caused.
---
drivers/bus/fsl-mc/fsl-mc-bus.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/bus/fsl-mc/fsl-mc-bus.c b/drivers/bus/fsl-mc/fsl-mc-bus.c
index 930d8a3ba722..62a9da88b4c9 100644
--- a/drivers/bus/fsl-mc/fsl-mc-bus.c
+++ b/drivers/bus/fsl-mc/fsl-mc-bus.c
@@ -201,8 +201,12 @@ static ssize_t driver_override_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct fsl_mc_device *mc_dev = to_fsl_mc_device(dev);
+ ssize_t len;
- return snprintf(buf, PAGE_SIZE, "%s\n", mc_dev->driver_override);
+ device_lock(dev);
+ len = snprintf(buf, PAGE_SIZE, "%s\n", mc_dev->driver_override);
+ device_unlock(dev);
+ return len;
}
static DEVICE_ATTR_RW(driver_override);
--
2.34.1
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: de31b3cd706347044e1a57d68c3a683d58e8cca4
Gitweb: https://git.kernel.org/tip/de31b3cd706347044e1a57d68c3a683d58e8cca4
Author: Xin Li (Intel) <xin(a)zytor.com>
AuthorDate: Fri, 10 Jan 2025 09:46:39 -08:00
Committer: Dave Hansen <dave.hansen(a)linux.intel.com>
CommitterDate: Tue, 14 Jan 2025 14:16:36 -08:00
x86/fred: Fix the FRED RSP0 MSR out of sync with its per-CPU cache
The FRED RSP0 MSR is only used for delivering events when running
userspace. Linux leverages this property to reduce expensive MSR
writes and optimize context switches. The kernel only writes the
MSR when about to run userspace *and* when the MSR has actually
changed since the last time userspace ran.
This optimization is implemented by maintaining a per-CPU cache of
FRED RSP0 and then checking that against the value for the top of
current task stack before running userspace.
However cpu_init_fred_exceptions() writes the MSR without updating
the per-CPU cache. This means that the kernel might return to
userspace with MSR_IA32_FRED_RSP0==0 when it needed to point to the
top of current task stack. This would induce a double fault (#DF),
which is bad.
A context switch after cpu_init_fred_exceptions() can paper over
the issue since it updates the cached value. That evidently
happens most of the time explaining how this bug got through.
Fix the bug through resynchronizing the FRED RSP0 MSR with its
per-CPU cache in cpu_init_fred_exceptions().
Fixes: fe85ee391966 ("x86/entry: Set FRED RSP0 on return to userspace instead of context switch")
Signed-off-by: Xin Li (Intel) <xin(a)zytor.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250110174639.1250829-1-xin%40zytor.com
---
arch/x86/kernel/fred.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/fred.c b/arch/x86/kernel/fred.c
index 8d32c3f..5e2cd10 100644
--- a/arch/x86/kernel/fred.c
+++ b/arch/x86/kernel/fred.c
@@ -50,7 +50,13 @@ void cpu_init_fred_exceptions(void)
FRED_CONFIG_ENTRYPOINT(asm_fred_entrypoint_user));
wrmsrl(MSR_IA32_FRED_STKLVLS, 0);
- wrmsrl(MSR_IA32_FRED_RSP0, 0);
+
+ /*
+ * Ater a CPU offline/online cycle, the FRED RSP0 MSR should be
+ * resynchronized with its per-CPU cache.
+ */
+ wrmsrl(MSR_IA32_FRED_RSP0, __this_cpu_read(fred_rsp0));
+
wrmsrl(MSR_IA32_FRED_RSP1, 0);
wrmsrl(MSR_IA32_FRED_RSP2, 0);
wrmsrl(MSR_IA32_FRED_RSP3, 0);
The following call-chain leads to misuse of spinlock_irq
when spinlock_irqsave was hold.
irq_set_vcpu_affinity
-> irq_get_desc_lock (spinlock_irqsave)
-> its_irq_set_vcpu_affinity
-> guard(raw_spin_lock_irq) <--- this enables interrupts
-> irq_put_desc_unlock // <--- WARN IRQs enabled
Fix the issue by using guard(raw_spinlock), since the function is
already called with irqsave and raw_spin_lock was used before the commit
b97e8a2f7130 ("irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()")
introducing the guard as well.
This was discovered through the lock debugging, and the corresponding
log is as follows:
raw_local_irq_restore() called with IRQs enabled
WARNING: CPU: 38 PID: 444 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x2c/0x38
Call trace:
warn_bogus_irq_restore+0x2c/0x38
_raw_spin_unlock_irqrestore+0x68/0x88
__irq_put_desc_unlock+0x1c/0x48
irq_set_vcpu_affinity+0x74/0xc0
its_map_vlpi+0x44/0x88
kvm_vgic_v4_set_forwarding+0x148/0x230
kvm_arch_irq_bypass_add_producer+0x20/0x28
__connect+0x98/0xb8
irq_bypass_register_consumer+0x150/0x178
kvm_irqfd+0x6dc/0x744
kvm_vm_ioctl+0xe44/0x16b0
Fixes: b97e8a2f7130 ("irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()")
Signed-off-by: Tomas Krcka <krckatom(a)amazon.de>
Reviewed-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
drivers/irqchip/irq-gic-v3-its.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 92244cfa0464..8c3ec5734f1e 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -2045,7 +2045,7 @@ static int its_irq_set_vcpu_affinity(struct irq_data *d, void *vcpu_info)
if (!is_v4(its_dev->its))
return -EINVAL;
- guard(raw_spinlock_irq)(&its_dev->event_map.vlpi_lock);
+ guard(raw_spinlock)(&its_dev->event_map.vlpi_lock);
/* Unmap request? */
if (!info)
--
2.40.1
Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: 0d62a49ab55c99e8deb4593b8d9f923de1ab5c18
Gitweb: https://git.kernel.org/tip/0d62a49ab55c99e8deb4593b8d9f923de1ab5c18
Author: Yogesh Lal <quic_ylal(a)quicinc.com>
AuthorDate: Fri, 20 Dec 2024 15:09:07 +05:30
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Wed, 15 Jan 2025 09:42:44 +01:00
irqchip/gic-v3: Handle CPU_PM_ENTER_FAILED correctly
When a CPU attempts to enter low power mode, it disables the redistributor
and Group 1 interrupts and reinitializes the system registers upon wakeup.
If the transition into low power mode fails, then the CPU_PM framework
invokes the PM notifier callback with CPU_PM_ENTER_FAILED to allow the
drivers to undo the state changes.
The GIC V3 driver ignores CPU_PM_ENTER_FAILED, which leaves the GIC in
disabled state.
Handle CPU_PM_ENTER_FAILED in the same way as CPU_PM_EXIT to restore normal
operation.
[ tglx: Massage change log, add Fixes tag ]
Fixes: 3708d52fc6bb ("irqchip: gic-v3: Implement CPU PM notifier")
Signed-off-by: Yogesh Lal <quic_ylal(a)quicinc.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/all/20241220093907.2747601-1-quic_ylal@quicinc.com
---
drivers/irqchip/irq-gic-v3.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 79d8cc8..76dce0a 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -1522,7 +1522,7 @@ static int gic_retrigger(struct irq_data *data)
static int gic_cpu_pm_notifier(struct notifier_block *self,
unsigned long cmd, void *v)
{
- if (cmd == CPU_PM_EXIT) {
+ if (cmd == CPU_PM_EXIT || cmd == CPU_PM_ENTER_FAILED) {
if (gic_dist_security_disabled())
gic_enable_redist(true);
gic_cpu_sys_reg_enable();
There is a data race between the functions driver_override_show() and
driver_override_store(). In the driver_override_store() function, the
assignment to ret calls driver_set_override(), which frees the old value
while writing the new value to dev. If a race occurs, it may cause a
use-after-free (UAF) error in driver_override_show().
To fix this issue, we adopt a logic similar to the driver_override_show()
function in vmbus_drv.c, protecting dev within a lock to ensure its value
remains unchanged.
This possible bug is found by an experimental static analysis tool
developed by our team. This tool analyzes the locking APIs to extract
function pairs that can be concurrently executed, and then analyzes the
instructions in the paired functions to identify possible concurrency bugs
including data races and atomicity violations.
Fixes: 48a6c7bced2a ("cdx: add device attributes")
Cc: stable(a)vger.kernel.org
Signed-off-by: Qiu-ji Chen <chenqiuji666(a)gmail.com>
---
V2:
Modified the title and description.
Removed the changes to cdx_bus_match().
---
drivers/cdx/cdx.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/cdx/cdx.c b/drivers/cdx/cdx.c
index 07371cb653d3..4af1901c9d52 100644
--- a/drivers/cdx/cdx.c
+++ b/drivers/cdx/cdx.c
@@ -470,8 +470,12 @@ static ssize_t driver_override_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct cdx_device *cdx_dev = to_cdx_device(dev);
+ ssize_t len;
- return sysfs_emit(buf, "%s\n", cdx_dev->driver_override);
+ device_lock(dev);
+ len = sysfs_emit(buf, "%s\n", cdx_dev->driver_override);
+ device_unlock(dev);
+ return len;
}
static DEVICE_ATTR_RW(driver_override);
--
2.34.1
From: Cheng Ming Lin <chengminglin(a)mxic.com.tw>
The default dummy cycle for Macronix SPI NOR flash in Octal Output
Read Mode(1-1-8) is 20.
Currently, the dummy buswidth is set according to the address bus width.
In the 1-1-8 mode, this means the dummy buswidth is 1. When converting
dummy cycles to bytes, this results in 20 x 1 / 8 = 2 bytes, causing the
host to read data 4 cycles too early.
Since the protocol data buswidth is always greater than or equal to the
address buswidth. Setting the dummy buswidth to match the data buswidth
increases the likelihood that the dummy cycle-to-byte conversion will be
divisible, preventing the host from reading data prematurely.
Fixes: 0e30f47232ab5 ("mtd: spi-nor: add support for DTR protocol")
Cc: stable(a)vger.kernel.org
Reviewd-by: Pratyush Yadav <pratyush(a)kernel.org>
Signed-off-by: Cheng Ming Lin <chengminglin(a)mxic.com.tw>
---
drivers/mtd/spi-nor/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index f9c189ed7353..c7aceaa8a43f 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -89,7 +89,7 @@ void spi_nor_spimem_setup_op(const struct spi_nor *nor,
op->addr.buswidth = spi_nor_get_protocol_addr_nbits(proto);
if (op->dummy.nbytes)
- op->dummy.buswidth = spi_nor_get_protocol_addr_nbits(proto);
+ op->dummy.buswidth = spi_nor_get_protocol_data_nbits(proto);
if (op->data.nbytes)
op->data.buswidth = spi_nor_get_protocol_data_nbits(proto);
--
2.25.1
From: Darrick J. Wong <djwong(a)kernel.org>
I received a report from the release engineering side of the house that
xfs_scrub without the -n flag (aka fix it mode) would try to fix a
broken filesystem even on a kernel that doesn't have online repair built
into it:
# xfs_scrub -dTvn /mnt/test
EXPERIMENTAL xfs_scrub program in use! Use at your own risk!
Phase 1: Find filesystem geometry.
/mnt/test: using 1 threads to scrub.
Phase 1: Memory used: 132k/0k (108k/25k), time: 0.00/ 0.00/ 0.00s
<snip>
Phase 4: Repair filesystem.
<snip>
Info: /mnt/test/some/victimdir directory entries: Attempting repair. (repair.c line 351)
Corruption: /mnt/test/some/victimdir directory entries: Repair unsuccessful; offline repair required. (repair.c line 204)
Source: https://blogs.oracle.com/linux/post/xfs-online-filesystem-repair
It is strange that xfs_scrub doesn't refuse to run, because the kernel
is supposed to return EOPNOTSUPP if we actually needed to run a repair,
and xfs_io's repair subcommand will perror that. And yet:
# xfs_io -x -c 'repair probe' /mnt/test
#
The first problem is commit dcb660f9222fd9 (4.15) which should have had
xchk_probe set the CORRUPT OFLAG so that any of the repair machinery
will get called at all.
It turns out that some refactoring that happened in the 6.6-6.8 era
broke the operation of this corner case. What we *really* want to
happen is that all the predicates that would steer xfs_scrub_metadata()
towards calling xrep_attempt() should function the same way that they do
when repair is compiled in; and then xrep_attempt gets to return the
fatal EOPNOTSUPP error code that causes the probe to fail.
Instead, commit 8336a64eb75cba (6.6) started the failwhale swimming by
hoisting OFLAG checking logic into a helper whose non-repair stub always
returns false, causing scrub to return "repair not needed" when in fact
the repair is not supported. Prior to that commit, the oflag checking
that was open-coded in scrub.c worked correctly.
Similarly, in commit 4bdfd7d15747b1 (6.8) we hoisted the IFLAG_REPAIR
and ALREADY_FIXED logic into a helper whose non-repair stub always
returns false, so we never enter the if test body that would have called
xrep_attempt, let alone fail to decode the OFLAGs correctly.
The final insult (yes, we're doing The Naked Gun now) is commit
48a72f60861f79 (6.8) in which we hoisted the "are we going to try a
repair?" predicate into yet another function with a non-repair stub
always returns false.
So. Fix xchk_probe to trigger xrep_probe, then fix all the other helper
predicates in scrub.h so that we always point to xrep_attempt, even if
online repair is disabled. Commit 48a72 is tagged here because the
scrub code prior to LTS 6.12 are incomplete and not worth patching.
Reported-by: David Flynn <david.flynn(a)oracle.com>
Cc: <stable(a)vger.kernel.org> # v6.8
Fixes: 48a72f60861f79 ("xfs: don't complain about unfixed metadata when repairs were injected")
Signed-off-by: "Darrick J. Wong" <djwong(a)kernel.org>
---
fs/xfs/scrub/common.h | 5 -----
fs/xfs/scrub/repair.h | 11 ++++++++++-
fs/xfs/scrub/scrub.c | 9 +++++++++
3 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/fs/xfs/scrub/common.h b/fs/xfs/scrub/common.h
index eecb6d54c0f953..53a6b34ce54271 100644
--- a/fs/xfs/scrub/common.h
+++ b/fs/xfs/scrub/common.h
@@ -212,7 +212,6 @@ static inline bool xchk_skip_xref(struct xfs_scrub_metadata *sm)
bool xchk_dir_looks_zapped(struct xfs_inode *dp);
bool xchk_pptr_looks_zapped(struct xfs_inode *ip);
-#ifdef CONFIG_XFS_ONLINE_REPAIR
/* Decide if a repair is required. */
static inline bool xchk_needs_repair(const struct xfs_scrub_metadata *sm)
{
@@ -232,10 +231,6 @@ static inline bool xchk_could_repair(const struct xfs_scrub *sc)
return (sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR) &&
!(sc->flags & XREP_ALREADY_FIXED);
}
-#else
-# define xchk_needs_repair(sc) (false)
-# define xchk_could_repair(sc) (false)
-#endif /* CONFIG_XFS_ONLINE_REPAIR */
int xchk_metadata_inode_forks(struct xfs_scrub *sc);
diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h
index b649da1a93eb8c..b3b1fe62814e7b 100644
--- a/fs/xfs/scrub/repair.h
+++ b/fs/xfs/scrub/repair.h
@@ -173,7 +173,16 @@ bool xrep_buf_verify_struct(struct xfs_buf *bp, const struct xfs_buf_ops *ops);
#else
#define xrep_ino_dqattach(sc) (0)
-#define xrep_will_attempt(sc) (false)
+
+/*
+ * When online repair is not built into the kernel, we still want to attempt
+ * the repair so that the stub xrep_attempt below will return EOPNOTSUPP.
+ */
+static inline bool xrep_will_attempt(const struct xfs_scrub *sc)
+{
+ return (sc->sm->sm_flags & XFS_SCRUB_IFLAG_FORCE_REBUILD) ||
+ xchk_needs_repair(sc->sm);
+}
static inline int
xrep_attempt(
diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c
index 950f5a58dcd967..09468f50781b24 100644
--- a/fs/xfs/scrub/scrub.c
+++ b/fs/xfs/scrub/scrub.c
@@ -149,6 +149,15 @@ xchk_probe(
if (xchk_should_terminate(sc, &error))
return error;
+ /*
+ * If the caller is probing to see if repair works, set the CORRUPT
+ * flag (without any of the usual tracing/logging) to force us into
+ * the repair codepaths. If repair is compiled into the kernel, we'll
+ * call xrep_probe and simulate a repair; otherwise, the repair
+ * codepaths return EOPNOTSUPP.
+ */
+ if (xchk_could_repair(sc))
+ sc->sm->sm_flags |= XFS_SCRUB_OFLAG_CORRUPT;
return 0;
}
Currently, DWC3 driver attempts to acquire the USB power supply only
once during the probe. If the USB power supply is not ready at that
time, the driver simply ignores the failure and continues the probe,
leading to permanent non-functioning of the gadget vbus_draw callback.
Address this problem by delaying the dwc3 driver initialization until
the USB power supply is registered.
Fixes: 6f0764b5adea ("usb: dwc3: add a power supply for current control")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kyle Tso <kyletso(a)google.com>
---
Note: This is a follow-up of https://lore.kernel.org/all/20240804084612.2561230-1-kyletso@google.com/
---
drivers/usb/dwc3/core.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index 7578c5133568..1550c39e792a 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -1669,7 +1669,7 @@ static void dwc3_get_software_properties(struct dwc3 *dwc)
}
}
-static void dwc3_get_properties(struct dwc3 *dwc)
+static int dwc3_get_properties(struct dwc3 *dwc)
{
struct device *dev = dwc->dev;
u8 lpm_nyet_threshold;
@@ -1724,7 +1724,7 @@ static void dwc3_get_properties(struct dwc3 *dwc)
if (ret >= 0) {
dwc->usb_psy = power_supply_get_by_name(usb_psy_name);
if (!dwc->usb_psy)
- dev_err(dev, "couldn't get usb power supply\n");
+ return dev_err_probe(dev, -EPROBE_DEFER, "couldn't get usb power supply\n");
}
dwc->has_lpm_erratum = device_property_read_bool(dev,
@@ -1847,6 +1847,8 @@ static void dwc3_get_properties(struct dwc3 *dwc)
dwc->imod_interval = 0;
dwc->tx_fifo_resize_max_num = tx_fifo_resize_max_num;
+
+ return 0;
}
/* check whether the core supports IMOD */
@@ -2181,7 +2183,9 @@ static int dwc3_probe(struct platform_device *pdev)
dwc->regs = regs;
dwc->regs_size = resource_size(&dwc_res);
- dwc3_get_properties(dwc);
+ ret = dwc3_get_properties(dwc);
+ if (ret)
+ return ret;
dwc3_get_software_properties(dwc);
--
2.47.1.688.g23fc6f90ad-goog
It is observed sometimes when tethering is used over NCM with Windows 11
as host, at some instances, the gadget_giveback has one byte appended at
the end of a proper NTB. When the NTB is parsed, unwrap call looks for
any leftover bytes in SKB provided by u_ether and if there are any pending
bytes, it treats them as a separate NTB and parses it. But in case the
second NTB (as per unwrap call) is faulty/corrupt, all the datagrams that
were parsed properly in the first NTB and saved in rx_list are dropped.
Adding a few custom traces showed the following:
[002] d..1 7828.532866: dwc3_gadget_giveback: ep1out:
req 000000003868811a length 1025/16384 zsI ==> 0
[002] d..1 7828.532867: ncm_unwrap_ntb: K: ncm_unwrap_ntb toprocess: 1025
[002] d..1 7828.532867: ncm_unwrap_ntb: K: ncm_unwrap_ntb nth: 1751999342
[002] d..1 7828.532868: ncm_unwrap_ntb: K: ncm_unwrap_ntb seq: 0xce67
[002] d..1 7828.532868: ncm_unwrap_ntb: K: ncm_unwrap_ntb blk_len: 0x400
[002] d..1 7828.532868: ncm_unwrap_ntb: K: ncm_unwrap_ntb ndp_len: 0x10
[002] d..1 7828.532869: ncm_unwrap_ntb: K: Parsed NTB with 1 frames
In this case, the giveback is of 1025 bytes and block length is 1024.
The rest 1 byte (which is 0x00) won't be parsed resulting in drop of
all datagrams in rx_list.
Same is case with packets of size 2048:
[002] d..1 7828.557948: dwc3_gadget_giveback: ep1out:
req 0000000011dfd96e length 2049/16384 zsI ==> 0
[002] d..1 7828.557949: ncm_unwrap_ntb: K: ncm_unwrap_ntb nth: 1751999342
[002] d..1 7828.557950: ncm_unwrap_ntb: K: ncm_unwrap_ntb blk_len: 0x800
Lecroy shows one byte coming in extra confirming that the byte is coming
in from PC:
Transfer 2959 - Bytes Transferred(1025) Timestamp((18.524 843 590)
- Transaction 8391 - Data(1025 bytes) Timestamp(18.524 843 590)
--- Packet 4063861
Data(1024 bytes)
Duration(2.117us) Idle(14.700ns) Timestamp(18.524 843 590)
--- Packet 4063863
Data(1 byte)
Duration(66.160ns) Time(282.000ns) Timestamp(18.524 845 722)
According to Windows driver, no ZLP is needed if wBlockLength is non-zero,
because the non-zero wBlockLength has already told the function side the
size of transfer to be expected. However, there are in-market NCM devices
that rely on ZLP as long as the wBlockLength is multiple of wMaxPacketSize.
To deal with such devices, it pads an extra 0 at end so the transfer is no
longer multiple of wMaxPacketSize.
Cc: <stable(a)vger.kernel.org>
Fixes: 9f6ce4240a2b ("usb: gadget: f_ncm.c added")
Signed-off-by: Krishna Kurapati <quic_kriskura(a)quicinc.com>
---
Link to v2:
https://lore.kernel.org/all/20240131150332.1326523-1-quic_kriskura@quicinc.…
Changes in v2:
Added check to see if the padded byte is 0x00.
Changes in v3:
Removed wMaxPacketSize check from v2.
drivers/usb/gadget/function/f_ncm.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c
index ca5d5f564998..e2a059cfda2c 100644
--- a/drivers/usb/gadget/function/f_ncm.c
+++ b/drivers/usb/gadget/function/f_ncm.c
@@ -1338,7 +1338,15 @@ static int ncm_unwrap_ntb(struct gether *port,
"Parsed NTB with %d frames\n", dgram_counter);
to_process -= block_len;
- if (to_process != 0) {
+
+ /*
+ * Windows NCM driver avoids USB ZLPs by adding a 1-byte
+ * zero pad as needed.
+ */
+ if (to_process == 1 &&
+ (*(unsigned char *)(ntb_ptr + block_len) == 0x00)) {
+ to_process--;
+ } else if (to_process > 0) {
ntb_ptr = (unsigned char *)(ntb_ptr + block_len);
goto parse_ntb;
}
--
2.34.1
Last week, Jakub reported [1] that the MPTCP Connect selftest was
unstable. It looked like it started after the introduction of some fixes
[2]. After analysis from Paolo, these patches revealed existing bugs,
that should be fixed by the following patches.
- Patch 1: Make sure ACK are sent when MPTCP-level window re-opens. In
some corner cases, the other peer was not notified when more data
could be sent. A fix for v5.11, but depending on a feature introduced
in v5.19.
- Patch 2: Fix spurious wake-up under memory pressure. In this
situation, the userspace could be invited to read data not being there
yet. A fix for v6.7.
- Patch 3: Fix a false positive error when running the MPTCP Connect
selftest with the "disconnect" cases. The userspace could disconnect
the socket too soon, which would reset (MP_FASTCLOSE) the connection,
interpreted as an error by the test. A fix for v5.17.
Link: https://lore.kernel.org/20250107131845.5e5de3c5@kernel.org [1]
Link: https://lore.kernel.org/20241230-net-mptcp-rbuf-fixes-v1-0-8608af434ceb@ker… [2]
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Paolo Abeni (3):
mptcp: be sure to send ack when mptcp-level window re-opens
mptcp: fix spurious wake-up on under memory pressure
selftests: mptcp: avoid spurious errors on disconnect
net/mptcp/options.c | 6 ++--
net/mptcp/protocol.h | 9 +++--
tools/testing/selftests/net/mptcp/mptcp_connect.c | 43 +++++++++++++++++------
3 files changed, 43 insertions(+), 15 deletions(-)
---
base-commit: 76201b5979768500bca362871db66d77cb4c225e
change-id: 20250113-net-mptcp-connect-st-flakes-4af6389808de
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
The orig_a0 is missing in struct user_regs_struct of riscv, and there is
no way to add it without breaking UAPI. (See Link tag below)
Like NT_ARM_SYSTEM_CALL do, we add a new regset name NT_RISCV_ORIG_A0 to
access original a0 register from userspace via ptrace API.
Link: https://lore.kernel.org/all/59505464-c84a-403d-972f-d4b2055eeaac@gmail.com/
Signed-off-by: Celeste Liu <uwu(a)coelacanthus.name>
---
Changes in v5:
- Fix wrong usage in selftests.
- Link to v4: https://lore.kernel.org/r/20241226-riscv-new-regset-v4-0-4496a29d0436@coela…
Changes in v4:
- Fix a copy paste error in selftest. (Forget to commit...)
- Link to v3: https://lore.kernel.org/r/20241226-riscv-new-regset-v3-0-f5b96465826b@coela…
Changes in v3:
- Use return 0 directly for readability.
- Fix test for modify a0.
- Add Fixes: tag
- Remove useless Cc: stable.
- Selftest will check both a0 and orig_a0, but depends on the
correctness of PTRACE_GET_SYSCALL_INFO.
- Link to v2: https://lore.kernel.org/r/20241203-riscv-new-regset-v2-0-d37da8c0cba6@coela…
Changes in v2:
- Fix integer width.
- Add selftest.
- Link to v1: https://lore.kernel.org/r/20241201-riscv-new-regset-v1-1-c83c58abcc7b@coela…
---
Celeste Liu (2):
riscv/ptrace: add new regset to access original a0 register
riscv: selftests: Add a ptrace test to verify a0 and orig_a0 access
arch/riscv/kernel/ptrace.c | 32 +++++
include/uapi/linux/elf.h | 1 +
tools/testing/selftests/riscv/abi/.gitignore | 1 +
tools/testing/selftests/riscv/abi/Makefile | 6 +-
tools/testing/selftests/riscv/abi/ptrace.c | 201 +++++++++++++++++++++++++++
5 files changed, 240 insertions(+), 1 deletion(-)
---
base-commit: 0e287d31b62bb53ad81d5e59778384a40f8b6f56
change-id: 20241201-riscv-new-regset-d529b952ad0d
Best regards,
--
Celeste Liu <uwu(a)coelacanthus.name>
Since commit 02fb4f008433 ("clk: clk-loongson2: Fix potential buffer
overflow in flexible-array member access"), the clk provider register is
failed.
The count of `clks_num` is shown below:
for (p = data; p->name; p++)
clks_num++;
In fact, `clks_num` represents the number of SoC clocks and should be
expressed as the maximum value of the clock binding id in use (p->id + 1).
Now we fix it to avoid the following error when trying to register a clk
provider:
[ 13.409595] of_clk_hw_onecell_get: invalid index 17
Cc: stable(a)vger.kernel.org
Cc: Gustavo A. R. Silva <gustavoars(a)kernel.org>
Fixes: 02fb4f008433 ("clk: clk-loongson2: Fix potential buffer overflow in flexible-array member access")
Signed-off-by: Binbin Zhou <zhoubinbin(a)loongson.cn>
---
drivers/clk/clk-loongson2.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/clk/clk-loongson2.c b/drivers/clk/clk-loongson2.c
index 6bf51d5a49a1..27e632edd484 100644
--- a/drivers/clk/clk-loongson2.c
+++ b/drivers/clk/clk-loongson2.c
@@ -294,7 +294,7 @@ static int loongson2_clk_probe(struct platform_device *pdev)
return -EINVAL;
for (p = data; p->name; p++)
- clks_num++;
+ clks_num = max(clks_num, p->id + 1);
clp = devm_kzalloc(dev, struct_size(clp, clk_data.hws, clks_num),
GFP_KERNEL);
@@ -309,6 +309,9 @@ static int loongson2_clk_probe(struct platform_device *pdev)
clp->clk_data.num = clks_num;
clp->dev = dev;
+ /* Avoid returning NULL for unused id */
+ memset_p((void **)clp->clk_data.hws, ERR_PTR(-ENOENT), clks_num);
+
for (i = 0; i < clks_num; i++) {
p = &data[i];
switch (p->type) {
--
2.43.5
When IORING_REGISTER_CLONE_BUFFERS is used to clone buffers from uring
instance A to uring instance B, where A and B use different MMs for
accounting, the accounting can go wrong:
If uring instance A is closed before uring instance B, the pinned memory
counters for uring instance B will be decremented, even though the pinned
memory was originally accounted through uring instance A; so the MM of
uring instance B can end up with negative locked memory.
Cc: stable(a)vger.kernel.org
Closes: https://lore.kernel.org/r/CAG48ez1zez4bdhmeGLEFxtbFADY4Czn3CV0u9d_TMcbvRA01…
Fixes: 7cc2a6eadcd7 ("io_uring: add IORING_REGISTER_COPY_BUFFERS method")
Signed-off-by: Jann Horn <jannh(a)google.com>
---
To be clear, I think this is a very minor issue, feel free to take your
time landing it.
I put a stable marker on this, but I'm ambivalent about whether this
issue even warrants landing a fix in stable - feel free to remove the
Cc stable marker if you think it's unnecessary.
---
io_uring/rsrc.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
index 077f84684c18a0b3f5e622adb4978b6a00353b2f..caecc18dd5be03054ae46179bc0918887bf609a4 100644
--- a/io_uring/rsrc.c
+++ b/io_uring/rsrc.c
@@ -931,6 +931,13 @@ static int io_clone_buffers(struct io_ring_ctx *ctx, struct io_ring_ctx *src_ctx
int i, ret, off, nr;
unsigned int nbufs;
+ /*
+ * Accounting state is shared between the two rings; that only works if
+ * both rings are accounted towards the same counters.
+ */
+ if (ctx->user != src_ctx->user || ctx->mm_account != src_ctx->mm_account)
+ return -EINVAL;
+
/* if offsets are given, must have nr specified too */
if (!arg->nr && (arg->dst_off || arg->src_off))
return -EINVAL;
---
base-commit: c45323b7560ec87c37c729b703c86ee65f136d75
change-id: 20250114-uring-check-accounting-4356f8b91c37
--
Jann Horn <jannh(a)google.com>
From: Zijun Hu <quic_zijuhu(a)quicinc.com>
For WCN6855, board ID specific NVM needs to be downloaded once board ID
is available, but the default NVM is always downloaded currently.
The wrong NVM causes poor RF performance, and effects user experience
for several types of laptop with WCN6855 on the market.
Fix by downloading board ID specific NVM if board ID is available.
Fixes: 095327fede00 ("Bluetooth: hci_qca: Add support for QTI Bluetooth chip wcn6855")
Cc: stable(a)vger.kernel.org # 6.4
Signed-off-by: Zijun Hu <quic_zijuhu(a)quicinc.com>
---
Changes in v3:
- Rework over tip of bluetooth-next tree.
- Remove both Reviewed-by and Tested-by tags.
- Link to v2: https://lore.kernel.org/r/20241116-x13s_wcn6855_fix-v2-1-c08c298d5fbf@quici…
Changes in v2:
- Correct subject and commit message
- Temporarily add nvm fallback logic to speed up backport.
- Add fix/stable tags as suggested by Luiz and Johan
- Link to v1: https://lore.kernel.org/r/20241113-x13s_wcn6855_fix-v1-1-15af0aa2549c@quici…
---
drivers/bluetooth/btqca.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/bluetooth/btqca.c b/drivers/bluetooth/btqca.c
index a6b53d1f23dbd4666b93e10635f5f154f38d80a5..cdf09d9a9ad27c080f27c5fe8d61d76085e1fd2c 100644
--- a/drivers/bluetooth/btqca.c
+++ b/drivers/bluetooth/btqca.c
@@ -909,8 +909,9 @@ int qca_uart_setup(struct hci_dev *hdev, uint8_t baudrate,
"qca/msnv%02x.bin", rom_ver);
break;
case QCA_WCN6855:
- snprintf(config.fwname, sizeof(config.fwname),
- "qca/hpnv%02x.bin", rom_ver);
+ qca_read_fw_board_id(hdev, &boardid);
+ qca_get_nvm_name_by_board(config.fwname, sizeof(config.fwname),
+ "hpnv", soc_type, ver, rom_ver, boardid);
break;
case QCA_WCN7850:
qca_get_nvm_name_by_board(config.fwname, sizeof(config.fwname),
---
base-commit: a723753d039fd9a6c5998340ac65f4d9e2966ba8
change-id: 20250113-wcn6855_fix-036ca2fa5559
Best regards,
--
Zijun Hu <quic_zijuhu(a)quicinc.com>
Commit 9604eea5bd3a ("scsi: st: Add third party poweron reset handling")
in v6.6 added new code to handle the Power On/Reset Unit Attention
(POR UA) sense data. This was in addition to the existing method. When
this Unit Attention is received, the driver blocks attempts to read,
write and some other operations because the reset may have rewinded
the tape. Because of the added code, also the initial POR UA resulted
in blocking operations, including those that are used to set the driver
options after the device is recognized. Also, reading and writing are
refused, whereas they succeeded before this commit.
This patch adds code to not set pos_unknown to block operations if the
POR UA is received from the first test_ready() call after the st device
has been created. This restores the behavior before v6.6.
Signed-off-by: Kai Mäkisara <Kai.Makisara(a)kolumbus.fi>
Fixes: 9604eea5bd3a ("scsi: st: Add third party poweron reset handling")
Closes: https://lore.kernel.org/linux-scsi/2201CF73-4795-4D3B-9A79-6EE5215CF58D@kol…
CC: stable(a)vger.kernel.org
---
drivers/scsi/st.c | 6 ++++++
drivers/scsi/st.h | 1 +
2 files changed, 7 insertions(+)
diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
index e8ef27d7ef61..ebbd50ec0cda 100644
--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -1030,6 +1030,11 @@ static int test_ready(struct scsi_tape *STp, int do_wait)
retval = new_session ? CHKRES_NEW_SESSION : CHKRES_READY;
break;
}
+ if (STp->first_tur) {
+ /* Don't set pos_unknown right after device recognition */
+ STp->pos_unknown = 0;
+ STp->first_tur = 0;
+ }
if (SRpnt != NULL)
st_release_request(SRpnt);
@@ -4328,6 +4333,7 @@ static int st_probe(struct device *dev)
blk_queue_rq_timeout(tpnt->device->request_queue, ST_TIMEOUT);
tpnt->long_timeout = ST_LONG_TIMEOUT;
tpnt->try_dio = try_direct_io;
+ tpnt->first_tur = 1;
for (i = 0; i < ST_NBR_MODES; i++) {
STm = &(tpnt->modes[i]);
diff --git a/drivers/scsi/st.h b/drivers/scsi/st.h
index 7a68eaba7e81..1aaaf5369a40 100644
--- a/drivers/scsi/st.h
+++ b/drivers/scsi/st.h
@@ -170,6 +170,7 @@ struct scsi_tape {
unsigned char rew_at_close; /* rewind necessary at close */
unsigned char inited;
unsigned char cleaning_req; /* cleaning requested? */
+ unsigned char first_tur; /* first TEST UNIT READY */
int block_size;
int min_block;
int max_block;
--
2.43.0
Currently the rseq constructor, rseq_init(), assumes that glibc always
has the support for rseq symbols (__rseq_size for instance). However,
glibc supports rseq from version 2.35 onwards. As a result, for the
systems that run glibc less than 2.35, the global rseq_size remains
initialized to -1U. When a thread then tries to register for rseq,
get_rseq_min_alloc_size() would end up returning -1U, which is
incorrect. Hence, initialize rseq_size for the cases where glibc doesn't
have the support for rseq symbols.
Cc: stable(a)vger.kernel.org
Fixes: 73a4f5a704a2 ("selftests/rseq: Fix mm_cid test failure")
Signed-off-by: Raghavendra Rao Ananta <rananta(a)google.com>
---
tools/testing/selftests/rseq/rseq.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
index 5b9772cdf265..9eb5356f25fa 100644
--- a/tools/testing/selftests/rseq/rseq.c
+++ b/tools/testing/selftests/rseq/rseq.c
@@ -142,6 +142,16 @@ unsigned int get_rseq_kernel_feature_size(void)
return ORIG_RSEQ_FEATURE_SIZE;
}
+static void set_default_rseq_size(void)
+{
+ unsigned int rseq_kernel_feature_size = get_rseq_kernel_feature_size();
+
+ if (rseq_kernel_feature_size < ORIG_RSEQ_ALLOC_SIZE)
+ rseq_size = rseq_kernel_feature_size;
+ else
+ rseq_size = ORIG_RSEQ_ALLOC_SIZE;
+}
+
int rseq_register_current_thread(void)
{
int rc;
@@ -219,12 +229,7 @@ void rseq_init(void)
fallthrough;
case ORIG_RSEQ_ALLOC_SIZE:
{
- unsigned int rseq_kernel_feature_size = get_rseq_kernel_feature_size();
-
- if (rseq_kernel_feature_size < ORIG_RSEQ_ALLOC_SIZE)
- rseq_size = rseq_kernel_feature_size;
- else
- rseq_size = ORIG_RSEQ_ALLOC_SIZE;
+ set_default_rseq_size();
break;
}
default:
@@ -239,8 +244,10 @@ void rseq_init(void)
rseq_size = 0;
return;
}
+
rseq_offset = (void *)&__rseq_abi - rseq_thread_pointer();
rseq_flags = 0;
+ set_default_rseq_size();
}
static __attribute__((destructor))
base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
--
2.47.0.338.g60cca15819-goog
[ upstream commit bd2703b42decebdcddf76e277ba76b4c4a142d73 ]
With IORING_SETUP_SQPOLL all requests are created by the SQPOLL task,
which means that req->task should always match sqd->thread. Since
accesses to sqd->thread should be separately protected, use req->task
in io_req_normal_work_add() instead.
Note, in the eyes of io_req_normal_work_add(), the SQPOLL task struct
is always pinned and alive, and sqd->thread can either be the task or
NULL. It's only problematic if the compiler decides to reload the value
after the null check, which is not so likely.
Cc: stable(a)vger.kernel.org
Cc: Bui Quang Minh <minhquangbui99(a)gmail.com>
Reported-by: lizetao <lizetao1(a)huawei.com>
Fixes: 78f9b61bd8e54 ("io_uring: wake SQPOLL task when task_work is added to an empty queue")
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://lore.kernel.org/r/1cbbe72cf32c45a8fee96026463024cd8564a7d7.17365413…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
---
io_uring/io_uring.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 9849da128364..21f1bcba2f52 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -1244,10 +1244,7 @@ static void io_req_normal_work_add(struct io_kiocb *req)
/* SQPOLL doesn't need the task_work added, it'll run it itself */
if (ctx->flags & IORING_SETUP_SQPOLL) {
- struct io_sq_data *sqd = ctx->sq_data;
-
- if (sqd->thread)
- __set_notify_signal(sqd->thread);
+ __set_notify_signal(req->task);
return;
}
--
2.47.1
Currently memremap(MEMREMAP_WB) can produce decrypted/shared mapping:
memremap(MEMREMAP_WB)
arch_memremap_wb()
ioremap_cache()
__ioremap_caller(.encrytped = false)
In such cases, the IORES_MAP_ENCRYPTED flag on the memory will determine
if the resulting mapping is encrypted or decrypted.
Creating a decrypted mapping without explicit request from the caller is
risky:
- It can inadvertently expose the guest's data and compromise the
guest.
- Accessing private memory via shared/decrypted mapping on TDX will
either trigger implicit conversion to shared or #VE (depending on
VMM implementation).
Implicit conversion is destructive: subsequent access to the same
memory via private mapping will trigger a hard-to-debug #VE crash.
The kernel already provides a way to request decrypted mapping
explicitly via the MEMREMAP_DEC flag.
Modify memremap(MEMREMAP_WB) to produce encrypted/private mapping by
default unless MEMREMAP_DEC is specified.
Fix the crash due to #VE on kexec in TDX guests if CONFIG_EISA is enabled.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # 6.11+
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Ashish Kalra <ashish.kalra(a)amd.com>
Cc: "Maciej W. Rozycki" <macro(a)orcam.me.uk>
---
arch/x86/include/asm/io.h | 3 +++
arch/x86/mm/ioremap.c | 8 ++++++++
2 files changed, 11 insertions(+)
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index ed580c7f9d0a..1a0dc2b2bf5b 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -175,6 +175,9 @@ extern void __iomem *ioremap_prot(resource_size_t offset, unsigned long size, un
extern void __iomem *ioremap_encrypted(resource_size_t phys_addr, unsigned long size);
#define ioremap_encrypted ioremap_encrypted
+void *arch_memremap_wb(phys_addr_t phys_addr, size_t size, unsigned long flags);
+#define arch_memremap_wb arch_memremap_wb
+
/**
* ioremap - map bus memory into CPU space
* @offset: bus address of the memory
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 8d29163568a7..3c36f3f5e688 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -503,6 +503,14 @@ void iounmap(volatile void __iomem *addr)
}
EXPORT_SYMBOL(iounmap);
+void *arch_memremap_wb(phys_addr_t phys_addr, size_t size, unsigned long flags)
+{
+ if (flags & MEMREMAP_DEC)
+ return (void __force *)ioremap_cache(phys_addr, size);
+
+ return (void __force *)ioremap_encrypted(phys_addr, size);
+}
+
/*
* Convert a physical pointer to a virtual kernel pointer for /dev/mem
* access
--
2.45.2
Since commit 02fb4f008433 ("clk: clk-loongson2: Fix potential buffer
overflow in flexible-array member access"), the clk provider register is
failed.
The count of `clks_num` is shown below:
for (p = data; p->name; p++)
clks_num++;
In fact, `clks_num` represents the number of SoC clocks and should be
expressed as the maximum value of the clock binding id in use (p->id + 1).
Now we fix it to avoid the following error when trying to register a clk
provider:
[ 13.409595] of_clk_hw_onecell_get: invalid index 17
Cc: stable(a)vger.kernel.org
Cc: Gustavo A. R. Silva <gustavoars(a)kernel.org>
Fixes: 02fb4f008433 ("clk: clk-loongson2: Fix potential buffer overflow in flexible-array member access")
Signed-off-by: Binbin Zhou <zhoubinbin(a)loongson.cn>
---
V2:
- Add Gustavo A. R. Silva to cc list;
- Populate the onecell data with -ENOENT error pointers to avoid
returning NULL, for it is a valid clock.
Link to V1:
https://lore.kernel.org/all/20241225060600.3094154-1-zhoubinbin@loongson.cn/
drivers/clk/clk-loongson2.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/clk/clk-loongson2.c b/drivers/clk/clk-loongson2.c
index 6bf51d5a49a1..9c240a2308f5 100644
--- a/drivers/clk/clk-loongson2.c
+++ b/drivers/clk/clk-loongson2.c
@@ -294,7 +294,7 @@ static int loongson2_clk_probe(struct platform_device *pdev)
return -EINVAL;
for (p = data; p->name; p++)
- clks_num++;
+ clks_num = max(clks_num, p->id + 1);
clp = devm_kzalloc(dev, struct_size(clp, clk_data.hws, clks_num),
GFP_KERNEL);
@@ -309,6 +309,9 @@ static int loongson2_clk_probe(struct platform_device *pdev)
clp->clk_data.num = clks_num;
clp->dev = dev;
+ /* Avoid returning NULL for unused id */
+ memset_p((void **)&clp->clk_data.hws, ERR_PTR(-ENOENT), clks_num);
+
for (i = 0; i < clks_num; i++) {
p = &data[i];
switch (p->type) {
--
2.43.5
Hi, I'm experiencing UBSAN array-index-out-of-bounds errors while using
my Framework 13" AMD laptop with its Mediatek MT7922 wifi adapter
(mt7921e).
It seems to happen only once on boot, and occurs with both kernel
versions 6.12.7 and 6.13-rc4, both compiled from vanilla upstream kernel
sources on Fedora 41 using the kernel.org LLVM toolchain (19.1.6).
I can try some other kernel series if necessary, and also a bisect if I
find a working version, but that may take me a while.
I wasn't sure if I should mark this as a regression, as I'm not sure
which/if there is a working kernel version at this point.
Thanks.
----
[ 17.754417] UBSAN: array-index-out-of-bounds in /data/linux/net/wireless/scan.c:766:2
[ 17.754423] index 0 is out of range for type 'struct ieee80211_channel *[] __counted_by(n_channels)' (aka 'struct ieee80211_channel *[]')
[ 17.754427] CPU: 13 UID: 0 PID: 620 Comm: kworker/u64:10 Tainted: G T 6.13.0-rc4 #9
[ 17.754433] Tainted: [T]=RANDSTRUCT
[ 17.754435] Hardware name: Framework Laptop 13 (AMD Ryzen 7040Series)/FRANMDCP07, BIOS 03.05 03/29/2024
[ 17.754438] Workqueue: events_unbound cfg80211_wiphy_work
[ 17.754446] Call Trace:
[ 17.754449] <TASK>
[ 17.754452] dump_stack_lvl+0x82/0xc0
[ 17.754459] __ubsan_handle_out_of_bounds+0xe7/0x110
[ 17.754464] ? srso_alias_return_thunk+0x5/0xfbef5
[ 17.754470] ? __kmalloc_noprof+0x1a7/0x280
[ 17.754477] cfg80211_scan_6ghz+0x3bb/0xfd0
[ 17.754482] ? srso_alias_return_thunk+0x5/0xfbef5
[ 17.754486] ? try_to_wake_up+0x368/0x4c0
[ 17.754491] ? try_to_wake_up+0x1a9/0x4c0
[ 17.754496] ___cfg80211_scan_done+0xa9/0x1e0
[ 17.754500] cfg80211_wiphy_work+0xb7/0xe0
[ 17.754504] process_scheduled_works+0x205/0x3a0
[ 17.754509] worker_thread+0x24a/0x300
[ 17.754514] ? __cfi_worker_thread+0x10/0x10
[ 17.754519] kthread+0x158/0x180
[ 17.754524] ? __cfi_kthread+0x10/0x10
[ 17.754528] ret_from_fork+0x40/0x50
[ 17.754534] ? __cfi_kthread+0x10/0x10
[ 17.754538] ret_from_fork_asm+0x11/0x30
[ 17.754544] </TASK>
On Tue, Jan 14, 2025 at 10:47:33AM +0100, Johan Hovold wrote:
> On Mon, Jan 13, 2025 at 06:00:34PM +0000, Qasim Ijaz wrote:
> > This patch addresses a null-ptr-deref in qt2_process_read_urb() due to
> > an incorrect bounds check in the following:
> >
> > if (newport > serial->num_ports) {
> > dev_err(&port->dev,
> > "%s - port change to invalid port: %i\n",
> > __func__, newport);
> > break;
> > }
> >
> > The condition doesn't account for the valid range of the serial->port
> > buffer, which is from 0 to serial->num_ports - 1. When newport is equal
> > to serial->num_ports, the assignment of "port" in the
> > following code is out-of-bounds and NULL:
> >
> > serial_priv->current_port = newport;
> > port = serial->port[serial_priv->current_port];
> >
> > The fix checks if newport is greater than or equal to serial->num_ports
> > indicating it is out-of-bounds.
> >
> > Reported-by: syzbot <syzbot+506479ebf12fe435d01a(a)syzkaller.appspotmail.com>
> > Closes: https://syzkaller.appspot.com/bug?extid=506479ebf12fe435d01a
> > Fixes: f7a33e608d9a ("USB: serial: add quatech2 usb to serial driver")
> > Cc: <stable(a)vger.kernel.org> # 3.5
> > Signed-off-by: Qasim Ijaz <qasdev00(a)gmail.com>
> > ---
>
> Thanks for the update. I've applied the patch now after adding Greg's
> Reviewed-by tag (for v2).
>
> For your future contributions, try to remember to include any
> Reviewed-by or Tested-by tags when updating the patch unless the changes
> are non-trivial.
>
> There should typically also be a short change log here under the ---
> line to indicate what changes from previous versions.
>
> It is also encouraged to write the commit message in imperative mood
> (add, change, fix) and to avoid the phrase "this patch". There are some
> more details in
>
> Documentation/process/submitting-patches.rst
>
> Something to keep in mind for the future, but this patch already looks
> really good.
>
> Johan
Hi Johan,
Thanks for reviewing and applying the patch. I appreciate the guidance on patch style and process, and I'll incorporate your suggestions in future submissions.
Best regards,
Qasim
Hi Carlos,
Please pull this branch with changes for xfs.
As usual, I did a test-merge with the main upstream branch as of a few
minutes ago, and didn't see any conflicts. Please let me know if you
encounter any problems.
--D
The following changes since commit 05290bd5c6236b8ad659157edb36bd2d38f46d3e:
xfs: allow inode-based btrees to reserve space in the data device (2024-12-23 13:06:03 -0800)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git tags/realtime-rmap_2024-12-23
for you to fetch changes up to c2358439af374cad47f771797875d0beb8256738:
xfs: enable realtime rmap btree (2024-12-23 13:06:09 -0800)
----------------------------------------------------------------
xfs: realtime reverse-mapping support [v6.2 04/14]
This is the latest revision of a patchset that adds to XFS kernel
support for reverse mapping for the realtime device. This time around
I've fixed some of the bitrot that I've noticed over the past few
months, and most notably have converted rtrmapbt to use the metadata
inode directory feature instead of burning more space in the superblock.
At the beginning of the set are patches to implement storing B+tree
leaves in an inode root, since the realtime rmapbt is rooted in an
inode, unlike the regular rmapbt which is rooted in an AG block.
Prior to this, the only btree that could be rooted in the inode fork
was the block mapping btree; if all the extent records fit in the
inode, format would be switched from 'btree' to 'extents'.
The next few patches enhance the reverse mapping routines to handle
the parts that are specific to rtgroups -- adding the new btree type,
adding a new log intent item type, and wiring up the metadata directory
tree entries.
Finally, implement GETFSMAP with the rtrmapbt and scrub functionality
for the rtrmapbt and rtbitmap and online fsck functionality.
This has been running on the djcloud for months with no problems. Enjoy!
Signed-off-by: "Darrick J. Wong" <djwong(a)kernel.org>
----------------------------------------------------------------
Darrick J. Wong (37):
xfs: add some rtgroup inode helpers
xfs: prepare rmap btree cursor tracepoints for realtime
xfs: simplify the xfs_rmap_{alloc,free}_extent calling conventions
xfs: introduce realtime rmap btree ondisk definitions
xfs: realtime rmap btree transaction reservations
xfs: add realtime rmap btree operations
xfs: prepare rmap functions to deal with rtrmapbt
xfs: add a realtime flag to the rmap update log redo items
xfs: support recovering rmap intent items targetting realtime extents
xfs: pretty print metadata file types in error messages
xfs: support file data forks containing metadata btrees
xfs: add realtime reverse map inode to metadata directory
xfs: add metadata reservations for realtime rmap btrees
xfs: wire up a new metafile type for the realtime rmap
xfs: wire up rmap map and unmap to the realtime rmapbt
xfs: create routine to allocate and initialize a realtime rmap btree inode
xfs: wire up getfsmap to the realtime reverse mapping btree
xfs: check that the rtrmapbt maxlevels doesn't increase when growing fs
xfs: report realtime rmap btree corruption errors to the health system
xfs: allow queued realtime intents to drain before scrubbing
xfs: scrub the realtime rmapbt
xfs: cross-reference realtime bitmap to realtime rmapbt scrubber
xfs: cross-reference the realtime rmapbt
xfs: scan rt rmap when we're doing an intense rmap check of bmbt mappings
xfs: scrub the metadir path of rt rmap btree files
xfs: walk the rt reverse mapping tree when rebuilding rmap
xfs: online repair of realtime file bmaps
xfs: repair inodes that have realtime extents
xfs: repair rmap btree inodes
xfs: online repair of realtime bitmaps for a realtime group
xfs: support repairing metadata btrees rooted in metadir inodes
xfs: online repair of the realtime rmap btree
xfs: create a shadow rmap btree during realtime rmap repair
xfs: hook live realtime rmap operations during a repair operation
xfs: don't shut down the filesystem for media failures beyond end of log
xfs: react to fsdax failure notifications on the rt device
xfs: enable realtime rmap btree
fs/xfs/Makefile | 3 +
fs/xfs/libxfs/xfs_btree.c | 73 +++
fs/xfs/libxfs/xfs_btree.h | 8 +-
fs/xfs/libxfs/xfs_btree_mem.c | 1 +
fs/xfs/libxfs/xfs_btree_staging.c | 1 +
fs/xfs/libxfs/xfs_defer.h | 1 +
fs/xfs/libxfs/xfs_exchmaps.c | 4 +-
fs/xfs/libxfs/xfs_format.h | 28 +-
fs/xfs/libxfs/xfs_fs.h | 7 +-
fs/xfs/libxfs/xfs_health.h | 4 +-
fs/xfs/libxfs/xfs_inode_buf.c | 32 +-
fs/xfs/libxfs/xfs_inode_fork.c | 25 +
fs/xfs/libxfs/xfs_log_format.h | 6 +-
fs/xfs/libxfs/xfs_log_recover.h | 2 +
fs/xfs/libxfs/xfs_metafile.c | 18 +
fs/xfs/libxfs/xfs_metafile.h | 2 +
fs/xfs/libxfs/xfs_ondisk.h | 2 +
fs/xfs/libxfs/xfs_refcount.c | 6 +-
fs/xfs/libxfs/xfs_rmap.c | 171 +++++--
fs/xfs/libxfs/xfs_rmap.h | 12 +-
fs/xfs/libxfs/xfs_rtbitmap.c | 2 +-
fs/xfs/libxfs/xfs_rtbitmap.h | 9 +
fs/xfs/libxfs/xfs_rtgroup.c | 53 +-
fs/xfs/libxfs/xfs_rtgroup.h | 49 +-
fs/xfs/libxfs/xfs_rtrmap_btree.c | 1011 +++++++++++++++++++++++++++++++++++++
fs/xfs/libxfs/xfs_rtrmap_btree.h | 210 ++++++++
fs/xfs/libxfs/xfs_sb.c | 6 +
fs/xfs/libxfs/xfs_shared.h | 14 +
fs/xfs/libxfs/xfs_trans_resv.c | 12 +-
fs/xfs/libxfs/xfs_trans_space.h | 13 +
fs/xfs/scrub/alloc_repair.c | 5 +-
fs/xfs/scrub/bmap.c | 108 +++-
fs/xfs/scrub/bmap_repair.c | 129 ++++-
fs/xfs/scrub/common.c | 160 +++++-
fs/xfs/scrub/common.h | 23 +-
fs/xfs/scrub/health.c | 1 +
fs/xfs/scrub/inode.c | 10 +-
fs/xfs/scrub/inode_repair.c | 136 ++++-
fs/xfs/scrub/metapath.c | 3 +
fs/xfs/scrub/newbt.c | 42 ++
fs/xfs/scrub/newbt.h | 1 +
fs/xfs/scrub/reap.c | 41 ++
fs/xfs/scrub/reap.h | 2 +
fs/xfs/scrub/repair.c | 191 +++++++
fs/xfs/scrub/repair.h | 17 +
fs/xfs/scrub/rgsuper.c | 6 +-
fs/xfs/scrub/rmap_repair.c | 84 ++-
fs/xfs/scrub/rtbitmap.c | 75 ++-
fs/xfs/scrub/rtbitmap.h | 55 ++
fs/xfs/scrub/rtbitmap_repair.c | 429 +++++++++++++++-
fs/xfs/scrub/rtrmap.c | 271 ++++++++++
fs/xfs/scrub/rtrmap_repair.c | 903 +++++++++++++++++++++++++++++++++
fs/xfs/scrub/rtsummary.c | 17 +-
fs/xfs/scrub/rtsummary_repair.c | 3 +-
fs/xfs/scrub/scrub.c | 11 +-
fs/xfs/scrub/scrub.h | 14 +
fs/xfs/scrub/stats.c | 1 +
fs/xfs/scrub/tempexch.h | 2 +-
fs/xfs/scrub/tempfile.c | 20 +-
fs/xfs/scrub/trace.c | 1 +
fs/xfs/scrub/trace.h | 228 ++++++++-
fs/xfs/xfs_buf.c | 1 +
fs/xfs/xfs_buf_item_recover.c | 4 +
fs/xfs/xfs_drain.c | 20 +-
fs/xfs/xfs_drain.h | 7 +-
fs/xfs/xfs_fsmap.c | 174 ++++++-
fs/xfs/xfs_fsops.c | 11 +
fs/xfs/xfs_health.c | 1 +
fs/xfs/xfs_inode.c | 19 +-
fs/xfs/xfs_inode_item.c | 2 +
fs/xfs/xfs_inode_item_recover.c | 44 +-
fs/xfs/xfs_log_recover.c | 2 +
fs/xfs/xfs_mount.c | 5 +-
fs/xfs/xfs_mount.h | 9 +
fs/xfs/xfs_notify_failure.c | 230 +++++----
fs/xfs/xfs_notify_failure.h | 11 +
fs/xfs/xfs_qm.c | 8 +-
fs/xfs/xfs_rmap_item.c | 216 +++++++-
fs/xfs/xfs_rtalloc.c | 82 ++-
fs/xfs/xfs_rtalloc.h | 10 +
fs/xfs/xfs_stats.c | 4 +-
fs/xfs/xfs_stats.h | 2 +
fs/xfs/xfs_super.c | 6 -
fs/xfs/xfs_super.h | 1 -
fs/xfs/xfs_trace.h | 104 ++--
85 files changed, 5381 insertions(+), 366 deletions(-)
create mode 100644 fs/xfs/libxfs/xfs_rtrmap_btree.c
create mode 100644 fs/xfs/libxfs/xfs_rtrmap_btree.h
create mode 100644 fs/xfs/scrub/rtrmap.c
create mode 100644 fs/xfs/scrub/rtrmap_repair.c
create mode 100644 fs/xfs/xfs_notify_failure.h
Hi Carlos,
Please pull this branch with changes for xfs.
As usual, I did a test-merge with the main upstream branch as of a few
minutes ago, and didn't see any conflicts. Please let me know if you
encounter any problems.
--D
The following changes since commit 4bbf9020becbfd8fc2c3da790855b7042fad455b:
Linux 6.13-rc4 (2024-12-22 13:22:21 -0800)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git tags/xfs-6.13-fixes_2024-12-23
for you to fetch changes up to 1aacd3fac248902ea1f7607f2d12b93929a4833b:
xfs: release the dquot buf outside of qli_lock (2024-12-23 13:06:01 -0800)
----------------------------------------------------------------
xfs: bug fixes for 6.13 [01/14]
Bug fixes for 6.13.
This has been running on the djcloud for months with no problems. Enjoy!
Signed-off-by: "Darrick J. Wong" <djwong(a)kernel.org>
----------------------------------------------------------------
Darrick J. Wong (2):
xfs: don't over-report free space or inodes in statvfs
xfs: release the dquot buf outside of qli_lock
fs/xfs/xfs_dquot.c | 12 ++++++++----
fs/xfs/xfs_qm_bhv.c | 27 +++++++++++++++++----------
2 files changed, 25 insertions(+), 14 deletions(-)
This patch addresses a null-ptr-deref in qt2_process_read_urb() due to
an incorrect bounds check in the following:
if (newport > serial->num_ports) {
dev_err(&port->dev,
"%s - port change to invalid port: %i\n",
__func__, newport);
break;
}
The condition doesn't account for the valid range of the serial->port
buffer, which is from 0 to serial->num_ports - 1. When newport is equal
to serial->num_ports, the assignment of "port" in the
following code is out-of-bounds and NULL:
serial_priv->current_port = newport;
port = serial->port[serial_priv->current_port];
The fix checks if newport is greater than or equal to serial->num_ports
indicating it is out-of-bounds.
Reported-by: syzbot <syzbot+506479ebf12fe435d01a(a)syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=506479ebf12fe435d01a
Fixes: f7a33e608d9a ("USB: serial: add quatech2 usb to serial driver")
Cc: <stable(a)vger.kernel.org> # 3.5
Signed-off-by: Qasim Ijaz <qasdev00(a)gmail.com>
---
drivers/usb/serial/quatech2.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/serial/quatech2.c b/drivers/usb/serial/quatech2.c
index a317bdbd00ad..72fe83a6c978 100644
--- a/drivers/usb/serial/quatech2.c
+++ b/drivers/usb/serial/quatech2.c
@@ -503,7 +503,7 @@ static void qt2_process_read_urb(struct urb *urb)
newport = *(ch + 3);
- if (newport > serial->num_ports) {
+ if (newport >= serial->num_ports) {
dev_err(&port->dev,
"%s - port change to invalid port: %i\n",
__func__, newport);
--
2.39.5
This is a resend of the patchset discussed here[1] for the 5.15 tree.
[1] https://lore.kernel.org/r/2025011052-backpedal-coat-2fec@gregkh
I've picked the "do not keep dangling zcomp pointer" patch from the
linux-rc tree at the time, so kept Sasha's SOB and added mine on top
-- please let me know if it wasn't appropriate.
I've also prepared the 5.10 patches, but I hadn't realized there were so
many stable deps (8 patchs total!); I'm honestly not sure the problem is
worth the churn but since it's done and tested I'll send the patches if
there is no problem with this 5.15 version.
Thanks!
Dominique Martinet (1):
zram: check comp is non-NULL before calling comp_destroy
Kairui Song (1):
zram: fix uninitialized ZRAM not releasing backing device
Sergey Senozhatsky (1):
drivers/block/zram/zram_drv.c: do not keep dangling zcomp pointer
after zram reset
drivers/block/zram/zram_drv.c | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)
--
2.39.5
The quilt patch titled
Subject: mm/hugetlb: fix avoid_reserve to allow taking folio from subpool
has been removed from the -mm tree. Its filename was
mm-hugetlb-fix-avoid_reserve-to-allow-taking-folio-from-subpool.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Peter Xu <peterx(a)redhat.com>
Subject: mm/hugetlb: fix avoid_reserve to allow taking folio from subpool
Date: Tue, 7 Jan 2025 15:39:56 -0500
Patch series "mm/hugetlb: Refactor hugetlb allocation resv accounting",
v2.
This is a follow up on Ackerley's series here as replacement:
https://lore.kernel.org/r/cover.1728684491.git.ackerleytng@google.com
The goal of this series is to cleanup hugetlb resv accounting, especially
during folio allocation, to decouple a few things:
- Hugetlb folios v.s. Hugetlbfs: IOW, the hope is in the future hugetlb
folios can be allocated completely without hugetlbfs.
- Decouple VMA v.s. hugetlb folio allocations: allocating a hugetlb folio
should not always require a hugetlbfs VMA. For example, either it got
allocated from the inode level (see hugetlbfs_fallocate() where it used
a pesudo VMA for allocation), or it can be allocated by other kernel
subsystems.
It paves way for other users to allocate hugetlb folios out of either
system reservations, or subpools (instead of hugetlbfs, as a file system).
For longer term, this prepares hugetlb as a separate concept versus
hugetlbfs, so that hugetlb folios can be allocated by not only hugetlbfs
and other things.
Tests I've done:
- I had a reproducer in patch 1 for the bug I found, this will start to
work after patch 1 or the whole set applied.
- Hugetlb regression tests (on x86_64 2MBs), includes:
- All vmtests on hugetlbfs
- libhugetlbfs test suite (which may fail some tests, but no new failures
will be introduced by this series, so all such failures happen before
this series so shouldn't be relevant).
This patch (of 7):
Since commit 04f2cbe35699 ("hugetlb: guarantee that COW faults for a
process that called mmap(MAP_PRIVATE) on hugetlbfs will succeed"),
avoid_reserve was introduced for a special case of CoW on hugetlb private
mappings, and only if the owner VMA is trying to allocate yet another
hugetlb folio that is not reserved within the private vma reserved map.
Later on, in commit d85f69b0b533 ("mm/hugetlb: alloc_huge_page handle
areas hole punched by fallocate"), alloc_huge_page() enforced to not
consume any global reservation as long as avoid_reserve=true. This
operation doesn't look correct, because even if it will enforce the
allocation to not use global reservation at all, it will still try to take
one reservation from the spool (if the subpool existed). Then since the
spool reserved pages take from global reservation, it'll also take one
reservation globally.
Logically it can cause global reservation to go wrong.
I wrote a reproducer below, trigger this special path, and every run of
such program will cause global reservation count to increment by one, until
it hits the number of free pages:
#define _GNU_SOURCE /* See feature_test_macros(7) */
#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/mman.h>
#define MSIZE (2UL << 20)
int main(int argc, char *argv[])
{
const char *path;
int *buf;
int fd, ret;
pid_t child;
if (argc < 2) {
printf("usage: %s <hugetlb_file>\n", argv[0]);
return -1;
}
path = argv[1];
fd = open(path, O_RDWR | O_CREAT, 0666);
if (fd < 0) {
perror("open failed");
return -1;
}
ret = fallocate(fd, 0, 0, MSIZE);
if (ret != 0) {
perror("fallocate");
return -1;
}
buf = mmap(NULL, MSIZE, PROT_READ|PROT_WRITE,
MAP_PRIVATE, fd, 0);
if (buf == MAP_FAILED) {
perror("mmap() failed");
return -1;
}
/* Allocate a page */
*buf = 1;
child = fork();
if (child == 0) {
/* child doesn't need to do anything */
exit(0);
}
/* Trigger CoW from owner */
*buf = 2;
munmap(buf, MSIZE);
close(fd);
unlink(path);
return 0;
}
It can only reproduce with a sub-mount when there're reserved pages on the
spool, like:
# sysctl vm.nr_hugepages=128
# mkdir ./hugetlb-pool
# mount -t hugetlbfs -o min_size=8M,pagesize=2M none ./hugetlb-pool
Then run the reproducer on the mountpoint:
# ./reproducer ./hugetlb-pool/test
Fix it by taking the reservation from spool if available. In general,
avoid_reserve is IMHO more about "avoid vma resv map", not spool's.
I copied stable, however I have no intention for backporting if it's not a
clean cherry-pick, because private hugetlb mapping, and then fork() on top
is too rare to hit.
Link: https://lkml.kernel.org/r/20250107204002.2683356-1-peterx@redhat.com
Link: https://lkml.kernel.org/r/20250107204002.2683356-2-peterx@redhat.com
Fixes: d85f69b0b533 ("mm/hugetlb: alloc_huge_page handle areas hole punched by fallocate")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Reviewed-by: Ackerley Tng <ackerleytng(a)google.com>
Tested-by: Ackerley Tng <ackerleytng(a)google.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Cc: Breno Leitao <leitao(a)debian.org>
Cc: Muchun Song <muchun.song(a)linux.dev>
Cc: Naoya Horiguchi <nao.horiguchi(a)gmail.com>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 22 +++-------------------
1 file changed, 3 insertions(+), 19 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-avoid_reserve-to-allow-taking-folio-from-subpool
+++ a/mm/hugetlb.c
@@ -1398,8 +1398,7 @@ static unsigned long available_huge_page
static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h,
struct vm_area_struct *vma,
- unsigned long address, int avoid_reserve,
- long chg)
+ unsigned long address, long chg)
{
struct folio *folio = NULL;
struct mempolicy *mpol;
@@ -1415,10 +1414,6 @@ static struct folio *dequeue_hugetlb_fol
if (!vma_has_reserves(vma, chg) && !available_huge_pages(h))
goto err;
- /* If reserves cannot be used, ensure enough pages are in the pool */
- if (avoid_reserve && !available_huge_pages(h))
- goto err;
-
gfp_mask = htlb_alloc_mask(h);
nid = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
@@ -1434,7 +1429,7 @@ static struct folio *dequeue_hugetlb_fol
folio = dequeue_hugetlb_folio_nodemask(h, gfp_mask,
nid, nodemask);
- if (folio && !avoid_reserve && vma_has_reserves(vma, chg)) {
+ if (folio && vma_has_reserves(vma, chg)) {
folio_set_hugetlb_restore_reserve(folio);
h->resv_huge_pages--;
}
@@ -3051,17 +3046,6 @@ struct folio *alloc_hugetlb_folio(struct
gbl_chg = hugepage_subpool_get_pages(spool, 1);
if (gbl_chg < 0)
goto out_end_reservation;
-
- /*
- * Even though there was no reservation in the region/reserve
- * map, there could be reservations associated with the
- * subpool that can be used. This would be indicated if the
- * return value of hugepage_subpool_get_pages() is zero.
- * However, if avoid_reserve is specified we still avoid even
- * the subpool reservations.
- */
- if (avoid_reserve)
- gbl_chg = 1;
}
/* If this allocation is not consuming a reservation, charge it now.
@@ -3084,7 +3068,7 @@ struct folio *alloc_hugetlb_folio(struct
* from the global free pool (global change). gbl_chg == 0 indicates
* a reservation exists for the allocation.
*/
- folio = dequeue_hugetlb_folio_vma(h, vma, addr, avoid_reserve, gbl_chg);
+ folio = dequeue_hugetlb_folio_vma(h, vma, addr, gbl_chg);
if (!folio) {
spin_unlock_irq(&hugetlb_lock);
folio = alloc_buddy_hugetlb_folio_with_mpol(h, vma, addr);
_
Patches currently in -mm which might be from peterx(a)redhat.com are
The quilt patch titled
Subject: maple_tree: simplify split calculation
has been removed from the -mm tree. Its filename was
maple_tree-simplify-split-calculation.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Wei Yang <richard.weiyang(a)gmail.com>
Subject: maple_tree: simplify split calculation
Date: Wed, 13 Nov 2024 03:16:14 +0000
Patch series "simplify split calculation", v3.
This patch (of 3):
The current calculation for splitting nodes tries to enforce a minimum
span on the leaf nodes. This code is complex and never worked correctly
to begin with, due to the min value being passed as 0 for all leaves.
The calculation should just split the data as equally as possible
between the new nodes. Note that b_end will be one more than the data,
so the left side is still favoured in the calculation.
The current code may also lead to a deficient node by not leaving enough
data for the right side of the split. This issue is also addressed with
the split calculation change.
[Liam.Howlett(a)Oracle.com: rephrase the change log]
Link: https://lkml.kernel.org/r/20241113031616.10530-1-richard.weiyang@gmail.com
Link: https://lkml.kernel.org/r/20241113031616.10530-2-richard.weiyang@gmail.com
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Wei Yang <richard.weiyang(a)gmail.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett(a)Oracle.com>
Cc: Sidhartha Kumar <sidhartha.kumar(a)oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes(a)oracle.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/maple_tree.c | 23 ++++++-----------------
1 file changed, 6 insertions(+), 17 deletions(-)
--- a/lib/maple_tree.c~maple_tree-simplify-split-calculation
+++ a/lib/maple_tree.c
@@ -1863,11 +1863,11 @@ static inline int mab_no_null_split(stru
* Return: The first split location. The middle split is set in @mid_split.
*/
static inline int mab_calc_split(struct ma_state *mas,
- struct maple_big_node *bn, unsigned char *mid_split, unsigned long min)
+ struct maple_big_node *bn, unsigned char *mid_split)
{
unsigned char b_end = bn->b_end;
int split = b_end / 2; /* Assume equal split. */
- unsigned char slot_min, slot_count = mt_slots[bn->type];
+ unsigned char slot_count = mt_slots[bn->type];
/*
* To support gap tracking, all NULL entries are kept together and a node cannot
@@ -1900,18 +1900,7 @@ static inline int mab_calc_split(struct
split = b_end / 3;
*mid_split = split * 2;
} else {
- slot_min = mt_min_slots[bn->type];
-
*mid_split = 0;
- /*
- * Avoid having a range less than the slot count unless it
- * causes one node to be deficient.
- * NOTE: mt_min_slots is 1 based, b_end and split are zero.
- */
- while ((split < slot_count - 1) &&
- ((bn->pivot[split] - min) < slot_count - 1) &&
- (b_end - split > slot_min))
- split++;
}
/* Avoid ending a node on a NULL entry */
@@ -2377,7 +2366,7 @@ static inline struct maple_enode
static inline unsigned char mas_mab_to_node(struct ma_state *mas,
struct maple_big_node *b_node, struct maple_enode **left,
struct maple_enode **right, struct maple_enode **middle,
- unsigned char *mid_split, unsigned long min)
+ unsigned char *mid_split)
{
unsigned char split = 0;
unsigned char slot_count = mt_slots[b_node->type];
@@ -2390,7 +2379,7 @@ static inline unsigned char mas_mab_to_n
if (b_node->b_end < slot_count) {
split = b_node->b_end;
} else {
- split = mab_calc_split(mas, b_node, mid_split, min);
+ split = mab_calc_split(mas, b_node, mid_split);
*right = mas_new_ma_node(mas, b_node);
}
@@ -2877,7 +2866,7 @@ static void mas_spanning_rebalance(struc
mast->bn->b_end--;
mast->bn->type = mte_node_type(mast->orig_l->node);
split = mas_mab_to_node(mas, mast->bn, &left, &right, &middle,
- &mid_split, mast->orig_l->min);
+ &mid_split);
mast_set_split_parents(mast, left, middle, right, split,
mid_split);
mast_cp_to_nodes(mast, left, middle, right, split, mid_split);
@@ -3365,7 +3354,7 @@ static void mas_split(struct ma_state *m
if (mas_push_data(mas, height, &mast, false))
break;
- split = mab_calc_split(mas, b_node, &mid_split, prev_l_mas.min);
+ split = mab_calc_split(mas, b_node, &mid_split);
mast_split_data(&mast, mas, split);
/*
* Usually correct, mab_mas_cp in the above call overwrites
_
Patches currently in -mm which might be from richard.weiyang(a)gmail.com are
On Mon, 13 Jan 2025 06:03:53 -0800, Sasha Levin wrote:
>
Hello Sasha,
> This is a note to let you know that I've just added the patch titled
>
> drm/xe/oa: Separate batch submission from waiting for completion
>
> to the 6.12-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> drm-xe-oa-separate-batch-submission-from-waiting-for.patch
> and it can be found in the queue-6.12 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
I am writing about 3 emails I received (including this one) about 3 commits
being added to stable. These are the 3 commits I am referring to (all
commit SHA's refer to Linus' tree and first commit is at the bottom,
everywhere in this email):
2fb4350a283af drm/xe/oa: Add input fence dependencies
c8507a25cebd1 drm/xe/oa/uapi: Define and parse OA sync properties
dddcb19ad4d4b drm/xe/oa: Separate batch submission from waiting for completion
So none of these commits had any "Fixes:" tag or "Cc: stable" so not sure
why they are being added to stable. Also, they are part of a 7 commit
series so not sure why only 3 of the 7 commits are being added to stable?
In addition, for this commit which is also added to stable:
f0ed39830e606 xe/oa: Fix query mode of operation for OAR/OAC
We modified this commit for stable because it will otherwise with the
previous 3 commits mentioned above, which we were assuming would not be in
stable.
Now, if we want all these commits in stable (I actually prefer it), we
should just pick them straight from Linus' tree. This would be all these
commits:
f0ed39830e606 xe/oa: Fix query mode of operation for OAR/OAC
c0403e4ceecae drm/xe/oa: Fix "Missing outer runtime PM protection" warning
85d3f9e84e062 drm/xe/oa: Allow only certain property changes from config
9920c8b88c5cf drm/xe/oa: Add syncs support to OA config ioctl
cc4e6994d5a23 drm/xe/oa: Move functions up so they can be reused for config ioctl
343dd246fd9b5 drm/xe/oa: Signal output fences
2fb4350a283af drm/xe/oa: Add input fence dependencies
c8507a25cebd1 drm/xe/oa/uapi: Define and parse OA sync properties
dddcb19ad4d4b drm/xe/oa: Separate batch submission from waiting for completion
So: we should either drop the 3 patches at the top, or just pick all 9
patches above. Doing the latter will probably be the simplest and I don't
expect any conflicts, or if there are, I can help to resolve those.
The above list can be generated by running the following in Linus' tree:
git log --oneline -- drivers/gpu/drm/xe/xe_oa.c
Thanks.
--
Ashutosh
>
>
>
> commit 9aeced687e728b9de067a502a0780f8029e61763
> Author: Ashutosh Dixit <ashutosh.dixit(a)intel.com>
> Date: Tue Oct 22 13:03:46 2024 -0700
>
> drm/xe/oa: Separate batch submission from waiting for completion
>
> [ Upstream commit dddcb19ad4d4bbe943a72a1fb3266c6e8aa8d541 ]
>
> When we introduce xe_syncs, we don't wait for internal OA programming
> batches to complete. That is, xe_syncs are signaled asynchronously. In
> anticipation for this, separate out batch submission from waiting for
> completion of those batches.
>
> v2: Change return type of xe_oa_submit_bb to "struct dma_fence *" (Matt B)
> v3: Retain init "int err = 0;" in xe_oa_submit_bb (Jose)
>
> Reviewed-by: Jonathan Cavitt <jonathan.cavitt(a)intel.com>
> Signed-off-by: Ashutosh Dixit <ashutosh.dixit(a)intel.com>
> Link: https://patchwork.freedesktop.org/patch/msgid/20241022200352.1192560-2-ashu…
> Stable-dep-of: f0ed39830e60 ("xe/oa: Fix query mode of operation for OAR/OAC")
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/drivers/gpu/drm/xe/xe_oa.c b/drivers/gpu/drm/xe/xe_oa.c
> index 78823f53d290..4962c9eb9a81 100644
> --- a/drivers/gpu/drm/xe/xe_oa.c
> +++ b/drivers/gpu/drm/xe/xe_oa.c
> @@ -567,11 +567,10 @@ static __poll_t xe_oa_poll(struct file *file, poll_table *wait)
> return ret;
> }
>
> -static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb)
> +static struct dma_fence *xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb)
> {
> struct xe_sched_job *job;
> struct dma_fence *fence;
> - long timeout;
> int err = 0;
>
> /* Kernel configuration is issued on stream->k_exec_q, not stream->exec_q */
> @@ -585,14 +584,9 @@ static int xe_oa_submit_bb(struct xe_oa_stream *stream, struct xe_bb *bb)
> fence = dma_fence_get(&job->drm.s_fence->finished);
> xe_sched_job_push(job);
>
> - timeout = dma_fence_wait_timeout(fence, false, HZ);
> - dma_fence_put(fence);
> - if (timeout < 0)
> - err = timeout;
> - else if (!timeout)
> - err = -ETIME;
> + return fence;
> exit:
> - return err;
> + return ERR_PTR(err);
> }
>
> static void write_cs_mi_lri(struct xe_bb *bb, const struct xe_oa_reg *reg_data, u32 n_regs)
> @@ -656,6 +650,7 @@ static void xe_oa_store_flex(struct xe_oa_stream *stream, struct xe_lrc *lrc,
> static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lrc,
> const struct flex *flex, u32 count)
> {
> + struct dma_fence *fence;
> struct xe_bb *bb;
> int err;
>
> @@ -667,7 +662,16 @@ static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lr
>
> xe_oa_store_flex(stream, lrc, bb, flex, count);
>
> - err = xe_oa_submit_bb(stream, bb);
> + fence = xe_oa_submit_bb(stream, bb);
> + if (IS_ERR(fence)) {
> + err = PTR_ERR(fence);
> + goto free_bb;
> + }
> + xe_bb_free(bb, fence);
> + dma_fence_put(fence);
> +
> + return 0;
> +free_bb:
> xe_bb_free(bb, NULL);
> exit:
> return err;
> @@ -675,6 +679,7 @@ static int xe_oa_modify_ctx_image(struct xe_oa_stream *stream, struct xe_lrc *lr
>
> static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri)
> {
> + struct dma_fence *fence;
> struct xe_bb *bb;
> int err;
>
> @@ -686,7 +691,16 @@ static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *re
>
> write_cs_mi_lri(bb, reg_lri, 1);
>
> - err = xe_oa_submit_bb(stream, bb);
> + fence = xe_oa_submit_bb(stream, bb);
> + if (IS_ERR(fence)) {
> + err = PTR_ERR(fence);
> + goto free_bb;
> + }
> + xe_bb_free(bb, fence);
> + dma_fence_put(fence);
> +
> + return 0;
> +free_bb:
> xe_bb_free(bb, NULL);
> exit:
> return err;
> @@ -914,15 +928,32 @@ static int xe_oa_emit_oa_config(struct xe_oa_stream *stream, struct xe_oa_config
> {
> #define NOA_PROGRAM_ADDITIONAL_DELAY_US 500
> struct xe_oa_config_bo *oa_bo;
> - int err, us = NOA_PROGRAM_ADDITIONAL_DELAY_US;
> + int err = 0, us = NOA_PROGRAM_ADDITIONAL_DELAY_US;
> + struct dma_fence *fence;
> + long timeout;
>
> + /* Emit OA configuration batch */
> oa_bo = xe_oa_alloc_config_buffer(stream, config);
> if (IS_ERR(oa_bo)) {
> err = PTR_ERR(oa_bo);
> goto exit;
> }
>
> - err = xe_oa_submit_bb(stream, oa_bo->bb);
> + fence = xe_oa_submit_bb(stream, oa_bo->bb);
> + if (IS_ERR(fence)) {
> + err = PTR_ERR(fence);
> + goto exit;
> + }
> +
> + /* Wait till all previous batches have executed */
> + timeout = dma_fence_wait_timeout(fence, false, 5 * HZ);
> + dma_fence_put(fence);
> + if (timeout < 0)
> + err = timeout;
> + else if (!timeout)
> + err = -ETIME;
> + if (err)
> + drm_dbg(&stream->oa->xe->drm, "dma_fence_wait_timeout err %d\n", err);
>
> /* Additional empirical delay needed for NOA programming after registers are written */
> usleep_range(us, 2 * us);
On Fri, 10 Jan 2025 06:02:04 +0000 Potin Lai (賴柏廷) wrote:
> > Neat!
> > Potin, please give this a test ASAP.
>
> Thanks for the new patch.
> I am currently tied up with other tasks, but I’ll make sure to test
> it as soon as possible and share the results with you.
Understood, would you be able to test it by January 13th?
Depending on how long we need to wait we may be better off
applying the patch already or waiting with committing..
The quilt patch titled
Subject: selftests/mm: virtual_address_range: avoid reading VVAR mappings
has been removed from the -mm tree. Its filename was
selftests-mm-virtual_address_range-avoid-reading-vvar-mappings.patch
This patch was dropped because an updated version will be issued
------------------------------------------------------
From: Thomas Wei��schuh <thomas.weissschuh(a)linutronix.de>
Subject: selftests/mm: virtual_address_range: avoid reading VVAR mappings
Date: Tue, 07 Jan 2025 16:14:46 +0100
The virtual_address_range selftest reads from the start of each mapping
listed in /proc/self/maps.
However not all mappings are valid to be arbitrarily accessed. For
example the vvar data used for virtual clocks on x86 can only be accessed
if 1) the kernel configuration enables virtual clocks and 2) the
hypervisor provided the data for it, which can only determined by the VDSO
code itself.
Since commit e93d2521b27f ("x86/vdso: Split virtual clock pages into
dedicated mapping") the virtual clock data was split out into its own
mapping, triggering faulting accesses by virtual_address_range.
Skip the various vvar mappings in virtual_address_range to avoid errors.
Link: https://lkml.kernel.org/r/20250107-virtual_address_range-tests-v1-2-3834a2f…
Fixes: e93d2521b27f ("x86/vdso: Split virtual clock pages into dedicated mapping")
Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()")
Signed-off-by: Thomas Wei��schuh <thomas.weissschuh(a)linutronix.de>
Reported-by: kernel test robot <oliver.sang(a)intel.com>
Closes: https://lore.kernel.org/oe-lkp/202412271148.2656e485-lkp@intel.com
Cc: Dev Jain <dev.jain(a)arm.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/virtual_address_range.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/mm/virtual_address_range.c~selftests-mm-virtual_address_range-avoid-reading-vvar-mappings
+++ a/tools/testing/selftests/mm/virtual_address_range.c
@@ -116,10 +116,11 @@ static int validate_complete_va_space(vo
prev_end_addr = 0;
while (fgets(line, sizeof(line), file)) {
+ int path_offset = 0;
unsigned long hop;
- if (sscanf(line, "%lx-%lx %s[rwxp-]",
- &start_addr, &end_addr, prot) != 3)
+ if (sscanf(line, "%lx-%lx %4s %*s %*s %*s %n",
+ &start_addr, &end_addr, prot, &path_offset) != 3)
ksft_exit_fail_msg("cannot parse /proc/self/maps\n");
/* end of userspace mappings; ignore vsyscall mapping */
@@ -135,6 +136,10 @@ static int validate_complete_va_space(vo
if (prot[0] != 'r')
continue;
+ /* Only the VDSO can know if a VVAR mapping is really readable */
+ if (path_offset && !strncmp(line + path_offset, "[vvar", 5))
+ continue;
+
/*
* Confirm whether MAP_CHUNK_SIZE chunk can be found or not.
* If write succeeds, no need to check MAP_CHUNK_SIZE - 1
_
Patches currently in -mm which might be from thomas.weissschuh(a)linutronix.de are
The quilt patch titled
Subject: selftests/mm: virtual_address_range: fix error when CommitLimit < 1GiB
has been removed from the -mm tree. Its filename was
selftests-mm-virtual_address_range-fix-error-when-commitlimit-1gib.patch
This patch was dropped because an updated version will be issued
------------------------------------------------------
From: Thomas Wei��schuh <thomas.weissschuh(a)linutronix.de>
Subject: selftests/mm: virtual_address_range: fix error when CommitLimit < 1GiB
Date: Tue, 07 Jan 2025 16:14:45 +0100
If not enough physical memory is available the kernel may fail mmap(); see
__vm_enough_memory() and vm_commit_limit(). In that case the logic in
validate_complete_va_space() does not make sense and will even incorrectly
fail. Instead skip the test if no mmap() succeeded.
Link: https://lkml.kernel.org/r/20250107-virtual_address_range-tests-v1-1-3834a2f…
Fixes: 010409649885 ("selftests/mm: confirm VA exhaustion without reliance on correctness of mmap()")
Signed-off-by: Thomas Wei��schuh <thomas.weissschuh(a)linutronix.de>
Cc: <stable(a)vger.kernel.org>
Cc: Dev Jain <dev.jain(a)arm.com>
Cc: kernel test robot <oliver.sang(a)intel.com>
Cc: Shuah Khan <shuah(a)kernel.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
tools/testing/selftests/mm/virtual_address_range.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/tools/testing/selftests/mm/virtual_address_range.c~selftests-mm-virtual_address_range-fix-error-when-commitlimit-1gib
+++ a/tools/testing/selftests/mm/virtual_address_range.c
@@ -178,6 +178,12 @@ int main(int argc, char *argv[])
validate_addr(ptr[i], 0);
}
lchunks = i;
+
+ if (!lchunks) {
+ ksft_test_result_skip("Not enough memory for a single chunk\n");
+ ksft_finished();
+ }
+
hptr = (char **) calloc(NR_CHUNKS_HIGH, sizeof(char *));
if (hptr == NULL) {
ksft_test_result_skip("Memory constraint not fulfilled\n");
_
Patches currently in -mm which might be from thomas.weissschuh(a)linutronix.de are
selftests-mm-virtual_address_range-avoid-reading-vvar-mappings.patch
On Mon, Jan 13, 2025 at 6:04 AM Sasha Levin <sashal(a)kernel.org> wrote:
>
> This is a note to let you know that I've just added the patch titled
>
> mm: zswap: fix race between [de]compression and CPU hotunplug
>
> to the 6.12-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> mm-zswap-fix-race-between-de-compression-and-cpu-hot.patch
> and it can be found in the queue-6.12 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
Please drop this patch from v6.12 (and all stable trees) as it has a
bug, as I pointed out in [1] (was that the correct place to surface
this?).
Andrew's latest PR contains a revert of this patch (and alternative fix) [2].
Thanks!
[1]https://lore.kernel.org/stable/CAJD7tkaiUA6J1nb0=1ELpUD0OgjoD+tft-iPqbPdy…
[2]https://lore.kernel.org/lkml/20250113000543.862792e948c2646032a477e0@linu…
>
>
>
> commit c56e79d453ef5b5fc6ded252bc6ba461f12946ba
> Author: Yosry Ahmed <yosryahmed(a)google.com>
> Date: Thu Dec 19 21:24:37 2024 +0000
>
> mm: zswap: fix race between [de]compression and CPU hotunplug
>
> [ Upstream commit eaebeb93922ca6ab0dd92027b73d0112701706ef ]
>
> In zswap_compress() and zswap_decompress(), the per-CPU acomp_ctx of the
> current CPU at the beginning of the operation is retrieved and used
> throughout. However, since neither preemption nor migration are disabled,
> it is possible that the operation continues on a different CPU.
>
> If the original CPU is hotunplugged while the acomp_ctx is still in use,
> we run into a UAF bug as the resources attached to the acomp_ctx are freed
> during hotunplug in zswap_cpu_comp_dead().
>
> The problem was introduced in commit 1ec3b5fe6eec ("mm/zswap: move to use
> crypto_acomp API for hardware acceleration") when the switch to the
> crypto_acomp API was made. Prior to that, the per-CPU crypto_comp was
> retrieved using get_cpu_ptr() which disables preemption and makes sure the
> CPU cannot go away from under us. Preemption cannot be disabled with the
> crypto_acomp API as a sleepable context is needed.
>
> Commit 8ba2f844f050 ("mm/zswap: change per-cpu mutex and buffer to
> per-acomp_ctx") increased the UAF surface area by making the per-CPU
> buffers dynamic, adding yet another resource that can be freed from under
> zswap compression/decompression by CPU hotunplug.
>
> There are a few ways to fix this:
> (a) Add a refcount for acomp_ctx.
> (b) Disable migration while using the per-CPU acomp_ctx.
> (c) Disable CPU hotunplug while using the per-CPU acomp_ctx by holding
> the CPUs read lock.
>
> Implement (c) since it's simpler than (a), and (b) involves using
> migrate_disable() which is apparently undesired (see huge comment in
> include/linux/preempt.h).
>
> Link: https://lkml.kernel.org/r/20241219212437.2714151-1-yosryahmed@google.com
> Fixes: 1ec3b5fe6eec ("mm/zswap: move to use crypto_acomp API for hardware acceleration")
> Signed-off-by: Yosry Ahmed <yosryahmed(a)google.com>
> Reported-by: Johannes Weiner <hannes(a)cmpxchg.org>
> Closes: https://lore.kernel.org/lkml/20241113213007.GB1564047@cmpxchg.org/
> Reported-by: Sam Sun <samsun1006219(a)gmail.com>
> Closes: https://lore.kernel.org/lkml/CAEkJfYMtSdM5HceNsXUDf5haghD5+o2e7Qv4OcuruL4tP…
> Reviewed-by: Chengming Zhou <chengming.zhou(a)linux.dev>
> Acked-by: Barry Song <baohua(a)kernel.org>
> Reviewed-by: Nhat Pham <nphamcs(a)gmail.com>
> Cc: Vitaly Wool <vitalywool(a)gmail.com>
> Cc: <stable(a)vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
> Signed-off-by: Sasha Levin <sashal(a)kernel.org>
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index 0030ce8fecfc..c86d4bcbb447 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -875,6 +875,18 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node)
> return 0;
> }
>
> +/* Prevent CPU hotplug from freeing up the per-CPU acomp_ctx resources */
> +static struct crypto_acomp_ctx *acomp_ctx_get_cpu(struct crypto_acomp_ctx __percpu *acomp_ctx)
> +{
> + cpus_read_lock();
> + return raw_cpu_ptr(acomp_ctx);
> +}
> +
> +static void acomp_ctx_put_cpu(void)
> +{
> + cpus_read_unlock();
> +}
> +
> static bool zswap_compress(struct folio *folio, struct zswap_entry *entry)
> {
> struct crypto_acomp_ctx *acomp_ctx;
> @@ -887,8 +899,7 @@ static bool zswap_compress(struct folio *folio, struct zswap_entry *entry)
> gfp_t gfp;
> u8 *dst;
>
> - acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx);
> -
> + acomp_ctx = acomp_ctx_get_cpu(entry->pool->acomp_ctx);
> mutex_lock(&acomp_ctx->mutex);
>
> dst = acomp_ctx->buffer;
> @@ -944,6 +955,7 @@ static bool zswap_compress(struct folio *folio, struct zswap_entry *entry)
> zswap_reject_alloc_fail++;
>
> mutex_unlock(&acomp_ctx->mutex);
> + acomp_ctx_put_cpu();
> return comp_ret == 0 && alloc_ret == 0;
> }
>
> @@ -954,7 +966,7 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio)
> struct crypto_acomp_ctx *acomp_ctx;
> u8 *src;
>
> - acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx);
> + acomp_ctx = acomp_ctx_get_cpu(entry->pool->acomp_ctx);
> mutex_lock(&acomp_ctx->mutex);
>
> src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
> @@ -984,6 +996,7 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio)
>
> if (src != acomp_ctx->buffer)
> zpool_unmap_handle(zpool, entry->handle);
> + acomp_ctx_put_cpu();
> }
>
> /*********************************
This partially reverts commit fe4f5d4b6616 ("drm/xe: Clean up VM / exec
queue file lock usage."). While it's desired to have the mutex to
protect only the reference to the exec queue, getting and dropping each
mutex and then later getting the GPU timestamp, doesn't produce a
correct result: it introduces multiple opportunities for the task to be
scheduled out and thus wrecking havoc the deltas reported to userspace.
Also, to better correlate the timestamp from the exec queues with the
GPU, disable preemption so they can be updated without allowing the task
to be scheduled out. We leave interrupts enabled as that shouldn't be
enough disturbance for the deltas to matter to userspace.
Test scenario:
* IGT'S `xe_drm_fdinfo --r utilization-single-full-load`
* Platform: LNL, where CI occasionally reports failures
* `stress -c $(nproc)` running in parallel to disturb the
system
This brings a first failure from "after ~150 executions" to "never
occurs after 1000 attempts".
v2: Also keep xe_hw_engine_read_timestamp() call inside the
preemption-disabled section (Umesh)
Cc: stable(a)vger.kernel.org # v6.11+
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa(a)intel.com>
Cc: Matthew Brost <matthew.brost(a)intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3512
Signed-off-by: Lucas De Marchi <lucas.demarchi(a)intel.com>
---
drivers/gpu/drm/xe/xe_drm_client.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_drm_client.c b/drivers/gpu/drm/xe/xe_drm_client.c
index 7d55ad846bac5..2220a09bf9751 100644
--- a/drivers/gpu/drm/xe/xe_drm_client.c
+++ b/drivers/gpu/drm/xe/xe_drm_client.c
@@ -337,20 +337,18 @@ static void show_run_ticks(struct drm_printer *p, struct drm_file *file)
return;
}
+ /* Let both the GPU timestamp and exec queue be updated together */
+ preempt_disable();
+ gpu_timestamp = xe_hw_engine_read_timestamp(hwe);
+
/* Accumulate all the exec queues from this client */
mutex_lock(&xef->exec_queue.lock);
- xa_for_each(&xef->exec_queue.xa, i, q) {
- xe_exec_queue_get(q);
- mutex_unlock(&xef->exec_queue.lock);
+ xa_for_each(&xef->exec_queue.xa, i, q)
xe_exec_queue_update_run_ticks(q);
- mutex_lock(&xef->exec_queue.lock);
- xe_exec_queue_put(q);
- }
mutex_unlock(&xef->exec_queue.lock);
-
- gpu_timestamp = xe_hw_engine_read_timestamp(hwe);
+ preempt_enable();
xe_force_wake_put(gt_to_fw(hwe->gt), fw_ref);
xe_pm_runtime_put(xe);
--
2.47.0
The patch below does not apply to the 6.12-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.12.y
git checkout FETCH_HEAD
git cherry-pick -x d1cacd74776895f6435941f86a1130e58f6dd226
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025011253-citable-monstrous-3ae9@gregkh' --subject-prefix 'PATCH 6.12.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d1cacd74776895f6435941f86a1130e58f6dd226 Mon Sep 17 00:00:00 2001
From: Jakub Kicinski <kuba(a)kernel.org>
Date: Mon, 6 Jan 2025 10:01:36 -0800
Subject: [PATCH] netdev: prevent accessing NAPI instances from another
namespace
The NAPI IDs were not fully exposed to user space prior to the netlink
API, so they were never namespaced. The netlink API must ensure that
at the very least NAPI instance belongs to the same netns as the owner
of the genl sock.
napi_by_id() can become static now, but it needs to move because of
dev_get_by_napi_id().
Cc: stable(a)vger.kernel.org
Fixes: 1287c1ae0fc2 ("netdev-genl: Support setting per-NAPI config values")
Fixes: 27f91aaf49b3 ("netdev-genl: Add netlink framework functions for napi")
Reviewed-by: Sridhar Samudrala <sridhar.samudrala(a)intel.com>
Reviewed-by: Joe Damato <jdamato(a)fastly.com>
Link: https://patch.msgid.link/20250106180137.1861472-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/net/core/dev.c b/net/core/dev.c
index faa23042df38..a9f62f5aeb84 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -753,6 +753,36 @@ int dev_fill_forward_path(const struct net_device *dev, const u8 *daddr,
}
EXPORT_SYMBOL_GPL(dev_fill_forward_path);
+/* must be called under rcu_read_lock(), as we dont take a reference */
+static struct napi_struct *napi_by_id(unsigned int napi_id)
+{
+ unsigned int hash = napi_id % HASH_SIZE(napi_hash);
+ struct napi_struct *napi;
+
+ hlist_for_each_entry_rcu(napi, &napi_hash[hash], napi_hash_node)
+ if (napi->napi_id == napi_id)
+ return napi;
+
+ return NULL;
+}
+
+/* must be called under rcu_read_lock(), as we dont take a reference */
+struct napi_struct *netdev_napi_by_id(struct net *net, unsigned int napi_id)
+{
+ struct napi_struct *napi;
+
+ napi = napi_by_id(napi_id);
+ if (!napi)
+ return NULL;
+
+ if (WARN_ON_ONCE(!napi->dev))
+ return NULL;
+ if (!net_eq(net, dev_net(napi->dev)))
+ return NULL;
+
+ return napi;
+}
+
/**
* __dev_get_by_name - find a device by its name
* @net: the applicable net namespace
@@ -6293,19 +6323,6 @@ bool napi_complete_done(struct napi_struct *n, int work_done)
}
EXPORT_SYMBOL(napi_complete_done);
-/* must be called under rcu_read_lock(), as we dont take a reference */
-struct napi_struct *napi_by_id(unsigned int napi_id)
-{
- unsigned int hash = napi_id % HASH_SIZE(napi_hash);
- struct napi_struct *napi;
-
- hlist_for_each_entry_rcu(napi, &napi_hash[hash], napi_hash_node)
- if (napi->napi_id == napi_id)
- return napi;
-
- return NULL;
-}
-
static void skb_defer_free_flush(struct softnet_data *sd)
{
struct sk_buff *skb, *next;
diff --git a/net/core/dev.h b/net/core/dev.h
index d043dee25a68..deb5eae5749f 100644
--- a/net/core/dev.h
+++ b/net/core/dev.h
@@ -22,6 +22,8 @@ struct sd_flow_limit {
extern int netdev_flow_limit_table_len;
+struct napi_struct *netdev_napi_by_id(struct net *net, unsigned int napi_id);
+
#ifdef CONFIG_PROC_FS
int __init dev_proc_init(void);
#else
@@ -269,7 +271,6 @@ void xdp_do_check_flushed(struct napi_struct *napi);
static inline void xdp_do_check_flushed(struct napi_struct *napi) { }
#endif
-struct napi_struct *napi_by_id(unsigned int napi_id);
void kick_defer_list_purge(struct softnet_data *sd, unsigned int cpu);
#define XMIT_RECURSION_LIMIT 8
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index 125b660004d3..a3bdaf075b6b 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -167,8 +167,6 @@ netdev_nl_napi_fill_one(struct sk_buff *rsp, struct napi_struct *napi,
void *hdr;
pid_t pid;
- if (WARN_ON_ONCE(!napi->dev))
- return -EINVAL;
if (!(napi->dev->flags & IFF_UP))
return 0;
@@ -234,7 +232,7 @@ int netdev_nl_napi_get_doit(struct sk_buff *skb, struct genl_info *info)
rtnl_lock();
rcu_read_lock();
- napi = napi_by_id(napi_id);
+ napi = netdev_napi_by_id(genl_info_net(info), napi_id);
if (napi) {
err = netdev_nl_napi_fill_one(rsp, napi, info);
} else {
@@ -355,7 +353,7 @@ int netdev_nl_napi_set_doit(struct sk_buff *skb, struct genl_info *info)
rtnl_lock();
rcu_read_lock();
- napi = napi_by_id(napi_id);
+ napi = netdev_napi_by_id(genl_info_net(info), napi_id);
if (napi) {
err = netdev_nl_napi_set_config(napi, info);
} else {
The patch titled
Subject: mm: zswap: move allocations during CPU init outside the lock
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-zswap-move-allocations-during-cpu-init-outside-the-lock.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Yosry Ahmed <yosryahmed(a)google.com>
Subject: mm: zswap: move allocations during CPU init outside the lock
Date: Mon, 13 Jan 2025 21:44:58 +0000
In zswap_cpu_comp_prepare(), allocations are made and assigned to various
members of acomp_ctx under acomp_ctx->mutex. However, allocations may
recurse into zswap through reclaim, trying to acquire the same mutex and
deadlocking.
Move the allocations before the mutex critical section. Only the
initialization of acomp_ctx needs to be done with the mutex held.
Link: https://lkml.kernel.org/r/20250113214458.2123410-1-yosryahmed@google.com
Fixes: 12dcb0ef5406 ("mm: zswap: properly synchronize freeing resources during CPU hotunplug")
Signed-off-by: Yosry Ahmed <yosryahmed(a)google.com>
Cc: Chengming Zhou <chengming.zhou(a)linux.dev>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Nhat Pham <nphamcs(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/zswap.c | 42 ++++++++++++++++++++++++------------------
1 file changed, 24 insertions(+), 18 deletions(-)
--- a/mm/zswap.c~mm-zswap-move-allocations-during-cpu-init-outside-the-lock
+++ a/mm/zswap.c
@@ -820,15 +820,15 @@ static int zswap_cpu_comp_prepare(unsign
{
struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu);
- struct crypto_acomp *acomp;
- struct acomp_req *req;
+ struct crypto_acomp *acomp = NULL;
+ struct acomp_req *req = NULL;
+ u8 *buffer = NULL;
int ret;
- mutex_lock(&acomp_ctx->mutex);
- acomp_ctx->buffer = kmalloc_node(PAGE_SIZE * 2, GFP_KERNEL, cpu_to_node(cpu));
- if (!acomp_ctx->buffer) {
+ buffer = kmalloc_node(PAGE_SIZE * 2, GFP_KERNEL, cpu_to_node(cpu));
+ if (!buffer) {
ret = -ENOMEM;
- goto buffer_fail;
+ goto fail;
}
acomp = crypto_alloc_acomp_node(pool->tfm_name, 0, 0, cpu_to_node(cpu));
@@ -836,21 +836,25 @@ static int zswap_cpu_comp_prepare(unsign
pr_err("could not alloc crypto acomp %s : %ld\n",
pool->tfm_name, PTR_ERR(acomp));
ret = PTR_ERR(acomp);
- goto acomp_fail;
+ goto fail;
}
- acomp_ctx->acomp = acomp;
- acomp_ctx->is_sleepable = acomp_is_async(acomp);
- req = acomp_request_alloc(acomp_ctx->acomp);
+ req = acomp_request_alloc(acomp);
if (!req) {
pr_err("could not alloc crypto acomp_request %s\n",
pool->tfm_name);
ret = -ENOMEM;
- goto req_fail;
+ goto fail;
}
- acomp_ctx->req = req;
+ /*
+ * Only hold the mutex after completing allocations, otherwise we may
+ * recurse into zswap through reclaim and attempt to hold the mutex
+ * again resulting in a deadlock.
+ */
+ mutex_lock(&acomp_ctx->mutex);
crypto_init_wait(&acomp_ctx->wait);
+
/*
* if the backend of acomp is async zip, crypto_req_done() will wakeup
* crypto_wait_req(); if the backend of acomp is scomp, the callback
@@ -859,15 +863,17 @@ static int zswap_cpu_comp_prepare(unsign
acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
crypto_req_done, &acomp_ctx->wait);
+ acomp_ctx->buffer = buffer;
+ acomp_ctx->acomp = acomp;
+ acomp_ctx->is_sleepable = acomp_is_async(acomp);
+ acomp_ctx->req = req;
mutex_unlock(&acomp_ctx->mutex);
return 0;
-req_fail:
- crypto_free_acomp(acomp_ctx->acomp);
-acomp_fail:
- kfree(acomp_ctx->buffer);
-buffer_fail:
- mutex_unlock(&acomp_ctx->mutex);
+fail:
+ if (acomp)
+ crypto_free_acomp(acomp);
+ kfree(buffer);
return ret;
}
_
Patches currently in -mm which might be from yosryahmed(a)google.com are
mm-zswap-move-allocations-during-cpu-init-outside-the-lock.patch
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 65a60a590142c54a3f3be11ff162db2d5b0e1e06
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025011353-untried-juniper-fe92@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 65a60a590142c54a3f3be11ff162db2d5b0e1e06 Mon Sep 17 00:00:00 2001
From: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
Date: Wed, 13 Nov 2024 21:25:45 +0100
Subject: [PATCH] iio: imu: inv_icm42600: fix timestamps after suspend if
sensor is on
Currently suspending while sensors are one will result in timestamping
continuing without gap at resume. It can work with monotonic clock but
not with other clocks. Fix that by resetting timestamping.
Fixes: ec74ae9fd37c ("iio: imu: inv_icm42600: add accurate timestamping")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
Link: https://patch.msgid.link/20241113-inv_icm42600-fix-timestamps-after-suspend…
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
index e43538e536f0..ef9875d3b79d 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
@@ -829,6 +829,8 @@ static int inv_icm42600_suspend(struct device *dev)
static int inv_icm42600_resume(struct device *dev)
{
struct inv_icm42600_state *st = dev_get_drvdata(dev);
+ struct inv_icm42600_sensor_state *gyro_st = iio_priv(st->indio_gyro);
+ struct inv_icm42600_sensor_state *accel_st = iio_priv(st->indio_accel);
int ret;
mutex_lock(&st->lock);
@@ -849,9 +851,12 @@ static int inv_icm42600_resume(struct device *dev)
goto out_unlock;
/* restore FIFO data streaming */
- if (st->fifo.on)
+ if (st->fifo.on) {
+ inv_sensors_timestamp_reset(&gyro_st->ts);
+ inv_sensors_timestamp_reset(&accel_st->ts);
ret = regmap_write(st->map, INV_ICM42600_REG_FIFO_CONFIG,
INV_ICM42600_FIFO_CONFIG_STREAM);
+ }
out_unlock:
mutex_unlock(&st->lock);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 65a60a590142c54a3f3be11ff162db2d5b0e1e06
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025011306-sedan-synthesis-c459@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 65a60a590142c54a3f3be11ff162db2d5b0e1e06 Mon Sep 17 00:00:00 2001
From: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
Date: Wed, 13 Nov 2024 21:25:45 +0100
Subject: [PATCH] iio: imu: inv_icm42600: fix timestamps after suspend if
sensor is on
Currently suspending while sensors are one will result in timestamping
continuing without gap at resume. It can work with monotonic clock but
not with other clocks. Fix that by resetting timestamping.
Fixes: ec74ae9fd37c ("iio: imu: inv_icm42600: add accurate timestamping")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
Link: https://patch.msgid.link/20241113-inv_icm42600-fix-timestamps-after-suspend…
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
index e43538e536f0..ef9875d3b79d 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
@@ -829,6 +829,8 @@ static int inv_icm42600_suspend(struct device *dev)
static int inv_icm42600_resume(struct device *dev)
{
struct inv_icm42600_state *st = dev_get_drvdata(dev);
+ struct inv_icm42600_sensor_state *gyro_st = iio_priv(st->indio_gyro);
+ struct inv_icm42600_sensor_state *accel_st = iio_priv(st->indio_accel);
int ret;
mutex_lock(&st->lock);
@@ -849,9 +851,12 @@ static int inv_icm42600_resume(struct device *dev)
goto out_unlock;
/* restore FIFO data streaming */
- if (st->fifo.on)
+ if (st->fifo.on) {
+ inv_sensors_timestamp_reset(&gyro_st->ts);
+ inv_sensors_timestamp_reset(&accel_st->ts);
ret = regmap_write(st->map, INV_ICM42600_REG_FIFO_CONFIG,
INV_ICM42600_FIFO_CONFIG_STREAM);
+ }
out_unlock:
mutex_unlock(&st->lock);
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 13134cc949148e1dfa540a0fe5dc73569bc62155
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025011231-bakery-sterling-1f23@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 13134cc949148e1dfa540a0fe5dc73569bc62155 Mon Sep 17 00:00:00 2001
From: Nam Cao <namcao(a)linutronix.de>
Date: Tue, 19 Nov 2024 12:10:56 +0100
Subject: [PATCH] riscv: kprobes: Fix incorrect address calculation
p->ainsn.api.insn is a pointer to u32, therefore arithmetic operations are
multiplied by four. This is clearly undesirable for this case.
Cast it to (void *) first before any calculation.
Below is a sample before/after. The dumped memory is two kprobe slots, the
first slot has
- c.addiw a0, 0x1c (0x7125)
- ebreak (0x00100073)
and the second slot has:
- c.addiw a0, -4 (0x7135)
- ebreak (0x00100073)
Before this patch:
(gdb) x/16xh 0xff20000000135000
0xff20000000135000: 0x7125 0x0000 0x0000 0x0000 0x7135 0x0010 0x0000 0x0000
0xff20000000135010: 0x0073 0x0010 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
After this patch:
(gdb) x/16xh 0xff20000000125000
0xff20000000125000: 0x7125 0x0073 0x0010 0x0000 0x7135 0x0073 0x0010 0x0000
0xff20000000125010: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000
Fixes: b1756750a397 ("riscv: kprobes: Use patch_text_nosync() for insn slots")
Signed-off-by: Nam Cao <namcao(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti(a)rivosinc.com>
Link: https://lore.kernel.org/r/20241119111056.2554419-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer(a)rivosinc.com>
diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
index 380a0e8cecc0..c0738d6c6498 100644
--- a/arch/riscv/kernel/probes/kprobes.c
+++ b/arch/riscv/kernel/probes/kprobes.c
@@ -30,7 +30,7 @@ static void __kprobes arch_prepare_ss_slot(struct kprobe *p)
p->ainsn.api.restore = (unsigned long)p->addr + len;
patch_text_nosync(p->ainsn.api.insn, &p->opcode, len);
- patch_text_nosync(p->ainsn.api.insn + len, &insn, GET_INSN_LENGTH(insn));
+ patch_text_nosync((void *)p->ainsn.api.insn + len, &insn, GET_INSN_LENGTH(insn));
}
static void __kprobes arch_prepare_simulate(struct kprobe *p)
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 65a60a590142c54a3f3be11ff162db2d5b0e1e06
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025011305-consuming-reptilian-2133@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 65a60a590142c54a3f3be11ff162db2d5b0e1e06 Mon Sep 17 00:00:00 2001
From: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
Date: Wed, 13 Nov 2024 21:25:45 +0100
Subject: [PATCH] iio: imu: inv_icm42600: fix timestamps after suspend if
sensor is on
Currently suspending while sensors are one will result in timestamping
continuing without gap at resume. It can work with monotonic clock but
not with other clocks. Fix that by resetting timestamping.
Fixes: ec74ae9fd37c ("iio: imu: inv_icm42600: add accurate timestamping")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
Link: https://patch.msgid.link/20241113-inv_icm42600-fix-timestamps-after-suspend…
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
index e43538e536f0..ef9875d3b79d 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
@@ -829,6 +829,8 @@ static int inv_icm42600_suspend(struct device *dev)
static int inv_icm42600_resume(struct device *dev)
{
struct inv_icm42600_state *st = dev_get_drvdata(dev);
+ struct inv_icm42600_sensor_state *gyro_st = iio_priv(st->indio_gyro);
+ struct inv_icm42600_sensor_state *accel_st = iio_priv(st->indio_accel);
int ret;
mutex_lock(&st->lock);
@@ -849,9 +851,12 @@ static int inv_icm42600_resume(struct device *dev)
goto out_unlock;
/* restore FIFO data streaming */
- if (st->fifo.on)
+ if (st->fifo.on) {
+ inv_sensors_timestamp_reset(&gyro_st->ts);
+ inv_sensors_timestamp_reset(&accel_st->ts);
ret = regmap_write(st->map, INV_ICM42600_REG_FIFO_CONFIG,
INV_ICM42600_FIFO_CONFIG_STREAM);
+ }
out_unlock:
mutex_unlock(&st->lock);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 65a60a590142c54a3f3be11ff162db2d5b0e1e06
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025011306-wake-happiness-3601@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 65a60a590142c54a3f3be11ff162db2d5b0e1e06 Mon Sep 17 00:00:00 2001
From: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
Date: Wed, 13 Nov 2024 21:25:45 +0100
Subject: [PATCH] iio: imu: inv_icm42600: fix timestamps after suspend if
sensor is on
Currently suspending while sensors are one will result in timestamping
continuing without gap at resume. It can work with monotonic clock but
not with other clocks. Fix that by resetting timestamping.
Fixes: ec74ae9fd37c ("iio: imu: inv_icm42600: add accurate timestamping")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol(a)tdk.com>
Link: https://patch.msgid.link/20241113-inv_icm42600-fix-timestamps-after-suspend…
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
diff --git a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
index e43538e536f0..ef9875d3b79d 100644
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
+++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c
@@ -829,6 +829,8 @@ static int inv_icm42600_suspend(struct device *dev)
static int inv_icm42600_resume(struct device *dev)
{
struct inv_icm42600_state *st = dev_get_drvdata(dev);
+ struct inv_icm42600_sensor_state *gyro_st = iio_priv(st->indio_gyro);
+ struct inv_icm42600_sensor_state *accel_st = iio_priv(st->indio_accel);
int ret;
mutex_lock(&st->lock);
@@ -849,9 +851,12 @@ static int inv_icm42600_resume(struct device *dev)
goto out_unlock;
/* restore FIFO data streaming */
- if (st->fifo.on)
+ if (st->fifo.on) {
+ inv_sensors_timestamp_reset(&gyro_st->ts);
+ inv_sensors_timestamp_reset(&accel_st->ts);
ret = regmap_write(st->map, INV_ICM42600_REG_FIFO_CONFIG,
INV_ICM42600_FIFO_CONFIG_STREAM);
+ }
out_unlock:
mutex_unlock(&st->lock);
From: Suraj Sonawane <surajsonawane0215(a)gmail.com>
commit f10593ad9bc36921f623361c9e3dd96bd52d85ee upstream.
Fix a use-after-free bug in sg_release(), detected by syzbot with KASAN:
BUG: KASAN: slab-use-after-free in lock_release+0x151/0xa30
kernel/locking/lockdep.c:5838
__mutex_unlock_slowpath+0xe2/0x750 kernel/locking/mutex.c:912
sg_release+0x1f4/0x2e0 drivers/scsi/sg.c:407
In sg_release(), the function kref_put(&sfp->f_ref, sg_remove_sfp) is
called before releasing the open_rel_lock mutex. The kref_put() call may
decrement the reference count of sfp to zero, triggering its cleanup
through sg_remove_sfp(). This cleanup includes scheduling deferred work
via sg_remove_sfp_usercontext(), which ultimately frees sfp.
After kref_put(), sg_release() continues to unlock open_rel_lock and may
reference sfp or sdp. If sfp has already been freed, this results in a
slab-use-after-free error.
Move the kref_put(&sfp->f_ref, sg_remove_sfp) call after unlocking the
open_rel_lock mutex. This ensures:
- No references to sfp or sdp occur after the reference count is
decremented.
- Cleanup functions such as sg_remove_sfp() and
sg_remove_sfp_usercontext() can safely execute without impacting the
mutex handling in sg_release().
The fix has been tested and validated by syzbot. This patch closes the
bug reported at the following syzkaller link and ensures proper
sequencing of resource cleanup and mutex operations, eliminating the
risk of use-after-free errors in sg_release().
Reported-by: syzbot+7efb5850a17ba6ce098b(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7efb5850a17ba6ce098b
Tested-by: syzbot+7efb5850a17ba6ce098b(a)syzkaller.appspotmail.com
Fixes: cc833acbee9d ("sg: O_EXCL and other lock handling")
Signed-off-by: Suraj Sonawane <surajsonawane0215(a)gmail.com>
Link: https://lore.kernel.org/r/20241120125944.88095-1-surajsonawane0215@gmail.com
Reviewed-by: Bart Van Assche <bvanassche(a)acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Alva Lan <alvalan9(a)foxmail.com>
---
drivers/scsi/sg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index 12344be14232..1946cc96c172 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -390,7 +390,6 @@ sg_release(struct inode *inode, struct file *filp)
mutex_lock(&sdp->open_rel_lock);
scsi_autopm_put_device(sdp->device);
- kref_put(&sfp->f_ref, sg_remove_sfp);
sdp->open_cnt--;
/* possibly many open()s waiting on exlude clearing, start many;
@@ -402,6 +401,7 @@ sg_release(struct inode *inode, struct file *filp)
wake_up_interruptible(&sdp->open_wait);
}
mutex_unlock(&sdp->open_rel_lock);
+ kref_put(&sfp->f_ref, sg_remove_sfp);
return 0;
}
--
2.43.0
Since the new_metric and last_hop_metric variables can reach
the MAX_METRIC(0xffffffff) value, an integer overflow may occur
when multiplying them by 10/9. It can lead to incorrect behavior.
Found by InfoTeCS on behalf of Linux Verification Center
(linuxtesting.org) with SVACE.
Fixes: a8d418d9ac25 ("mac80211: mesh: only switch path when new metric is at least 10% better")
Cc: stable(a)vger.kernel.org
Signed-off-by: Ilia Gavrilov <Ilia.Gavrilov(a)infotecs.ru>
---
net/mac80211/mesh_hwmp.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
index 4e9546e998b6..7d367ff1efc2 100644
--- a/net/mac80211/mesh_hwmp.c
+++ b/net/mac80211/mesh_hwmp.c
@@ -458,7 +458,7 @@ static u32 hwmp_route_info_get(struct ieee80211_sub_if_data *sdata,
(mpath->sn == orig_sn &&
(rcu_access_pointer(mpath->next_hop) !=
sta ?
- mult_frac(new_metric, 10, 9) :
+ mult_frac((u64)new_metric, 10, 9) :
new_metric) >= mpath->metric)) {
process = false;
fresh_info = false;
@@ -533,7 +533,7 @@ static u32 hwmp_route_info_get(struct ieee80211_sub_if_data *sdata,
if ((mpath->flags & MESH_PATH_FIXED) ||
((mpath->flags & MESH_PATH_ACTIVE) &&
((rcu_access_pointer(mpath->next_hop) != sta ?
- mult_frac(last_hop_metric, 10, 9) :
+ mult_frac((u64)last_hop_metric, 10, 9) :
last_hop_metric) > mpath->metric)))
fresh_info = false;
} else {
--
2.39.5
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x c9a40292a44e78f71258b8522655bffaf5753bdb
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025011246-appealing-angler-4f22@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c9a40292a44e78f71258b8522655bffaf5753bdb Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Wed, 8 Jan 2025 10:28:05 -0700
Subject: [PATCH] io_uring/eventfd: ensure io_eventfd_signal() defers another
RCU period
io_eventfd_do_signal() is invoked from an RCU callback, but when
dropping the reference to the io_ev_fd, it calls io_eventfd_free()
directly if the refcount drops to zero. This isn't correct, as any
potential freeing of the io_ev_fd should be deferred another RCU grace
period.
Just call io_eventfd_put() rather than open-code the dec-and-test and
free, which will correctly defer it another RCU grace period.
Fixes: 21a091b970cd ("io_uring: signal registered eventfd to process deferred task work")
Reported-by: Jann Horn <jannh(a)google.com>
Cc: stable(a)vger.kernel.org
Tested-by: Li Zetao <lizetao1(a)huawei.com>
Reviewed-by: Li Zetao<lizetao1(a)huawei.com>
Reviewed-by: Prasanna Kumar T S M <ptsm(a)linux.microsoft.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/eventfd.c b/io_uring/eventfd.c
index fab936d31ba8..100d5da94cb9 100644
--- a/io_uring/eventfd.c
+++ b/io_uring/eventfd.c
@@ -33,20 +33,18 @@ static void io_eventfd_free(struct rcu_head *rcu)
kfree(ev_fd);
}
+static void io_eventfd_put(struct io_ev_fd *ev_fd)
+{
+ if (refcount_dec_and_test(&ev_fd->refs))
+ call_rcu(&ev_fd->rcu, io_eventfd_free);
+}
+
static void io_eventfd_do_signal(struct rcu_head *rcu)
{
struct io_ev_fd *ev_fd = container_of(rcu, struct io_ev_fd, rcu);
eventfd_signal_mask(ev_fd->cq_ev_fd, EPOLL_URING_WAKE);
-
- if (refcount_dec_and_test(&ev_fd->refs))
- io_eventfd_free(rcu);
-}
-
-static void io_eventfd_put(struct io_ev_fd *ev_fd)
-{
- if (refcount_dec_and_test(&ev_fd->refs))
- call_rcu(&ev_fd->rcu, io_eventfd_free);
+ io_eventfd_put(ev_fd);
}
static void io_eventfd_release(struct io_ev_fd *ev_fd, bool put_ref)
For WCN6855, board ID specific NVM needs to be downloaded once board ID
is available, but the default NVM is always downloaded currently, and
the wrong NVM causes poor RF performance which effects user experience.
Fix by downloading board ID specific NVM if board ID is available.
Cc: Bjorn Andersson <bjorande(a)quicinc.com>
Cc: Aiqun Yu (Maria) <quic_aiquny(a)quicinc.com>
Cc: Cheng Jiang <quic_chejiang(a)quicinc.com>
Cc: Johan Hovold <johan(a)kernel.org>
Cc: Jens Glathe <jens.glathe(a)oldschoolsolutions.biz>
Cc: Steev Klimaszewski <steev(a)kali.org>
Cc: Paul Menzel <pmenzel(a)molgen.mpg.de>
Fixes: 095327fede00 ("Bluetooth: hci_qca: Add support for QTI Bluetooth chip wcn6855")
Cc: stable(a)vger.kernel.org # 6.4
Reviewed-by: Johan Hovold <johan+linaro(a)kernel.org>
Tested-by: Johan Hovold <johan+linaro(a)kernel.org>
Tested-by: Steev Klimaszewski <steev(a)kali.org>
Tested-by: Jens Glathe <jens.glathe(a)oldschoolsolutions.biz>
Signed-off-by: Zijun Hu <quic_zijuhu(a)quicinc.com>
---
Thank you Paul, Jens, Steev, Johan, Luiz for code review, various
verification, comments and suggestions. these comments and suggestions
are very good, and all of them are taken by this v2 patch.
Regarding the variant 'g', sorry for that i can say nothing due to
confidential information (CCI), but fortunately, we don't need to
care about its difference against one without 'g' from BT host
perspective, qca_get_hsp_nvm_name_generic() shows how to map BT chip
to firmware.
I will help to backport it to LTS kernels ASAP once this commit
is mainlined.
---
Changes in v2:
- Correct subject and commit message
- Temporarily add nvm fallback logic to speed up backport.
— Add fix/stable tags as suggested by Luiz and Johan
- Link to v1: https://lore.kernel.org/r/20241113-x13s_wcn6855_fix-v1-1-15af0aa2549c@quici…
---
drivers/bluetooth/btqca.c | 44 +++++++++++++++++++++++++++++++++++++++++---
1 file changed, 41 insertions(+), 3 deletions(-)
diff --git a/drivers/bluetooth/btqca.c b/drivers/bluetooth/btqca.c
index dfbbac92242a..ddfe7e3c9b50 100644
--- a/drivers/bluetooth/btqca.c
+++ b/drivers/bluetooth/btqca.c
@@ -717,6 +717,29 @@ static void qca_generate_hsp_nvm_name(char *fwname, size_t max_size,
snprintf(fwname, max_size, "qca/hpnv%02x%s.%x", rom_ver, variant, bid);
}
+static void qca_get_hsp_nvm_name_generic(struct qca_fw_config *cfg,
+ struct qca_btsoc_version ver,
+ u8 rom_ver, u16 bid)
+{
+ const char *variant;
+
+ /* hsp gf chip */
+ if ((le32_to_cpu(ver.soc_id) & QCA_HSP_GF_SOC_MASK) == QCA_HSP_GF_SOC_ID)
+ variant = "g";
+ else
+ variant = "";
+
+ if (bid == 0x0)
+ snprintf(cfg->fwname, sizeof(cfg->fwname), "qca/hpnv%02x%s.bin",
+ rom_ver, variant);
+ else if (bid & 0xff00)
+ snprintf(cfg->fwname, sizeof(cfg->fwname), "qca/hpnv%02x%s.b%x",
+ rom_ver, variant, bid);
+ else
+ snprintf(cfg->fwname, sizeof(cfg->fwname), "qca/hpnv%02x%s.b%02x",
+ rom_ver, variant, bid);
+}
+
static inline void qca_get_nvm_name_generic(struct qca_fw_config *cfg,
const char *stem, u8 rom_ver, u16 bid)
{
@@ -810,8 +833,15 @@ int qca_uart_setup(struct hci_dev *hdev, uint8_t baudrate,
/* Give the controller some time to get ready to receive the NVM */
msleep(10);
- if (soc_type == QCA_QCA2066 || soc_type == QCA_WCN7850)
+ switch (soc_type) {
+ case QCA_QCA2066:
+ case QCA_WCN6855:
+ case QCA_WCN7850:
qca_read_fw_board_id(hdev, &boardid);
+ break;
+ default:
+ break;
+ }
/* Download NVM configuration */
config.type = TLV_TYPE_NVM;
@@ -848,8 +878,7 @@ int qca_uart_setup(struct hci_dev *hdev, uint8_t baudrate,
"qca/msnv%02x.bin", rom_ver);
break;
case QCA_WCN6855:
- snprintf(config.fwname, sizeof(config.fwname),
- "qca/hpnv%02x.bin", rom_ver);
+ qca_get_hsp_nvm_name_generic(&config, ver, rom_ver, boardid);
break;
case QCA_WCN7850:
qca_get_nvm_name_generic(&config, "hmt", rom_ver, boardid);
@@ -861,9 +890,18 @@ int qca_uart_setup(struct hci_dev *hdev, uint8_t baudrate,
}
}
+download_nvm:
err = qca_download_firmware(hdev, &config, soc_type, rom_ver);
if (err < 0) {
bt_dev_err(hdev, "QCA Failed to download NVM (%d)", err);
+ if (err == -ENOENT && boardid != 0 &&
+ soc_type == QCA_WCN6855) {
+ boardid = 0;
+ qca_get_hsp_nvm_name_generic(&config, ver,
+ rom_ver, boardid);
+ bt_dev_warn(hdev, "QCA fallback to default NVM");
+ goto download_nvm;
+ }
return err;
}
---
base-commit: e88b020190bf5bc3e7ce5bd8003fc39b23cc95fe
change-id: 20241113-x13s_wcn6855_fix-53c573ff7878
Best regards,
--
Zijun Hu <quic_zijuhu(a)quicinc.com>
Commit c7ff693fa2094ba0a9d0a20feb4ab1658eff9c33 ("module: Split
modules_install compression and in-kernel decompression") removed the
MODULE_COMPRESS_NONE, but left it loadpin's Kconfig, and removing it
Signed-off-by: Arulpandiyan Vadivel <arulpandiyan.vadivel(a)siemens.com>
---
security/loadpin/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/security/loadpin/Kconfig b/security/loadpin/Kconfig
index 848f8b4a60190..94348e2831db9 100644
--- a/security/loadpin/Kconfig
+++ b/security/loadpin/Kconfig
@@ -16,7 +16,7 @@ config SECURITY_LOADPIN_ENFORCE
depends on SECURITY_LOADPIN
# Module compression breaks LoadPin unless modules are decompressed in
# the kernel.
- depends on !MODULES || (MODULE_COMPRESS_NONE || MODULE_DECOMPRESS)
+ depends on !MODULES || MODULE_DECOMPRESS
help
If selected, LoadPin will enforce pinning at boot. If not
selected, it can be enabled at boot with the kernel parameter
--
2.39.5
Hi Maciej,
On 2025/1/13 1:49, Maciej Żenczykowski Wrote:
> (a) I think this looks like a bug on the sending Win10 side, rather
> than a parsing bug in Linux,
> with there being no ZLP, and no short (<512) frame, there's simply no
> way for the receiver to split at the right spot.
>
> Indeed, fixing this on the Linux/parsing side seems non-trivial...
> I guess we could try to treat the connection as simply a serial
> connection (ie. ignore frame boundaries), but then we might have
> issues with other senders...
>
> I guess the most likely 'correct' hack/fix would be to hold on to the
> extra 'N*512' bytes (it doesn't even have to be 1, though likely the N
> is odd), if it starts with a NTH header...
Make sence, it seems we only need to save the rest data beside
dwBlockLength for next unwrap if a hack is acceptable, otherwise I may
need to check if a custom host driver for Windows10 user feasible.
I didn't look carefully into the 1byte and padding stuff with Windows11
host yet, I will take a look then.
> (b) I notice the '512' not '1024', I think this implies a USB2
> connection instead of USB3
> -- could you try replicating this with a USB3 capable data cable (and
> USB3 ports), this should result in 1024 block size instead of 512.
>
> I'm wondering if the win10 stack is avoiding generating N*1024, but
> then hitting N*512 with odd N...
Yes, I am using USB2.0 connection to better capture the crime scene.
Normally the OUT transfer on USB3.0 SuperSpeed connection comes with a bunch
of 1024B Data Pakcet along with a Short Packet less than 1024B in the end from
the Lecroy trace.
It's also reproducible on USB3.0 SuperSpeed connection using dwc3 controller,
but it will cost more time and make it difficult to capture the online data
(limited tracer HW buffer), I can try using software tracing or custom logs
later:
[ 5] 26.00-27.00 sec 183 MBytes 1.54 Gbits/sec
[ 5] 27.00-28.00 sec 182 MBytes 1.53 Gbits/sec
[ 206.123935] configfs.gadget.2: Wrong NDP SIGN
[ 206.129785] configfs.gadget.2: Wrong NTH SIGN, skblen 12208
[ 206.136802] HEAD:0000000004f66a88: 80 06 bc f9 c0 a8 24 66 c0 a8 24 65 f7 24 14 51 aa 1a 30 d5 01 f8 01 26 50 10 20 14 27 3d 00 00
[ 5] 28.00-29.00 sec 128 MBytes 1.07 Gbits/sec
[ 5] 29.00-30.00 sec 191 MBytes 1.61 Gbits/sec
>
> Presumably '512' would be '64' with USB1.0/1.1, but I guess finding a
> USB1.x port/host to test against is likely to be near impossible...
>
> I'll try to see if I can find the source of the bug in the Win
> driver's sources (though based on it being Win10 only, may need to
> search history)
> It's great if you can analyze from the host driver.
I didn't know if the NCM driver open-sourced on github by M$ is the correspond
version. They said that only Win 11 officially support NCM in the issue on github
yet they do have a built-in driver in Win10 and 2004 tag there in the repo.
Once device_register() failed, we should call put_device() to
decrement reference count for cleanup. Or it could cause memory leak.
device_register() includes device_add(). As comment of device_add()
says, 'if device_add() succeeds, you should call device_del() when you
want to get rid of it. If device_add() has not succeeded, use only
put_device() to drop the reference count'.
Found by code review.
Cc: stable(a)vger.kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Ma Ke <make24(a)iscas.ac.cn>
---
Changes in v4:
- deleted the redundant initialization;
Changes in v3:
- modified the patch as suggestions;
Changes in v2:
- modified the patch as suggestions.
---
arch/arm/common/locomo.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/arch/arm/common/locomo.c b/arch/arm/common/locomo.c
index cb6ef449b987..45106066a17f 100644
--- a/arch/arm/common/locomo.c
+++ b/arch/arm/common/locomo.c
@@ -223,10 +223,8 @@ locomo_init_one_child(struct locomo *lchip, struct locomo_dev_info *info)
int ret;
dev = kzalloc(sizeof(struct locomo_dev), GFP_KERNEL);
- if (!dev) {
- ret = -ENOMEM;
- goto out;
- }
+ if (!dev)
+ return -ENOMEM;
/*
* If the parent device has a DMA mask associated with it,
@@ -254,10 +252,9 @@ locomo_init_one_child(struct locomo *lchip, struct locomo_dev_info *info)
NO_IRQ : lchip->irq_base + info->irq[0];
ret = device_register(&dev->dev);
- if (ret) {
- out:
- kfree(dev);
- }
+ if (ret)
+ put_device(&dev->dev);
+
return ret;
}
--
2.25.1
Since commit 04f2cbe35699 ("hugetlb: guarantee that COW faults for a
process that called mmap(MAP_PRIVATE) on hugetlbfs will succeed"),
avoid_reserve was introduced for a special case of CoW on hugetlb private
mappings, and only if the owner VMA is trying to allocate yet another
hugetlb folio that is not reserved within the private vma reserved map.
Later on, in commit d85f69b0b533 ("mm/hugetlb: alloc_huge_page handle areas
hole punched by fallocate"), alloc_huge_page() enforced to not consume any
global reservation as long as avoid_reserve=true. This operation doesn't
look correct, because even if it will enforce the allocation to not use
global reservation at all, it will still try to take one reservation from
the spool (if the subpool existed). Then since the spool reserved pages
take from global reservation, it'll also take one reservation globally.
Logically it can cause global reservation to go wrong.
I wrote a reproducer below, trigger this special path, and every run of
such program will cause global reservation count to increment by one, until
it hits the number of free pages:
#define _GNU_SOURCE /* See feature_test_macros(7) */
#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/mman.h>
#define MSIZE (2UL << 20)
int main(int argc, char *argv[])
{
const char *path;
int *buf;
int fd, ret;
pid_t child;
if (argc < 2) {
printf("usage: %s <hugetlb_file>\n", argv[0]);
return -1;
}
path = argv[1];
fd = open(path, O_RDWR | O_CREAT, 0666);
if (fd < 0) {
perror("open failed");
return -1;
}
ret = fallocate(fd, 0, 0, MSIZE);
if (ret != 0) {
perror("fallocate");
return -1;
}
buf = mmap(NULL, MSIZE, PROT_READ|PROT_WRITE,
MAP_PRIVATE, fd, 0);
if (buf == MAP_FAILED) {
perror("mmap() failed");
return -1;
}
/* Allocate a page */
*buf = 1;
child = fork();
if (child == 0) {
/* child doesn't need to do anything */
exit(0);
}
/* Trigger CoW from owner */
*buf = 2;
munmap(buf, MSIZE);
close(fd);
unlink(path);
return 0;
}
It can only reproduce with a sub-mount when there're reserved pages on the
spool, like:
# sysctl vm.nr_hugepages=128
# mkdir ./hugetlb-pool
# mount -t hugetlbfs -o min_size=8M,pagesize=2M none ./hugetlb-pool
Then run the reproducer on the mountpoint:
# ./reproducer ./hugetlb-pool/test
Fix it by taking the reservation from spool if available. In general,
avoid_reserve is IMHO more about "avoid vma resv map", not spool's.
I copied stable, however I have no intention for backporting if it's not a
clean cherry-pick, because private hugetlb mapping, and then fork() on top
is too rare to hit.
Cc: linux-stable <stable(a)vger.kernel.org>
Fixes: d85f69b0b533 ("mm/hugetlb: alloc_huge_page handle areas hole punched by fallocate")
Reviewed-by: Ackerley Tng <ackerleytng(a)google.com>
Tested-by: Ackerley Tng <ackerleytng(a)google.com>
Signed-off-by: Peter Xu <peterx(a)redhat.com>
---
mm/hugetlb.c | 22 +++-------------------
1 file changed, 3 insertions(+), 19 deletions(-)
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 354eec6f7e84..2bf971f77553 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1394,8 +1394,7 @@ static unsigned long available_huge_pages(struct hstate *h)
static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h,
struct vm_area_struct *vma,
- unsigned long address, int avoid_reserve,
- long chg)
+ unsigned long address, long chg)
{
struct folio *folio = NULL;
struct mempolicy *mpol;
@@ -1411,10 +1410,6 @@ static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h,
if (!vma_has_reserves(vma, chg) && !available_huge_pages(h))
goto err;
- /* If reserves cannot be used, ensure enough pages are in the pool */
- if (avoid_reserve && !available_huge_pages(h))
- goto err;
-
gfp_mask = htlb_alloc_mask(h);
nid = huge_node(vma, address, gfp_mask, &mpol, &nodemask);
@@ -1430,7 +1425,7 @@ static struct folio *dequeue_hugetlb_folio_vma(struct hstate *h,
folio = dequeue_hugetlb_folio_nodemask(h, gfp_mask,
nid, nodemask);
- if (folio && !avoid_reserve && vma_has_reserves(vma, chg)) {
+ if (folio && vma_has_reserves(vma, chg)) {
folio_set_hugetlb_restore_reserve(folio);
h->resv_huge_pages--;
}
@@ -3047,17 +3042,6 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
gbl_chg = hugepage_subpool_get_pages(spool, 1);
if (gbl_chg < 0)
goto out_end_reservation;
-
- /*
- * Even though there was no reservation in the region/reserve
- * map, there could be reservations associated with the
- * subpool that can be used. This would be indicated if the
- * return value of hugepage_subpool_get_pages() is zero.
- * However, if avoid_reserve is specified we still avoid even
- * the subpool reservations.
- */
- if (avoid_reserve)
- gbl_chg = 1;
}
/* If this allocation is not consuming a reservation, charge it now.
@@ -3080,7 +3064,7 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
* from the global free pool (global change). gbl_chg == 0 indicates
* a reservation exists for the allocation.
*/
- folio = dequeue_hugetlb_folio_vma(h, vma, addr, avoid_reserve, gbl_chg);
+ folio = dequeue_hugetlb_folio_vma(h, vma, addr, gbl_chg);
if (!folio) {
spin_unlock_irq(&hugetlb_lock);
folio = alloc_buddy_hugetlb_folio_with_mpol(h, vma, addr);
--
2.47.0
Syzkaller has reported a general protection fault at function
ir_raw_event_store_with_filter(). This crash is caused by a NULL pointer
dereference of dev->raw pointer, even though it is checked for NULL in
the same function, which means there is a race condition. It occurs due
to the incorrect order of actions in the streamzap_disconnect() function:
rc_unregister_device() is called before usb_kill_urb(). The dev->raw
pointer is freed and set to NULL in rc_unregister_device(), and only
after that usb_kill_urb() waits for in-progress requests to finish.
If rc_unregister_device() is called while streamzap_callback() handler is
not finished, this can lead to accessing freed resources. Thus
rc_unregister_device() should be called after usb_kill_urb().
Found by Linux Verification Center (linuxtesting.org) with Syzkaller.
Fixes: 8e9e60640067 ("V4L/DVB: staging/lirc: port lirc_streamzap to ir-core")
Cc: stable(a)vger.kernel.org
Reported-by: syzbot+34008406ee9a31b13c73(a)syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=34008406ee9a31b13c73
Signed-off-by: Murad Masimov <m.masimov(a)mt-integration.ru>
---
drivers/media/rc/streamzap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/rc/streamzap.c b/drivers/media/rc/streamzap.c
index 9b209e687f25..2ce62fe5d60f 100644
--- a/drivers/media/rc/streamzap.c
+++ b/drivers/media/rc/streamzap.c
@@ -385,8 +385,8 @@ static void streamzap_disconnect(struct usb_interface *interface)
if (!sz)
return;
- rc_unregister_device(sz->rdev);
usb_kill_urb(sz->urb_in);
+ rc_unregister_device(sz->rdev);
usb_free_urb(sz->urb_in);
usb_free_coherent(usbdev, sz->buf_in_len, sz->buf_in, sz->dma_in);
--
2.39.2
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x ed2761958ad77e54791802b07095786150eab844
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025011341-kisser-strained-c171@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ed2761958ad77e54791802b07095786150eab844 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= <ilpo.jarvinen(a)linux.intel.com>
Date: Tue, 10 Dec 2024 19:01:20 +0200
Subject: [PATCH] tty: serial: 8250: Fix another runtime PM usage counter
underflow
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The commit f9b11229b79c ("serial: 8250: Fix PM usage_count for console
handover") fixed one runtime PM usage counter balance problem that
occurs because .dev is not set during univ8250 setup preventing call to
pm_runtime_get_sync(). Later, univ8250_console_exit() will trigger the
runtime PM usage counter underflow as .dev is already set at that time.
Call pm_runtime_get_sync() to balance the RPM usage counter also in
serial8250_register_8250_port() before trying to add the port.
Reported-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Fixes: bedb404e91bb ("serial: 8250_port: Don't use power management for kernel console")
Cc: stable <stable(a)kernel.org>
Tested-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241210170120.2231-1-ilpo.jarvinen@linux.intel.c…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
index 5f9f06911795..68baf75bdadc 100644
--- a/drivers/tty/serial/8250/8250_core.c
+++ b/drivers/tty/serial/8250/8250_core.c
@@ -812,6 +812,9 @@ int serial8250_register_8250_port(const struct uart_8250_port *up)
uart->dl_write = up->dl_write;
if (uart->port.type != PORT_8250_CIR) {
+ if (uart_console_registered(&uart->port))
+ pm_runtime_get_sync(uart->port.dev);
+
if (serial8250_isa_config != NULL)
serial8250_isa_config(0, &uart->port,
&uart->capabilities);
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x ed2761958ad77e54791802b07095786150eab844
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025011340-empirical-actress-7e43@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ed2761958ad77e54791802b07095786150eab844 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= <ilpo.jarvinen(a)linux.intel.com>
Date: Tue, 10 Dec 2024 19:01:20 +0200
Subject: [PATCH] tty: serial: 8250: Fix another runtime PM usage counter
underflow
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The commit f9b11229b79c ("serial: 8250: Fix PM usage_count for console
handover") fixed one runtime PM usage counter balance problem that
occurs because .dev is not set during univ8250 setup preventing call to
pm_runtime_get_sync(). Later, univ8250_console_exit() will trigger the
runtime PM usage counter underflow as .dev is already set at that time.
Call pm_runtime_get_sync() to balance the RPM usage counter also in
serial8250_register_8250_port() before trying to add the port.
Reported-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Fixes: bedb404e91bb ("serial: 8250_port: Don't use power management for kernel console")
Cc: stable <stable(a)kernel.org>
Tested-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241210170120.2231-1-ilpo.jarvinen@linux.intel.c…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
index 5f9f06911795..68baf75bdadc 100644
--- a/drivers/tty/serial/8250/8250_core.c
+++ b/drivers/tty/serial/8250/8250_core.c
@@ -812,6 +812,9 @@ int serial8250_register_8250_port(const struct uart_8250_port *up)
uart->dl_write = up->dl_write;
if (uart->port.type != PORT_8250_CIR) {
+ if (uart_console_registered(&uart->port))
+ pm_runtime_get_sync(uart->port.dev);
+
if (serial8250_isa_config != NULL)
serial8250_isa_config(0, &uart->port,
&uart->capabilities);
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x ed2761958ad77e54791802b07095786150eab844
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025011340-happily-deport-200b@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ed2761958ad77e54791802b07095786150eab844 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= <ilpo.jarvinen(a)linux.intel.com>
Date: Tue, 10 Dec 2024 19:01:20 +0200
Subject: [PATCH] tty: serial: 8250: Fix another runtime PM usage counter
underflow
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The commit f9b11229b79c ("serial: 8250: Fix PM usage_count for console
handover") fixed one runtime PM usage counter balance problem that
occurs because .dev is not set during univ8250 setup preventing call to
pm_runtime_get_sync(). Later, univ8250_console_exit() will trigger the
runtime PM usage counter underflow as .dev is already set at that time.
Call pm_runtime_get_sync() to balance the RPM usage counter also in
serial8250_register_8250_port() before trying to add the port.
Reported-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Fixes: bedb404e91bb ("serial: 8250_port: Don't use power management for kernel console")
Cc: stable <stable(a)kernel.org>
Tested-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen(a)linux.intel.com>
Link: https://lore.kernel.org/r/20241210170120.2231-1-ilpo.jarvinen@linux.intel.c…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/tty/serial/8250/8250_core.c b/drivers/tty/serial/8250/8250_core.c
index 5f9f06911795..68baf75bdadc 100644
--- a/drivers/tty/serial/8250/8250_core.c
+++ b/drivers/tty/serial/8250/8250_core.c
@@ -812,6 +812,9 @@ int serial8250_register_8250_port(const struct uart_8250_port *up)
uart->dl_write = up->dl_write;
if (uart->port.type != PORT_8250_CIR) {
+ if (uart_console_registered(&uart->port))
+ pm_runtime_get_sync(uart->port.dev);
+
if (serial8250_isa_config != NULL)
serial8250_isa_config(0, &uart->port,
&uart->capabilities);