From: David Howells <dhowells(a)redhat.com>
[ Upstream commit 880a88f318cf1d2a0f4c0a7ff7b07e2062b434a4 ]
If an AF_RXRPC service socket is opened and bound, but calls are
preallocated, then rxrpc_alloc_incoming_call() will oops because the
rxrpc_backlog struct doesn't get allocated until the first preallocation is
made.
Fix this by returning NULL from rxrpc_alloc_incoming_call() if there is no
backlog struct. This will cause the incoming call to be aborted.
Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Suggested-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
cc: LePremierHomme <kwqcheii(a)proton.me>
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: Willy Tarreau <w(a)1wt.eu>
cc: Simon Horman <horms(a)kernel.org>
cc: linux-afs(a)lists.infradead.org
Link: https://patch.msgid.link/20250708211506.2699012-3-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Fixes a Critical Kernel Oops**: The commit addresses a NULL pointer
dereference that causes a kernel crash when `rx->backlog` is NULL. At
line 257 of the original code,
`smp_load_acquire(&b->call_backlog_head)` would dereference a NULL
pointer if no preallocation was done.
2. **Minimal and Safe Fix**: The fix is a simple defensive check:
```c
+ if (!b)
+ return NULL;
```
This is placed immediately after obtaining the backlog pointer and
before any usage. The fix has zero risk of regression - if `b` is
NULL, the code would have crashed anyway.
3. **Clear Reproducible Scenario**: The bug occurs in a specific but
realistic scenario - when an AF_RXRPC service socket is opened and
bound but no calls are preallocated (meaning
`rxrpc_service_prealloc()` was never called to allocate the backlog
structure).
4. **Follows Stable Kernel Rules**: This fix meets all criteria for
stable backporting:
- Fixes a real bug that users can hit
- Small and contained change (2 lines)
- Obviously correct with no side effects
- Already tested and merged upstream
5. **Similar to Previously Backported Fixes**: Looking at Similar Commit
#2 which was marked YES, it also fixed an oops in the rxrpc
preallocation/backlog system with minimal changes.
The commit prevents a kernel crash with a trivial NULL check, making it
an ideal candidate for stable backporting.
net/rxrpc/call_accept.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 55fb3744552de..99f05057e4c90 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -281,6 +281,9 @@ static struct rxrpc_call *rxrpc_alloc_incoming_call(struct rxrpc_sock *rx,
unsigned short call_tail, conn_tail, peer_tail;
unsigned short call_count, conn_count;
+ if (!b)
+ return NULL;
+
/* #calls >= #conns >= #peers must hold true. */
call_head = smp_load_acquire(&b->call_backlog_head);
call_tail = b->call_backlog_tail;
--
2.39.5
From: David Howells <dhowells(a)redhat.com>
[ Upstream commit 880a88f318cf1d2a0f4c0a7ff7b07e2062b434a4 ]
If an AF_RXRPC service socket is opened and bound, but calls are
preallocated, then rxrpc_alloc_incoming_call() will oops because the
rxrpc_backlog struct doesn't get allocated until the first preallocation is
made.
Fix this by returning NULL from rxrpc_alloc_incoming_call() if there is no
backlog struct. This will cause the incoming call to be aborted.
Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Suggested-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
cc: LePremierHomme <kwqcheii(a)proton.me>
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: Willy Tarreau <w(a)1wt.eu>
cc: Simon Horman <horms(a)kernel.org>
cc: linux-afs(a)lists.infradead.org
Link: https://patch.msgid.link/20250708211506.2699012-3-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Fixes a Critical Kernel Oops**: The commit addresses a NULL pointer
dereference that causes a kernel crash when `rx->backlog` is NULL. At
line 257 of the original code,
`smp_load_acquire(&b->call_backlog_head)` would dereference a NULL
pointer if no preallocation was done.
2. **Minimal and Safe Fix**: The fix is a simple defensive check:
```c
+ if (!b)
+ return NULL;
```
This is placed immediately after obtaining the backlog pointer and
before any usage. The fix has zero risk of regression - if `b` is
NULL, the code would have crashed anyway.
3. **Clear Reproducible Scenario**: The bug occurs in a specific but
realistic scenario - when an AF_RXRPC service socket is opened and
bound but no calls are preallocated (meaning
`rxrpc_service_prealloc()` was never called to allocate the backlog
structure).
4. **Follows Stable Kernel Rules**: This fix meets all criteria for
stable backporting:
- Fixes a real bug that users can hit
- Small and contained change (2 lines)
- Obviously correct with no side effects
- Already tested and merged upstream
5. **Similar to Previously Backported Fixes**: Looking at Similar Commit
#2 which was marked YES, it also fixed an oops in the rxrpc
preallocation/backlog system with minimal changes.
The commit prevents a kernel crash with a trivial NULL check, making it
an ideal candidate for stable backporting.
net/rxrpc/call_accept.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 2a14d69b171f3..b96af42a1b041 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -271,6 +271,9 @@ static struct rxrpc_call *rxrpc_alloc_incoming_call(struct rxrpc_sock *rx,
unsigned short call_tail, conn_tail, peer_tail;
unsigned short call_count, conn_count;
+ if (!b)
+ return NULL;
+
/* #calls >= #conns >= #peers must hold true. */
call_head = smp_load_acquire(&b->call_backlog_head);
call_tail = b->call_backlog_tail;
--
2.39.5
From: David Howells <dhowells(a)redhat.com>
[ Upstream commit 880a88f318cf1d2a0f4c0a7ff7b07e2062b434a4 ]
If an AF_RXRPC service socket is opened and bound, but calls are
preallocated, then rxrpc_alloc_incoming_call() will oops because the
rxrpc_backlog struct doesn't get allocated until the first preallocation is
made.
Fix this by returning NULL from rxrpc_alloc_incoming_call() if there is no
backlog struct. This will cause the incoming call to be aborted.
Reported-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Suggested-by: Junvyyang, Tencent Zhuque Lab <zhuque(a)tencent.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
cc: LePremierHomme <kwqcheii(a)proton.me>
cc: Marc Dionne <marc.dionne(a)auristor.com>
cc: Willy Tarreau <w(a)1wt.eu>
cc: Simon Horman <horms(a)kernel.org>
cc: linux-afs(a)lists.infradead.org
Link: https://patch.msgid.link/20250708211506.2699012-3-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
**YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Fixes a Critical Kernel Oops**: The commit addresses a NULL pointer
dereference that causes a kernel crash when `rx->backlog` is NULL. At
line 257 of the original code,
`smp_load_acquire(&b->call_backlog_head)` would dereference a NULL
pointer if no preallocation was done.
2. **Minimal and Safe Fix**: The fix is a simple defensive check:
```c
+ if (!b)
+ return NULL;
```
This is placed immediately after obtaining the backlog pointer and
before any usage. The fix has zero risk of regression - if `b` is
NULL, the code would have crashed anyway.
3. **Clear Reproducible Scenario**: The bug occurs in a specific but
realistic scenario - when an AF_RXRPC service socket is opened and
bound but no calls are preallocated (meaning
`rxrpc_service_prealloc()` was never called to allocate the backlog
structure).
4. **Follows Stable Kernel Rules**: This fix meets all criteria for
stable backporting:
- Fixes a real bug that users can hit
- Small and contained change (2 lines)
- Obviously correct with no side effects
- Already tested and merged upstream
5. **Similar to Previously Backported Fixes**: Looking at Similar Commit
#2 which was marked YES, it also fixed an oops in the rxrpc
preallocation/backlog system with minimal changes.
The commit prevents a kernel crash with a trivial NULL check, making it
an ideal candidate for stable backporting.
net/rxrpc/call_accept.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/rxrpc/call_accept.c b/net/rxrpc/call_accept.c
index 99e10eea37321..658b592e58e05 100644
--- a/net/rxrpc/call_accept.c
+++ b/net/rxrpc/call_accept.c
@@ -270,6 +270,9 @@ static struct rxrpc_call *rxrpc_alloc_incoming_call(struct rxrpc_sock *rx,
unsigned short call_tail, conn_tail, peer_tail;
unsigned short call_count, conn_count;
+ if (!b)
+ return NULL;
+
/* #calls >= #conns >= #peers must hold true. */
call_head = smp_load_acquire(&b->call_backlog_head);
call_tail = b->call_backlog_tail;
--
2.39.5
From: Xiang Mei <xmei5(a)asu.edu>
[ Upstream commit dd831ac8221e691e9e918585b1003c7071df0379 ]
To prevent a potential crash in agg_dequeue (net/sched/sch_qfq.c)
when cl->qdisc->ops->peek(cl->qdisc) returns NULL, we check the return
value before using it, similar to the existing approach in sch_hfsc.c.
To avoid code duplication, the following changes are made:
1. Changed qdisc_warn_nonwc(include/net/pkt_sched.h) into a static
inline function.
2. Moved qdisc_peek_len from net/sched/sch_hfsc.c to
include/net/pkt_sched.h so that sch_qfq can reuse it.
3. Applied qdisc_peek_len in agg_dequeue to avoid crashing.
Signed-off-by: Xiang Mei <xmei5(a)asu.edu>
Reviewed-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Link: https://patch.msgid.link/20250705212143.3982664-1-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and related code, here is my
assessment:
**YES** - This commit should be backported to stable kernel trees.
Here's my extensive explanation:
## Reason for Backporting
1. **Real Bug Fix**: This commit fixes a NULL pointer dereference bug in
`agg_dequeue()` function in `net/sched/sch_qfq.c`. The bug occurs
when `cl->qdisc->ops->peek(cl->qdisc)` returns NULL, but the code at
line 992 (before the fix) directly uses this potentially NULL value
in `qdisc_pkt_len()` without checking:
```c
else if (cl->deficit < qdisc_pkt_len(cl->qdisc->ops->peek(cl->qdisc)))
```
2. **Crash Prevention**: This is not a theoretical issue - it causes an
actual kernel crash (NULL pointer dereference) that can affect system
stability. This meets the stable kernel criteria of fixing "a real
bug that bothers people" including "an oops, a hang, data
corruption."
3. **Similar Patterns in Other Schedulers**: The commit shows that this
pattern of NULL checking after peek operations is already implemented
in other packet schedulers:
- `sch_hfsc.c` already has the `qdisc_peek_len()` function with NULL
checking
- `sch_drr.c` checks for NULL after peek operations (lines 385-387)
- The similar commit #1 shows DRR had a similar issue fixed
4. **Minimal and Contained Fix**: The fix is:
- Small in size (well under 100 lines)
- Obviously correct - it adds proper NULL checking
- Moves existing code to be reusable
- Makes the code more consistent across schedulers
5. **Precedent from Similar Commits**: Looking at the historical
commits:
- Similar commit #2 (sch_codel NULL check) was backported (Status:
YES)
- Similar commit #3 (multiple schedulers NULL handling) was
backported (Status: YES)
- Both dealt with NULL pointer handling in packet scheduler dequeue
paths
6. **Code Consolidation**: The fix properly consolidates the NULL
checking logic:
- Converts `qdisc_warn_nonwc` from a regular function to static
inline (reducing overhead)
- Moves `qdisc_peek_len` from sch_hfsc.c to the common header so it
can be reused
- Uses the same pattern across multiple schedulers for consistency
7. **Tested Pattern**: The `qdisc_peek_len()` function being moved has
been in use in sch_hfsc.c, proving it's a tested and working
solution.
8. **Security Consideration**: While not explicitly a security
vulnerability, NULL pointer dereferences can potentially be exploited
for denial of service attacks, making this fix important for system
stability.
The commit follows all the stable kernel rules: it fixes a real bug
(NULL pointer dereference), is obviously correct (adds NULL check), is
small and contained, and improves consistency across the codebase. The
pattern of backporting similar NULL check fixes in packet schedulers (as
seen in similar commits #2 and #3) supports backporting this fix as
well.
include/net/pkt_sched.h | 25 ++++++++++++++++++++++++-
net/sched/sch_api.c | 10 ----------
net/sched/sch_hfsc.c | 16 ----------------
net/sched/sch_qfq.c | 2 +-
4 files changed, 25 insertions(+), 28 deletions(-)
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index f99a513b40a92..76f9fb651faff 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -113,7 +113,6 @@ struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r,
struct netlink_ext_ack *extack);
void qdisc_put_rtab(struct qdisc_rate_table *tab);
void qdisc_put_stab(struct qdisc_size_table *tab);
-void qdisc_warn_nonwc(const char *txt, struct Qdisc *qdisc);
bool sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
struct net_device *dev, struct netdev_queue *txq,
spinlock_t *root_lock, bool validate);
@@ -247,4 +246,28 @@ static inline bool tc_qdisc_stats_dump(struct Qdisc *sch,
return true;
}
+static inline void qdisc_warn_nonwc(const char *txt, struct Qdisc *qdisc)
+{
+ if (!(qdisc->flags & TCQ_F_WARN_NONWC)) {
+ pr_warn("%s: %s qdisc %X: is non-work-conserving?\n",
+ txt, qdisc->ops->id, qdisc->handle >> 16);
+ qdisc->flags |= TCQ_F_WARN_NONWC;
+ }
+}
+
+static inline unsigned int qdisc_peek_len(struct Qdisc *sch)
+{
+ struct sk_buff *skb;
+ unsigned int len;
+
+ skb = sch->ops->peek(sch);
+ if (unlikely(skb == NULL)) {
+ qdisc_warn_nonwc("qdisc_peek_len", sch);
+ return 0;
+ }
+ len = qdisc_pkt_len(skb);
+
+ return len;
+}
+
#endif
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 7c5df62421bbd..6a2d258b01daa 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -593,16 +593,6 @@ void __qdisc_calculate_pkt_len(struct sk_buff *skb,
qdisc_skb_cb(skb)->pkt_len = pkt_len;
}
-void qdisc_warn_nonwc(const char *txt, struct Qdisc *qdisc)
-{
- if (!(qdisc->flags & TCQ_F_WARN_NONWC)) {
- pr_warn("%s: %s qdisc %X: is non-work-conserving?\n",
- txt, qdisc->ops->id, qdisc->handle >> 16);
- qdisc->flags |= TCQ_F_WARN_NONWC;
- }
-}
-EXPORT_SYMBOL(qdisc_warn_nonwc);
-
static enum hrtimer_restart qdisc_watchdog(struct hrtimer *timer)
{
struct qdisc_watchdog *wd = container_of(timer, struct qdisc_watchdog,
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index 61b91de8065f0..302413e0aceff 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -836,22 +836,6 @@ update_vf(struct hfsc_class *cl, unsigned int len, u64 cur_time)
}
}
-static unsigned int
-qdisc_peek_len(struct Qdisc *sch)
-{
- struct sk_buff *skb;
- unsigned int len;
-
- skb = sch->ops->peek(sch);
- if (unlikely(skb == NULL)) {
- qdisc_warn_nonwc("qdisc_peek_len", sch);
- return 0;
- }
- len = qdisc_pkt_len(skb);
-
- return len;
-}
-
static void
hfsc_adjust_levels(struct hfsc_class *cl)
{
diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c
index 6462468bf77c7..cde6c7026ab1d 100644
--- a/net/sched/sch_qfq.c
+++ b/net/sched/sch_qfq.c
@@ -991,7 +991,7 @@ static struct sk_buff *agg_dequeue(struct qfq_aggregate *agg,
if (cl->qdisc->q.qlen == 0) /* no more packets, remove from list */
list_del_init(&cl->alist);
- else if (cl->deficit < qdisc_pkt_len(cl->qdisc->ops->peek(cl->qdisc))) {
+ else if (cl->deficit < qdisc_peek_len(cl->qdisc)) {
cl->deficit += agg->lmax;
list_move_tail(&cl->alist, &agg->active);
}
--
2.39.5
From: Xiang Mei <xmei5(a)asu.edu>
[ Upstream commit dd831ac8221e691e9e918585b1003c7071df0379 ]
To prevent a potential crash in agg_dequeue (net/sched/sch_qfq.c)
when cl->qdisc->ops->peek(cl->qdisc) returns NULL, we check the return
value before using it, similar to the existing approach in sch_hfsc.c.
To avoid code duplication, the following changes are made:
1. Changed qdisc_warn_nonwc(include/net/pkt_sched.h) into a static
inline function.
2. Moved qdisc_peek_len from net/sched/sch_hfsc.c to
include/net/pkt_sched.h so that sch_qfq can reuse it.
3. Applied qdisc_peek_len in agg_dequeue to avoid crashing.
Signed-off-by: Xiang Mei <xmei5(a)asu.edu>
Reviewed-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Link: https://patch.msgid.link/20250705212143.3982664-1-xmei5@asu.edu
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and related code, here is my
assessment:
**YES** - This commit should be backported to stable kernel trees.
Here's my extensive explanation:
## Reason for Backporting
1. **Real Bug Fix**: This commit fixes a NULL pointer dereference bug in
`agg_dequeue()` function in `net/sched/sch_qfq.c`. The bug occurs
when `cl->qdisc->ops->peek(cl->qdisc)` returns NULL, but the code at
line 992 (before the fix) directly uses this potentially NULL value
in `qdisc_pkt_len()` without checking:
```c
else if (cl->deficit < qdisc_pkt_len(cl->qdisc->ops->peek(cl->qdisc)))
```
2. **Crash Prevention**: This is not a theoretical issue - it causes an
actual kernel crash (NULL pointer dereference) that can affect system
stability. This meets the stable kernel criteria of fixing "a real
bug that bothers people" including "an oops, a hang, data
corruption."
3. **Similar Patterns in Other Schedulers**: The commit shows that this
pattern of NULL checking after peek operations is already implemented
in other packet schedulers:
- `sch_hfsc.c` already has the `qdisc_peek_len()` function with NULL
checking
- `sch_drr.c` checks for NULL after peek operations (lines 385-387)
- The similar commit #1 shows DRR had a similar issue fixed
4. **Minimal and Contained Fix**: The fix is:
- Small in size (well under 100 lines)
- Obviously correct - it adds proper NULL checking
- Moves existing code to be reusable
- Makes the code more consistent across schedulers
5. **Precedent from Similar Commits**: Looking at the historical
commits:
- Similar commit #2 (sch_codel NULL check) was backported (Status:
YES)
- Similar commit #3 (multiple schedulers NULL handling) was
backported (Status: YES)
- Both dealt with NULL pointer handling in packet scheduler dequeue
paths
6. **Code Consolidation**: The fix properly consolidates the NULL
checking logic:
- Converts `qdisc_warn_nonwc` from a regular function to static
inline (reducing overhead)
- Moves `qdisc_peek_len` from sch_hfsc.c to the common header so it
can be reused
- Uses the same pattern across multiple schedulers for consistency
7. **Tested Pattern**: The `qdisc_peek_len()` function being moved has
been in use in sch_hfsc.c, proving it's a tested and working
solution.
8. **Security Consideration**: While not explicitly a security
vulnerability, NULL pointer dereferences can potentially be exploited
for denial of service attacks, making this fix important for system
stability.
The commit follows all the stable kernel rules: it fixes a real bug
(NULL pointer dereference), is obviously correct (adds NULL check), is
small and contained, and improves consistency across the codebase. The
pattern of backporting similar NULL check fixes in packet schedulers (as
seen in similar commits #2 and #3) supports backporting this fix as
well.
include/net/pkt_sched.h | 25 ++++++++++++++++++++++++-
net/sched/sch_api.c | 10 ----------
net/sched/sch_hfsc.c | 16 ----------------
net/sched/sch_qfq.c | 2 +-
4 files changed, 25 insertions(+), 28 deletions(-)
diff --git a/include/net/pkt_sched.h b/include/net/pkt_sched.h
index 15960564e0c36..4d72d24b1f33e 100644
--- a/include/net/pkt_sched.h
+++ b/include/net/pkt_sched.h
@@ -112,7 +112,6 @@ struct qdisc_rate_table *qdisc_get_rtab(struct tc_ratespec *r,
struct netlink_ext_ack *extack);
void qdisc_put_rtab(struct qdisc_rate_table *tab);
void qdisc_put_stab(struct qdisc_size_table *tab);
-void qdisc_warn_nonwc(const char *txt, struct Qdisc *qdisc);
bool sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
struct net_device *dev, struct netdev_queue *txq,
spinlock_t *root_lock, bool validate);
@@ -306,4 +305,28 @@ static inline bool tc_qdisc_stats_dump(struct Qdisc *sch,
return true;
}
+static inline void qdisc_warn_nonwc(const char *txt, struct Qdisc *qdisc)
+{
+ if (!(qdisc->flags & TCQ_F_WARN_NONWC)) {
+ pr_warn("%s: %s qdisc %X: is non-work-conserving?\n",
+ txt, qdisc->ops->id, qdisc->handle >> 16);
+ qdisc->flags |= TCQ_F_WARN_NONWC;
+ }
+}
+
+static inline unsigned int qdisc_peek_len(struct Qdisc *sch)
+{
+ struct sk_buff *skb;
+ unsigned int len;
+
+ skb = sch->ops->peek(sch);
+ if (unlikely(skb == NULL)) {
+ qdisc_warn_nonwc("qdisc_peek_len", sch);
+ return 0;
+ }
+ len = qdisc_pkt_len(skb);
+
+ return len;
+}
+
#endif
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 282423106f15d..ae63b3199b1d1 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -594,16 +594,6 @@ void __qdisc_calculate_pkt_len(struct sk_buff *skb,
qdisc_skb_cb(skb)->pkt_len = pkt_len;
}
-void qdisc_warn_nonwc(const char *txt, struct Qdisc *qdisc)
-{
- if (!(qdisc->flags & TCQ_F_WARN_NONWC)) {
- pr_warn("%s: %s qdisc %X: is non-work-conserving?\n",
- txt, qdisc->ops->id, qdisc->handle >> 16);
- qdisc->flags |= TCQ_F_WARN_NONWC;
- }
-}
-EXPORT_SYMBOL(qdisc_warn_nonwc);
-
static enum hrtimer_restart qdisc_watchdog(struct hrtimer *timer)
{
struct qdisc_watchdog *wd = container_of(timer, struct qdisc_watchdog,
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index afcb83d469ff6..751b1e2c35b3f 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -835,22 +835,6 @@ update_vf(struct hfsc_class *cl, unsigned int len, u64 cur_time)
}
}
-static unsigned int
-qdisc_peek_len(struct Qdisc *sch)
-{
- struct sk_buff *skb;
- unsigned int len;
-
- skb = sch->ops->peek(sch);
- if (unlikely(skb == NULL)) {
- qdisc_warn_nonwc("qdisc_peek_len", sch);
- return 0;
- }
- len = qdisc_pkt_len(skb);
-
- return len;
-}
-
static void
hfsc_adjust_levels(struct hfsc_class *cl)
{
diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c
index 5e557b960bde3..e0cefa21ce21c 100644
--- a/net/sched/sch_qfq.c
+++ b/net/sched/sch_qfq.c
@@ -992,7 +992,7 @@ static struct sk_buff *agg_dequeue(struct qfq_aggregate *agg,
if (cl->qdisc->q.qlen == 0) /* no more packets, remove from list */
list_del_init(&cl->alist);
- else if (cl->deficit < qdisc_pkt_len(cl->qdisc->ops->peek(cl->qdisc))) {
+ else if (cl->deficit < qdisc_peek_len(cl->qdisc)) {
cl->deficit += agg->lmax;
list_move_tail(&cl->alist, &agg->active);
}
--
2.39.5
From: Miri Korenblit <miriam.rachel.korenblit(a)intel.com>
[ Upstream commit d7a54d02db41f72f0581a3c77c75b0993ed3f6e2 ]
This is currently not initialized for a virtual monitor, leading to a
NULL pointer dereference when - for example - iterating over all the
keys of all the vifs.
Reviewed-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit(a)intel.com>
Link: https://patch.msgid.link/20250709233400.8dcefe578497.I4c90a00ae3256520e0631…
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and the code changes, here is my
assessment:
**YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Critical NULL Pointer Dereference Fix**: The commit fixes a NULL
pointer dereference that occurs when iterating over the key_list of
virtual monitor interfaces. This is a crash-inducing bug that affects
system stability.
2. **Clear Bug with Simple Fix**: The issue is straightforward - the
`key_list` was not initialized for virtual monitor interfaces created
via `ieee80211_add_virtual_monitor()`. The fix is minimal and
contained - it simply moves the `INIT_LIST_HEAD(&sdata->key_list)`
initialization from `ieee80211_if_add()` into
`ieee80211_sdata_init()`, ensuring all sdata structures have their
key_list properly initialized.
3. **Real-World Impact**: The bug can be triggered when any code
iterates over all interfaces and their keys. Looking at the code,
functions like `ieee80211_iter_keys()` and
`ieee80211_iter_keys_rcu()` iterate through all interfaces when
called without a specific vif parameter:
```c
list_for_each_entry(sdata, &local->interfaces, list)
list_for_each_entry_safe(key, tmp, &sdata->key_list, list)
```
This would cause a NULL pointer dereference when it encounters a
virtual monitor interface.
4. **Minimal Risk**: The change is extremely low risk - it only adds
initialization of a list head that should have been initialized all
along. There are no architectural changes or feature additions.
5. **Follows Stable Rules**: This perfectly fits the stable kernel
criteria:
- Fixes a real bug (NULL pointer dereference/crash)
- Small and contained change (2 lines moved)
- Obviously correct fix
- No new features or behaviors introduced
The commit is similar in nature to commit #5 in the reference list which
was marked as suitable for backporting - both fix NULL pointer
dereferences in the wifi/mac80211 subsystem with minimal, targeted
changes.
net/mac80211/iface.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 209d6ffa8e426..adfdc14bd91ac 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -1121,6 +1121,8 @@ static void ieee80211_sdata_init(struct ieee80211_local *local,
{
sdata->local = local;
+ INIT_LIST_HEAD(&sdata->key_list);
+
/*
* Initialize the default link, so we can use link_id 0 for non-MLD,
* and that continues to work for non-MLD-aware drivers that use just
@@ -2162,8 +2164,6 @@ int ieee80211_if_add(struct ieee80211_local *local, const char *name,
ieee80211_init_frag_cache(&sdata->frags);
- INIT_LIST_HEAD(&sdata->key_list);
-
wiphy_delayed_work_init(&sdata->dec_tailroom_needed_wk,
ieee80211_delayed_tailroom_dec);
--
2.39.5
From: Miri Korenblit <miriam.rachel.korenblit(a)intel.com>
[ Upstream commit d7a54d02db41f72f0581a3c77c75b0993ed3f6e2 ]
This is currently not initialized for a virtual monitor, leading to a
NULL pointer dereference when - for example - iterating over all the
keys of all the vifs.
Reviewed-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit(a)intel.com>
Link: https://patch.msgid.link/20250709233400.8dcefe578497.I4c90a00ae3256520e0631…
Signed-off-by: Johannes Berg <johannes.berg(a)intel.com>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
LLM Generated explanations, may be completely bogus:
Based on my analysis of the commit and the code changes, here is my
assessment:
**YES**
This commit should be backported to stable kernel trees for the
following reasons:
1. **Critical NULL Pointer Dereference Fix**: The commit fixes a NULL
pointer dereference that occurs when iterating over the key_list of
virtual monitor interfaces. This is a crash-inducing bug that affects
system stability.
2. **Clear Bug with Simple Fix**: The issue is straightforward - the
`key_list` was not initialized for virtual monitor interfaces created
via `ieee80211_add_virtual_monitor()`. The fix is minimal and
contained - it simply moves the `INIT_LIST_HEAD(&sdata->key_list)`
initialization from `ieee80211_if_add()` into
`ieee80211_sdata_init()`, ensuring all sdata structures have their
key_list properly initialized.
3. **Real-World Impact**: The bug can be triggered when any code
iterates over all interfaces and their keys. Looking at the code,
functions like `ieee80211_iter_keys()` and
`ieee80211_iter_keys_rcu()` iterate through all interfaces when
called without a specific vif parameter:
```c
list_for_each_entry(sdata, &local->interfaces, list)
list_for_each_entry_safe(key, tmp, &sdata->key_list, list)
```
This would cause a NULL pointer dereference when it encounters a
virtual monitor interface.
4. **Minimal Risk**: The change is extremely low risk - it only adds
initialization of a list head that should have been initialized all
along. There are no architectural changes or feature additions.
5. **Follows Stable Rules**: This perfectly fits the stable kernel
criteria:
- Fixes a real bug (NULL pointer dereference/crash)
- Small and contained change (2 lines moved)
- Obviously correct fix
- No new features or behaviors introduced
The commit is similar in nature to commit #5 in the reference list which
was marked as suitable for backporting - both fix NULL pointer
dereferences in the wifi/mac80211 subsystem with minimal, targeted
changes.
net/mac80211/iface.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 7d93e5aa595b2..0485a78eda366 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -1117,6 +1117,8 @@ static void ieee80211_sdata_init(struct ieee80211_local *local,
{
sdata->local = local;
+ INIT_LIST_HEAD(&sdata->key_list);
+
/*
* Initialize the default link, so we can use link_id 0 for non-MLD,
* and that continues to work for non-MLD-aware drivers that use just
@@ -2177,8 +2179,6 @@ int ieee80211_if_add(struct ieee80211_local *local, const char *name,
ieee80211_init_frag_cache(&sdata->frags);
- INIT_LIST_HEAD(&sdata->key_list);
-
wiphy_delayed_work_init(&sdata->dec_tailroom_needed_wk,
ieee80211_delayed_tailroom_dec);
--
2.39.5
Commit e7607f7d6d81 ("ARM: 9443/1: Require linker to support KEEP within
OVERLAY for DCE") accidentally broke the binutils version restriction
that was added in commit 0d437918fb64 ("ARM: 9414/1: Fix build issue
with LD_DEAD_CODE_DATA_ELIMINATION"), reintroducing the segmentation
fault addressed by that workaround.
Restore the binutils version dependency by using
CONFIG_LD_CAN_USE_KEEP_IN_OVERLAY as an additional condition to ensure
that CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION is only enabled with
binutils >= 2.36 and ld.lld >= 21.0.0.
Cc: stable(a)vger.kernel.org
Fixes: e7607f7d6d81 ("ARM: 9443/1: Require linker to support KEEP within OVERLAY for DCE")
Reported-by: Rob Landley <rob(a)landley.net>
Closes: https://lore.kernel.org/6739da7d-e555-407a-b5cb-e5681da71056@landley.net/
Tested-by: Rob Landley <rob(a)landley.net>
Signed-off-by: Nathan Chancellor <nathan(a)kernel.org>
---
arch/arm/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 3072731fe09c..962451e54fdd 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -121,7 +121,7 @@ config ARM
select HAVE_KERNEL_XZ
select HAVE_KPROBES if !XIP_KERNEL && !CPU_ENDIAN_BE32 && !CPU_V7M
select HAVE_KRETPROBES if HAVE_KPROBES
- select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if (LD_VERSION >= 23600 || LD_CAN_USE_KEEP_IN_OVERLAY)
+ select HAVE_LD_DEAD_CODE_DATA_ELIMINATION if (LD_VERSION >= 23600 || LD_IS_LLD) && LD_CAN_USE_KEEP_IN_OVERLAY
select HAVE_MOD_ARCH_SPECIFIC
select HAVE_NMI
select HAVE_OPTPROBES if !THUMB2_KERNEL
---
base-commit: d7b8f8e20813f0179d8ef519541a3527e7661d3a
change-id: 20250707-arm-fix-dce-older-binutils-87a5a4b829d9
Best regards,
--
Nathan Chancellor <nathan(a)kernel.org>
Hello,
with kernel v6.16-rc5-121-gbc9ff192a6c9 I see this:
cat /sys/devices/system/cpu/vulnerabilities/tsa
Mitigation: Clear CPU buffers
dmesg | grep micro
[ 1.479203] microcode: Current revision: 0x0a20102e
[ 1.479206] microcode: Updated early from: 0x0a201016
So, this works.
but same machine with 6.6.97:
dmesg | grep micro
[ 0.451496] Transient Scheduler Attacks: Vulnerable: Clear CPU buffers
attempted, no microcode
[ 1.077149] microcode: Current revision: 0x0a20102e
[ 1.077152] microcode: Updated early from: 0x0a201016
so:
cat /sys/devices/system/cpu/vulnerabilities/tsa
Vulnerable: Clear CPU buffers attempted, no microcode
but it is switched on:
zcat /proc/config.gz | grep TSA
CONFIG_MITIGATION_TSA=y
And other stuff which need microcode works:
cat /sys/devices/system/cpu/vulnerabilities/spec_rstack_overflow
Mitigation: Safe RET
without microcode you wwould see:
Vulnerable: Safe RET, no microcode
6.12.37 broken too
6.15.6 works
v6.16-rc5-121-gbc9ff192a6c9 works
This is a:
processor : 11
vendor_id : AuthenticAMD
cpu family : 25
model : 33
model name : AMD Ryzen 5 5600X 6-Core Processor
stepping : 0
microcode : 0xa20102e
Is something missing in 6.6.y and 6.12.y?
Thomas
(already submitted @ Fedora; advised to post here to upstream as well)
-------- Forwarded Message --------
Subject: regression in 6.15.5: KVM guest launch FAILSs with missing CPU feature error (sbpb, ibpb-brtype)
Date: Sun, 13 Jul 2025 17:37:57 -0400
From: pgnd <pgnd(a)dev-mail.net>
Reply-To: pgnd(a)dev-mail.net
To: kernel(a)lists.fedoraproject.org
i'm seeing a regression on Fedora 42 kernel 6.15.4 -> 6.15.5 on AMD Ryzen 5 5600G host (x86_64)
kvm guests that launch ok under kernels 6.15.[3,4] fail with the following error when attempting to autostart under 6.15.5:
internal error: Failed to autostart VM: operation failed: guest CPU doesn't match specification: missing features: sbpb,ibpb-brtype
no changes made to libvirt, qemu, or VM defs between kernel versions.
re-booting to old kernel versions restores expected behavior.
maybe related (?) to recent changes in AMD CPU feature exposure / mitigation handling in kernel 6.15.5?
i've opened a bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2379784
I'm announcing the release of the 6.12.38 kernel.
Only users of AMD x86-based processors need to upgrade, all others may
skip this release.
The updated 6.12.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.12.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/x86/kernel/cpu/amd.c | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
Borislav Petkov (AMD) (1):
x86/CPU/AMD: Properly check the TSA microcode
Greg Kroah-Hartman (1):
Linux 6.12.38
I'm announcing the release of the 6.6.98 kernel.
Only users of AMD x86-based processors need to upgrade, all others may
skip this release.
The updated 6.6.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.6.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/x86/kernel/cpu/amd.c | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
Borislav Petkov (AMD) (1):
x86/CPU/AMD: Properly check the TSA microcode
Greg Kroah-Hartman (1):
Linux 6.6.98
I'm announcing the release of the 6.1.145 kernel.
Only users of AMD x86-based processors need to upgrade, all others may
skip this release.
The updated 6.1.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.1.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/x86/kernel/cpu/amd.c | 5 +++--
2 files changed, 4 insertions(+), 3 deletions(-)
Borislav Petkov (AMD) (1):
x86/CPU/AMD: Properly check the TSA microcode
Greg Kroah-Hartman (1):
Linux 6.1.145
I'm announcing the release of the 5.15.188 kernel.
Only users of AMD x86-based processors need to upgrade, all others may
skip this release.
The updated 5.15.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.15.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
arch/x86/kernel/cpu/amd.c | 5 +++--
2 files changed, 4 insertions(+), 3 deletions(-)
Borislav Petkov (AMD) (1):
x86/CPU/AMD: Properly check the TSA microcode
Greg Kroah-Hartman (1):
Linux 5.15.188
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4.y
git checkout FETCH_HEAD
git cherry-pick -x 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071423-imbecile-slander-d8e9@gregkh' --subject-prefix 'PATCH 5.4.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Wed, 2 Jul 2025 10:32:04 +0200
Subject: [PATCH] x86/mm: Disable hugetlb page table sharing on 32-bit
Only select ARCH_WANT_HUGE_PMD_SHARE on 64-bit x86.
Page table sharing requires at least three levels because it involves
shared references to PMD tables; 32-bit x86 has either two-level paging
(without PAE) or three-level paging (with PAE), but even with
three-level paging, having a dedicated PGD entry for hugetlb is only
barely possible (because the PGD only has four entries), and it seems
unlikely anyone's actually using PMD sharing on 32-bit.
Having ARCH_WANT_HUGE_PMD_SHARE enabled on non-PAE 32-bit X86 (which
has 2-level paging) became particularly problematic after commit
59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count"),
since that changes `struct ptdesc` such that the `pt_mm` (for PGDs) and
the `pt_share_count` (for PMDs) share the same union storage - and with
2-level paging, PMDs are PGDs.
(For comparison, arm64 also gates ARCH_WANT_HUGE_PMD_SHARE on the
configuration of page tables such that it is never enabled with 2-level
paging.)
Closes: https://lore.kernel.org/r/srhpjxlqfna67blvma5frmy3aa@altlinux.org
Fixes: cfe28c5d63d8 ("x86: mm: Remove x86 version of huge_pmd_share.")
Reported-by: Vitaly Chikunov <vt(a)altlinux.org>
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Signed-off-by: Jann Horn <jannh(a)google.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: David Hildenbrand <david(a)redhat.com>
Tested-by: Vitaly Chikunov <vt(a)altlinux.org>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250702-x86-2level-hugetlb-v2-1-1a98096edf92%4…
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 71019b3b54ea..4e0fe688cc83 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,7 +147,7 @@ config X86
select ARCH_WANTS_DYNAMIC_TASK_STRUCT
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_GENERAL_HUGETLB
- select ARCH_WANT_HUGE_PMD_SHARE
+ select ARCH_WANT_HUGE_PMD_SHARE if X86_64
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64
select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071422-target-professor-6f97@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Wed, 2 Jul 2025 10:32:04 +0200
Subject: [PATCH] x86/mm: Disable hugetlb page table sharing on 32-bit
Only select ARCH_WANT_HUGE_PMD_SHARE on 64-bit x86.
Page table sharing requires at least three levels because it involves
shared references to PMD tables; 32-bit x86 has either two-level paging
(without PAE) or three-level paging (with PAE), but even with
three-level paging, having a dedicated PGD entry for hugetlb is only
barely possible (because the PGD only has four entries), and it seems
unlikely anyone's actually using PMD sharing on 32-bit.
Having ARCH_WANT_HUGE_PMD_SHARE enabled on non-PAE 32-bit X86 (which
has 2-level paging) became particularly problematic after commit
59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count"),
since that changes `struct ptdesc` such that the `pt_mm` (for PGDs) and
the `pt_share_count` (for PMDs) share the same union storage - and with
2-level paging, PMDs are PGDs.
(For comparison, arm64 also gates ARCH_WANT_HUGE_PMD_SHARE on the
configuration of page tables such that it is never enabled with 2-level
paging.)
Closes: https://lore.kernel.org/r/srhpjxlqfna67blvma5frmy3aa@altlinux.org
Fixes: cfe28c5d63d8 ("x86: mm: Remove x86 version of huge_pmd_share.")
Reported-by: Vitaly Chikunov <vt(a)altlinux.org>
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Signed-off-by: Jann Horn <jannh(a)google.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: David Hildenbrand <david(a)redhat.com>
Tested-by: Vitaly Chikunov <vt(a)altlinux.org>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250702-x86-2level-hugetlb-v2-1-1a98096edf92%4…
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 71019b3b54ea..4e0fe688cc83 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,7 +147,7 @@ config X86
select ARCH_WANTS_DYNAMIC_TASK_STRUCT
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_GENERAL_HUGETLB
- select ARCH_WANT_HUGE_PMD_SHARE
+ select ARCH_WANT_HUGE_PMD_SHARE if X86_64
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64
select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y
git checkout FETCH_HEAD
git cherry-pick -x 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071421-molar-theme-43d7@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Wed, 2 Jul 2025 10:32:04 +0200
Subject: [PATCH] x86/mm: Disable hugetlb page table sharing on 32-bit
Only select ARCH_WANT_HUGE_PMD_SHARE on 64-bit x86.
Page table sharing requires at least three levels because it involves
shared references to PMD tables; 32-bit x86 has either two-level paging
(without PAE) or three-level paging (with PAE), but even with
three-level paging, having a dedicated PGD entry for hugetlb is only
barely possible (because the PGD only has four entries), and it seems
unlikely anyone's actually using PMD sharing on 32-bit.
Having ARCH_WANT_HUGE_PMD_SHARE enabled on non-PAE 32-bit X86 (which
has 2-level paging) became particularly problematic after commit
59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count"),
since that changes `struct ptdesc` such that the `pt_mm` (for PGDs) and
the `pt_share_count` (for PMDs) share the same union storage - and with
2-level paging, PMDs are PGDs.
(For comparison, arm64 also gates ARCH_WANT_HUGE_PMD_SHARE on the
configuration of page tables such that it is never enabled with 2-level
paging.)
Closes: https://lore.kernel.org/r/srhpjxlqfna67blvma5frmy3aa@altlinux.org
Fixes: cfe28c5d63d8 ("x86: mm: Remove x86 version of huge_pmd_share.")
Reported-by: Vitaly Chikunov <vt(a)altlinux.org>
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Signed-off-by: Jann Horn <jannh(a)google.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: David Hildenbrand <david(a)redhat.com>
Tested-by: Vitaly Chikunov <vt(a)altlinux.org>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250702-x86-2level-hugetlb-v2-1-1a98096edf92%4…
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 71019b3b54ea..4e0fe688cc83 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,7 +147,7 @@ config X86
select ARCH_WANTS_DYNAMIC_TASK_STRUCT
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_GENERAL_HUGETLB
- select ARCH_WANT_HUGE_PMD_SHARE
+ select ARCH_WANT_HUGE_PMD_SHARE if X86_64
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64
select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y
git checkout FETCH_HEAD
git cherry-pick -x 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025071420-occupy-vowel-a8a5@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 76303ee8d54bff6d9a6d55997acd88a6c2ba63cf Mon Sep 17 00:00:00 2001
From: Jann Horn <jannh(a)google.com>
Date: Wed, 2 Jul 2025 10:32:04 +0200
Subject: [PATCH] x86/mm: Disable hugetlb page table sharing on 32-bit
Only select ARCH_WANT_HUGE_PMD_SHARE on 64-bit x86.
Page table sharing requires at least three levels because it involves
shared references to PMD tables; 32-bit x86 has either two-level paging
(without PAE) or three-level paging (with PAE), but even with
three-level paging, having a dedicated PGD entry for hugetlb is only
barely possible (because the PGD only has four entries), and it seems
unlikely anyone's actually using PMD sharing on 32-bit.
Having ARCH_WANT_HUGE_PMD_SHARE enabled on non-PAE 32-bit X86 (which
has 2-level paging) became particularly problematic after commit
59d9094df3d7 ("mm: hugetlb: independent PMD page table shared count"),
since that changes `struct ptdesc` such that the `pt_mm` (for PGDs) and
the `pt_share_count` (for PMDs) share the same union storage - and with
2-level paging, PMDs are PGDs.
(For comparison, arm64 also gates ARCH_WANT_HUGE_PMD_SHARE on the
configuration of page tables such that it is never enabled with 2-level
paging.)
Closes: https://lore.kernel.org/r/srhpjxlqfna67blvma5frmy3aa@altlinux.org
Fixes: cfe28c5d63d8 ("x86: mm: Remove x86 version of huge_pmd_share.")
Reported-by: Vitaly Chikunov <vt(a)altlinux.org>
Suggested-by: Dave Hansen <dave.hansen(a)intel.com>
Signed-off-by: Jann Horn <jannh(a)google.com>
Signed-off-by: Dave Hansen <dave.hansen(a)linux.intel.com>
Acked-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: David Hildenbrand <david(a)redhat.com>
Tested-by: Vitaly Chikunov <vt(a)altlinux.org>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250702-x86-2level-hugetlb-v2-1-1a98096edf92%4…
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 71019b3b54ea..4e0fe688cc83 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -147,7 +147,7 @@ config X86
select ARCH_WANTS_DYNAMIC_TASK_STRUCT
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_GENERAL_HUGETLB
- select ARCH_WANT_HUGE_PMD_SHARE
+ select ARCH_WANT_HUGE_PMD_SHARE if X86_64
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64
select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64
After commit 725f31f300e3 ("thermal/of: support thermal zones w/o trips
subnode") was backported on 6.6 stable branch as commit d3304dbc2d5f
("thermal/of: support thermal zones w/o trips subnode"), thermal zones
w/o trips subnode still fail to register since `mask` argument is not
set correctly. When number of trips subnode is 0, `mask` must be 0 to
pass the check in `thermal_zone_device_register_with_trips()`.
Set `mask` to 0 when there's no trips subnode.
Signed-off-by: Hsin-Te Yuan <yuanhsinte(a)chromium.org>
---
drivers/thermal/thermal_of.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/thermal/thermal_of.c b/drivers/thermal/thermal_of.c
index 0f520cf923a1e684411a3077ad283551395eec11..97aeb869abf5179dfa512dd744725121ec7fd0d9 100644
--- a/drivers/thermal/thermal_of.c
+++ b/drivers/thermal/thermal_of.c
@@ -514,7 +514,7 @@ static struct thermal_zone_device *thermal_of_zone_register(struct device_node *
of_ops->bind = thermal_of_bind;
of_ops->unbind = thermal_of_unbind;
- mask = GENMASK_ULL((ntrips) - 1, 0);
+ mask = ntrips ? GENMASK_ULL((ntrips) - 1, 0) : 0;
tz = thermal_zone_device_register_with_trips(np->name, trips, ntrips,
mask, data, of_ops, &tzp,
---
base-commit: a5df3a702b2cba82bcfb066fa9f5e4a349c33924
change-id: 20250707-trip-point-73dae9fd9c74
Best regards,
--
Hsin-Te Yuan <yuanhsinte(a)chromium.org>
The grp->bb_largest_free_order is updated regardless of whether
mb_optimize_scan is enabled. This can lead to inconsistencies between
grp->bb_largest_free_order and the actual s_mb_largest_free_orders list
index when mb_optimize_scan is repeatedly enabled and disabled via remount.
For example, if mb_optimize_scan is initially enabled, largest free
order is 3, and the group is in s_mb_largest_free_orders[3]. Then,
mb_optimize_scan is disabled via remount, block allocations occur,
updating largest free order to 2. Finally, mb_optimize_scan is re-enabled
via remount, more block allocations update largest free order to 1.
At this point, the group would be removed from s_mb_largest_free_orders[3]
under the protection of s_mb_largest_free_orders_locks[2]. This lock
mismatch can lead to list corruption.
To fix this, whenever grp->bb_largest_free_order changes, we now always
attempt to remove the group from its old order list. However, we only
insert the group into the new order list if `mb_optimize_scan` is enabled.
This approach helps prevent lock inconsistencies and ensures the data in
the order lists remains reliable.
Fixes: 196e402adf2e ("ext4: improve cr 0 / cr 1 group scanning")
CC: stable(a)vger.kernel.org
Suggested-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Baokun Li <libaokun1(a)huawei.com>
---
fs/ext4/mballoc.c | 33 ++++++++++++++-------------------
1 file changed, 14 insertions(+), 19 deletions(-)
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 72b20fc52bbf..fada0d1b3fdb 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -1152,33 +1152,28 @@ static void
mb_set_largest_free_order(struct super_block *sb, struct ext4_group_info *grp)
{
struct ext4_sb_info *sbi = EXT4_SB(sb);
- int i;
+ int new, old = grp->bb_largest_free_order;
- for (i = MB_NUM_ORDERS(sb) - 1; i >= 0; i--)
- if (grp->bb_counters[i] > 0)
+ for (new = MB_NUM_ORDERS(sb) - 1; new >= 0; new--)
+ if (grp->bb_counters[new] > 0)
break;
+
/* No need to move between order lists? */
- if (!test_opt2(sb, MB_OPTIMIZE_SCAN) ||
- i == grp->bb_largest_free_order) {
- grp->bb_largest_free_order = i;
+ if (new == old)
return;
- }
- if (grp->bb_largest_free_order >= 0) {
- write_lock(&sbi->s_mb_largest_free_orders_locks[
- grp->bb_largest_free_order]);
+ if (old >= 0 && !list_empty(&grp->bb_largest_free_order_node)) {
+ write_lock(&sbi->s_mb_largest_free_orders_locks[old]);
list_del_init(&grp->bb_largest_free_order_node);
- write_unlock(&sbi->s_mb_largest_free_orders_locks[
- grp->bb_largest_free_order]);
+ write_unlock(&sbi->s_mb_largest_free_orders_locks[old]);
}
- grp->bb_largest_free_order = i;
- if (grp->bb_largest_free_order >= 0 && grp->bb_free) {
- write_lock(&sbi->s_mb_largest_free_orders_locks[
- grp->bb_largest_free_order]);
+
+ grp->bb_largest_free_order = new;
+ if (test_opt2(sb, MB_OPTIMIZE_SCAN) && new >= 0 && grp->bb_free) {
+ write_lock(&sbi->s_mb_largest_free_orders_locks[new]);
list_add_tail(&grp->bb_largest_free_order_node,
- &sbi->s_mb_largest_free_orders[grp->bb_largest_free_order]);
- write_unlock(&sbi->s_mb_largest_free_orders_locks[
- grp->bb_largest_free_order]);
+ &sbi->s_mb_largest_free_orders[new]);
+ write_unlock(&sbi->s_mb_largest_free_orders_locks[new]);
}
}
--
2.46.1