The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 7ca52541c05c832d32b112274f81a985101f9ba8 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to 'stable@vger.kernel.org' --in-reply-to '2025062025-unengaged-regroup-c3c7@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7ca52541c05c832d32b112274f81a985101f9ba8 Mon Sep 17 00:00:00 2001 From: Eric Dumazet edumazet@google.com Date: Wed, 11 Jun 2025 08:35:01 +0000 Subject: [PATCH] net_sched: sch_sfq: reject invalid perturb period
Gerrard Tai reported that SFQ perturb_period has no range check yet, and this can be used to trigger a race condition fixed in a separate patch.
We want to make sure ctl->perturb_period * HZ will not overflow and is positive.
Tested:
tc qd add dev lo root sfq perturb -10 # negative value : error Error: sch_sfq: invalid perturb period.
tc qd add dev lo root sfq perturb 1000000000 # too big : error Error: sch_sfq: invalid perturb period.
tc qd add dev lo root sfq perturb 2000000 # acceptable value tc -s -d qd sh dev lo qdisc sfq 8005: root refcnt 2 limit 127p quantum 64Kb depth 127 flows 128 divisor 1024 perturb 2000000sec Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Gerrard Tai gerrard.tai@starlabs.sg Signed-off-by: Eric Dumazet edumazet@google.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250611083501.1810459-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index 77fa02f2bfcd..a8cca549b5a2 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -656,6 +656,14 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt, NL_SET_ERR_MSG_MOD(extack, "invalid quantum"); return -EINVAL; } + + if (ctl->perturb_period < 0 || + ctl->perturb_period > INT_MAX / HZ) { + NL_SET_ERR_MSG_MOD(extack, "invalid perturb period"); + return -EINVAL; + } + perturb_period = ctl->perturb_period * HZ; + if (ctl_v1 && !red_check_params(ctl_v1->qth_min, ctl_v1->qth_max, ctl_v1->Wlog, ctl_v1->Scell_log, NULL)) return -EINVAL; @@ -672,14 +680,12 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt, headdrop = q->headdrop; maxdepth = q->maxdepth; maxflows = q->maxflows; - perturb_period = q->perturb_period; quantum = q->quantum; flags = q->flags;
/* update and validate configuration */ if (ctl->quantum) quantum = ctl->quantum; - perturb_period = ctl->perturb_period * HZ; if (ctl->flows) maxflows = min_t(u32, ctl->flows, SFQ_MAX_FLOWS); if (ctl->divisor) {
commit a17ef9e6c2c1cf0fc6cd6ca6a9ce525c67d1da7f upstream.
sfq_perturbation() reads q->perturb_period locklessly. Add annotations to fix potential issues.
Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: Simon Horman horms@kernel.org Link: https://lore.kernel.org/r/20240430180015.3111398-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org --- net/sched/sch_sfq.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index f8e569f79f1367563bf92793dbd2a9c0a4ce957b..d6299eb12a8164bcaa0594d3ac2829531bfac588 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -608,6 +608,7 @@ static void sfq_perturbation(struct timer_list *t) struct Qdisc *sch = q->sch; spinlock_t *root_lock = qdisc_lock(qdisc_root_sleeping(sch)); siphash_key_t nkey; + int period;
get_random_bytes(&nkey, sizeof(nkey)); spin_lock(root_lock); @@ -616,8 +617,12 @@ static void sfq_perturbation(struct timer_list *t) sfq_rehash(sch); spin_unlock(root_lock);
- if (q->perturb_period) - mod_timer(&q->perturb_timer, jiffies + q->perturb_period); + /* q->perturb_period can change under us from + * sfq_change() and sfq_destroy(). + */ + period = READ_ONCE(q->perturb_period); + if (period) + mod_timer(&q->perturb_timer, jiffies + period); }
static int sfq_change(struct Qdisc *sch, struct nlattr *opt) @@ -659,7 +664,7 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt) q->quantum = ctl->quantum; q->scaled_quantum = SFQ_ALLOT_SIZE(q->quantum); } - q->perturb_period = ctl->perturb_period * HZ; + WRITE_ONCE(q->perturb_period, ctl->perturb_period * HZ); if (ctl->flows) q->maxflows = min_t(u32, ctl->flows, SFQ_MAX_FLOWS); if (ctl->divisor) { @@ -721,7 +726,7 @@ static void sfq_destroy(struct Qdisc *sch) struct sfq_sched_data *q = qdisc_priv(sch);
tcf_block_put(q->block); - q->perturb_period = 0; + WRITE_ONCE(q->perturb_period, 0); del_timer_sync(&q->perturb_timer); sfq_free(q->ht); sfq_free(q->slots);
commit e4650d7ae4252f67e997a632adfae0dd74d3a99a upstream.
SFQ has an assumption on dealing with packets smaller than 64KB.
Even before BIG TCP, TCA_STAB can provide arbitrary big values in qdisc_pkt_len(skb)
It is time to switch (struct sfq_slot)->allot to a 32bit field.
sizeof(struct sfq_slot) is now 64 bytes, giving better cache locality.
Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://patch.msgid.link/20241008111603.653140-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org --- net/sched/sch_sfq.c | 39 +++++++++++++-------------------------- 1 file changed, 13 insertions(+), 26 deletions(-)
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index d6299eb12a8164bcaa0594d3ac2829531bfac588..714bdc2c5a682a9767ce5e8a404aa7b4889604a6 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -77,12 +77,6 @@ #define SFQ_EMPTY_SLOT 0xffff #define SFQ_DEFAULT_HASH_DIVISOR 1024
-/* We use 16 bits to store allot, and want to handle packets up to 64K - * Scale allot by 8 (1<<3) so that no overflow occurs. - */ -#define SFQ_ALLOT_SHIFT 3 -#define SFQ_ALLOT_SIZE(X) DIV_ROUND_UP(X, 1 << SFQ_ALLOT_SHIFT) - /* This type should contain at least SFQ_MAX_DEPTH + 1 + SFQ_MAX_FLOWS values */ typedef u16 sfq_index;
@@ -104,7 +98,7 @@ struct sfq_slot { sfq_index next; /* next slot in sfq RR chain */ struct sfq_head dep; /* anchor in dep[] chains */ unsigned short hash; /* hash value (index in ht[]) */ - short allot; /* credit for this slot */ + int allot; /* credit for this slot */
unsigned int backlog; struct red_vars vars; @@ -120,7 +114,6 @@ struct sfq_sched_data { siphash_key_t perturbation; u8 cur_depth; /* depth of longest slot */ u8 flags; - unsigned short scaled_quantum; /* SFQ_ALLOT_SIZE(quantum) */ struct tcf_proto __rcu *filter_list; struct tcf_block *block; sfq_index *ht; /* Hash table ('divisor' slots) */ @@ -456,7 +449,7 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) */ q->tail = slot; /* We could use a bigger initial quantum for new flows */ - slot->allot = q->scaled_quantum; + slot->allot = q->quantum; } if (++sch->q.qlen <= q->limit) return NET_XMIT_SUCCESS; @@ -493,7 +486,7 @@ sfq_dequeue(struct Qdisc *sch) slot = &q->slots[a]; if (slot->allot <= 0) { q->tail = slot; - slot->allot += q->scaled_quantum; + slot->allot += q->quantum; goto next_slot; } skb = slot_dequeue_head(slot); @@ -512,7 +505,7 @@ sfq_dequeue(struct Qdisc *sch) } q->tail->next = next_a; } else { - slot->allot -= SFQ_ALLOT_SIZE(qdisc_pkt_len(skb)); + slot->allot -= qdisc_pkt_len(skb); } return skb; } @@ -595,7 +588,7 @@ static void sfq_rehash(struct Qdisc *sch) q->tail->next = x; } q->tail = slot; - slot->allot = q->scaled_quantum; + slot->allot = q->quantum; } } sch->q.qlen -= dropped; @@ -625,7 +618,8 @@ static void sfq_perturbation(struct timer_list *t) mod_timer(&q->perturb_timer, jiffies + period); }
-static int sfq_change(struct Qdisc *sch, struct nlattr *opt) +static int sfq_change(struct Qdisc *sch, struct nlattr *opt, + struct netlink_ext_ack *extack) { struct sfq_sched_data *q = qdisc_priv(sch); struct tc_sfq_qopt *ctl = nla_data(opt); @@ -643,14 +637,10 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt) (!is_power_of_2(ctl->divisor) || ctl->divisor > 65536)) return -EINVAL;
- /* slot->allot is a short, make sure quantum is not too big. */ - if (ctl->quantum) { - unsigned int scaled = SFQ_ALLOT_SIZE(ctl->quantum); - - if (scaled <= 0 || scaled > SHRT_MAX) - return -EINVAL; + if ((int)ctl->quantum < 0) { + NL_SET_ERR_MSG_MOD(extack, "invalid quantum"); + return -EINVAL; } - if (ctl_v1 && !red_check_params(ctl_v1->qth_min, ctl_v1->qth_max, ctl_v1->Wlog, ctl_v1->Scell_log, NULL)) return -EINVAL; @@ -660,10 +650,8 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt) return -ENOMEM; } sch_tree_lock(sch); - if (ctl->quantum) { + if (ctl->quantum) q->quantum = ctl->quantum; - q->scaled_quantum = SFQ_ALLOT_SIZE(q->quantum); - } WRITE_ONCE(q->perturb_period, ctl->perturb_period * HZ); if (ctl->flows) q->maxflows = min_t(u32, ctl->flows, SFQ_MAX_FLOWS); @@ -759,12 +747,11 @@ static int sfq_init(struct Qdisc *sch, struct nlattr *opt, q->divisor = SFQ_DEFAULT_HASH_DIVISOR; q->maxflows = SFQ_DEFAULT_FLOWS; q->quantum = psched_mtu(qdisc_dev(sch)); - q->scaled_quantum = SFQ_ALLOT_SIZE(q->quantum); q->perturb_period = 0; get_random_bytes(&q->perturbation, sizeof(q->perturbation));
if (opt) { - int err = sfq_change(sch, opt); + int err = sfq_change(sch, opt, extack); if (err) return err; } @@ -875,7 +862,7 @@ static int sfq_dump_class_stats(struct Qdisc *sch, unsigned long cl, if (idx != SFQ_EMPTY_SLOT) { const struct sfq_slot *slot = &q->slots[idx];
- xstats.allot = slot->allot << SFQ_ALLOT_SHIFT; + xstats.allot = slot->allot; qs.qlen = slot->qlen; qs.backlog = slot->backlog; }
From: Octavian Purdila tavip@google.com
commit 10685681bafce6febb39770f3387621bf5d67d0b upstream.
The current implementation does not work correctly with a limit of 1. iproute2 actually checks for this and this patch adds the check in kernel as well.
This fixes the following syzkaller reported crash:
UBSAN: array-index-out-of-bounds in net/sched/sch_sfq.c:210:6 index 65535 is out of range for type 'struct sfq_head[128]' CPU: 0 PID: 2569 Comm: syz-executor101 Not tainted 5.10.0-smp-DEV #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x125/0x19f lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:148 [inline] __ubsan_handle_out_of_bounds+0xed/0x120 lib/ubsan.c:347 sfq_link net/sched/sch_sfq.c:210 [inline] sfq_dec+0x528/0x600 net/sched/sch_sfq.c:238 sfq_dequeue+0x39b/0x9d0 net/sched/sch_sfq.c:500 sfq_reset+0x13/0x50 net/sched/sch_sfq.c:525 qdisc_reset+0xfe/0x510 net/sched/sch_generic.c:1026 tbf_reset+0x3d/0x100 net/sched/sch_tbf.c:319 qdisc_reset+0xfe/0x510 net/sched/sch_generic.c:1026 dev_reset_queue+0x8c/0x140 net/sched/sch_generic.c:1296 netdev_for_each_tx_queue include/linux/netdevice.h:2350 [inline] dev_deactivate_many+0x6dc/0xc20 net/sched/sch_generic.c:1362 __dev_close_many+0x214/0x350 net/core/dev.c:1468 dev_close_many+0x207/0x510 net/core/dev.c:1506 unregister_netdevice_many+0x40f/0x16b0 net/core/dev.c:10738 unregister_netdevice_queue+0x2be/0x310 net/core/dev.c:10695 unregister_netdevice include/linux/netdevice.h:2893 [inline] __tun_detach+0x6b6/0x1600 drivers/net/tun.c:689 tun_detach drivers/net/tun.c:705 [inline] tun_chr_close+0x104/0x1b0 drivers/net/tun.c:3640 __fput+0x203/0x840 fs/file_table.c:280 task_work_run+0x129/0x1b0 kernel/task_work.c:185 exit_task_work include/linux/task_work.h:33 [inline] do_exit+0x5ce/0x2200 kernel/exit.c:931 do_group_exit+0x144/0x310 kernel/exit.c:1046 __do_sys_exit_group kernel/exit.c:1057 [inline] __se_sys_exit_group kernel/exit.c:1055 [inline] __x64_sys_exit_group+0x3b/0x40 kernel/exit.c:1055 do_syscall_64+0x6c/0xd0 entry_SYSCALL_64_after_hwframe+0x61/0xcb RIP: 0033:0x7fe5e7b52479 Code: Unable to access opcode bytes at RIP 0x7fe5e7b5244f. RSP: 002b:00007ffd3c800398 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe5e7b52479 RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000 RBP: 00007fe5e7bcd2d0 R08: ffffffffffffffb8 R09: 0000000000000014 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe5e7bcd2d0 R13: 0000000000000000 R14: 00007fe5e7bcdd20 R15: 00007fe5e7b24270
The crash can be also be reproduced with the following (with a tc recompiled to allow for sfq limits of 1):
tc qdisc add dev dummy0 handle 1: root tbf rate 1Kbit burst 100b lat 1s ../iproute2-6.9.0/tc/tc qdisc add dev dummy0 handle 2: parent 1:10 sfq limit 1 ifconfig dummy0 up ping -I dummy0 -f -c2 -W0.1 8.8.8.8 sleep 1
Scenario that triggers the crash:
* the first packet is sent and queued in TBF and SFQ; qdisc qlen is 1
* TBF dequeues: it peeks from SFQ which moves the packet to the gso_skb list and keeps qdisc qlen set to 1. TBF is out of tokens so it schedules itself for later.
* the second packet is sent and TBF tries to queues it to SFQ. qdisc qlen is now 2 and because the SFQ limit is 1 the packet is dropped by SFQ. At this point qlen is 1, and all of the SFQ slots are empty, however q->tail is not NULL.
At this point, assuming no more packets are queued, when sch_dequeue runs again it will decrement the qlen for the current empty slot causing an underflow and the subsequent out of bounds access.
Reported-by: syzbot syzkaller@googlegroups.com Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Octavian Purdila tavip@google.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20241204030520.2084663-2-tavip@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org --- net/sched/sch_sfq.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index 714bdc2c5a682a9767ce5e8a404aa7b4889604a6..505209a932ab73a91a0b15f990d4c9d7a206a05a 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -649,6 +649,10 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt, if (!p) return -ENOMEM; } + if (ctl->limit == 1) { + NL_SET_ERR_MSG_MOD(extack, "invalid limit"); + return -EINVAL; + } sch_tree_lock(sch); if (ctl->quantum) q->quantum = ctl->quantum;
From: Octavian Purdila tavip@google.com
commit 8c0cea59d40cf6dd13c2950437631dd614fbade6 upstream.
Many configuration parameters have influence on others (e.g. divisor -> flows -> limit, depth -> limit) and so it is difficult to correctly do all of the validation before applying the configuration. And if a validation error is detected late it is difficult to roll back a partially applied configuration.
To avoid these issues use a temporary work area to update and validate the configuration and only then apply the configuration to the internal state.
Signed-off-by: Octavian Purdila tavip@google.com Acked-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net --- net/sched/sch_sfq.c | 56 +++++++++++++++++++++++++++++++++++---------- 1 file changed, 44 insertions(+), 12 deletions(-)
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index 505209a932ab73a91a0b15f990d4c9d7a206a05a..0d916146a5a39b492a89067e19b81628bd72f5d1 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -628,6 +628,15 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt, struct red_parms *p = NULL; struct sk_buff *to_free = NULL; struct sk_buff *tail = NULL; + unsigned int maxflows; + unsigned int quantum; + unsigned int divisor; + int perturb_period; + u8 headdrop; + u8 maxdepth; + int limit; + u8 flags; +
if (opt->nla_len < nla_attr_size(sizeof(*ctl))) return -EINVAL; @@ -653,36 +662,59 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt, NL_SET_ERR_MSG_MOD(extack, "invalid limit"); return -EINVAL; } + sch_tree_lock(sch); + + limit = q->limit; + divisor = q->divisor; + headdrop = q->headdrop; + maxdepth = q->maxdepth; + maxflows = q->maxflows; + perturb_period = q->perturb_period; + quantum = q->quantum; + flags = q->flags; + + /* update and validate configuration */ if (ctl->quantum) - q->quantum = ctl->quantum; - WRITE_ONCE(q->perturb_period, ctl->perturb_period * HZ); + quantum = ctl->quantum; + perturb_period = ctl->perturb_period * HZ; if (ctl->flows) - q->maxflows = min_t(u32, ctl->flows, SFQ_MAX_FLOWS); + maxflows = min_t(u32, ctl->flows, SFQ_MAX_FLOWS); if (ctl->divisor) { - q->divisor = ctl->divisor; - q->maxflows = min_t(u32, q->maxflows, q->divisor); + divisor = ctl->divisor; + maxflows = min_t(u32, maxflows, divisor); } if (ctl_v1) { if (ctl_v1->depth) - q->maxdepth = min_t(u32, ctl_v1->depth, SFQ_MAX_DEPTH); + maxdepth = min_t(u32, ctl_v1->depth, SFQ_MAX_DEPTH); if (p) { - swap(q->red_parms, p); - red_set_parms(q->red_parms, + red_set_parms(p, ctl_v1->qth_min, ctl_v1->qth_max, ctl_v1->Wlog, ctl_v1->Plog, ctl_v1->Scell_log, NULL, ctl_v1->max_P); } - q->flags = ctl_v1->flags; - q->headdrop = ctl_v1->headdrop; + flags = ctl_v1->flags; + headdrop = ctl_v1->headdrop; } if (ctl->limit) { - q->limit = min_t(u32, ctl->limit, q->maxdepth * q->maxflows); - q->maxflows = min_t(u32, q->maxflows, q->limit); + limit = min_t(u32, ctl->limit, maxdepth * maxflows); + maxflows = min_t(u32, maxflows, limit); }
+ /* commit configuration */ + q->limit = limit; + q->divisor = divisor; + q->headdrop = headdrop; + q->maxdepth = maxdepth; + q->maxflows = maxflows; + WRITE_ONCE(q->perturb_period, perturb_period); + q->quantum = quantum; + q->flags = flags; + if (p) + swap(q->red_parms, p); + qlen = sch->q.qlen; while (sch->q.qlen > q->limit) { dropped += sfq_drop(sch, &to_free);
From: Octavian Purdila tavip@google.com
commit b3bf8f63e6179076b57c9de660c9f80b5abefe70 upstream.
It is not sufficient to directly validate the limit on the data that the user passes as it can be updated based on how the other parameters are changed.
Move the check at the end of the configuration update process to also catch scenarios where the limit is indirectly updated, for example with the following configurations:
tc qdisc add dev dummy0 handle 1: root sfq limit 2 flows 1 depth 1 tc qdisc add dev dummy0 handle 1: root sfq limit 2 flows 1 divisor 1
This fixes the following syzkaller reported crash:
------------[ cut here ]------------ UBSAN: array-index-out-of-bounds in net/sched/sch_sfq.c:203:6 index 65535 is out of range for type 'struct sfq_head[128]' CPU: 1 UID: 0 PID: 3037 Comm: syz.2.16 Not tainted 6.14.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x201/0x300 lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:231 [inline] __ubsan_handle_out_of_bounds+0xf5/0x120 lib/ubsan.c:429 sfq_link net/sched/sch_sfq.c:203 [inline] sfq_dec+0x53c/0x610 net/sched/sch_sfq.c:231 sfq_dequeue+0x34e/0x8c0 net/sched/sch_sfq.c:493 sfq_reset+0x17/0x60 net/sched/sch_sfq.c:518 qdisc_reset+0x12e/0x600 net/sched/sch_generic.c:1035 tbf_reset+0x41/0x110 net/sched/sch_tbf.c:339 qdisc_reset+0x12e/0x600 net/sched/sch_generic.c:1035 dev_reset_queue+0x100/0x1b0 net/sched/sch_generic.c:1311 netdev_for_each_tx_queue include/linux/netdevice.h:2590 [inline] dev_deactivate_many+0x7e5/0xe70 net/sched/sch_generic.c:1375
Reported-by: syzbot syzkaller@googlegroups.com Fixes: 10685681bafc ("net_sched: sch_sfq: don't allow 1 packet limit") Signed-off-by: Octavian Purdila tavip@google.com Acked-by: Cong Wang xiyou.wangcong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net --- net/sched/sch_sfq.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index 0d916146a5a39b492a89067e19b81628bd72f5d1..04c3aa446ad3d69c014b06e66795b8dfbc369333 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -658,10 +658,6 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt, if (!p) return -ENOMEM; } - if (ctl->limit == 1) { - NL_SET_ERR_MSG_MOD(extack, "invalid limit"); - return -EINVAL; - }
sch_tree_lock(sch);
@@ -702,6 +698,12 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt, limit = min_t(u32, ctl->limit, maxdepth * maxflows); maxflows = min_t(u32, maxflows, limit); } + if (limit == 1) { + sch_tree_unlock(sch); + kfree(p); + NL_SET_ERR_MSG_MOD(extack, "invalid limit"); + return -EINVAL; + }
/* commit configuration */ q->limit = limit;
commit 82ffbe7776d0ac084031f114167712269bf3d832 upstream.
SFQ has an assumption of always being able to queue at least one packet.
However, after the blamed commit, sch->q.len can be inflated by packets in sch->gso_skb, and an enqueue() on an empty SFQ qdisc can be followed by an immediate drop.
Fix sfq_drop() to properly clear q->tail in this situation.
Tested:
ip netns add lb ip link add dev to-lb type veth peer name in-lb netns lb ethtool -K to-lb tso off # force qdisc to requeue gso_skb ip netns exec lb ethtool -K in-lb gro on # enable NAPI ip link set dev to-lb up ip -netns lb link set dev in-lb up ip addr add dev to-lb 192.168.20.1/24 ip -netns lb addr add dev in-lb 192.168.20.2/24 tc qdisc replace dev to-lb root sfq limit 100
ip netns exec lb netserver
netperf -H 192.168.20.2 -l 100 & netperf -H 192.168.20.2 -l 100 & netperf -H 192.168.20.2 -l 100 & netperf -H 192.168.20.2 -l 100 &
Fixes: a53851e2c321 ("net: sched: explicit locking in gso_cpu fallback") Reported-by: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de Closes: https://lore.kernel.org/netdev/9da42688-bfaa-4364-8797-e9271f3bdaef@hetzner-... Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://patch.msgid.link/20250606165127.3629486-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org --- net/sched/sch_sfq.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index 04c3aa446ad3d69c014b06e66795b8dfbc369333..29e17809d1a70258ca268d450289c63a272fbee4 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -310,7 +310,10 @@ static unsigned int sfq_drop(struct Qdisc *sch, struct sk_buff **to_free) /* It is difficult to believe, but ALL THE SLOTS HAVE LENGTH 1. */ x = q->tail->next; slot = &q->slots[x]; - q->tail->next = slot->next; + if (slot->next == x) + q->tail = NULL; /* no more active slots */ + else + q->tail->next = slot->next; q->ht[slot->hash] = SFQ_EMPTY_SLOT; goto drop; }
commit 7ca52541c05c832d32b112274f81a985101f9ba8 upstream.
Gerrard Tai reported that SFQ perturb_period has no range check yet, and this can be used to trigger a race condition fixed in a separate patch.
We want to make sure ctl->perturb_period * HZ will not overflow and is positive.
Tested:
tc qd add dev lo root sfq perturb -10 # negative value : error Error: sch_sfq: invalid perturb period.
tc qd add dev lo root sfq perturb 1000000000 # too big : error Error: sch_sfq: invalid perturb period.
tc qd add dev lo root sfq perturb 2000000 # acceptable value tc -s -d qd sh dev lo qdisc sfq 8005: root refcnt 2 limit 127p quantum 64Kb depth 127 flows 128 divisor 1024 perturb 2000000sec Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Gerrard Tai gerrard.tai@starlabs.sg Signed-off-by: Eric Dumazet edumazet@google.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250611083501.1810459-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org --- net/sched/sch_sfq.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index 29e17809d1a70258ca268d450289c63a272fbee4..cd089c3b226a7a6fbd63479d2f16b6150f53c82b 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -653,6 +653,14 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt, NL_SET_ERR_MSG_MOD(extack, "invalid quantum"); return -EINVAL; } + + if (ctl->perturb_period < 0 || + ctl->perturb_period > INT_MAX / HZ) { + NL_SET_ERR_MSG_MOD(extack, "invalid perturb period"); + return -EINVAL; + } + perturb_period = ctl->perturb_period * HZ; + if (ctl_v1 && !red_check_params(ctl_v1->qth_min, ctl_v1->qth_max, ctl_v1->Wlog, ctl_v1->Scell_log, NULL)) return -EINVAL; @@ -669,14 +677,12 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt, headdrop = q->headdrop; maxdepth = q->maxdepth; maxflows = q->maxflows; - perturb_period = q->perturb_period; quantum = q->quantum; flags = q->flags;
/* update and validate configuration */ if (ctl->quantum) quantum = ctl->quantum; - perturb_period = ctl->perturb_period * HZ; if (ctl->flows) maxflows = min_t(u32, ctl->flows, SFQ_MAX_FLOWS); if (ctl->divisor) {
linux-stable-mirror@lists.linaro.org