A private syzbot instance reported "KASAN: slab-use-after-free Read in dev_map_enqueue" under some runtime environments.
Upstream patch fecef4cd42c6 ("tun: Assign missing bpf_net_context") fixes the issue. In order to bring this patch to stable v6.6 it's also necessary to bring upstream patch 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") as a dependency.
The dependency patch (401cb7dae813 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.")) comes from a patch series [1], the second patch addresses a missing change in the series. Only these two patches were picked up because the purpose of this backport is to fix the particular issue discovered by syzbot. However, maybe Sebastian may consider it's a better idea to backport the whole series instead of only these two patches. I'd also appreciate if you can share your opinion on whether this backport should be applied to other stable branches as well.
Both patches needed some manual work in order to be applied on stable, mostly related to changes in the context lines:
In the case of 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT."), the backport addresses the differences in net/core/dev.c:napi_threaded_poll(), busy_poll_stop(), napi_busy_loop() and net_rx_action() between upstream and stable. This allows the patch to be applied without bringing additional dependencies, such as dad6b9770263 ("net: Allow to use SMP threads for backlog NAPI."). The rest of the changes are made to adapt context lines and are unrelated to the purpose of the patch.
For fecef4cd42c6 ("tun: Assign missing bpf_net_context"), the backport addresses the changes in function parameters introduced by 7cd1107f48e2a ("bpf, xdp: constify some bpf_prog * function arguments") and 4d2bb0bfe874 ("xdp: rely on skb pointer reference in do_xdp_generic and netif_receive_generic_xdp").
Additionally, upstream commit 9da49aa80d68 ("tun: Add missing bpf_net_ctx_clear() in do_xdp_generic()"), which fixes fecef4cd42c6 ("tun: Assign missing bpf_net_context") is also backported with trivial changes to adapt the differences in the patch context.
[1] https://lore.kernel.org/all/20240612170303.3896084-1-bigeasy@linutronix.de/
Signed-off-by: Ricardo Cañuelo Navarro rcn@igalia.com
--- Changes in v3: - Additional patch backported: 9da49aa80d68 ("tun: Add missing bpf_net_ctx_clear() in do_xdp_generic()") which fixes fecef4cd42c6 ("tun: Assign missing bpf_net_context."). Suggested by Sasha's helper bot. - Link to v2: https://lore.kernel.org/r/20250225-20250204-kasan-slab-use-after-free-read-i...
Changes in v2: - Fix backport for patch 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") in v1. - Add context for the patches and SoB tags. - Extend the recipient list. - Link to v1: https://lore.kernel.org/r/20250224-20250204-kasan-slab-use-after-free-read-i...
--- Jeongjun Park (1): tun: Add missing bpf_net_ctx_clear() in do_xdp_generic()
Sebastian Andrzej Siewior (2): net: Reference bpf_redirect_info via task_struct on PREEMPT_RT. tun: Assign missing bpf_net_context.
drivers/net/tun.c | 7 +++++++ include/linux/filter.h | 56 +++++++++++++++++++++++++++++++++++++++++--------- include/linux/sched.h | 3 +++ kernel/bpf/cpumap.c | 3 +++ kernel/bpf/devmap.c | 9 +++++++- kernel/fork.c | 1 + net/bpf/test_run.c | 11 +++++++++- net/core/dev.c | 34 +++++++++++++++++++++++++++++- net/core/filter.c | 44 +++++++++++---------------------------- net/core/lwt_bpf.c | 3 +++ 10 files changed, 126 insertions(+), 45 deletions(-) --- base-commit: c0249d3a0c3cf082d56f4285647ddba19ef604a7 change-id: 20250224-20250204-kasan-slab-use-after-free-read-in-dev_map_enqueue__submit-b907af839805
Cheers, Ricardo
From: Sebastian Andrzej Siewior bigeasy@linutronix.de
[ Upstream commit 401cb7dae8130fd34eb84648e02ab4c506df7d5e ]
The XDP redirect process is two staged: - bpf_prog_run_xdp() is invoked to run a eBPF program which inspects the packet and makes decisions. While doing that, the per-CPU variable bpf_redirect_info is used.
- Afterwards xdp_do_redirect() is invoked and accesses bpf_redirect_info and it may also access other per-CPU variables like xskmap_flush_list.
At the very end of the NAPI callback, xdp_do_flush() is invoked which does not access bpf_redirect_info but will touch the individual per-CPU lists.
The per-CPU variables are only used in the NAPI callback hence disabling bottom halves is the only protection mechanism. Users from preemptible context (like cpu_map_kthread_run()) explicitly disable bottom halves for protections reasons. Without locking in local_bh_disable() on PREEMPT_RT this data structure requires explicit locking.
PREEMPT_RT has forced-threaded interrupts enabled and every NAPI-callback runs in a thread. If each thread has its own data structure then locking can be avoided.
Create a struct bpf_net_context which contains struct bpf_redirect_info. Define the variable on stack, use bpf_net_ctx_set() to save a pointer to it, bpf_net_ctx_clear() removes it again. The bpf_net_ctx_set() may nest. For instance a function can be used from within NET_RX_SOFTIRQ/ net_rx_action which uses bpf_net_ctx_set() and NET_TX_SOFTIRQ which does not. Therefore only the first invocations updates the pointer. Use bpf_net_ctx_get_ri() as a wrapper to retrieve the current struct bpf_redirect_info. The returned data structure is zero initialized to ensure nothing is leaked from stack. This is done on first usage of the struct. bpf_net_ctx_set() sets bpf_redirect_info::kern_flags to 0 to note that initialisation is required. First invocation of bpf_net_ctx_get_ri() will memset() the data structure and update bpf_redirect_info::kern_flags. bpf_redirect_info::nh is excluded from memset because it is only used once BPF_F_NEIGH is set which also sets the nh member. The kern_flags is moved past nh to exclude it from memset.
The pointer to bpf_net_context is saved task's task_struct. Using always the bpf_net_context approach has the advantage that there is almost zero differences between PREEMPT_RT and non-PREEMPT_RT builds.
Cc: Andrii Nakryiko andrii@kernel.org Cc: Eduard Zingerman eddyz87@gmail.com Cc: Hao Luo haoluo@google.com Cc: Jiri Olsa jolsa@kernel.org Cc: John Fastabend john.fastabend@gmail.com Cc: KP Singh kpsingh@kernel.org Cc: Martin KaFai Lau martin.lau@linux.dev Cc: Song Liu song@kernel.org Cc: Stanislav Fomichev sdf@google.com Cc: Yonghong Song yonghong.song@linux.dev Acked-by: Alexei Starovoitov ast@kernel.org Acked-by: Jesper Dangaard Brouer hawk@kernel.org Reviewed-by: Toke Høiland-Jørgensen toke@redhat.com Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Link: https://patch.msgid.link/20240620132727.660738-15-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski kuba@kernel.org [rcn: the backport addresses the differences in net/core/dev.c:napi_threaded_poll(), busy_poll_stop(), napi_busy_loop() and net_rx_action() between upstream and stable. This allows the patch to be applied without bringing additional dependencies, such as dad6b9770263 ("net: Allow to use SMP threads for backlog NAPI."). These changes allow applying the patch on stable without bringing the whole related series. The rest of the changes are made to adapt context lines and are unrelated to the purpose of the patch.] Signed-off-by: Ricardo Cañuelo Navarro rcn@igalia.com --- include/linux/filter.h | 56 +++++++++++++++++++++++++++++++++++++++++--------- include/linux/sched.h | 3 +++ kernel/bpf/cpumap.c | 3 +++ kernel/bpf/devmap.c | 9 +++++++- kernel/fork.c | 1 + net/bpf/test_run.c | 11 +++++++++- net/core/dev.c | 28 ++++++++++++++++++++++++- net/core/filter.c | 44 +++++++++++---------------------------- net/core/lwt_bpf.c | 3 +++ 9 files changed, 113 insertions(+), 45 deletions(-)
diff --git a/include/linux/filter.h b/include/linux/filter.h index 5090e940ba3e46fa9cabd8b8bcea08d719b20b51..785addc2786bde5ea9d2ae45177bf4283f5233f4 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -654,21 +654,59 @@ struct bpf_nh_params { }; };
+/* flags for bpf_redirect_info kern_flags */ +#define BPF_RI_F_RF_NO_DIRECT BIT(0) /* no napi_direct on return_frame */ +#define BPF_RI_F_RI_INIT BIT(1) + struct bpf_redirect_info { u64 tgt_index; void *tgt_value; struct bpf_map *map; u32 flags; - u32 kern_flags; u32 map_id; enum bpf_map_type map_type; struct bpf_nh_params nh; + u32 kern_flags; };
-DECLARE_PER_CPU(struct bpf_redirect_info, bpf_redirect_info); +struct bpf_net_context { + struct bpf_redirect_info ri; +};
-/* flags for bpf_redirect_info kern_flags */ -#define BPF_RI_F_RF_NO_DIRECT BIT(0) /* no napi_direct on return_frame */ +static inline struct bpf_net_context *bpf_net_ctx_set(struct bpf_net_context *bpf_net_ctx) +{ + struct task_struct *tsk = current; + + if (tsk->bpf_net_context != NULL) + return NULL; + bpf_net_ctx->ri.kern_flags = 0; + + tsk->bpf_net_context = bpf_net_ctx; + return bpf_net_ctx; +} + +static inline void bpf_net_ctx_clear(struct bpf_net_context *bpf_net_ctx) +{ + if (bpf_net_ctx) + current->bpf_net_context = NULL; +} + +static inline struct bpf_net_context *bpf_net_ctx_get(void) +{ + return current->bpf_net_context; +} + +static inline struct bpf_redirect_info *bpf_net_ctx_get_ri(void) +{ + struct bpf_net_context *bpf_net_ctx = bpf_net_ctx_get(); + + if (!(bpf_net_ctx->ri.kern_flags & BPF_RI_F_RI_INIT)) { + memset(&bpf_net_ctx->ri, 0, offsetof(struct bpf_net_context, ri.nh)); + bpf_net_ctx->ri.kern_flags |= BPF_RI_F_RI_INIT; + } + + return &bpf_net_ctx->ri; +}
/* Compute the linear packet data range [data, data_end) which * will be accessed by various program types (cls_bpf, act_bpf, @@ -929,25 +967,23 @@ struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off, const struct bpf_insn *patch, u32 len); int bpf_remove_insns(struct bpf_prog *prog, u32 off, u32 cnt);
-void bpf_clear_redirect_map(struct bpf_map *map); - static inline bool xdp_return_frame_no_direct(void) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri();
return ri->kern_flags & BPF_RI_F_RF_NO_DIRECT; }
static inline void xdp_set_return_frame_no_direct(void) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri();
ri->kern_flags |= BPF_RI_F_RF_NO_DIRECT; }
static inline void xdp_clear_return_frame_no_direct(void) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri();
ri->kern_flags &= ~BPF_RI_F_RF_NO_DIRECT; } @@ -1503,7 +1539,7 @@ static __always_inline long __bpf_xdp_redirect_map(struct bpf_map *map, u64 inde u64 flags, const u64 flag_mask, void *lookup_elem(struct bpf_map *map, u32 key)) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri(); const u64 action_mask = XDP_ABORTED | XDP_DROP | XDP_PASS | XDP_TX;
/* Lower bits of the flags are used as return code on lookup failure */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 2af0a8859d6473e1a4b06a64a3fe7f5017447a32..f905d771703aaf4a9957dd589f3337b23a1a8654 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -45,6 +45,7 @@ struct bio_list; struct blk_plug; struct bpf_local_storage; struct bpf_run_ctx; +struct bpf_net_context; struct capture_control; struct cfs_rq; struct fs_struct; @@ -1494,6 +1495,8 @@ struct task_struct { /* Used for BPF run context */ struct bpf_run_ctx *bpf_ctx; #endif + /* Used by BPF for per-TASK xdp storage */ + struct bpf_net_context *bpf_net_context;
#ifdef CONFIG_GCC_PLUGIN_STACKLEAK unsigned long lowest_stack; diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index df03e66a687c1c5629c73e5d096629369b73d831..34700eb6a853bdb3f4b5f8ddae652caf94f54de3 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -239,12 +239,14 @@ static int cpu_map_bpf_prog_run(struct bpf_cpu_map_entry *rcpu, void **frames, int xdp_n, struct xdp_cpumap_stats *stats, struct list_head *list) { + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; int nframes;
if (!rcpu->prog) return xdp_n;
rcu_read_lock_bh(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx);
nframes = cpu_map_bpf_prog_run_xdp(rcpu, frames, xdp_n, stats);
@@ -254,6 +256,7 @@ static int cpu_map_bpf_prog_run(struct bpf_cpu_map_entry *rcpu, void **frames, if (unlikely(!list_empty(list))) cpu_map_bpf_prog_run_skb(rcpu, list, stats);
+ bpf_net_ctx_clear(bpf_net_ctx); rcu_read_unlock_bh(); /* resched point, may call do_softirq() */
return nframes; diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 5f2356b47b2ddc9aba5680082721bbbadb86bb47..3d8d6d07df73b7b59a102bcfe291d5cf1007f13a 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -196,7 +196,14 @@ static void dev_map_free(struct bpf_map *map) list_del_rcu(&dtab->list); spin_unlock(&dev_map_lock);
- bpf_clear_redirect_map(map); + /* bpf_redirect_info->map is assigned in __bpf_xdp_redirect_map() + * during NAPI callback and cleared after the XDP redirect. There is no + * explicit RCU read section which protects bpf_redirect_info->map but + * local_bh_disable() also marks the beginning an RCU section. This + * makes the complete softirq callback RCU protected. Thus after + * following synchronize_rcu() there no bpf_redirect_info->map == map + * assignment. + */ synchronize_rcu();
/* Make sure prior __dev_map_entry_free() have completed. */ diff --git a/kernel/fork.c b/kernel/fork.c index 23efaa2c42e4f8144310435d929de4edaee9c69f..bb8019683edc809cb84404f4a17baa49967b4936 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2466,6 +2466,7 @@ __latent_entropy struct task_struct *copy_process( RCU_INIT_POINTER(p->bpf_storage, NULL); p->bpf_ctx = NULL; #endif + p->bpf_net_context = NULL;
/* Perform scheduler related setup. Assign this task to a CPU. */ retval = sched_fork(clone_flags, p); diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index 905de361f8623f90e1594532300d458b3e480a86..ddbbba4861923fa877f07fb8b0c86847771e3392 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -282,9 +282,10 @@ static int xdp_recv_frames(struct xdp_frame **frames, int nframes, static int xdp_test_run_batch(struct xdp_test_data *xdp, struct bpf_prog *prog, u32 repeat) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; int err = 0, act, ret, i, nframes = 0, batch_sz; struct xdp_frame **frames = xdp->frames; + struct bpf_redirect_info *ri; struct xdp_page_head *head; struct xdp_frame *frm; bool redirect = false; @@ -294,6 +295,8 @@ static int xdp_test_run_batch(struct xdp_test_data *xdp, struct bpf_prog *prog, batch_sz = min_t(u32, repeat, xdp->batch_size);
local_bh_disable(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); + ri = bpf_net_ctx_get_ri(); xdp_set_return_frame_no_direct();
for (i = 0; i < batch_sz; i++) { @@ -358,6 +361,7 @@ static int xdp_test_run_batch(struct xdp_test_data *xdp, struct bpf_prog *prog, }
xdp_clear_return_frame_no_direct(); + bpf_net_ctx_clear(bpf_net_ctx); local_bh_enable(); return err; } @@ -393,6 +397,7 @@ static int bpf_test_run_xdp_live(struct bpf_prog *prog, struct xdp_buff *ctx, static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat, u32 *retval, u32 *time, bool xdp) { + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; struct bpf_prog_array_item item = {.prog = prog}; struct bpf_run_ctx *old_ctx; struct bpf_cg_run_ctx run_ctx; @@ -418,10 +423,14 @@ static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat, do { run_ctx.prog_item = &item; local_bh_disable(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); + if (xdp) *retval = bpf_prog_run_xdp(prog, ctx); else *retval = bpf_prog_run(prog, ctx); + + bpf_net_ctx_clear(bpf_net_ctx); local_bh_enable(); } while (bpf_test_timer_continue(&t, 1, repeat, &ret, time)); bpf_reset_run_ctx(old_ctx); diff --git a/net/core/dev.c b/net/core/dev.c index 479a3892f98c3cc174859e75efd87043cd290fc2..24460c630d3cdfa2490d15969b7ab62b6ce42003 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4019,10 +4019,13 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret, struct net_device *orig_dev, bool *another) { struct bpf_mprog_entry *entry = rcu_dereference_bh(skb->dev->tcx_ingress); + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; int sch_ret;
if (!entry) return skb; + + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); if (*pt_prev) { *ret = deliver_skb(skb, *pt_prev, orig_dev); *pt_prev = NULL; @@ -4051,10 +4054,12 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret, break; } *ret = NET_RX_SUCCESS; + bpf_net_ctx_clear(bpf_net_ctx); return NULL; case TC_ACT_SHOT: kfree_skb_reason(skb, SKB_DROP_REASON_TC_INGRESS); *ret = NET_RX_DROP; + bpf_net_ctx_clear(bpf_net_ctx); return NULL; /* used by tc_run */ case TC_ACT_STOLEN: @@ -4064,8 +4069,10 @@ sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret, fallthrough; case TC_ACT_CONSUMED: *ret = NET_RX_SUCCESS; + bpf_net_ctx_clear(bpf_net_ctx); return NULL; } + bpf_net_ctx_clear(bpf_net_ctx);
return skb; } @@ -4074,11 +4081,14 @@ static __always_inline struct sk_buff * sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev) { struct bpf_mprog_entry *entry = rcu_dereference_bh(dev->tcx_egress); + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; int sch_ret;
if (!entry) return skb;
+ bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); + /* qdisc_skb_cb(skb)->pkt_len & tcx_set_ingress() was * already set by the caller. */ @@ -4094,10 +4104,12 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev) /* No need to push/pop skb's mac_header here on egress! */ skb_do_redirect(skb); *ret = NET_XMIT_SUCCESS; + bpf_net_ctx_clear(bpf_net_ctx); return NULL; case TC_ACT_SHOT: kfree_skb_reason(skb, SKB_DROP_REASON_TC_EGRESS); *ret = NET_XMIT_DROP; + bpf_net_ctx_clear(bpf_net_ctx); return NULL; /* used by tc_run */ case TC_ACT_STOLEN: @@ -4107,8 +4119,10 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev) fallthrough; case TC_ACT_CONSUMED: *ret = NET_XMIT_SUCCESS; + bpf_net_ctx_clear(bpf_net_ctx); return NULL; } + bpf_net_ctx_clear(bpf_net_ctx);
return skb; } @@ -6211,6 +6225,7 @@ static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, bool prefer_busy_poll, u16 budget) { + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; bool skip_schedule = false; unsigned long timeout; int rc; @@ -6228,6 +6243,7 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, bool clear_bit(NAPI_STATE_IN_BUSY_POLL, &napi->state);
local_bh_disable(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx);
if (prefer_busy_poll) { napi->defer_hard_irqs_count = READ_ONCE(napi->dev->napi_defer_hard_irqs); @@ -6250,6 +6266,7 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, bool netpoll_poll_unlock(have_poll_lock); if (rc == budget) __busy_poll_stop(napi, skip_schedule); + bpf_net_ctx_clear(bpf_net_ctx); local_bh_enable(); }
@@ -6259,6 +6276,7 @@ void napi_busy_loop(unsigned int napi_id, { unsigned long start_time = loop_end ? busy_loop_current_time() : 0; int (*napi_poll)(struct napi_struct *napi, int budget); + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; void *have_poll_lock = NULL; struct napi_struct *napi;
@@ -6277,6 +6295,7 @@ void napi_busy_loop(unsigned int napi_id, int work = 0;
local_bh_disable(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); if (!napi_poll) { unsigned long val = READ_ONCE(napi->state);
@@ -6306,6 +6325,7 @@ void napi_busy_loop(unsigned int napi_id, if (work > 0) __NET_ADD_STATS(dev_net(napi->dev), LINUX_MIB_BUSYPOLLRXPACKETS, work); + bpf_net_ctx_clear(bpf_net_ctx); local_bh_enable();
if (!loop_end || loop_end(loop_end_arg, start_time)) @@ -6694,6 +6714,7 @@ static void skb_defer_free_flush(struct softnet_data *sd)
static int napi_threaded_poll(void *data) { + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; struct napi_struct *napi = data; struct softnet_data *sd; void *have; @@ -6705,6 +6726,7 @@ static int napi_threaded_poll(void *data) bool repoll = false;
local_bh_disable(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); sd = this_cpu_ptr(&softnet_data); sd->in_napi_threaded_poll = true;
@@ -6720,6 +6742,7 @@ static int napi_threaded_poll(void *data) net_rps_action_and_irq_enable(sd); } skb_defer_free_flush(sd); + bpf_net_ctx_clear(bpf_net_ctx); local_bh_enable();
if (!repoll) @@ -6737,10 +6760,12 @@ static __latent_entropy void net_rx_action(struct softirq_action *h) struct softnet_data *sd = this_cpu_ptr(&softnet_data); unsigned long time_limit = jiffies + usecs_to_jiffies(READ_ONCE(netdev_budget_usecs)); + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; int budget = READ_ONCE(netdev_budget); LIST_HEAD(list); LIST_HEAD(repoll);
+ bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); start: sd->in_net_rx_action = true; local_irq_disable(); @@ -6793,7 +6818,8 @@ static __latent_entropy void net_rx_action(struct softirq_action *h) sd->in_net_rx_action = false;
net_rps_action_and_irq_enable(sd); -end:; +end: + bpf_net_ctx_clear(bpf_net_ctx); }
struct netdev_adjacent { diff --git a/net/core/filter.c b/net/core/filter.c index 84992279f4b10e5513cbe6a77fc7dbb77271ec4e..3457b4238bee24ecd029998ba42b179c23de3880 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2471,12 +2471,9 @@ static const struct bpf_func_proto bpf_clone_redirect_proto = { .arg3_type = ARG_ANYTHING, };
-DEFINE_PER_CPU(struct bpf_redirect_info, bpf_redirect_info); -EXPORT_PER_CPU_SYMBOL_GPL(bpf_redirect_info); - int skb_do_redirect(struct sk_buff *skb) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri(); struct net *net = dev_net(skb->dev); struct net_device *dev; u32 flags = ri->flags; @@ -2512,7 +2509,7 @@ int skb_do_redirect(struct sk_buff *skb)
BPF_CALL_2(bpf_redirect, u32, ifindex, u64, flags) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri();
if (unlikely(flags & (~(BPF_F_INGRESS) | BPF_F_REDIRECT_INTERNAL))) return TC_ACT_SHOT; @@ -2533,7 +2530,7 @@ static const struct bpf_func_proto bpf_redirect_proto = {
BPF_CALL_2(bpf_redirect_peer, u32, ifindex, u64, flags) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri();
if (unlikely(flags)) return TC_ACT_SHOT; @@ -2555,7 +2552,7 @@ static const struct bpf_func_proto bpf_redirect_peer_proto = { BPF_CALL_4(bpf_redirect_neigh, u32, ifindex, struct bpf_redir_neigh *, params, int, plen, u64, flags) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri();
if (unlikely((plen && plen < sizeof(*params)) || flags)) return TC_ACT_SHOT; @@ -4297,30 +4294,13 @@ void xdp_do_flush(void) } EXPORT_SYMBOL_GPL(xdp_do_flush);
-void bpf_clear_redirect_map(struct bpf_map *map) -{ - struct bpf_redirect_info *ri; - int cpu; - - for_each_possible_cpu(cpu) { - ri = per_cpu_ptr(&bpf_redirect_info, cpu); - /* Avoid polluting remote cacheline due to writes if - * not needed. Once we pass this test, we need the - * cmpxchg() to make sure it hasn't been changed in - * the meantime by remote CPU. - */ - if (unlikely(READ_ONCE(ri->map) == map)) - cmpxchg(&ri->map, map, NULL); - } -} - DEFINE_STATIC_KEY_FALSE(bpf_master_redirect_enabled_key); EXPORT_SYMBOL_GPL(bpf_master_redirect_enabled_key);
u32 xdp_master_redirect(struct xdp_buff *xdp) { + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri(); struct net_device *master, *slave; - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info);
master = netdev_master_upper_dev_get_rcu(xdp->rxq->dev); slave = master->netdev_ops->ndo_xdp_get_xmit_slave(master, xdp); @@ -4392,7 +4372,7 @@ static __always_inline int __xdp_do_redirect_frame(struct bpf_redirect_info *ri, map = READ_ONCE(ri->map);
/* The map pointer is cleared when the map is being torn - * down by bpf_clear_redirect_map() + * down by dev_map_free() */ if (unlikely(!map)) { err = -ENOENT; @@ -4437,7 +4417,7 @@ static __always_inline int __xdp_do_redirect_frame(struct bpf_redirect_info *ri, int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp, struct bpf_prog *xdp_prog) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri(); enum bpf_map_type map_type = ri->map_type;
if (map_type == BPF_MAP_TYPE_XSKMAP) @@ -4451,7 +4431,7 @@ EXPORT_SYMBOL_GPL(xdp_do_redirect); int xdp_do_redirect_frame(struct net_device *dev, struct xdp_buff *xdp, struct xdp_frame *xdpf, struct bpf_prog *xdp_prog) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri(); enum bpf_map_type map_type = ri->map_type;
if (map_type == BPF_MAP_TYPE_XSKMAP) @@ -4468,7 +4448,7 @@ static int xdp_do_generic_redirect_map(struct net_device *dev, enum bpf_map_type map_type, u32 map_id, u32 flags) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri(); struct bpf_map *map; int err;
@@ -4480,7 +4460,7 @@ static int xdp_do_generic_redirect_map(struct net_device *dev, map = READ_ONCE(ri->map);
/* The map pointer is cleared when the map is being torn - * down by bpf_clear_redirect_map() + * down by dev_map_free() */ if (unlikely(!map)) { err = -ENOENT; @@ -4522,7 +4502,7 @@ static int xdp_do_generic_redirect_map(struct net_device *dev, int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb, struct xdp_buff *xdp, struct bpf_prog *xdp_prog) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri(); enum bpf_map_type map_type = ri->map_type; void *fwd = ri->tgt_value; u32 map_id = ri->map_id; @@ -4558,7 +4538,7 @@ int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb,
BPF_CALL_2(bpf_xdp_redirect, u32, ifindex, u64, flags) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_redirect_info *ri = bpf_net_ctx_get_ri();
if (unlikely(flags)) return XDP_ABORTED; diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c index 4a0797f0a154b29ec9d89d1ea7c7461efd4332eb..5350dce2e52d68b1bf133178474230090a856cff 100644 --- a/net/core/lwt_bpf.c +++ b/net/core/lwt_bpf.c @@ -38,6 +38,7 @@ static inline struct bpf_lwt *bpf_lwt_lwtunnel(struct lwtunnel_state *lwt) static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt, struct dst_entry *dst, bool can_redirect) { + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; int ret;
/* Migration disable and BH disable are needed to protect per-cpu @@ -45,6 +46,7 @@ static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt, */ migrate_disable(); local_bh_disable(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); bpf_compute_data_pointers(skb); ret = bpf_prog_run_save_cb(lwt->prog, skb);
@@ -77,6 +79,7 @@ static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt, break; }
+ bpf_net_ctx_clear(bpf_net_ctx); local_bh_enable(); migrate_enable();
[ Sasha's backport helper bot ]
Hi,
Summary of potential issues: ⚠️ Found follow-up fixes in mainline
The upstream commit SHA1 provided is correct: 401cb7dae8130fd34eb84648e02ab4c506df7d5e
WARNING: Author mismatch between patch and upstream commit: Backport author: =?utf-8?q?Ricardo_Ca=C3=B1uelo_Navarro?=rcn@igalia.com Commit author: Sebastian Andrzej Siewiorbigeasy@linutronix.de
Found fixes commits: 55e802468e1d sfc: Don't invoke xdp_do_flush() from netpoll. 157f29152b61 netkit: Assign missing bpf_net_context fecef4cd42c6 tun: Assign missing bpf_net_context.
Note: The patch differs from the upstream commit: --- 1: 401cb7dae8130 ! 1: fe2023f9238ce net: Reference bpf_redirect_info via task_struct on PREEMPT_RT. @@ Metadata ## Commit message ## net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.
+ [ Upstream commit 401cb7dae8130fd34eb84648e02ab4c506df7d5e ] + The XDP redirect process is two staged: - bpf_prog_run_xdp() is invoked to run a eBPF program which inspects the packet and makes decisions. While doing that, the per-CPU variable @@ Commit message Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Link: https://patch.msgid.link/20240620132727.660738-15-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski kuba@kernel.org + [rcn: the backport addresses the differences in + net/core/dev.c:napi_threaded_poll(), busy_poll_stop(), napi_busy_loop() + and net_rx_action() between upstream and stable. This allows the patch + to be applied without bringing additional dependencies, such as + dad6b9770263 ("net: Allow to use SMP threads for backlog NAPI."). These + changes allow applying the patch on stable without bringing the whole + related series. + The rest of the changes are made to adapt context lines and are + unrelated to the purpose of the patch.] + Signed-off-by: Ricardo Cañuelo Navarro rcn@igalia.com
## include/linux/filter.h ## @@ include/linux/filter.h: struct bpf_nh_params { @@ net/bpf/test_run.c: static int bpf_test_run(struct bpf_prog *prog, void *ctx, u3
## net/core/dev.c ## @@ net/core/dev.c: sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret, + struct net_device *orig_dev, bool *another) { struct bpf_mprog_entry *entry = rcu_dereference_bh(skb->dev->tcx_ingress); - enum skb_drop_reason drop_reason = SKB_DROP_REASON_TC_INGRESS; + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; int sch_ret;
@@ net/core/dev.c: sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_ + bpf_net_ctx_clear(bpf_net_ctx); return NULL; case TC_ACT_SHOT: - kfree_skb_reason(skb, drop_reason); + kfree_skb_reason(skb, SKB_DROP_REASON_TC_INGRESS); *ret = NET_RX_DROP; + bpf_net_ctx_clear(bpf_net_ctx); return NULL; @@ net/core/dev.c: sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_
return skb; } -@@ net/core/dev.c: sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev) +@@ net/core/dev.c: static __always_inline struct sk_buff * + sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev) { struct bpf_mprog_entry *entry = rcu_dereference_bh(dev->tcx_egress); - enum skb_drop_reason drop_reason = SKB_DROP_REASON_TC_EGRESS; + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; int sch_ret;
@@ net/core/dev.c: sch_handle_egress(struct sk_buff *skb, int *ret, struct net_devi + bpf_net_ctx_clear(bpf_net_ctx); return NULL; case TC_ACT_SHOT: - kfree_skb_reason(skb, drop_reason); + kfree_skb_reason(skb, SKB_DROP_REASON_TC_EGRESS); *ret = NET_XMIT_DROP; + bpf_net_ctx_clear(bpf_net_ctx); return NULL; @@ net/core/dev.c: sch_handle_egress(struct sk_buff *skb, int *ret, struct net_devi
return skb; } -@@ net/core/dev.c: enum { - static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, - unsigned flags, u16 budget) +@@ net/core/dev.c: static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) + static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, bool prefer_busy_poll, + u16 budget) { + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; bool skip_schedule = false; unsigned long timeout; int rc; -@@ net/core/dev.c: static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, +@@ net/core/dev.c: static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, bool clear_bit(NAPI_STATE_IN_BUSY_POLL, &napi->state);
local_bh_disable(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx);
- if (flags & NAPI_F_PREFER_BUSY_POLL) { + if (prefer_busy_poll) { napi->defer_hard_irqs_count = READ_ONCE(napi->dev->napi_defer_hard_irqs); -@@ net/core/dev.c: static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, +@@ net/core/dev.c: static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, bool netpoll_poll_unlock(have_poll_lock); if (rc == budget) __busy_poll_stop(napi, skip_schedule); @@ net/core/dev.c: static void busy_poll_stop(struct napi_struct *napi, void *have_ local_bh_enable(); }
-@@ net/core/dev.c: static void __napi_busy_loop(unsigned int napi_id, +@@ net/core/dev.c: void napi_busy_loop(unsigned int napi_id, { unsigned long start_time = loop_end ? busy_loop_current_time() : 0; int (*napi_poll)(struct napi_struct *napi, int budget); @@ net/core/dev.c: static void __napi_busy_loop(unsigned int napi_id, void *have_poll_lock = NULL; struct napi_struct *napi;
-@@ net/core/dev.c: static void __napi_busy_loop(unsigned int napi_id, +@@ net/core/dev.c: void napi_busy_loop(unsigned int napi_id, int work = 0;
local_bh_disable(); @@ net/core/dev.c: static void __napi_busy_loop(unsigned int napi_id, if (!napi_poll) { unsigned long val = READ_ONCE(napi->state);
-@@ net/core/dev.c: static void __napi_busy_loop(unsigned int napi_id, +@@ net/core/dev.c: void napi_busy_loop(unsigned int napi_id, + if (work > 0) __NET_ADD_STATS(dev_net(napi->dev), LINUX_MIB_BUSYPOLLRXPACKETS, work); - skb_defer_free_flush(this_cpu_ptr(&softnet_data)); + bpf_net_ctx_clear(bpf_net_ctx); local_bh_enable();
if (!loop_end || loop_end(loop_end_arg, start_time)) -@@ net/core/dev.c: static int napi_thread_wait(struct napi_struct *napi) +@@ net/core/dev.c: static void skb_defer_free_flush(struct softnet_data *sd)
- static void napi_threaded_poll_loop(struct napi_struct *napi) + static int napi_threaded_poll(void *data) { + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; + struct napi_struct *napi = data; struct softnet_data *sd; - unsigned long last_qs = jiffies; - -@@ net/core/dev.c: static void napi_threaded_poll_loop(struct napi_struct *napi) - void *have; - - local_bh_disable(); -+ bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); -+ - sd = this_cpu_ptr(&softnet_data); - sd->in_napi_threaded_poll = true; - -@@ net/core/dev.c: static void napi_threaded_poll_loop(struct napi_struct *napi) - net_rps_action_and_irq_enable(sd); - } - skb_defer_free_flush(sd); -+ bpf_net_ctx_clear(bpf_net_ctx); - local_bh_enable(); - - if (!repoll) + void *have; +@@ net/core/dev.c: static int napi_threaded_poll(void *data) + bool repoll = false; + + local_bh_disable(); ++ bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); + sd = this_cpu_ptr(&softnet_data); + sd->in_napi_threaded_poll = true; + +@@ net/core/dev.c: static int napi_threaded_poll(void *data) + net_rps_action_and_irq_enable(sd); + } + skb_defer_free_flush(sd); ++ bpf_net_ctx_clear(bpf_net_ctx); + local_bh_enable(); + + if (!repoll) @@ net/core/dev.c: static __latent_entropy void net_rx_action(struct softirq_action *h) struct softnet_data *sd = this_cpu_ptr(&softnet_data); unsigned long time_limit = jiffies + - usecs_to_jiffies(READ_ONCE(net_hotdata.netdev_budget_usecs)); + usecs_to_jiffies(READ_ONCE(netdev_budget_usecs)); + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; - int budget = READ_ONCE(net_hotdata.netdev_budget); + int budget = READ_ONCE(netdev_budget); LIST_HEAD(list); LIST_HEAD(repoll);
@@ net/core/filter.c: static const struct bpf_func_proto bpf_clone_redirect_proto = -DEFINE_PER_CPU(struct bpf_redirect_info, bpf_redirect_info); -EXPORT_PER_CPU_SYMBOL_GPL(bpf_redirect_info); - - static struct net_device *skb_get_peer_dev(struct net_device *dev) - { - const struct net_device_ops *ops = dev->netdev_ops; -@@ net/core/filter.c: static struct net_device *skb_get_peer_dev(struct net_device *dev) - int skb_do_redirect(struct sk_buff *skb) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); @@ net/core/filter.c: static const struct bpf_func_proto bpf_redirect_peer_proto =
if (unlikely((plen && plen < sizeof(*params)) || flags)) return TC_ACT_SHOT; -@@ net/core/filter.c: void xdp_do_check_flushed(struct napi_struct *napi) +@@ net/core/filter.c: void xdp_do_flush(void) } - #endif + EXPORT_SYMBOL_GPL(xdp_do_flush);
-void bpf_clear_redirect_map(struct bpf_map *map) -{ @@ net/core/lwt_bpf.c: static inline struct bpf_lwt *bpf_lwt_lwtunnel(struct lwtunn + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; int ret;
- /* Disabling BH is needed to protect per-CPU bpf_redirect_info between - * BPF prog and skb_do_redirect(). + /* Migration disable and BH disable are needed to protect per-cpu +@@ net/core/lwt_bpf.c: static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt, */ + migrate_disable(); local_bh_disable(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); bpf_compute_data_pointers(skb); @@ net/core/lwt_bpf.c: static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_p
+ bpf_net_ctx_clear(bpf_net_ctx); local_bh_enable(); + migrate_enable();
- return ret; ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
[ Sasha's backport helper bot ]
Hi,
Summary of potential issues: ⚠️ Found follow-up fixes in mainline
The upstream commit SHA1 provided is correct: 401cb7dae8130fd34eb84648e02ab4c506df7d5e
WARNING: Author mismatch between patch and upstream commit: Backport author: =?utf-8?q?Ricardo_Ca=C3=B1uelo_Navarro?=rcn@igalia.com Commit author: Sebastian Andrzej Siewiorbigeasy@linutronix.de
Found fixes commits: 55e802468e1d sfc: Don't invoke xdp_do_flush() from netpoll. 157f29152b61 netkit: Assign missing bpf_net_context fecef4cd42c6 tun: Assign missing bpf_net_context.
Note: The patch differs from the upstream commit: --- 1: 401cb7dae8130 ! 1: 9cf6145a4dcd5 net: Reference bpf_redirect_info via task_struct on PREEMPT_RT. @@ Metadata ## Commit message ## net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.
+ [ Upstream commit 401cb7dae8130fd34eb84648e02ab4c506df7d5e ] + The XDP redirect process is two staged: - bpf_prog_run_xdp() is invoked to run a eBPF program which inspects the packet and makes decisions. While doing that, the per-CPU variable @@ Commit message Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Link: https://patch.msgid.link/20240620132727.660738-15-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski kuba@kernel.org + [rcn: the backport addresses the differences in + net/core/dev.c:napi_threaded_poll(), busy_poll_stop(), napi_busy_loop() + and net_rx_action() between upstream and stable. This allows the patch + to be applied without bringing additional dependencies, such as + dad6b9770263 ("net: Allow to use SMP threads for backlog NAPI."). These + changes allow applying the patch on stable without bringing the whole + related series. + The rest of the changes are made to adapt context lines and are + unrelated to the purpose of the patch.] + Signed-off-by: Ricardo Cañuelo Navarro rcn@igalia.com
## include/linux/filter.h ## @@ include/linux/filter.h: struct bpf_nh_params { @@ net/bpf/test_run.c: static int bpf_test_run(struct bpf_prog *prog, void *ctx, u3
## net/core/dev.c ## @@ net/core/dev.c: sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_prev, int *ret, + struct net_device *orig_dev, bool *another) { struct bpf_mprog_entry *entry = rcu_dereference_bh(skb->dev->tcx_ingress); - enum skb_drop_reason drop_reason = SKB_DROP_REASON_TC_INGRESS; + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; int sch_ret;
@@ net/core/dev.c: sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_ + bpf_net_ctx_clear(bpf_net_ctx); return NULL; case TC_ACT_SHOT: - kfree_skb_reason(skb, drop_reason); + kfree_skb_reason(skb, SKB_DROP_REASON_TC_INGRESS); *ret = NET_RX_DROP; + bpf_net_ctx_clear(bpf_net_ctx); return NULL; @@ net/core/dev.c: sch_handle_ingress(struct sk_buff *skb, struct packet_type **pt_
return skb; } -@@ net/core/dev.c: sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev) +@@ net/core/dev.c: static __always_inline struct sk_buff * + sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev) { struct bpf_mprog_entry *entry = rcu_dereference_bh(dev->tcx_egress); - enum skb_drop_reason drop_reason = SKB_DROP_REASON_TC_EGRESS; + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; int sch_ret;
@@ net/core/dev.c: sch_handle_egress(struct sk_buff *skb, int *ret, struct net_devi + bpf_net_ctx_clear(bpf_net_ctx); return NULL; case TC_ACT_SHOT: - kfree_skb_reason(skb, drop_reason); + kfree_skb_reason(skb, SKB_DROP_REASON_TC_EGRESS); *ret = NET_XMIT_DROP; + bpf_net_ctx_clear(bpf_net_ctx); return NULL; @@ net/core/dev.c: sch_handle_egress(struct sk_buff *skb, int *ret, struct net_devi
return skb; } -@@ net/core/dev.c: enum { - static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, - unsigned flags, u16 budget) +@@ net/core/dev.c: static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) + static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, bool prefer_busy_poll, + u16 budget) { + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; bool skip_schedule = false; unsigned long timeout; int rc; -@@ net/core/dev.c: static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, +@@ net/core/dev.c: static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, bool clear_bit(NAPI_STATE_IN_BUSY_POLL, &napi->state);
local_bh_disable(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx);
- if (flags & NAPI_F_PREFER_BUSY_POLL) { + if (prefer_busy_poll) { napi->defer_hard_irqs_count = READ_ONCE(napi->dev->napi_defer_hard_irqs); -@@ net/core/dev.c: static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, +@@ net/core/dev.c: static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, bool netpoll_poll_unlock(have_poll_lock); if (rc == budget) __busy_poll_stop(napi, skip_schedule); @@ net/core/dev.c: static void busy_poll_stop(struct napi_struct *napi, void *have_ local_bh_enable(); }
-@@ net/core/dev.c: static void __napi_busy_loop(unsigned int napi_id, +@@ net/core/dev.c: void napi_busy_loop(unsigned int napi_id, { unsigned long start_time = loop_end ? busy_loop_current_time() : 0; int (*napi_poll)(struct napi_struct *napi, int budget); @@ net/core/dev.c: static void __napi_busy_loop(unsigned int napi_id, void *have_poll_lock = NULL; struct napi_struct *napi;
-@@ net/core/dev.c: static void __napi_busy_loop(unsigned int napi_id, +@@ net/core/dev.c: void napi_busy_loop(unsigned int napi_id, int work = 0;
local_bh_disable(); @@ net/core/dev.c: static void __napi_busy_loop(unsigned int napi_id, if (!napi_poll) { unsigned long val = READ_ONCE(napi->state);
-@@ net/core/dev.c: static void __napi_busy_loop(unsigned int napi_id, +@@ net/core/dev.c: void napi_busy_loop(unsigned int napi_id, + if (work > 0) __NET_ADD_STATS(dev_net(napi->dev), LINUX_MIB_BUSYPOLLRXPACKETS, work); - skb_defer_free_flush(this_cpu_ptr(&softnet_data)); + bpf_net_ctx_clear(bpf_net_ctx); local_bh_enable();
if (!loop_end || loop_end(loop_end_arg, start_time)) -@@ net/core/dev.c: static int napi_thread_wait(struct napi_struct *napi) +@@ net/core/dev.c: static void skb_defer_free_flush(struct softnet_data *sd)
- static void napi_threaded_poll_loop(struct napi_struct *napi) + static int napi_threaded_poll(void *data) { + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; + struct napi_struct *napi = data; struct softnet_data *sd; - unsigned long last_qs = jiffies; - -@@ net/core/dev.c: static void napi_threaded_poll_loop(struct napi_struct *napi) - void *have; - - local_bh_disable(); -+ bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); -+ - sd = this_cpu_ptr(&softnet_data); - sd->in_napi_threaded_poll = true; - -@@ net/core/dev.c: static void napi_threaded_poll_loop(struct napi_struct *napi) - net_rps_action_and_irq_enable(sd); - } - skb_defer_free_flush(sd); -+ bpf_net_ctx_clear(bpf_net_ctx); - local_bh_enable(); - - if (!repoll) + void *have; +@@ net/core/dev.c: static int napi_threaded_poll(void *data) + bool repoll = false; + + local_bh_disable(); ++ bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); + sd = this_cpu_ptr(&softnet_data); + sd->in_napi_threaded_poll = true; + +@@ net/core/dev.c: static int napi_threaded_poll(void *data) + net_rps_action_and_irq_enable(sd); + } + skb_defer_free_flush(sd); ++ bpf_net_ctx_clear(bpf_net_ctx); + local_bh_enable(); + + if (!repoll) @@ net/core/dev.c: static __latent_entropy void net_rx_action(struct softirq_action *h) struct softnet_data *sd = this_cpu_ptr(&softnet_data); unsigned long time_limit = jiffies + - usecs_to_jiffies(READ_ONCE(net_hotdata.netdev_budget_usecs)); + usecs_to_jiffies(READ_ONCE(netdev_budget_usecs)); + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; - int budget = READ_ONCE(net_hotdata.netdev_budget); + int budget = READ_ONCE(netdev_budget); LIST_HEAD(list); LIST_HEAD(repoll);
@@ net/core/filter.c: static const struct bpf_func_proto bpf_clone_redirect_proto = -DEFINE_PER_CPU(struct bpf_redirect_info, bpf_redirect_info); -EXPORT_PER_CPU_SYMBOL_GPL(bpf_redirect_info); - - static struct net_device *skb_get_peer_dev(struct net_device *dev) - { - const struct net_device_ops *ops = dev->netdev_ops; -@@ net/core/filter.c: static struct net_device *skb_get_peer_dev(struct net_device *dev) - int skb_do_redirect(struct sk_buff *skb) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); @@ net/core/filter.c: static const struct bpf_func_proto bpf_redirect_peer_proto =
if (unlikely((plen && plen < sizeof(*params)) || flags)) return TC_ACT_SHOT; -@@ net/core/filter.c: void xdp_do_check_flushed(struct napi_struct *napi) +@@ net/core/filter.c: void xdp_do_flush(void) } - #endif + EXPORT_SYMBOL_GPL(xdp_do_flush);
-void bpf_clear_redirect_map(struct bpf_map *map) -{ @@ net/core/lwt_bpf.c: static inline struct bpf_lwt *bpf_lwt_lwtunnel(struct lwtunn + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; int ret;
- /* Disabling BH is needed to protect per-CPU bpf_redirect_info between - * BPF prog and skb_do_redirect(). + /* Migration disable and BH disable are needed to protect per-cpu +@@ net/core/lwt_bpf.c: static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_prog *lwt, */ + migrate_disable(); local_bh_disable(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); bpf_compute_data_pointers(skb); @@ net/core/lwt_bpf.c: static int run_lwt_bpf(struct sk_buff *skb, struct bpf_lwt_p
+ bpf_net_ctx_clear(bpf_net_ctx); local_bh_enable(); + migrate_enable();
- return ret; ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: Sebastian Andrzej Siewior bigeasy@linutronix.de
[ Upstream commit fecef4cd42c689a200bdd39e6fffa71475904bc1 ]
During the introduction of struct bpf_net_context handling for XDP-redirect, the tun driver has been missed. Jakub also pointed out that there is another call chain to do_xdp_generic() originating from netif_receive_skb() and drivers may use it outside from the NAPI context.
Set the bpf_net_context before invoking BPF XDP program within the TUN driver. Set the bpf_net_context also in do_xdp_generic() if a xdp program is available.
Reported-by: syzbot+0b5c75599f1d872bea6f@syzkaller.appspotmail.com Reported-by: syzbot+5ae46b237278e2369cac@syzkaller.appspotmail.com Reported-by: syzbot+c1e04a422bbc0f0f2921@syzkaller.appspotmail.com Fixes: 401cb7dae8130 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Link: https://patch.msgid.link/20240704144815.j8xQda5r@linutronix.de Signed-off-by: Jakub Kicinski kuba@kernel.org [rcn: Backport the patch to address minor differences in the context lines. These differences are introduced by upstream commits 4d2bb0bfe874 ("xdp: rely on skb pointer reference in do_xdp_generic and netif_receive_generic_xdp") and 7cd1107f48e2a ("bpf, xdp: constify some bpf_prog * function arguments"), which change the parameters in do_xdp_generic() and in calls to netif_receive_generic_xdp(), kfree_skb_reason() and generic_xdp_tx(). These changes aren't significant to the purpose of the patch.] Signed-off-by: Ricardo Cañuelo Navarro rcn@igalia.com --- drivers/net/tun.c | 7 +++++++ net/core/dev.c | 5 +++++ 2 files changed, 12 insertions(+)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c index c1fdf8804d60b6776b7fb78c41ac041b6aeb5a88..f28f57abe59dc1afca310de8f7a0a69107ec33db 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1668,6 +1668,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun, int len, int *skb_xdp) { struct page_frag *alloc_frag = ¤t->task_frag; + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; struct bpf_prog *xdp_prog; int buflen = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); char *buf; @@ -1707,6 +1708,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
local_bh_disable(); rcu_read_lock(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); xdp_prog = rcu_dereference(tun->xdp_prog); if (xdp_prog) { struct xdp_buff xdp; @@ -1735,12 +1737,14 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun, pad = xdp.data - xdp.data_hard_start; len = xdp.data_end - xdp.data; } + bpf_net_ctx_clear(bpf_net_ctx); rcu_read_unlock(); local_bh_enable();
return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, pad);
out: + bpf_net_ctx_clear(bpf_net_ctx); rcu_read_unlock(); local_bh_enable(); return NULL; @@ -2577,6 +2581,7 @@ static int tun_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
if (m->msg_controllen == sizeof(struct tun_msg_ctl) && ctl && ctl->type == TUN_MSG_PTR) { + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; struct tun_page tpage; int n = ctl->num; int flush = 0, queued = 0; @@ -2585,6 +2590,7 @@ static int tun_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
local_bh_disable(); rcu_read_lock(); + bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx);
for (i = 0; i < n; i++) { xdp = &((struct xdp_buff *)ctl->ptr)[i]; @@ -2599,6 +2605,7 @@ static int tun_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len) if (tfile->napi_enabled && queued > 0) napi_schedule(&tfile->napi);
+ bpf_net_ctx_clear(bpf_net_ctx); rcu_read_unlock(); local_bh_enable();
diff --git a/net/core/dev.c b/net/core/dev.c index 24460c630d3cdfa2490d15969b7ab62b6ce42003..a6a63f5b6b8364d2d24553180d4e2138b13614b9 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5051,11 +5051,14 @@ static DEFINE_STATIC_KEY_FALSE(generic_xdp_needed_key);
int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb) { + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; + if (xdp_prog) { struct xdp_buff xdp; u32 act; int err;
+ bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); act = netif_receive_generic_xdp(skb, &xdp, xdp_prog); if (act != XDP_PASS) { switch (act) { @@ -5069,11 +5072,13 @@ int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb) generic_xdp_tx(skb, xdp_prog); break; } + bpf_net_ctx_clear(bpf_net_ctx); return XDP_DROP; } } return XDP_PASS; out_redir: + bpf_net_ctx_clear(bpf_net_ctx); kfree_skb_reason(skb, SKB_DROP_REASON_XDP); return XDP_DROP; }
[ Sasha's backport helper bot ]
Hi,
Summary of potential issues: ℹ️ This is part 2/3 of a series ⚠️ Found follow-up fixes in mainline
The upstream commit SHA1 provided is correct: fecef4cd42c689a200bdd39e6fffa71475904bc1
WARNING: Author mismatch between patch and upstream commit: Backport author: =?utf-8?q?Ricardo_Ca=C3=B1uelo_Navarro?=rcn@igalia.com Commit author: Sebastian Andrzej Siewiorbigeasy@linutronix.de
Found fixes commits: 9da49aa80d68 tun: Add missing bpf_net_ctx_clear() in do_xdp_generic()
Note: The patch differs from the upstream commit: --- 1: fecef4cd42c68 ! 1: d614f795c4a67 tun: Assign missing bpf_net_context. @@ Metadata ## Commit message ## tun: Assign missing bpf_net_context.
+ [ Upstream commit fecef4cd42c689a200bdd39e6fffa71475904bc1 ] + During the introduction of struct bpf_net_context handling for XDP-redirect, the tun driver has been missed. Jakub also pointed out that there is another call chain to @@ Commit message Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Link: https://patch.msgid.link/20240704144815.j8xQda5r@linutronix.de Signed-off-by: Jakub Kicinski kuba@kernel.org + [rcn: Backport the patch to address minor differences in the context lines. + These differences are introduced by upstream commits + 4d2bb0bfe874 ("xdp: rely on skb pointer reference in do_xdp_generic and + netif_receive_generic_xdp") and 7cd1107f48e2a ("bpf, xdp: constify some + bpf_prog * function arguments"), which change the parameters in + do_xdp_generic() and in calls to netif_receive_generic_xdp(), + kfree_skb_reason() and generic_xdp_tx(). These changes aren't + significant to the purpose of the patch.] + Signed-off-by: Ricardo Cañuelo Navarro rcn@igalia.com
## drivers/net/tun.c ## @@ drivers/net/tun.c: static struct sk_buff *tun_build_skb(struct tun_struct *tun, @@ drivers/net/tun.c: static int tun_sendmsg(struct socket *sock, struct msghdr *m, ## net/core/dev.c ## @@ net/core/dev.c: static DEFINE_STATIC_KEY_FALSE(generic_xdp_needed_key);
- int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff **pskb) + int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb) { + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; + @@ net/core/dev.c: static DEFINE_STATIC_KEY_FALSE(generic_xdp_needed_key); int err;
+ bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); - act = netif_receive_generic_xdp(pskb, &xdp, xdp_prog); + act = netif_receive_generic_xdp(skb, &xdp, xdp_prog); if (act != XDP_PASS) { switch (act) { -@@ net/core/dev.c: int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff **pskb) - generic_xdp_tx(*pskb, xdp_prog); +@@ net/core/dev.c: int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb) + generic_xdp_tx(skb, xdp_prog); break; } + bpf_net_ctx_clear(bpf_net_ctx); @@ net/core/dev.c: int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff **p return XDP_PASS; out_redir: + bpf_net_ctx_clear(bpf_net_ctx); - kfree_skb_reason(*pskb, SKB_DROP_REASON_XDP); + kfree_skb_reason(skb, SKB_DROP_REASON_XDP); return XDP_DROP; } ---
NOTE: These results are for this patch alone. Full series testing will be performed when all parts are received.
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
[ Sasha's backport helper bot ]
Hi,
Summary of potential issues: ℹ️ This is part 2/3 of a series ⚠️ Found follow-up fixes in mainline
The upstream commit SHA1 provided is correct: fecef4cd42c689a200bdd39e6fffa71475904bc1
WARNING: Author mismatch between patch and upstream commit: Backport author: =?utf-8?q?Ricardo_Ca=C3=B1uelo_Navarro?=rcn@igalia.com Commit author: Sebastian Andrzej Siewiorbigeasy@linutronix.de
Found fixes commits: 9da49aa80d68 tun: Add missing bpf_net_ctx_clear() in do_xdp_generic()
Note: The patch differs from the upstream commit: --- 1: fecef4cd42c68 ! 1: 54a183ca55664 tun: Assign missing bpf_net_context. @@ Metadata ## Commit message ## tun: Assign missing bpf_net_context.
+ [ Upstream commit fecef4cd42c689a200bdd39e6fffa71475904bc1 ] + During the introduction of struct bpf_net_context handling for XDP-redirect, the tun driver has been missed. Jakub also pointed out that there is another call chain to @@ Commit message Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Link: https://patch.msgid.link/20240704144815.j8xQda5r@linutronix.de Signed-off-by: Jakub Kicinski kuba@kernel.org + [rcn: Backport the patch to address minor differences in the context lines. + These differences are introduced by upstream commits + 4d2bb0bfe874 ("xdp: rely on skb pointer reference in do_xdp_generic and + netif_receive_generic_xdp") and 7cd1107f48e2a ("bpf, xdp: constify some + bpf_prog * function arguments"), which change the parameters in + do_xdp_generic() and in calls to netif_receive_generic_xdp(), + kfree_skb_reason() and generic_xdp_tx(). These changes aren't + significant to the purpose of the patch.] + Signed-off-by: Ricardo Cañuelo Navarro rcn@igalia.com
## drivers/net/tun.c ## @@ drivers/net/tun.c: static struct sk_buff *tun_build_skb(struct tun_struct *tun, @@ drivers/net/tun.c: static int tun_sendmsg(struct socket *sock, struct msghdr *m, ## net/core/dev.c ## @@ net/core/dev.c: static DEFINE_STATIC_KEY_FALSE(generic_xdp_needed_key);
- int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff **pskb) + int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb) { + struct bpf_net_context __bpf_net_ctx, *bpf_net_ctx; + @@ net/core/dev.c: static DEFINE_STATIC_KEY_FALSE(generic_xdp_needed_key); int err;
+ bpf_net_ctx = bpf_net_ctx_set(&__bpf_net_ctx); - act = netif_receive_generic_xdp(pskb, &xdp, xdp_prog); + act = netif_receive_generic_xdp(skb, &xdp, xdp_prog); if (act != XDP_PASS) { switch (act) { -@@ net/core/dev.c: int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff **pskb) - generic_xdp_tx(*pskb, xdp_prog); +@@ net/core/dev.c: int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb) + generic_xdp_tx(skb, xdp_prog); break; } + bpf_net_ctx_clear(bpf_net_ctx); @@ net/core/dev.c: int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff **p return XDP_PASS; out_redir: + bpf_net_ctx_clear(bpf_net_ctx); - kfree_skb_reason(*pskb, SKB_DROP_REASON_XDP); + kfree_skb_reason(skb, SKB_DROP_REASON_XDP); return XDP_DROP; } ---
NOTE: These results are for this patch alone. Full series testing will be performed when all parts are received.
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
From: Jeongjun Park aha310510@gmail.com
[ Upstream commit 9da49aa80d686582bc3a027112a30484c9be6b6e ]
There are cases where do_xdp_generic returns bpf_net_context without clearing it. This causes various memory corruptions, so the missing bpf_net_ctx_clear must be added.
Reported-by: syzbot+44623300f057a28baf1e@syzkaller.appspotmail.com Fixes: fecef4cd42c6 ("tun: Assign missing bpf_net_context.") Signed-off-by: Jeongjun Park aha310510@gmail.com Acked-by: Jason Wang jasowang@redhat.com Reviewed-by: Willem de Bruijn willemb@google.com Reported-by: syzbot+3c2b6d5d4bec3b904933@syzkaller.appspotmail.com Reported-by: syzbot+707d98c8649695eaf329@syzkaller.appspotmail.com Reported-by: syzbot+c226757eb784a9da3e8b@syzkaller.appspotmail.com Reported-by: syzbot+61a1cfc2b6632363d319@syzkaller.appspotmail.com Reported-by: syzbot+709e4c85c904bcd62735@syzkaller.appspotmail.com Signed-off-by: David S. Miller davem@davemloft.net [rcn: trivial backport edit to adapt the patch context.] Signed-off-by: Ricardo Cañuelo Navarro rcn@igalia.com --- net/core/dev.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/core/dev.c b/net/core/dev.c index a6a63f5b6b8364d2d24553180d4e2138b13614b9..3d1bf7be1ab1cc74f50a2e5b8bf05d21def3c5a2 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5075,6 +5075,7 @@ int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb) bpf_net_ctx_clear(bpf_net_ctx); return XDP_DROP; } + bpf_net_ctx_clear(bpf_net_ctx); } return XDP_PASS; out_redir:
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 9da49aa80d686582bc3a027112a30484c9be6b6e
WARNING: Author mismatch between patch and upstream commit: Backport author: =?utf-8?q?Ricardo_Ca=C3=B1uelo_Navarro?=rcn@igalia.com Commit author: Jeongjun Parkaha310510@gmail.com
Note: The patch differs from the upstream commit: --- 1: 9da49aa80d686 ! 1: f295380a77d57 tun: Add missing bpf_net_ctx_clear() in do_xdp_generic() @@ Metadata ## Commit message ## tun: Add missing bpf_net_ctx_clear() in do_xdp_generic()
+ [ Upstream commit 9da49aa80d686582bc3a027112a30484c9be6b6e ] + There are cases where do_xdp_generic returns bpf_net_context without clearing it. This causes various memory corruptions, so the missing bpf_net_ctx_clear must be added. @@ Commit message Reported-by: syzbot+61a1cfc2b6632363d319@syzkaller.appspotmail.com Reported-by: syzbot+709e4c85c904bcd62735@syzkaller.appspotmail.com Signed-off-by: David S. Miller davem@davemloft.net + [rcn: trivial backport edit to adapt the patch context.] + Signed-off-by: Ricardo Cañuelo Navarro rcn@igalia.com
## net/core/dev.c ## -@@ net/core/dev.c: int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff **pskb) +@@ net/core/dev.c: int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb) bpf_net_ctx_clear(bpf_net_ctx); return XDP_DROP; } ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: 9da49aa80d686582bc3a027112a30484c9be6b6e
WARNING: Author mismatch between patch and upstream commit: Backport author: =?utf-8?q?Ricardo_Ca=C3=B1uelo_Navarro?=rcn@igalia.com Commit author: Jeongjun Parkaha310510@gmail.com
Note: The patch differs from the upstream commit: --- 1: 9da49aa80d686 ! 1: c6d34f018cdee tun: Add missing bpf_net_ctx_clear() in do_xdp_generic() @@ Metadata ## Commit message ## tun: Add missing bpf_net_ctx_clear() in do_xdp_generic()
+ [ Upstream commit 9da49aa80d686582bc3a027112a30484c9be6b6e ] + There are cases where do_xdp_generic returns bpf_net_context without clearing it. This causes various memory corruptions, so the missing bpf_net_ctx_clear must be added. @@ Commit message Reported-by: syzbot+61a1cfc2b6632363d319@syzkaller.appspotmail.com Reported-by: syzbot+709e4c85c904bcd62735@syzkaller.appspotmail.com Signed-off-by: David S. Miller davem@davemloft.net + [rcn: trivial backport edit to adapt the patch context.] + Signed-off-by: Ricardo Cañuelo Navarro rcn@igalia.com
## net/core/dev.c ## -@@ net/core/dev.c: int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff **pskb) +@@ net/core/dev.c: int do_xdp_generic(struct bpf_prog *xdp_prog, struct sk_buff *skb) bpf_net_ctx_clear(bpf_net_ctx); return XDP_DROP; } ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
On 2025-02-26 10:39:04 [+0100], Ricardo Cañuelo Navarro wrote:
A private syzbot instance reported "KASAN: slab-use-after-free Read in dev_map_enqueue" under some runtime environments.
Upstream patch fecef4cd42c6 ("tun: Assign missing bpf_net_context") fixes the issue. In order to bring this patch to stable v6.6 it's also necessary to bring upstream patch 401cb7dae813 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") as a dependency.
Just to be clear: A problem exists in v6.6 independent of my doing and 401cb7dae813 happens to fix it? The commit fecef4cd42c6 is a fixup for 401cb7dae813.
If so, can you share syzbot's reproducer and/or backtrace/ report?
Sebastian
Hi Sebastian, thanks for answering.
On Wed, Feb 26 2025 at 17:31:58, Sebastian Andrzej Siewior bigeasy@linutronix.de wrote:
Just to be clear: A problem exists in v6.6 independent of my doing and 401cb7dae813 happens to fix it? The commit fecef4cd42c6 is a fixup for 401cb7dae813.
If so, can you share syzbot's reproducer and/or backtrace/ report?
Actually, it's fecef4cd42c6 the one that fixes the issue, but 401cb7dae813 is needed for it to work: it implements the bpf_net_ctx_*() api, then fecef4cd42c6 uses bpf_net_ctx_set() and bpf_net_ctx_clear() in the tun driver.
Here's the syzkaller report on stable v6.6: https://pastebin.com/yD0zVD0c
and here's the C reproducer: https://pastebin.com/d415fUDU
Cheers, Ricardo
On 2025-02-26 10:39:04 [+0100], Ricardo Cañuelo Navarro wrote:
Both patches needed some manual work in order to be applied on stable, mostly related to changes in the context lines:
Did the same rebase and came to the same set of patches so Reviewed-by: Sebastian Andrzej Siewior bigeasy@linutronix.de
Based on KASAN report Ricardo provided it is the dst->dev dereference in dev_map_enqueue(). This originates from bpf_redirect_info::tgt_value which is a per-CPU variable. It is assigned in __bpf_xdp_redirect_map(). Since it is claimed that this patches fix it, the bot must have found a way to share the per-CPU data with another instance/ thread. Haven't figured out how…
Signed-off-by: Ricardo Cañuelo Navarro rcn@igalia.com
Sebastian
linux-stable-mirror@lists.linaro.org