On Mon, Dec 9, 2024 at 2:45 PM Andrii Nakryiko andrii.nakryiko@gmail.com wrote:
Ok, weeding through the perf/uprobe plumbing for BPF, I think we avoid this problem with uprobe BPF link because uprobe_unregister_sync() waits for RCU Tasks Trace GP, and so once we finish uprobe unregistration, we have a guarantee that there is no more uprobe that might dereference our BPF program. (I might have thought about this problem when fixing BPF link for sleepable tracepoints, but I missed the legacy perf_event attach/detach case).
With legacy perf event perf_event_detach_bpf_prog() we don't do any of that, we just NULL out pointer and do bpf_prog_put(), not caring whether this is uprobe, kprobe, or tracepoint...
So one way to solve this is to either teach perf_event_detach_bpf_prog() to delay bpf_prog_put() until after RCU Tasks Trace GP (which is what we do with bpf_prog_array, but not the program itself),
since this is a legacy prog detach api I would just add sync_rcu_tt there. It's a backportable one line change.
or add prog->aux->sleepable_hook flag in addition to prog->aux->sleepable, and then inside bpf_prog_put() check (prog->aux->sleepable || prog->aux->sleepable_hook) and do RCU Tasks Trace delay (in addition to normal call_rcu()).
That sounds like more work and if we introduce sleepable_hook we would better generalize it and rely on it everywhere. Which makes it even more work and certainly not for stable.
Third alternative would be to have something like bpf_prog_put_sleepable() (just like we have bpf_prog_array_free_sleepable()), where this would do additional call_rcu_tasks_trace() even if BPF program itself isn't sleepable.
Sounds like less work than 2, but conceptually it's the same as 1. Just call_rcu_tt vs sync_rcu_tt. And we can wait just fine in that code path. So I'd go with 1.