Benjamin Tissoires benjamin.tissoires@redhat.com writes:
On Fri, Feb 9, 2024 at 4:42 PM Toke Høiland-Jørgensen toke@redhat.com wrote:
Benjamin Tissoires bentiss@kernel.org writes:
[Putting this as a RFC because I'm pretty sure I'm not doing the things correctly at the BPF level.] [Also using bpf-next as the base tree as there will be conflicting changes otherwise]
Ideally I'd like to have something similar to bpf_timers, but not in soft IRQ context. So I'm emulating this with a sleepable bpf_tail_call() (see "HID: bpf: allow to defer work in a delayed workqueue").
Why implement a new mechanism? Sounds like what you need is essentially the bpf_timer functionality, just running in a different context, right?
Heh, that's exactly why I put in a RFC :)
So yes, the bpf_timer approach is cleaner, but I need it in a workqueue, as a hrtimer in a softIRQ would prevent me to kzalloc and wait for the device.
Right, makes sense.
So why not just add a flag to the timer setup that controls the callback context? I've been toying with something similar for restarting XDP TX for my queueing patch series (though I'm not sure if this will actually end up being needed in the end):
https://git.kernel.org/pub/scm/linux/kernel/git/toke/linux.git/commit/?h=xdp...
Oh, nice. Good idea. But would it be OK to have a "timer-like" where it actually defers the job in a workqueue instead of using an hrtimer?
That's conceptually still a timer, though, isn't it? I.e., it's a mechanism whereby you specify a callback and a delay, and bpf_timer ensures that your callback is called after that delay. IMO it's totally congruent with that API to be able to specify a different execution context as part of the timer setup.
As for how to implement it, I suspect the easiest may be something similar to what the patch I linked above does: keep the hrtimer, and just have a different (kernel) callback function when the timer fires which does an immediate schedule_work() (without the _delayed) and then runs the BPF callback in that workqueue. I.e., keep the delay handling the way the existing bpf_timer implementation does it, and just add an indirection to start the workqueue in the kernel dispatch code.
I thought I would have to rewrite the entire bpf_timer approach without the softIRQ, but if I can just add a new flag, that will make things way simpler for me.
IMO that would be fine. You may want to wait for the maintainers to chime in before going down this route, though :)
This however raises another issue if I were to use the bpf_timers: now the HID-BPF kfuncs will not be available as they are only available to tracing prog types. And when I tried to call them from a bpf_timer (in softIRQ) they were not available.
IIUC, the bpf_timer callback is just a function (subprog) from the verifier PoV, so it is verified as whatever program type is creating the timer. So in other words, as long as you setup the timer from inside a tracing prog type, you should have access to all the same kfuncs, I think?
-Toke