On Feb 13 2024, Kumar Kartikeya Dwivedi wrote:
On Tue, 13 Feb 2024 at 18:46, Benjamin Tissoires bentiss@kernel.org wrote:
On Feb 12 2024, Alexei Starovoitov wrote:
On Mon, Feb 12, 2024 at 10:21 AM Benjamin Tissoires benjamin.tissoires@redhat.com wrote:
On Mon, Feb 12, 2024 at 6:46 PM Toke Høiland-Jørgensen toke@redhat.com wrote:
Benjamin Tissoires benjamin.tissoires@redhat.com writes:
[...]
I agree that workqueue delegation fits into the bpf_timer concept and a lot of code can and should be shared.
Thanks Alexei for the detailed answer. I've given it an attempt but still can not figure it out entirely.
All the lessons(bugs) learned with bpf_timer don't need to be re-discovered :) Too bad, bpf_timer_set_callback() doesn't have a flag argument, so we need a new kfunc to set a sleepable callback. Maybe bpf_timer_set_sleepable_cb() ?
OK. So I guess I should drop Toke's suggestion with the bpf_timer_ini() flag?
The verifier will set is_async_cb = true for it (like it does for regular cb-s). And since prog->aux->sleepable is kinda "global" we need another per subprog flag: bool is_sleepable: 1;
done (in push_callback_call())
We can factor out a check "if (prog->aux->sleepable)" into a helper that will check that "global" flag and another env->cur_state->in_sleepable flag that will work similar to active_rcu_lock.
done (I think), cf patch 2 below
Once the verifier starts processing subprog->is_sleepable it will set cur_state->in_sleepable = true; to make all subprogs called from that cb to be recognized as sleepable too.
That's the point I don't know where to put the new code.
I think that would go in the already existing special case for push_async_cb where you get the verifier state of the async callback. You can make setting the boolean in that verifier state conditional on whether it's your kfunc/helper you're processing taking a sleepable callback.
Hehe, thanks a lot. Indeed, it was a simple fix. I tried to put this everywhere but here, and with your help got it working in 2 mins :)
It seems the best place would be in do_check(), but I am under the impression that the code of the callback is added at the end of the instruction list, meaning that I do not know where it starts, and which subprog index it corresponds to.
A bit of a challenge is what to do with global subprogs, since they're verified lazily. They can be called from sleepable and non-sleepable contex. Should be solvable.
I must confess this is way over me (and given that I didn't even managed to make the "easy" case working, that might explain things a little :-P )
I think it will be solvable but made somewhat difficult by the fact that even if we mark subprog_info of some global_func A as in_sleepable, so that we explore it as sleepable during its verification, we might encounter later another global_func that calls a global func, already explored as non-sleepable, in sleepable context. In this case I think we need to redo the verification of that global func as sleepable once again. It could be that it is called from both non-sleepable and sleepable contexts, so both paths (in_sleepable = true, and in_sleepable = false) need to be explored, or we could reject such cases, but it might be a little restrictive.
Some common helper global func unrelated to caller context doing some auxiliary work, called from sleepable timer callback and normal main subprog might be an example where rejection will be prohibitive.
An approach might be to explore main and global subprogs once as we do now, and then keep a list of global subprogs that need to be revisited as in_sleepable (due to being called from a sleepable context) and trigger do_check_common for them again, this might have to be repeated as the list grows on each iteration, but eventually we will have explored all of them as in_sleepable if need be, and the loop will end. Surely, this trades off logical simplicity of verifier code with redoing verification of global subprogs again.
To add items to such a list, for each global subprog we encounter that needs to be analyzed as in_sleepable, we will also collect all its callee global subprogs by walking its instructions (a bit like check_max_stack_depth does).
FWIW, this (or Alexei's suggestion) is still not implemented in v2
Overall I think this feature is needed urgently, so if you don't have cycles to work on this soon, I can prioritize it right after bpf_arena work.
I can try to spare a few cycles on it. Even if your instructions were on spot, I still can't make the subprogs recognized as sleepable.
For reference, this is where I am (probably bogus, but seems to be working when timer_set_sleepable_cb() is called from a sleepable context as mentioned by Toke):
I just skimmed the patch but I think it's already 90% there. The only other change I would suggest is switching from helper to kfunc, as originally proposed by Alexei.
the kfunc was a rabbit hole: - I needed to teach the verifier about BPF_TIMER in kfunc - I needed to teach the verifier about the kfunc itself - I'm failing at calling the callback :(
Anyway, I'm about to send a second RFC so we can discuss on the code and see where my monkey patching capabilities are reaching their limits.
Cheers, Benjamin