On Mon, 29 Apr 2019 11:59:04 -0700 Linus Torvalds torvalds@linux-foundation.org wrote:
I really don't care. Just do what I suggested, and if you have numbers to show problems, then maybe I'll care.
Are you suggesting that I rewrite the code to do it one function at a time? This has always been batch mode. This is not something new. The function tracer has been around longer than the text poke code.
Right now you're just making excuses for this. I described the solution months ago, now I've written a patch, if that's not good enough then we can just skip this all entirely.
Honestly, if you need to rewrite tens of thousands of calls, maybe you're doing something wrong?
# cd /sys/kernel/debug/tracing # cat available_filter_functions | wc -l 45856 # cat enabled_functions | wc -l 0 # echo function > current_tracer # cat enabled_functions | wc -l 45856
There, I just enabled 45,856 function call sites in one shot!
How else do you want to update them? Every function in the kernel has a nop, that turns into a call to the ftrace_handler, if I add another user of that code, it will change each one as well.
-- Steve
On Mon, Apr 29, 2019 at 12:07 PM Steven Rostedt rostedt@goodmis.org wrote:
Are you suggesting that I rewrite the code to do it one function at a time? This has always been batch mode. This is not something new. The function tracer has been around longer than the text poke code.
Only do the 'call' instructions one at a time. Why would you change _existing_ code?
Linus
On Mon, Apr 29, 2019 at 1:06 PM Linus Torvalds torvalds@linux-foundation.org wrote:
Only do the 'call' instructions one at a time. Why would you change _existing_ code?
Side note: if you want to, you can easily batch up rewriting 'call' instructions to the same target using the exact same code. You just need to change the int3 handler case to calculate the bp_int3_call_return from the fixed one-time address to use sopmething like
this_cpu_write(bp_call_return, int3_address-1+bp_int3_call_size);
instead (and you'd need to also teach the function that there's not just a single int3 live at a time)
Linus
On Mon, 29 Apr 2019 13:06:17 -0700 Linus Torvalds torvalds@linux-foundation.org wrote:
On Mon, Apr 29, 2019 at 12:07 PM Steven Rostedt rostedt@goodmis.org wrote:
Are you suggesting that I rewrite the code to do it one function at a time? This has always been batch mode. This is not something new. The function tracer has been around longer than the text poke code.
Only do the 'call' instructions one at a time. Why would you change _existing_ code?
The function tracing is a call instruction.
On boot:
<function_X>: nop blah blah
After a callback to function tracing is called:
<function_X> call custom_trampoline blah blah
If we have two functions to that function added:
<function_X> call iterator_trampoline blah blah
The update from "call custom_trampoline" to "call iterator_trampoline" is where we have an issue.
We could make this a special case where we do this one at a time, but currently the code is all the same looking at tables to determine to what to do. Which is one of three:
1) change nop to call function 2) change call function to nop 3) update call function to another call function
#3 is where we have an issue. But if we want this to be different, we would need to change the code significantly, and know that we are only updating calls to calls. Which would take a bit of accounting to see if that's the change that is being made.
This thread started about that #3 operation causing a call to be missed because we turn it into a nop while we make the transition, where in reality it needs to be a call to one of the two functions in the transition.
-- Steve
On Mon, Apr 29, 2019 at 1:30 PM Steven Rostedt rostedt@goodmis.org wrote:
The update from "call custom_trampoline" to "call iterator_trampoline" is where we have an issue.
So it has never worked. Just tell people that they have two chocies:
- you do the careful rewriting, which takes more time
- you do it by rewriting as nop and then back, which is what historically has been done, and that is fast and simple, because there's no need to be careful.
Really. I find your complaints completely incomprehensible. You've never rewritten call instructions atomically before, and now you complain about it being slightly more expensive to do it when I give you the code? Yes it damn well will be slightly more expensive. Deal with it.
Btw, once again - I several months ago also gave a suggestion on how it could be done batch-mode by having lots of those small stubs and just generating them dynamically.
You never wrote that code *EITHER*. It's been *months*.
So now I've written the non-batch-mode code for you, and you just *complain* about it?
I'm done with this discussion. I'm totally fed up.
Linus
On Mon, 29 Apr 2019 14:38:35 -0700 Linus Torvalds torvalds@linux-foundation.org wrote:
On Mon, Apr 29, 2019 at 1:30 PM Steven Rostedt rostedt@goodmis.org wrote:
The update from "call custom_trampoline" to "call iterator_trampoline" is where we have an issue.
So it has never worked. Just tell people that they have two chocies:
The custom call to iterator translation never worked. One option is to always call the iterator, and just take the hit. There's another solution to to make permanent updates that would handle the live patching case, but not for the tracing case. It is more critical for live patching than tracing (missed traced function is not as critical as running buggy code unexpectedly). I could look to work on that instead.
- you do the careful rewriting, which takes more time
Which would be the method I said making the call to call a special case.
- you do it by rewriting as nop and then back, which is what
historically has been done, and that is fast and simple, because there's no need to be careful.
Except in the live kernel patching case where you just put back the buggy function.
Really. I find your complaints completely incomprehensible. You've never rewritten call instructions atomically before, and now you complain about it being slightly more expensive to do it when I give you the code? Yes it damn well will be slightly more expensive. Deal with it.
I wasn't complaining about the expense of doing it that way. I was stating that it would take more time to get it right as it as it would require a different implementation for the call to call case.
Btw, once again - I several months ago also gave a suggestion on how it could be done batch-mode by having lots of those small stubs and just generating them dynamically.
You never wrote that code *EITHER*. It's been *months*.
Note, I wasn't the one writing the code back then either. I posted a proof of concept for a similar topic (trace events) for the purpose of bringing back the performance lost due to the retpolines (something like 8% hit). Josh took the initiative to do that work, but I don't remember ever getting to a consensus on a solution to that. Yes you had given us ideas, but I remember people (like Andy) having concerns with it. But because it was just an optimization change, and Josh had other things to work on, that work just stalled.
But I just found out that the function tracing code has the same issue, but this can cause real problems with live kernel patching. This is why this patch set started.
So now I've written the non-batch-mode code for you, and you just *complain* about it?
Because this is a different story. The trace event code is updated one at a time (there's patches out there to do it in batch). But this is not trace events. This is function tracing which only does batching.
I'm done with this discussion. I'm totally fed up.
Note, I wasn't writing this code anyway as I didn't have the time to. I was helping others who took the initiative to do this work. But I guess they didn't have time to move it forward either.
For the live kernel patching case, I'll start working on the permanent changes, where once a function entry is changed, it can never be put back. Then we wont have an issue with the live kernel patching case, but only for tracing. But the worse thing there is you miss tracing functions while an update is being made.
If Nicolai has time, perhaps he can try out the method you suggest and see if we can move this forward.
-- Steve
Steven Rostedt rostedt@goodmis.org writes:
On Mon, 29 Apr 2019 14:38:35 -0700 Linus Torvalds torvalds@linux-foundation.org wrote:
On Mon, Apr 29, 2019 at 1:30 PM Steven Rostedt rostedt@goodmis.org wrote:
The update from "call custom_trampoline" to "call iterator_trampoline" is where we have an issue.
So it has never worked. Just tell people that they have two chocies:
The custom call to iterator translation never worked. One option is to always call the iterator, and just take the hit. There's another solution to to make permanent updates that would handle the live patching case, but not for the tracing case. It is more critical for live patching than tracing (missed traced function is not as critical as running buggy code unexpectedly). I could look to work on that instead.
Making the live patching case permanent would disable tracing of live patched functions though?
For unconditionally calling the iterator: I don't have numbers, but would expect that always searching through something like 50-70 live patching ftrace_ops from some hot path will be noticeable.
If Nicolai has time, perhaps he can try out the method you suggest and see if we can move this forward.
You mean making ftrace handle call->call transitions one by one in non-batch mode, right? Sure, I can do that.
Another question though: is there anything that prevents us from calling ftrace_ops_list_func() directly from ftrace_int3_handler()? We don't have parent_ip, but if that is used for reporting only, maybe setting it to some ftrace_is_in_int3_and_doesnt_now_the_parent() will work? Graph tracing entries would still be lost, but well...
Thanks,
Nicolai
linux-kselftest-mirror@lists.linaro.org