On Fri, May 01, 2020 at 11:01:07AM +0900, Masami Hiramatsu wrote:
On Tue, 28 Apr 2020 23:36:27 +0200 Jiri Olsa jolsa@redhat.com wrote:
On Fri, Apr 17, 2020 at 04:38:10PM +0900, Masami Hiramatsu wrote:
SNIP
The code within the kretprobe handler checks for probe reentrancy, so we won't trigger any _raw_spin_lock_irqsave probe in there.
The problem is in outside kprobe_flush_task, where we call:
kprobe_flush_task kretprobe_table_lock raw_spin_lock_irqsave _raw_spin_lock_irqsave
where _raw_spin_lock_irqsave triggers the kretprobe and installs kretprobe_trampoline handler on _raw_spin_lock_irqsave return.
The kretprobe_trampoline handler is then executed with already locked kretprobe_table_locks, and first thing it does is to lock kretprobe_table_locks ;-) the whole lockup path like:
kprobe_flush_task kretprobe_table_lock raw_spin_lock_irqsave _raw_spin_lock_irqsave ---> probe triggered, kretprobe_trampoline installed
---> kretprobe_table_locks locked kretprobe_trampoline trampoline_handler kretprobe_hash_lock(current, &head, &flags); <--- deadlock
Adding kprobe_busy_begin/end helpers that mark code with fake probe installed to prevent triggering of another kprobe within this code.
Using these helpers in kprobe_flush_task, so the probe recursion protection check is hit and the probe is never set to prevent above lockup.
Thanks Jiri!
Ingo, could you pick this up?
Ingo, any chance you could take this one?
Hi Ingo,
Should I make a pull request for all kprobes related patches to you?
looks like Ingo is offline, Thomas, could you please pull this one?
thanks, jirka