Hi,
On 6/4/20 5:32 AM, Will Deacon wrote:
Hi Keno,
Cheers for the really helpful explanation. I have a bunch of questions/comments, since it's not very often that somebody shows up who understands how this is supposed to work and so I'd like to take advantage of that!
On Wed, Jun 03, 2020 at 12:56:24PM -0400, Keno Fischer wrote:
On Wed, Jun 3, 2020 at 11:53 AM Will Deacon will@kernel.org wrote:
However, at the same time as changing this, we should probably make sure to enable the syscall exit pseudo-singlestep trap (similar issue as the other patch I had sent for the signal pseudo-singlestep trap), since otherwise ptracers might get confused about the lack of singlestep trap during a singlestep -> seccomp -> singlestep path (which would give one trap less with this patch than before).
Hmm, I don't completely follow your example. Please could you spell it out a bit more? I fast-forward the stepping state machine on sigreturn, which I thought would be sufficient. Perhaps you're referring to a variant of the situation mentioned by Mark, which I didn't think could happen with ptrace [2].
Sure suppose we have code like the following:
0x0: svc #0 0x4: str x0, [x7] ...
Then, if there's a seccomp filter active that just does SECCOMP_RET_TRACE of everything, right now we get traps:
<- (ip: 0x0) -> PTRACE_SINGLESTEP <- (ip: 0x4 - seccomp trap) -> PTRACE_SINGLESTEP <- SIGTRAP (ip: 0x4 - TRAP_TRACE trap) -> PTRACE_SINGLESTEP <- SIGTRAP (ip: 0x8 - TRAP_TRACE trap)
With your proposed patch, we instead get <- (ip: 0x0) -> PTRACE_SINGLESTEP <- (ip: 0x4 - seccomp trap) -> PTRACE_SINGLESTEP <- SIGTRAP (ip: 0x8 - TRAP_TRACE trap)
Urgh, and actually, I think this is *only* the case if the seccomp handler actually changes a register in the target, right?
In which case, your proposed patch should probably do something like:
if (dir == PTRACE_SYSCALL_EXIT) { bool stepping = test_thread_flag(TIF_SINGLESTEP);
tracehook_report_syscall_exit(regs, stepping); user_rewind_single_step(regs);
}
otherwise I think we could get a spurious SIGTRAP on return to userspace. What do you think?
This has also got me thinking about your other patch to report a pseudo-step exception on entry to a signal handler:
https://lore.kernel.org/r/20200524043827.GA33001@juliacomputing.com
Although we don't actually disarm the step logic there (and so you might expect a spurious SIGTRAP on the second instruction of the handler), I think it's ok because the tracer will either do PTRACE_SINGLESTEP (and rearm the state machine) or PTRACE_CONT (and so stepping will be disabled). Do you agree?
This is problematic, because the ptracer may want to inspect the result of the syscall instruction. On other architectures, this problem is solved with a pseudo-singlestep trap that gets executed if you resume from a syscall-entry-like trap with PTRACE_SINGLESTEP. See the below patch for the change I'm proposing. There is a slight issue with that patch, still: It now makes the x7 issue apply to the singlestep trap at exit, so we should do the patch to fix that issue before we apply that change (or manually check for this situation and issue the pseudo-singlestep trap manually).
I don't see the dependency on the x7 issue; x7 is nobbled on syscall entry, so it will be nobbled in the psuedo-step trap as well as the hardware step trap on syscall return. I'd also like to backport this to stable, without having to backport an optional extension to the ptrace API for preserving x7. Or are you saying that the value of x7 should be PTRACE_SYSCALL_ENTER for the pseudo trap? That seems a bit weird to me, but then this is all weird anyway.
My proposed patch below also changes
<- (ip: 0x0) -> PTRACE_SYSCALL <- (ip: 0x4 - syscall entry trap) -> PTRACE_SINGLESTEP <- SIGTRAP (ip: 0x8 - TRAP_TRACE trap)
to
<- (ip: 0x0) -> PTRACE_SYSCALL <- (ip: 0x4 - syscall entry trap) -> PTRACE_SINGLESTEP <- SIGTRAP (ip: 0x4 - pseudo-singlestep exit trap) -> PTRACE_SINGLESTEP <- SIGTRAP (ip: 0x8 - TRAP_TRACE trap)
But I consider that a bugfix, since that's how other architectures behave and I was going to send in this patch for that reason anyway (since this was another one of the aarch64 ptrace quirks we had to work around).
I think that's still the case with my addition above, so that's good. Any other quirks you ran into that we should address here? Now that I have this stuff partially paged-in, it would be good to fix a bunch of this at once. I can send out a v2 which includes the two patches from you once we're agreed on the details.
Since we're discussing arm64 ptrace/kernel quirks, I'm gonna go ahead and describe a strange behavior on arm64 that I could not reproduce on x86, for example. I apologize for hijacking the thread if this is a non-issue or not related.
This is something I noticed when single-stepping over fork and vfork syscalls in GDB, so handling of PTRACE_EVENT_FORK, PTRACE_EVENT_VFORK and PTRACE_EVENT_VFORK_DONE.
The situation seems to happen more reliably with vforks since it is a two stage operation with VFORK and VFORK_DONE.
Suppose we're stopped at a vfork syscall instruction and that the child we spawn will exit immediately. If we attempt to single-step that particular instruction, this is what happens for arm64:
--
[Step over vfork syscall] ptrace(PTRACE_SINGLESTEP, 63049, 0x1, SIG_0) = 0 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=63049, si_uid=13595, si_status=SIGTRAP, si_utime=0, si_stime=0} ---
[vfork event for child 63052] ptrace(PTRACE_GETEVENTMSG, 63049, NULL, [63052]) = 0
...
[Detach child] ptrace(PTRACE_DETACH, 63052, NULL, SIG_0) = 0 ptrace(PTRACE_CONT, 63049, 0x1, SIG_0) = 0 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=63049, si_uid=13595, si_status=SIGTRAP, si_utime=0, si_stime=0} ---
...
ptrace(PTRACE_SINGLESTEP, 63049, 0x1, SIG_0) = 0 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=63049, si_uid=13595, si_status=SIGCHLD, si_utime=0, si_stime=0} ---
--
For x86-64, we have this:
--
[Step over vfork syscall] ptrace(PTRACE_SINGLESTEP, 13484, 0x1, SIG_0) = 0 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=13484, si_uid=1000, si_status=SIGTRAP, si_utime=0, si_stime=0} --- --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=13493, si_uid=1000, si_status=SIGSTOP, si_utime=0, si_stime=0} ---
[vfork event for child 13493] ptrace(PTRACE_GETEVENTMSG, 13484, NULL, [13493]) = 0
...
[Detach child] ptrace(PTRACE_DETACH, 13493, NULL, SIG_0) = 0 ptrace(PTRACE_CONT, 13484, 0x1, SIG_0) = 0 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=13484, si_uid=1000, si_status=SIGTRAP, si_utime=0, si_stime=0} ---
...
ptrace(PTRACE_SINGLESTEP, 13484, 0x1, SIG_0) = 0 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_TRAPPED, si_pid=13484, si_uid=1000, si_status=SIGTRAP, si_utime=0, si_stime=0} ---
--
There are a couple things off:
1 - x86-64 seems to get an extra SIGSTOP when we single-step over the vfork syscall, though this doesn't seem to do any harm.
2 - This is the one that throws GDB off. In the last single-step request, arm64 gets a SIGCHLD instead of the SIGTRAP x86-64 gets.
I did some experiments with it, and it seems the last SIGCHLD is more prone to being delivered (instead of a SIGTRAP) if we put some load on the machine (by firing off processes or producing a lot of screen output for example).
Does this ring any bells? I suppose signal delivery order is not guaranteed in this context, but x86-64 seems to deliver them consistently in the same order.