As pointed out by Michael Ellerman, the ptrace ABI on powerpc does not allow or require the return code to be set on syscall entry when skipping the syscall. It will always return ENOSYS and the return code must be set on syscall exit.
This code does that, behaving more similarly to strace. It still sets the return code on entry, which is overridden on powerpc, and it will always repeat the same on exit. Also, on powerpc, the errno is not inverted, and depends on ccr.so being set.
This has been tested on powerpc and amd64.
Cc: Michael Ellerman mpe@ellerman.id.au Cc: Kees Cook keescook@google.com Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@canonical.com --- tools/testing/selftests/seccomp/seccomp_bpf.c | 24 +++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c index 252140a52553..b90a9190ba88 100644 --- a/tools/testing/selftests/seccomp/seccomp_bpf.c +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c @@ -1738,6 +1738,14 @@ void change_syscall(struct __test_metadata *_metadata, TH_LOG("Can't modify syscall return on this architecture"); #else regs.SYSCALL_RET = result; +# if defined(__powerpc__) + if (result < 0) { + regs.SYSCALL_RET = -result; + regs.ccr |= 0x10000000; + } else { + regs.ccr &= ~0x10000000; + } +# endif #endif
#ifdef HAVE_GETREGS @@ -1796,6 +1804,7 @@ void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee, int ret, nr; unsigned long msg; static bool entry; + int *syscall_nr = args;
/* * The traditional way to tell PTRACE_SYSCALL entry/exit @@ -1809,10 +1818,15 @@ void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee, EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
- if (!entry) + if (!entry && !syscall_nr) return;
- nr = get_syscall(_metadata, tracee); + if (entry) + nr = get_syscall(_metadata, tracee); + else + nr = *syscall_nr; + if (syscall_nr) + *syscall_nr = nr;
if (nr == __NR_getpid) change_syscall(_metadata, tracee, __NR_getppid, 0); @@ -1889,9 +1903,10 @@ TEST_F(TRACE_syscall, ptrace_syscall_redirected)
TEST_F(TRACE_syscall, ptrace_syscall_errno) { + int syscall_nr = -1; /* Swap SECCOMP_RET_TRACE tracer for PTRACE_SYSCALL tracer. */ teardown_trace_fixture(_metadata, self->tracer); - self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, NULL, + self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, &syscall_nr, true);
/* Tracer should skip the open syscall, resulting in ESRCH. */ @@ -1900,9 +1915,10 @@ TEST_F(TRACE_syscall, ptrace_syscall_errno)
TEST_F(TRACE_syscall, ptrace_syscall_faked) { + int syscall_nr = -1; /* Swap SECCOMP_RET_TRACE tracer for PTRACE_SYSCALL tracer. */ teardown_trace_fixture(_metadata, self->tracer); - self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, NULL, + self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, &syscall_nr, true);
/* Tracer should skip the gettid syscall, resulting fake pid. */
On Tue, Jun 30, 2020 at 01:47:39PM -0300, Thadeu Lima de Souza Cascardo wrote:
As pointed out by Michael Ellerman, the ptrace ABI on powerpc does not allow or require the return code to be set on syscall entry when skipping the syscall. It will always return ENOSYS and the return code must be set on syscall exit.
This code does that, behaving more similarly to strace. It still sets the return code on entry, which is overridden on powerpc, and it will always repeat the same on exit. Also, on powerpc, the errno is not inverted, and depends on ccr.so being set.
This has been tested on powerpc and amd64.
Cc: Michael Ellerman mpe@ellerman.id.au Cc: Kees Cook keescook@google.com Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@canonical.com
Yikes, I missed this from a while ago. I apologize for responding so late!
This appears still unfixed; is that correct?
tools/testing/selftests/seccomp/seccomp_bpf.c | 24 +++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c index 252140a52553..b90a9190ba88 100644 --- a/tools/testing/selftests/seccomp/seccomp_bpf.c +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c @@ -1738,6 +1738,14 @@ void change_syscall(struct __test_metadata *_metadata, TH_LOG("Can't modify syscall return on this architecture"); #else regs.SYSCALL_RET = result; +# if defined(__powerpc__)
if (result < 0) {
regs.SYSCALL_RET = -result;
regs.ccr |= 0x10000000;
} else {
regs.ccr &= ~0x10000000;
}
+# endif #endif
Just so I understand correctly: for ppc to "see" this result, it needs to be both negative AND have this specific register set?
#ifdef HAVE_GETREGS @@ -1796,6 +1804,7 @@ void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee, int ret, nr; unsigned long msg; static bool entry;
- int *syscall_nr = args;
/* * The traditional way to tell PTRACE_SYSCALL entry/exit @@ -1809,10 +1818,15 @@ void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee, EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
- if (!entry)
- if (!entry && !syscall_nr) return;
- nr = get_syscall(_metadata, tracee);
- if (entry)
nr = get_syscall(_metadata, tracee);
- else
nr = *syscall_nr;
This is weird? Shouldn't get_syscall() be modified to do the right thing here instead of depending on the extra arg?
- if (syscall_nr)
*syscall_nr = nr;
if (nr == __NR_getpid) change_syscall(_metadata, tracee, __NR_getppid, 0); @@ -1889,9 +1903,10 @@ TEST_F(TRACE_syscall, ptrace_syscall_redirected) TEST_F(TRACE_syscall, ptrace_syscall_errno) {
- int syscall_nr = -1; /* Swap SECCOMP_RET_TRACE tracer for PTRACE_SYSCALL tracer. */ teardown_trace_fixture(_metadata, self->tracer);
- self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, NULL,
- self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, &syscall_nr, true);
/* Tracer should skip the open syscall, resulting in ESRCH. */ @@ -1900,9 +1915,10 @@ TEST_F(TRACE_syscall, ptrace_syscall_errno) TEST_F(TRACE_syscall, ptrace_syscall_faked) {
- int syscall_nr = -1; /* Swap SECCOMP_RET_TRACE tracer for PTRACE_SYSCALL tracer. */ teardown_trace_fixture(_metadata, self->tracer);
- self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, NULL,
- self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, &syscall_nr, true);
/* Tracer should skip the gettid syscall, resulting fake pid. */ -- 2.25.1
On Tue, Sep 08, 2020 at 04:18:17PM -0700, Kees Cook wrote:
On Tue, Jun 30, 2020 at 01:47:39PM -0300, Thadeu Lima de Souza Cascardo wrote:
As pointed out by Michael Ellerman, the ptrace ABI on powerpc does not allow or require the return code to be set on syscall entry when skipping the syscall. It will always return ENOSYS and the return code must be set on syscall exit.
This code does that, behaving more similarly to strace. It still sets the return code on entry, which is overridden on powerpc, and it will always repeat the same on exit. Also, on powerpc, the errno is not inverted, and depends on ccr.so being set.
This has been tested on powerpc and amd64.
Cc: Michael Ellerman mpe@ellerman.id.au Cc: Kees Cook keescook@google.com Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@canonical.com
Yikes, I missed this from a while ago. I apologize for responding so late!
This appears still unfixed; is that correct?
Yes. I will send a v2 on top of recent changes to the test.
tools/testing/selftests/seccomp/seccomp_bpf.c | 24 +++++++++++++++---- 1 file changed, 20 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c index 252140a52553..b90a9190ba88 100644 --- a/tools/testing/selftests/seccomp/seccomp_bpf.c +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c @@ -1738,6 +1738,14 @@ void change_syscall(struct __test_metadata *_metadata, TH_LOG("Can't modify syscall return on this architecture"); #else regs.SYSCALL_RET = result; +# if defined(__powerpc__)
if (result < 0) {
regs.SYSCALL_RET = -result;
regs.ccr |= 0x10000000;
} else {
regs.ccr &= ~0x10000000;
}
+# endif #endif
Just so I understand correctly: for ppc to "see" this result, it needs to be both negative AND have this specific register set?
Yes. According to Documentation/powerpc/syscall64-abi.rst:
" - For the sc instruction, both a value and an error condition are returned. cr0.SO is the error condition, and r3 is the return value. When cr0.SO is clear, the syscall succeeded and r3 is the return value. When cr0.SO is set, the syscall failed and r3 is the error value (that normally corresponds to errno). "
So, while some other arches will return -EINVAL, ppc returns EINVAL. And that is distinguished from, say, read(2) returning 22 bytes read, by using CR.SO.
#ifdef HAVE_GETREGS @@ -1796,6 +1804,7 @@ void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee, int ret, nr; unsigned long msg; static bool entry;
- int *syscall_nr = args;
/* * The traditional way to tell PTRACE_SYSCALL entry/exit @@ -1809,10 +1818,15 @@ void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee, EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
- if (!entry)
- if (!entry && !syscall_nr) return;
- nr = get_syscall(_metadata, tracee);
- if (entry)
nr = get_syscall(_metadata, tracee);
- else
nr = *syscall_nr;
This is weird? Shouldn't get_syscall() be modified to do the right thing here instead of depending on the extra arg?
R0 might be clobered. Same documentation mentions it as volatile. So, during syscall exit, we can't tell for sure that R0 will have the original syscall number. So, we need to grab it during syscall enter, save it somewhere and reuse it. I used the test context/args for that. That's the main change I had to deal with after recent changes to the test. I used the variant struct now.
I only saw the need to do this under tracer_ptrace, as that was the only one changing syscall return values using ptrace. And that can only be done during syscall exit on ppc (ptrace ABI we can't break). So, changing get_syscall did not seem necessary.
Thanks. Cascardo.
- if (syscall_nr)
*syscall_nr = nr;
if (nr == __NR_getpid) change_syscall(_metadata, tracee, __NR_getppid, 0); @@ -1889,9 +1903,10 @@ TEST_F(TRACE_syscall, ptrace_syscall_redirected) TEST_F(TRACE_syscall, ptrace_syscall_errno) {
- int syscall_nr = -1; /* Swap SECCOMP_RET_TRACE tracer for PTRACE_SYSCALL tracer. */ teardown_trace_fixture(_metadata, self->tracer);
- self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, NULL,
- self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, &syscall_nr, true);
/* Tracer should skip the open syscall, resulting in ESRCH. */ @@ -1900,9 +1915,10 @@ TEST_F(TRACE_syscall, ptrace_syscall_errno) TEST_F(TRACE_syscall, ptrace_syscall_faked) {
- int syscall_nr = -1; /* Swap SECCOMP_RET_TRACE tracer for PTRACE_SYSCALL tracer. */ teardown_trace_fixture(_metadata, self->tracer);
- self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, NULL,
- self->tracer = setup_trace_fixture(_metadata, tracer_ptrace, &syscall_nr, true);
/* Tracer should skip the gettid syscall, resulting fake pid. */ -- 2.25.1
-- Kees Cook
Thadeu Lima de Souza Cascardo cascardo@canonical.com writes:
On Tue, Sep 08, 2020 at 04:18:17PM -0700, Kees Cook wrote:
On Tue, Jun 30, 2020 at 01:47:39PM -0300, Thadeu Lima de Souza Cascardo wrote:
...
@@ -1809,10 +1818,15 @@ void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee, EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
- if (!entry)
- if (!entry && !syscall_nr) return;
- nr = get_syscall(_metadata, tracee);
- if (entry)
nr = get_syscall(_metadata, tracee);
- else
nr = *syscall_nr;
This is weird? Shouldn't get_syscall() be modified to do the right thing here instead of depending on the extra arg?
R0 might be clobered. Same documentation mentions it as volatile. So, during syscall exit, we can't tell for sure that R0 will have the original syscall number. So, we need to grab it during syscall enter, save it somewhere and reuse it. I used the test context/args for that.
The user r0 (in regs->gpr[0]) shouldn't be clobbered.
But it is modified if the tracer skips the syscall, by setting the syscall number to -1. Or if the tracer changes the syscall number.
So if you need the original syscall number in the exit path then I think you do need to save it at entry.
cheers
On Sun, Sep 13, 2020 at 10:34:23PM +1000, Michael Ellerman wrote:
Thadeu Lima de Souza Cascardo cascardo@canonical.com writes:
On Tue, Sep 08, 2020 at 04:18:17PM -0700, Kees Cook wrote:
On Tue, Jun 30, 2020 at 01:47:39PM -0300, Thadeu Lima de Souza Cascardo wrote:
...
@@ -1809,10 +1818,15 @@ void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee, EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
- if (!entry)
- if (!entry && !syscall_nr) return;
- nr = get_syscall(_metadata, tracee);
- if (entry)
nr = get_syscall(_metadata, tracee);
- else
nr = *syscall_nr;
This is weird? Shouldn't get_syscall() be modified to do the right thing here instead of depending on the extra arg?
R0 might be clobered. Same documentation mentions it as volatile. So, during syscall exit, we can't tell for sure that R0 will have the original syscall number. So, we need to grab it during syscall enter, save it somewhere and reuse it. I used the test context/args for that.
The user r0 (in regs->gpr[0]) shouldn't be clobbered.
But it is modified if the tracer skips the syscall, by setting the syscall number to -1. Or if the tracer changes the syscall number.
So if you need the original syscall number in the exit path then I think you do need to save it at entry.
... the selftest code wants to test the updated syscall (-1 or whatever), so this sounds correct. Was this test actually failing on powerpc? (I'd really rather not split entry/exit if I don't have to.)
On Thu, Sep 17, 2020 at 03:37:16PM -0700, Kees Cook wrote:
On Sun, Sep 13, 2020 at 10:34:23PM +1000, Michael Ellerman wrote:
Thadeu Lima de Souza Cascardo cascardo@canonical.com writes:
On Tue, Sep 08, 2020 at 04:18:17PM -0700, Kees Cook wrote:
On Tue, Jun 30, 2020 at 01:47:39PM -0300, Thadeu Lima de Souza Cascardo wrote:
...
@@ -1809,10 +1818,15 @@ void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee, EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
- if (!entry)
- if (!entry && !syscall_nr) return;
- nr = get_syscall(_metadata, tracee);
- if (entry)
nr = get_syscall(_metadata, tracee);
- else
nr = *syscall_nr;
This is weird? Shouldn't get_syscall() be modified to do the right thing here instead of depending on the extra arg?
R0 might be clobered. Same documentation mentions it as volatile. So, during syscall exit, we can't tell for sure that R0 will have the original syscall number. So, we need to grab it during syscall enter, save it somewhere and reuse it. I used the test context/args for that.
The user r0 (in regs->gpr[0]) shouldn't be clobbered.
But it is modified if the tracer skips the syscall, by setting the syscall number to -1. Or if the tracer changes the syscall number.
So if you need the original syscall number in the exit path then I think you do need to save it at entry.
... the selftest code wants to test the updated syscall (-1 or whatever), so this sounds correct. Was this test actually failing on powerpc? (I'd really rather not split entry/exit if I don't have to.)
-- Kees Cook
Yes, it started failing when the return code started being changed as well. Though ptrace can change the syscall at entry (IIRC), it can't change the return code. And that is part of the ABI. If ppc is changed so it allows changing the return code during ptrace entry, some strace uses will break. So that is not an option.
Cascardo.
Thadeu Lima de Souza Cascardo cascardo@canonical.com writes:
On Thu, Sep 17, 2020 at 03:37:16PM -0700, Kees Cook wrote:
On Sun, Sep 13, 2020 at 10:34:23PM +1000, Michael Ellerman wrote:
Thadeu Lima de Souza Cascardo cascardo@canonical.com writes:
On Tue, Sep 08, 2020 at 04:18:17PM -0700, Kees Cook wrote:
On Tue, Jun 30, 2020 at 01:47:39PM -0300, Thadeu Lima de Souza Cascardo wrote:
...
@@ -1809,10 +1818,15 @@ void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee, EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
- if (!entry)
- if (!entry && !syscall_nr) return;
- nr = get_syscall(_metadata, tracee);
- if (entry)
nr = get_syscall(_metadata, tracee);
- else
nr = *syscall_nr;
This is weird? Shouldn't get_syscall() be modified to do the right thing here instead of depending on the extra arg?
R0 might be clobered. Same documentation mentions it as volatile. So, during syscall exit, we can't tell for sure that R0 will have the original syscall number. So, we need to grab it during syscall enter, save it somewhere and reuse it. I used the test context/args for that.
The user r0 (in regs->gpr[0]) shouldn't be clobbered.
But it is modified if the tracer skips the syscall, by setting the syscall number to -1. Or if the tracer changes the syscall number.
So if you need the original syscall number in the exit path then I think you do need to save it at entry.
... the selftest code wants to test the updated syscall (-1 or whatever), so this sounds correct. Was this test actually failing on powerpc? (I'd really rather not split entry/exit if I don't have to.)
Yes, it started failing when the return code started being changed as well. Though ptrace can change the syscall at entry (IIRC), it can't change the return code. And that is part of the ABI. If ppc is changed so it allows changing the return code during ptrace entry, some strace uses will break. So that is not an option.
Yep.
I don't know that it would break anything to change that part of the ptrace ABI, but it definitely could break things and it would be subtle, so it's not really an option.
So for powerpc we do need the return code change done in the exit hook, sorry.
cheers
linux-kselftest-mirror@lists.linaro.org