On Tue, Nov 27, 2018 at 10:56:20AM +0000, Russell King - ARM Linux wrote:
On Tue, Nov 27, 2018 at 08:30:32AM -0200, Rafael David Tinoco wrote:
On 11/26/18 9:44 PM, Russell King - ARM Linux wrote:
On Mon, Nov 26, 2018 at 11:41:11PM +0000, Russell King - ARM Linux wrote:
On Mon, Nov 26, 2018 at 11:33:03PM +0000, Russell King - ARM Linux wrote:
On Mon, Nov 26, 2018 at 08:53:35PM -0200, Rafael David Tinoco wrote:
Right now, only way for task->thread_info->syscall to be updated is if if _TIF_SYSCALL_WORK is set in current's task thread_info->flags (similar to what has_syscall_work() checks for arm64).
This means that "->syscall" will only be updated if we are tracing the syscalls through ptrace, for example. This is NOT the same behavior as arm64, when pt_regs->syscallno is updated in the beginning of svc0 handler for *every* syscall entry.
So when was it decided that the syscall number will always be required (we need it to know how far back this has to be backported).
PS, I rather object to the fact that the required behaviour seems to change, arch maintainers aren't told about it until... some test is created at some random point in the future which then fails.
Surely there's a better way to communicate changes in requirements than discovery-by-random-bug-report ?
Final comment for tonight - the commit introducing /proc/*/syscall says:
This adds /proc/PID/syscall and /proc/PID/task/TID/syscall magic files. These use task_current_syscall() to show the task's current system call number and argument registers, stack pointer and PC. For a task blocked but not in a syscall, the file shows "-1" in place of the syscall number, followed by only the SP and PC. For a task that's not blocked, it shows "running".
Please validate that a blocked task does indeed show -1 with your patch applied.
Will do. This is done in an upper level (collect_syscall <- task_current_syscall <- proc_pid_syscall):
if (!try_get_task_stack(target)) { /* Task has no stack, so the task isn't in a syscall. */ *sp = *pc = 0; *callno = -1; return 0; }
I think only missing part for arm was that one, but will confirm, after fixing usage of "r7" for obtaining "scno". Will send a v2 in this thread.
There's another question - what's the expected behaviour when we restart a syscall using the restartblock mechanism? Is the syscall number expected to be __NR_restart_syscall or the original syscall number?
I can't find anywhere that this detail is specified (damn the lack of API documentation - I'm tempted to say that we won't implement this until it gets documented properly, and that test can continue failing until such time that happens.)
Having looked around, it seems that the /proc/<PID>/syscall interface was sneaked into the kernel. The patch series which added it was sent in 2008 with a covering message that made no mention of this new interface, instead stating:
http://lkml.iu.edu/hypermail/linux/kernel/0807.2/0551.html
Most of these changes move code around with little or no change, and they should not break anything or change any behavior.
While that statement is absolutely correct, it doesn't highlight the fact that the set of patches _also_ include a brand new userspace interface exposing things like syscall numbers and arguments in /proc.
There appears to be no documentation at all of this interface, so there is no definition of how it is supposed to work or what it is supposed to expose beyond what little information is in the original patch:
http://lkml.iu.edu/hypermail/linux/kernel/0807.2/0577.html
This adds /proc/PID/syscall and /proc/PID/task/TID/syscall magic files. These use task_current_syscall() to show the task's current system call number and argument registers, stack pointer and PC. For a task blocked but not in a syscall, the file shows "-1" in place of the syscall number, followed by only the SP and PC. For a task that's not blocked, it shows "running".
This really isn't a good place to be - this is why commit messages should _not_ just describe what the changes are doing, also _why_ they are being made. Also, any new user interface needs to be fully and properly documented, because years later, people will move away, knowledge will be lost, and that leaves us with a maintainability problem, exactly like we have right now with this.
With the lack of interface documentation, how do we even know whether the /proc/*/syscall is supposed to show the syscall number of non-traced threads? How do we know that the test that found this is actually correct in reporting a failure? How do we know whether it's supposed to expose __NR_restart_syscall?
So, I thought I'd write a test program:
#include <stdio.h> #include <stdlib.h> #include <signal.h> #include <sys/fcntl.h> #include <unistd.h>
static int read_file(const char *fn, char *buf, size_t size) { int fd, ret, nr;
fd = open(fn, O_RDONLY); if (fd == -1) return -1;
for (nr = 0; nr < size; nr += ret) { ret = read(fd, buf + nr, size - nr); if (ret <= 0) break; }
close(fd);
return nr ? nr : ret; }
int main() { char fn[64], buf[256]; int pid, ret;
pid = fork(); if (pid == 0) { /* child */ sleep(5); exit(0); }
/* parent */ sleep(1); snprintf(fn, sizeof(fn), "/proc/%d/syscall", pid); ret = read_file(fn, buf, sizeof(buf));
printf("%.*s", ret, buf);
kill(pid, SIGCONT); sleep(1);
ret = read_file(fn, buf, sizeof(buf));
printf("%.*s", ret, buf);
return 0; }
On x86 (32-bit app on 64-bit kernel), it has this behaviour:
$ ./syscall-test 162 0xffcc5a6c 0xffcc5a6c 0x48d09000 0x0 0xffcc5af4 0xffcc5a74 0xffcc5a2c 0xf77dfa59 162 0xffcc5a6c 0xffcc5a6c 0x48d09000 0x0 0xffcc5af4 0xffcc5a74 0xffcc5a2c 0xf77dfa59
which looks good, except:
$ strace -o /dev/null -f ./syscall-test 162 0xffc0070c 0xffc0070c 0x48d09000 0x0 0xffc00794 0xffc00714 0xffc006cc 0xf77f3a59 0 0xffc0070c 0xffc0070c 0x48d09000 0x0 0xffc00794 0xffc00714 0xffc006cc 0xf77f3a59
So, if we're syscall ptracing a program, __NR_restart_syscall gets exposed through this interface, but if we aren't, it isn't exposed. Which version is correct? *shrug*, no documentation...