On Sat, Oct 09, 2021 at 03:52:02PM +0300, Alexey Dobriyan wrote:
On Fri, Oct 08, 2021 at 04:55:04PM -0700, Kees Cook wrote:
This makes sure that wchan contains a sensible symbol when a process is blocked.
Specifically this calls the sleep() syscall, and expects the architecture to have called schedule() from a function that has "sleep" somewhere in its name.
This exposes internal kernel symbol to userspace.
Correct; we're verifying the results of the wchan output, which produces a kernel symbol for blocked processes.
Why would want to test that?
This is part of a larger series refactoring/fixing wchan[1], and we've now tripped over several different failure conditions, so I want to make sure this doesn't regress in the future.
Doing s/sleep/SLEEP/g doesn't change kernel but now the test is broken.
Yes; the test would be doing it's job, as that would mean there was a userspace visible change to wchan, so we'd want to catch it and either fix the kernel or update the test to reflect the new reality.
For example, on the architectures I tested (x86_64, arm64, arm, mips, and powerpc) this is "hrtimer_nanosleep":
+/*
- Make sure that wchan returns a reasonable symbol when blocked.
- */
Test should be "contains C identifier" then?
Nope, this was intentional. Expanding to a C identifier won't catch the "we unwound the stack to the wrong depth and now all wchan shows is '__switch_to'" bug[2]. We're specifically checking that wchan is doing at least the right thing for the most common blocking state.
+int main(void) +{
- char buf[64];
- pid_t child;
- int sync[2], fd;
- if (pipe(sync) < 0)
perror_exit("pipe");
- child = fork();
- if (child < 0)
perror_exit("fork");
- if (child == 0) {
/* Child */
if (close(sync[0]) < 0)
perror_exit("child close sync[0]");
if (close(sync[1]) < 0)
perror_exit("child close sync[1]");
Redundant close().
Hmm, did you maybe miss the differing array indexes? This closes the reading end followed by the writing end of the child's pipe.
sleep(10);
_exit(0);
- }
- /* Parent */
- if (close(sync[1]) < 0)
perror_exit("parent close sync[1]");
Redundant close().
It's not, though. This closes the write side of the parent's pipe.
- if (read(sync[0], buf, 1) != 0)
perror_exit("parent read sync[0]");
Racy if child is scheduled out after first close in the child.
No, the first close will close the child's read-side of the pipe, which isn't being used. For example, see[3].
The parent's read of /proc/$child/wchan could technically race if the child is scheduled out after the second close() and before the sleep(), but the parent is doing at least 2 syscalls before then. I'm open to a more exact synchronization method, but this should be sufficient. (e.g. Using ptrace to catch sleep syscall entry seemed like overkill.)
-Kees
[1] https://lore.kernel.org/lkml/20211008111527.438276127@infradead.org/ [2] https://lore.kernel.org/lkml/20211008124052.GA976@C02TD0UTHF1T.local/ [3] https://man7.org/tlpi/code/online/diff/pipes/pipe_sync.c.html
- snprintf(buf, sizeof(buf), "/proc/%d/wchan", child);
- fd = open(buf, O_RDONLY);
- if (fd < 0) {
if (errno == ENOENT)
return 4;
perror_exit(buf);
- }
- memset(buf, 0, sizeof(buf));
- if (read(fd, buf, sizeof(buf) - 1) < 1)
perror_exit(buf);
- if (strstr(buf, "sleep") == NULL) {
fprintf(stderr, "FAIL: did not find 'sleep' in wchan '%s'\n", buf);
return 1;
- }
- printf("ok: found 'sleep' in wchan '%s'\n", buf);
- if (kill(child, SIGKILL) < 0)
perror_exit("kill");
- if (waitpid(child, NULL, 0) != child) {
fprintf(stderr, "waitpid: got the wrong child!?\n");
return 1;
- }
- return 0;
+}