On Wed, Jan 15, 2020 at 9:18 AM Christian Brauner christian.brauner@ubuntu.com wrote:
Commit 69f594a38967 ("ptrace: do not audit capability check when outputing /proc/pid/stat") introduced the ability to opt out of audit messages for accesses to various proc files since they are not violations of policy. While doing so it somehow switched the check from ns_capable() to has_ns_capability{_noaudit}(). That means it switched from checking the subjective credentials of the task to using the objective credentials. I couldn't find the original lkml thread and so I don't know why this switch was done. But it seems wrong since ptrace_has_cap() is currently only used in ptrace_may_access(). And it's used to check whether the calling task (subject) has the CAP_SYS_PTRACE capability in the provided user namespace to operate on the target task (object). According to the cred.h comments this would mean the subjective credentials of the calling task need to be used. This switches it to use security_capable() because we only call ptrace_has_cap() in ptrace_may_access() and in there we already have a stable reference to the calling tasks creds under cred_guard_mutex so there's no need to go through another series of dereferences and rcu locking done in ns_capable{_noaudit}().
This patch breaks CRIU tests:
All CRIU tests fail because ptrace returns EPERM:
$ python test/zdtm.py run -t zdtm/static/env00 --sat === Run 1/1 ================ zdtm/static/env00 ========================== Run zdtm/static/env00 in h ========================== Start test ./env00 --pidfile=env00.pid --outfile=env00.out --envname=ENV_00_TEST Run criu dump =[strace]=> dump/zdtm/static/env00/44/1/dump.strace =[log]=> dump/zdtm/static/env00/44/1/dump.log ------------------------ grep Error ------------------------ b'(00.014558) cg: `- [net_cls,net_prio] -> [/] [0]' b'(00.014634) cg: `- [perf_event] -> [/] [0]' b'(00.014713) cg: `- [pids] -> [/user.slice/user-0.slice/session-1.scope] [0]' b'(00.014818) cg: Set 1 is criu one' b'(00.015123) Warn (compel/src/lib/infect.c:127): Unable to interrupt task: 44 (Operation not permitted)' b'(00.015302) Unlock network' b'(00.015423) Unfreezing tasks into 1' b'(00.015524) \tUnseizing 44 into 1' b'(00.015701) Error (compel/src/lib/infect.c:346): Unable to detach from 44: No such process' b'(00.015864) Error (criu/cr-dump.c:1775): Dumping FAILED.' ------------------------ ERROR OVER ------------------------ ################### Test zdtm/static/env00 FAIL at CRIU dump ################### Send the 9 signal to 44 Wait for zdtm/static/env00(44) to die for 0.100000 ##################################### FAIL #####################################
Here is a strace output for the criu dump process:
write(4, "(00.014482) cg: `- [name=zdt"..., 61) = 61 <0.000028> write(4, "(00.014558) cg: `- [net_cls,"..., 53) = 53 <0.000028> write(4, "(00.014634) cg: `- [perf_eve"..., 47) = 47 <0.000031> write(4, "(00.014713) cg: `- [pids] ->"..., 80) = 80 <0.000034> write(4, "(00.014818) cg: Set 1 is criu on"..., 34) = 34 <0.000028> rt_sigaction(SIGALRM, {sa_handler=0x43de00, sa_mask=[ALRM], sa_flags=SA_RESTORER, sa_restorer=0x7f962247c6b0}, NULL, 8) = 0 <0.000018> alarm(10) = 0 <0.000025> ptrace(PTRACE_SEIZE, 44, NULL, 0) = -1 EPERM (Operation not permitted) <0.000022> write(4, "(00.015123) Warn (compel/src/li"..., 104) = 104 <0.000029> alarm(0) = 10 <0.000032> write(4, "(00.015302) Unlock network\n", 27) = 27 <0.000043> write(4, "(00.015423) Unfreezing tasks int"..., 36) = 36 <0.000036>
The criu process is started with all capabilities in the root user namespace.
I don't have time to investigate this issue right now, will provide more details next Tuesday.
The issue has been detected by our travis-c job: https://travis-ci.org/avagin/linux/jobs/638547093
Thanks, Andrei