 
            On Tue, Mar 10, 2020 at 03:57:35PM -0500, Eric W. Biederman wrote:
Jann Horn jannh@google.com writes:
On Tue, Mar 10, 2020 at 9:00 PM Jann Horn jannh@google.com wrote:
On Tue, Mar 10, 2020 at 8:29 PM Eric W. Biederman ebiederm@xmission.com wrote:
Jann Horn jannh@google.com writes:
On Tue, Mar 10, 2020 at 7:54 PM Eric W. Biederman ebiederm@xmission.com wrote:
During exec some file descriptors are closed and the files struct is unshared. But all of that can happen at other times and it has the same protections during exec as at ordinary times. So stop taking the cred_guard_mutex as it is useless.
Furthermore he cred_guard_mutex is a bad idea because it is deadlock prone, as it is held in serveral while waiting possibly indefinitely for userspace to do something.
[...]
If you make this change, then if this races with execution of a setuid program that afterwards e.g. opens a unix domain socket, an attacker will be able to steal that socket and inject messages into communication with things like DBus. procfs currently has the same race, and that still needs to be fixed, but at least procfs doesn't let you open things like sockets because they don't have a working ->open handler, and it enforces the normal permission check for opening files.
It isn't only exec that can change credentials. Do we need a lock for changing credentials?
[...]
If we need a lock around credential change let's design and build that. Having a mismatch between what a lock is designed to do, and what people use it for can only result in other bugs as people get confused.
Hmm... what benefits do we get from making it a separate lock? I guess it would allow us to make it a per-task lock instead of a signal_struct-wide one? That might be helpful...
But actually, isn't the core purpose of the cred_guard_mutex to guard against concurrent credential changes anyway? That's what almost everyone uses it for, and it's in the name...
Having been through all of the users nope.
Maybe someone tried to repurpose for that. I haven't traced through when it went the it was renamed from cred_exec_mutex to cred_guard_mutex.
The original purpose was to make make exec and ptrace deadlock. But it was seen as being there to allow safely calculating the new credentials before the point of now return. Because if a process is ptraced or not affects the new credential calculations. Unfortunately offering that guarantee fundamentally leads to deadlock.
So ptrace_attach and seccomp use the cred_guard_mutex to guarantee a deadlock.
The common use is to take cred_guard_mutex to guard the window when credentials and process details are out of sync in exec. But there is at least do_io_accounting that seems to have the same justification for holding __pidfd_fget.
With effort I suspect we can replace exec_change_mutex with task_lock. When we are guaranteed to be single threaded placing exec_change_mutex in signal_struct doesn't really help us (except maybe in some races?).
The deep problem is no one really understands cred_guard_mutex so it is a mess. Code with poorly defined semantics is always wrong somewhere
This is a good point. When discussing patches sensitive to credential changes cred_guard_mutex was always introduced as having the purpose to guard against concurrent credential changes. And I'm pretty sure that that's how most people have been using it for quite a long time. I mean, it's at least the case for seccomp and proc and probably quite a few more. So the problem seems to me that it has clear _intended_ semantics that runs into issues in all sorts of cases. So if cred_guard_mutex is not that then we seem to need to provide something that serves it's intended purpose.