On Tue, Aug 9, 2022 at 12:08 PM Eric W. Biederman ebiederm@xmission.com wrote:
Paul Moore paul@paul-moore.com writes:
On Mon, Aug 8, 2022 at 3:43 PM Eric W. Biederman ebiederm@xmission.com wrote:
"Eric W. Biederman" ebiederm@xmission.com writes:
Paul Moore paul@paul-moore.com writes:
I did provide constructive feedback. My feedback to his problem was to address the real problem of bugs in the kernel.
We've heard from several people who have use cases which require adding LSM-level access controls and observability to user namespace creation. This is the problem we are trying to solve here; if you do not like the approach proposed in this patchset please suggest another implementation that allows LSMs visibility into user namespace creation.
Please stop, ignoring my feedback, not detailing what problem or problems you are actually trying to be solved, and threatening to merge code into files that I maintain that has the express purpose of breaking my users.
You just artificially constrained the problems, so that no other solution is acceptable. On that basis alone I am object to this whole approach to steam roll over me and my code.
If you want an example of what kind of harm it can cause to introduce a failure where no failure was before I invite you to look at what happened with sendmail when setuid was modified to fail, when changing the user of a process would cause RLIMIT_NPROC to be exceeded.
I think we are all familiar with the sendmail capabilities bug and the others like it, but using that as an excuse to block additional access controls seems very weak. The Linux Kernel is very different from when the sendmail bug hit (what was that, ~20 years ago?), with advancements in capabilities and other discretionary controls, as well as mandatory access controls which have enabled Linux to be certified through a number of third party security evaluations.
If you are familiar with scenarios like that then why is there not being due diligence performed to ensure that userspace won't break?
What level of due diligence would satisfy you Eric? This feels very much like an unbounded ask that can be used to permanently block any effort to add any form of additional access control around user namespace creation. If that isn't the case, and this request is being made in good faith, please elaborate on what you believe would be sufficient analysis of this patch?
Personally, the fact that the fork()/clone() variants and the unshare() syscall all can fail and return error codes to userspace seems like a reasonable level of safety. If userspace is currently ignoring the return values of fork/clone/unshare that is a problem independent of this patchset. Even looking at the existing create_user_ns() function shows several cases where the user namespace code can forcibly error out due to access controls, memory pressure, etc. I also see that additional restrictions were put on user namespace creation after it was introduced, e.g. the chroot restriction; I'm assuming that didn't break "your" users? If you can describe the analysis you did on that change, perhaps we can do the same for this patchset and call it sufficient, yes?
I am not arguing that what you are proposing is that bad but unexpected failures cause real problems, and at a minimum that needs a better response than: "There is at least one user that wants a failure here".
Let me fix that for you: "There are multiple users who want to have better visibility and access control for user namespace creation."
Visibility sure. Design a proper hook for that. All the proposed hook can do is print an audit message. It can't allocate or manage any state as there is not the corresponding hook when a user namespace is freed. So the proposed hook is not appropriate for increasing visibility.
Auditing very much increases visibility. Look at what the BPF LSM can do, observability is one of its primary goals.
Access control. Not a chance unless it is carefully designed and reviewed.
See the above. The relevant syscalls already have the risk of failure, if userspace is assuming fork/clone/unshare/etc. will never fail then that application is broken independent of this discussion.