On Mon, Jan 06, 2025 at 05:14:26PM -0800, Isaac Manjarres wrote:
On Fri, Jan 03, 2025 at 04:03:44PM +0100, Jann Horn wrote:
On Fri, Jan 3, 2025 at 12:32 AM Isaac J. Manjarres isaacmanjarres@google.com wrote:
Android currently uses the ashmem driver [1] for creating shared memory regions between processes. Ashmem buffers can initially be mapped with PROT_READ, PROT_WRITE, and PROT_EXEC. Processes can then use the ASHMEM_SET_PROT_MASK ioctl command to restrict--never add--the permissions that the buffer can be mapped with.
Processes can remove the ability to map ashmem buffers as executable to ensure that those buffers cannot be exploited to run unintended code.
Is there really code out there that first maps an ashmem buffer with PROT_EXEC, then uses the ioctl to remove execute permission for future mappings? I don't see why anyone would do that.
Hi Jann,
Thanks for your feedback and for taking the time to review these patches!
Not that I'm aware of. The reason why I made this seal have semantics where it prevents future executable mappings is because there are existing applications that allocate an ashmem buffer (default permissions are RWX), map the buffer as RW, and then restrict the permissions to just R.
When the buffer is mapped as RW, do_mmap() unconditionally sets VM_MAYEXEC on the VMA for the mapping, which means that the mapping could later be mapped as executable via mprotect(). Therefore, having the semantics of the seal be that it prevents any executable mappings would break existing code that has already been released. It would make transitioning clients to memfd difficult, because to amend that, the ashmem users would have to first restrict the permissions of the buffer to be RW, then map it as RW, and then restrict the permissions again to be just R, which also means an additional system call.
You could do something similar to my adjustments to the F_SEAL_WRITE changes that clears VM_MAYEXEC in cases where do_mmap() maps an F_SEAL_EXEC'd without PROT_EXEC.
Please note that F_SEAL_EXEC implies:
F_SEAL_SHRINK F_SEAL_GROW F_SEAL_WRITE <- important, obviously F_SEAL_FUTURE_WRITE <- also important
if (seals & F_SEAL_EXEC && inode->i_mode & 0111) seals |= F_SEAL_SHRINK|F_SEAL_GROW|F_SEAL_WRITE|F_SEAL_FUTURE_WRITE;
Though interestingly only _after_ the mapping_deny_writable() call is performed which is... odd.
Probably worth exploring F_SEAL_EXEC semantics in detail if you haven't already to see how viable this is.
For instance, suppose process A allocates a memfd that is meant to be read and written by itself and another process, call it B.
Process A shares the buffer with process B, but process B injects code into the buffer, and compromises process A, such that it makes A map the buffer with PROT_EXEC. This provides an opportunity for process A to run the code that process B injected into the buffer.
If process A had the ability to seal the buffer against future executable mappings before sharing the buffer with process B, this attack would not be possible.
I think if you want to enforce such restrictions in a scenario where the attacker can already make the target process perform semi-arbitrary syscalls, it would probably be more reliable to enforce rules on executable mappings with something like SELinux policy and/or F_SEAL_EXEC.
For SELinux policy, do you mean to not allow execmem permissions? What about scenarios where a process wants to use JIT compilation, but doesn't want memfd data buffers to be executable? My thought was to use this new seal to have a finer granularity to control what buffers can be mapped as executable. If not, could you please clarify?
Also, F_SEAL_EXEC just seals the memfd's current executable permissions, and doesn't affect the mapping permissions at all. Are you saying that F_SEAL_EXEC should be extended to cover mappings as well? If so, it is not clear to me what the semantics of that would be.
I need to dig into how this functions, but I'm guessing then it doesn't immediatley enforce anything preventing an existing mapping from executing? Which differs from F_SEAL_WRITE semantics and then makes it seem like it is already acting like an F_SEAL_FUTURE_EXEC in effect?
Hm need to dig into this a bit.
For instance, if a memfd is non-executable and F_SEAL_EXEC is applied, we can also prevent any executable mappings at that point. I'm not sure if that's the right thing to do though. For instance, there are shared object files that don't have executable permissions, but their code sections should be mapped as executable. So, drawing from that, I'm not sure if it makes sense to tie the file execution permissions to the mapping permissions.
There's also the case where F_SEAL_EXEC is invoked on an executable memfd. In that case, there doesn't seem to be anything to do from a mapping perspective since memfds can be mapped as executable by default?
Android is currently trying to replace ashmem with memfd. However, memfd does not have a provision to permanently remove the ability to map a buffer as executable, and leaves itself open to the type of attack described earlier. However, this should be something that can be achieved via a new file seal.
There are known usecases (e.g. CursorWindow [2]) where a process maps a buffer with read/write permissions before restricting the buffer to being mapped as read-only for future mappings.
Here you're talking about write permission, but the patch is about execute permission?
Sorry, I used this example about write permission to show why I implemented the seal with support for preventing future mappings, since the writable mappings that get created can become executable in the future, as described later in the commit text.
The resulting VMA from the writable mapping has VM_MAYEXEC set, meaning that mprotect() can change the mapping to be executable. Therefore, implementing the seal similar to F_SEAL_WRITE would not be appropriate, since it would not work with the CursorWindow usecase. This is because the CursorWindow process restricts the mapping permissions to read-only after the writable mapping is created. So, adding a file seal for executable mappings that operates like F_SEAL_WRITE would fail.
Therefore, add support for F_SEAL_FUTURE_EXEC, which is handled similarly to F_SEAL_FUTURE_WRITE. This ensures that CursorWindow can continue to create a writable mapping initially, and then restrict the permissions on the buffer to be mappable as read-only by using both F_SEAL_FUTURE_WRITE and F_SEAL_FUTURE_EXEC. After the seal is applied, any calls to mmap() with PROT_EXEC will fail.
[1] https://cs.android.com/android/kernel/superproject/+/common-android-mainline... [2] https://developer.android.com/reference/android/database/CursorWindow
Signed-off-by: Isaac J. Manjarres isaacmanjarres@google.com
include/uapi/linux/fcntl.h | 1 + mm/memfd.c | 39 +++++++++++++++++++++++++++++++++++++- 2 files changed, 39 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h index 6e6907e63bfc..ef066e524777 100644 --- a/include/uapi/linux/fcntl.h +++ b/include/uapi/linux/fcntl.h @@ -49,6 +49,7 @@ #define F_SEAL_WRITE 0x0008 /* prevent writes */ #define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped */ #define F_SEAL_EXEC 0x0020 /* prevent chmod modifying exec bits */ +#define F_SEAL_FUTURE_EXEC 0x0040 /* prevent future executable mappings */ /* (1U << 31) is reserved for signed error codes */
/* diff --git a/mm/memfd.c b/mm/memfd.c index 5f5a23c9051d..cfd62454df5e 100644 --- a/mm/memfd.c +++ b/mm/memfd.c @@ -184,6 +184,7 @@ static unsigned int *memfd_file_seals_ptr(struct file *file) }
#define F_ALL_SEALS (F_SEAL_SEAL | \
F_SEAL_FUTURE_EXEC |\ F_SEAL_EXEC | \ F_SEAL_SHRINK | \ F_SEAL_GROW | \
@@ -357,14 +358,50 @@ static int check_write_seal(unsigned long *vm_flags_ptr) return 0; }
+static inline bool is_exec_sealed(unsigned int seals) +{
return seals & F_SEAL_FUTURE_EXEC;
+}
+static int check_exec_seal(unsigned long *vm_flags_ptr) +{
unsigned long vm_flags = *vm_flags_ptr;
unsigned long mask = vm_flags & (VM_SHARED | VM_EXEC);
/* Executability is not a concern for private mappings. */
if (!(mask & VM_SHARED))
return 0;
Why is it not a concern for private mappings?
I didn't consider private mappings since it wasn't clear as to how they could be a threat to another process. A process can copy the contents of the buffer into another executable region of memory and just run it from there? Or are you saying that because it can do that, is there any value in differentiating between shared and private mappings?
Yes this is my point of view also but I might be missing something.
Thanks, Isaac