=== Overview
arm64 has a feature called Top Byte Ignore, which allows to embed pointer tags into the top byte of each pointer. Userspace programs (such as HWASan, a memory debugging tool [1]) might use this feature and pass tagged user pointers to the kernel through syscalls or other interfaces.
Right now the kernel is already able to handle user faults with tagged pointers, due to these patches:
1. 81cddd65 ("arm64: traps: fix userspace cache maintenance emulation on a tagged pointer") 2. 7dcd9dd8 ("arm64: hw_breakpoint: fix watchpoint matching for tagged pointers") 3. 276e9327 ("arm64: entry: improve data abort handling of tagged pointers")
This patchset extends tagged pointer support to syscall arguments.
As per the proposed ABI change [3], tagged pointers are only allowed to be passed to syscalls when they point to memory ranges obtained by anonymous mmap() or sbrk() (see the patchset [3] for more details).
For non-memory syscalls this is done by untaging user pointers when the kernel performs pointer checking to find out whether the pointer comes from userspace (most notably in access_ok). The untagging is done only when the pointer is being checked, the tag is preserved as the pointer makes its way through the kernel and stays tagged when the kernel dereferences the pointer when perfoming user memory accesses.
The mmap and mremap (only new_addr) syscalls do not currently accept tagged addresses. Architectures may interpret the tag as a background colour for the corresponding vma.
Other memory syscalls (mprotect, etc.) don't do user memory accesses but rather deal with memory ranges, and untagged pointers are better suited to describe memory ranges internally. Thus for memory syscalls we untag pointers completely when they enter the kernel.
=== Other approaches
One of the alternative approaches to untagging that was considered is to completely strip the pointer tag as the pointer enters the kernel with some kind of a syscall wrapper, but that won't work with the countless number of different ioctl calls. With this approach we would need a custom wrapper for each ioctl variation, which doesn't seem practical.
An alternative approach to untagging pointers in memory syscalls prologues is to inspead allow tagged pointers to be passed to find_vma() (and other vma related functions) and untag them there. Unfortunately, a lot of find_vma() callers then compare or subtract the returned vma start and end fields against the pointer that was being searched. Thus this approach would still require changing all find_vma() callers.
=== Testing
The following testing approaches has been taken to find potential issues with user pointer untagging:
1. Static testing (with sparse [2] and separately with a custom static analyzer based on Clang) to track casts of __user pointers to integer types to find places where untagging needs to be done.
2. Static testing with grep to find parts of the kernel that call find_vma() (and other similar functions) or directly compare against vm_start/vm_end fields of vma.
3. Static testing with grep to find parts of the kernel that compare user pointers with TASK_SIZE or other similar consts and macros.
4. Dynamic testing: adding BUG_ON(has_tag(addr)) to find_vma() and running a modified syzkaller version that passes tagged pointers to the kernel.
Based on the results of the testing the requried patches have been added to the patchset.
=== Notes
This patchset is meant to be merged together with "arm64 relaxed ABI" [3].
This patchset is a prerequisite for ARM's memory tagging hardware feature support [4].
This patchset has been merged into the Pixel 2 & 3 kernel trees and is now being used to enable testing of Pixel phones with HWASan.
Thanks!
[1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html
[2] https://github.com/lucvoo/sparse-dev/commit/5f960cb10f56ec2017c128ef9d16060e...
[3] https://lkml.org/lkml/2019/3/18/819
[4] https://community.arm.com/processors/b/blog/posts/arm-a-profile-architecture...
=== History
Changes in v17: - The "uaccess: add noop untagged_addr definition" patch is dropped, as it was merged into upstream named as "uaccess: add noop untagged_addr definition". - Merged "mm, arm64: untag user pointers in do_pages_move" into "mm, arm64: untag user pointers passed to memory syscalls". - Added "arm64: Introduce prctl() options to control the tagged user addresses ABI" patch from Catalin. - Add tags_lib.so to tools/testing/selftests/arm64/.gitignore. - Added a comment clarifying untagged in mremap. - Moved untagging back into mlx4_get_umem_mr() for the IB patch.
Changes in v16: - Moved untagging for memory syscalls from arm64 wrappers back to generic code. - Dropped untagging for the following memory syscalls: brk, mmap, munmap; mremap (only dropped for new_address); mmap_pgoff (not used on arm64); remap_file_pages (deprecated); shmat, shmdt (work on shared memory). - Changed kselftest to LD_PRELOAD a shared library that overrides malloc to return tagged pointers. - Rebased onto 5.2-rc3.
Changes in v15: - Removed unnecessary untagging from radeon_ttm_tt_set_userptr(). - Removed unnecessary untagging from amdgpu_ttm_tt_set_userptr(). - Moved untagging to validate_range() in userfaultfd code. - Moved untagging to ib_uverbs_(re)reg_mr() from mlx4_get_umem_mr(). - Rebased onto 5.1.
Changes in v14: - Moved untagging for most memory syscalls to an arm64 specific implementation, instead of doing that in the common code. - Dropped "net, arm64: untag user pointers in tcp_zerocopy_receive", since the provided user pointers don't come from an anonymous map and thus are not covered by this ABI relaxation. - Dropped "kernel, arm64: untag user pointers in prctl_set_mm*". - Moved untagging from __check_mem_type() to tee_shm_register(). - Updated untagging for the amdgpu and radeon drivers to cover the MMU notifier, as suggested by Felix. - Since this ABI relaxation doesn't actually allow tagged instruction pointers, dropped the following patches: - Dropped "tracing, arm64: untag user pointers in seq_print_user_ip". - Dropped "uprobes, arm64: untag user pointers in find_active_uprobe". - Dropped "bpf, arm64: untag user pointers in stack_map_get_build_id_offset". - Rebased onto 5.1-rc7 (37624b58).
Changes in v13: - Simplified untagging in tcp_zerocopy_receive(). - Looked at find_vma() callers in drivers/, which allowed to identify a few other places where untagging is needed. - Added patch "mm, arm64: untag user pointers in get_vaddr_frames". - Added patch "drm/amdgpu, arm64: untag user pointers in amdgpu_ttm_tt_get_user_pages". - Added patch "drm/radeon, arm64: untag user pointers in radeon_ttm_tt_pin_userptr". - Added patch "IB/mlx4, arm64: untag user pointers in mlx4_get_umem_mr". - Added patch "media/v4l2-core, arm64: untag user pointers in videobuf_dma_contig_user_get". - Added patch "tee/optee, arm64: untag user pointers in check_mem_type". - Added patch "vfio/type1, arm64: untag user pointers".
Changes in v12: - Changed untagging in tcp_zerocopy_receive() to also untag zc->address. - Fixed untagging in prctl_set_mm* to only untag pointers for vma lookups and validity checks, but leave them as is for actual user space accesses. - Updated the link to the v2 of the "arm64 relaxed ABI" patchset [3]. - Dropped the documentation patch, as the "arm64 relaxed ABI" patchset [3] handles that.
Changes in v11: - Added "uprobes, arm64: untag user pointers in find_active_uprobe" patch. - Added "bpf, arm64: untag user pointers in stack_map_get_build_id_offset" patch. - Fixed "tracing, arm64: untag user pointers in seq_print_user_ip" to correctly perform subtration with a tagged addr. - Moved untagged_addr() from SYSCALL_DEFINE3(mprotect) and SYSCALL_DEFINE4(pkey_mprotect) to do_mprotect_pkey(). - Moved untagged_addr() definition for other arches from include/linux/memory.h to include/linux/mm.h. - Changed untagging in strn*_user() to perform userspace accesses through tagged pointers. - Updated the documentation to mention that passing tagged pointers to memory syscalls is allowed. - Updated the test to use malloc'ed memory instead of stack memory.
Changes in v10: - Added "mm, arm64: untag user pointers passed to memory syscalls" back. - New patch "fs, arm64: untag user pointers in fs/userfaultfd.c". - New patch "net, arm64: untag user pointers in tcp_zerocopy_receive". - New patch "kernel, arm64: untag user pointers in prctl_set_mm*". - New patch "tracing, arm64: untag user pointers in seq_print_user_ip".
Changes in v9: - Rebased onto 4.20-rc6. - Used u64 instead of __u64 in type casts in the untagged_addr macro for arm64. - Added braces around (addr) in the untagged_addr macro for other arches.
Changes in v8: - Rebased onto 65102238 (4.20-rc1). - Added a note to the cover letter on why syscall wrappers/shims that untag user pointers won't work. - Added a note to the cover letter that this patchset has been merged into the Pixel 2 kernel tree. - Documentation fixes, in particular added a list of syscalls that don't support tagged user pointers.
Changes in v7: - Rebased onto 17b57b18 (4.19-rc6). - Dropped the "arm64: untag user address in __do_user_fault" patch, since the existing patches already handle user faults properly. - Dropped the "usb, arm64: untag user addresses in devio" patch, since the passed pointer must come from a vma and therefore be untagged. - Dropped the "arm64: annotate user pointers casts detected by sparse" patch (see the discussion to the replies of the v6 of this patchset). - Added more context to the cover letter. - Updated Documentation/arm64/tagged-pointers.txt.
Changes in v6: - Added annotations for user pointer casts found by sparse. - Rebased onto 050cdc6c (4.19-rc1+).
Changes in v5: - Added 3 new patches that add untagging to places found with static analysis. - Rebased onto 44c929e1 (4.18-rc8).
Changes in v4: - Added a selftest for checking that passing tagged pointers to the kernel succeeds. - Rebased onto 81e97f013 (4.18-rc1+).
Changes in v3: - Rebased onto e5c51f30 (4.17-rc6+). - Added linux-arch@ to the list of recipients.
Changes in v2: - Rebased onto 2d618bdf (4.17-rc3+). - Removed excessive untagging in gup.c. - Removed untagging pointers returned from __uaccess_mask_ptr.
Changes in v1: - Rebased onto 4.17-rc1.
Changes in RFC v2: - Added "#ifndef untagged_addr..." fallback in linux/uaccess.h instead of defining it for each arch individually. - Updated Documentation/arm64/tagged-pointers.txt. - Dropped "mm, arm64: untag user addresses in memory syscalls". - Rebased onto 3eb2ce82 (4.16-rc7).
Signed-off-by: Andrey Konovalov andreyknvl@google.com
Andrey Konovalov (14): arm64: untag user pointers in access_ok and __uaccess_mask_ptr lib, arm64: untag user pointers in strn*_user mm, arm64: untag user pointers passed to memory syscalls mm, arm64: untag user pointers in mm/gup.c mm, arm64: untag user pointers in get_vaddr_frames fs, arm64: untag user pointers in copy_mount_options userfaultfd, arm64: untag user pointers drm/amdgpu, arm64: untag user pointers drm/radeon, arm64: untag user pointers in radeon_gem_userptr_ioctl IB/mlx4, arm64: untag user pointers in mlx4_get_umem_mr media/v4l2-core, arm64: untag user pointers in videobuf_dma_contig_user_get tee/shm, arm64: untag user pointers in tee_shm_register vfio/type1, arm64: untag user pointers in vaddr_get_pfn selftests, arm64: add a selftest for passing tagged pointers to kernel
Catalin Marinas (1): arm64: Introduce prctl() options to control the tagged user addresses ABI
arch/arm64/include/asm/processor.h | 6 ++ arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/include/asm/uaccess.h | 11 ++- arch/arm64/kernel/process.c | 67 +++++++++++++++++++ .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 + drivers/gpu/drm/radeon/radeon_gem.c | 2 + drivers/infiniband/hw/mlx4/mr.c | 7 +- drivers/media/v4l2-core/videobuf-dma-contig.c | 9 +-- drivers/tee/tee_shm.c | 1 + drivers/vfio/vfio_iommu_type1.c | 2 + fs/namespace.c | 2 +- fs/userfaultfd.c | 22 +++--- include/uapi/linux/prctl.h | 5 ++ kernel/sys.c | 16 +++++ lib/strncpy_from_user.c | 3 +- lib/strnlen_user.c | 3 +- mm/frame_vector.c | 2 + mm/gup.c | 4 ++ mm/madvise.c | 2 + mm/mempolicy.c | 3 + mm/migrate.c | 2 +- mm/mincore.c | 2 + mm/mlock.c | 4 ++ mm/mprotect.c | 2 + mm/mremap.c | 7 ++ mm/msync.c | 2 + tools/testing/selftests/arm64/.gitignore | 2 + tools/testing/selftests/arm64/Makefile | 22 ++++++ .../testing/selftests/arm64/run_tags_test.sh | 12 ++++ tools/testing/selftests/arm64/tags_lib.c | 62 +++++++++++++++++ tools/testing/selftests/arm64/tags_test.c | 18 +++++ 32 files changed, 282 insertions(+), 25 deletions(-) create mode 100644 tools/testing/selftests/arm64/.gitignore create mode 100644 tools/testing/selftests/arm64/Makefile create mode 100755 tools/testing/selftests/arm64/run_tags_test.sh create mode 100644 tools/testing/selftests/arm64/tags_lib.c create mode 100644 tools/testing/selftests/arm64/tags_test.c
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
copy_from_user (and a few other similar functions) are used to copy data from user memory into the kernel memory or vice versa. Since a user can provided a tagged pointer to one of the syscalls that use copy_from_user, we need to correctly handle such pointers.
Do this by untagging user pointers in access_ok and in __uaccess_mask_ptr, before performing access validity checks.
Note, that this patch only temporarily untags the pointers to perform the checks, but then passes them as is into the kernel internals.
Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Andrey Konovalov andreyknvl@google.com --- arch/arm64/include/asm/uaccess.h | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h index e5d5f31c6d36..df729afca0ba 100644 --- a/arch/arm64/include/asm/uaccess.h +++ b/arch/arm64/include/asm/uaccess.h @@ -73,6 +73,8 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si { unsigned long ret, limit = current_thread_info()->addr_limit;
+ addr = untagged_addr(addr); + __chk_user_ptr(addr); asm volatile( // A + B <= C + 1 for all A,B,C, in four easy steps: @@ -226,7 +228,8 @@ static inline void uaccess_enable_not_uao(void)
/* * Sanitise a uaccess pointer such that it becomes NULL if above the - * current addr_limit. + * current addr_limit. In case the pointer is tagged (has the top byte set), + * untag the pointer before checking. */ #define uaccess_mask_ptr(ptr) (__typeof__(ptr))__uaccess_mask_ptr(ptr) static inline void __user *__uaccess_mask_ptr(const void __user *ptr) @@ -234,10 +237,11 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr) void __user *safe_ptr;
asm volatile( - " bics xzr, %1, %2\n" + " bics xzr, %3, %2\n" " csel %0, %1, xzr, eq\n" : "=&r" (safe_ptr) - : "r" (ptr), "r" (current_thread_info()->addr_limit) + : "r" (ptr), "r" (current_thread_info()->addr_limit), + "r" (untagged_addr(ptr)) : "cc");
csdb();
On 12/06/2019 12:43, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
copy_from_user (and a few other similar functions) are used to copy data from user memory into the kernel memory or vice versa. Since a user can provided a tagged pointer to one of the syscalls that use copy_from_user, we need to correctly handle such pointers.
Do this by untagging user pointers in access_ok and in __uaccess_mask_ptr, before performing access validity checks.
Note, that this patch only temporarily untags the pointers to perform the checks, but then passes them as is into the kernel internals.
Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Andrey Konovalov andreyknvl@google.com
Reviewed-by: Vincenzo Frascino vincenzo.frascino@arm.com
arch/arm64/include/asm/uaccess.h | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h index e5d5f31c6d36..df729afca0ba 100644 --- a/arch/arm64/include/asm/uaccess.h +++ b/arch/arm64/include/asm/uaccess.h @@ -73,6 +73,8 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si { unsigned long ret, limit = current_thread_info()->addr_limit;
- addr = untagged_addr(addr);
- __chk_user_ptr(addr); asm volatile( // A + B <= C + 1 for all A,B,C, in four easy steps:
@@ -226,7 +228,8 @@ static inline void uaccess_enable_not_uao(void) /*
- Sanitise a uaccess pointer such that it becomes NULL if above the
- current addr_limit.
- current addr_limit. In case the pointer is tagged (has the top byte set),
*/
- untag the pointer before checking.
#define uaccess_mask_ptr(ptr) (__typeof__(ptr))__uaccess_mask_ptr(ptr) static inline void __user *__uaccess_mask_ptr(const void __user *ptr) @@ -234,10 +237,11 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr) void __user *safe_ptr; asm volatile(
- " bics xzr, %1, %2\n"
- " bics xzr, %3, %2\n" " csel %0, %1, xzr, eq\n" : "=&r" (safe_ptr)
- : "r" (ptr), "r" (current_thread_info()->addr_limit)
- : "r" (ptr), "r" (current_thread_info()->addr_limit),
: "cc");"r" (untagged_addr(ptr))
csdb();
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
strncpy_from_user and strnlen_user accept user addresses as arguments, and do not go through the same path as copy_from_user and others, so here we need to handle the case of tagged user addresses separately.
Untag user pointers passed to these functions.
Note, that this patch only temporarily untags the pointers to perform validity checks, but then uses them as is to perform user memory accesses.
Reviewed-by: Khalid Aziz khalid.aziz@oracle.com Acked-by: Kees Cook keescook@chromium.org Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Andrey Konovalov andreyknvl@google.com --- lib/strncpy_from_user.c | 3 ++- lib/strnlen_user.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c index 023ba9f3b99f..dccb95af6003 100644 --- a/lib/strncpy_from_user.c +++ b/lib/strncpy_from_user.c @@ -6,6 +6,7 @@ #include <linux/uaccess.h> #include <linux/kernel.h> #include <linux/errno.h> +#include <linux/mm.h>
#include <asm/byteorder.h> #include <asm/word-at-a-time.h> @@ -108,7 +109,7 @@ long strncpy_from_user(char *dst, const char __user *src, long count) return 0;
max_addr = user_addr_max(); - src_addr = (unsigned long)src; + src_addr = (unsigned long)untagged_addr(src); if (likely(src_addr < max_addr)) { unsigned long max = max_addr - src_addr; long retval; diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c index 7f2db3fe311f..28ff554a1be8 100644 --- a/lib/strnlen_user.c +++ b/lib/strnlen_user.c @@ -2,6 +2,7 @@ #include <linux/kernel.h> #include <linux/export.h> #include <linux/uaccess.h> +#include <linux/mm.h>
#include <asm/word-at-a-time.h>
@@ -109,7 +110,7 @@ long strnlen_user(const char __user *str, long count) return 0;
max_addr = user_addr_max(); - src_addr = (unsigned long)str; + src_addr = (unsigned long)untagged_addr(str); if (likely(src_addr < max_addr)) { unsigned long max = max_addr - src_addr; long retval;
On 12/06/2019 12:43, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
strncpy_from_user and strnlen_user accept user addresses as arguments, and do not go through the same path as copy_from_user and others, so here we need to handle the case of tagged user addresses separately.
Untag user pointers passed to these functions.
Note, that this patch only temporarily untags the pointers to perform validity checks, but then uses them as is to perform user memory accesses.
Reviewed-by: Khalid Aziz khalid.aziz@oracle.com Acked-by: Kees Cook keescook@chromium.org Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Andrey Konovalov andreyknvl@google.com
Reviewed-by: Vincenzo Frascino vincenzo.frascino@arm.com
lib/strncpy_from_user.c | 3 ++- lib/strnlen_user.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c index 023ba9f3b99f..dccb95af6003 100644 --- a/lib/strncpy_from_user.c +++ b/lib/strncpy_from_user.c @@ -6,6 +6,7 @@ #include <linux/uaccess.h> #include <linux/kernel.h> #include <linux/errno.h> +#include <linux/mm.h> #include <asm/byteorder.h> #include <asm/word-at-a-time.h> @@ -108,7 +109,7 @@ long strncpy_from_user(char *dst, const char __user *src, long count) return 0; max_addr = user_addr_max();
- src_addr = (unsigned long)src;
- src_addr = (unsigned long)untagged_addr(src); if (likely(src_addr < max_addr)) { unsigned long max = max_addr - src_addr; long retval;
diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c index 7f2db3fe311f..28ff554a1be8 100644 --- a/lib/strnlen_user.c +++ b/lib/strnlen_user.c @@ -2,6 +2,7 @@ #include <linux/kernel.h> #include <linux/export.h> #include <linux/uaccess.h> +#include <linux/mm.h> #include <asm/word-at-a-time.h> @@ -109,7 +110,7 @@ long strnlen_user(const char __user *str, long count) return 0; max_addr = user_addr_max();
- src_addr = (unsigned long)str;
- src_addr = (unsigned long)untagged_addr(str); if (likely(src_addr < max_addr)) { unsigned long max = max_addr - src_addr; long retval;
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com --- arch/arm64/include/asm/processor.h | 6 +++ arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/include/asm/uaccess.h | 3 +- arch/arm64/kernel/process.c | 67 ++++++++++++++++++++++++++++ include/uapi/linux/prctl.h | 5 +++ kernel/sys.c | 16 +++++++ 6 files changed, 97 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index fcd0e691b1ea..fee457456aa8 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -307,6 +307,12 @@ extern void __init minsigstksz_setup(void); /* PR_PAC_RESET_KEYS prctl */ #define PAC_RESET_KEYS(tsk, arg) ptrauth_prctl_reset_keys(tsk, arg)
+/* PR_TAGGED_ADDR prctl */ +long set_tagged_addr_ctrl(unsigned long arg); +long get_tagged_addr_ctrl(void); +#define SET_TAGGED_ADDR_CTRL(arg) set_tagged_addr_ctrl(arg) +#define GET_TAGGED_ADDR_CTRL() get_tagged_addr_ctrl() + /* * For CONFIG_GCC_PLUGIN_STACKLEAK * diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index f1d032be628a..354a31d2b737 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -99,6 +99,7 @@ void arch_release_task_struct(struct task_struct *tsk); #define TIF_SVE 23 /* Scalable Vector Extension in use */ #define TIF_SVE_VL_INHERIT 24 /* Inherit sve_vl_onexec across exec */ #define TIF_SSBD 25 /* Wants SSB mitigation */ +#define TIF_TAGGED_ADDR 26 /* Allow tagged user addresses */
#define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h index df729afca0ba..995b9ea11a89 100644 --- a/arch/arm64/include/asm/uaccess.h +++ b/arch/arm64/include/asm/uaccess.h @@ -73,7 +73,8 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si { unsigned long ret, limit = current_thread_info()->addr_limit;
- addr = untagged_addr(addr); + if (test_thread_flag(TIF_TAGGED_ADDR)) + addr = untagged_addr(addr);
__chk_user_ptr(addr); asm volatile( diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 3767fb21a5b8..69d0be1fc708 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -30,6 +30,7 @@ #include <linux/kernel.h> #include <linux/mm.h> #include <linux/stddef.h> +#include <linux/sysctl.h> #include <linux/unistd.h> #include <linux/user.h> #include <linux/delay.h> @@ -323,6 +324,7 @@ void flush_thread(void) fpsimd_flush_thread(); tls_thread_flush(); flush_ptrace_hw_breakpoint(current); + clear_thread_flag(TIF_TAGGED_ADDR); }
void release_thread(struct task_struct *dead_task) @@ -552,3 +554,68 @@ void arch_setup_new_exec(void)
ptrauth_thread_init_user(current); } + +/* + * Control the relaxed ABI allowing tagged user addresses into the kernel. + */ +static unsigned int tagged_addr_prctl_allowed = 1; + +long set_tagged_addr_ctrl(unsigned long arg) +{ + if (!tagged_addr_prctl_allowed) + return -EINVAL; + if (is_compat_task()) + return -EINVAL; + if (arg & ~PR_TAGGED_ADDR_ENABLE) + return -EINVAL; + + if (arg & PR_TAGGED_ADDR_ENABLE) + set_thread_flag(TIF_TAGGED_ADDR); + else + clear_thread_flag(TIF_TAGGED_ADDR); + + return 0; +} + +long get_tagged_addr_ctrl(void) +{ + if (!tagged_addr_prctl_allowed) + return -EINVAL; + if (is_compat_task()) + return -EINVAL; + + if (test_thread_flag(TIF_TAGGED_ADDR)) + return PR_TAGGED_ADDR_ENABLE; + + return 0; +} + +/* + * Global sysctl to disable the tagged user addresses support. This control + * only prevents the tagged address ABI enabling via prctl() and does not + * disable it for tasks that already opted in to the relaxed ABI. + */ +static int zero; +static int one = 1; + +static struct ctl_table tagged_addr_sysctl_table[] = { + { + .procname = "tagged_addr", + .mode = 0644, + .data = &tagged_addr_prctl_allowed, + .maxlen = sizeof(int), + .proc_handler = proc_dointvec_minmax, + .extra1 = &zero, + .extra2 = &one, + }, + { } +}; + +static int __init tagged_addr_init(void) +{ + if (!register_sysctl("abi", tagged_addr_sysctl_table)) + return -EINVAL; + return 0; +} + +core_initcall(tagged_addr_init); diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 094bb03b9cc2..2e927b3e9d6c 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -229,4 +229,9 @@ struct prctl_mm_map { # define PR_PAC_APDBKEY (1UL << 3) # define PR_PAC_APGAKEY (1UL << 4)
+/* Tagged user address controls for arm64 */ +#define PR_SET_TAGGED_ADDR_CTRL 55 +#define PR_GET_TAGGED_ADDR_CTRL 56 +# define PR_TAGGED_ADDR_ENABLE (1UL << 0) + #endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index 2969304c29fe..ec48396b4943 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -124,6 +124,12 @@ #ifndef PAC_RESET_KEYS # define PAC_RESET_KEYS(a, b) (-EINVAL) #endif +#ifndef SET_TAGGED_ADDR_CTRL +# define SET_TAGGED_ADDR_CTRL(a) (-EINVAL) +#endif +#ifndef GET_TAGGED_ADDR_CTRL +# define GET_TAGGED_ADDR_CTRL() (-EINVAL) +#endif
/* * this is where the system-wide overflow UID and GID are defined, for @@ -2492,6 +2498,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, return -EINVAL; error = PAC_RESET_KEYS(me, arg2); break; + case PR_SET_TAGGED_ADDR_CTRL: + if (arg3 || arg4 || arg5) + return -EINVAL; + error = SET_TAGGED_ADDR_CTRL(arg2); + break; + case PR_GET_TAGGED_ADDR_CTRL: + if (arg2 || arg3 || arg4 || arg5) + return -EINVAL; + error = GET_TAGGED_ADDR_CTRL(); + break; default: error = -EINVAL; break;
On 12/06/2019 12:43, Andrey Konovalov wrote:
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
Reviewed-by: Vincenzo Frascino vincenzo.frascino@arm.com
arch/arm64/include/asm/processor.h | 6 +++ arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/include/asm/uaccess.h | 3 +- arch/arm64/kernel/process.c | 67 ++++++++++++++++++++++++++++ include/uapi/linux/prctl.h | 5 +++ kernel/sys.c | 16 +++++++ 6 files changed, 97 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index fcd0e691b1ea..fee457456aa8 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -307,6 +307,12 @@ extern void __init minsigstksz_setup(void); /* PR_PAC_RESET_KEYS prctl */ #define PAC_RESET_KEYS(tsk, arg) ptrauth_prctl_reset_keys(tsk, arg) +/* PR_TAGGED_ADDR prctl */ +long set_tagged_addr_ctrl(unsigned long arg); +long get_tagged_addr_ctrl(void); +#define SET_TAGGED_ADDR_CTRL(arg) set_tagged_addr_ctrl(arg) +#define GET_TAGGED_ADDR_CTRL() get_tagged_addr_ctrl()
/*
- For CONFIG_GCC_PLUGIN_STACKLEAK
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index f1d032be628a..354a31d2b737 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -99,6 +99,7 @@ void arch_release_task_struct(struct task_struct *tsk); #define TIF_SVE 23 /* Scalable Vector Extension in use */ #define TIF_SVE_VL_INHERIT 24 /* Inherit sve_vl_onexec across exec */ #define TIF_SSBD 25 /* Wants SSB mitigation */ +#define TIF_TAGGED_ADDR 26 /* Allow tagged user addresses */ #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h index df729afca0ba..995b9ea11a89 100644 --- a/arch/arm64/include/asm/uaccess.h +++ b/arch/arm64/include/asm/uaccess.h @@ -73,7 +73,8 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si { unsigned long ret, limit = current_thread_info()->addr_limit;
- addr = untagged_addr(addr);
- if (test_thread_flag(TIF_TAGGED_ADDR))
addr = untagged_addr(addr);
__chk_user_ptr(addr); asm volatile( diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 3767fb21a5b8..69d0be1fc708 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -30,6 +30,7 @@ #include <linux/kernel.h> #include <linux/mm.h> #include <linux/stddef.h> +#include <linux/sysctl.h> #include <linux/unistd.h> #include <linux/user.h> #include <linux/delay.h> @@ -323,6 +324,7 @@ void flush_thread(void) fpsimd_flush_thread(); tls_thread_flush(); flush_ptrace_hw_breakpoint(current);
- clear_thread_flag(TIF_TAGGED_ADDR);
} void release_thread(struct task_struct *dead_task) @@ -552,3 +554,68 @@ void arch_setup_new_exec(void) ptrauth_thread_init_user(current); }
+/*
- Control the relaxed ABI allowing tagged user addresses into the kernel.
- */
+static unsigned int tagged_addr_prctl_allowed = 1;
+long set_tagged_addr_ctrl(unsigned long arg) +{
- if (!tagged_addr_prctl_allowed)
return -EINVAL;
- if (is_compat_task())
return -EINVAL;
- if (arg & ~PR_TAGGED_ADDR_ENABLE)
return -EINVAL;
- if (arg & PR_TAGGED_ADDR_ENABLE)
set_thread_flag(TIF_TAGGED_ADDR);
- else
clear_thread_flag(TIF_TAGGED_ADDR);
- return 0;
+}
+long get_tagged_addr_ctrl(void) +{
- if (!tagged_addr_prctl_allowed)
return -EINVAL;
- if (is_compat_task())
return -EINVAL;
- if (test_thread_flag(TIF_TAGGED_ADDR))
return PR_TAGGED_ADDR_ENABLE;
- return 0;
+}
+/*
- Global sysctl to disable the tagged user addresses support. This control
- only prevents the tagged address ABI enabling via prctl() and does not
- disable it for tasks that already opted in to the relaxed ABI.
- */
+static int zero; +static int one = 1;
+static struct ctl_table tagged_addr_sysctl_table[] = {
- {
.procname = "tagged_addr",
.mode = 0644,
.data = &tagged_addr_prctl_allowed,
.maxlen = sizeof(int),
.proc_handler = proc_dointvec_minmax,
.extra1 = &zero,
.extra2 = &one,
- },
- { }
+};
+static int __init tagged_addr_init(void) +{
- if (!register_sysctl("abi", tagged_addr_sysctl_table))
return -EINVAL;
- return 0;
+}
+core_initcall(tagged_addr_init); diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 094bb03b9cc2..2e927b3e9d6c 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -229,4 +229,9 @@ struct prctl_mm_map { # define PR_PAC_APDBKEY (1UL << 3) # define PR_PAC_APGAKEY (1UL << 4) +/* Tagged user address controls for arm64 */ +#define PR_SET_TAGGED_ADDR_CTRL 55 +#define PR_GET_TAGGED_ADDR_CTRL 56 +# define PR_TAGGED_ADDR_ENABLE (1UL << 0)
#endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index 2969304c29fe..ec48396b4943 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -124,6 +124,12 @@ #ifndef PAC_RESET_KEYS # define PAC_RESET_KEYS(a, b) (-EINVAL) #endif +#ifndef SET_TAGGED_ADDR_CTRL +# define SET_TAGGED_ADDR_CTRL(a) (-EINVAL) +#endif +#ifndef GET_TAGGED_ADDR_CTRL +# define GET_TAGGED_ADDR_CTRL() (-EINVAL) +#endif /*
- this is where the system-wide overflow UID and GID are defined, for
@@ -2492,6 +2498,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, return -EINVAL; error = PAC_RESET_KEYS(me, arg2); break;
- case PR_SET_TAGGED_ADDR_CTRL:
if (arg3 || arg4 || arg5)
return -EINVAL;
error = SET_TAGGED_ADDR_CTRL(arg2);
break;
- case PR_GET_TAGGED_ADDR_CTRL:
if (arg2 || arg3 || arg4 || arg5)
return -EINVAL;
error = GET_TAGGED_ADDR_CTRL();
default: error = -EINVAL; break;break;
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
You need your signed-off-by here since you are contributing it. And thanks for adding the comment to the TIF definition.
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
arch/arm64/include/asm/processor.h | 6 +++ arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/include/asm/uaccess.h | 3 +- arch/arm64/kernel/process.c | 67 ++++++++++++++++++++++++++++ include/uapi/linux/prctl.h | 5 +++ kernel/sys.c | 16 +++++++ 6 files changed, 97 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index fcd0e691b1ea..fee457456aa8 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -307,6 +307,12 @@ extern void __init minsigstksz_setup(void); /* PR_PAC_RESET_KEYS prctl */ #define PAC_RESET_KEYS(tsk, arg) ptrauth_prctl_reset_keys(tsk, arg) +/* PR_TAGGED_ADDR prctl */ +long set_tagged_addr_ctrl(unsigned long arg); +long get_tagged_addr_ctrl(void); +#define SET_TAGGED_ADDR_CTRL(arg) set_tagged_addr_ctrl(arg) +#define GET_TAGGED_ADDR_CTRL() get_tagged_addr_ctrl()
/*
- For CONFIG_GCC_PLUGIN_STACKLEAK
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h index f1d032be628a..354a31d2b737 100644 --- a/arch/arm64/include/asm/thread_info.h +++ b/arch/arm64/include/asm/thread_info.h @@ -99,6 +99,7 @@ void arch_release_task_struct(struct task_struct *tsk); #define TIF_SVE 23 /* Scalable Vector Extension in use */ #define TIF_SVE_VL_INHERIT 24 /* Inherit sve_vl_onexec across exec */ #define TIF_SSBD 25 /* Wants SSB mitigation */ +#define TIF_TAGGED_ADDR 26 /* Allow tagged user addresses */ #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h index df729afca0ba..995b9ea11a89 100644 --- a/arch/arm64/include/asm/uaccess.h +++ b/arch/arm64/include/asm/uaccess.h @@ -73,7 +73,8 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si { unsigned long ret, limit = current_thread_info()->addr_limit;
- addr = untagged_addr(addr);
- if (test_thread_flag(TIF_TAGGED_ADDR))
addr = untagged_addr(addr);
__chk_user_ptr(addr); asm volatile( diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 3767fb21a5b8..69d0be1fc708 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -30,6 +30,7 @@ #include <linux/kernel.h> #include <linux/mm.h> #include <linux/stddef.h> +#include <linux/sysctl.h> #include <linux/unistd.h> #include <linux/user.h> #include <linux/delay.h> @@ -323,6 +324,7 @@ void flush_thread(void) fpsimd_flush_thread(); tls_thread_flush(); flush_ptrace_hw_breakpoint(current);
- clear_thread_flag(TIF_TAGGED_ADDR);
} void release_thread(struct task_struct *dead_task) @@ -552,3 +554,68 @@ void arch_setup_new_exec(void) ptrauth_thread_init_user(current); }
+/*
- Control the relaxed ABI allowing tagged user addresses into the kernel.
- */
+static unsigned int tagged_addr_prctl_allowed = 1;
+long set_tagged_addr_ctrl(unsigned long arg) +{
- if (!tagged_addr_prctl_allowed)
return -EINVAL;
- if (is_compat_task())
return -EINVAL;
- if (arg & ~PR_TAGGED_ADDR_ENABLE)
return -EINVAL;
- if (arg & PR_TAGGED_ADDR_ENABLE)
set_thread_flag(TIF_TAGGED_ADDR);
- else
clear_thread_flag(TIF_TAGGED_ADDR);
- return 0;
+}
+long get_tagged_addr_ctrl(void) +{
- if (!tagged_addr_prctl_allowed)
return -EINVAL;
- if (is_compat_task())
return -EINVAL;
- if (test_thread_flag(TIF_TAGGED_ADDR))
return PR_TAGGED_ADDR_ENABLE;
- return 0;
+}
+/*
- Global sysctl to disable the tagged user addresses support. This control
- only prevents the tagged address ABI enabling via prctl() and does not
- disable it for tasks that already opted in to the relaxed ABI.
- */
+static int zero; +static int one = 1;
!!!
And these can't even be const without a cast. Yuk.
(Not your fault though, but it would be nice to have a proc_dobool() to avoid this.)
+static struct ctl_table tagged_addr_sysctl_table[] = {
- {
.procname = "tagged_addr",
.mode = 0644,
.data = &tagged_addr_prctl_allowed,
.maxlen = sizeof(int),
.proc_handler = proc_dointvec_minmax,
.extra1 = &zero,
.extra2 = &one,
- },
- { }
+};
+static int __init tagged_addr_init(void) +{
- if (!register_sysctl("abi", tagged_addr_sysctl_table))
return -EINVAL;
- return 0;
+}
+core_initcall(tagged_addr_init); diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index 094bb03b9cc2..2e927b3e9d6c 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -229,4 +229,9 @@ struct prctl_mm_map { # define PR_PAC_APDBKEY (1UL << 3) # define PR_PAC_APGAKEY (1UL << 4) +/* Tagged user address controls for arm64 */ +#define PR_SET_TAGGED_ADDR_CTRL 55 +#define PR_GET_TAGGED_ADDR_CTRL 56 +# define PR_TAGGED_ADDR_ENABLE (1UL << 0)
Do we expect this prctl to be applicable to other arches, or is it strictly arm64-specific?
#endif /* _LINUX_PRCTL_H */ diff --git a/kernel/sys.c b/kernel/sys.c index 2969304c29fe..ec48396b4943 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -124,6 +124,12 @@ #ifndef PAC_RESET_KEYS # define PAC_RESET_KEYS(a, b) (-EINVAL) #endif +#ifndef SET_TAGGED_ADDR_CTRL +# define SET_TAGGED_ADDR_CTRL(a) (-EINVAL) +#endif +#ifndef GET_TAGGED_ADDR_CTRL +# define GET_TAGGED_ADDR_CTRL() (-EINVAL) +#endif /*
- this is where the system-wide overflow UID and GID are defined, for
@@ -2492,6 +2498,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, return -EINVAL; error = PAC_RESET_KEYS(me, arg2); break;
- case PR_SET_TAGGED_ADDR_CTRL:
if (arg3 || arg4 || arg5)
<bikeshed>
How do you anticipate these arguments being used in the future?
For the SVE prctls I took the view that "get" could only ever mean one thing, and "put" already had a flags argument with spare bits for future expansion anyway, so forcing the extra arguments to zero would be unnecessary.
Opinions seem to differ on whether requiring surplus arguments to be 0 is beneficial for hygiene, but the glibc prototype for prctl() is
int prctl (int __option, ...);
so it seemed annoying to have to pass extra arguments to it just for the sake of it. IMHO this also makes the code at the call site less readable, since it's not immediately apparent that all those 0s are meaningless.
</bikeshed>
(OTOH, the extra arguments are harmless and prctl is far from being a general-purpose syscall.)
return -EINVAL;
error = SET_TAGGED_ADDR_CTRL(arg2);
break;
- case PR_GET_TAGGED_ADDR_CTRL:
if (arg2 || arg3 || arg4 || arg5)
return -EINVAL;
error = GET_TAGGED_ADDR_CTRL();
Having a "get" prctl is probably a good idea, but is there a clear usecase for it?
(The usecase for PR_SVE_GET_VL was always a bit dubious, since the VL can also be read via an SVE insn or a compiler intrinsic, which is less portable but much cheaper. As for the PR_SVE_SET_VL_INHERIT flag that can be read via PR_SVE_GET_VL, I've never been sure how useful it is to be able to read that...)
[...]
Cheers ---Dave
Hi Dave,
On Thu, Jun 13, 2019 at 12:02:35PM +0100, Dave P Martin wrote:
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
+/*
- Global sysctl to disable the tagged user addresses support. This control
- only prevents the tagged address ABI enabling via prctl() and does not
- disable it for tasks that already opted in to the relaxed ABI.
- */
+static int zero; +static int one = 1;
!!!
And these can't even be const without a cast. Yuk.
(Not your fault though, but it would be nice to have a proc_dobool() to avoid this.)
I had the same reaction. Maybe for another patch sanitising this pattern across the kernel.
--- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -229,4 +229,9 @@ struct prctl_mm_map { # define PR_PAC_APDBKEY (1UL << 3) # define PR_PAC_APGAKEY (1UL << 4) +/* Tagged user address controls for arm64 */ +#define PR_SET_TAGGED_ADDR_CTRL 55 +#define PR_GET_TAGGED_ADDR_CTRL 56 +# define PR_TAGGED_ADDR_ENABLE (1UL << 0)
Do we expect this prctl to be applicable to other arches, or is it strictly arm64-specific?
I kept it generic, at least the tagged address part. The MTE bits later on would be arm64-specific.
@@ -2492,6 +2498,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, return -EINVAL; error = PAC_RESET_KEYS(me, arg2); break;
- case PR_SET_TAGGED_ADDR_CTRL:
if (arg3 || arg4 || arg5)
<bikeshed>
How do you anticipate these arguments being used in the future?
I don't expect them to be used at all. But since I'm not sure, I'd force them as zero for now rather than ignored. The GET is supposed to return the SET arg2, hence I'd rather not used the other arguments.
For the SVE prctls I took the view that "get" could only ever mean one thing, and "put" already had a flags argument with spare bits for future expansion anyway, so forcing the extra arguments to zero would be unnecessary.
Opinions seem to differ on whether requiring surplus arguments to be 0 is beneficial for hygiene, but the glibc prototype for prctl() is
int prctl (int __option, ...);
so it seemed annoying to have to pass extra arguments to it just for the sake of it. IMHO this also makes the code at the call site less readable, since it's not immediately apparent that all those 0s are meaningless.
It's fine by me to ignore the other arguments. I just followed the pattern of some existing prctl options. I don't have a strong opinion either way.
return -EINVAL;
error = SET_TAGGED_ADDR_CTRL(arg2);
break;
- case PR_GET_TAGGED_ADDR_CTRL:
if (arg2 || arg3 || arg4 || arg5)
return -EINVAL;
error = GET_TAGGED_ADDR_CTRL();
Having a "get" prctl is probably a good idea, but is there a clear usecase for it?
Not sure, maybe some other library (e.g. a JIT compiler) would like to check whether tagged addresses have been enabled during application start and decide to generate tagged pointers for itself. It seemed pretty harmless, unless we add more complex things to the prctl() that cannot be returned in one request).
On Thu, Jun 13, 2019 at 04:26:32PM +0100, Catalin Marinas wrote:
On Thu, Jun 13, 2019 at 12:02:35PM +0100, Dave P Martin wrote:
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
+static int zero; +static int one = 1;
!!!
And these can't even be const without a cast. Yuk.
(Not your fault though, but it would be nice to have a proc_dobool() to avoid this.)
I had the same reaction. Maybe for another patch sanitising this pattern across the kernel.
That's actually already happening (via -mm tree last I looked). tl;dr: it ends up using a cast hidden in a macro. It's in linux-next already along with a checkpatch.pl addition to yell about doing what's being done here. ;)
https://lore.kernel.org/lkml/20190430180111.10688-1-mcroce@redhat.com/#r
On Thu, Jun 13, 2019 at 10:13:54PM -0700, Kees Cook wrote:
On Thu, Jun 13, 2019 at 04:26:32PM +0100, Catalin Marinas wrote:
On Thu, Jun 13, 2019 at 12:02:35PM +0100, Dave P Martin wrote:
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
+static int zero; +static int one = 1;
!!!
And these can't even be const without a cast. Yuk.
(Not your fault though, but it would be nice to have a proc_dobool() to avoid this.)
I had the same reaction. Maybe for another patch sanitising this pattern across the kernel.
That's actually already happening (via -mm tree last I looked). tl;dr: it ends up using a cast hidden in a macro. It's in linux-next already along with a checkpatch.pl addition to yell about doing what's being done here. ;)
https://lore.kernel.org/lkml/20190430180111.10688-1-mcroce@redhat.com/#r
Hmmm, that is marginally less bad.
Ideally we'd have a union in there, not just a bunch of void *. I may look at that someday...
Cheers ---Dave
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
arch/arm64/include/asm/processor.h | 6 +++ arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/include/asm/uaccess.h | 3 +- arch/arm64/kernel/process.c | 67 ++++++++++++++++++++++++++++ include/uapi/linux/prctl.h | 5 +++ kernel/sys.c | 16 +++++++ 6 files changed, 97 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index fcd0e691b1ea..fee457456aa8 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -307,6 +307,12 @@ extern void __init minsigstksz_setup(void); /* PR_PAC_RESET_KEYS prctl */ #define PAC_RESET_KEYS(tsk, arg) ptrauth_prctl_reset_keys(tsk, arg) +/* PR_TAGGED_ADDR prctl */
(A couple of comments I missed in my last reply:)
Name mismatch?
+long set_tagged_addr_ctrl(unsigned long arg); +long get_tagged_addr_ctrl(void); +#define SET_TAGGED_ADDR_CTRL(arg) set_tagged_addr_ctrl(arg) +#define GET_TAGGED_ADDR_CTRL() get_tagged_addr_ctrl()
[...]
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 3767fb21a5b8..69d0be1fc708 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -30,6 +30,7 @@ #include <linux/kernel.h> #include <linux/mm.h> #include <linux/stddef.h> +#include <linux/sysctl.h> #include <linux/unistd.h> #include <linux/user.h> #include <linux/delay.h> @@ -323,6 +324,7 @@ void flush_thread(void) fpsimd_flush_thread(); tls_thread_flush(); flush_ptrace_hw_breakpoint(current);
- clear_thread_flag(TIF_TAGGED_ADDR);
} void release_thread(struct task_struct *dead_task) @@ -552,3 +554,68 @@ void arch_setup_new_exec(void) ptrauth_thread_init_user(current); }
+/*
- Control the relaxed ABI allowing tagged user addresses into the kernel.
- */
+static unsigned int tagged_addr_prctl_allowed = 1;
+long set_tagged_addr_ctrl(unsigned long arg) +{
- if (!tagged_addr_prctl_allowed)
return -EINVAL;
So, tagging can actually be locked on by having a process enable it and then some possibly unrelated process clearing tagged_addr_prctl_allowed. That feels a bit weird.
Do we want to allow a process that has tagging on to be able to turn it off at all? Possibly things like CRIU might want to do that.
- if (is_compat_task())
return -EINVAL;
- if (arg & ~PR_TAGGED_ADDR_ENABLE)
return -EINVAL;
How do we expect this argument to be extended in the future?
I'm wondering whether this is really a bitmask or an enum, or a mixture of the two. Maybe it doesn't matter.
- if (arg & PR_TAGGED_ADDR_ENABLE)
set_thread_flag(TIF_TAGGED_ADDR);
- else
clear_thread_flag(TIF_TAGGED_ADDR);
I think update_thread_flag() could be used here.
[...]
Cheers ---Dave
On Thu, Jun 13, 2019 at 12:16:59PM +0100, Dave P Martin wrote:
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
arch/arm64/include/asm/processor.h | 6 +++ arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/include/asm/uaccess.h | 3 +- arch/arm64/kernel/process.c | 67 ++++++++++++++++++++++++++++ include/uapi/linux/prctl.h | 5 +++ kernel/sys.c | 16 +++++++ 6 files changed, 97 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index fcd0e691b1ea..fee457456aa8 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -307,6 +307,12 @@ extern void __init minsigstksz_setup(void); /* PR_PAC_RESET_KEYS prctl */ #define PAC_RESET_KEYS(tsk, arg) ptrauth_prctl_reset_keys(tsk, arg) +/* PR_TAGGED_ADDR prctl */
(A couple of comments I missed in my last reply:)
Name mismatch?
Yeah, it went through several names but it seems that I didn't update all places.
+long set_tagged_addr_ctrl(unsigned long arg); +long get_tagged_addr_ctrl(void); +#define SET_TAGGED_ADDR_CTRL(arg) set_tagged_addr_ctrl(arg) +#define GET_TAGGED_ADDR_CTRL() get_tagged_addr_ctrl()
[...]
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 3767fb21a5b8..69d0be1fc708 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -30,6 +30,7 @@ #include <linux/kernel.h> #include <linux/mm.h> #include <linux/stddef.h> +#include <linux/sysctl.h> #include <linux/unistd.h> #include <linux/user.h> #include <linux/delay.h> @@ -323,6 +324,7 @@ void flush_thread(void) fpsimd_flush_thread(); tls_thread_flush(); flush_ptrace_hw_breakpoint(current);
- clear_thread_flag(TIF_TAGGED_ADDR);
} void release_thread(struct task_struct *dead_task) @@ -552,3 +554,68 @@ void arch_setup_new_exec(void) ptrauth_thread_init_user(current); }
+/*
- Control the relaxed ABI allowing tagged user addresses into the kernel.
- */
+static unsigned int tagged_addr_prctl_allowed = 1;
+long set_tagged_addr_ctrl(unsigned long arg) +{
- if (!tagged_addr_prctl_allowed)
return -EINVAL;
So, tagging can actually be locked on by having a process enable it and then some possibly unrelated process clearing tagged_addr_prctl_allowed. That feels a bit weird.
The problem is that if you disable the ABI globally, lots of applications would crash. This sysctl is meant as a way to disable the opt-in to the TBI ABI. Another option would be a kernel command line option (I'm not keen on a Kconfig option).
Do we want to allow a process that has tagging on to be able to turn it off at all? Possibly things like CRIU might want to do that.
I left it in for symmetry but I don't expect it to be used. A potential use-case is doing it per subsequent threads in an application.
- if (is_compat_task())
return -EINVAL;
- if (arg & ~PR_TAGGED_ADDR_ENABLE)
return -EINVAL;
How do we expect this argument to be extended in the future?
Yes, for MTE. That's why I wouldn't allow random bits here.
I'm wondering whether this is really a bitmask or an enum, or a mixture of the two. Maybe it doesn't matter.
User may want to set PR_TAGGED_ADDR_ENABLE | PR_MTE_PRECISE in a single call.
- if (arg & PR_TAGGED_ADDR_ENABLE)
set_thread_flag(TIF_TAGGED_ADDR);
- else
clear_thread_flag(TIF_TAGGED_ADDR);
I think update_thread_flag() could be used here.
Yes. I forgot you added this.
On 13/06/2019 16:35, Catalin Marinas wrote:
On Thu, Jun 13, 2019 at 12:16:59PM +0100, Dave P Martin wrote:
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
arch/arm64/include/asm/processor.h | 6 +++ arch/arm64/include/asm/thread_info.h | 1 + arch/arm64/include/asm/uaccess.h | 3 +- arch/arm64/kernel/process.c | 67 ++++++++++++++++++++++++++++ include/uapi/linux/prctl.h | 5 +++ kernel/sys.c | 16 +++++++ 6 files changed, 97 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h index fcd0e691b1ea..fee457456aa8 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -307,6 +307,12 @@ extern void __init minsigstksz_setup(void); /* PR_PAC_RESET_KEYS prctl */ #define PAC_RESET_KEYS(tsk, arg) ptrauth_prctl_reset_keys(tsk, arg) +/* PR_TAGGED_ADDR prctl */
(A couple of comments I missed in my last reply:)
Name mismatch?
Yeah, it went through several names but it seems that I didn't update all places.
+long set_tagged_addr_ctrl(unsigned long arg); +long get_tagged_addr_ctrl(void); +#define SET_TAGGED_ADDR_CTRL(arg) set_tagged_addr_ctrl(arg) +#define GET_TAGGED_ADDR_CTRL() get_tagged_addr_ctrl()
[...]
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 3767fb21a5b8..69d0be1fc708 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -30,6 +30,7 @@ #include <linux/kernel.h> #include <linux/mm.h> #include <linux/stddef.h> +#include <linux/sysctl.h> #include <linux/unistd.h> #include <linux/user.h> #include <linux/delay.h> @@ -323,6 +324,7 @@ void flush_thread(void) fpsimd_flush_thread(); tls_thread_flush(); flush_ptrace_hw_breakpoint(current);
- clear_thread_flag(TIF_TAGGED_ADDR);
} void release_thread(struct task_struct *dead_task) @@ -552,3 +554,68 @@ void arch_setup_new_exec(void) ptrauth_thread_init_user(current); }
+/*
- Control the relaxed ABI allowing tagged user addresses into the kernel.
- */
+static unsigned int tagged_addr_prctl_allowed = 1;
+long set_tagged_addr_ctrl(unsigned long arg) +{
- if (!tagged_addr_prctl_allowed)
return -EINVAL;
So, tagging can actually be locked on by having a process enable it and then some possibly unrelated process clearing tagged_addr_prctl_allowed. That feels a bit weird.
The problem is that if you disable the ABI globally, lots of applications would crash. This sysctl is meant as a way to disable the opt-in to the TBI ABI. Another option would be a kernel command line option (I'm not keen on a Kconfig option).
Why you are not keen on a Kconfig option?
Do we want to allow a process that has tagging on to be able to turn it off at all? Possibly things like CRIU might want to do that.
I left it in for symmetry but I don't expect it to be used. A potential use-case is doing it per subsequent threads in an application.
- if (is_compat_task())
return -EINVAL;
- if (arg & ~PR_TAGGED_ADDR_ENABLE)
return -EINVAL;
How do we expect this argument to be extended in the future?
Yes, for MTE. That's why I wouldn't allow random bits here.
I'm wondering whether this is really a bitmask or an enum, or a mixture of the two. Maybe it doesn't matter.
User may want to set PR_TAGGED_ADDR_ENABLE | PR_MTE_PRECISE in a single call.
- if (arg & PR_TAGGED_ADDR_ENABLE)
set_thread_flag(TIF_TAGGED_ADDR);
- else
clear_thread_flag(TIF_TAGGED_ADDR);
I think update_thread_flag() could be used here.
Yes. I forgot you added this.
On Thu, Jun 13, 2019 at 04:45:54PM +0100, Vincenzo Frascino wrote:
On 13/06/2019 16:35, Catalin Marinas wrote:
On Thu, Jun 13, 2019 at 12:16:59PM +0100, Dave P Martin wrote:
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
+/*
- Control the relaxed ABI allowing tagged user addresses into the kernel.
- */
+static unsigned int tagged_addr_prctl_allowed = 1;
+long set_tagged_addr_ctrl(unsigned long arg) +{
- if (!tagged_addr_prctl_allowed)
return -EINVAL;
So, tagging can actually be locked on by having a process enable it and then some possibly unrelated process clearing tagged_addr_prctl_allowed. That feels a bit weird.
The problem is that if you disable the ABI globally, lots of applications would crash. This sysctl is meant as a way to disable the opt-in to the TBI ABI. Another option would be a kernel command line option (I'm not keen on a Kconfig option).
Why you are not keen on a Kconfig option?
Because I don't want to rebuild the kernel/reboot just to be able to test how user space handles the ABI opt-in. I'm ok with a Kconfig option to disable this globally in addition to a run-time option (if actually needed, I'm not sure).
On 13/06/2019 16:57, Catalin Marinas wrote:
On Thu, Jun 13, 2019 at 04:45:54PM +0100, Vincenzo Frascino wrote:
On 13/06/2019 16:35, Catalin Marinas wrote:
On Thu, Jun 13, 2019 at 12:16:59PM +0100, Dave P Martin wrote:
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
+/*
- Control the relaxed ABI allowing tagged user addresses into the kernel.
- */
+static unsigned int tagged_addr_prctl_allowed = 1;
+long set_tagged_addr_ctrl(unsigned long arg) +{
- if (!tagged_addr_prctl_allowed)
return -EINVAL;
So, tagging can actually be locked on by having a process enable it and then some possibly unrelated process clearing tagged_addr_prctl_allowed. That feels a bit weird.
The problem is that if you disable the ABI globally, lots of applications would crash. This sysctl is meant as a way to disable the opt-in to the TBI ABI. Another option would be a kernel command line option (I'm not keen on a Kconfig option).
Why you are not keen on a Kconfig option?
Because I don't want to rebuild the kernel/reboot just to be able to test how user space handles the ABI opt-in. I'm ok with a Kconfig option to disable this globally in addition to a run-time option (if actually needed, I'm not sure).
There might be scenarios (i.e. embedded) in which this is not needed, hence having a config option (maybe Y by default) that removes from the kernel the whole feature would be good, obviously in conjunction with the run-time option.
Based on my previous review, if we move out the code from process.c in its own independent file when the Kconfig option is turned off we could remove the entire object from the kernel (this would remove the sysctl and let still the prctl return -EINVAL).
These changes though could be done successively with a separate patch set, if the Kconfig is meant to be Y by default.
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
A question for the user-space folk: if an application opts in to this ABI, would you want the sigcontext.fault_address and/or siginfo.si_addr to contain the tag? We currently clear it early in the arm64 entry.S but we could find a way to pass it down if needed.
On 17/06/2019 14:56, Catalin Marinas wrote:
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
A question for the user-space folk: if an application opts in to this ABI, would you want the sigcontext.fault_address and/or siginfo.si_addr to contain the tag? We currently clear it early in the arm64 entry.S but we could find a way to pass it down if needed.
to me it makes sense to keep the tag in si_addr / fault_address.
but i don't know in detail how those fields are used currently.
keeping the tag is certainly useful for MTE to debug wrong tag failures unless there is a separate mechanism for that.
On Mon, Jun 17, 2019 at 6:56 AM Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
A question for the user-space folk: if an application opts in to this ABI, would you want the sigcontext.fault_address and/or siginfo.si_addr to contain the tag? We currently clear it early in the arm64 entry.S but we could find a way to pass it down if needed.
For HWASan this would not be useful because we instrument memory accesses with explicit checks anyway. For MTE, on the other hand, it would be very convenient to know the fault address tag without disassembling the code.
On Mon, Jun 17, 2019 at 09:57:36AM -0700, Evgenii Stepanov wrote:
On Mon, Jun 17, 2019 at 6:56 AM Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
A question for the user-space folk: if an application opts in to this ABI, would you want the sigcontext.fault_address and/or siginfo.si_addr to contain the tag? We currently clear it early in the arm64 entry.S but we could find a way to pass it down if needed.
For HWASan this would not be useful because we instrument memory accesses with explicit checks anyway. For MTE, on the other hand, it would be very convenient to know the fault address tag without disassembling the code.
I could as this differently: does anything break if, once the user opts in to TBI, fault_address and/or si_addr have non-zero top byte?
Alternatively, we could present the original FAR_EL1 register as a separate field as we do with ESR_EL1, independently of whether the user opted in to TBI or not.
On Mon, Jun 17, 2019 at 10:18 AM Catalin Marinas catalin.marinas@arm.com wrote:
On Mon, Jun 17, 2019 at 09:57:36AM -0700, Evgenii Stepanov wrote:
On Mon, Jun 17, 2019 at 6:56 AM Catalin Marinas catalin.marinas@arm.com wrote:
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
A question for the user-space folk: if an application opts in to this ABI, would you want the sigcontext.fault_address and/or siginfo.si_addr to contain the tag? We currently clear it early in the arm64 entry.S but we could find a way to pass it down if needed.
For HWASan this would not be useful because we instrument memory accesses with explicit checks anyway. For MTE, on the other hand, it would be very convenient to know the fault address tag without disassembling the code.
I could as this differently: does anything break if, once the user opts in to TBI, fault_address and/or si_addr have non-zero top byte?
I think it would be fine.
Alternatively, we could present the original FAR_EL1 register as a separate field as we do with ESR_EL1, independently of whether the user opted in to TBI or not.
-- Catalin
On Wed, Jun 12, 2019 at 1:43 PM Andrey Konovalov andreyknvl@google.com wrote:
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
Catalin, would you like to do the requested changes to this patch yourself and send it to me or should I do that?
On Wed, Jun 19, 2019 at 04:45:02PM +0200, Andrey Konovalov wrote:
On Wed, Jun 12, 2019 at 1:43 PM Andrey Konovalov andreyknvl@google.com wrote:
From: Catalin Marinas catalin.marinas@arm.com
It is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately. This patch introduces a prctl() interface for enabling or disabling the tagged ABI with a global sysctl control for preventing applications from enabling the relaxed ABI (meant for testing user-space prctl() return error checking without reconfiguring the kernel). The ABI properties are inherited by threads of the same application and fork()'ed children but cleared on execve().
The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle MTE-specific settings like imprecise vs precise exceptions.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com
Catalin, would you like to do the requested changes to this patch yourself and send it to me or should I do that?
I'll send you an updated version this week.
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
This patch allows tagged pointers to be passed to the following memory syscalls: get_mempolicy, madvise, mbind, mincore, mlock, mlock2, mprotect, mremap, msync, munlock, move_pages.
The mmap and mremap syscalls do not currently accept tagged addresses. Architectures may interpret the tag as a background colour for the corresponding vma.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Andrey Konovalov andreyknvl@google.com --- mm/madvise.c | 2 ++ mm/mempolicy.c | 3 +++ mm/migrate.c | 2 +- mm/mincore.c | 2 ++ mm/mlock.c | 4 ++++ mm/mprotect.c | 2 ++ mm/mremap.c | 7 +++++++ mm/msync.c | 2 ++ 8 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c index 628022e674a7..39b82f8a698f 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -810,6 +810,8 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) size_t len; struct blk_plug plug;
+ start = untagged_addr(start); + if (!madvise_behavior_valid(behavior)) return error;
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 01600d80ae01..78e0a88b2680 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1360,6 +1360,7 @@ static long kernel_mbind(unsigned long start, unsigned long len, int err; unsigned short mode_flags;
+ start = untagged_addr(start); mode_flags = mode & MPOL_MODE_FLAGS; mode &= ~MPOL_MODE_FLAGS; if (mode >= MPOL_MAX) @@ -1517,6 +1518,8 @@ static int kernel_get_mempolicy(int __user *policy, int uninitialized_var(pval); nodemask_t nodes;
+ addr = untagged_addr(addr); + if (nmask != NULL && maxnode < nr_node_ids) return -EINVAL;
diff --git a/mm/migrate.c b/mm/migrate.c index f2ecc2855a12..d22c45cf36b2 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1616,7 +1616,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, goto out_flush; if (get_user(node, nodes + i)) goto out_flush; - addr = (unsigned long)p; + addr = (unsigned long)untagged_addr(p);
err = -ENODEV; if (node < 0 || node >= MAX_NUMNODES) diff --git a/mm/mincore.c b/mm/mincore.c index c3f058bd0faf..64c322ed845c 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -249,6 +249,8 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len, unsigned long pages; unsigned char *tmp;
+ start = untagged_addr(start); + /* Check the start address: needs to be page-aligned.. */ if (start & ~PAGE_MASK) return -EINVAL; diff --git a/mm/mlock.c b/mm/mlock.c index 080f3b36415b..e82609eaa428 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -674,6 +674,8 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla unsigned long lock_limit; int error = -ENOMEM;
+ start = untagged_addr(start); + if (!can_do_mlock()) return -EPERM;
@@ -735,6 +737,8 @@ SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len) { int ret;
+ start = untagged_addr(start); + len = PAGE_ALIGN(len + (offset_in_page(start))); start &= PAGE_MASK;
diff --git a/mm/mprotect.c b/mm/mprotect.c index bf38dfbbb4b4..19f981b733bc 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -465,6 +465,8 @@ static int do_mprotect_pkey(unsigned long start, size_t len, const bool rier = (current->personality & READ_IMPLIES_EXEC) && (prot & PROT_READ);
+ start = untagged_addr(start); + prot &= ~(PROT_GROWSDOWN|PROT_GROWSUP); if (grows == (PROT_GROWSDOWN|PROT_GROWSUP)) /* can't be both */ return -EINVAL; diff --git a/mm/mremap.c b/mm/mremap.c index fc241d23cd97..64c9a3b8be0a 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -606,6 +606,13 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, LIST_HEAD(uf_unmap_early); LIST_HEAD(uf_unmap);
+ /* + * Architectures may interpret the tag passed to mmap as a background + * colour for the corresponding vma. For mremap we don't allow tagged + * new_addr to preserve similar behaviour to mmap. + */ + addr = untagged_addr(addr); + if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE)) return ret;
diff --git a/mm/msync.c b/mm/msync.c index ef30a429623a..c3bd3e75f687 100644 --- a/mm/msync.c +++ b/mm/msync.c @@ -37,6 +37,8 @@ SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags) int unmapped_error = 0; int error = -EINVAL;
+ start = untagged_addr(start); + if (flags & ~(MS_ASYNC | MS_INVALIDATE | MS_SYNC)) goto out; if (offset_in_page(start))
On 12/06/2019 12:43, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
This patch allows tagged pointers to be passed to the following memory syscalls: get_mempolicy, madvise, mbind, mincore, mlock, mlock2, mprotect, mremap, msync, munlock, move_pages.
The mmap and mremap syscalls do not currently accept tagged addresses. Architectures may interpret the tag as a background colour for the corresponding vma.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Andrey Konovalov andreyknvl@google.com
Reviewed-by: Vincenzo Frascino vincenzo.frascino@arm.com
mm/madvise.c | 2 ++ mm/mempolicy.c | 3 +++ mm/migrate.c | 2 +- mm/mincore.c | 2 ++ mm/mlock.c | 4 ++++ mm/mprotect.c | 2 ++ mm/mremap.c | 7 +++++++ mm/msync.c | 2 ++ 8 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c index 628022e674a7..39b82f8a698f 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -810,6 +810,8 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) size_t len; struct blk_plug plug;
- start = untagged_addr(start);
- if (!madvise_behavior_valid(behavior)) return error;
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 01600d80ae01..78e0a88b2680 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1360,6 +1360,7 @@ static long kernel_mbind(unsigned long start, unsigned long len, int err; unsigned short mode_flags;
- start = untagged_addr(start); mode_flags = mode & MPOL_MODE_FLAGS; mode &= ~MPOL_MODE_FLAGS; if (mode >= MPOL_MAX)
@@ -1517,6 +1518,8 @@ static int kernel_get_mempolicy(int __user *policy, int uninitialized_var(pval); nodemask_t nodes;
- addr = untagged_addr(addr);
- if (nmask != NULL && maxnode < nr_node_ids) return -EINVAL;
diff --git a/mm/migrate.c b/mm/migrate.c index f2ecc2855a12..d22c45cf36b2 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1616,7 +1616,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, goto out_flush; if (get_user(node, nodes + i)) goto out_flush;
addr = (unsigned long)p;
addr = (unsigned long)untagged_addr(p);
err = -ENODEV; if (node < 0 || node >= MAX_NUMNODES) diff --git a/mm/mincore.c b/mm/mincore.c index c3f058bd0faf..64c322ed845c 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -249,6 +249,8 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len, unsigned long pages; unsigned char *tmp;
- start = untagged_addr(start);
- /* Check the start address: needs to be page-aligned.. */ if (start & ~PAGE_MASK) return -EINVAL;
diff --git a/mm/mlock.c b/mm/mlock.c index 080f3b36415b..e82609eaa428 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -674,6 +674,8 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla unsigned long lock_limit; int error = -ENOMEM;
- start = untagged_addr(start);
- if (!can_do_mlock()) return -EPERM;
@@ -735,6 +737,8 @@ SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len) { int ret;
- start = untagged_addr(start);
- len = PAGE_ALIGN(len + (offset_in_page(start))); start &= PAGE_MASK;
diff --git a/mm/mprotect.c b/mm/mprotect.c index bf38dfbbb4b4..19f981b733bc 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -465,6 +465,8 @@ static int do_mprotect_pkey(unsigned long start, size_t len, const bool rier = (current->personality & READ_IMPLIES_EXEC) && (prot & PROT_READ);
- start = untagged_addr(start);
- prot &= ~(PROT_GROWSDOWN|PROT_GROWSUP); if (grows == (PROT_GROWSDOWN|PROT_GROWSUP)) /* can't be both */ return -EINVAL;
diff --git a/mm/mremap.c b/mm/mremap.c index fc241d23cd97..64c9a3b8be0a 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -606,6 +606,13 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, LIST_HEAD(uf_unmap_early); LIST_HEAD(uf_unmap);
- /*
* Architectures may interpret the tag passed to mmap as a background
* colour for the corresponding vma. For mremap we don't allow tagged
* new_addr to preserve similar behaviour to mmap.
*/
- addr = untagged_addr(addr);
- if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE)) return ret;
diff --git a/mm/msync.c b/mm/msync.c index ef30a429623a..c3bd3e75f687 100644 --- a/mm/msync.c +++ b/mm/msync.c @@ -37,6 +37,8 @@ SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags) int unmapped_error = 0; int error = -EINVAL;
- start = untagged_addr(start);
- if (flags & ~(MS_ASYNC | MS_INVALIDATE | MS_SYNC)) goto out; if (offset_in_page(start))
On 6/12/19 5:43 AM, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
This patch allows tagged pointers to be passed to the following memory syscalls: get_mempolicy, madvise, mbind, mincore, mlock, mlock2, mprotect, mremap, msync, munlock, move_pages.
The mmap and mremap syscalls do not currently accept tagged addresses. Architectures may interpret the tag as a background colour for the corresponding vma.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Andrey Konovalov andreyknvl@google.com
Reviewed-by: Khalid Aziz khalid.aziz@oracle.com
mm/madvise.c | 2 ++ mm/mempolicy.c | 3 +++ mm/migrate.c | 2 +- mm/mincore.c | 2 ++ mm/mlock.c | 4 ++++ mm/mprotect.c | 2 ++ mm/mremap.c | 7 +++++++ mm/msync.c | 2 ++ 8 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c index 628022e674a7..39b82f8a698f 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -810,6 +810,8 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) size_t len; struct blk_plug plug;
- start = untagged_addr(start);
- if (!madvise_behavior_valid(behavior)) return error;
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 01600d80ae01..78e0a88b2680 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1360,6 +1360,7 @@ static long kernel_mbind(unsigned long start, unsigned long len, int err; unsigned short mode_flags;
- start = untagged_addr(start); mode_flags = mode & MPOL_MODE_FLAGS; mode &= ~MPOL_MODE_FLAGS; if (mode >= MPOL_MAX)
@@ -1517,6 +1518,8 @@ static int kernel_get_mempolicy(int __user *policy, int uninitialized_var(pval); nodemask_t nodes;
- addr = untagged_addr(addr);
- if (nmask != NULL && maxnode < nr_node_ids) return -EINVAL;
diff --git a/mm/migrate.c b/mm/migrate.c index f2ecc2855a12..d22c45cf36b2 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1616,7 +1616,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, goto out_flush; if (get_user(node, nodes + i)) goto out_flush;
addr = (unsigned long)p;
addr = (unsigned long)untagged_addr(p);
err = -ENODEV; if (node < 0 || node >= MAX_NUMNODES) diff --git a/mm/mincore.c b/mm/mincore.c index c3f058bd0faf..64c322ed845c 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -249,6 +249,8 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len, unsigned long pages; unsigned char *tmp;
- start = untagged_addr(start);
- /* Check the start address: needs to be page-aligned.. */ if (start & ~PAGE_MASK) return -EINVAL;
diff --git a/mm/mlock.c b/mm/mlock.c index 080f3b36415b..e82609eaa428 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -674,6 +674,8 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla unsigned long lock_limit; int error = -ENOMEM;
- start = untagged_addr(start);
- if (!can_do_mlock()) return -EPERM;
@@ -735,6 +737,8 @@ SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len) { int ret;
- start = untagged_addr(start);
- len = PAGE_ALIGN(len + (offset_in_page(start))); start &= PAGE_MASK;
diff --git a/mm/mprotect.c b/mm/mprotect.c index bf38dfbbb4b4..19f981b733bc 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -465,6 +465,8 @@ static int do_mprotect_pkey(unsigned long start, size_t len, const bool rier = (current->personality & READ_IMPLIES_EXEC) && (prot & PROT_READ);
- start = untagged_addr(start);
- prot &= ~(PROT_GROWSDOWN|PROT_GROWSUP); if (grows == (PROT_GROWSDOWN|PROT_GROWSUP)) /* can't be both */ return -EINVAL;
diff --git a/mm/mremap.c b/mm/mremap.c index fc241d23cd97..64c9a3b8be0a 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -606,6 +606,13 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, LIST_HEAD(uf_unmap_early); LIST_HEAD(uf_unmap);
- /*
* Architectures may interpret the tag passed to mmap as a background
* colour for the corresponding vma. For mremap we don't allow tagged
* new_addr to preserve similar behaviour to mmap.
*/
- addr = untagged_addr(addr);
- if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE)) return ret;
diff --git a/mm/msync.c b/mm/msync.c index ef30a429623a..c3bd3e75f687 100644 --- a/mm/msync.c +++ b/mm/msync.c @@ -37,6 +37,8 @@ SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags) int unmapped_error = 0; int error = -EINVAL;
- start = untagged_addr(start);
- if (flags & ~(MS_ASYNC | MS_INVALIDATE | MS_SYNC)) goto out; if (offset_in_page(start))
On 6/19/19 9:55 AM, Khalid Aziz wrote:
On 6/12/19 5:43 AM, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
This patch allows tagged pointers to be passed to the following memory syscalls: get_mempolicy, madvise, mbind, mincore, mlock, mlock2, mprotect, mremap, msync, munlock, move_pages.
The mmap and mremap syscalls do not currently accept tagged addresses. Architectures may interpret the tag as a background colour for the corresponding vma.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Andrey Konovalov andreyknvl@google.com
Reviewed-by: Khalid Aziz khalid.aziz@oracle.com
I would also recommend updating commit log for all the patches in this series that are changing files under mm/ as opposed to arch/arm64 to not reference arm64 kernel ABI since the change applies to every architecture. So something along the lines of "This patch is part of a series that extends kernel ABI to allow......."
-- Khalid
mm/madvise.c | 2 ++ mm/mempolicy.c | 3 +++ mm/migrate.c | 2 +- mm/mincore.c | 2 ++ mm/mlock.c | 4 ++++ mm/mprotect.c | 2 ++ mm/mremap.c | 7 +++++++ mm/msync.c | 2 ++ 8 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c index 628022e674a7..39b82f8a698f 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -810,6 +810,8 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) size_t len; struct blk_plug plug;
- start = untagged_addr(start);
- if (!madvise_behavior_valid(behavior)) return error;
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 01600d80ae01..78e0a88b2680 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1360,6 +1360,7 @@ static long kernel_mbind(unsigned long start, unsigned long len, int err; unsigned short mode_flags;
- start = untagged_addr(start); mode_flags = mode & MPOL_MODE_FLAGS; mode &= ~MPOL_MODE_FLAGS; if (mode >= MPOL_MAX)
@@ -1517,6 +1518,8 @@ static int kernel_get_mempolicy(int __user *policy, int uninitialized_var(pval); nodemask_t nodes;
- addr = untagged_addr(addr);
- if (nmask != NULL && maxnode < nr_node_ids) return -EINVAL;
diff --git a/mm/migrate.c b/mm/migrate.c index f2ecc2855a12..d22c45cf36b2 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1616,7 +1616,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, goto out_flush; if (get_user(node, nodes + i)) goto out_flush;
addr = (unsigned long)p;
addr = (unsigned long)untagged_addr(p);
err = -ENODEV; if (node < 0 || node >= MAX_NUMNODES) diff --git a/mm/mincore.c b/mm/mincore.c index c3f058bd0faf..64c322ed845c 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -249,6 +249,8 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len, unsigned long pages; unsigned char *tmp;
- start = untagged_addr(start);
- /* Check the start address: needs to be page-aligned.. */ if (start & ~PAGE_MASK) return -EINVAL;fixup_user_fault
diff --git a/mm/mlock.c b/mm/mlock.c index 080f3b36415b..e82609eaa428 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -674,6 +674,8 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla unsigned long lock_limit; int error = -ENOMEM;
- start = untagged_addr(start);
- if (!can_do_mlock()) return -EPERM;
@@ -735,6 +737,8 @@ SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len) { int ret;
- start = untagged_addr(start);
- len = PAGE_ALIGN(len + (offset_in_page(start))); start &= PAGE_MASK;
diff --git a/mm/mprotect.c b/mm/mprotect.c index bf38dfbbb4b4..19f981b733bc 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -465,6 +465,8 @@ static int do_mprotect_pkey(unsigned long start, size_t len, const bool rier = (current->personality & READ_IMPLIES_EXEC) && (prot & PROT_READ);
- start = untagged_addr(start);
- prot &= ~(PROT_GROWSDOWN|PROT_GROWSUP); if (grows == (PROT_GROWSDOWN|PROT_GROWSUP)) /* can't be both */ return -EINVAL;
diff --git a/mm/mremap.c b/mm/mremap.c index fc241d23cd97..64c9a3b8be0a 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -606,6 +606,13 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, LIST_HEAD(uf_unmap_early); LIST_HEAD(uf_unmap);
- /*
* Architectures may interpret the tag passed to mmap as a background
* colour for the corresponding vma. For mremap we don't allow tagged
* new_addr to preserve similar behaviour to mmap.
*/
- addr = untagged_addr(addr);
- if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE)) return ret;
diff --git a/mm/msync.c b/mm/msync.c index ef30a429623a..c3bd3e75f687 100644 --- a/mm/msync.c +++ b/mm/msync.c @@ -37,6 +37,8 @@ SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags) int unmapped_error = 0; int error = -EINVAL;
- start = untagged_addr(start);
- if (flags & ~(MS_ASYNC | MS_INVALIDATE | MS_SYNC)) goto out; if (offset_in_page(start))
On Wed, Jun 19, 2019 at 6:46 PM Khalid Aziz khalid.aziz@oracle.com wrote:
On 6/19/19 9:55 AM, Khalid Aziz wrote:
On 6/12/19 5:43 AM, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
This patch allows tagged pointers to be passed to the following memory syscalls: get_mempolicy, madvise, mbind, mincore, mlock, mlock2, mprotect, mremap, msync, munlock, move_pages.
The mmap and mremap syscalls do not currently accept tagged addresses. Architectures may interpret the tag as a background colour for the corresponding vma.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Andrey Konovalov andreyknvl@google.com
Reviewed-by: Khalid Aziz khalid.aziz@oracle.com
I would also recommend updating commit log for all the patches in this series that are changing files under mm/ as opposed to arch/arm64 to not reference arm64 kernel ABI since the change applies to every architecture. So something along the lines of "This patch is part of a series that extends kernel ABI to allow......."
Sure, will do in v18, thanks!
-- Khalid
mm/madvise.c | 2 ++ mm/mempolicy.c | 3 +++ mm/migrate.c | 2 +- mm/mincore.c | 2 ++ mm/mlock.c | 4 ++++ mm/mprotect.c | 2 ++ mm/mremap.c | 7 +++++++ mm/msync.c | 2 ++ 8 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c index 628022e674a7..39b82f8a698f 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -810,6 +810,8 @@ SYSCALL_DEFINE3(madvise, unsigned long, start, size_t, len_in, int, behavior) size_t len; struct blk_plug plug;
- start = untagged_addr(start);
- if (!madvise_behavior_valid(behavior)) return error;
diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 01600d80ae01..78e0a88b2680 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1360,6 +1360,7 @@ static long kernel_mbind(unsigned long start, unsigned long len, int err; unsigned short mode_flags;
- start = untagged_addr(start); mode_flags = mode & MPOL_MODE_FLAGS; mode &= ~MPOL_MODE_FLAGS; if (mode >= MPOL_MAX)
@@ -1517,6 +1518,8 @@ static int kernel_get_mempolicy(int __user *policy, int uninitialized_var(pval); nodemask_t nodes;
- addr = untagged_addr(addr);
- if (nmask != NULL && maxnode < nr_node_ids) return -EINVAL;
diff --git a/mm/migrate.c b/mm/migrate.c index f2ecc2855a12..d22c45cf36b2 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1616,7 +1616,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, goto out_flush; if (get_user(node, nodes + i)) goto out_flush;
addr = (unsigned long)p;
addr = (unsigned long)untagged_addr(p); err = -ENODEV; if (node < 0 || node >= MAX_NUMNODES)
diff --git a/mm/mincore.c b/mm/mincore.c index c3f058bd0faf..64c322ed845c 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -249,6 +249,8 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len, unsigned long pages; unsigned char *tmp;
- start = untagged_addr(start);
- /* Check the start address: needs to be page-aligned.. */ if (start & ~PAGE_MASK) return -EINVAL;fixup_user_fault
diff --git a/mm/mlock.c b/mm/mlock.c index 080f3b36415b..e82609eaa428 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -674,6 +674,8 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla unsigned long lock_limit; int error = -ENOMEM;
- start = untagged_addr(start);
- if (!can_do_mlock()) return -EPERM;
@@ -735,6 +737,8 @@ SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len) { int ret;
- start = untagged_addr(start);
- len = PAGE_ALIGN(len + (offset_in_page(start))); start &= PAGE_MASK;
diff --git a/mm/mprotect.c b/mm/mprotect.c index bf38dfbbb4b4..19f981b733bc 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -465,6 +465,8 @@ static int do_mprotect_pkey(unsigned long start, size_t len, const bool rier = (current->personality & READ_IMPLIES_EXEC) && (prot & PROT_READ);
- start = untagged_addr(start);
- prot &= ~(PROT_GROWSDOWN|PROT_GROWSUP); if (grows == (PROT_GROWSDOWN|PROT_GROWSUP)) /* can't be both */ return -EINVAL;
diff --git a/mm/mremap.c b/mm/mremap.c index fc241d23cd97..64c9a3b8be0a 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -606,6 +606,13 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, LIST_HEAD(uf_unmap_early); LIST_HEAD(uf_unmap);
- /*
* Architectures may interpret the tag passed to mmap as a background
* colour for the corresponding vma. For mremap we don't allow tagged
* new_addr to preserve similar behaviour to mmap.
*/
- addr = untagged_addr(addr);
- if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE)) return ret;
diff --git a/mm/msync.c b/mm/msync.c index ef30a429623a..c3bd3e75f687 100644 --- a/mm/msync.c +++ b/mm/msync.c @@ -37,6 +37,8 @@ SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags) int unmapped_error = 0; int error = -EINVAL;
- start = untagged_addr(start);
- if (flags & ~(MS_ASYNC | MS_INVALIDATE | MS_SYNC)) goto out; if (offset_in_page(start))
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
mm/gup.c provides a kernel interface that accepts user addresses and manipulates user pages directly (for example get_user_pages, that is used by the futex syscall). Since a user can provided tagged addresses, we need to handle this case.
Add untagging to gup.c functions that use user addresses for vma lookups.
Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Andrey Konovalov andreyknvl@google.com --- mm/gup.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/mm/gup.c b/mm/gup.c index ddde097cf9e4..c37df3d455a2 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -802,6 +802,8 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, if (!nr_pages) return 0;
+ start = untagged_addr(start); + VM_BUG_ON(!!pages != !!(gup_flags & FOLL_GET));
/* @@ -964,6 +966,8 @@ int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm, struct vm_area_struct *vma; vm_fault_t ret, major = 0;
+ address = untagged_addr(address); + if (unlocked) fault_flags |= FAULT_FLAG_ALLOW_RETRY;
On 12/06/2019 12:43, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
mm/gup.c provides a kernel interface that accepts user addresses and manipulates user pages directly (for example get_user_pages, that is used by the futex syscall). Since a user can provided tagged addresses, we need to handle this case.
Add untagging to gup.c functions that use user addresses for vma lookups.
Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Andrey Konovalov andreyknvl@google.com
Reviewed-by: Vincenzo Frascino vincenzo.frascino@arm.com
mm/gup.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/mm/gup.c b/mm/gup.c index ddde097cf9e4..c37df3d455a2 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -802,6 +802,8 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, if (!nr_pages) return 0;
- start = untagged_addr(start);
- VM_BUG_ON(!!pages != !!(gup_flags & FOLL_GET));
/* @@ -964,6 +966,8 @@ int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm, struct vm_area_struct *vma; vm_fault_t ret, major = 0;
- address = untagged_addr(address);
- if (unlocked) fault_flags |= FAULT_FLAG_ALLOW_RETRY;
On 6/12/19 5:43 AM, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
mm/gup.c provides a kernel interface that accepts user addresses and manipulates user pages directly (for example get_user_pages, that is used by the futex syscall). Since a user can provided tagged addresses, we need to handle this case.
Add untagging to gup.c functions that use user addresses for vma lookups.
Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Andrey Konovalov andreyknvl@google.com
Reviewed-by: Khalid Aziz khalid.aziz@oracle.com
mm/gup.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/mm/gup.c b/mm/gup.c index ddde097cf9e4..c37df3d455a2 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -802,6 +802,8 @@ static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, if (!nr_pages) return 0;
- start = untagged_addr(start);
- VM_BUG_ON(!!pages != !!(gup_flags & FOLL_GET));
/* @@ -964,6 +966,8 @@ int fixup_user_fault(struct task_struct *tsk, struct mm_struct *mm, struct vm_area_struct *vma; vm_fault_t ret, major = 0;
- address = untagged_addr(address);
- if (unlocked) fault_flags |= FAULT_FLAG_ALLOW_RETRY;
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
get_vaddr_frames uses provided user pointers for vma lookups, which can only by done with untagged pointers. Instead of locating and changing all callers of this function, perform untagging in it.
Acked-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Andrey Konovalov andreyknvl@google.com --- mm/frame_vector.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/mm/frame_vector.c b/mm/frame_vector.c index c64dca6e27c2..c431ca81dad5 100644 --- a/mm/frame_vector.c +++ b/mm/frame_vector.c @@ -46,6 +46,8 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, if (WARN_ON_ONCE(nr_frames > vec->nr_allocated)) nr_frames = vec->nr_allocated;
+ start = untagged_addr(start); + down_read(&mm->mmap_sem); locked = 1; vma = find_vma_intersection(mm, start, start + 1);
On 12/06/2019 12:43, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
get_vaddr_frames uses provided user pointers for vma lookups, which can only by done with untagged pointers. Instead of locating and changing all callers of this function, perform untagging in it.
Acked-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Andrey Konovalov andreyknvl@google.com
Reviewed-by: Vincenzo Frascino vincenzo.frascino@arm.com
mm/frame_vector.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/mm/frame_vector.c b/mm/frame_vector.c index c64dca6e27c2..c431ca81dad5 100644 --- a/mm/frame_vector.c +++ b/mm/frame_vector.c @@ -46,6 +46,8 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, if (WARN_ON_ONCE(nr_frames > vec->nr_allocated)) nr_frames = vec->nr_allocated;
- start = untagged_addr(start);
- down_read(&mm->mmap_sem); locked = 1; vma = find_vma_intersection(mm, start, start + 1);
On 6/12/19 5:43 AM, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
get_vaddr_frames uses provided user pointers for vma lookups, which can only by done with untagged pointers. Instead of locating and changing all callers of this function, perform untagging in it.
Acked-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Andrey Konovalov andreyknvl@google.com
With the suggested change to commit log in my previous email:
Reviewed-by: Khalid Aziz khalid.aziz@oracle.com
mm/frame_vector.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/mm/frame_vector.c b/mm/frame_vector.c index c64dca6e27c2..c431ca81dad5 100644 --- a/mm/frame_vector.c +++ b/mm/frame_vector.c @@ -46,6 +46,8 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, if (WARN_ON_ONCE(nr_frames > vec->nr_allocated)) nr_frames = vec->nr_allocated;
- start = untagged_addr(start);
- down_read(&mm->mmap_sem); locked = 1; vma = find_vma_intersection(mm, start, start + 1);
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
In copy_mount_options a user address is being subtracted from TASK_SIZE. If the address is lower than TASK_SIZE, the size is calculated to not allow the exact_copy_from_user() call to cross TASK_SIZE boundary. However if the address is tagged, then the size will be calculated incorrectly.
Untag the address before subtracting.
Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Andrey Konovalov andreyknvl@google.com --- fs/namespace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/namespace.c b/fs/namespace.c index b26778bdc236..2e85712a19ed 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2993,7 +2993,7 @@ void *copy_mount_options(const void __user * data) * the remainder of the page. */ /* copy_from_user cannot cross TASK_SIZE ! */ - size = TASK_SIZE - (unsigned long)data; + size = TASK_SIZE - (unsigned long)untagged_addr(data); if (size > PAGE_SIZE) size = PAGE_SIZE;
On 12/06/2019 12:43, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
In copy_mount_options a user address is being subtracted from TASK_SIZE. If the address is lower than TASK_SIZE, the size is calculated to not allow the exact_copy_from_user() call to cross TASK_SIZE boundary. However if the address is tagged, then the size will be calculated incorrectly.
Untag the address before subtracting.
Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Andrey Konovalov andreyknvl@google.com
Reviewed-by: Vincenzo Frascino vincenzo.frascino@arm.com
fs/namespace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/namespace.c b/fs/namespace.c index b26778bdc236..2e85712a19ed 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2993,7 +2993,7 @@ void *copy_mount_options(const void __user * data) * the remainder of the page. */ /* copy_from_user cannot cross TASK_SIZE ! */
- size = TASK_SIZE - (unsigned long)data;
- size = TASK_SIZE - (unsigned long)untagged_addr(data); if (size > PAGE_SIZE) size = PAGE_SIZE;
On 6/12/19 5:43 AM, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
In copy_mount_options a user address is being subtracted from TASK_SIZE. If the address is lower than TASK_SIZE, the size is calculated to not allow the exact_copy_from_user() call to cross TASK_SIZE boundary. However if the address is tagged, then the size will be calculated incorrectly.
Untag the address before subtracting.
Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Andrey Konovalov andreyknvl@google.com
Please update commit log to make it not arm64 specific since this change affects other architectures as well. Other than that,
Reviewed-by: Khalid Aziz khalid.aziz@oracle.com
fs/namespace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/namespace.c b/fs/namespace.c index b26778bdc236..2e85712a19ed 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2993,7 +2993,7 @@ void *copy_mount_options(const void __user * data) * the remainder of the page. */ /* copy_from_user cannot cross TASK_SIZE ! */
- size = TASK_SIZE - (unsigned long)data;
- size = TASK_SIZE - (unsigned long)untagged_addr(data); if (size > PAGE_SIZE) size = PAGE_SIZE;
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
userfaultfd code use provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in validate_range().
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Andrey Konovalov andreyknvl@google.com --- fs/userfaultfd.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 3b30301c90ec..24d68c3b5ee2 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1263,21 +1263,23 @@ static __always_inline void wake_userfault(struct userfaultfd_ctx *ctx, }
static __always_inline int validate_range(struct mm_struct *mm, - __u64 start, __u64 len) + __u64 *start, __u64 len) { __u64 task_size = mm->task_size;
- if (start & ~PAGE_MASK) + *start = untagged_addr(*start); + + if (*start & ~PAGE_MASK) return -EINVAL; if (len & ~PAGE_MASK) return -EINVAL; if (!len) return -EINVAL; - if (start < mmap_min_addr) + if (*start < mmap_min_addr) return -EINVAL; - if (start >= task_size) + if (*start >= task_size) return -EINVAL; - if (len > task_size - start) + if (len > task_size - *start) return -EINVAL; return 0; } @@ -1327,7 +1329,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, goto out; }
- ret = validate_range(mm, uffdio_register.range.start, + ret = validate_range(mm, &uffdio_register.range.start, uffdio_register.range.len); if (ret) goto out; @@ -1516,7 +1518,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, if (copy_from_user(&uffdio_unregister, buf, sizeof(uffdio_unregister))) goto out;
- ret = validate_range(mm, uffdio_unregister.start, + ret = validate_range(mm, &uffdio_unregister.start, uffdio_unregister.len); if (ret) goto out; @@ -1667,7 +1669,7 @@ static int userfaultfd_wake(struct userfaultfd_ctx *ctx, if (copy_from_user(&uffdio_wake, buf, sizeof(uffdio_wake))) goto out;
- ret = validate_range(ctx->mm, uffdio_wake.start, uffdio_wake.len); + ret = validate_range(ctx->mm, &uffdio_wake.start, uffdio_wake.len); if (ret) goto out;
@@ -1707,7 +1709,7 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, sizeof(uffdio_copy)-sizeof(__s64))) goto out;
- ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len); + ret = validate_range(ctx->mm, &uffdio_copy.dst, uffdio_copy.len); if (ret) goto out; /* @@ -1763,7 +1765,7 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx, sizeof(uffdio_zeropage)-sizeof(__s64))) goto out;
- ret = validate_range(ctx->mm, uffdio_zeropage.range.start, + ret = validate_range(ctx->mm, &uffdio_zeropage.range.start, uffdio_zeropage.range.len); if (ret) goto out;
On 12/06/2019 12:43, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
userfaultfd code use provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in validate_range().
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Andrey Konovalov andreyknvl@google.com
Reviewed-by: Vincenzo Frascino vincenzo.frascino@arm.com
fs/userfaultfd.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 3b30301c90ec..24d68c3b5ee2 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1263,21 +1263,23 @@ static __always_inline void wake_userfault(struct userfaultfd_ctx *ctx, } static __always_inline int validate_range(struct mm_struct *mm,
__u64 start, __u64 len)
__u64 *start, __u64 len)
{ __u64 task_size = mm->task_size;
- if (start & ~PAGE_MASK)
- *start = untagged_addr(*start);
- if (*start & ~PAGE_MASK) return -EINVAL; if (len & ~PAGE_MASK) return -EINVAL; if (!len) return -EINVAL;
- if (start < mmap_min_addr)
- if (*start < mmap_min_addr) return -EINVAL;
- if (start >= task_size)
- if (*start >= task_size) return -EINVAL;
- if (len > task_size - start)
- if (len > task_size - *start) return -EINVAL; return 0;
} @@ -1327,7 +1329,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, goto out; }
- ret = validate_range(mm, uffdio_register.range.start,
- ret = validate_range(mm, &uffdio_register.range.start, uffdio_register.range.len); if (ret) goto out;
@@ -1516,7 +1518,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, if (copy_from_user(&uffdio_unregister, buf, sizeof(uffdio_unregister))) goto out;
- ret = validate_range(mm, uffdio_unregister.start,
- ret = validate_range(mm, &uffdio_unregister.start, uffdio_unregister.len); if (ret) goto out;
@@ -1667,7 +1669,7 @@ static int userfaultfd_wake(struct userfaultfd_ctx *ctx, if (copy_from_user(&uffdio_wake, buf, sizeof(uffdio_wake))) goto out;
- ret = validate_range(ctx->mm, uffdio_wake.start, uffdio_wake.len);
- ret = validate_range(ctx->mm, &uffdio_wake.start, uffdio_wake.len); if (ret) goto out;
@@ -1707,7 +1709,7 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, sizeof(uffdio_copy)-sizeof(__s64))) goto out;
- ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len);
- ret = validate_range(ctx->mm, &uffdio_copy.dst, uffdio_copy.len); if (ret) goto out; /*
@@ -1763,7 +1765,7 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx, sizeof(uffdio_zeropage)-sizeof(__s64))) goto out;
- ret = validate_range(ctx->mm, uffdio_zeropage.range.start,
- ret = validate_range(ctx->mm, &uffdio_zeropage.range.start, uffdio_zeropage.range.len); if (ret) goto out;
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
In amdgpu_gem_userptr_ioctl() and amdgpu_amdkfd_gpuvm.c/init_user_pages() an MMU notifier is set up with a (tagged) userspace pointer. The untagged address should be used so that MMU notifiers for the untagged address get correctly matched up with the right BO. This patch untag user pointers in amdgpu_gem_userptr_ioctl() for the GEM case and in amdgpu_amdkfd_gpuvm_ alloc_memory_of_gpu() for the KFD case. This also makes sure that an untagged pointer is passed to amdgpu_ttm_tt_get_user_pages(), which uses it for vma lookups.
Suggested-by: Felix Kuehling Felix.Kuehling@amd.com Acked-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Andrey Konovalov andreyknvl@google.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index a6e5184d436c..5d476e9bbc43 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -1108,7 +1108,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu( alloc_flags = 0; if (!offset || !*offset) return -EINVAL; - user_addr = *offset; + user_addr = untagged_addr(*offset); } else if (flags & ALLOC_MEM_FLAGS_DOORBELL) { domain = AMDGPU_GEM_DOMAIN_GTT; alloc_domain = AMDGPU_GEM_DOMAIN_CPU; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index d4fcf5475464..e91df1407618 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -287,6 +287,8 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data, uint32_t handle; int r;
+ args->addr = untagged_addr(args->addr); + if (offset_in_page(args->addr | args->size)) return -EINVAL;
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
In radeon_gem_userptr_ioctl() an MMU notifier is set up with a (tagged) userspace pointer. The untagged address should be used so that MMU notifiers for the untagged address get correctly matched up with the right BO. This funcation also calls radeon_ttm_tt_pin_userptr(), which uses provided user pointers for vma lookups, which can only by done with untagged pointers.
This patch untags user pointers in radeon_gem_userptr_ioctl().
Suggested-by: Felix Kuehling Felix.Kuehling@amd.com Acked-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Andrey Konovalov andreyknvl@google.com --- drivers/gpu/drm/radeon/radeon_gem.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 44617dec8183..90eb78fb5eb2 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -291,6 +291,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, uint32_t handle; int r;
+ args->addr = untagged_addr(args->addr); + if (offset_in_page(args->addr | args->size)) return -EINVAL;
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
mlx4_get_umem_mr() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Signed-off-by: Andrey Konovalov andreyknvl@google.com --- drivers/infiniband/hw/mlx4/mr.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c index 355205a28544..13d9f917f249 100644 --- a/drivers/infiniband/hw/mlx4/mr.c +++ b/drivers/infiniband/hw/mlx4/mr.c @@ -378,6 +378,7 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata *udata, u64 start, * again */ if (!ib_access_writable(access_flags)) { + unsigned long untagged_start = untagged_addr(start); struct vm_area_struct *vma;
down_read(¤t->mm->mmap_sem); @@ -386,9 +387,9 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata *udata, u64 start, * cover the memory, but for now it requires a single vma to * entirely cover the MR to support RO mappings. */ - vma = find_vma(current->mm, start); - if (vma && vma->vm_end >= start + length && - vma->vm_start <= start) { + vma = find_vma(current->mm, untagged_start); + if (vma && vma->vm_end >= untagged_start + length && + vma->vm_start <= untagged_start) { if (vma->vm_flags & VM_WRITE) access_flags |= IB_ACCESS_LOCAL_WRITE; } else {
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
videobuf_dma_contig_user_get() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag the pointers in this function.
Reviewed-by: Kees Cook keescook@chromium.org Acked-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Andrey Konovalov andreyknvl@google.com --- drivers/media/v4l2-core/videobuf-dma-contig.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c index e1bf50df4c70..8a1ddd146b17 100644 --- a/drivers/media/v4l2-core/videobuf-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c @@ -160,6 +160,7 @@ static void videobuf_dma_contig_user_put(struct videobuf_dma_contig_memory *mem) static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, struct videobuf_buffer *vb) { + unsigned long untagged_baddr = untagged_addr(vb->baddr); struct mm_struct *mm = current->mm; struct vm_area_struct *vma; unsigned long prev_pfn, this_pfn; @@ -167,22 +168,22 @@ static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, unsigned int offset; int ret;
- offset = vb->baddr & ~PAGE_MASK; + offset = untagged_baddr & ~PAGE_MASK; mem->size = PAGE_ALIGN(vb->size + offset); ret = -EINVAL;
down_read(&mm->mmap_sem);
- vma = find_vma(mm, vb->baddr); + vma = find_vma(mm, untagged_baddr); if (!vma) goto out_up;
- if ((vb->baddr + mem->size) > vma->vm_end) + if ((untagged_baddr + mem->size) > vma->vm_end) goto out_up;
pages_done = 0; prev_pfn = 0; /* kill warning */ - user_address = vb->baddr; + user_address = untagged_baddr;
while (pages_done < (mem->size >> PAGE_SHIFT)) { ret = follow_pfn(vma, user_address, &this_pfn);
On 6/12/19 5:43 AM, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
videobuf_dma_contig_user_get() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag the pointers in this function.
Reviewed-by: Kees Cook keescook@chromium.org Acked-by: Mauro Carvalho Chehab mchehab+samsung@kernel.org Signed-off-by: Andrey Konovalov andreyknvl@google.com
Patch looks good, but commit log should be updated to not be specific to arm64.
Reviewed-by: Khalid Aziz khalid.aziz@oracle.com
drivers/media/v4l2-core/videobuf-dma-contig.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) exact_copy_from_user diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c index e1bf50df4c70..8a1ddd146b17 100644 --- a/drivers/media/v4l2-core/videobuf-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c @@ -160,6 +160,7 @@ static void videobuf_dma_contig_user_put(struct videobuf_dma_contig_memory *mem) static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, struct videobuf_buffer *vb) {
- unsigned long untagged_baddr = untagged_addr(vb->baddr); struct mm_struct *mm = current->mm; struct vm_area_struct *vma; unsigned long prev_pfn, this_pfn;
@@ -167,22 +168,22 @@ static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, unsigned int offset; int ret;
- offset = vb->baddr & ~PAGE_MASK;
- offset = untagged_baddr & ~PAGE_MASK; mem->size = PAGE_ALIGN(vb->size + offset); ret = -EINVAL;
down_read(&mm->mmap_sem);
- vma = find_vma(mm, vb->baddr);
- vma = find_vma(mm, untagged_baddr); if (!vma) goto out_up;
- if ((vb->baddr + mem->size) > vma->vm_end)
- if ((untagged_baddr + mem->size) > vma->vm_end) goto out_up;
pages_done = 0; prev_pfn = 0; /* kill warning */
- user_address = vb->baddr;
- user_address = untagged_baddr;
while (pages_done < (mem->size >> PAGE_SHIFT)) { ret = follow_pfn(vma, user_address, &this_pfn);
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
tee_shm_register()->optee_shm_unregister()->check_mem_type() uses provided user pointers for vma lookups (via __check_mem_type()), which can only by done with untagged pointers.
Untag user pointers in this function.
Reviewed-by: Kees Cook keescook@chromium.org Acked-by: Jens Wiklander jens.wiklander@linaro.org Signed-off-by: Andrey Konovalov andreyknvl@google.com --- drivers/tee/tee_shm.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c index 2da026fd12c9..09ddcd06c715 100644 --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -254,6 +254,7 @@ struct tee_shm *tee_shm_register(struct tee_context *ctx, unsigned long addr, shm->teedev = teedev; shm->ctx = ctx; shm->id = -1; + addr = untagged_addr(addr); start = rounddown(addr, PAGE_SIZE); shm->offset = addr - start; shm->size = length;
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
vaddr_get_pfn() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Andrey Konovalov andreyknvl@google.com --- drivers/vfio/vfio_iommu_type1.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 3ddc375e7063..528e39a1c2dd 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -384,6 +384,8 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr,
down_read(&mm->mmap_sem);
+ vaddr = untagged_addr(vaddr); + vma = find_vma_intersection(mm, vaddr, vaddr + 1);
if (vma && vma->vm_flags & VM_PFNMAP) {
On 12/06/2019 12:43, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
vaddr_get_pfn() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Andrey Konovalov andreyknvl@google.com
Reviewed-by: Vincenzo Frascino vincenzo.frascino@arm.com
drivers/vfio/vfio_iommu_type1.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 3ddc375e7063..528e39a1c2dd 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -384,6 +384,8 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, down_read(&mm->mmap_sem);
- vaddr = untagged_addr(vaddr);
- vma = find_vma_intersection(mm, vaddr, vaddr + 1);
if (vma && vma->vm_flags & VM_PFNMAP) {
Hi Andrey,
On 6/12/19 1:43 PM, Andrey Konovalov wrote:
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
vaddr_get_pfn() uses provided user pointers for vma lookups, which can only by done with untagged pointers.
Untag user pointers in this function.
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Andrey Konovalov andreyknvl@google.com
Reviewed-by: Eric Auger eric.auger@redhat.com
Thanks
Eric
drivers/vfio/vfio_iommu_type1.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 3ddc375e7063..528e39a1c2dd 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -384,6 +384,8 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, down_read(&mm->mmap_sem);
- vaddr = untagged_addr(vaddr);
- vma = find_vma_intersection(mm, vaddr, vaddr + 1);
if (vma && vma->vm_flags & VM_PFNMAP) {
This patch is a part of a series that extends arm64 kernel ABI to allow to pass tagged user pointers (with the top byte set to something else other than 0x00) as syscall arguments.
This patch adds a simple test, that calls the uname syscall with a tagged user pointer as an argument. Without the kernel accepting tagged user pointers the test fails with EFAULT.
Co-developed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Andrey Konovalov andreyknvl@google.com --- tools/testing/selftests/arm64/.gitignore | 2 + tools/testing/selftests/arm64/Makefile | 22 +++++++ .../testing/selftests/arm64/run_tags_test.sh | 12 ++++ tools/testing/selftests/arm64/tags_lib.c | 62 +++++++++++++++++++ tools/testing/selftests/arm64/tags_test.c | 18 ++++++ 5 files changed, 116 insertions(+) create mode 100644 tools/testing/selftests/arm64/.gitignore create mode 100644 tools/testing/selftests/arm64/Makefile create mode 100755 tools/testing/selftests/arm64/run_tags_test.sh create mode 100644 tools/testing/selftests/arm64/tags_lib.c create mode 100644 tools/testing/selftests/arm64/tags_test.c
diff --git a/tools/testing/selftests/arm64/.gitignore b/tools/testing/selftests/arm64/.gitignore new file mode 100644 index 000000000000..9b6a568de17f --- /dev/null +++ b/tools/testing/selftests/arm64/.gitignore @@ -0,0 +1,2 @@ +tags_test +tags_lib.so diff --git a/tools/testing/selftests/arm64/Makefile b/tools/testing/selftests/arm64/Makefile new file mode 100644 index 000000000000..9dee18727923 --- /dev/null +++ b/tools/testing/selftests/arm64/Makefile @@ -0,0 +1,22 @@ +# SPDX-License-Identifier: GPL-2.0 + +include ../lib.mk + +# ARCH can be overridden by the user for cross compiling +ARCH ?= $(shell uname -m 2>/dev/null || echo not) + +ifneq (,$(filter $(ARCH),aarch64 arm64)) + +TEST_CUSTOM_PROGS := $(OUTPUT)/tags_test + +$(OUTPUT)/tags_test: tags_test.c $(OUTPUT)/tags_lib.so + $(CC) -o $@ $(CFLAGS) $(LDFLAGS) $< + +$(OUTPUT)/tags_lib.so: tags_lib.c + $(CC) -o $@ -shared $(CFLAGS) $(LDFLAGS) $^ + +TEST_PROGS := run_tags_test.sh + +all: $(TEST_CUSTOM_PROGS) + +endif diff --git a/tools/testing/selftests/arm64/run_tags_test.sh b/tools/testing/selftests/arm64/run_tags_test.sh new file mode 100755 index 000000000000..2bbe0cd4220b --- /dev/null +++ b/tools/testing/selftests/arm64/run_tags_test.sh @@ -0,0 +1,12 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 + +echo "--------------------" +echo "running tags test" +echo "--------------------" +LD_PRELOAD=./tags_lib.so ./tags_test +if [ $? -ne 0 ]; then + echo "[FAIL]" +else + echo "[PASS]" +fi diff --git a/tools/testing/selftests/arm64/tags_lib.c b/tools/testing/selftests/arm64/tags_lib.c new file mode 100644 index 000000000000..55f64fc1aae6 --- /dev/null +++ b/tools/testing/selftests/arm64/tags_lib.c @@ -0,0 +1,62 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <stdlib.h> +#include <sys/prctl.h> + +#define TAG_SHIFT (56) +#define TAG_MASK (0xffUL << TAG_SHIFT) + +#define PR_SET_TAGGED_ADDR_CTRL 55 +#define PR_GET_TAGGED_ADDR_CTRL 56 +#define PR_TAGGED_ADDR_ENABLE (1UL << 0) + +void *__libc_malloc(size_t size); +void __libc_free(void *ptr); +void *__libc_realloc(void *ptr, size_t size); +void *__libc_calloc(size_t nmemb, size_t size); + +static void *tag_ptr(void *ptr) +{ + static int tagged_addr_err = 1; + unsigned long tag = 0; + + /* + * Note that this code is racy. We only use it as a part of a single + * threaded test application. Beware of using in multithreaded ones. + */ + if (tagged_addr_err == 1) + tagged_addr_err = prctl(PR_SET_TAGGED_ADDR_CTRL, + PR_TAGGED_ADDR_ENABLE, 0, 0, 0); + + if (!ptr) + return ptr; + if (!tagged_addr_err) + tag = rand() & 0xff; + + return (void *)((unsigned long)ptr | (tag << TAG_SHIFT)); +} + +static void *untag_ptr(void *ptr) +{ + return (void *)((unsigned long)ptr & ~TAG_MASK); +} + +void *malloc(size_t size) +{ + return tag_ptr(__libc_malloc(size)); +} + +void free(void *ptr) +{ + __libc_free(untag_ptr(ptr)); +} + +void *realloc(void *ptr, size_t size) +{ + return tag_ptr(__libc_realloc(untag_ptr(ptr), size)); +} + +void *calloc(size_t nmemb, size_t size) +{ + return tag_ptr(__libc_calloc(nmemb, size)); +} diff --git a/tools/testing/selftests/arm64/tags_test.c b/tools/testing/selftests/arm64/tags_test.c new file mode 100644 index 000000000000..263b302874ed --- /dev/null +++ b/tools/testing/selftests/arm64/tags_test.c @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <stdio.h> +#include <stdlib.h> +#include <unistd.h> +#include <stdint.h> +#include <sys/utsname.h> + +int main(void) +{ + struct utsname *ptr; + int err; + + ptr = (struct utsname *)malloc(sizeof(*ptr)); + err = uname(ptr); + free(ptr); + return err; +}
On 12/06/2019 12:43, Andrey Konovalov wrote:
--- /dev/null +++ b/tools/testing/selftests/arm64/tags_lib.c @@ -0,0 +1,62 @@ +// SPDX-License-Identifier: GPL-2.0
+#include <stdlib.h> +#include <sys/prctl.h>
+#define TAG_SHIFT (56) +#define TAG_MASK (0xffUL << TAG_SHIFT)
+#define PR_SET_TAGGED_ADDR_CTRL 55 +#define PR_GET_TAGGED_ADDR_CTRL 56 +#define PR_TAGGED_ADDR_ENABLE (1UL << 0)
+void *__libc_malloc(size_t size); +void __libc_free(void *ptr); +void *__libc_realloc(void *ptr, size_t size); +void *__libc_calloc(size_t nmemb, size_t size);
this does not work on at least musl.
the most robust solution would be to implement the malloc apis with mmap/munmap/mremap, if that's too cumbersome then use dlsym RTLD_NEXT (although that has the slight wart that in glibc it may call calloc so wrapping calloc that way is tricky).
in simple linux tests i'd just use static or stack allocations or mmap.
if a generic preloadable lib solution is needed then do it properly with pthread_once to avoid races etc.
+static void *tag_ptr(void *ptr) +{
- static int tagged_addr_err = 1;
- unsigned long tag = 0;
- /*
* Note that this code is racy. We only use it as a part of a single
* threaded test application. Beware of using in multithreaded ones.
*/
- if (tagged_addr_err == 1)
tagged_addr_err = prctl(PR_SET_TAGGED_ADDR_CTRL,
PR_TAGGED_ADDR_ENABLE, 0, 0, 0);
- if (!ptr)
return ptr;
- if (!tagged_addr_err)
tag = rand() & 0xff;
- return (void *)((unsigned long)ptr | (tag << TAG_SHIFT));
+}
+static void *untag_ptr(void *ptr) +{
- return (void *)((unsigned long)ptr & ~TAG_MASK);
+}
+void *malloc(size_t size) +{
- return tag_ptr(__libc_malloc(size));
+}
...
On arm64 the TCR_EL1.TBI0 bit has been always enabled on the arm64 kernel, hence the userspace (EL0) is allowed to set a non-zero value in the top byte but the resulting pointers are not allowed at the user-kernel syscall ABI boundary.
This patchset proposes a relaxation of the ABI with which it is possible to pass tagged tagged pointers to the syscalls, when these pointers are in memory ranges obtained as described in tagged-address-abi.txt contained in this patch series.
Since it is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately, this patchset documents a new sysctl interface (/proc/sys/abi/tagged_addr) that is used to prevent the applications from enabling the relaxed ABI and a new prctl() interface that can be used to enable or disable the relaxed ABI.
This patchset should be merged together with [1].
[1] https://patchwork.kernel.org/cover/10674351/
Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com CC: Andrey Konovalov andreyknvl@google.com CC: Alexander Viro viro@zeniv.linux.org.uk Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com
Vincenzo Frascino (2): arm64: Define Documentation/arm64/tagged-address-abi.txt arm64: Relax Documentation/arm64/tagged-pointers.txt
Documentation/arm64/tagged-address-abi.txt | 111 +++++++++++++++++++++ Documentation/arm64/tagged-pointers.txt | 23 +++-- 2 files changed, 127 insertions(+), 7 deletions(-) create mode 100644 Documentation/arm64/tagged-address-abi.txt
On arm64 the TCR_EL1.TBI0 bit has been always enabled hence the userspace (EL0) is allowed to set a non-zero value in the top byte but the resulting pointers are not allowed at the user-kernel syscall ABI boundary.
With the relaxed ABI proposed through this document, it is now possible to pass tagged pointers to the syscalls, when these pointers are in memory ranges obtained by an anonymous (MAP_ANONYMOUS) mmap().
This change in the ABI requires a mechanism to requires the userspace to opt-in to such an option.
Specify and document the way in which sysctl and prctl() can be used in combination to allow the userspace to opt-in this feature.
Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com CC: Andrey Konovalov andreyknvl@google.com Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com --- Documentation/arm64/tagged-address-abi.txt | 111 +++++++++++++++++++++ 1 file changed, 111 insertions(+) create mode 100644 Documentation/arm64/tagged-address-abi.txt
diff --git a/Documentation/arm64/tagged-address-abi.txt b/Documentation/arm64/tagged-address-abi.txt new file mode 100644 index 000000000000..96e149e2c55c --- /dev/null +++ b/Documentation/arm64/tagged-address-abi.txt @@ -0,0 +1,111 @@ +ARM64 TAGGED ADDRESS ABI +======================== + +This document describes the usage and semantics of the Tagged Address +ABI on arm64. + +1. Introduction +--------------- + +On arm64 the TCR_EL1.TBI0 bit has been always enabled on the arm64 kernel, +hence the userspace (EL0) is allowed to set a non-zero value in the top +byte but the resulting pointers are not allowed at the user-kernel syscall +ABI boundary. + +This document describes a relaxation of the ABI with which it is possible +to pass tagged tagged pointers to the syscalls, when these pointers are in +memory ranges obtained as described in paragraph 2. + +Since it is not desirable to relax the ABI to allow tagged user addresses +into the kernel indiscriminately, arm64 provides a new sysctl interface +(/proc/sys/abi/tagged_addr) that is used to prevent the applications from +enabling the relaxed ABI and a new prctl() interface that can be used to +enable or disable the relaxed ABI. + +The sysctl is meant also for testing purposes in order to provide a simple +way for the userspace to verify the return error checking of the prctl() +command without having to reconfigure the kernel. + +The ABI properties are inherited by threads of the same application and +fork()'ed children but cleared when a new process is spawn (execve()). + +2. ARM64 Tagged Address ABI +--------------------------- + +From the kernel syscall interface prospective, we define, for the purposes +of this document, a "valid tagged pointer" as a pointer that either it has +a zero value set in the top byte or it has a non-zero value, it is in memory +ranges privately owned by a userspace process and it is obtained in one of +the following ways: + - mmap() done by the process itself, where either: + * flags = MAP_PRIVATE | MAP_ANONYMOUS + * flags = MAP_PRIVATE and the file descriptor refers to a regular + file or "/dev/zero" + - a mapping below sbrk(0) done by the process itself + - any memory mapped by the kernel in the process's address space during + creation and following the restrictions presented above (i.e. data, bss, + stack). + +The ARM64 Tagged Address ABI is an opt-in feature, and an application can +control it using the following prctl()s: + - PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI. + - PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged + Address ABI. + +As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications, +the ABI guarantees the following behaviours: + + - Every current or newly introduced syscall can accept any valid tagged + pointers. + + - If a non valid tagged pointer is passed to a syscall then the behaviour + is undefined. + + - Every valid tagged pointer is expected to work as an untagged one. + + - The kernel preserves any valid tagged pointers and returns them to the + userspace unchanged in all the cases except the ones documented in the + "Preserving tags" paragraph of tagged-pointers.txt. + +A definition of the meaning of tagged pointers on arm64 can be found in: +Documentation/arm64/tagged-pointers.txt. + +3. ARM64 Tagged Address ABI Exceptions +-------------------------------------- + +The behaviours described in paragraph 2, with particular reference to the +acceptance by the syscalls of any valid tagged pointer are not applicable +to the following cases: + - mmap() addr parameter. + - mremap() new_address parameter. + - prctl_set_mm() struct prctl_map fields. + - prctl_set_mm_map() struct prctl_map fields. + +4. Example of correct usage +--------------------------- + +void main(void) +{ + static int tbi_enabled = 0; + unsigned long tag = 0; + + char *ptr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS, -1, 0); + + if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE, + 0, 0, 0) == 0) + tbi_enabled = 1; + + if (!ptr) + return -1; + + if (tbi_enabled) + tag = rand() & 0xff; + + ptr = (char *)((unsigned long)ptr | (tag << TAG_SHIFT)); + + *ptr = 'a'; + + ... +} +
Hi Vincenzo,
Some minor comments below but it looks fine to me overall. Cc'ing Szabolcs as well since I'd like a view from the libc people.
On Wed, Jun 12, 2019 at 03:21:10PM +0100, Vincenzo Frascino wrote:
diff --git a/Documentation/arm64/tagged-address-abi.txt b/Documentation/arm64/tagged-address-abi.txt new file mode 100644 index 000000000000..96e149e2c55c --- /dev/null +++ b/Documentation/arm64/tagged-address-abi.txt @@ -0,0 +1,111 @@ +ARM64 TAGGED ADDRESS ABI +========================
+This document describes the usage and semantics of the Tagged Address +ABI on arm64.
+1. Introduction +---------------
+On arm64 the TCR_EL1.TBI0 bit has been always enabled on the arm64 kernel, +hence the userspace (EL0) is allowed to set a non-zero value in the top
I'd be clearer here: "userspace (EL0) is allowed to perform a user memory access through a 64-bit pointer with a non-zero top byte" (or something along the lines). Otherwise setting a non-zero top byte is allowed on any architecture, dereferencing it is a problem.
+byte but the resulting pointers are not allowed at the user-kernel syscall +ABI boundary.
+This document describes a relaxation of the ABI with which it is possible
"relaxation of the ABI that makes it possible to..."
+to pass tagged tagged pointers to the syscalls, when these pointers are in +memory ranges obtained as described in paragraph 2.
"section 2" is better. There are a lot more paragraphs.
+Since it is not desirable to relax the ABI to allow tagged user addresses +into the kernel indiscriminately, arm64 provides a new sysctl interface +(/proc/sys/abi/tagged_addr) that is used to prevent the applications from +enabling the relaxed ABI and a new prctl() interface that can be used to +enable or disable the relaxed ABI.
+The sysctl is meant also for testing purposes in order to provide a simple +way for the userspace to verify the return error checking of the prctl() +command without having to reconfigure the kernel.
+The ABI properties are inherited by threads of the same application and +fork()'ed children but cleared when a new process is spawn (execve()).
"spawned".
I guess you could drop these three paragraphs here and mention the inheritance properties when introducing the prctl() below. You can also mention the global sysctl switch after the prctl() was introduced.
+2. ARM64 Tagged Address ABI +---------------------------
+From the kernel syscall interface prospective, we define, for the purposes +of this document, a "valid tagged pointer" as a pointer that either it has
"either has" (no 'it') sounds slightly better but I'm not a native English speaker either.
+a zero value set in the top byte or it has a non-zero value, it is in memory +ranges privately owned by a userspace process and it is obtained in one of +the following ways:
- mmap() done by the process itself, where either:
- flags = MAP_PRIVATE | MAP_ANONYMOUS
- flags = MAP_PRIVATE and the file descriptor refers to a regular
file or "/dev/zero"
- a mapping below sbrk(0) done by the process itself
- any memory mapped by the kernel in the process's address space during
- creation and following the restrictions presented above (i.e. data, bss,
- stack).
+The ARM64 Tagged Address ABI is an opt-in feature, and an application can +control it using the following prctl()s:
- PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
enable or disable (not sure we need the latter but it doesn't heart).
I'd add the arg2 description here as well.
- PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
Address ABI.
+As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications, +the ABI guarantees the following behaviours:
- Every current or newly introduced syscall can accept any valid tagged
- pointers.
- If a non valid tagged pointer is passed to a syscall then the behaviour
- is undefined.
- Every valid tagged pointer is expected to work as an untagged one.
- The kernel preserves any valid tagged pointers and returns them to the
- userspace unchanged in all the cases except the ones documented in the
- "Preserving tags" paragraph of tagged-pointers.txt.
I'd think we need to qualify the context here in which the kernel preserves the tagged pointers. Did you mean on the syscall return?
+A definition of the meaning of tagged pointers on arm64 can be found in: +Documentation/arm64/tagged-pointers.txt.
+3. ARM64 Tagged Address ABI Exceptions +--------------------------------------
+The behaviours described in paragraph 2, with particular reference to the
"section 2"
+acceptance by the syscalls of any valid tagged pointer are not applicable +to the following cases:
- mmap() addr parameter.
- mremap() new_address parameter.
- prctl_set_mm() struct prctl_map fields.
- prctl_set_mm_map() struct prctl_map fields.
+4. Example of correct usage +---------------------------
+void main(void) +{
- static int tbi_enabled = 0;
- unsigned long tag = 0;
- char *ptr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS, -1, 0);
- if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE,
0, 0, 0) == 0)
tbi_enabled = 1;
- if (!ptr)
return -1;
- if (tbi_enabled)
tag = rand() & 0xff;
- ptr = (char *)((unsigned long)ptr | (tag << TAG_SHIFT));
- *ptr = 'a';
- ...
+}
-- 2.21.0
Hi Catalin,
On 12/06/2019 16:35, Catalin Marinas wrote:
Hi Vincenzo,
Some minor comments below but it looks fine to me overall. Cc'ing Szabolcs as well since I'd like a view from the libc people.
Thanks for this, I saw Szabolcs comments.
On Wed, Jun 12, 2019 at 03:21:10PM +0100, Vincenzo Frascino wrote:
diff --git a/Documentation/arm64/tagged-address-abi.txt b/Documentation/arm64/tagged-address-abi.txt new file mode 100644 index 000000000000..96e149e2c55c --- /dev/null +++ b/Documentation/arm64/tagged-address-abi.txt @@ -0,0 +1,111 @@ +ARM64 TAGGED ADDRESS ABI +========================
+This document describes the usage and semantics of the Tagged Address +ABI on arm64.
+1. Introduction +---------------
+On arm64 the TCR_EL1.TBI0 bit has been always enabled on the arm64 kernel, +hence the userspace (EL0) is allowed to set a non-zero value in the top
I'd be clearer here: "userspace (EL0) is allowed to perform a user memory access through a 64-bit pointer with a non-zero top byte" (or something along the lines). Otherwise setting a non-zero top byte is allowed on any architecture, dereferencing it is a problem.
Ok.
+byte but the resulting pointers are not allowed at the user-kernel syscall +ABI boundary.
+This document describes a relaxation of the ABI with which it is possible
"relaxation of the ABI that makes it possible to..."
+to pass tagged tagged pointers to the syscalls, when these pointers are in +memory ranges obtained as described in paragraph 2.
"section 2" is better. There are a lot more paragraphs.
Agree.
+Since it is not desirable to relax the ABI to allow tagged user addresses +into the kernel indiscriminately, arm64 provides a new sysctl interface +(/proc/sys/abi/tagged_addr) that is used to prevent the applications from +enabling the relaxed ABI and a new prctl() interface that can be used to +enable or disable the relaxed ABI.
+The sysctl is meant also for testing purposes in order to provide a simple +way for the userspace to verify the return error checking of the prctl() +command without having to reconfigure the kernel.
+The ABI properties are inherited by threads of the same application and +fork()'ed children but cleared when a new process is spawn (execve()).
"spawned".
I guess you could drop these three paragraphs here and mention the inheritance properties when introducing the prctl() below. You can also mention the global sysctl switch after the prctl() was introduced.
I will move the last two (rewording them) to the _section_ 2, but I would still prefer the Introduction to give an overview of the solution as well.
+2. ARM64 Tagged Address ABI +---------------------------
+From the kernel syscall interface prospective, we define, for the purposes +of this document, a "valid tagged pointer" as a pointer that either it has
"either has" (no 'it') sounds slightly better but I'm not a native English speaker either.
+a zero value set in the top byte or it has a non-zero value, it is in memory +ranges privately owned by a userspace process and it is obtained in one of +the following ways:
- mmap() done by the process itself, where either:
- flags = MAP_PRIVATE | MAP_ANONYMOUS
- flags = MAP_PRIVATE and the file descriptor refers to a regular
file or "/dev/zero"
- a mapping below sbrk(0) done by the process itself
- any memory mapped by the kernel in the process's address space during
- creation and following the restrictions presented above (i.e. data, bss,
- stack).
+The ARM64 Tagged Address ABI is an opt-in feature, and an application can +control it using the following prctl()s:
- PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
enable or disable (not sure we need the latter but it doesn't heart).
I'd add the arg2 description here as well.
Good point I missed this.
- PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
Address ABI.
+As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications, +the ABI guarantees the following behaviours:
- Every current or newly introduced syscall can accept any valid tagged
- pointers.
- If a non valid tagged pointer is passed to a syscall then the behaviour
- is undefined.
- Every valid tagged pointer is expected to work as an untagged one.
- The kernel preserves any valid tagged pointers and returns them to the
- userspace unchanged in all the cases except the ones documented in the
- "Preserving tags" paragraph of tagged-pointers.txt.
I'd think we need to qualify the context here in which the kernel preserves the tagged pointers. Did you mean on the syscall return?
What this means is that on syscall return the tags are preserved, but if for example you have tagged pointers inside siginfo_t, they will not because according to tagged-pointers.txt non-zero tags are not preserved when delivering signals.
+A definition of the meaning of tagged pointers on arm64 can be found in: +Documentation/arm64/tagged-pointers.txt.
+3. ARM64 Tagged Address ABI Exceptions +--------------------------------------
+The behaviours described in paragraph 2, with particular reference to the
"section 2"
+acceptance by the syscalls of any valid tagged pointer are not applicable +to the following cases:
- mmap() addr parameter.
- mremap() new_address parameter.
- prctl_set_mm() struct prctl_map fields.
- prctl_set_mm_map() struct prctl_map fields.
+4. Example of correct usage +---------------------------
+void main(void) +{
- static int tbi_enabled = 0;
- unsigned long tag = 0;
- char *ptr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS, -1, 0);
- if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE,
0, 0, 0) == 0)
tbi_enabled = 1;
- if (!ptr)
return -1;
- if (tbi_enabled)
tag = rand() & 0xff;
- ptr = (char *)((unsigned long)ptr | (tag << TAG_SHIFT));
- *ptr = 'a';
- ...
+}
-- 2.21.0
On Thu, Jun 13, 2019 at 11:15:34AM +0100, Vincenzo Frascino wrote:
Hi Catalin,
On 12/06/2019 16:35, Catalin Marinas wrote:
Hi Vincenzo,
Some minor comments below but it looks fine to me overall. Cc'ing Szabolcs as well since I'd like a view from the libc people.
Thanks for this, I saw Szabolcs comments.
On Wed, Jun 12, 2019 at 03:21:10PM +0100, Vincenzo Frascino wrote:
diff --git a/Documentation/arm64/tagged-address-abi.txt b/Documentation/arm64/tagged-address-abi.txt new file mode 100644 index 000000000000..96e149e2c55c --- /dev/null +++ b/Documentation/arm64/tagged-address-abi.txt
[...]
+Since it is not desirable to relax the ABI to allow tagged user addresses +into the kernel indiscriminately, arm64 provides a new sysctl interface +(/proc/sys/abi/tagged_addr) that is used to prevent the applications from +enabling the relaxed ABI and a new prctl() interface that can be used to +enable or disable the relaxed ABI.
+The sysctl is meant also for testing purposes in order to provide a simple +way for the userspace to verify the return error checking of the prctl() +command without having to reconfigure the kernel.
+The ABI properties are inherited by threads of the same application and +fork()'ed children but cleared when a new process is spawn (execve()).
"spawned".
I'd just say "cleared by execve()."
"Spawn" suggests (v)fork+exec to me (at least, what's what "spawn" means on certain other OSes).
I guess you could drop these three paragraphs here and mention the inheritance properties when introducing the prctl() below. You can also mention the global sysctl switch after the prctl() was introduced.
I will move the last two (rewording them) to the _section_ 2, but I would still prefer the Introduction to give an overview of the solution as well.
+2. ARM64 Tagged Address ABI +---------------------------
+From the kernel syscall interface prospective, we define, for the purposes +of this document, a "valid tagged pointer" as a pointer that either it has
"either has" (no 'it') sounds slightly better but I'm not a native English speaker either.
+a zero value set in the top byte or it has a non-zero value, it is in memory +ranges privately owned by a userspace process and it is obtained in one of +the following ways:
- mmap() done by the process itself, where either:
- flags = MAP_PRIVATE | MAP_ANONYMOUS
- flags = MAP_PRIVATE and the file descriptor refers to a regular
file or "/dev/zero"
- a mapping below sbrk(0) done by the process itself
- any memory mapped by the kernel in the process's address space during
- creation and following the restrictions presented above (i.e. data, bss,
- stack).
+The ARM64 Tagged Address ABI is an opt-in feature, and an application can +control it using the following prctl()s:
- PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
enable or disable (not sure we need the latter but it doesn't heart).
I'd add the arg2 description here as well.
Good point I missed this.
- PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
Address ABI.
For both prctls, you should also document the zeroed arguments up to arg5 (unless we get rid of the enforcement and just ignore them).
Is there a canonical way to detect whether this whole API/ABI is available? (i.e., try to call this prctl / check for an HWCAP bit, etc.)
[...]
Cheers ---Dave
On Thu, Jun 13, 2019 at 12:37:32PM +0100, Dave P Martin wrote:
On Thu, Jun 13, 2019 at 11:15:34AM +0100, Vincenzo Frascino wrote:
On 12/06/2019 16:35, Catalin Marinas wrote:
On Wed, Jun 12, 2019 at 03:21:10PM +0100, Vincenzo Frascino wrote:
- PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
Address ABI.
[...]
Is there a canonical way to detect whether this whole API/ABI is available? (i.e., try to call this prctl / check for an HWCAP bit, etc.)
The canonical way is a prctl() call. HWCAP doesn't make sense since it's not a hardware feature. If you really want a different way of detecting this (which I don't think it's worth), we can reinstate the AT_FLAGS bit.
On Thu, Jun 13, 2019 at 01:28:21PM +0100, Catalin Marinas wrote:
On Thu, Jun 13, 2019 at 12:37:32PM +0100, Dave P Martin wrote:
On Thu, Jun 13, 2019 at 11:15:34AM +0100, Vincenzo Frascino wrote:
On 12/06/2019 16:35, Catalin Marinas wrote:
On Wed, Jun 12, 2019 at 03:21:10PM +0100, Vincenzo Frascino wrote:
- PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
Address ABI.
[...]
Is there a canonical way to detect whether this whole API/ABI is available? (i.e., try to call this prctl / check for an HWCAP bit, etc.)
The canonical way is a prctl() call. HWCAP doesn't make sense since it's not a hardware feature. If you really want a different way of detecting this (which I don't think it's worth), we can reinstate the AT_FLAGS bit.
Sure, I think this probably makes sense -- I'm still getting my around which parts of the design are directly related to MTE and which aren't.
I was a bit concerned about the interaction between PR_SET_TAGGED_ADDR_CTRL and the sysctl: the caller might conclude that this API is unavailable when actually tagged addresses are stuck on.
I'm not sure whether this matters, but it's a bit weird.
One option would be to change the semantics, so that the sysctl just forbids turning tagging from off to on. Alternatively, we could return a different error code to distinguish this case.
Or we just leave it as proposed.
Cheers ---Dave
On Thu, Jun 13, 2019 at 02:23:43PM +0100, Dave P Martin wrote:
On Thu, Jun 13, 2019 at 01:28:21PM +0100, Catalin Marinas wrote:
On Thu, Jun 13, 2019 at 12:37:32PM +0100, Dave P Martin wrote:
On Thu, Jun 13, 2019 at 11:15:34AM +0100, Vincenzo Frascino wrote:
On 12/06/2019 16:35, Catalin Marinas wrote:
On Wed, Jun 12, 2019 at 03:21:10PM +0100, Vincenzo Frascino wrote:
- PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
Address ABI.
[...]
Is there a canonical way to detect whether this whole API/ABI is available? (i.e., try to call this prctl / check for an HWCAP bit, etc.)
The canonical way is a prctl() call. HWCAP doesn't make sense since it's not a hardware feature. If you really want a different way of detecting this (which I don't think it's worth), we can reinstate the AT_FLAGS bit.
Sure, I think this probably makes sense -- I'm still getting my around which parts of the design are directly related to MTE and which aren't.
I was a bit concerned about the interaction between PR_SET_TAGGED_ADDR_CTRL and the sysctl: the caller might conclude that this API is unavailable when actually tagged addresses are stuck on.
I'm not sure whether this matters, but it's a bit weird.
One option would be to change the semantics, so that the sysctl just forbids turning tagging from off to on. Alternatively, we could return a different error code to distinguish this case.
This is the intention, just to forbid turning tagging on. We could return -EPERM instead, though my original intent was to simply pretend that the prctl does not exist like in an older kernel version.
On 12/06/2019 15:21, Vincenzo Frascino wrote:
On arm64 the TCR_EL1.TBI0 bit has been always enabled hence the userspace (EL0) is allowed to set a non-zero value in the top byte but the resulting pointers are not allowed at the user-kernel syscall ABI boundary.
With the relaxed ABI proposed through this document, it is now possible to pass tagged pointers to the syscalls, when these pointers are in memory ranges obtained by an anonymous (MAP_ANONYMOUS) mmap().
This change in the ABI requires a mechanism to requires the userspace to opt-in to such an option.
Specify and document the way in which sysctl and prctl() can be used in combination to allow the userspace to opt-in this feature.
Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com CC: Andrey Konovalov andreyknvl@google.com Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com
Documentation/arm64/tagged-address-abi.txt | 111 +++++++++++++++++++++ 1 file changed, 111 insertions(+) create mode 100644 Documentation/arm64/tagged-address-abi.txt
diff --git a/Documentation/arm64/tagged-address-abi.txt b/Documentation/arm64/tagged-address-abi.txt new file mode 100644 index 000000000000..96e149e2c55c --- /dev/null +++ b/Documentation/arm64/tagged-address-abi.txt @@ -0,0 +1,111 @@ +ARM64 TAGGED ADDRESS ABI +========================
+This document describes the usage and semantics of the Tagged Address +ABI on arm64.
+1. Introduction +---------------
+On arm64 the TCR_EL1.TBI0 bit has been always enabled on the arm64 kernel, +hence the userspace (EL0) is allowed to set a non-zero value in the top +byte but the resulting pointers are not allowed at the user-kernel syscall +ABI boundary.
+This document describes a relaxation of the ABI with which it is possible +to pass tagged tagged pointers to the syscalls, when these pointers are in
^^^^^^^^^^^^^ typo.
+memory ranges obtained as described in paragraph 2.
+Since it is not desirable to relax the ABI to allow tagged user addresses +into the kernel indiscriminately, arm64 provides a new sysctl interface +(/proc/sys/abi/tagged_addr) that is used to prevent the applications from +enabling the relaxed ABI and a new prctl() interface that can be used to +enable or disable the relaxed ABI.
+The sysctl is meant also for testing purposes in order to provide a simple +way for the userspace to verify the return error checking of the prctl() +command without having to reconfigure the kernel.
+The ABI properties are inherited by threads of the same application and +fork()'ed children but cleared when a new process is spawn (execve()).
OK.
+2. ARM64 Tagged Address ABI +---------------------------
+From the kernel syscall interface prospective, we define, for the purposes
^^^^^^^^^^^ perspective
+of this document, a "valid tagged pointer" as a pointer that either it has +a zero value set in the top byte or it has a non-zero value, it is in memory +ranges privately owned by a userspace process and it is obtained in one of +the following ways:
- mmap() done by the process itself, where either:
- flags = MAP_PRIVATE | MAP_ANONYMOUS
- flags = MAP_PRIVATE and the file descriptor refers to a regular
file or "/dev/zero"
this does not make it clear if MAP_FIXED or other flags are valid (there are many map flags i don't know, but at least fixed should work and stack/growsdown. i'd expect anything that's not incompatible with private|anon to work).
- a mapping below sbrk(0) done by the process itself
doesn't the mmap rule cover this?
- any memory mapped by the kernel in the process's address space during
- creation and following the restrictions presented above (i.e. data, bss,
- stack).
OK.
Can a null pointer have a tag? (in case NULL is valid to pass to a syscall)
+The ARM64 Tagged Address ABI is an opt-in feature, and an application can +control it using the following prctl()s:
- PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
- PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
Address ABI.
+As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications, +the ABI guarantees the following behaviours:
- Every current or newly introduced syscall can accept any valid tagged
- pointers.
- If a non valid tagged pointer is passed to a syscall then the behaviour
- is undefined.
- Every valid tagged pointer is expected to work as an untagged one.
- The kernel preserves any valid tagged pointers and returns them to the
- userspace unchanged in all the cases except the ones documented in the
- "Preserving tags" paragraph of tagged-pointers.txt.
OK.
i guess pointers of another process are not "valid tagged pointers" for the current one, so e.g. in ptrace the ptracer has to clear the tags before PEEK etc.
+A definition of the meaning of tagged pointers on arm64 can be found in: +Documentation/arm64/tagged-pointers.txt.
+3. ARM64 Tagged Address ABI Exceptions +--------------------------------------
+The behaviours described in paragraph 2, with particular reference to the +acceptance by the syscalls of any valid tagged pointer are not applicable +to the following cases:
- mmap() addr parameter.
- mremap() new_address parameter.
- prctl_set_mm() struct prctl_map fields.
- prctl_set_mm_map() struct prctl_map fields.
i don't understand the exception: does it mean that passing a tagged address to these syscalls is undefined?
+4. Example of correct usage +---------------------------
+void main(void) +{
- static int tbi_enabled = 0;
- unsigned long tag = 0;
- char *ptr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS, -1, 0);
- if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE,
0, 0, 0) == 0)
tbi_enabled = 1;
- if (!ptr)
return -1;
mmap returns MAP_FAILED on failure.
- if (tbi_enabled)
tag = rand() & 0xff;
- ptr = (char *)((unsigned long)ptr | (tag << TAG_SHIFT));
- *ptr = 'a';
- ...
+}
Hi Szabolcs,
On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
On 12/06/2019 15:21, Vincenzo Frascino wrote:
+2. ARM64 Tagged Address ABI +---------------------------
+From the kernel syscall interface prospective, we define, for the purposes
^^^^^^^^^^^
perspective
+of this document, a "valid tagged pointer" as a pointer that either it has +a zero value set in the top byte or it has a non-zero value, it is in memory +ranges privately owned by a userspace process and it is obtained in one of +the following ways:
- mmap() done by the process itself, where either:
- flags = MAP_PRIVATE | MAP_ANONYMOUS
- flags = MAP_PRIVATE and the file descriptor refers to a regular
file or "/dev/zero"
this does not make it clear if MAP_FIXED or other flags are valid (there are many map flags i don't know, but at least fixed should work and stack/growsdown. i'd expect anything that's not incompatible with private|anon to work).
Just to clarify, this document tries to define the memory ranges from where tagged addresses can be passed into the kernel in the context of TBI only (not MTE); that is for hwasan support. FIXED or GROWSDOWN should not affect this.
- a mapping below sbrk(0) done by the process itself
doesn't the mmap rule cover this?
IIUC it doesn't cover it as that's memory mapped by the kernel automatically on access vs a pointer returned by mmap(). The statement above talks about how the address is obtained by the user.
- any memory mapped by the kernel in the process's address space during
- creation and following the restrictions presented above (i.e. data, bss,
- stack).
OK.
Can a null pointer have a tag? (in case NULL is valid to pass to a syscall)
Good point. I don't think it can. We may change this for MTE where we give a hint tag but no hint address, however, this document only covers TBI for now.
+The ARM64 Tagged Address ABI is an opt-in feature, and an application can +control it using the following prctl()s:
- PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
- PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
Address ABI.
+As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications, +the ABI guarantees the following behaviours:
- Every current or newly introduced syscall can accept any valid tagged
- pointers.
- If a non valid tagged pointer is passed to a syscall then the behaviour
- is undefined.
- Every valid tagged pointer is expected to work as an untagged one.
- The kernel preserves any valid tagged pointers and returns them to the
- userspace unchanged in all the cases except the ones documented in the
- "Preserving tags" paragraph of tagged-pointers.txt.
OK.
i guess pointers of another process are not "valid tagged pointers" for the current one, so e.g. in ptrace the ptracer has to clear the tags before PEEK etc.
Another good point. Are there any pros/cons here or use-cases? When we add MTE support, should we handle this differently?
+A definition of the meaning of tagged pointers on arm64 can be found in: +Documentation/arm64/tagged-pointers.txt.
+3. ARM64 Tagged Address ABI Exceptions +--------------------------------------
+The behaviours described in paragraph 2, with particular reference to the +acceptance by the syscalls of any valid tagged pointer are not applicable +to the following cases:
- mmap() addr parameter.
- mremap() new_address parameter.
- prctl_set_mm() struct prctl_map fields.
- prctl_set_mm_map() struct prctl_map fields.
i don't understand the exception: does it mean that passing a tagged address to these syscalls is undefined?
I'd say it's as undefined as it is right now without these patches. We may be able to explain this better in the document.
On 13/06/2019 10:20, Catalin Marinas wrote:
Hi Szabolcs,
On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
On 12/06/2019 15:21, Vincenzo Frascino wrote:
+2. ARM64 Tagged Address ABI +---------------------------
+From the kernel syscall interface prospective, we define, for the purposes
^^^^^^^^^^^
perspective
+of this document, a "valid tagged pointer" as a pointer that either it has +a zero value set in the top byte or it has a non-zero value, it is in memory +ranges privately owned by a userspace process and it is obtained in one of +the following ways:
- mmap() done by the process itself, where either:
- flags = MAP_PRIVATE | MAP_ANONYMOUS
- flags = MAP_PRIVATE and the file descriptor refers to a regular
file or "/dev/zero"
this does not make it clear if MAP_FIXED or other flags are valid (there are many map flags i don't know, but at least fixed should work and stack/growsdown. i'd expect anything that's not incompatible with private|anon to work).
Just to clarify, this document tries to define the memory ranges from where tagged addresses can be passed into the kernel in the context of TBI only (not MTE); that is for hwasan support. FIXED or GROWSDOWN should not affect this.
yes, so either the text should list MAP_* flags that don't affect the pointer tagging semantics or specify private|anon mapping with different wording.
- a mapping below sbrk(0) done by the process itself
doesn't the mmap rule cover this?
IIUC it doesn't cover it as that's memory mapped by the kernel automatically on access vs a pointer returned by mmap(). The statement above talks about how the address is obtained by the user.
ok i read 'mapping below sbrk' as an mmap (possibly MAP_FIXED) that happens to be below the heap area.
i think "below sbrk(0)" is not the best term to use: there may be address range below the heap area that can be mmapped and thus below sbrk(0) and sbrk is a posix api not a linux syscall, the libc can implement it with mmap or whatever.
i'm not sure what the right term for 'heap area' is (the address range between syscall(__NR_brk,0) at program startup and its current value?)
- any memory mapped by the kernel in the process's address space during
- creation and following the restrictions presented above (i.e. data, bss,
- stack).
OK.
Can a null pointer have a tag? (in case NULL is valid to pass to a syscall)
Good point. I don't think it can. We may change this for MTE where we give a hint tag but no hint address, however, this document only covers TBI for now.
OK.
+The ARM64 Tagged Address ABI is an opt-in feature, and an application can +control it using the following prctl()s:
- PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
- PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
Address ABI.
+As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications, +the ABI guarantees the following behaviours:
- Every current or newly introduced syscall can accept any valid tagged
- pointers.
- If a non valid tagged pointer is passed to a syscall then the behaviour
- is undefined.
- Every valid tagged pointer is expected to work as an untagged one.
- The kernel preserves any valid tagged pointers and returns them to the
- userspace unchanged in all the cases except the ones documented in the
- "Preserving tags" paragraph of tagged-pointers.txt.
OK.
i guess pointers of another process are not "valid tagged pointers" for the current one, so e.g. in ptrace the ptracer has to clear the tags before PEEK etc.
Another good point. Are there any pros/cons here or use-cases? When we add MTE support, should we handle this differently?
i'm not sure what gdb does currently, but it has an 'address_significant' hook used at a few places that drops the tag on aarch64, so it probably avoids passing tagged pointer to ptrace.
i was worried about strace which tries to print structs passed to syscalls and follow pointers in them which currently would work, but if we allow tags in syscalls then it needs some update. (i haven't checked the strace code though)
+A definition of the meaning of tagged pointers on arm64 can be found in: +Documentation/arm64/tagged-pointers.txt.
+3. ARM64 Tagged Address ABI Exceptions +--------------------------------------
+The behaviours described in paragraph 2, with particular reference to the +acceptance by the syscalls of any valid tagged pointer are not applicable +to the following cases:
- mmap() addr parameter.
- mremap() new_address parameter.
- prctl_set_mm() struct prctl_map fields.
- prctl_set_mm_map() struct prctl_map fields.
i don't understand the exception: does it mean that passing a tagged address to these syscalls is undefined?
I'd say it's as undefined as it is right now without these patches. We may be able to explain this better in the document.
Hi Szabolcs,
thank you for your review.
On 13/06/2019 11:14, Szabolcs Nagy wrote:
On 13/06/2019 10:20, Catalin Marinas wrote:
Hi Szabolcs,
On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
On 12/06/2019 15:21, Vincenzo Frascino wrote:
+2. ARM64 Tagged Address ABI +---------------------------
+From the kernel syscall interface prospective, we define, for the purposes
^^^^^^^^^^^
perspective
+of this document, a "valid tagged pointer" as a pointer that either it has +a zero value set in the top byte or it has a non-zero value, it is in memory +ranges privately owned by a userspace process and it is obtained in one of +the following ways:
- mmap() done by the process itself, where either:
- flags = MAP_PRIVATE | MAP_ANONYMOUS
- flags = MAP_PRIVATE and the file descriptor refers to a regular
file or "/dev/zero"
this does not make it clear if MAP_FIXED or other flags are valid (there are many map flags i don't know, but at least fixed should work and stack/growsdown. i'd expect anything that's not incompatible with private|anon to work).
Just to clarify, this document tries to define the memory ranges from where tagged addresses can be passed into the kernel in the context of TBI only (not MTE); that is for hwasan support. FIXED or GROWSDOWN should not affect this.
yes, so either the text should list MAP_* flags that don't affect the pointer tagging semantics or specify private|anon mapping with different wording.
Good point. Could you please propose a wording that would be suitable for this case?
- a mapping below sbrk(0) done by the process itself
doesn't the mmap rule cover this?
IIUC it doesn't cover it as that's memory mapped by the kernel automatically on access vs a pointer returned by mmap(). The statement above talks about how the address is obtained by the user.
ok i read 'mapping below sbrk' as an mmap (possibly MAP_FIXED) that happens to be below the heap area.
i think "below sbrk(0)" is not the best term to use: there may be address range below the heap area that can be mmapped and thus below sbrk(0) and sbrk is a posix api not a linux syscall, the libc can implement it with mmap or whatever.
i'm not sure what the right term for 'heap area' is (the address range between syscall(__NR_brk,0) at program startup and its current value?)
I used sbrk(0) with the meaning of "end of the process's data segment" not implying that this is a syscall, but just as a useful way to identify the mapping. I agree that it is a posix function implemented by libc but when it is used with 0 finds the current location of the program break, which can be changed by brk() and depending on the new address passed to this syscall can have the effect of allocating or deallocating memory.
Will changing sbrk(0) with "end of the process's data segment" make it more clear?
I will add what you are suggesting about the heap area.
- any memory mapped by the kernel in the process's address space during
- creation and following the restrictions presented above (i.e. data, bss,
- stack).
OK.
Can a null pointer have a tag? (in case NULL is valid to pass to a syscall)
Good point. I don't think it can. We may change this for MTE where we give a hint tag but no hint address, however, this document only covers TBI for now.
OK.
+The ARM64 Tagged Address ABI is an opt-in feature, and an application can +control it using the following prctl()s:
- PR_SET_TAGGED_ADDR_CTRL: can be used to enable the Tagged Address ABI.
- PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
Address ABI.
+As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications, +the ABI guarantees the following behaviours:
- Every current or newly introduced syscall can accept any valid tagged
- pointers.
- If a non valid tagged pointer is passed to a syscall then the behaviour
- is undefined.
- Every valid tagged pointer is expected to work as an untagged one.
- The kernel preserves any valid tagged pointers and returns them to the
- userspace unchanged in all the cases except the ones documented in the
- "Preserving tags" paragraph of tagged-pointers.txt.
OK.
i guess pointers of another process are not "valid tagged pointers" for the current one, so e.g. in ptrace the ptracer has to clear the tags before PEEK etc.
Another good point. Are there any pros/cons here or use-cases? When we add MTE support, should we handle this differently?
i'm not sure what gdb does currently, but it has an 'address_significant' hook used at a few places that drops the tag on aarch64, so it probably avoids passing tagged pointer to ptrace.
i was worried about strace which tries to print structs passed to syscalls and follow pointers in them which currently would work, but if we allow tags in syscalls then it needs some update. (i haven't checked the strace code though)
+A definition of the meaning of tagged pointers on arm64 can be found in:
+Documentation/arm64/tagged-pointers.txt.
+3. ARM64 Tagged Address ABI Exceptions +--------------------------------------
+The behaviours described in paragraph 2, with particular reference to the +acceptance by the syscalls of any valid tagged pointer are not applicable +to the following cases:
- mmap() addr parameter.
- mremap() new_address parameter.
- prctl_set_mm() struct prctl_map fields.
- prctl_set_mm_map() struct prctl_map fields.
i don't understand the exception: does it mean that passing a tagged address to these syscalls is undefined?
I'd say it's as undefined as it is right now without these patches. We may be able to explain this better in the document.
On 13/06/2019 12:16, Vincenzo Frascino wrote:
Hi Szabolcs,
thank you for your review.
On 13/06/2019 11:14, Szabolcs Nagy wrote:
On 13/06/2019 10:20, Catalin Marinas wrote:
Hi Szabolcs,
On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
On 12/06/2019 15:21, Vincenzo Frascino wrote:
+2. ARM64 Tagged Address ABI +---------------------------
+From the kernel syscall interface prospective, we define, for the purposes
^^^^^^^^^^^
perspective
+of this document, a "valid tagged pointer" as a pointer that either it has +a zero value set in the top byte or it has a non-zero value, it is in memory +ranges privately owned by a userspace process and it is obtained in one of +the following ways:
- mmap() done by the process itself, where either:
- flags = MAP_PRIVATE | MAP_ANONYMOUS
- flags = MAP_PRIVATE and the file descriptor refers to a regular
file or "/dev/zero"
this does not make it clear if MAP_FIXED or other flags are valid (there are many map flags i don't know, but at least fixed should work and stack/growsdown. i'd expect anything that's not incompatible with private|anon to work).
Just to clarify, this document tries to define the memory ranges from where tagged addresses can be passed into the kernel in the context of TBI only (not MTE); that is for hwasan support. FIXED or GROWSDOWN should not affect this.
yes, so either the text should list MAP_* flags that don't affect the pointer tagging semantics or specify private|anon mapping with different wording.
Good point. Could you please propose a wording that would be suitable for this case?
i don't know all the MAP_ magic, but i think it's enough to change the "flags =" to
* flags have MAP_PRIVATE and MAP_ANONYMOUS set or * flags have MAP_PRIVATE set and the file descriptor refers to...
- a mapping below sbrk(0) done by the process itself
doesn't the mmap rule cover this?
IIUC it doesn't cover it as that's memory mapped by the kernel automatically on access vs a pointer returned by mmap(). The statement above talks about how the address is obtained by the user.
ok i read 'mapping below sbrk' as an mmap (possibly MAP_FIXED) that happens to be below the heap area.
i think "below sbrk(0)" is not the best term to use: there may be address range below the heap area that can be mmapped and thus below sbrk(0) and sbrk is a posix api not a linux syscall, the libc can implement it with mmap or whatever.
i'm not sure what the right term for 'heap area' is (the address range between syscall(__NR_brk,0) at program startup and its current value?)
I used sbrk(0) with the meaning of "end of the process's data segment" not implying that this is a syscall, but just as a useful way to identify the mapping. I agree that it is a posix function implemented by libc but when it is used with 0 finds the current location of the program break, which can be changed by brk() and depending on the new address passed to this syscall can have the effect of allocating or deallocating memory.
Will changing sbrk(0) with "end of the process's data segment" make it more clear?
i don't understand what's the relevance of the *end* of the data segment.
i'd expect the text to say something about the address range of the data segment.
i can do
mmap((void*)65536, 65536, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_SHARED|MAP_ANON, -1, 0);
and it will be below the end of the data segment.
I will add what you are suggesting about the heap area.
On 13/06/2019 13:28, Szabolcs Nagy wrote:
On 13/06/2019 12:16, Vincenzo Frascino wrote:
Hi Szabolcs,
thank you for your review.
On 13/06/2019 11:14, Szabolcs Nagy wrote:
On 13/06/2019 10:20, Catalin Marinas wrote:
Hi Szabolcs,
On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
On 12/06/2019 15:21, Vincenzo Frascino wrote:
+2. ARM64 Tagged Address ABI +---------------------------
+From the kernel syscall interface prospective, we define, for the purposes
^^^^^^^^^^^
perspective
+of this document, a "valid tagged pointer" as a pointer that either it has +a zero value set in the top byte or it has a non-zero value, it is in memory +ranges privately owned by a userspace process and it is obtained in one of +the following ways:
- mmap() done by the process itself, where either:
- flags = MAP_PRIVATE | MAP_ANONYMOUS
- flags = MAP_PRIVATE and the file descriptor refers to a regular
file or "/dev/zero"
this does not make it clear if MAP_FIXED or other flags are valid (there are many map flags i don't know, but at least fixed should work and stack/growsdown. i'd expect anything that's not incompatible with private|anon to work).
Just to clarify, this document tries to define the memory ranges from where tagged addresses can be passed into the kernel in the context of TBI only (not MTE); that is for hwasan support. FIXED or GROWSDOWN should not affect this.
yes, so either the text should list MAP_* flags that don't affect the pointer tagging semantics or specify private|anon mapping with different wording.
Good point. Could you please propose a wording that would be suitable for this case?
i don't know all the MAP_ magic, but i think it's enough to change the "flags =" to
- flags have MAP_PRIVATE and MAP_ANONYMOUS set or
- flags have MAP_PRIVATE set and the file descriptor refers to...
Fine by me. I will add it the next iterations.
- a mapping below sbrk(0) done by the process itself
doesn't the mmap rule cover this?
IIUC it doesn't cover it as that's memory mapped by the kernel automatically on access vs a pointer returned by mmap(). The statement above talks about how the address is obtained by the user.
ok i read 'mapping below sbrk' as an mmap (possibly MAP_FIXED) that happens to be below the heap area.
i think "below sbrk(0)" is not the best term to use: there may be address range below the heap area that can be mmapped and thus below sbrk(0) and sbrk is a posix api not a linux syscall, the libc can implement it with mmap or whatever.
i'm not sure what the right term for 'heap area' is (the address range between syscall(__NR_brk,0) at program startup and its current value?)
I used sbrk(0) with the meaning of "end of the process's data segment" not implying that this is a syscall, but just as a useful way to identify the mapping. I agree that it is a posix function implemented by libc but when it is used with 0 finds the current location of the program break, which can be changed by brk() and depending on the new address passed to this syscall can have the effect of allocating or deallocating memory.
Will changing sbrk(0) with "end of the process's data segment" make it more clear?
i don't understand what's the relevance of the *end* of the data segment.
i'd expect the text to say something about the address range of the data segment.
i can do
mmap((void*)65536, 65536, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_SHARED|MAP_ANON, -1, 0);
and it will be below the end of the data segment.
As far as I understand the data segment "lives" below the program break, hence it is a way of describing the range from which the user can obtain a valid tagged pointer.
Said that, I am not really sure on how do you want me to document this (my aim is for this to be clear to the userspace developers). Could you please propose something?
I will add what you are suggesting about the heap area.
On 13/06/2019 15:03, Vincenzo Frascino wrote:
On 13/06/2019 13:28, Szabolcs Nagy wrote:
On 13/06/2019 12:16, Vincenzo Frascino wrote:
On 13/06/2019 11:14, Szabolcs Nagy wrote:
On 13/06/2019 10:20, Catalin Marinas wrote:
On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote:
On 12/06/2019 15:21, Vincenzo Frascino wrote: > + - a mapping below sbrk(0) done by the process itself
doesn't the mmap rule cover this?
IIUC it doesn't cover it as that's memory mapped by the kernel automatically on access vs a pointer returned by mmap(). The statement above talks about how the address is obtained by the user.
ok i read 'mapping below sbrk' as an mmap (possibly MAP_FIXED) that happens to be below the heap area.
i think "below sbrk(0)" is not the best term to use: there may be address range below the heap area that can be mmapped and thus below sbrk(0) and sbrk is a posix api not a linux syscall, the libc can implement it with mmap or whatever.
i'm not sure what the right term for 'heap area' is (the address range between syscall(__NR_brk,0) at program startup and its current value?)
I used sbrk(0) with the meaning of "end of the process's data segment" not implying that this is a syscall, but just as a useful way to identify the mapping. I agree that it is a posix function implemented by libc but when it is used with 0 finds the current location of the program break, which can be changed by brk() and depending on the new address passed to this syscall can have the effect of allocating or deallocating memory.
Will changing sbrk(0) with "end of the process's data segment" make it more clear?
i don't understand what's the relevance of the *end* of the data segment.
i'd expect the text to say something about the address range of the data segment.
i can do
mmap((void*)65536, 65536, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_SHARED|MAP_ANON, -1, 0);
and it will be below the end of the data segment.
As far as I understand the data segment "lives" below the program break, hence it is a way of describing the range from which the user can obtain a valid tagged pointer.> Said that, I am not really sure on how do you want me to document this (my aim is for this to be clear to the userspace developers). Could you please propose something?
[...], it is in the memory ranges privately owned by a userspace process and it is obtained in one of the following ways:
- mmap done by the process itself, [...]
- brk syscall done by the process itself. (i.e. the heap area between the initial location of the program break at process creation and its current location.)
- any memory mapped by the kernel [...]
the data segment that's part of the process image is already covered by the last point.
On 13/06/2019 16:32, Szabolcs Nagy wrote:
On 13/06/2019 15:03, Vincenzo Frascino wrote:
On 13/06/2019 13:28, Szabolcs Nagy wrote:
On 13/06/2019 12:16, Vincenzo Frascino wrote:
On 13/06/2019 11:14, Szabolcs Nagy wrote:
On 13/06/2019 10:20, Catalin Marinas wrote:
On Wed, Jun 12, 2019 at 05:30:34PM +0100, Szabolcs Nagy wrote: > On 12/06/2019 15:21, Vincenzo Frascino wrote: >> + - a mapping below sbrk(0) done by the process itself > > doesn't the mmap rule cover this?
IIUC it doesn't cover it as that's memory mapped by the kernel automatically on access vs a pointer returned by mmap(). The statement above talks about how the address is obtained by the user.
ok i read 'mapping below sbrk' as an mmap (possibly MAP_FIXED) that happens to be below the heap area.
i think "below sbrk(0)" is not the best term to use: there may be address range below the heap area that can be mmapped and thus below sbrk(0) and sbrk is a posix api not a linux syscall, the libc can implement it with mmap or whatever.
i'm not sure what the right term for 'heap area' is (the address range between syscall(__NR_brk,0) at program startup and its current value?)
I used sbrk(0) with the meaning of "end of the process's data segment" not implying that this is a syscall, but just as a useful way to identify the mapping. I agree that it is a posix function implemented by libc but when it is used with 0 finds the current location of the program break, which can be changed by brk() and depending on the new address passed to this syscall can have the effect of allocating or deallocating memory.
Will changing sbrk(0) with "end of the process's data segment" make it more clear?
i don't understand what's the relevance of the *end* of the data segment.
i'd expect the text to say something about the address range of the data segment.
i can do
mmap((void*)65536, 65536, PROT_READ|PROT_WRITE, MAP_FIXED|MAP_SHARED|MAP_ANON, -1, 0);
and it will be below the end of the data segment.
As far as I understand the data segment "lives" below the program break, hence it is a way of describing the range from which the user can obtain a valid tagged pointer.> Said that, I am not really sure on how do you want me to document this (my aim is for this to be clear to the userspace developers). Could you please propose something?
[...], it is in the memory ranges privately owned by a userspace process and it is obtained in one of the following ways:
mmap done by the process itself, [...]
brk syscall done by the process itself. (i.e. the heap area between the initial location of the program break at process creation and its current location.)
any memory mapped by the kernel [...]
the data segment that's part of the process image is already covered by the last point.
Thanks Szabolcs, I will update the document accordingly.
On arm64 the TCR_EL1.TBI0 bit has been always enabled hence the userspace (EL0) is allowed to set a non-zero value in the top byte but the resulting pointers are not allowed at the user-kernel syscall ABI boundary.
With the relaxed ABI proposed in this set, it is now possible to pass tagged pointers to the syscalls, when these pointers are in memory ranges obtained by an anonymous (MAP_ANONYMOUS) mmap().
Relax the requirements described in tagged-pointers.txt to be compliant with the behaviours guaranteed by the ARM64 Tagged Address ABI.
Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com CC: Andrey Konovalov andreyknvl@google.com Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com --- Documentation/arm64/tagged-pointers.txt | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/Documentation/arm64/tagged-pointers.txt b/Documentation/arm64/tagged-pointers.txt index a25a99e82bb1..db58a7e95805 100644 --- a/Documentation/arm64/tagged-pointers.txt +++ b/Documentation/arm64/tagged-pointers.txt @@ -18,7 +18,8 @@ Passing tagged addresses to the kernel --------------------------------------
All interpretation of userspace memory addresses by the kernel assumes -an address tag of 0x00. +an address tag of 0x00, unless the userspace opts-in the ARM64 Tagged +Address ABI via the PR_SET_TAGGED_ADDR_CTRL prctl().
This includes, but is not limited to, addresses found in:
@@ -31,18 +32,23 @@ This includes, but is not limited to, addresses found in: - the frame pointer (x29) and frame records, e.g. when interpreting them to generate a backtrace or call graph.
-Using non-zero address tags in any of these locations may result in an -error code being returned, a (fatal) signal being raised, or other modes -of failure. +Using non-zero address tags in any of these locations when the +userspace application did not opt-in to the ARM64 Tagged Address ABI, +may result in an error code being returned, a (fatal) signal being raised, +or other modes of failure.
-For these reasons, passing non-zero address tags to the kernel via -system calls is forbidden, and using a non-zero address tag for sp is -strongly discouraged. +For these reasons, when the userspace application did not opt-in, passing +non-zero address tags to the kernel via system calls is forbidden, and using +a non-zero address tag for sp is strongly discouraged.
Programs maintaining a frame pointer and frame records that use non-zero address tags may suffer impaired or inaccurate debug and profiling visibility.
+A definition of the meaning of ARM64 Tagged Address ABI and of the +guarantees that the ABI provides when the userspace opts-in via prctl() +can be found in: Documentation/arm64/tagged-address-abi.txt. +
Preserving tags --------------- @@ -57,6 +63,9 @@ be preserved. The architecture prevents the use of a tagged PC, so the upper byte will be set to a sign-extension of bit 55 on exception return.
+This behaviours are preserved even when the the userspace opts-in the ARM64 +Tagged Address ABI via the PR_SET_TAGGED_ADDR_CTRL prctl(). +
Other considerations --------------------
A couple of minor nits below.
On Wed, Jun 12, 2019 at 03:21:11PM +0100, Vincenzo Frascino wrote:
--- a/Documentation/arm64/tagged-pointers.txt +++ b/Documentation/arm64/tagged-pointers.txt @@ -18,7 +18,8 @@ Passing tagged addresses to the kernel
All interpretation of userspace memory addresses by the kernel assumes -an address tag of 0x00. +an address tag of 0x00, unless the userspace opts-in the ARM64 Tagged +Address ABI via the PR_SET_TAGGED_ADDR_CTRL prctl(). This includes, but is not limited to, addresses found in: @@ -31,18 +32,23 @@ This includes, but is not limited to, addresses found in:
- the frame pointer (x29) and frame records, e.g. when interpreting them to generate a backtrace or call graph.
-Using non-zero address tags in any of these locations may result in an -error code being returned, a (fatal) signal being raised, or other modes -of failure. +Using non-zero address tags in any of these locations when the +userspace application did not opt-in to the ARM64 Tagged Address ABI,
Nitpick: drop the comma after "ABI," since a predicate follows.
+may result in an error code being returned, a (fatal) signal being raised, +or other modes of failure. -For these reasons, passing non-zero address tags to the kernel via -system calls is forbidden, and using a non-zero address tag for sp is -strongly discouraged. +For these reasons, when the userspace application did not opt-in, passing +non-zero address tags to the kernel via system calls is forbidden, and using +a non-zero address tag for sp is strongly discouraged. Programs maintaining a frame pointer and frame records that use non-zero address tags may suffer impaired or inaccurate debug and profiling visibility. +A definition of the meaning of ARM64 Tagged Address ABI and of the +guarantees that the ABI provides when the userspace opts-in via prctl() +can be found in: Documentation/arm64/tagged-address-abi.txt.
Preserving tags
@@ -57,6 +63,9 @@ be preserved. The architecture prevents the use of a tagged PC, so the upper byte will be set to a sign-extension of bit 55 on exception return. +This behaviours are preserved even when the the userspace opts-in the ARM64
"These" ... "opts in to"
+Tagged Address ABI via the PR_SET_TAGGED_ADDR_CTRL prctl().
Other considerations
-- 2.21.0
On 12/06/2019 15:21, Vincenzo Frascino wrote:
On arm64 the TCR_EL1.TBI0 bit has been always enabled hence the userspace (EL0) is allowed to set a non-zero value in the top byte but the resulting pointers are not allowed at the user-kernel syscall ABI boundary.
With the relaxed ABI proposed in this set, it is now possible to pass tagged pointers to the syscalls, when these pointers are in memory ranges obtained by an anonymous (MAP_ANONYMOUS) mmap().
Relax the requirements described in tagged-pointers.txt to be compliant with the behaviours guaranteed by the ARM64 Tagged Address ABI.
Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com CC: Andrey Konovalov andreyknvl@google.com Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com
Documentation/arm64/tagged-pointers.txt | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/Documentation/arm64/tagged-pointers.txt b/Documentation/arm64/tagged-pointers.txt index a25a99e82bb1..db58a7e95805 100644 --- a/Documentation/arm64/tagged-pointers.txt +++ b/Documentation/arm64/tagged-pointers.txt @@ -18,7 +18,8 @@ Passing tagged addresses to the kernel
All interpretation of userspace memory addresses by the kernel assumes -an address tag of 0x00. +an address tag of 0x00, unless the userspace opts-in the ARM64 Tagged +Address ABI via the PR_SET_TAGGED_ADDR_CTRL prctl(). This includes, but is not limited to, addresses found in: @@ -31,18 +32,23 @@ This includes, but is not limited to, addresses found in:
- the frame pointer (x29) and frame records, e.g. when interpreting them to generate a backtrace or call graph.
-Using non-zero address tags in any of these locations may result in an -error code being returned, a (fatal) signal being raised, or other modes -of failure. +Using non-zero address tags in any of these locations when the +userspace application did not opt-in to the ARM64 Tagged Address ABI, +may result in an error code being returned, a (fatal) signal being raised, +or other modes of failure. -For these reasons, passing non-zero address tags to the kernel via -system calls is forbidden, and using a non-zero address tag for sp is -strongly discouraged. +For these reasons, when the userspace application did not opt-in, passing +non-zero address tags to the kernel via system calls is forbidden, and using +a non-zero address tag for sp is strongly discouraged. Programs maintaining a frame pointer and frame records that use non-zero address tags may suffer impaired or inaccurate debug and profiling visibility. +A definition of the meaning of ARM64 Tagged Address ABI and of the +guarantees that the ABI provides when the userspace opts-in via prctl() +can be found in: Documentation/arm64/tagged-address-abi.txt.
OK.
Preserving tags
@@ -57,6 +63,9 @@ be preserved. The architecture prevents the use of a tagged PC, so the upper byte will be set to a sign-extension of bit 55 on exception return. +This behaviours are preserved even when the the userspace opts-in the ARM64
these behaviours.
+Tagged Address ABI via the PR_SET_TAGGED_ADDR_CTRL prctl().
Other considerations
On arm64 the TCR_EL1.TBI0 bit has been always enabled on the arm64 kernel, hence the userspace (EL0) is allowed to set a non-zero value in the top byte but the resulting pointers are not allowed at the user-kernel syscall ABI boundary.
This patchset proposes a relaxation of the ABI with which it is possible to pass tagged tagged pointers to the syscalls, when these pointers are in memory ranges obtained as described in tagged-address-abi.txt contained in this patch series.
Since it is not desirable to relax the ABI to allow tagged user addresses into the kernel indiscriminately, this patchset documents a new sysctl interface (/proc/sys/abi/tagged_addr) that is used to prevent the applications from enabling the relaxed ABI and a new prctl() interface that can be used to enable or disable the relaxed ABI.
This patchset should be merged together with [1].
[1] https://patchwork.kernel.org/cover/10674351/
Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com CC: Andrey Konovalov andreyknvl@google.com CC: Alexander Viro viro@zeniv.linux.org.uk Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com
Vincenzo Frascino (2): arm64: Define Documentation/arm64/tagged-address-abi.txt arm64: Relax Documentation/arm64/tagged-pointers.txt
Documentation/arm64/tagged-address-abi.txt | 134 +++++++++++++++++++++ Documentation/arm64/tagged-pointers.txt | 23 ++-- 2 files changed, 150 insertions(+), 7 deletions(-) create mode 100644 Documentation/arm64/tagged-address-abi.txt
On arm64 the TCR_EL1.TBI0 bit has been always enabled hence the userspace (EL0) is allowed to set a non-zero value in the top byte but the resulting pointers are not allowed at the user-kernel syscall ABI boundary.
With the relaxed ABI proposed through this document, it is now possible to pass tagged pointers to the syscalls, when these pointers are in memory ranges obtained by an anonymous (MAP_ANONYMOUS) mmap().
This change in the ABI requires a mechanism to requires the userspace to opt-in to such an option.
Specify and document the way in which sysctl and prctl() can be used in combination to allow the userspace to opt-in this feature.
Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com CC: Andrey Konovalov andreyknvl@google.com Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com --- Documentation/arm64/tagged-address-abi.txt | 134 +++++++++++++++++++++ 1 file changed, 134 insertions(+) create mode 100644 Documentation/arm64/tagged-address-abi.txt
diff --git a/Documentation/arm64/tagged-address-abi.txt b/Documentation/arm64/tagged-address-abi.txt new file mode 100644 index 000000000000..0ae900d4bb2d --- /dev/null +++ b/Documentation/arm64/tagged-address-abi.txt @@ -0,0 +1,134 @@ +ARM64 TAGGED ADDRESS ABI +======================== + +This document describes the usage and semantics of the Tagged Address +ABI on arm64. + +1. Introduction +--------------- + +On arm64 the TCR_EL1.TBI0 bit has been always enabled on the kernel, hence +the userspace (EL0) is entitled to perform a user memory access through a +64-bit pointer with a non-zero top byte but the resulting pointers are not +allowed at the user-kernel syscall ABI boundary. + +This document describes a relaxation of the ABI that makes it possible to +to pass tagged pointers to the syscalls, when these pointers are in memory +ranges obtained as described in section 2. + +Since it is not desirable to relax the ABI to allow tagged user addresses +into the kernel indiscriminately, arm64 provides a new sysctl interface +(/proc/sys/abi/tagged_addr) that is used to prevent the applications from +enabling the relaxed ABI and a new prctl() interface that can be used to +enable or disable the relaxed ABI. +A detailed description of the newly introduced mechanisms will be provided +in section 2. + +2. ARM64 Tagged Address ABI +--------------------------- + +From the kernel syscall interface perspective, we define, for the purposes +of this document, a "valid tagged pointer" as a pointer that either has a +zero value set in the top byte or has a non-zero value, it is in memory +ranges privately owned by a userspace process and it is obtained in one of +the following ways: + - mmap() done by the process itself, where either: + * flags have MAP_PRIVATE and MAP_ANONYMOUS + * flags have MAP_PRIVATE and the file descriptor refers to a regular + file or "/dev/zero" + - brk() system call done by the process itself (i.e. the heap area between + the initial location of the program break at process creation and its + current location). + - any memory mapped by the kernel in the process's address space during + creation and following the restrictions presented above (i.e. data, bss, + stack). + +The ARM64 Tagged Address ABI is an opt-in feature, and an application can +control it using the following: + - /proc/sys/abi/tagged_addr: a new sysctl interface that can be used to + prevent the applications from enabling the relaxed ABI. + The sysctl is meant also for testing purposes in order to provide a + simple way for the userspace to verify the return error checking of + the prctl() commands without having to reconfigure the kernel. + The sysctl supports the following configuration options: + - 0: Disable ARM64 Tagged Address ABI for all the applications. + - 1 (Default): Enable ARM64 Tagged Address ABI for all the + applications. + If the ARM64 Tagged Address ABI is disabled at a certain point in + time, all the applications that were using tagging before this event + occurs, will continue to use tagging. + + - prctl()s: + - PR_SET_TAGGED_ADDR_CTRL: can be used to enable or disable the Tagged + Address ABI. + The (unsigned int) arg2 argument is a bit mask describing the + control mode used: + - PR_TAGGED_ADDR_ENABLE: Enable ARM64 Tagged Address ABI. + The arguments arg3, arg4, and arg5 are ignored. + + - PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged + Address ABI. + The arguments arg2, arg3, arg4, and arg5 are ignored. + +The ABI properties set by the mechanisms described above are inherited by threads +of the same application and fork()'ed children but cleared by execve(). + +As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications, +the ABI guarantees the following behaviours: + + - Every current or newly introduced syscall can accept any valid tagged + pointers. + + - If a non valid tagged pointer is passed to a syscall then the behaviour + is undefined. + + - Every valid tagged pointer is expected to work as an untagged one. + + - The kernel preserves any valid tagged pointers and returns them to the + userspace unchanged (i.e. on syscall return) in all the cases except the + ones documented in the "Preserving tags" section of tagged-pointers.txt. + +A definition of the meaning of tagged pointers on arm64 can be found in: +Documentation/arm64/tagged-pointers.txt. + +3. ARM64 Tagged Address ABI Exceptions +-------------------------------------- + +The behaviours described in section 2, with particular reference to the +acceptance by the syscalls of any valid tagged pointer are not applicable +to the following cases: + - mmap() addr parameter. + - mremap() new_address parameter. + - prctl_set_mm() struct prctl_map fields. + - prctl_set_mm_map() struct prctl_map fields. + +Any attempt to use non-zero tagged pointers will lead to undefined behaviour. + +4. Example of correct usage +--------------------------- + +void main(void) +{ + static int tbi_enabled = 0; + unsigned long tag = 0; + + char *ptr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS, -1, 0); + + if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE, + 0, 0, 0) == 0) + tbi_enabled = 1; + + if (ptr == (void *)-1) /* MAP_FAILED */ + return -1; + + if (tbi_enabled) + tag = rand() & 0xff; + + ptr = (char *)((unsigned long)ptr | (tag << TAG_SHIFT)); + + *ptr = 'a'; + + ... +} +
On 13/06/2019 16:51, Vincenzo Frascino wrote:
On arm64 the TCR_EL1.TBI0 bit has been always enabled hence the userspace (EL0) is allowed to set a non-zero value in the top byte but the resulting pointers are not allowed at the user-kernel syscall ABI boundary.
With the relaxed ABI proposed through this document, it is now possible to pass tagged pointers to the syscalls, when these pointers are in memory ranges obtained by an anonymous (MAP_ANONYMOUS) mmap().
This change in the ABI requires a mechanism to requires the userspace to opt-in to such an option.
Specify and document the way in which sysctl and prctl() can be used in combination to allow the userspace to opt-in this feature.
Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com CC: Andrey Konovalov andreyknvl@google.com Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com
Acked-by: Szabolcs Nagy szabolcs.nagy@arm.com
Documentation/arm64/tagged-address-abi.txt | 134 +++++++++++++++++++++ 1 file changed, 134 insertions(+) create mode 100644 Documentation/arm64/tagged-address-abi.txt
diff --git a/Documentation/arm64/tagged-address-abi.txt b/Documentation/arm64/tagged-address-abi.txt new file mode 100644 index 000000000000..0ae900d4bb2d --- /dev/null +++ b/Documentation/arm64/tagged-address-abi.txt @@ -0,0 +1,134 @@ +ARM64 TAGGED ADDRESS ABI +========================
+This document describes the usage and semantics of the Tagged Address +ABI on arm64.
+1. Introduction +---------------
+On arm64 the TCR_EL1.TBI0 bit has been always enabled on the kernel, hence +the userspace (EL0) is entitled to perform a user memory access through a +64-bit pointer with a non-zero top byte but the resulting pointers are not +allowed at the user-kernel syscall ABI boundary.
+This document describes a relaxation of the ABI that makes it possible to +to pass tagged pointers to the syscalls, when these pointers are in memory +ranges obtained as described in section 2.
+Since it is not desirable to relax the ABI to allow tagged user addresses +into the kernel indiscriminately, arm64 provides a new sysctl interface +(/proc/sys/abi/tagged_addr) that is used to prevent the applications from +enabling the relaxed ABI and a new prctl() interface that can be used to +enable or disable the relaxed ABI. +A detailed description of the newly introduced mechanisms will be provided +in section 2.
+2. ARM64 Tagged Address ABI +---------------------------
+From the kernel syscall interface perspective, we define, for the purposes +of this document, a "valid tagged pointer" as a pointer that either has a +zero value set in the top byte or has a non-zero value, it is in memory +ranges privately owned by a userspace process and it is obtained in one of +the following ways:
- mmap() done by the process itself, where either:
- flags have MAP_PRIVATE and MAP_ANONYMOUS
- flags have MAP_PRIVATE and the file descriptor refers to a regular
file or "/dev/zero"
- brk() system call done by the process itself (i.e. the heap area between
- the initial location of the program break at process creation and its
- current location).
- any memory mapped by the kernel in the process's address space during
- creation and following the restrictions presented above (i.e. data, bss,
- stack).
+The ARM64 Tagged Address ABI is an opt-in feature, and an application can +control it using the following:
- /proc/sys/abi/tagged_addr: a new sysctl interface that can be used to
prevent the applications from enabling the relaxed ABI.
The sysctl is meant also for testing purposes in order to provide a
simple way for the userspace to verify the return error checking of
the prctl() commands without having to reconfigure the kernel.
The sysctl supports the following configuration options:
- 0: Disable ARM64 Tagged Address ABI for all the applications.
- 1 (Default): Enable ARM64 Tagged Address ABI for all the
applications.
If the ARM64 Tagged Address ABI is disabled at a certain point in
time, all the applications that were using tagging before this event
occurs, will continue to use tagging.
- prctl()s:
- PR_SET_TAGGED_ADDR_CTRL: can be used to enable or disable the Tagged
Address ABI.
The (unsigned int) arg2 argument is a bit mask describing the
control mode used:
- PR_TAGGED_ADDR_ENABLE: Enable ARM64 Tagged Address ABI.
The arguments arg3, arg4, and arg5 are ignored.
- PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
Address ABI.
The arguments arg2, arg3, arg4, and arg5 are ignored.
+The ABI properties set by the mechanisms described above are inherited by threads +of the same application and fork()'ed children but cleared by execve().
+As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications, +the ABI guarantees the following behaviours:
- Every current or newly introduced syscall can accept any valid tagged
- pointers.
- If a non valid tagged pointer is passed to a syscall then the behaviour
- is undefined.
- Every valid tagged pointer is expected to work as an untagged one.
- The kernel preserves any valid tagged pointers and returns them to the
- userspace unchanged (i.e. on syscall return) in all the cases except the
- ones documented in the "Preserving tags" section of tagged-pointers.txt.
+A definition of the meaning of tagged pointers on arm64 can be found in: +Documentation/arm64/tagged-pointers.txt.
+3. ARM64 Tagged Address ABI Exceptions +--------------------------------------
+The behaviours described in section 2, with particular reference to the +acceptance by the syscalls of any valid tagged pointer are not applicable +to the following cases:
- mmap() addr parameter.
- mremap() new_address parameter.
- prctl_set_mm() struct prctl_map fields.
- prctl_set_mm_map() struct prctl_map fields.
+Any attempt to use non-zero tagged pointers will lead to undefined behaviour.
+4. Example of correct usage +---------------------------
+void main(void) +{
- static int tbi_enabled = 0;
- unsigned long tag = 0;
- char *ptr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS, -1, 0);
- if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE,
0, 0, 0) == 0)
tbi_enabled = 1;
- if (ptr == (void *)-1) /* MAP_FAILED */
return -1;
- if (tbi_enabled)
tag = rand() & 0xff;
- ptr = (char *)((unsigned long)ptr | (tag << TAG_SHIFT));
- *ptr = 'a';
- ...
+}
I'm happy with the ABI overall, but I think we need a few more tweaks.
On 13/06/2019 16:51, Vincenzo Frascino wrote:
On arm64 the TCR_EL1.TBI0 bit has been always enabled hence the userspace (EL0) is allowed to set a non-zero value in the top byte but the resulting pointers are not allowed at the user-kernel syscall ABI boundary.
With the relaxed ABI proposed through this document, it is now possible to pass tagged pointers to the syscalls, when these pointers are in memory ranges obtained by an anonymous (MAP_ANONYMOUS) mmap().
This change in the ABI requires a mechanism to requires the userspace to opt-in to such an option.
Specify and document the way in which sysctl and prctl() can be used in combination to allow the userspace to opt-in this feature.
Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com CC: Andrey Konovalov andreyknvl@google.com Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com
Documentation/arm64/tagged-address-abi.txt | 134 +++++++++++++++++++++ 1 file changed, 134 insertions(+) create mode 100644 Documentation/arm64/tagged-address-abi.txt
diff --git a/Documentation/arm64/tagged-address-abi.txt b/Documentation/arm64/tagged-address-abi.txt new file mode 100644 index 000000000000..0ae900d4bb2d --- /dev/null +++ b/Documentation/arm64/tagged-address-abi.txt @@ -0,0 +1,134 @@ +ARM64 TAGGED ADDRESS ABI +========================
+This document describes the usage and semantics of the Tagged Address +ABI on arm64.
+1. Introduction +---------------
+On arm64 the TCR_EL1.TBI0 bit has been always enabled on the kernel, hence
"been always" -> "always been"
+the userspace (EL0) is entitled to perform a user memory access through a +64-bit pointer with a non-zero top byte but the resulting pointers are not +allowed at the user-kernel syscall ABI boundary.
+This document describes a relaxation of the ABI that makes it possible to +to pass tagged pointers to the syscalls, when these pointers are in memory +ranges obtained as described in section 2.
+Since it is not desirable to relax the ABI to allow tagged user addresses +into the kernel indiscriminately, arm64 provides a new sysctl interface +(/proc/sys/abi/tagged_addr) that is used to prevent the applications from +enabling the relaxed ABI and a new prctl() interface that can be used to +enable or disable the relaxed ABI. +A detailed description of the newly introduced mechanisms will be provided +in section 2.
+2. ARM64 Tagged Address ABI +---------------------------
+From the kernel syscall interface perspective, we define, for the purposes +of this document, a "valid tagged pointer" as a pointer that either has a +zero value set in the top byte or has a non-zero value, it is in memory +ranges privately owned by a userspace process and it is obtained in one of
Remove all the remaining "it": "a pointer that either [...], is in memory ranges [...] and is obtained..."
+the following ways:
- mmap() done by the process itself, where either:
- flags have MAP_PRIVATE and MAP_ANONYMOUS
- flags have MAP_PRIVATE and the file descriptor refers to a regular
file or "/dev/zero"
- brk() system call done by the process itself (i.e. the heap area between
- the initial location of the program break at process creation and its
- current location).
- any memory mapped by the kernel in the process's address space during
- creation and following the restrictions presented above (i.e. data, bss,
- stack).
As I commented on v2, the "i.e." is not correct: these 3 sections are not the only ones that are covered by this ABI (.text also is, for instance). Replacing "i.e." with "e.g." would work.
Also, since the rules above say explicitly "done by the process itself", it might be clearer to replace "following the restrictions presented above" with "with the same restrictions as for mmap()".
+The ARM64 Tagged Address ABI is an opt-in feature, and an application can +control it using the following:
- /proc/sys/abi/tagged_addr: a new sysctl interface that can be used to
prevent the applications from enabling the relaxed ABI.
The sysctl is meant also for testing purposes in order to provide a
simple way for the userspace to verify the return error checking of
the prctl() commands without having to reconfigure the kernel.
The sysctl supports the following configuration options:
- 0: Disable ARM64 Tagged Address ABI for all the applications.
- 1 (Default): Enable ARM64 Tagged Address ABI for all the
applications.
I find this very confusing, because it suggests that the default value of PR_GET_TAGGED_ADDR_CTRL for new processes will be set to the value of this sysctl, when in fact this sysctl is about restricting the *availability* of the new ABI. Instead of disabling the ABI, I would talk about disabling access to the new ABI here.
If the ARM64 Tagged Address ABI is disabled at a certain point in
time, all the applications that were using tagging before this event
occurs, will continue to use tagging.
- prctl()s:
- PR_SET_TAGGED_ADDR_CTRL: can be used to enable or disable the Tagged
Address ABI.
The (unsigned int) arg2 argument is a bit mask describing the
control mode used:
- PR_TAGGED_ADDR_ENABLE: Enable ARM64 Tagged Address ABI.
The arguments arg3, arg4, and arg5 are ignored.
Have we definitely decided that arg{3,4,5} are ignored? Catalin?
- PR_GET_TAGGED_ADDR_CTRL: can be used to check the status of the Tagged
Address ABI.
The arguments arg2, arg3, arg4, and arg5 are ignored.
+The ABI properties set by the mechanisms described above are inherited by threads +of the same application and fork()'ed children but cleared by execve().
+As a consequence of invoking PR_SET_TAGGED_ADDR_CTRL prctl() by an applications,
I think this is too vague, you can use this prctl() to disable the new ABI, and it can also fail. Maybe it's best to simply say that the process has successfully opted into the new ABI.
+the ABI guarantees the following behaviours:
- Every current or newly introduced syscall can accept any valid tagged
- pointers.
"pointer". Also, is it really useful to talk about newly introduced syscall? New from which point of view?
- If a non valid tagged pointer is passed to a syscall then the behaviour
- is undefined.
- Every valid tagged pointer is expected to work as an untagged one.
- The kernel preserves any valid tagged pointers and returns them to the
"pointer", "returns it"
- userspace unchanged (i.e. on syscall return) in all the cases except the
- ones documented in the "Preserving tags" section of tagged-pointers.txt.
+A definition of the meaning of tagged pointers on arm64 can be found in: +Documentation/arm64/tagged-pointers.txt.
+3. ARM64 Tagged Address ABI Exceptions +--------------------------------------
+The behaviours described in section 2, with particular reference to the +acceptance by the syscalls of any valid tagged pointer are not applicable +to the following cases:
- mmap() addr parameter.
- mremap() new_address parameter.
- prctl_set_mm() struct prctl_map fields.
- prctl_set_mm_map() struct prctl_map fields.
prctl_set_mm() and prctl_set_mm_map() are internal kernel functions, not syscall names. IIUC, we don't want to allow any address field settable via the PR_SET_MM prctl() to be tagged. Catalin, is that correct? I think this needs rephrasing.
Kevin
+Any attempt to use non-zero tagged pointers will lead to undefined behaviour.
+4. Example of correct usage +---------------------------
+void main(void) +{
- static int tbi_enabled = 0;
- unsigned long tag = 0;
- char *ptr = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS, -1, 0);
- if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE,
0, 0, 0) == 0)
tbi_enabled = 1;
- if (ptr == (void *)-1) /* MAP_FAILED */
return -1;
- if (tbi_enabled)
tag = rand() & 0xff;
- ptr = (char *)((unsigned long)ptr | (tag << TAG_SHIFT));
- *ptr = 'a';
- ...
+}
On Tue, Jun 18, 2019 at 02:13:01PM +0100, Kevin Brodsky wrote:
On 13/06/2019 16:51, Vincenzo Frascino wrote:
+The ARM64 Tagged Address ABI is an opt-in feature, and an application can +control it using the following:
- /proc/sys/abi/tagged_addr: a new sysctl interface that can be used to
prevent the applications from enabling the relaxed ABI.
The sysctl is meant also for testing purposes in order to provide a
simple way for the userspace to verify the return error checking of
the prctl() commands without having to reconfigure the kernel.
The sysctl supports the following configuration options:
- 0: Disable ARM64 Tagged Address ABI for all the applications.
- 1 (Default): Enable ARM64 Tagged Address ABI for all the
applications.
I find this very confusing, because it suggests that the default value of PR_GET_TAGGED_ADDR_CTRL for new processes will be set to the value of this sysctl, when in fact this sysctl is about restricting the *availability* of the new ABI. Instead of disabling the ABI, I would talk about disabling access to the new ABI here.
This bullet point needs to be re-written. The sysctl is meant to disable opting in to the ABI. I'd also drop the "meant for testing" part. I put it in my commit log as justification but I don't think it should be part of the ABI document.
- prctl()s:
- PR_SET_TAGGED_ADDR_CTRL: can be used to enable or disable the Tagged
Address ABI.
The (unsigned int) arg2 argument is a bit mask describing the
control mode used:
- PR_TAGGED_ADDR_ENABLE: Enable ARM64 Tagged Address ABI.
The arguments arg3, arg4, and arg5 are ignored.
Have we definitely decided that arg{3,4,5} are ignored? Catalin?
I don't have a strong preference either way. If it's simpler for the user to ignore them, fine by me. I can see in the current prctl commands a mix if ignore vs forced zero.
+the ABI guarantees the following behaviours:
- Every current or newly introduced syscall can accept any valid tagged
- pointers.
"pointer". Also, is it really useful to talk about newly introduced syscall? New from which point of view?
I think we should drop this guarantee. It would have made sense if we allowed tagged pointers everywhere but we already have some exceptions.
+3. ARM64 Tagged Address ABI Exceptions +--------------------------------------
+The behaviours described in section 2, with particular reference to the +acceptance by the syscalls of any valid tagged pointer are not applicable +to the following cases:
- mmap() addr parameter.
- mremap() new_address parameter.
- prctl_set_mm() struct prctl_map fields.
- prctl_set_mm_map() struct prctl_map fields.
prctl_set_mm() and prctl_set_mm_map() are internal kernel functions, not syscall names. IIUC, we don't want to allow any address field settable via the PR_SET_MM prctl() to be tagged. Catalin, is that correct? I think this needs rephrasing.
I fully agree. It should talk about PR_SET_MM, PR_SET_MM_MAP, PR_SET_MM_MAP_SIZE.
On arm64 the TCR_EL1.TBI0 bit has been always enabled hence the userspace (EL0) is allowed to set a non-zero value in the top byte but the resulting pointers are not allowed at the user-kernel syscall ABI boundary.
With the relaxed ABI proposed in this set, it is now possible to pass tagged pointers to the syscalls, when these pointers are in memory ranges obtained by an anonymous (MAP_ANONYMOUS) mmap().
Relax the requirements described in tagged-pointers.txt to be compliant with the behaviours guaranteed by the ARM64 Tagged Address ABI.
Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will.deacon@arm.com CC: Andrey Konovalov andreyknvl@google.com Signed-off-by: Vincenzo Frascino vincenzo.frascino@arm.com --- Documentation/arm64/tagged-pointers.txt | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-)
diff --git a/Documentation/arm64/tagged-pointers.txt b/Documentation/arm64/tagged-pointers.txt index a25a99e82bb1..e33af14478e3 100644 --- a/Documentation/arm64/tagged-pointers.txt +++ b/Documentation/arm64/tagged-pointers.txt @@ -18,7 +18,8 @@ Passing tagged addresses to the kernel --------------------------------------
All interpretation of userspace memory addresses by the kernel assumes -an address tag of 0x00. +an address tag of 0x00, unless the userspace opts-in the ARM64 Tagged +Address ABI via the PR_SET_TAGGED_ADDR_CTRL prctl().
This includes, but is not limited to, addresses found in:
@@ -31,18 +32,23 @@ This includes, but is not limited to, addresses found in: - the frame pointer (x29) and frame records, e.g. when interpreting them to generate a backtrace or call graph.
-Using non-zero address tags in any of these locations may result in an -error code being returned, a (fatal) signal being raised, or other modes -of failure. +Using non-zero address tags in any of these locations when the +userspace application did not opt-in to the ARM64 Tagged Address ABI +may result in an error code being returned, a (fatal) signal being raised, +or other modes of failure.
-For these reasons, passing non-zero address tags to the kernel via -system calls is forbidden, and using a non-zero address tag for sp is -strongly discouraged. +For these reasons, when the userspace application did not opt-in, passing +non-zero address tags to the kernel via system calls is forbidden, and using +a non-zero address tag for sp is strongly discouraged.
Programs maintaining a frame pointer and frame records that use non-zero address tags may suffer impaired or inaccurate debug and profiling visibility.
+A definition of the meaning of ARM64 Tagged Address ABI and of the +guarantees that the ABI provides when the userspace opts-in via prctl() +can be found in: Documentation/arm64/tagged-address-abi.txt. +
Preserving tags --------------- @@ -57,6 +63,9 @@ be preserved. The architecture prevents the use of a tagged PC, so the upper byte will be set to a sign-extension of bit 55 on exception return.
+These behaviours are preserved even when the userspace opts-in to the ARM64 +Tagged Address ABI via the PR_SET_TAGGED_ADDR_CTRL prctl(). +
Other considerations --------------------
linux-kselftest-mirror@lists.linaro.org