Hi Andrey,
(sorry for the delay)
On Mon, Sep 17, 2018 at 07:01:00PM +0200, Andrey Konovalov wrote:
I took another look at the changes this patchset does to the kernel and here are my thoughts:
I see two ways how a (potentially tagged) user pointer gets into the kernel:
- A pointer is passed to a syscall (directly as an argument or
indirectly as a struct field). 2. A pointer is extracted from user context (registers, etc.) by some kind of a trap/fault handler. (Is there something else?)
Not AFAICT.
In case 1 we also have a special case of a pointer passed to one of the memory syscalls (mmap, mprotect, etc.). These syscalls "are not doing memory accesses but rather dealing with the memory range, hence an untagged pointer is better suited" as pointed out by Catalin (these syscalls do not always use "unsigned long" instead of "void __user *" though, for example shmat uses "void __user *").
If it makes things any simpler, we could revisit this though it seems to me more consistent not to pass a tagged pointer back into the kernel when the original one was untagged (i.e. mmap vs munmap).
(if it wasn't for pointers in structures, I would have just left the problem entirely to the C library to do the untagging before calling the kernel)
Looking at patch #8 ("usb, arm64: untag user addresses in devio") in this series, it seems that that devio ioctl actually accepts a pointer into a vma, so we shouldn't actually be untagging its argument and the patch needs to be dropped.
You are right, the pointer seems to have originated from the kernel as already untagged (mmap() on the driver), so we would expect the user to pass it back an untagged pointer.
Otherwise there's quite a few more cases that needs to be changed (like tcp_zerocopy_receive() for example, more can be found by grepping find_vma() in generic code).
Yes, it's similar to the devio one.
Regarding case 2, it seems that analyzing casts of __user pointers won't really help, since the code (arch/arm64/mm/fault.c) doesn't really use them. However all of this code is arch specific, so it shouldn't really change over time (right?). It looks like dealing with tags passed to the kernel through these fault handlers is already resolved with these patches (and therefore patch #6 ("arm64: untag user address in __do_user_fault") in this series is not actually needed and can be dropped (need to test that)):
I'm less worried about (2) since, as you say, it's under the arch control and, even if it changes slightly over time, we can be aware of this.
Now, I also see two cases when kernel behavior changes depending on whether a pointer is tagged:
- Kernel code checks that a pointer belongs to userspace by comparing
it with TASK_SIZE/addr_limit/user_addr_max()/USER_DS/... . 2. A pointer gets passed to find_vma() or similar functions. (Is there something else?)
I think these are the main cases.
The initial thought that I had here is that the pointers that reach find_vma() must be passed through memory syscalls and therefore shouldn't be untagged and don't require any fixes. There are at least two exceptions to this: 1. get_user_pages() (see patch #4 ("mm, arm64: untag user addresses in mm/gup.c") in this patch series) and 2. __do_page_fault() in arch/arm64/mm/fault.c. Are there any other obvious exceptions?
Vincenzo F did some more in-depth analysis, I'll let him answer whether he has found any more cases.
At a quick grep, arm64_notify_segfault() (it changes the siginfo si_code). There are a few other places which assume that the address is already untagged but we don't seem to have a clear guideline on the ABI. For example, prctl(PR_SET_MM) takes user addresses and we assume they are untagged (as in the mmap() and friends).
I've tried adding BUG_ON(has_tag(addr)) to find_vma() and running a modified syzkaller version that passes tagged pointers to the kernel and failed to find anything else.
I added a similar test with an LD_PRELOAD'ed malloc/free implementation on Debian and seemed alright but I'd rather want something more consistent when defining the user ABI, otherwise we have places where find_vma() is called but requires untagging.
As for case 1, the places where pointers are compared with TASK_SIZE and others can be found with grep. Maybe it makes sense to introduce some kind of routine like is_user_pointer() that handles tagged pointers and refactor the existing code to use it? And maybe add a rule to checkpatch.pl that forbids the direct usage of TASK_SIZE and others.
So I think detecting direct comparisons with TASK_SIZE and others would more useful than finding __user pointer casts (it seems that the latter requires a lot of annotations to be fixed/added), and I should just drop this patch with annotations.
I think point (1) is not too bad, usually found with grep.
As I've said in my previous reply, I kind of came to the same conclusion that searching __user pointer casts to long may not actually scale. If we could add an __untagged annotation to ulong where it matters (e.g. find_vma()), we could identify a ulong (default tagged) and annotate some of those.
However, this analysis on __user * casting was useful even if we don't end up using it. If we come up with a clearer definition of the ABI (which syscalls accept tagged pointers), we may conclude that the only places where untagging matters are a few find_vma() calls in the arch and mm code and can ignore the rest.
FTR, this code is a prerequisite for ARM's memory tagging hardware feature. Interestingly, I can't find SPARC ADI support dealing with this aspect at all. I suspect their user ABI does not allow tagged addresses passed to the kernel.