Re: ❌ FAIL: Test report for kernel 5.4.0-rc2-d6c2c23.cki (stable-next)

15 Oct 2019

On Tue, Oct 15, 2019 at 04:26:51PM +0100, Catalin Marinas wrote:
...
On Mon, Oct 14, 2019 at 10:33:32PM +0100, Will Deacon wrote:
...
On Mon, Oct 14, 2019 at 05:26:51PM +0100, Catalin Marinas wrote:
...
On Mon, Oct 14, 2019 at 02:54:17PM +0200, Andrey Konovalov wrote:
...
What do you think we should do here?
It is an ABI break as the man page clearly states that the above case
should return -ENOMEM.
Although I agree with your analysis, I also thought that these sorts of
ABI breaks (changes in error codes) were unfortunately common so we
shouldn't throw the baby out with the bath water here.
...
The options I see:

Revert commit 057d3389108e and try again to document that the memory
syscalls do not support tagged pointers

Change untagged_addr() to only 0-extend from bit 55 or leave the
tag unchanged if bit 55 is 1. We could mask out the tag (0 rather
than sign-extend) but if we had an mlock test passing ULONG_MASK,
then we get -ENOMEM instead of -EINVAL

Make untagged_addr() depend on the TIF_TAGGED_ADDR bit and we only
break the ABI for applications opting in to this new ABI. We did look
at this but the ptrace(PEEK/POKE_DATA) needs a bit more thinking on
whether we check the ptrace'd process or the debugger flags

Leave things as they are, consider the address space 56-bit and
change the test to not use LONG_MAX on arm64. This needs to be passed
by the sparc guys since they probably have a similar issue


I'm in favour of (2) or (4) as long as there's also an update to the docs.
With (4) we'd start differing from other architectures supported by
Linux. This works if we consider the test to be broken. However, reading
the mlock man page:
   EINVAL The result of the addition addr+len was less than addr
   (e.g., the addition may have resulted in an overflow).

   ENOMEM Some of the specified address range does not correspond to
   mapped pages in the address space of the process.


There is no mention of EINVAL outside the TASK_SIZE, seems to fall more
within the ENOMEM description above.
Sorry, I was being too vague in my wording. What I was trying to say is I'm
ok with (2) or (4), but either way we need to update our ABI documentation
under Documentation/arm64/. I personally don't think userspace will care
about EINVAL vs ENOMEM because the kernel is already horribly unreliable
when it comes to keeping error codes stable, which is why I think we could
get away with (4). For example, Jan (who reported this issue) wrote this
change to LTP last year for one of the mmap tests:
https://github.com/linux-test-project/ltp/commit/e7bab61882847609be3132a2f0d...
The fact that we have tagging at all already means that we differ from
many other architectures.
...
...
...
It's slightly annoying to find this now. We did run (part of) LTP but I
guess we never run the POSIX conformance tests.
Yes, and this stuff was in linux-next for a while so it's worrying that
kernelci didn't spot it either. Hmm.
For some reason the mlock test was skipped around the time we pushed the
TBI patches into -next:
https://qa-reports.linaro.org/lkft/linux-next-oe/tests/ltp-open-posix-tests/...
Coincidence?
...
Internally I don't think we've configured LTP with
--with-open-posix-testsuite, so this explains why we missed it.
Ok, hopefully you can fix that now.
...
...
...
My preference is 2 with a quick attempt below. This basically means
clear the tag if it resembles a valid (tagged) user pointer, otherwise
don't touch it (bit 55 set always means an invalid user pointer). Not
sure how the generated code will look like but we could probably do
something better in assembly directly.
[...]
...

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index b61b50bf68b1..c23c47360664 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -215,12 +215,18 @@ static inline unsigned long kaslr_offset(void)

up with a tagged userland pointer. Clear the tag to get a sane pointer to
pass on to access_ok(), for instance.

*/
-#define untagged_addr(addr)	\
+#define __untagged_addr(addr)	\
   ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55))
 
+#define untagged_addr(addr)	({					\

u64 __addr = (__force u64)addr;					\
__addr &= __untagged_addr(__addr);				\
(__force __typeof__(addr))__addr;				\

+})



#ifdef CONFIG_KASAN_SW_TAGS
 #define __tag_shifted(tag)	((u64)(tag) << 56)
-#define __tag_reset(addr)	untagged_addr(addr)
+#define __tag_reset(addr)	__untagged_addr(addr)
 #define __tag_get(addr)		(__u8)((u64)(addr) >> 56)
 #else
 #define __tag_shifted(tag)	0UL
This works for me. Szabolcs also suggested just zeroing the top byte but
we wouldn't catch the overflow EINVAL case above, so I'd rather only
mask the tag out if it was a user address (i.e. bit 55 is 0).
I'll spin it as a proper patch while we decide whether we want to do
anything.
Will

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: ❌ FAIL: Test report for kernel 5.4.0-rc2-d6c2c23.cki (stable-next)