Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git Commit: d6c2c23a29f4 - Merge branch 'stable-next' of ssh://chubbybox:/home/sasha/data/next into stable-next
The results of these automated tests are provided below.
Overall result: FAILED (see details below) Merge: OK Compile: OK Tests: FAILED
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/223563
One or more kernel tests failed:
aarch64: ❌ LTP: openposix test suite
We hope that these logs can help you find the problem quickly. For the full detail on our testing procedures, please scroll to the bottom of this message.
Please reply to this email if you have any questions about the tests that we ran or if you have any suggestions on how to make future tests more effective.
,-. ,-. ( C ) ( K ) Continuous `-',-.`-' Kernel ( I ) Integration `-' ______________________________________________________________________________
Compile testing ---------------
We compiled the kernel for 3 architectures:
aarch64: make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le: make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64: make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing ---------------- We booted each kernel and ran the following tests:
aarch64: Host 1: ✅ Boot test ✅ xfstests: ext4 ✅ xfstests: xfs ✅ selinux-policy: serge-testsuite ✅ lvm thinp sanity ✅ storage: software RAID testing 🚧 ✅ Storage blktests
Host 2: ✅ Boot test ✅ Podman system integration test (as root) ✅ Podman system integration test (as user) ✅ Loopdev Sanity ✅ jvm test suite ✅ Memory function: memfd_create ✅ AMTU (Abstract Machine Test Utility) ❌ LTP: openposix test suite ✅ Ethernet drivers sanity ✅ Networking socket: fuzz ✅ Networking sctp-auth: sockopts test ✅ Networking: igmp conformance test ✅ Networking TCP: keepalive test ✅ Networking UDP: socket ✅ Networking tunnel: gre basic ✅ Networking tunnel: vxlan basic ✅ audit: audit testsuite test ✅ httpd: mod_ssl smoke sanity ✅ iotop: sanity ✅ tuned: tune-processes-through-perf ✅ Usex - version 1.9-29 ✅ storage: SCSI VPD 🚧 ✅ LTP lite 🚧 ✅ CIFS Connectathon 🚧 ✅ POSIX pjd-fstest suites 🚧 ✅ Memory function: kaslr 🚧 ✅ Networking bridge: sanity 🚧 ✅ Networking MACsec: sanity 🚧 ✅ Networking route: pmtu 🚧 ✅ Networking tunnel: geneve basic test 🚧 ✅ L2TP basic test 🚧 ✅ Networking vnic: ipvlan/basic 🚧 ✅ ALSA PCM loopback test 🚧 ✅ ALSA Control (mixer) Userspace Element test 🚧 ✅ storage: dm/common 🚧 ✅ trace: ftrace/tracer 🚧 ✅ Networking route_func: local 🚧 ✅ Networking route_func: forward 🚧 ✅ Networking ipsec: basic netns transport 🚧 ✅ Networking ipsec: basic netns tunnel
ppc64le: Host 1: ✅ Boot test ✅ xfstests: ext4 ✅ xfstests: xfs ✅ selinux-policy: serge-testsuite ✅ lvm thinp sanity ✅ storage: software RAID testing 🚧 ✅ Storage blktests
Host 2: ✅ Boot test ✅ Podman system integration test (as root) ✅ Podman system integration test (as user) ✅ Loopdev Sanity ✅ jvm test suite ✅ Memory function: memfd_create ✅ AMTU (Abstract Machine Test Utility) ✅ LTP: openposix test suite ✅ Ethernet drivers sanity ✅ Networking socket: fuzz ✅ Networking sctp-auth: sockopts test ✅ Networking TCP: keepalive test ✅ Networking UDP: socket ✅ Networking tunnel: gre basic ✅ Networking tunnel: vxlan basic ✅ audit: audit testsuite test ✅ httpd: mod_ssl smoke sanity ✅ iotop: sanity ✅ tuned: tune-processes-through-perf ✅ Usex - version 1.9-29 🚧 ✅ LTP lite 🚧 ✅ CIFS Connectathon 🚧 ✅ POSIX pjd-fstest suites 🚧 ✅ Memory function: kaslr 🚧 ✅ Networking bridge: sanity 🚧 ✅ Networking MACsec: sanity 🚧 ✅ Networking route: pmtu 🚧 ✅ Networking tunnel: geneve basic test 🚧 ✅ L2TP basic test 🚧 ✅ Networking ipsec: basic netns tunnel 🚧 ✅ Networking vnic: ipvlan/basic 🚧 ✅ ALSA PCM loopback test 🚧 ✅ ALSA Control (mixer) Userspace Element test 🚧 ✅ storage: dm/common 🚧 ✅ trace: ftrace/tracer 🚧 ✅ Networking route_func: local 🚧 ✅ Networking route_func: forward
x86_64: Host 1: ✅ Boot test 🚧 ✅ IPMI driver test 🚧 ✅ IPMItool loop stress test
Host 2: ✅ Boot test ✅ Podman system integration test (as root) ✅ Podman system integration test (as user) ✅ Loopdev Sanity ✅ jvm test suite ✅ Memory function: memfd_create ✅ AMTU (Abstract Machine Test Utility) ✅ LTP: openposix test suite ✅ Ethernet drivers sanity ✅ Networking socket: fuzz ✅ Networking sctp-auth: sockopts test ✅ Networking: igmp conformance test ✅ Networking TCP: keepalive test ✅ Networking UDP: socket ✅ Networking tunnel: gre basic ✅ Networking tunnel: vxlan basic ✅ audit: audit testsuite test ✅ httpd: mod_ssl smoke sanity ✅ iotop: sanity ✅ tuned: tune-processes-through-perf ✅ pciutils: sanity smoke test ✅ Usex - version 1.9-29 ✅ storage: SCSI VPD ✅ stress: stress-ng 🚧 ✅ LTP lite 🚧 ✅ CIFS Connectathon 🚧 ✅ POSIX pjd-fstest suites 🚧 ✅ Memory function: kaslr 🚧 ✅ Networking bridge: sanity 🚧 ✅ Networking MACsec: sanity 🚧 ✅ Networking route: pmtu 🚧 ✅ Networking tunnel: geneve basic test 🚧 ✅ L2TP basic test 🚧 ✅ Networking vnic: ipvlan/basic 🚧 ✅ ALSA PCM loopback test 🚧 ✅ ALSA Control (mixer) Userspace Element test 🚧 ✅ storage: dm/common 🚧 ✅ trace: ftrace/tracer 🚧 ✅ Networking route_func: local 🚧 ✅ Networking route_func: forward 🚧 ✅ Networking ipsec: basic netns transport 🚧 ✅ Networking ipsec: basic netns tunnel
Host 3: ✅ Boot test ✅ Storage SAN device stress - megaraid_sas
Host 4: ✅ Boot test ✅ Storage SAN device stress - mpt3sas driver
Host 5: ✅ Boot test ✅ xfstests: ext4 ✅ xfstests: xfs ✅ selinux-policy: serge-testsuite ✅ lvm thinp sanity ✅ storage: software RAID testing 🚧 ✅ IOMMU boot test 🚧 ✅ Storage blktests
Test sources: https://github.com/CKI-project/tests-beaker 💚 Pull requests are welcome for new tests or improvements to existing tests!
Waived tests ------------ If the test run included waived tests, they are marked with 🚧. Such tests are executed but their results are not taken into account. Tests are waived when their results are not reliable enough, e.g. when they're just introduced or are being fixed.
----- Original Message -----
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git Commit: d6c2c23a29f4 - Merge branch 'stable-next' of ssh://chubbybox:/home/sasha/data/next into stable-next
The results of these automated tests are provided below.
Overall result: FAILED (see details below) Merge: OK Compile: OK Tests: FAILED
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/223563
One or more kernel tests failed:
aarch64: ❌ LTP: openposix test suite
Test [1] is passing value close to LONG_MAX, which on arm64 is now treated as tagged userspace ptr: 057d3389108e ("mm: untag user pointers passed to memory syscalls")
And now seems to hit overflow check after sign extension (EINVAL). Test should probably find different way to choose invalid pointer.
[1] https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_t...
On Mon, Oct 14, 2019 at 9:29 AM Jan Stancek jstancek@redhat.com wrote:
----- Original Message -----
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git Commit: d6c2c23a29f4 - Merge branch 'stable-next' of ssh://chubbybox:/home/sasha/data/next into stable-next
The results of these automated tests are provided below.
Overall result: FAILED (see details below) Merge: OK Compile: OK Tests: FAILED
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/223563
One or more kernel tests failed:
aarch64: ❌ LTP: openposix test suite
Test [1] is passing value close to LONG_MAX, which on arm64 is now treated as tagged userspace ptr: 057d3389108e ("mm: untag user pointers passed to memory syscalls")
And now seems to hit overflow check after sign extension (EINVAL). Test should probably find different way to choose invalid pointer.
[1] https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_t...
+Catalin and Vincenzo
Per Documentation/arm64/tagged-address-abi.rst we now have:
User addresses not accessed by the kernel but used for address space management (e.g. ``mmap()``, ``mprotect()``, ``madvise()``). The use of valid tagged pointers in this context is always allowed.
However this breaks the test above.
What do you think we should do here?
+ Will
On Mon, Oct 14, 2019 at 02:54:17PM +0200, Andrey Konovalov wrote:
On Mon, Oct 14, 2019 at 9:29 AM Jan Stancek jstancek@redhat.com wrote:
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git Commit: d6c2c23a29f4 - Merge branch 'stable-next' of ssh://chubbybox:/home/sasha/data/next into stable-next
The results of these automated tests are provided below.
Overall result: FAILED (see details below) Merge: OK Compile: OK Tests: FAILED
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/223563
One or more kernel tests failed:
aarch64: ❌ LTP: openposix test suite
Test [1] is passing value close to LONG_MAX, which on arm64 is now treated as tagged userspace ptr: 057d3389108e ("mm: untag user pointers passed to memory syscalls")
And now seems to hit overflow check after sign extension (EINVAL). Test should probably find different way to choose invalid pointer.
[1] https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_t...
Per Documentation/arm64/tagged-address-abi.rst we now have:
User addresses not accessed by the kernel but used for address space management (e.g. ``mmap()``, ``mprotect()``, ``madvise()``). The use of valid tagged pointers in this context is always allowed.
However this breaks the test above.
So the problem is that user space passes a 0x7fff_ffff_ffff_f000 start address and untagged_addr sign-extends it to 0xffff_ffff_ffff_f000. The subsequent check in apply_vma_lock_flags() finds that start+PAGE_SIZE is smaller than start, hence -EINVAL instead of -ENOMEM.
What do you think we should do here?
It is an ABI break as the man page clearly states that the above case should return -ENOMEM. The options I see:
1. Revert commit 057d3389108e and try again to document that the memory syscalls do not support tagged pointers
2. Change untagged_addr() to only 0-extend from bit 55 or leave the tag unchanged if bit 55 is 1. We could mask out the tag (0 rather than sign-extend) but if we had an mlock test passing ULONG_MASK, then we get -ENOMEM instead of -EINVAL
3. Make untagged_addr() depend on the TIF_TAGGED_ADDR bit and we only break the ABI for applications opting in to this new ABI. We did look at this but the ptrace(PEEK/POKE_DATA) needs a bit more thinking on whether we check the ptrace'd process or the debugger flags
4. Leave things as they are, consider the address space 56-bit and change the test to not use LONG_MAX on arm64. This needs to be passed by the sparc guys since they probably have a similar issue
It's slightly annoying to find this now. We did run (part of) LTP but I guess we never run the POSIX conformance tests.
My preference is 2 with a quick attempt below. This basically means clear the tag if it resembles a valid (tagged) user pointer, otherwise don't touch it (bit 55 set always means an invalid user pointer). Not sure how the generated code will look like but we could probably do something better in assembly directly.
---------8<------------------------------- diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index b61b50bf68b1..6b36d080a633 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -215,12 +215,15 @@ static inline unsigned long kaslr_offset(void) * up with a tagged userland pointer. Clear the tag to get a sane pointer to * pass on to access_ok(), for instance. */ -#define untagged_addr(addr) \ +#define __untagged_addr(addr) \ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55))
+#define untagged_addr(addr) \ + ((__force u64)(addr) & BIT(55) ? (addr) : __untagged_addr(addr)) + #ifdef CONFIG_KASAN_SW_TAGS #define __tag_shifted(tag) ((u64)(tag) << 56) -#define __tag_reset(addr) untagged_addr(addr) +#define __tag_reset(addr) __untagged_addr(addr) #define __tag_get(addr) (__u8)((u64)(addr) >> 56) #else #define __tag_shifted(tag) 0UL
On Mon, Oct 14, 2019 at 05:26:51PM +0100, Catalin Marinas wrote:
On Mon, Oct 14, 2019 at 02:54:17PM +0200, Andrey Konovalov wrote:
On Mon, Oct 14, 2019 at 9:29 AM Jan Stancek jstancek@redhat.com wrote:
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git Commit: d6c2c23a29f4 - Merge branch 'stable-next' of ssh://chubbybox:/home/sasha/data/next into stable-next
The results of these automated tests are provided below.
Overall result: FAILED (see details below) Merge: OK Compile: OK Tests: FAILED
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/223563
One or more kernel tests failed:
aarch64: ❌ LTP: openposix test suite
Test [1] is passing value close to LONG_MAX, which on arm64 is now treated as tagged userspace ptr: 057d3389108e ("mm: untag user pointers passed to memory syscalls")
And now seems to hit overflow check after sign extension (EINVAL). Test should probably find different way to choose invalid pointer.
[1] https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_t...
Per Documentation/arm64/tagged-address-abi.rst we now have:
User addresses not accessed by the kernel but used for address space management (e.g. ``mmap()``, ``mprotect()``, ``madvise()``). The use of valid tagged pointers in this context is always allowed.
However this breaks the test above.
So the problem is that user space passes a 0x7fff_ffff_ffff_f000 start address and untagged_addr sign-extends it to 0xffff_ffff_ffff_f000. The subsequent check in apply_vma_lock_flags() finds that start+PAGE_SIZE is smaller than start, hence -EINVAL instead of -ENOMEM.
What do you think we should do here?
It is an ABI break as the man page clearly states that the above case should return -ENOMEM.
Although I agree with your analysis, I also thought that these sorts of ABI breaks (changes in error codes) were unfortunately common so we shouldn't throw the baby out with the bath water here.
The options I see:
Revert commit 057d3389108e and try again to document that the memory syscalls do not support tagged pointers
Change untagged_addr() to only 0-extend from bit 55 or leave the tag unchanged if bit 55 is 1. We could mask out the tag (0 rather than sign-extend) but if we had an mlock test passing ULONG_MASK, then we get -ENOMEM instead of -EINVAL
Make untagged_addr() depend on the TIF_TAGGED_ADDR bit and we only break the ABI for applications opting in to this new ABI. We did look at this but the ptrace(PEEK/POKE_DATA) needs a bit more thinking on whether we check the ptrace'd process or the debugger flags
Leave things as they are, consider the address space 56-bit and change the test to not use LONG_MAX on arm64. This needs to be passed by the sparc guys since they probably have a similar issue
I'm in favour of (2) or (4) as long as there's also an update to the docs.
It's slightly annoying to find this now. We did run (part of) LTP but I guess we never run the POSIX conformance tests.
Yes, and this stuff was in linux-next for a while so it's worrying that kernelci didn't spot it either. Hmm.
My preference is 2 with a quick attempt below. This basically means clear the tag if it resembles a valid (tagged) user pointer, otherwise don't touch it (bit 55 set always means an invalid user pointer). Not sure how the generated code will look like but we could probably do something better in assembly directly.
---------8<------------------------------- diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index b61b50bf68b1..6b36d080a633 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -215,12 +215,15 @@ static inline unsigned long kaslr_offset(void)
- up with a tagged userland pointer. Clear the tag to get a sane pointer to
- pass on to access_ok(), for instance.
*/ -#define untagged_addr(addr) \ +#define __untagged_addr(addr) \ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55)) +#define untagged_addr(addr) \
- ((__force u64)(addr) & BIT(55) ? (addr) : __untagged_addr(addr))
It would be nice not to resort to asm for this, but I think we can do a bit better with something like the code below which just introduces an additional AND instruction.
Will
--->8
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index b61b50bf68b1..c23c47360664 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -215,12 +215,18 @@ static inline unsigned long kaslr_offset(void) * up with a tagged userland pointer. Clear the tag to get a sane pointer to * pass on to access_ok(), for instance. */ -#define untagged_addr(addr) \ +#define __untagged_addr(addr) \ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55))
+#define untagged_addr(addr) ({ \ + u64 __addr = (__force u64)addr; \ + __addr &= __untagged_addr(__addr); \ + (__force __typeof__(addr))__addr; \ +}) + #ifdef CONFIG_KASAN_SW_TAGS #define __tag_shifted(tag) ((u64)(tag) << 56) -#define __tag_reset(addr) untagged_addr(addr) +#define __tag_reset(addr) __untagged_addr(addr) #define __tag_get(addr) (__u8)((u64)(addr) >> 56) #else #define __tag_shifted(tag) 0UL
Adding Szabolcs and Dave from ARM as we've discussed this internally briefly but we should include the wider audience.
On Mon, Oct 14, 2019 at 10:33:32PM +0100, Will Deacon wrote:
On Mon, Oct 14, 2019 at 05:26:51PM +0100, Catalin Marinas wrote:
On Mon, Oct 14, 2019 at 02:54:17PM +0200, Andrey Konovalov wrote:
On Mon, Oct 14, 2019 at 9:29 AM Jan Stancek jstancek@redhat.com wrote:
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git Commit: d6c2c23a29f4 - Merge branch 'stable-next' of ssh://chubbybox:/home/sasha/data/next into stable-next
The results of these automated tests are provided below.
Overall result: FAILED (see details below) Merge: OK Compile: OK Tests: FAILED
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/223563
One or more kernel tests failed:
aarch64: ❌ LTP: openposix test suite
Test [1] is passing value close to LONG_MAX, which on arm64 is now treated as tagged userspace ptr: 057d3389108e ("mm: untag user pointers passed to memory syscalls")
And now seems to hit overflow check after sign extension (EINVAL). Test should probably find different way to choose invalid pointer.
[1] https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_t...
Per Documentation/arm64/tagged-address-abi.rst we now have:
User addresses not accessed by the kernel but used for address space management (e.g. ``mmap()``, ``mprotect()``, ``madvise()``). The use of valid tagged pointers in this context is always allowed.
However this breaks the test above.
So the problem is that user space passes a 0x7fff_ffff_ffff_f000 start address and untagged_addr sign-extends it to 0xffff_ffff_ffff_f000. The subsequent check in apply_vma_lock_flags() finds that start+PAGE_SIZE is smaller than start, hence -EINVAL instead of -ENOMEM.
What do you think we should do here?
It is an ABI break as the man page clearly states that the above case should return -ENOMEM.
Although I agree with your analysis, I also thought that these sorts of ABI breaks (changes in error codes) were unfortunately common so we shouldn't throw the baby out with the bath water here.
The options I see:
Revert commit 057d3389108e and try again to document that the memory syscalls do not support tagged pointers
Change untagged_addr() to only 0-extend from bit 55 or leave the tag unchanged if bit 55 is 1. We could mask out the tag (0 rather than sign-extend) but if we had an mlock test passing ULONG_MASK, then we get -ENOMEM instead of -EINVAL
Make untagged_addr() depend on the TIF_TAGGED_ADDR bit and we only break the ABI for applications opting in to this new ABI. We did look at this but the ptrace(PEEK/POKE_DATA) needs a bit more thinking on whether we check the ptrace'd process or the debugger flags
Leave things as they are, consider the address space 56-bit and change the test to not use LONG_MAX on arm64. This needs to be passed by the sparc guys since they probably have a similar issue
I'm in favour of (2) or (4) as long as there's also an update to the docs.
With (4) we'd start differing from other architectures supported by Linux. This works if we consider the test to be broken. However, reading the mlock man page:
EINVAL The result of the addition addr+len was less than addr (e.g., the addition may have resulted in an overflow).
ENOMEM Some of the specified address range does not correspond to mapped pages in the address space of the process.
There is no mention of EINVAL outside the TASK_SIZE, seems to fall more within the ENOMEM description above.
It's slightly annoying to find this now. We did run (part of) LTP but I guess we never run the POSIX conformance tests.
Yes, and this stuff was in linux-next for a while so it's worrying that kernelci didn't spot it either. Hmm.
For some reason the mlock test was skipped around the time we pushed the TBI patches into -next:
https://qa-reports.linaro.org/lkft/linux-next-oe/tests/ltp-open-posix-tests/...
Internally I don't think we've configured LTP with --with-open-posix-testsuite, so this explains why we missed it.
My preference is 2 with a quick attempt below. This basically means clear the tag if it resembles a valid (tagged) user pointer, otherwise don't touch it (bit 55 set always means an invalid user pointer). Not sure how the generated code will look like but we could probably do something better in assembly directly.
[...]
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index b61b50bf68b1..c23c47360664 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -215,12 +215,18 @@ static inline unsigned long kaslr_offset(void)
- up with a tagged userland pointer. Clear the tag to get a sane pointer to
- pass on to access_ok(), for instance.
*/ -#define untagged_addr(addr) \ +#define __untagged_addr(addr) \ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55)) +#define untagged_addr(addr) ({ \
- u64 __addr = (__force u64)addr; \
- __addr &= __untagged_addr(__addr); \
- (__force __typeof__(addr))__addr; \
+})
#ifdef CONFIG_KASAN_SW_TAGS #define __tag_shifted(tag) ((u64)(tag) << 56) -#define __tag_reset(addr) untagged_addr(addr) +#define __tag_reset(addr) __untagged_addr(addr) #define __tag_get(addr) (__u8)((u64)(addr) >> 56) #else #define __tag_shifted(tag) 0UL
This works for me. Szabolcs also suggested just zeroing the top byte but we wouldn't catch the overflow EINVAL case above, so I'd rather only mask the tag out if it was a user address (i.e. bit 55 is 0).
Hi Catalin,
On 10/15/19 4:26 PM, Catalin Marinas wrote:
Adding Szabolcs and Dave from ARM as we've discussed this internally briefly but we should include the wider audience.
On Mon, Oct 14, 2019 at 10:33:32PM +0100, Will Deacon wrote:
On Mon, Oct 14, 2019 at 05:26:51PM +0100, Catalin Marinas wrote:
On Mon, Oct 14, 2019 at 02:54:17PM +0200, Andrey Konovalov wrote:
On Mon, Oct 14, 2019 at 9:29 AM Jan Stancek jstancek@redhat.com wrote:
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git Commit: d6c2c23a29f4 - Merge branch 'stable-next' of ssh://chubbybox:/home/sasha/data/next into stable-next
The results of these automated tests are provided below.
Overall result: FAILED (see details below) Merge: OK Compile: OK Tests: FAILED
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/223563
One or more kernel tests failed:
aarch64: ❌ LTP: openposix test suite
Test [1] is passing value close to LONG_MAX, which on arm64 is now treated as tagged userspace ptr: 057d3389108e ("mm: untag user pointers passed to memory syscalls")
And now seems to hit overflow check after sign extension (EINVAL). Test should probably find different way to choose invalid pointer.
[1] https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_t...
[...]
The options I see:
Revert commit 057d3389108e and try again to document that the memory syscalls do not support tagged pointers
Change untagged_addr() to only 0-extend from bit 55 or leave the tag unchanged if bit 55 is 1. We could mask out the tag (0 rather than sign-extend) but if we had an mlock test passing ULONG_MASK, then we get -ENOMEM instead of -EINVAL
Make untagged_addr() depend on the TIF_TAGGED_ADDR bit and we only break the ABI for applications opting in to this new ABI. We did look at this but the ptrace(PEEK/POKE_DATA) needs a bit more thinking on whether we check the ptrace'd process or the debugger flags
Leave things as they are, consider the address space 56-bit and change the test to not use LONG_MAX on arm64. This needs to be passed by the sparc guys since they probably have a similar issue
I'm in favour of (2) or (4) as long as there's also an update to the docs.
With (4) we'd start differing from other architectures supported by Linux. This works if we consider the test to be broken. However, reading the mlock man page:
EINVAL The result of the addition addr+len was less than addr (e.g., the addition may have resulted in an overflow). ENOMEM Some of the specified address range does not correspond to mapped pages in the address space of the process.
There is no mention of EINVAL outside the TASK_SIZE, seems to fall more within the ENOMEM description above.
I agree with your analysis and vote for option (2) as well, because of what you reported about mlock() error meanings and because being this ABI I would prefer where possible to not diverge from other architectures. [...]
On Tue, Oct 15, 2019 at 04:26:51PM +0100, Catalin Marinas wrote:
On Mon, Oct 14, 2019 at 10:33:32PM +0100, Will Deacon wrote:
On Mon, Oct 14, 2019 at 05:26:51PM +0100, Catalin Marinas wrote:
On Mon, Oct 14, 2019 at 02:54:17PM +0200, Andrey Konovalov wrote:
What do you think we should do here?
It is an ABI break as the man page clearly states that the above case should return -ENOMEM.
Although I agree with your analysis, I also thought that these sorts of ABI breaks (changes in error codes) were unfortunately common so we shouldn't throw the baby out with the bath water here.
The options I see:
Revert commit 057d3389108e and try again to document that the memory syscalls do not support tagged pointers
Change untagged_addr() to only 0-extend from bit 55 or leave the tag unchanged if bit 55 is 1. We could mask out the tag (0 rather than sign-extend) but if we had an mlock test passing ULONG_MASK, then we get -ENOMEM instead of -EINVAL
Make untagged_addr() depend on the TIF_TAGGED_ADDR bit and we only break the ABI for applications opting in to this new ABI. We did look at this but the ptrace(PEEK/POKE_DATA) needs a bit more thinking on whether we check the ptrace'd process or the debugger flags
Leave things as they are, consider the address space 56-bit and change the test to not use LONG_MAX on arm64. This needs to be passed by the sparc guys since they probably have a similar issue
I'm in favour of (2) or (4) as long as there's also an update to the docs.
With (4) we'd start differing from other architectures supported by Linux. This works if we consider the test to be broken. However, reading the mlock man page:
EINVAL The result of the addition addr+len was less than addr (e.g., the addition may have resulted in an overflow). ENOMEM Some of the specified address range does not correspond to mapped pages in the address space of the process.
There is no mention of EINVAL outside the TASK_SIZE, seems to fall more within the ENOMEM description above.
Sorry, I was being too vague in my wording. What I was trying to say is I'm ok with (2) or (4), but either way we need to update our ABI documentation under Documentation/arm64/. I personally don't think userspace will care about EINVAL vs ENOMEM because the kernel is already horribly unreliable when it comes to keeping error codes stable, which is why I think we could get away with (4). For example, Jan (who reported this issue) wrote this change to LTP last year for one of the mmap tests:
https://github.com/linux-test-project/ltp/commit/e7bab61882847609be3132a2f0d...
The fact that we have tagging at all already means that we differ from many other architectures.
It's slightly annoying to find this now. We did run (part of) LTP but I guess we never run the POSIX conformance tests.
Yes, and this stuff was in linux-next for a while so it's worrying that kernelci didn't spot it either. Hmm.
For some reason the mlock test was skipped around the time we pushed the TBI patches into -next:
https://qa-reports.linaro.org/lkft/linux-next-oe/tests/ltp-open-posix-tests/...
Coincidence?
Internally I don't think we've configured LTP with --with-open-posix-testsuite, so this explains why we missed it.
Ok, hopefully you can fix that now.
My preference is 2 with a quick attempt below. This basically means clear the tag if it resembles a valid (tagged) user pointer, otherwise don't touch it (bit 55 set always means an invalid user pointer). Not sure how the generated code will look like but we could probably do something better in assembly directly.
[...]
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index b61b50bf68b1..c23c47360664 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -215,12 +215,18 @@ static inline unsigned long kaslr_offset(void)
- up with a tagged userland pointer. Clear the tag to get a sane pointer to
- pass on to access_ok(), for instance.
*/ -#define untagged_addr(addr) \ +#define __untagged_addr(addr) \ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55)) +#define untagged_addr(addr) ({ \
- u64 __addr = (__force u64)addr; \
- __addr &= __untagged_addr(__addr); \
- (__force __typeof__(addr))__addr; \
+})
#ifdef CONFIG_KASAN_SW_TAGS #define __tag_shifted(tag) ((u64)(tag) << 56) -#define __tag_reset(addr) untagged_addr(addr) +#define __tag_reset(addr) __untagged_addr(addr) #define __tag_get(addr) (__u8)((u64)(addr) >> 56) #else #define __tag_shifted(tag) 0UL
This works for me. Szabolcs also suggested just zeroing the top byte but we wouldn't catch the overflow EINVAL case above, so I'd rather only mask the tag out if it was a user address (i.e. bit 55 is 0).
I'll spin it as a proper patch while we decide whether we want to do anything.
Will
On Tue, Oct 15, 2019 at 05:14:53PM +0100, Will Deacon wrote:
On Tue, Oct 15, 2019 at 04:26:51PM +0100, Catalin Marinas wrote:
On Mon, Oct 14, 2019 at 10:33:32PM +0100, Will Deacon wrote:
On Mon, Oct 14, 2019 at 05:26:51PM +0100, Catalin Marinas wrote:
The options I see:
Revert commit 057d3389108e and try again to document that the memory syscalls do not support tagged pointers
Change untagged_addr() to only 0-extend from bit 55 or leave the tag unchanged if bit 55 is 1. We could mask out the tag (0 rather than sign-extend) but if we had an mlock test passing ULONG_MASK, then we get -ENOMEM instead of -EINVAL
Make untagged_addr() depend on the TIF_TAGGED_ADDR bit and we only break the ABI for applications opting in to this new ABI. We did look at this but the ptrace(PEEK/POKE_DATA) needs a bit more thinking on whether we check the ptrace'd process or the debugger flags
Leave things as they are, consider the address space 56-bit and change the test to not use LONG_MAX on arm64. This needs to be passed by the sparc guys since they probably have a similar issue
I'm in favour of (2) or (4) as long as there's also an update to the docs.
With (4) we'd start differing from other architectures supported by Linux. This works if we consider the test to be broken. However, reading the mlock man page:
EINVAL The result of the addition addr+len was less than addr (e.g., the addition may have resulted in an overflow). ENOMEM Some of the specified address range does not correspond to mapped pages in the address space of the process.
There is no mention of EINVAL outside the TASK_SIZE, seems to fall more within the ENOMEM description above.
Sorry, I was being too vague in my wording. What I was trying to say is I'm ok with (2) or (4), but either way we need to update our ABI documentation under Documentation/arm64/.
Having looked at making that change, I actually think the text is ok as-is if we go with option (2). We only make guarantees about "valid tagged pointer", which are defined to "reference an address in the user process address space" and therefore must have bit 55 == 0.
Untested patch below.
Will
--->8
From 517d979e84191ae9997c9513a88a5b798af6912f Mon Sep 17 00:00:00 2001
From: Will Deacon will@kernel.org Date: Tue, 15 Oct 2019 21:04:18 -0700 Subject: [PATCH] arm64: tags: Preserve tags for addresses translated via TTBR1
Sign-extending TTBR1 addresses when converting to an untagged address breaks the documented POSIX semantics for mlock() in some obscure error cases where we end up returning -EINVAL instead of -ENOMEM as a direct result of rewriting the upper address bits.
Rework the untagged_addr() macro to preserve the upper address bits for TTBR1 addresses and only clear the tag bits for user addresses. This matches the behaviour of the 'clear_address_tag' assembly macro, so rename that and align the implementations at the same time so that they use the same instruction sequences for the tag manipulation.
Cc: Catalin Marinas catalin.marinas@arm.com Link: https://lore.kernel.org/stable/20191014162651.GF19200@arrakis.emea.arm.com/ Reported-by: Jan Stancek jstancek@redhat.com Signed-off-by: Will Deacon will@kernel.org --- arch/arm64/include/asm/asm-uaccess.h | 7 +++---- arch/arm64/include/asm/memory.h | 10 ++++++++-- arch/arm64/kernel/entry.S | 4 ++-- 3 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h index f74909ba29bd..5bf963830b17 100644 --- a/arch/arm64/include/asm/asm-uaccess.h +++ b/arch/arm64/include/asm/asm-uaccess.h @@ -78,10 +78,9 @@ alternative_else_nop_endif /* * Remove the address tag from a virtual address, if present. */ - .macro clear_address_tag, dst, addr - tst \addr, #(1 << 55) - bic \dst, \addr, #(0xff << 56) - csel \dst, \dst, \addr, eq + .macro untagged_addr, dst, addr + sbfx \dst, \addr, #0, #56 + and \dst, \dst, \addr .endm
#endif diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index b61b50bf68b1..c23c47360664 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -215,12 +215,18 @@ static inline unsigned long kaslr_offset(void) * up with a tagged userland pointer. Clear the tag to get a sane pointer to * pass on to access_ok(), for instance. */ -#define untagged_addr(addr) \ +#define __untagged_addr(addr) \ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55))
+#define untagged_addr(addr) ({ \ + u64 __addr = (__force u64)addr; \ + __addr &= __untagged_addr(__addr); \ + (__force __typeof__(addr))__addr; \ +}) + #ifdef CONFIG_KASAN_SW_TAGS #define __tag_shifted(tag) ((u64)(tag) << 56) -#define __tag_reset(addr) untagged_addr(addr) +#define __tag_reset(addr) __untagged_addr(addr) #define __tag_get(addr) (__u8)((u64)(addr) >> 56) #else #define __tag_shifted(tag) 0UL diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index e304fe04b098..9ae336cc5833 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -604,7 +604,7 @@ el1_da: */ mrs x3, far_el1 inherit_daif pstate=x23, tmp=x2 - clear_address_tag x0, x3 + untagged_addr x0, x3 mov x2, sp // struct pt_regs bl do_mem_abort
@@ -808,7 +808,7 @@ el0_da: mrs x26, far_el1 ct_user_exit_irqoff enable_daif - clear_address_tag x0, x26 + untagged_addr x0, x26 mov x1, x25 mov x2, sp bl do_mem_abort
On Wed, Oct 16, 2019 at 05:29:33AM +0100, Will Deacon wrote:
From 517d979e84191ae9997c9513a88a5b798af6912f Mon Sep 17 00:00:00 2001 From: Will Deacon will@kernel.org Date: Tue, 15 Oct 2019 21:04:18 -0700 Subject: [PATCH] arm64: tags: Preserve tags for addresses translated via TTBR1
Sign-extending TTBR1 addresses when converting to an untagged address breaks the documented POSIX semantics for mlock() in some obscure error cases where we end up returning -EINVAL instead of -ENOMEM as a direct result of rewriting the upper address bits.
Rework the untagged_addr() macro to preserve the upper address bits for TTBR1 addresses and only clear the tag bits for user addresses. This matches the behaviour of the 'clear_address_tag' assembly macro, so rename that and align the implementations at the same time so that they use the same instruction sequences for the tag manipulation.
Cc: Catalin Marinas catalin.marinas@arm.com Link: https://lore.kernel.org/stable/20191014162651.GF19200@arrakis.emea.arm.com/ Reported-by: Jan Stancek jstancek@redhat.com Signed-off-by: Will Deacon will@kernel.org
Reviewed-by: Catalin Marinas catalin.marinas@arm.com Tested-by: Catalin Marinas catalin.marinas@arm.com
On 10/16/19 5:29 AM, Will Deacon wrote:
From 517d979e84191ae9997c9513a88a5b798af6912f Mon Sep 17 00:00:00 2001 From: Will Deacon will@kernel.org Date: Tue, 15 Oct 2019 21:04:18 -0700 Subject: [PATCH] arm64: tags: Preserve tags for addresses translated via TTBR1
Sign-extending TTBR1 addresses when converting to an untagged address breaks the documented POSIX semantics for mlock() in some obscure error cases where we end up returning -EINVAL instead of -ENOMEM as a direct result of rewriting the upper address bits.
Rework the untagged_addr() macro to preserve the upper address bits for TTBR1 addresses and only clear the tag bits for user addresses. This matches the behaviour of the 'clear_address_tag' assembly macro, so rename that and align the implementations at the same time so that they use the same instruction sequences for the tag manipulation.
Cc: Catalin Marinas catalin.marinas@arm.com Link: https://lore.kernel.org/stable/20191014162651.GF19200@arrakis.emea.arm.com/ Reported-by: Jan Stancek jstancek@redhat.com Signed-off-by: Will Deacon will@kernel.org
arch/arm64/include/asm/asm-uaccess.h | 7 +++---- arch/arm64/include/asm/memory.h | 10 ++++++++-- arch/arm64/kernel/entry.S | 4 ++-- 3 files changed, 13 insertions(+), 8 deletions(-)
Reviewed-by: Vincenzo Frascino vincenzo.frascino@arm.com Tested-by: Vincenzo Frascino vincenzo.frascino@arm.com
On Wed, Oct 16, 2019 at 6:29 AM Will Deacon will@kernel.org wrote:
On Tue, Oct 15, 2019 at 05:14:53PM +0100, Will Deacon wrote:
On Tue, Oct 15, 2019 at 04:26:51PM +0100, Catalin Marinas wrote:
On Mon, Oct 14, 2019 at 10:33:32PM +0100, Will Deacon wrote:
On Mon, Oct 14, 2019 at 05:26:51PM +0100, Catalin Marinas wrote:
The options I see:
Revert commit 057d3389108e and try again to document that the memory syscalls do not support tagged pointers
Change untagged_addr() to only 0-extend from bit 55 or leave the tag unchanged if bit 55 is 1. We could mask out the tag (0 rather than sign-extend) but if we had an mlock test passing ULONG_MASK, then we get -ENOMEM instead of -EINVAL
Make untagged_addr() depend on the TIF_TAGGED_ADDR bit and we only break the ABI for applications opting in to this new ABI. We did look at this but the ptrace(PEEK/POKE_DATA) needs a bit more thinking on whether we check the ptrace'd process or the debugger flags
Leave things as they are, consider the address space 56-bit and change the test to not use LONG_MAX on arm64. This needs to be passed by the sparc guys since they probably have a similar issue
I'm in favour of (2) or (4) as long as there's also an update to the docs.
With (4) we'd start differing from other architectures supported by Linux. This works if we consider the test to be broken. However, reading the mlock man page:
EINVAL The result of the addition addr+len was less than addr (e.g., the addition may have resulted in an overflow). ENOMEM Some of the specified address range does not correspond to mapped pages in the address space of the process.
There is no mention of EINVAL outside the TASK_SIZE, seems to fall more within the ENOMEM description above.
Sorry, I was being too vague in my wording. What I was trying to say is I'm ok with (2) or (4), but either way we need to update our ABI documentation under Documentation/arm64/.
Having looked at making that change, I actually think the text is ok as-is if we go with option (2). We only make guarantees about "valid tagged pointer", which are defined to "reference an address in the user process address space" and therefore must have bit 55 == 0.
Untested patch below.
Will
--->8
From 517d979e84191ae9997c9513a88a5b798af6912f Mon Sep 17 00:00:00 2001 From: Will Deacon will@kernel.org Date: Tue, 15 Oct 2019 21:04:18 -0700 Subject: [PATCH] arm64: tags: Preserve tags for addresses translated via TTBR1
Sign-extending TTBR1 addresses when converting to an untagged address breaks the documented POSIX semantics for mlock() in some obscure error cases where we end up returning -EINVAL instead of -ENOMEM as a direct result of rewriting the upper address bits.
Rework the untagged_addr() macro to preserve the upper address bits for TTBR1 addresses and only clear the tag bits for user addresses. This matches the behaviour of the 'clear_address_tag' assembly macro, so rename that and align the implementations at the same time so that they use the same instruction sequences for the tag manipulation.
Cc: Catalin Marinas catalin.marinas@arm.com Link: https://lore.kernel.org/stable/20191014162651.GF19200@arrakis.emea.arm.com/ Reported-by: Jan Stancek jstancek@redhat.com Signed-off-by: Will Deacon will@kernel.org
Reviewed-by: Andrey Konovalov andreyknvl@google.com
Thanks!
arch/arm64/include/asm/asm-uaccess.h | 7 +++---- arch/arm64/include/asm/memory.h | 10 ++++++++-- arch/arm64/kernel/entry.S | 4 ++-- 3 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h index f74909ba29bd..5bf963830b17 100644 --- a/arch/arm64/include/asm/asm-uaccess.h +++ b/arch/arm64/include/asm/asm-uaccess.h @@ -78,10 +78,9 @@ alternative_else_nop_endif /*
- Remove the address tag from a virtual address, if present.
*/
.macro clear_address_tag, dst, addr
tst \addr, #(1 << 55)
bic \dst, \addr, #(0xff << 56)
csel \dst, \dst, \addr, eq
.macro untagged_addr, dst, addr
sbfx \dst, \addr, #0, #56
and \dst, \dst, \addr .endm
#endif diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index b61b50bf68b1..c23c47360664 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -215,12 +215,18 @@ static inline unsigned long kaslr_offset(void)
- up with a tagged userland pointer. Clear the tag to get a sane pointer to
- pass on to access_ok(), for instance.
*/ -#define untagged_addr(addr) \ +#define __untagged_addr(addr) \ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55))
+#define untagged_addr(addr) ({ \
u64 __addr = (__force u64)addr; \
__addr &= __untagged_addr(__addr); \
(__force __typeof__(addr))__addr; \
+})
#ifdef CONFIG_KASAN_SW_TAGS #define __tag_shifted(tag) ((u64)(tag) << 56) -#define __tag_reset(addr) untagged_addr(addr) +#define __tag_reset(addr) __untagged_addr(addr) #define __tag_get(addr) (__u8)((u64)(addr) >> 56) #else #define __tag_shifted(tag) 0UL diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index e304fe04b098..9ae336cc5833 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -604,7 +604,7 @@ el1_da: */ mrs x3, far_el1 inherit_daif pstate=x23, tmp=x2
clear_address_tag x0, x3
untagged_addr x0, x3 mov x2, sp // struct pt_regs bl do_mem_abort
@@ -808,7 +808,7 @@ el0_da: mrs x26, far_el1 ct_user_exit_irqoff enable_daif
clear_address_tag x0, x26
untagged_addr x0, x26 mov x1, x25 mov x2, sp bl do_mem_abort
-- 2.23.0.700.g56cf767bdb-goog
----- Original Message -----
From 517d979e84191ae9997c9513a88a5b798af6912f Mon Sep 17 00:00:00 2001 From: Will Deacon will@kernel.org Date: Tue, 15 Oct 2019 21:04:18 -0700 Subject: [PATCH] arm64: tags: Preserve tags for addresses translated via TTBR1
Sign-extending TTBR1 addresses when converting to an untagged address breaks the documented POSIX semantics for mlock() in some obscure error cases where we end up returning -EINVAL instead of -ENOMEM as a direct result of rewriting the upper address bits.
Rework the untagged_addr() macro to preserve the upper address bits for TTBR1 addresses and only clear the tag bits for user addresses. This matches the behaviour of the 'clear_address_tag' assembly macro, so rename that and align the implementations at the same time so that they use the same instruction sequences for the tag manipulation.
Cc: Catalin Marinas catalin.marinas@arm.com Link: https://lore.kernel.org/stable/20191014162651.GF19200@arrakis.emea.arm.com/ Reported-by: Jan Stancek jstancek@redhat.com Signed-off-by: Will Deacon will@kernel.org
No regressions observed with LTP syscalls/sched/mm/commands and open_posix_testsuite.
Tested-by: Jan Stancek jstancek@redhat.com
On Wed, Oct 16, 2019 at 05:29:33AM +0100, Will Deacon wrote:
On Tue, Oct 15, 2019 at 05:14:53PM +0100, Will Deacon wrote:
On Tue, Oct 15, 2019 at 04:26:51PM +0100, Catalin Marinas wrote:
On Mon, Oct 14, 2019 at 10:33:32PM +0100, Will Deacon wrote:
On Mon, Oct 14, 2019 at 05:26:51PM +0100, Catalin Marinas wrote:
The options I see:
Revert commit 057d3389108e and try again to document that the memory syscalls do not support tagged pointers
Change untagged_addr() to only 0-extend from bit 55 or leave the tag unchanged if bit 55 is 1. We could mask out the tag (0 rather than sign-extend) but if we had an mlock test passing ULONG_MASK, then we get -ENOMEM instead of -EINVAL
Make untagged_addr() depend on the TIF_TAGGED_ADDR bit and we only break the ABI for applications opting in to this new ABI. We did look at this but the ptrace(PEEK/POKE_DATA) needs a bit more thinking on whether we check the ptrace'd process or the debugger flags
Leave things as they are, consider the address space 56-bit and change the test to not use LONG_MAX on arm64. This needs to be passed by the sparc guys since they probably have a similar issue
I'm in favour of (2) or (4) as long as there's also an update to the docs.
With (4) we'd start differing from other architectures supported by Linux. This works if we consider the test to be broken. However, reading the mlock man page:
EINVAL The result of the addition addr+len was less than addr (e.g., the addition may have resulted in an overflow). ENOMEM Some of the specified address range does not correspond to mapped pages in the address space of the process.
There is no mention of EINVAL outside the TASK_SIZE, seems to fall more within the ENOMEM description above.
Sorry, I was being too vague in my wording. What I was trying to say is I'm ok with (2) or (4), but either way we need to update our ABI documentation under Documentation/arm64/.
Having looked at making that change, I actually think the text is ok as-is if we go with option (2). We only make guarantees about "valid tagged pointer", which are defined to "reference an address in the user process address space" and therefore must have bit 55 == 0.
Untested patch below.
Will
--->8
From 517d979e84191ae9997c9513a88a5b798af6912f Mon Sep 17 00:00:00 2001 From: Will Deacon will@kernel.org Date: Tue, 15 Oct 2019 21:04:18 -0700 Subject: [PATCH] arm64: tags: Preserve tags for addresses translated via TTBR1
Sign-extending TTBR1 addresses when converting to an untagged address breaks the documented POSIX semantics for mlock() in some obscure error cases where we end up returning -EINVAL instead of -ENOMEM as a direct result of rewriting the upper address bits.
Rework the untagged_addr() macro to preserve the upper address bits for TTBR1 addresses and only clear the tag bits for user addresses. This matches the behaviour of the 'clear_address_tag' assembly macro, so rename that and align the implementations at the same time so that they use the same instruction sequences for the tag manipulation.
Cc: Catalin Marinas catalin.marinas@arm.com Link: https://lore.kernel.org/stable/20191014162651.GF19200@arrakis.emea.arm.com/ Reported-by: Jan Stancek jstancek@redhat.com Signed-off-by: Will Deacon will@kernel.org
arch/arm64/include/asm/asm-uaccess.h | 7 +++---- arch/arm64/include/asm/memory.h | 10 ++++++++-- arch/arm64/kernel/entry.S | 4 ++-- 3 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h index f74909ba29bd..5bf963830b17 100644 --- a/arch/arm64/include/asm/asm-uaccess.h +++ b/arch/arm64/include/asm/asm-uaccess.h @@ -78,10 +78,9 @@ alternative_else_nop_endif /*
- Remove the address tag from a virtual address, if present.
*/
- .macro clear_address_tag, dst, addr
- tst \addr, #(1 << 55)
- bic \dst, \addr, #(0xff << 56)
- csel \dst, \dst, \addr, eq
- .macro untagged_addr, dst, addr
- sbfx \dst, \addr, #0, #56
- and \dst, \dst, \addr .endm
#endif diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index b61b50bf68b1..c23c47360664 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -215,12 +215,18 @@ static inline unsigned long kaslr_offset(void)
- up with a tagged userland pointer. Clear the tag to get a sane pointer to
- pass on to access_ok(), for instance.
*/ -#define untagged_addr(addr) \ +#define __untagged_addr(addr) \ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55)) +#define untagged_addr(addr) ({ \
Having the same informal name ("untagged") for two different address representations seems like a recipe for confusion. Can we rename one of them? (Say, untagged_address_if_user()?)
- u64 __addr = (__force u64)addr; \
Missing () round addr.
Also, nit: needlessly fragile macro? (OK, callers are unlikely to pass "__addr" for addr, but the __addr variable doesn't do a lot here other than to avoid repeated evaluation of the argument -- I don't expect this to matter given how this macro is used.)
- __addr &= __untagged_addr(__addr); \
- (__force __typeof__(addr))__addr; \
+})
Is there any reason why we can't just have
#define untagged_addr ((__force __typeof__(addr))( \ (__force u64)(addr) & GENMASK_ULL(63, 56)))
?
Either way, "kernel" addresses (bit 55 set) become unusable garbage, but we _want_ such addresses passed from userspace to be unusable. Comparison against TASK_SIZE will still police them accurately.
Simply zero-extending would be a less obfuscated way of only ever rounding the address down -- it's the rounding up that spuriously triggers address wraparound and leads to the -EINVAL return.
(Tests for error codes are inherently fragile, and MTE requires POSIX wording to be interpreted in a context not anticipated by the authors -- so I'm still not totally convinced we need a band-aid for this. But anyway...)
[...]
Cheers ---Dave
On Wed, Oct 16, 2019 at 03:44:25PM +0100, Dave P Martin wrote:
On Wed, Oct 16, 2019 at 05:29:33AM +0100, Will Deacon wrote:
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index b61b50bf68b1..c23c47360664 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -215,12 +215,18 @@ static inline unsigned long kaslr_offset(void)
- up with a tagged userland pointer. Clear the tag to get a sane pointer to
- pass on to access_ok(), for instance.
*/ -#define untagged_addr(addr) \ +#define __untagged_addr(addr) \ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55)) +#define untagged_addr(addr) ({ \
Having the same informal name ("untagged") for two different address representations seems like a recipe for confusion. Can we rename one of them? (Say, untagged_address_if_user()?)
I agree it's confusing. We can rename the __* one but the other is spread around the kernel (it can be done, though as a subsequent exercise; untagged_uaddr?).
- __addr &= __untagged_addr(__addr); \
- (__force __typeof__(addr))__addr; \
+})
Is there any reason why we can't just have
#define untagged_addr ((__force __typeof__(addr))( \ (__force u64)(addr) & GENMASK_ULL(63, 56)))
I guess you meant ~GENMASK_ULL(63,56) or GENMASK_ULL(55,0).
This changes the overflow case for mlock() which would return -ENOMEM instead of -EINVAL (not sure we have a test for it though). Does it matter?
On Wed, Oct 16, 2019 at 03:52:38PM +0100, Catalin Marinas wrote:
On Wed, Oct 16, 2019 at 03:44:25PM +0100, Dave P Martin wrote:
On Wed, Oct 16, 2019 at 05:29:33AM +0100, Will Deacon wrote:
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index b61b50bf68b1..c23c47360664 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -215,12 +215,18 @@ static inline unsigned long kaslr_offset(void)
- up with a tagged userland pointer. Clear the tag to get a sane pointer to
- pass on to access_ok(), for instance.
*/ -#define untagged_addr(addr) \ +#define __untagged_addr(addr) \ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55)) +#define untagged_addr(addr) ({ \
Having the same informal name ("untagged") for two different address representations seems like a recipe for confusion. Can we rename one of them? (Say, untagged_address_if_user()?)
I agree it's confusing. We can rename the __* one but the other is spread around the kernel (it can be done, though as a subsequent exercise; untagged_uaddr?).
- __addr &= __untagged_addr(__addr); \
- (__force __typeof__(addr))__addr; \
+})
Is there any reason why we can't just have
#define untagged_addr ((__force __typeof__(addr))( \ (__force u64)(addr) & GENMASK_ULL(63, 56)))
I guess you meant ~GENMASK_ULL(63,56) or GENMASK_ULL(55,0).
This changes the overflow case for mlock() which would return -ENOMEM instead of -EINVAL (not sure we have a test for it though). Does it matter?
It looks like SUSv7 has m{,un}local() and mprotect() aligned with one another regarding ENOMEM versus EINVAL:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/mprotect.html https://pubs.opengroup.org/onlinepubs/9699919799/functions/mlock.html
So whatever we do, it should probably have the same effect on both calls.
There's another wrinkle that occurrs to me though. Rounding "kernel" addresses up can spuriously change ENOMEM to EINVAL -- but the fix for userspace addresses (i.e., rounding down) can likewise spuriously change EINVAL to ENOMEM.
Maybe this is OK -- the SUSv7 wording doesn't seem to call out address wraparound as a special case, and therefore supposedly doesn't require EINVAL to be returned for it.
The asymmetry is concerning though, and a bit ugly.
Cheers ---Dave
On 16/10/2019 17:35, Dave Martin wrote:
On Wed, Oct 16, 2019 at 03:52:38PM +0100, Catalin Marinas wrote:
On Wed, Oct 16, 2019 at 03:44:25PM +0100, Dave P Martin wrote:
On Wed, Oct 16, 2019 at 05:29:33AM +0100, Will Deacon wrote:
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index b61b50bf68b1..c23c47360664 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -215,12 +215,18 @@ static inline unsigned long kaslr_offset(void)
- up with a tagged userland pointer. Clear the tag to get a sane pointer to
- pass on to access_ok(), for instance.
*/ -#define untagged_addr(addr) \ +#define __untagged_addr(addr) \ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55)) +#define untagged_addr(addr) ({ \
Having the same informal name ("untagged") for two different address representations seems like a recipe for confusion. Can we rename one of them? (Say, untagged_address_if_user()?)
I agree it's confusing. We can rename the __* one but the other is spread around the kernel (it can be done, though as a subsequent exercise; untagged_uaddr?).
- __addr &= __untagged_addr(__addr); \
- (__force __typeof__(addr))__addr; \
+})
Is there any reason why we can't just have
#define untagged_addr ((__force __typeof__(addr))( \ (__force u64)(addr) & GENMASK_ULL(63, 56)))
I guess you meant ~GENMASK_ULL(63,56) or GENMASK_ULL(55,0).
This changes the overflow case for mlock() which would return -ENOMEM instead of -EINVAL (not sure we have a test for it though). Does it matter?
It looks like SUSv7 has m{,un}local() and mprotect() aligned with one another regarding ENOMEM versus EINVAL:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/mprotect.html https://pubs.opengroup.org/onlinepubs/9699919799/functions/mlock.html
So whatever we do, it should probably have the same effect on both calls.
There's another wrinkle that occurrs to me though. Rounding "kernel" addresses up can spuriously change ENOMEM to EINVAL -- but the fix for userspace addresses (i.e., rounding down) can likewise spuriously change EINVAL to ENOMEM.
well this was the point of the bit 55 check, to avoid both. (with the assumption that getting EINVAL is not important if overflow happens with len > 2^55 and 0 bit 55)
as far as i can tell the EINVAL for overflow is linux specific.
i think returning ENOMEM for invalid addr,len pairs should be fine, i.e. zero extension is ok.
i think this is consistent with posix requirements, and arguably even with current linux manual which documents EINVAL for overflow in mlock, but also ENOMEM for unmapped pages in the range, so both errors are ok on overflow.
so the bit 55 check only matters if something somewhere relies on the error code in a significant way when calling syscalls with nonsensical arguments.
Maybe this is OK -- the SUSv7 wording doesn't seem to call out address
wraparound as a special case, and therefore supposedly doesn't require EINVAL to be returned for it.
The asymmetry is concerning though, and a bit ugly.
Cheers ---Dave
linux-stable-mirror@lists.linaro.org