Hi Catalin,
On 10/15/19 4:26 PM, Catalin Marinas wrote:
Adding Szabolcs and Dave from ARM as we've discussed this internally briefly but we should include the wider audience.
On Mon, Oct 14, 2019 at 10:33:32PM +0100, Will Deacon wrote:
On Mon, Oct 14, 2019 at 05:26:51PM +0100, Catalin Marinas wrote:
On Mon, Oct 14, 2019 at 02:54:17PM +0200, Andrey Konovalov wrote:
On Mon, Oct 14, 2019 at 9:29 AM Jan Stancek jstancek@redhat.com wrote:
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git Commit: d6c2c23a29f4 - Merge branch 'stable-next' of ssh://chubbybox:/home/sasha/data/next into stable-next
The results of these automated tests are provided below.
Overall result: FAILED (see details below) Merge: OK Compile: OK Tests: FAILED
All kernel binaries, config files, and logs are available for download here:
https://artifacts.cki-project.org/pipelines/223563
One or more kernel tests failed:
aarch64: ❌ LTP: openposix test suite
Test [1] is passing value close to LONG_MAX, which on arm64 is now treated as tagged userspace ptr: 057d3389108e ("mm: untag user pointers passed to memory syscalls")
And now seems to hit overflow check after sign extension (EINVAL). Test should probably find different way to choose invalid pointer.
[1] https://github.com/linux-test-project/ltp/blob/master/testcases/open_posix_t...
[...]
The options I see:
Revert commit 057d3389108e and try again to document that the memory syscalls do not support tagged pointers
Change untagged_addr() to only 0-extend from bit 55 or leave the tag unchanged if bit 55 is 1. We could mask out the tag (0 rather than sign-extend) but if we had an mlock test passing ULONG_MASK, then we get -ENOMEM instead of -EINVAL
Make untagged_addr() depend on the TIF_TAGGED_ADDR bit and we only break the ABI for applications opting in to this new ABI. We did look at this but the ptrace(PEEK/POKE_DATA) needs a bit more thinking on whether we check the ptrace'd process or the debugger flags
Leave things as they are, consider the address space 56-bit and change the test to not use LONG_MAX on arm64. This needs to be passed by the sparc guys since they probably have a similar issue
I'm in favour of (2) or (4) as long as there's also an update to the docs.
With (4) we'd start differing from other architectures supported by Linux. This works if we consider the test to be broken. However, reading the mlock man page:
EINVAL The result of the addition addr+len was less than addr (e.g., the addition may have resulted in an overflow). ENOMEM Some of the specified address range does not correspond to mapped pages in the address space of the process.
There is no mention of EINVAL outside the TASK_SIZE, seems to fall more within the ENOMEM description above.
I agree with your analysis and vote for option (2) as well, because of what you reported about mlock() error meanings and because being this ABI I would prefer where possible to not diverge from other architectures. [...]