Re: WARNING: at mm/mremap.c:211 move_page_tables in i386

10 Jul 2020


      On Fri, 10 Jul 2020 at 10:55, Linus Torvalds
torvalds@linux-foundation.org wrote:
...
On Thu, Jul 9, 2020 at 9:29 PM Naresh Kamboju naresh.kamboju@linaro.org wrote:
...
Your patch applied and re-tested.
warning triggered 10 times.
old: bfe00000-c0000000 new: bfa00000 (val: 7d530067)
Hmm.. It's not even the overlapping case, it's literally just "move
exactly 2MB of page tables exactly one pmd down". Which should be the
nice efficient case where we can do it without modifying the lower
page tables at all, we just move the PMD entry.
There shouldn't be anything in the new address space from bfa00000-bfdfffff.
That PMD value obviously says differently, but it looks like a nice
normal PMD value, nothing bad there.
I'm starting to think that the issue might be that this is because the
stack segment is special. Not only does it have the growsdown flag,
but that whole thing has the magic guard page logic.
So I wonder if we have installed a guard page _just_ below the old
stack, so that we have populated that pmd because of that.
We used to have an _actual_ guard page and then play nasty games with
vm_start logic. We've gotten rid of that, though, and now we have that
"stack_guard_gap" logic that _should_ mean that vm_start is always
exact and proper (and that pgtbales_free() should have emptied it, but
maybe we have some case we forgot about.
...
[  741.511684] WARNING: CPU: 1 PID: 15173 at mm/mremap.c:211 move_page_tables.cold+0x0/0x2b
[  741.598159] Call Trace:
[  741.600694]  setup_arg_pages+0x22b/0x310
[  741.621687]  load_elf_binary+0x31e/0x10f0
[  741.633839]  __do_execve_file+0x5a8/0xbf0
[  741.637893]  __ia32_sys_execve+0x2a/0x40
[  741.641875]  do_syscall_32_irqs_on+0x3d/0x2c0
[  741.657660]  do_fast_syscall_32+0x60/0xf0
[  741.661691]  do_SYSENTER_32+0x15/0x20
[  741.665373]  entry_SYSENTER_32+0x9f/0xf2
[  741.734151]  old: bfe00000-c0000000 new: bfa00000 (val: 7d530067)
Nothing looks bad, and the ELF loading phase memory map should be
really quite simple.
The only half-way unusual thing is that you have basically exactly 2MB
of stack at execve time (easy enough to tune by just setting argv/env
right), and it's moved down by exactly 2MB.
And that latter thing is just due to randomization, see
arch_align_stack() in arch/x86/kernel/process.c.
So that would explain why it doesn't happen every time.
What happens if you apply the attached patch to *always* force the 2MB
shift (rather than moving the stack by a random amount), and then run
the other program (t.c -> compiled to "a.out").
I have applied your patch and test started in a loop for a million times
but the test ran for 35 times. Seems like the test got a timeout after 1 hour.
kernel messages printed while testing a.out
a.out (480) used greatest stack depth: 4872 bytes left
On other device
kworker/dying (172) used greatest stack depth: 5044 bytes left
Re-running test with long timeouts 4 hours and will share findings.
ref:
https://lkft.validation.linaro.org/scheduler/job/1555132#L1515
- Naresh

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: WARNING: at mm/mremap.c:211 move_page_tables in i386