On Aug 23, 2019, at 5:59 PM, Thomas Gleixner tglx@linutronix.de wrote:
On Wed, 21 Aug 2019, Thomas Gleixner wrote:
On Wed, 21 Aug 2019, Song Liu wrote:
On Aug 20, 2019, at 1:23 PM, Song Liu songliubraving@fb.com wrote:
Before 32-bit support, pti_clone_pmds() always adds PMD_SIZE to addr. This behavior changes after the 32-bit support: pti_clone_pgtable() increases addr by PUD_SIZE for pud_none(*pud) case, and increases addr by PMD_SIZE for pmd_none(*pmd) case. However, this is not accurate because addr may not be PUD_SIZE/PMD_SIZE aligned.
Fix this issue by properly rounding up addr to next PUD_SIZE/PMD_SIZE in these two cases.
After poking around more, I found the following doesn't really make sense.
I'm glad you figured that out yourself. Was about to write up something to that effect.
Still interesting questions remain:
How did you end up feeding an unaligned address into that which points to a 0 PUD?
Is this related to Facebook specific changes and unlikely to affect any regular kernel? I can't come up with a way to trigger that in mainline
As this is a user page table and the missing mapping is related to mappings required by PTI, how is the machine going in/out of user space in the first place? Or did I just trip over what you called nonsense?
And just because this ended in silence I looked at it myself after Peter told me that this was on a kernel with PTI disabled. Aside of that my built in distrust for debug war stories combined with fairy tale changelogs triggered my curiousity anyway.
I am really sorry that I was silent. Somehow I didn't see this in my inbox (or it didn't show up until just now?).
For this patch, I really messed up this with something else. The issue we are seeing is that kprobe on CONFIG_KPROBES_ON_FTRACE splits PMD located at 0xffffffff81a00000. I sent another patch last night, but that might not be the right fix either.
I haven't started testing our PTI enabled kernel, so I am not sure whether there is really an issue with the PTI code.
Thanks, Song