On Fri, 23 Aug 2019 11:36:37 +0200 Peter Zijlstra peterz@infradead.org wrote:
On Thu, Aug 22, 2019 at 10:23:35PM -0700, Song Liu wrote:
As 4k pages check was removed from cpa [1], set_kernel_text_rw() leads to split_large_page() for all kernel text pages. This means a single kprobe will put all kernel text in 4k pages:
root@ ~# grep ffff81000000- /sys/kernel/debug/page_tables/kernel 0xffffffff81000000-0xffffffff82400000 20M ro PSE x pmd
root@ ~# echo ONE_KPROBE >> /sys/kernel/debug/tracing/kprobe_events root@ ~# echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
root@ ~# grep ffff81000000- /sys/kernel/debug/page_tables/kernel 0xffffffff81000000-0xffffffff82400000 20M ro x pte
To fix this issue, introduce CPA_FLIP_TEXT_RW to bypass "Text RO" check in static_protections().
Two helper functions set_text_rw() and set_text_ro() are added to flip _PAGE_RW bit for kernel text.
[1] commit 585948f4f695 ("x86/mm/cpa: Avoid the 4k pages check completely")
ARGH; so this is because ftrace flips the whole kernel range to RW and back for giggles? I'm thinking _that_ is a bug, it's a clear W^X violation.
Since ftrace did this way before text_poke existed and way before anybody cared (back in 2007), it's not really a bug.
Anyway, I believe Nadav has some patches that converts ftrace to use the shadow page modification trick somewhere.
Or we also need the text_poke batch processing (did that get upstream?).
Mapping in 40,000 pages one at a time is noticeable from a human stand point.
-- Steve