On 20/11/24 16:32, Peter Zijlstra wrote:
On Wed, Nov 20, 2024 at 04:22:16PM +0100, Peter Zijlstra wrote:
On Tue, Nov 19, 2024 at 04:35:00PM +0100, Valentin Schneider wrote:
+void noinstr __flush_tlb_all_noinstr(void) +{
- /*
* This is for invocation in early entry code that cannot be
* instrumented. A RMW to CR4 works for most cases, but relies on
* being able to flip either of the PGE or PCIDE bits. Flipping CR4.PCID
* would require also resetting CR3.PCID, so just try with CR4.PGE, else
* do the CR3 write.
*
* XXX: this gives paravirt the finger.
*/
- if (cpu_feature_enabled(X86_FEATURE_PGE))
__native_tlb_flush_global_noinstr(this_cpu_read(cpu_tlbstate.cr4));
- else
native_flush_tlb_local_noinstr();
+}
Urgh, so that's a lot of ugleh, and cr4 has that pinning stuff and gah.
Why not always just do the CR3 write and call it a day? That should also work for paravirt, no? Just make the whole write_cr3 thing noinstr and voila.
Oh gawd, just having looked at xen_write_cr3() this might not be entirely trivial to mark noinstr :/
... I hadn't even seen that.
AIUI the CR3 RMW is not "enough" if we have PGE enabled, because then global pages aren't flushed.
The question becomes: what is held in global pages and do we care about that when it comes to vmalloc()? I'm starting to think no, but this is x86, I don't know what surprises are waiting for me.
I see e.g. ds_clear_cea() clears PTEs that can have the _PAGE_GLOBAL flag, and it correctly uses the non-deferrable flush_tlb_kernel_range().