Ira,
ira.weiny@intel.com writes:
... // ref == 0 dev_access_enable() // ref += 1 ==> disable protection irq() // enable protection // ref = 0 _handler() dev_access_enable() // ref += 1 ==> disable protection dev_access_disable() // ref -= 1 ==> enable protection // WARN_ON(ref != 0) // disable protection do_pmem_thing() // all good here dev_access_disable() // ref -= 1 ==> 0 ==> enable protection
...
First I'm not sure if adding this state to idtentry_state and having that state copied is the right way to go.
Adding the state to idtentry_state is fine at least for most interrupts and exceptions. Emphasis on most.
#PF does not work because #PF can schedule.
It seems like we should start passing this by reference instead of value. But for now this works as an RFC. Comments?
Works as in compiles, right?
static void noinstr idt_save_pkrs(idtentry_state_t state) { state.foo = 1; }
How is that supposed to change the caller state? C programming basics.
Second, I'm not 100% happy with having to save the reference count in the exception handler. It seems like a very ugly layering violation but I don't see a way around it at the moment.
That state is strict per task, right? So why do you want to store it anywhere else that in task/thread storage. That solves your problem of #PF scheduling nicely.
Third, this patch has gone through a couple of revisions as I've had crashes which just don't make sense to me. One particular issue I've had is taking a MCE during memcpy_mcsafe causing my WARN_ON() to fire. The code path was a pmem copy and the ref count should have been elevated due to dev_access_enable() but why was idtentry_enter()->idt_save_pkrs() not called I don't know.
Because #MC does not go through idtentry_enter(). Neither do #NMI, #DB, #BP.
Finally, it looks like the entry/exit code is being refactored into common code. So likely this is best handled somewhat there. Because this can be predicated on CONFIG_ARCH_HAS_SUPERVISOR_PKEYS and handled in a generic fashion. But that is a ways off I think.
The invocation of save/restore might be placed in generic code at least for the common exception and interrupt entries.
+static void noinstr idt_save_pkrs(idtentry_state_t state)
*state. See above.
+#else +/* Define as macros to prevent conflict of inline and noinstr */ +#define idt_save_pkrs(state) +#define idt_restore_pkrs(state)
Empty inlines do not need noinstr because they are optimized out. If you want inlines in a noinstr section then use __always_inline
/**
- idtentry_enter - Handle state tracking on ordinary idtentries
- @regs: Pointer to pt_regs of interrupted context
@@ -604,6 +671,8 @@ idtentry_state_t noinstr idtentry_enter(struct pt_regs *regs) return ret; }
- idt_save_pkrs(ret);
No. This really has no business to be called before the state is established. It's not something horribly urgent and write_pkrs() is NOT noinstr and invokes wrmsrl() which is subject to tracing.
- idt_restore_pkrs(state);
This one is placed correctly.
Thanks,
tglx