On Mon, Apr 29, 2019 at 11:53 AM Linus Torvalds torvalds@linux-foundation.org wrote:
On Mon, Apr 29, 2019, 11:42 Andy Lutomirski luto@kernel.org wrote:
I'm less than 100% convinced about this argument. Sure, an NMI right there won't cause a problem. But an NMI followed by an interrupt will kill us if preemption is on. I can think of three solutions:
No, because either the sti shadow disables nmi too (that's the case on some CPUs at least) or the iret from nmi does.
Otherwise you could never trust the whole sti shadow thing - and it very much is part of the architecture.
Is this documented somewhere? And do you actually believe that this is true under KVM, Hyper-V, etc? As I recall, Andrew Cooper dug in to the way that VMX dealt with this stuff and concluded that the SDM was blatantly wrong in many cases, which leads me to believe that Xen HVM/PVH is the *only* hypervisor that gets it right.
Steven's point about batched updates is quite valid, though. My personal favorite solution to this whole mess is to rework the whole thing so that the int3 handler simply returns and retries and to replace the sync_core() broadcast with an SMI broadcast. I don't know whether this will actually work on real CPUs and on VMs and whether it's going to crash various BIOSes out there.