Re: [Xen-devel] Patch "x86/entry/64: Remove %ebx handling from error_entry/exit" has been added to the 4.9-stable tree

7 Dec 2018

      On Thu, 2018-12-06 at 20:27 +0000, David Woodhouse wrote:
...
On Thu, 2018-12-06 at 10:49 -0800, Andy Lutomirski wrote:
...
...
On Dec 6, 2018, at 9:36 AM, Andrew Cooper <
andrew.cooper3@citrix.com> wrote:
Basically - what is happening is that xen_load_tls() is
invalidating the
%gs selector while %gs is still non-NUL.
If this happens to intersect with a vcpu reschedule, %gs (being
non-NUL)
takes precedence over KERNGSBASE, and faults when Xen tries to
reload
it.  This results in the failsafe callback being invoked.
I think the correct course of action is to use
xen_load_gs_index(0)
(poorly named - it is a hypercall which does swapgs; mov to %gs;
swapgs)
before using update_descriptor() to invalidate the segment.
That will reset %gs to 0 without touching KERNGSBASE, and can be
queued
in the same multicall as the update_descriptor() hypercall.
Sounds good to me as long as we skip it on native.
Like this?
...
#else

struct multicall_space mc = __xen_mc_entry(0);

MULTI_set_segment_base(mc.mc, SEGBASE_GS_USER_SEL, 0);

loadsegment(fs, 0);

#endif
That seems to boot and run, at least.
I'm going to experiment with sticking a SCHEDOP_yield in the batch
*after* the update_descriptor requests, to see if I can trigger the
original problem a bit quicker than my current method — which involves
running a hundred machines for a day or two.
Still wondering if the better fix is just to adjust the comments to
admit that xen_failsafe_callback catches the race condition and fixes
it up perfectly, by just letting the %gs selector be zero for a while?

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [Xen-devel] Patch "x86/entry/64: Remove %ebx handling from error_entry/exit" has been added to the 4.9-stable tree