Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") broke Xen guest time handling across migration:
[ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 187.251137] OOM killer disabled. [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 187.252299] suspending xenstore... [ 187.266987] xen:grant_table: Grant tables using version 1 layout [18446743811.706476] OOM killer enabled. [18446743811.706478] Restarting tasks ... done. [18446743811.720505] Setting capacity to 16777216
Fix that by setting xen_sched_clock_offset at resume time to ensure a monotonic clock value.
Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") Cc: stable@vger.kernel.org # 4.11 Reported-by: Hans van Kranenburg hans@knorrie.org Signed-off-by: Juergen Gross jgross@suse.com --- arch/x86/xen/time.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 72bf446c3fee..6e29794573b7 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -361,8 +361,6 @@ void xen_timer_resume(void) { int cpu;
- pvclock_resume(); - if (xen_clockevent != &xen_vcpuop_clockevent) return;
@@ -379,12 +377,15 @@ static const struct pv_time_ops xen_time_ops __initconst = { };
static struct pvclock_vsyscall_time_info *xen_clock __read_mostly; +static u64 xen_clock_value_saved;
void xen_save_time_memory_area(void) { struct vcpu_register_time_memory_area t; int ret;
+ xen_clock_value_saved = xen_clocksource_read() - xen_sched_clock_offset; + if (!xen_clock) return;
@@ -404,7 +405,7 @@ void xen_restore_time_memory_area(void) int ret;
if (!xen_clock) - return; + goto out;
t.addr.v = &xen_clock->pvti;
@@ -421,6 +422,11 @@ void xen_restore_time_memory_area(void) if (ret != 0) pr_notice("Cannot restore secondary vcpu_time_info (err %d)", ret); + +out: + /* Need pvclock_resume() before using xen_clocksource_read(). */ + pvclock_resume(); + xen_sched_clock_offset = xen_clocksource_read() - xen_clock_value_saved; }
static void xen_setup_vsyscall_time_info(void)
On 1/14/19 7:44 AM, Juergen Gross wrote:
Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") broke Xen guest time handling across migration:
[ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 187.251137] OOM killer disabled. [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 187.252299] suspending xenstore... [ 187.266987] xen:grant_table: Grant tables using version 1 layout [18446743811.706476] OOM killer enabled. [18446743811.706478] Restarting tasks ... done. [18446743811.720505] Setting capacity to 16777216
Fix that by setting xen_sched_clock_offset at resume time to ensure a monotonic clock value.
Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") Cc: stable@vger.kernel.org # 4.11 Reported-by: Hans van Kranenburg hans@knorrie.org Signed-off-by: Juergen Gross jgross@suse.com
Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com
Hi Boris,
On 1/14/19 2:54 PM, Boris Ostrovsky wrote:
On 1/14/19 7:44 AM, Juergen Gross wrote:
Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") broke Xen guest time handling across migration:
[ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 187.251137] OOM killer disabled. [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 187.252299] suspending xenstore... [ 187.266987] xen:grant_table: Grant tables using version 1 layout [18446743811.706476] OOM killer enabled. [18446743811.706478] Restarting tasks ... done. [18446743811.720505] Setting capacity to 16777216
Fix that by setting xen_sched_clock_offset at resume time to ensure a monotonic clock value.
Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") Cc: stable@vger.kernel.org # 4.11 Reported-by: Hans van Kranenburg hans@knorrie.org Signed-off-by: Juergen Gross jgross@suse.com
Reviewed-by: Boris Ostrovsky boris.ostrovsky@oracle.com
Can you please change the address to my work email?
Reported-by: Hans van Kranenburg hans.van.kranenburg@mendix.com
Also (see other email):
Tested-by: Hans van Kranenburg hans.van.kranenburg@mendix.com
Thanks, Hans
On Mon, 14 Jan 2019, Juergen Gross wrote:
Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") broke Xen guest time handling across migration:
[ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 187.251137] OOM killer disabled. [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 187.252299] suspending xenstore... [ 187.266987] xen:grant_table: Grant tables using version 1 layout [18446743811.706476] OOM killer enabled. [18446743811.706478] Restarting tasks ... done. [18446743811.720505] Setting capacity to 16777216
I see that it's broken, but the changelog could do with an explanation WHY it broke.
Other than that this looks about right.
Thanks,
tglx
On 15/01/2019 11:43, Thomas Gleixner wrote:
On Mon, 14 Jan 2019, Juergen Gross wrote:
Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") broke Xen guest time handling across migration:
[ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 187.251137] OOM killer disabled. [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 187.252299] suspending xenstore... [ 187.266987] xen:grant_table: Grant tables using version 1 layout [18446743811.706476] OOM killer enabled. [18446743811.706478] Restarting tasks ... done. [18446743811.720505] Setting capacity to 16777216
I see that it's broken, but the changelog could do with an explanation WHY it broke.
This seems to be rather complex.
I believe the mentioned commit just ignored Xen guests resulting in a "stable" clock where it shouldn't, but maybe I have missed another aspect of this commit which is to blame. I tried to fix that by replacing using_native_sched_clock() with a hypervisor specific pvops function.
Unfortunately this didn't work, maybe due to other uses of using_native_sched_clock() added by later patches. In the end it was quite clear that updating the Xen clock offset was the right thing to do, so I ended up with this patch.
Juergen
Hi,
On 1/14/19 1:44 PM, Juergen Gross wrote:
Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") broke Xen guest time handling across migration:
[ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 187.251137] OOM killer disabled. [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 187.252299] suspending xenstore... [ 187.266987] xen:grant_table: Grant tables using version 1 layout [18446743811.706476] OOM killer enabled. [18446743811.706478] Restarting tasks ... done. [18446743811.720505] Setting capacity to 16777216
Fix that by setting xen_sched_clock_offset at resume time to ensure a monotonic clock value.
[...]
With v3 of the patch, I see the time jump in one log line happen, but only when using PVH.
[ 49.486453] Freezing user space processes ... (elapsed 0.002 seconds) done. [ 49.488743] OOM killer disabled. [ 49.488764] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 49.491117] suspending xenstore... [2000731.388722] xen:events: Xen HVM callback vector for event delivery is enabled [ 49.491750] xen:grant_table: Grant tables using version 1 layout [ 49.810722] OOM killer enabled. [ 49.810744] Restarting tasks ... done. [ 49.856263] Setting capacity to 6291456 [ 50.006002] Setting capacity to 10485760
If I start as PV, it never seems to happen.
Up to you to decide how important this is. :)
FYI this is with v3 on top of the Debian stretch-backports 4.19 kernel, which I'm starting to use now to reboot things with.
-# uname -a Linux appnode-kylie 4.19.0-0.bpo.1-cloud-amd64 #1 SMP Debian 4.19.12-1~bpo9+1+mendix1 (2019-01-15) x86_64 GNU/Linux
https://salsa.debian.org/knorrie-guest/linux/commits/mendix/stretch-backport...
Hans
On 16/01/2019 01:24, Hans van Kranenburg wrote:
Hi,
On 1/14/19 1:44 PM, Juergen Gross wrote:
Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface") broke Xen guest time handling across migration:
[ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 187.251137] OOM killer disabled. [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 187.252299] suspending xenstore... [ 187.266987] xen:grant_table: Grant tables using version 1 layout [18446743811.706476] OOM killer enabled. [18446743811.706478] Restarting tasks ... done. [18446743811.720505] Setting capacity to 16777216
Fix that by setting xen_sched_clock_offset at resume time to ensure a monotonic clock value.
[...]
With v3 of the patch, I see the time jump in one log line happen, but only when using PVH.
[ 49.486453] Freezing user space processes ... (elapsed 0.002 seconds) done. [ 49.488743] OOM killer disabled. [ 49.488764] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 49.491117] suspending xenstore... [2000731.388722] xen:events: Xen HVM callback vector for event delivery is enabled [ 49.491750] xen:grant_table: Grant tables using version 1 layout [ 49.810722] OOM killer enabled. [ 49.810744] Restarting tasks ... done. [ 49.856263] Setting capacity to 6291456 [ 50.006002] Setting capacity to 10485760
If I start as PV, it never seems to happen.
Up to you to decide how important this is. :)
We could do something like below. Boris?
Juergen --- diff --git a/arch/x86/xen/suspend_hvm.c b/arch/x86/xen/suspend_hvm.c index e666b614cf6d..088f3a6b4be9 100644 --- a/arch/x86/xen/suspend_hvm.c +++ b/arch/x86/xen/suspend_hvm.c @@ -13,6 +13,6 @@ void xen_hvm_post_suspend(int suspend_cancelled) xen_hvm_init_shared_info(); xen_vcpu_restore(); } - xen_callback_vector(); + xen_callback_vector(true); xen_unplug_emulated_devices(); } diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h index 0e60bd918695..ba293fda3265 100644 --- a/arch/x86/xen/xen-ops.h +++ b/arch/x86/xen/xen-ops.h @@ -55,7 +55,7 @@ void xen_enable_sysenter(void); void xen_enable_syscall(void); void xen_vcpu_restore(void);
-void xen_callback_vector(void); +void xen_callback_vector(bool silent); void xen_hvm_init_shared_info(void); void xen_unplug_emulated_devices(void);
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 93194f3e7540..8d8d50bea215 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -1637,7 +1637,7 @@ EXPORT_SYMBOL_GPL(xen_set_callback_via); /* Vector callbacks are better than PCI interrupts to receive event * channel notifications because we can receive vector callbacks on any * vcpu and we don't need PCI support or APIC interactions. */ -void xen_callback_vector(void) +void xen_callback_vector(bool silent) { int rc; uint64_t callback_via; @@ -1650,13 +1650,14 @@ void xen_callback_vector(void) xen_have_vector_callback = 0; return; } - pr_info("Xen HVM callback vector for event delivery is enabled\n"); + if (!silent) + pr_info("Xen HVM callback vector for event delivery is enabled\n"); alloc_intr_gate(HYPERVISOR_CALLBACK_VECTOR, xen_hvm_callback_vector); } } #else -void xen_callback_vector(void) {} +void xen_callback_vector(bool silent) {} #endif
#undef MODULE_PARAM_PREFIX @@ -1692,7 +1693,7 @@ void __init xen_init_IRQ(void) pci_xen_initial_domain(); } if (xen_feature(XENFEAT_hvm_callback_vector)) - xen_callback_vector(); + xen_callback_vector(false);
if (xen_hvm_domain()) { native_init_IRQ();
On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:
@@ -1650,13 +1650,14 @@ void xen_callback_vector(void) xen_have_vector_callback = 0; return; }
pr_info("Xen HVM callback vector for event delivery is
enabled\n");
if (!silent)
pr_info("Xen HVM callback vector for event
delivery is enabled\n");
How about replacing pr_info() with pr_info_once()?
-boris
On 16/01/2019 14:17, Boris Ostrovsky wrote:
On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:
@@ -1650,13 +1650,14 @@ void xen_callback_vector(void) xen_have_vector_callback = 0; return; }
pr_info("Xen HVM callback vector for event delivery is
enabled\n");
if (!silent)
pr_info("Xen HVM callback vector for event
delivery is enabled\n");
How about replacing pr_info() with pr_info_once()?
What a nice and simple idea!
Extra patch or V4?
Juergen
On 1/16/19 9:33 AM, Juergen Gross wrote:
On 16/01/2019 14:17, Boris Ostrovsky wrote:
On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:
@@ -1650,13 +1650,14 @@ void xen_callback_vector(void) xen_have_vector_callback = 0; return; }
pr_info("Xen HVM callback vector for event delivery is
enabled\n");
if (!silent)
pr_info("Xen HVM callback vector for event
delivery is enabled\n");
How about replacing pr_info() with pr_info_once()?
What a nice and simple idea!
Extra patch or V4?
I can add this while committing, I don't think it's worth a whole new patch.
One outstanding question I have is whether anything needs to be added to the commit message (Thomas had some questions)
-boris
On 16/01/2019 16:07, Boris Ostrovsky wrote:
On 1/16/19 9:33 AM, Juergen Gross wrote:
On 16/01/2019 14:17, Boris Ostrovsky wrote:
On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:
@@ -1650,13 +1650,14 @@ void xen_callback_vector(void) xen_have_vector_callback = 0; return; }
pr_info("Xen HVM callback vector for event delivery is
enabled\n");
if (!silent)
pr_info("Xen HVM callback vector for event
delivery is enabled\n");
How about replacing pr_info() with pr_info_once()?
What a nice and simple idea!
Extra patch or V4?
I can add this while committing, I don't think it's worth a whole new patch.
One outstanding question I have is whether anything needs to be added to the commit message (Thomas had some questions)
He didn't react to my explanation. I'm interpreting that as him being fine with my explanation, which I believe is not suitable to be added to the commit message.
Juergen
On 1/16/19 10:29 AM, Juergen Gross wrote:
On 16/01/2019 16:07, Boris Ostrovsky wrote:
On 1/16/19 9:33 AM, Juergen Gross wrote:
On 16/01/2019 14:17, Boris Ostrovsky wrote:
On Wed, Jan 16, 2019 at 08:50:13AM +0100, Juergen Gross wrote:
@@ -1650,13 +1650,14 @@ void xen_callback_vector(void) xen_have_vector_callback = 0; return; }
pr_info("Xen HVM callback vector for event delivery is
enabled\n");
if (!silent)
pr_info("Xen HVM callback vector for event
delivery is enabled\n");
How about replacing pr_info() with pr_info_once()?
What a nice and simple idea!
Extra patch or V4?
I can add this while committing, I don't think it's worth a whole new patch.
One outstanding question I have is whether anything needs to be added to the commit message (Thomas had some questions)
He didn't react to my explanation. I'm interpreting that as him being fine with my explanation, which I believe is not suitable to be added to the commit message.
OK. Applied to (now properly named) for-linus-5.0
-boris
linux-stable-mirror@lists.linaro.org