From: Frederic Weisbecker frederic@kernel.org
[ Upstream commit 68e7a4d66b0ce04bf18ff2ffded5596ab3618585 ]
vtime_account_system() assumes that the target task to account cputime to is always the current task. This is most often true indeed except on task switch where we call:
vtime_common_task_switch(prev) vtime_account_system(prev)
Here prev is the scheduling-out task where we account the cputime to. It doesn't match current that is already the scheduling-in task at this stage of the context switch.
So we end up checking the wrong task flags to determine if we are accounting guest or system time to the previous task.
As a result the wrong task is used to check if the target is running in guest mode. We may then spuriously account or leak either system or guest time on task switch.
Fix this assumption and also turn vtime_guest_enter/exit() to use the task passed in parameter as well to avoid future similar issues.
Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Rik van Riel riel@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Wanpeng Li wanpengli@tencent.com Fixes: 2a42eb9594a1 ("sched/cputime: Accumulate vtime on top of nsec clocksource") Link: https://lkml.kernel.org/r/20190925214242.21873-1-frederic@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/sched/cputime.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c index 14d2dbf97c531..45c2cd37fe6b0 100644 --- a/kernel/sched/cputime.c +++ b/kernel/sched/cputime.c @@ -738,7 +738,7 @@ void vtime_account_system(struct task_struct *tsk)
write_seqcount_begin(&vtime->seqcount); /* We might have scheduled out from guest path */ - if (current->flags & PF_VCPU) + if (tsk->flags & PF_VCPU) vtime_account_guest(tsk, vtime); else __vtime_account_system(tsk, vtime); @@ -781,7 +781,7 @@ void vtime_guest_enter(struct task_struct *tsk) */ write_seqcount_begin(&vtime->seqcount); __vtime_account_system(tsk, vtime); - current->flags |= PF_VCPU; + tsk->flags |= PF_VCPU; write_seqcount_end(&vtime->seqcount); } EXPORT_SYMBOL_GPL(vtime_guest_enter); @@ -792,7 +792,7 @@ void vtime_guest_exit(struct task_struct *tsk)
write_seqcount_begin(&vtime->seqcount); vtime_account_guest(tsk, vtime); - current->flags &= ~PF_VCPU; + tsk->flags &= ~PF_VCPU; write_seqcount_end(&vtime->seqcount); } EXPORT_SYMBOL_GPL(vtime_guest_exit);