From: Peter Zijlstra peterz@infradead.org
intel_pmu_cpu_prepare() allocated memory for ->shared_regs among other members of struct cpu_hw_events. This memory is released in intel_pmu_cpu_dying() which is wrong. The counterpart of the intel_pmu_cpu_prepare() callback is x86_pmu_dead_cpu().
Otherwise if the CPU fails on the UP path between CPUHP_PERF_X86_PREPARE and CPUHP_AP_PERF_X86_STARTING then it won't release the memory but allocate new memory on the next attempt to online the CPU (leaking the old memory). Also, if the CPU down path fails between CPUHP_AP_PERF_X86_STARTING and CPUHP_PERF_X86_PREPARE then the CPU will go back online but never allocate the memory that was released in x86_pmu_dying_cpu().
Make the memory allocation/free symmetrical in regard to the CPU hotplug notifier by moving the deallocation to intel_pmu_cpu_dead().
This started in commit:
a7e3ed1e47011 ("perf: Add support for supplementary event registers").
In principle the bug was introduced in v2.6.39 (!), but it will almost certainly not backport cleanly across the big CPU hotplug rewrite between v4.7-v4.15...
[ bigeasy: Added patch description. ] [ mingo: Added backporting guidance. ]
Reported-by: He Zhe zhe.he@windriver.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org # With developer hat on Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org # With maintainer hat on Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: acme@kernel.org Cc: bp@alien8.de Cc: hpa@zytor.com Cc: jolsa@kernel.org Cc: kan.liang@linux.intel.com Cc: namhyung@kernel.org Cc: stable@vger.kernel.org Fixes: a7e3ed1e47011 ("perf: Add support for supplementary event registers"). Link: https://lkml.kernel.org/r/20181219165350.6s3jvyxbibpvlhtq@linutronix.de Signed-off-by: Ingo Molnar mingo@kernel.org [ He Zhe: Fixes conflict caused by missing disable_counter_freeze which is introduced since v4.20 af3bdb991a5cb. ] Signed-off-by: He Zhe zhe.he@windriver.com --- This backport is for v4.9. The original commit id is 602cae04c4864.
arch/x86/events/intel/core.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 4f85607..f600ab6 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3235,6 +3235,11 @@ static void free_excl_cntrs(int cpu)
static void intel_pmu_cpu_dying(int cpu) { + fini_debug_store_on_cpu(cpu); +} + +static void intel_pmu_cpu_dead(int cpu) +{ struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu); struct intel_shared_regs *pc;
@@ -3246,8 +3251,6 @@ static void intel_pmu_cpu_dying(int cpu) }
free_excl_cntrs(cpu); - - fini_debug_store_on_cpu(cpu); }
static void intel_pmu_sched_task(struct perf_event_context *ctx, @@ -3324,6 +3327,7 @@ static __initconst const struct x86_pmu core_pmu = { .cpu_prepare = intel_pmu_cpu_prepare, .cpu_starting = intel_pmu_cpu_starting, .cpu_dying = intel_pmu_cpu_dying, + .cpu_dead = intel_pmu_cpu_dead, };
static __initconst const struct x86_pmu intel_pmu = { @@ -3359,6 +3363,8 @@ static __initconst const struct x86_pmu intel_pmu = { .cpu_prepare = intel_pmu_cpu_prepare, .cpu_starting = intel_pmu_cpu_starting, .cpu_dying = intel_pmu_cpu_dying, + .cpu_dead = intel_pmu_cpu_dead, + .guest_get_msrs = intel_guest_get_msrs, .sched_task = intel_pmu_sched_task, };
On Mon, Feb 11, 2019 at 09:53:53PM +0800, zhe.he@windriver.com wrote:
From: Peter Zijlstra peterz@infradead.org
intel_pmu_cpu_prepare() allocated memory for ->shared_regs among other members of struct cpu_hw_events. This memory is released in intel_pmu_cpu_dying() which is wrong. The counterpart of the intel_pmu_cpu_prepare() callback is x86_pmu_dead_cpu().
Otherwise if the CPU fails on the UP path between CPUHP_PERF_X86_PREPARE and CPUHP_AP_PERF_X86_STARTING then it won't release the memory but allocate new memory on the next attempt to online the CPU (leaking the old memory). Also, if the CPU down path fails between CPUHP_AP_PERF_X86_STARTING and CPUHP_PERF_X86_PREPARE then the CPU will go back online but never allocate the memory that was released in x86_pmu_dying_cpu().
Make the memory allocation/free symmetrical in regard to the CPU hotplug notifier by moving the deallocation to intel_pmu_cpu_dead().
This started in commit:
a7e3ed1e47011 ("perf: Add support for supplementary event registers").
In principle the bug was introduced in v2.6.39 (!), but it will almost certainly not backport cleanly across the big CPU hotplug rewrite between v4.7-v4.15...
[ bigeasy: Added patch description. ] [ mingo: Added backporting guidance. ]
Reported-by: He Zhe zhe.he@windriver.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org # With developer hat on Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org # With maintainer hat on Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@redhat.com Cc: Jiri Olsa jolsa@redhat.com Cc: Linus Torvalds torvalds@linux-foundation.org Cc: Peter Zijlstra peterz@infradead.org Cc: Thomas Gleixner tglx@linutronix.de Cc: acme@kernel.org Cc: bp@alien8.de Cc: hpa@zytor.com Cc: jolsa@kernel.org Cc: kan.liang@linux.intel.com Cc: namhyung@kernel.org Cc: stable@vger.kernel.org Fixes: a7e3ed1e47011 ("perf: Add support for supplementary event registers"). Link: https://lkml.kernel.org/r/20181219165350.6s3jvyxbibpvlhtq@linutronix.de Signed-off-by: Ingo Molnar mingo@kernel.org [ He Zhe: Fixes conflict caused by missing disable_counter_freeze which is introduced since v4.20 af3bdb991a5cb. ] Signed-off-by: He Zhe zhe.he@windriver.com
This backport is for v4.9. The original commit id is 602cae04c4864.
Now queued up, thanks.
greg k-h
linux-stable-mirror@lists.linaro.org