4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andi Kleen ak@linux.intel.com
commit a7e3ed1e470116c9d12c2f778431a481a6be8ab6 upstream.
Change logs against Andi's original version:
- Extends perf_event_attr:config to config{,1,2} (Peter Zijlstra) - Fixed a major event scheduling issue. There cannot be a ref++ on an event that has already done ref++ once and without calling put_constraint() in between. (Stephane Eranian) - Use thread_cpumask for percore allocation. (Lin Ming) - Use MSR names in the extra reg lists. (Lin Ming) - Remove redundant "c = NULL" in intel_percore_constraints - Fix comment of perf_event_attr::config1
Intel Nehalem/Westmere have a special OFFCORE_RESPONSE event that can be used to monitor any offcore accesses from a core. This is a very useful event for various tunings, and it's also needed to implement the generic LLC-* events correctly.
Unfortunately this event requires programming a mask in a separate register. And worse this separate register is per core, not per CPU thread.
This patch:
- Teaches perf_events that OFFCORE_RESPONSE needs extra parameters. The extra parameters are passed by user space in the perf_event_attr::config1 field.
- Adds support to the Intel perf_event core to schedule per core resources. This adds fairly generic infrastructure that can be also used for other per core resources. The basic code has is patterned after the similar AMD northbridge constraints code.
Thanks to Stephane Eranian who pointed out some problems in the original version and suggested improvements.
Signed-off-by: Andi Kleen ak@linux.intel.com Signed-off-by: Lin Ming ming.m.lin@intel.com Signed-off-by: Peter Zijlstra a.p.zijlstra@chello.nl LKML-Reference: 1299119690-13991-2-git-send-email-ming.m.lin@intel.com Signed-off-by: Ingo Molnar mingo@elte.hu [ He Zhe: Fixes conflict caused by missing disable_counter_freeze which is introduced since v4.20 af3bdb991a5cb. ] Signed-off-by: He Zhe zhe.he@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- arch/x86/events/intel/core.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3235,6 +3235,11 @@ static void free_excl_cntrs(int cpu)
static void intel_pmu_cpu_dying(int cpu) { + fini_debug_store_on_cpu(cpu); +} + +static void intel_pmu_cpu_dead(int cpu) +{ struct cpu_hw_events *cpuc = &per_cpu(cpu_hw_events, cpu); struct intel_shared_regs *pc;
@@ -3246,8 +3251,6 @@ static void intel_pmu_cpu_dying(int cpu) }
free_excl_cntrs(cpu); - - fini_debug_store_on_cpu(cpu); }
static void intel_pmu_sched_task(struct perf_event_context *ctx, @@ -3324,6 +3327,7 @@ static __initconst const struct x86_pmu .cpu_prepare = intel_pmu_cpu_prepare, .cpu_starting = intel_pmu_cpu_starting, .cpu_dying = intel_pmu_cpu_dying, + .cpu_dead = intel_pmu_cpu_dead, };
static __initconst const struct x86_pmu intel_pmu = { @@ -3359,6 +3363,8 @@ static __initconst const struct x86_pmu .cpu_prepare = intel_pmu_cpu_prepare, .cpu_starting = intel_pmu_cpu_starting, .cpu_dying = intel_pmu_cpu_dying, + .cpu_dead = intel_pmu_cpu_dead, + .guest_get_msrs = intel_guest_get_msrs, .sched_task = intel_pmu_sched_task, };