Commit 5738891229a2 ("perf/x86/amd: Add support for Large Increment per Cycle Events") mistakenly zeroes the upper 16 bits of the count in set_period(). That's fine for counting with perf stat, but not sampling with perf record when only Large Increment events are being sampled. To enable sampling, we sign extend the upper 16 bits of the merged counter pair as described in the Family 17h PPRs:
"Software wanting to preload a value to a merged counter pair writes the high-order 16-bit value to the low-order 16 bits of the odd counter and then writes the low-order 48-bit value to the even counter. Reading the even counter of the merged counter pair returns the full 64-bit value."
Fixes: 5738891229a2 ("perf/x86/amd: Add support for Large Increment per Cycle Events") Signed-off-by: Kim Phillips kim.phillips@amd.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Cc: Stephane Eranian eranian@google.com Cc: Peter Zijlstra peterz@infradead.org Cc: Ingo Molnar mingo@kernel.org Cc: Ingo Molnar mingo@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Borislav Petkov bp@alien8.de Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Arnaldo Carvalho de Melo acme@kernel.org Cc: "H. Peter Anvin" hpa@zytor.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Michael Petlan mpetlan@redhat.com Cc: Namhyung Kim namhyung@kernel.org Cc: LKML linux-kernel@vger.kernel.org Cc: x86 x86@kernel.org Cc: stable@vger.kernel.org --- v2: no changes.
arch/x86/events/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 1cbf57dc2ac8..2fdc211e3e56 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -1284,11 +1284,11 @@ int x86_perf_event_set_period(struct perf_event *event) wrmsrl(hwc->event_base, (u64)(-left) & x86_pmu.cntval_mask);
/* - * Clear the Merge event counter's upper 16 bits since + * Sign extend the Merge event counter's upper 16 bits since * we currently declare a 48-bit counter width */ if (is_counter_pair(hwc)) - wrmsrl(x86_pmu_event_addr(idx + 1), 0); + wrmsrl(x86_pmu_event_addr(idx + 1), 0xffff);
/* * Due to erratum on certan cpu we need