This is a note to let you know that I've just added the patch titled
perf trace: Add mmap alias for s390
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-trace-add-mmap-alias-for-s390.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Jiri Olsa <jolsa(a)kernel.org>
Date: Wed, 31 May 2017 13:35:57 +0200
Subject: perf trace: Add mmap alias for s390
From: Jiri Olsa <jolsa(a)kernel.org>
[ Upstream commit 54265664c15a68905d8d67d19205e9a767636434 ]
The s390 architecture maps sys_mmap (nr 90) into sys_old_mmap. For this
reason perf trace can't find the proper syscall event to get args format
from and displays it wrongly as 'continued'.
To fix that fill the "alias" field with "old_mmap" for trace's mmap record
to get the correct translation.
Before:
0.042 ( 0.011 ms): vest/43052 fstat(statbuf: 0x3ffff89fd90 ) = 0
0.042 ( 0.028 ms): vest/43052 ... [continued]: mmap()) = 0x3fffd6e2000
0.072 ( 0.025 ms): vest/43052 read(buf: 0x3fffd6e2000, count: 4096 ) = 6
After:
0.045 ( 0.011 ms): fstat(statbuf: 0x3ffff8a0930 ) = 0
0.057 ( 0.018 ms): mmap(arg: 0x3ffff8a0858 ) = 0x3fffd14a000
0.076 ( 0.025 ms): read(buf: 0x3fffd14a000, count: 4096 ) = 6
Signed-off-by: Jiri Olsa <jolsa(a)kernel.org>
Cc: David Ahern <dsahern(a)gmail.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra(a)chello.nl>
Link: http://lkml.kernel.org/r/20170531113557.19175-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/builtin-trace.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -679,6 +679,10 @@ static struct syscall_fmt {
{ .name = "mlockall", .errmsg = true,
.arg_scnprintf = { [0] = SCA_HEX, /* addr */ }, },
{ .name = "mmap", .hexret = true,
+/* The standard mmap maps to old_mmap on s390x */
+#if defined(__s390x__)
+ .alias = "old_mmap",
+#endif
.arg_scnprintf = { [0] = SCA_HEX, /* addr */
[2] = SCA_MMAP_PROT, /* prot */
[3] = SCA_MMAP_FLAGS, /* flags */ }, },
Patches currently in stable-queue which might be from jolsa(a)kernel.org are
queue-4.9/perf-report-fix-off-by-one-for-non-activation-frames.patch
queue-4.9/perf-trace-add-mmap-alias-for-s390.patch
queue-4.9/perf-tools-fix-copyfile_offset-update-of-output-offset.patch
queue-4.9/perf-core-correct-event-creation-with-perf_format_group.patch
queue-4.9/perf-tools-decompress-kernel-module-when-reading-dso-data.patch
queue-4.9/perf-tests-decompress-kernel-module-before-objdump.patch
queue-4.9/perf-header-set-proper-module-name-when-build-id-event-found.patch
This is a note to let you know that I've just added the patch titled
perf tools: Decompress kernel module when reading DSO data
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-tools-decompress-kernel-module-when-reading-dso-data.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Namhyung Kim <namhyung(a)kernel.org>
Date: Thu, 8 Jun 2017 16:31:05 +0900
Subject: perf tools: Decompress kernel module when reading DSO data
From: Namhyung Kim <namhyung(a)kernel.org>
[ Upstream commit 1d6b3c9ba756a5134fd7ad1959acac776d17404b ]
Currently perf decompresses kernel modules when loading the symbol table
but it missed to do it when reading raw data.
Signed-off-by: Namhyung Kim <namhyung(a)kernel.org>
Acked-by: Jiri Olsa <jolsa(a)kernel.org>
Cc: Adrian Hunter <adrian.hunter(a)intel.com>
Cc: David Ahern <dsahern(a)gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra(a)chello.nl>
Cc: Wang Nan <wangnan0(a)huawei.com>
Cc: kernel-team(a)lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/util/dso.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -366,7 +366,23 @@ static int __open_dso(struct dso *dso, s
if (!is_regular_file(name))
return -EINVAL;
+ if (dso__needs_decompress(dso)) {
+ char newpath[KMOD_DECOMP_LEN];
+ size_t len = sizeof(newpath);
+
+ if (dso__decompress_kmodule_path(dso, name, newpath, len) < 0) {
+ free(name);
+ return -dso->load_errno;
+ }
+
+ strcpy(name, newpath);
+ }
+
fd = do_open(name);
+
+ if (dso__needs_decompress(dso))
+ unlink(name);
+
free(name);
return fd;
}
Patches currently in stable-queue which might be from namhyung(a)kernel.org are
queue-4.9/perf-report-fix-off-by-one-for-non-activation-frames.patch
queue-4.9/perf-trace-add-mmap-alias-for-s390.patch
queue-4.9/perf-report-ensure-the-perf-dso-mapping-matches-what-libdw-sees.patch
queue-4.9/perf-tools-fix-copyfile_offset-update-of-output-offset.patch
queue-4.9/perf-tools-decompress-kernel-module-when-reading-dso-data.patch
queue-4.9/perf-tests-decompress-kernel-module-before-objdump.patch
queue-4.9/perf-header-set-proper-module-name-when-build-id-event-found.patch
This is a note to let you know that I've just added the patch titled
perf tools: Fix copyfile_offset update of output offset
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-tools-fix-copyfile_offset-update-of-output-offset.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Jiri Olsa <jolsa(a)kernel.org>
Date: Tue, 9 Jan 2018 14:39:23 +0100
Subject: perf tools: Fix copyfile_offset update of output offset
From: Jiri Olsa <jolsa(a)kernel.org>
[ Upstream commit fa1195ccc0af2d121abe0fe266a1caee8c265eea ]
We need to increase output offset in each iteration, not decrease it as
we currently do.
I guess we were lucky to finish in most cases in first iteration, so the
bug never showed. However it shows a lot when working with big (~4GB)
size data.
Signed-off-by: Jiri Olsa <jolsa(a)kernel.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: David Ahern <dsahern(a)gmail.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Fixes: 9c9f5a2f1944 ("perf tools: Introduce copyfile_offset() function")
Link: http://lkml.kernel.org/r/20180109133923.25406-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/util/util.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -207,7 +207,7 @@ int copyfile_offset(int ifd, loff_t off_
size -= ret;
off_in += ret;
- off_out -= ret;
+ off_out += ret;
}
munmap(ptr, off_in + size);
Patches currently in stable-queue which might be from jolsa(a)kernel.org are
queue-4.9/perf-report-fix-off-by-one-for-non-activation-frames.patch
queue-4.9/perf-trace-add-mmap-alias-for-s390.patch
queue-4.9/perf-tools-fix-copyfile_offset-update-of-output-offset.patch
queue-4.9/perf-core-correct-event-creation-with-perf_format_group.patch
queue-4.9/perf-tools-decompress-kernel-module-when-reading-dso-data.patch
queue-4.9/perf-tests-decompress-kernel-module-before-objdump.patch
queue-4.9/perf-header-set-proper-module-name-when-build-id-event-found.patch
This is a note to let you know that I've just added the patch titled
perf report: Fix off-by-one for non-activation frames
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-report-fix-off-by-one-for-non-activation-frames.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Milian Wolff <milian.wolff(a)kdab.com>
Date: Wed, 24 May 2017 15:21:25 +0900
Subject: perf report: Fix off-by-one for non-activation frames
From: Milian Wolff <milian.wolff(a)kdab.com>
[ Upstream commit 1982ad48fc82c284a5cc55697a012d3357e84d01 ]
As the documentation for dwfl_frame_pc says, frames that
are no activation frames need to have their program counter
decremented by one to properly find the function of the caller.
This fixes many cases where perf report currently attributes
the cost to the next line. I.e. I have code like this:
~~~~~~~~~~~~~~~
#include <thread>
#include <chrono>
using namespace std;
int main()
{
this_thread::sleep_for(chrono::milliseconds(1000));
this_thread::sleep_for(chrono::milliseconds(100));
this_thread::sleep_for(chrono::milliseconds(10));
return 0;
}
~~~~~~~~~~~~~~~
Now compile and record it:
~~~~~~~~~~~~~~~
g++ -std=c++11 -g -O2 test.cpp
echo 1 | sudo tee /proc/sys/kernel/sched_schedstats
perf record \
--event sched:sched_stat_sleep \
--event sched:sched_process_exit \
--event sched:sched_switch --call-graph=dwarf \
--output perf.data.raw \
./a.out
echo 0 | sudo tee /proc/sys/kernel/sched_schedstats
perf inject --sched-stat --input perf.data.raw --output perf.data
~~~~~~~~~~~~~~~
Before this patch, the report clearly shows the off-by-one issue.
Most notably, the last sleep invocation is incorrectly attributed
to the "return 0;" line:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:9
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:10
| __libc_start_main
| _start
|
--0.91%--main test.cpp:13
__libc_start_main
_start
~~~~~~~~~~~~~~~
With this patch here applied, the issue is fixed. The report becomes
much more usable:
~~~~~~~~~~~~~~~
Overhead Source:Line
........ ...........
100.00% core.c:0
|
---__schedule core.c:0
schedule
do_nanosleep hrtimer.c:0
hrtimer_nanosleep
sys_nanosleep
entry_SYSCALL_64_fastpath .tmp_entry_64.o:0
__nanosleep_nocancel .:0
std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323
|
|--90.08%--main test.cpp:8
| __libc_start_main
| _start
|
|--9.01%--main test.cpp:9
| __libc_start_main
| _start
|
--0.91%--main test.cpp:10
__libc_start_main
_start
~~~~~~~~~~~~~~~
Similarly it works for signal frames:
~~~~~~~~~~~~~~~
__noinline void bar(void)
{
volatile long cnt = 0;
for (cnt = 0; cnt < 100000000; cnt++);
}
__noinline void foo(void)
{
bar();
}
void sig_handler(int sig)
{
foo();
}
int main(void)
{
signal(SIGUSR1, sig_handler);
raise(SIGUSR1);
foo();
return 0;
}
~~~~~~~~~~~~~~~~
Before, the report wrongly points to `signal.c:29` after raise():
~~~~~~~~~~~~~~~~
$ perf report --stdio --no-children -g srcline -s srcline
...
100.00% signal.c:11
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:29
__libc_start_main
_start
~~~~~~~~~~~~~~~~
With this patch in, the issue is fixed and we instead get:
~~~~~~~~~~~~~~~~
100.00% signal signal [.] bar
|
---bar signal.c:11
|
|--50.49%--main signal.c:29
| __libc_start_main
| _start
|
--49.51%--0x33a8f
raise .:0
main signal.c:27
__libc_start_main
_start
~~~~~~~~~~~~~~~~
Note how this patch fixes this issue for both unwinding methods, i.e.
both dwfl and libunwind. The former case is straight-forward thanks
to dwfl_frame_pc(). For libunwind, we replace the functionality via
unw_is_signal_frame() for any but the very first frame.
Signed-off-by: Milian Wolff <milian.wolff(a)kdab.com>
Signed-off-by: Namhyung Kim <namhyung(a)kernel.org>
Cc: Arnaldo Carvalho de Melo <acme(a)kernel.org>
Cc: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Cc: David Ahern <dsahern(a)gmail.com>
Cc: Jiri Olsa <jolsa(a)kernel.org>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra(a)chello.nl>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Yao Jin <yao.jin(a)linux.intel.com>
Cc: kernel-team(a)lge.com
Link: http://lkml.kernel.org/r/20170524062129.32529-4-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/util/unwind-libdw.c | 6 +++++-
tools/perf/util/unwind-libunwind-local.c | 11 +++++++++++
2 files changed, 16 insertions(+), 1 deletion(-)
--- a/tools/perf/util/unwind-libdw.c
+++ b/tools/perf/util/unwind-libdw.c
@@ -167,12 +167,16 @@ frame_callback(Dwfl_Frame *state, void *
{
struct unwind_info *ui = arg;
Dwarf_Addr pc;
+ bool isactivation;
- if (!dwfl_frame_pc(state, &pc, NULL)) {
+ if (!dwfl_frame_pc(state, &pc, &isactivation)) {
pr_err("%s", dwfl_errmsg(-1));
return DWARF_CB_ABORT;
}
+ if (!isactivation)
+ --pc;
+
return entry(pc, ui) || !(--ui->max_stack) ?
DWARF_CB_ABORT : DWARF_CB_OK;
}
--- a/tools/perf/util/unwind-libunwind-local.c
+++ b/tools/perf/util/unwind-libunwind-local.c
@@ -646,6 +646,17 @@ static int get_entries(struct unwind_inf
while (!ret && (unw_step(&c) > 0) && i < max_stack) {
unw_get_reg(&c, UNW_REG_IP, &ips[i]);
+
+ /*
+ * Decrement the IP for any non-activation frames.
+ * this is required to properly find the srcline
+ * for caller frames.
+ * See also the documentation for dwfl_frame_pc(),
+ * which this code tries to replicate.
+ */
+ if (unw_is_signal_frame(&c) <= 0)
+ --ips[i];
+
++i;
}
Patches currently in stable-queue which might be from milian.wolff(a)kdab.com are
queue-4.9/perf-report-fix-off-by-one-for-non-activation-frames.patch
queue-4.9/perf-report-ensure-the-perf-dso-mapping-matches-what-libdw-sees.patch
This is a note to let you know that I've just added the patch titled
perf tests: Decompress kernel module before objdump
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-tests-decompress-kernel-module-before-objdump.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Namhyung Kim <namhyung(a)kernel.org>
Date: Thu, 8 Jun 2017 16:31:07 +0900
Subject: perf tests: Decompress kernel module before objdump
From: Namhyung Kim <namhyung(a)kernel.org>
[ Upstream commit 94df1040b1e6aacd8dec0ba3c61d7e77cd695f26 ]
If a kernel modules is compressed, it should be decompressed before
running objdump to parse binary data correctly. This fixes a failure of
object code reading test for me.
Signed-off-by: Namhyung Kim <namhyung(a)kernel.org>
Acked-by: Adrian Hunter <adrian.hunter(a)intel.com>
Acked-by: Jiri Olsa <jolsa(a)kernel.org>
Cc: David Ahern <dsahern(a)gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra(a)chello.nl>
Cc: Wang Nan <wangnan0(a)huawei.com>
Cc: kernel-team(a)lge.com
Link: http://lkml.kernel.org/r/20170608073109.30699-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/tests/code-reading.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -224,6 +224,8 @@ static int read_object_code(u64 addr, si
unsigned char buf2[BUFSZ];
size_t ret_len;
u64 objdump_addr;
+ const char *objdump_name;
+ char decomp_name[KMOD_DECOMP_LEN];
int ret;
pr_debug("Reading object code for memory address: %#"PRIx64"\n", addr);
@@ -284,9 +286,25 @@ static int read_object_code(u64 addr, si
state->done[state->done_cnt++] = al.map->start;
}
+ objdump_name = al.map->dso->long_name;
+ if (dso__needs_decompress(al.map->dso)) {
+ if (dso__decompress_kmodule_path(al.map->dso, objdump_name,
+ decomp_name,
+ sizeof(decomp_name)) < 0) {
+ pr_debug("decompression failed\n");
+ return -1;
+ }
+
+ objdump_name = decomp_name;
+ }
+
/* Read the object code using objdump */
objdump_addr = map__rip_2objdump(al.map, al.addr);
- ret = read_via_objdump(al.map->dso->long_name, objdump_addr, buf2, len);
+ ret = read_via_objdump(objdump_name, objdump_addr, buf2, len);
+
+ if (dso__needs_decompress(al.map->dso))
+ unlink(objdump_name);
+
if (ret > 0) {
/*
* The kernel maps are inaccurate - assume objdump is right in
Patches currently in stable-queue which might be from namhyung(a)kernel.org are
queue-4.9/perf-report-fix-off-by-one-for-non-activation-frames.patch
queue-4.9/perf-trace-add-mmap-alias-for-s390.patch
queue-4.9/perf-report-ensure-the-perf-dso-mapping-matches-what-libdw-sees.patch
queue-4.9/perf-tools-fix-copyfile_offset-update-of-output-offset.patch
queue-4.9/perf-tools-decompress-kernel-module-when-reading-dso-data.patch
queue-4.9/perf-tests-decompress-kernel-module-before-objdump.patch
queue-4.9/perf-header-set-proper-module-name-when-build-id-event-found.patch
This is a note to let you know that I've just added the patch titled
perf report: Ensure the perf DSO mapping matches what libdw sees
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-report-ensure-the-perf-dso-mapping-matches-what-libdw-sees.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Milian Wolff <milian.wolff(a)kdab.com>
Date: Fri, 2 Jun 2017 16:37:52 +0200
Subject: perf report: Ensure the perf DSO mapping matches what libdw sees
From: Milian Wolff <milian.wolff(a)kdab.com>
[ Upstream commit 2538b9e2450ae255337c04356e9e0f8cb9ec48d9 ]
In some situations the libdw unwinder stopped working properly. I.e.
with libunwind we see:
~~~~~
heaptrack_gui 2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
608f3 _GLOBAL__sub_I_kdynamicjobtracker.cpp (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
f199 call_init.part.0 (/usr/lib/ld-2.25.so)
f2a5 _dl_init (/usr/lib/ld-2.25.so)
db9 _dl_start_user (/usr/lib/ld-2.25.so)
~~~~~
But with libdw and without this patch this sample is not properly
unwound:
~~~~~
heaptrack_gui 2228 135073.400112: 641314 cycles:
e8ed _dl_fixup (/usr/lib/ld-2.25.so)
15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so)
ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0)
~~~~~
Debug output showed me that libdw found a module for the last frame
address, but it thinks it belongs to /usr/lib/ld-2.25.so. This patch
double-checks what libdw sees and what perf knows. If the mappings
mismatch, we now report the elf known to perf. This fixes the situation
above, and the libdw unwinder produces the same stack as libunwind.
Signed-off-by: Milian Wolff <milian.wolff(a)kdab.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Namhyung Kim <namhyung(a)kernel.org>
Link: http://lkml.kernel.org/r/20170602143753.16907-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/util/unwind-libdw.c | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/tools/perf/util/unwind-libdw.c
+++ b/tools/perf/util/unwind-libdw.c
@@ -38,6 +38,14 @@ static int __report_module(struct addr_l
return 0;
mod = dwfl_addrmodule(ui->dwfl, ip);
+ if (mod) {
+ Dwarf_Addr s;
+
+ dwfl_module_info(mod, NULL, &s, NULL, NULL, NULL, NULL, NULL);
+ if (s != al->map->start)
+ mod = 0;
+ }
+
if (!mod)
mod = dwfl_report_elf(ui->dwfl, dso->short_name,
dso->long_name, -1, al->map->start,
Patches currently in stable-queue which might be from milian.wolff(a)kdab.com are
queue-4.9/perf-report-fix-off-by-one-for-non-activation-frames.patch
queue-4.9/perf-report-ensure-the-perf-dso-mapping-matches-what-libdw-sees.patch
This is a note to let you know that I've just added the patch titled
perf header: Set proper module name when build-id event found
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-header-set-proper-module-name-when-build-id-event-found.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Namhyung Kim <namhyung(a)kernel.org>
Date: Wed, 31 May 2017 21:01:03 +0900
Subject: perf header: Set proper module name when build-id event found
From: Namhyung Kim <namhyung(a)kernel.org>
[ Upstream commit 1deec1bd96ccd8beb04d2112a6d12fe20505c3a6 ]
When perf processes build-id event, it creates DSOs with the build-id.
But it didn't set the module short name (like '[module-name]') so when
processing a kernel mmap event of the module, it cannot found the DSO as
it only checks the short names.
That leads for perf to create a same DSO without the build-id info and
it'll lookup the system path even if the DSO is already in the build-id
cache. After kernel was updated, perf cannot find the DSO and cannot
show symbols in it anymore.
You can see this if you have an old data file (w/ old kernel version):
$ perf report -i perf.data.old -v |& grep scsi_mod
build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz : cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
Failed to open /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz, continuing without symbols
...
The second message didn't show the build-id. With this patch:
$ perf report -i perf.data.old -v |& grep scsi_mod
build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz: cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1
/lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz with build id cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1 not found, continuing without symbols
...
Now it shows the build-id but still cannot load the symbol table. This
is a different problem which will be fixed in the next patch.
Signed-off-by: Namhyung Kim <namhyung(a)kernel.org>
Acked-by: Jiri Olsa <jolsa(a)kernel.org>
Cc: Andi Kleen <andi(a)firstfloor.org>
Cc: David Ahern <dsahern(a)gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra(a)chello.nl>
Cc: kernel-team(a)lge.com
Link: http://lkml.kernel.org/r/20170531120105.21731-1-namhyung@kernel.org
[ Fix the build on older compilers (debian <= 8, fedora <= 21, etc) wrt kmod_path var init ]
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/util/header.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1454,8 +1454,16 @@ static int __event_process_build_id(stru
dso__set_build_id(dso, &bev->build_id);
- if (!is_kernel_module(filename, cpumode))
- dso->kernel = dso_type;
+ if (dso_type != DSO_TYPE_USER) {
+ struct kmod_path m = { .name = NULL, };
+
+ if (!kmod_path__parse_name(&m, filename) && m.kmod)
+ dso__set_short_name(dso, strdup(m.name), true);
+ else
+ dso->kernel = dso_type;
+
+ free(m.name);
+ }
build_id__sprintf(dso->build_id, sizeof(dso->build_id),
sbuild_id);
Patches currently in stable-queue which might be from namhyung(a)kernel.org are
queue-4.9/perf-report-fix-off-by-one-for-non-activation-frames.patch
queue-4.9/perf-trace-add-mmap-alias-for-s390.patch
queue-4.9/perf-report-ensure-the-perf-dso-mapping-matches-what-libdw-sees.patch
queue-4.9/perf-tools-fix-copyfile_offset-update-of-output-offset.patch
queue-4.9/perf-tools-decompress-kernel-module-when-reading-dso-data.patch
queue-4.9/perf-tests-decompress-kernel-module-before-objdump.patch
queue-4.9/perf-header-set-proper-module-name-when-build-id-event-found.patch
This is a note to let you know that I've just added the patch titled
perf probe: Add warning message if there is unexpected event name
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-probe-add-warning-message-if-there-is-unexpected-event-name.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Masami Hiramatsu <mhiramat(a)kernel.org>
Date: Sat, 9 Dec 2017 01:26:46 +0900
Subject: perf probe: Add warning message if there is unexpected event name
From: Masami Hiramatsu <mhiramat(a)kernel.org>
[ Upstream commit 9f5c6d8777a2d962b0eeacb2a16f37da6bea545b ]
This improve the error message so that user can know event-name error
before writing new events to kprobe-events interface.
E.g.
======
#./perf probe -x /lib64/libc-2.25.so malloc_get_state*
Internal error: "malloc_get_state@GLIBC_2" is an invalid event name.
Error: Failed to add events.
======
Reported-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Acked-by: Ravi Bangoria <ravi.bangoria(a)linux.vnet.ibm.com>
Reviewed-by: Thomas Richter <tmricht(a)linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Cc: Paul Clarke <pc(a)us.ibm.com>
Cc: bhargavb <bhargavaramudu(a)gmail.com>
Cc: linux-rt-users(a)vger.kernel.org
Link: http://lkml.kernel.org/r/151275040665.24652.5188568529237584489.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
tools/perf/util/probe-event.c | 8 ++++++++
1 file changed, 8 insertions(+)
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -2609,6 +2609,14 @@ static int get_new_event_name(char *buf,
out:
free(nbase);
+
+ /* Final validation */
+ if (ret >= 0 && !is_c_func_name(buf)) {
+ pr_warning("Internal error: \"%s\" is an invalid event name.\n",
+ buf);
+ ret = -EINVAL;
+ }
+
return ret;
}
Patches currently in stable-queue which might be from mhiramat(a)kernel.org are
queue-4.9/perf-probe-add-warning-message-if-there-is-unexpected-event-name.patch
This is a note to let you know that I've just added the patch titled
perf/core: Fix error handling in perf_event_alloc()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-core-fix-error-handling-in-perf_event_alloc.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Dan Carpenter <dan.carpenter(a)oracle.com>
Date: Mon, 22 May 2017 12:04:18 +0300
Subject: perf/core: Fix error handling in perf_event_alloc()
From: Dan Carpenter <dan.carpenter(a)oracle.com>
[ Upstream commit 36cc2b9222b5106de34085c4dd8635ac67ef5cba ]
We don't set an error code here which means that perf_event_alloc()
returns ERR_PTR(0) (in other words NULL). The callers are not expecting
that and would Oops.
Signed-off-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme(a)kernel.org>
Cc: Arnaldo Carvalho de Melo <acme(a)redhat.com>
Cc: Jiri Olsa <jolsa(a)redhat.com>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Stephane Eranian <eranian(a)google.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Vince Weaver <vincent.weaver(a)maine.edu>
Fixes: 375637bc5249 ("perf/core: Introduce address range filtering")
Link: http://lkml.kernel.org/r/20170522090418.hvs6icgpdo53wkn5@mwanda
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/events/core.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -9289,8 +9289,10 @@ perf_event_alloc(struct perf_event_attr
event->addr_filters_offs = kcalloc(pmu->nr_addr_filters,
sizeof(unsigned long),
GFP_KERNEL);
- if (!event->addr_filters_offs)
+ if (!event->addr_filters_offs) {
+ err = -ENOMEM;
goto err_per_task;
+ }
/* force hw sync on the address filters */
event->addr_filters_gen = 1;
Patches currently in stable-queue which might be from dan.carpenter(a)oracle.com are
queue-4.9/block-fix-an-error-code-in-add_partition.patch
queue-4.9/x.509-fix-error-code-in-x509_cert_parse.patch
queue-4.9/rdma-iw_cxgb4-avoid-touch-after-free-error-in-arp-failure-handlers.patch
queue-4.9/drm-amdkfd-null-dereference-involving-create_process.patch
queue-4.9/pnfs-flexfiles-missing-error-code-in-ff_layout_alloc_lseg.patch
queue-4.9/drivers-misc-vmw_vmci-vmci_queue_pair.c-fix-a-couple-integer-overflow-tests.patch
queue-4.9/cxl-unlock-on-error-in-probe.patch
queue-4.9/md-cluster-fix-potential-lock-issue-in-add_new_disk.patch
queue-4.9/ipmi_ssif-unlock-on-allocation-failure.patch
queue-4.9/powercap-fix-an-error-code-in-powercap_register_zone.patch
queue-4.9/perf-core-fix-error-handling-in-perf_event_alloc.patch
queue-4.9/libceph-null-deref-on-crush_decode-error-path.patch
This is a note to let you know that I've just added the patch titled
perf/core: Correct event creation with PERF_FORMAT_GROUP
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-core-correct-event-creation-with-perf_format_group.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Apr 9 17:09:24 CEST 2018
From: Peter Zijlstra <peterz(a)infradead.org>
Date: Tue, 30 May 2017 11:45:12 +0200
Subject: perf/core: Correct event creation with PERF_FORMAT_GROUP
From: Peter Zijlstra <peterz(a)infradead.org>
[ Upstream commit ba5213ae6b88fb170c4771fef6553f759c7d8cdd ]
Andi was asking about PERF_FORMAT_GROUP vs inherited events, which led
to the discovery of a bug from commit:
3dab77fb1bf8 ("perf: Rework/fix the whole read vs group stuff")
- PERF_SAMPLE_GROUP = 1U << 4,
+ PERF_SAMPLE_READ = 1U << 4,
- if (attr->inherit && (attr->sample_type & PERF_SAMPLE_GROUP))
+ if (attr->inherit && (attr->read_format & PERF_FORMAT_GROUP))
is a clear fail :/
While this changes user visible behaviour; it was previously possible
to create an inherited event with PERF_SAMPLE_READ; this is deemed
acceptible because its results were always incorrect.
Reported-by: Andi Kleen <ak(a)linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: Alexander Shishkin <alexander.shishkin(a)linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme(a)kernel.org>
Cc: Jiri Olsa <jolsa(a)kernel.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Stephane Eranian <eranian(a)google.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Vince Weaver <vince(a)deater.net>
Fixes: 3dab77fb1bf8 ("perf: Rework/fix the whole read vs group stuff")
Link: http://lkml.kernel.org/r/20170530094512.dy2nljns2uq7qa3j@hirez.programming.…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/events/core.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5669,9 +5669,6 @@ static void perf_output_read_one(struct
__output_copy(handle, values, n * sizeof(u64));
}
-/*
- * XXX PERF_FORMAT_GROUP vs inherited events seems difficult.
- */
static void perf_output_read_group(struct perf_output_handle *handle,
struct perf_event *event,
u64 enabled, u64 running)
@@ -5716,6 +5713,13 @@ static void perf_output_read_group(struc
#define PERF_FORMAT_TOTAL_TIMES (PERF_FORMAT_TOTAL_TIME_ENABLED|\
PERF_FORMAT_TOTAL_TIME_RUNNING)
+/*
+ * XXX PERF_SAMPLE_READ vs inherited events seems difficult.
+ *
+ * The problem is that its both hard and excessively expensive to iterate the
+ * child list, not to mention that its impossible to IPI the children running
+ * on another CPU, from interrupt/NMI context.
+ */
static void perf_output_read(struct perf_output_handle *handle,
struct perf_event *event)
{
@@ -9259,9 +9263,10 @@ perf_event_alloc(struct perf_event_attr
local64_set(&hwc->period_left, hwc->sample_period);
/*
- * we currently do not support PERF_FORMAT_GROUP on inherited events
+ * We currently do not support PERF_SAMPLE_READ on inherited events.
+ * See perf_output_read().
*/
- if (attr->inherit && (attr->read_format & PERF_FORMAT_GROUP))
+ if (attr->inherit && (attr->sample_type & PERF_SAMPLE_READ))
goto err_ns;
if (!has_branch_stack(event))
Patches currently in stable-queue which might be from peterz(a)infradead.org are
queue-4.9/perf-callchain-force-user_ds-when-invoking-perf_callchain_user.patch
queue-4.9/perf-report-fix-off-by-one-for-non-activation-frames.patch
queue-4.9/x86-efi-disable-runtime-services-on-kexec-kernel-if-booted-with-efi-old_map.patch
queue-4.9/perf-tools-fix-copyfile_offset-update-of-output-offset.patch
queue-4.9/sched-numa-use-down_read_trylock-for-the-mmap_sem.patch
queue-4.9/x86-asm-don-t-use-rbp-as-a-temporary-register-in-csum_partial_copy_generic.patch
queue-4.9/perf-core-correct-event-creation-with-perf_format_group.patch
queue-4.9/x86-mm-kaslr-use-the-_asm_mul-macro-for-multiplication-to-work-around-clang-incompatibility.patch
queue-4.9/x86-tsc-provide-tsc-unstable-boot-parameter.patch
queue-4.9/x86-boot-declare-error-as-noreturn.patch
queue-4.9/cpuhotplug-link-lock-stacks-for-hotplug-callbacks.patch
queue-4.9/perf-core-fix-error-handling-in-perf_event_alloc.patch
queue-4.9/sched-deadline-use-the-revised-wakeup-rule-for-suspending-constrained-dl-tasks.patch