summaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2019-05-16perf tools: Fix map reference countingJiri Olsa1-3/+1
[ Upstream commit b9abbdfa88024d52c8084d8f46ea4f161606c692 ] By calling maps__insert() we assume to get 2 references on the map, which we relese within maps__remove call. However if there's already same map name, we currently don't bump the reference and can crash, like: Program received signal SIGABRT, Aborted. 0x00007ffff75e60f5 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff75e60f5 in raise () from /lib64/libc.so.6 #1 0x00007ffff75d0895 in abort () from /lib64/libc.so.6 #2 0x00007ffff75d0769 in __assert_fail_base.cold () from /lib64/libc.so.6 #3 0x00007ffff75de596 in __assert_fail () from /lib64/libc.so.6 #4 0x00000000004fc006 in refcount_sub_and_test (i=1, r=0x1224e88) at tools/include/linux/refcount.h:131 #5 refcount_dec_and_test (r=0x1224e88) at tools/include/linux/refcount.h:148 #6 map__put (map=0x1224df0) at util/map.c:299 #7 0x00000000004fdb95 in __maps__remove (map=0x1224df0, maps=0xb17d80) at util/map.c:953 #8 maps__remove (maps=0xb17d80, map=0x1224df0) at util/map.c:959 #9 0x00000000004f7d8a in map_groups__remove (map=<optimized out>, mg=<optimized out>) at util/map_groups.h:65 #10 machine__process_ksymbol_unregister (sample=<optimized out>, event=0x7ffff7279670, machine=<optimized out>) at util/machine.c:728 #11 machine__process_ksymbol (machine=<optimized out>, event=0x7ffff7279670, sample=<optimized out>) at util/machine.c:741 #12 0x00000000004fffbb in perf_session__deliver_event (session=0xb11390, event=0x7ffff7279670, tool=0x7fffffffc7b0, file_offset=13936) at util/session.c:1362 #13 0x00000000005039bb in do_flush (show_progress=false, oe=0xb17e80) at util/ordered-events.c:243 #14 __ordered_events__flush (oe=0xb17e80, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:322 #15 0x00000000005005e4 in perf_session__process_user_event (session=session@entry=0xb11390, event=event@entry=0x7ffff72a4af8, ... Add the map to the list and getting the reference event if we find the map with same name. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Fixes: 1e6285699b30 ("perf symbols: Fix slowness due to -ffunction-section") Link: http://lkml.kernel.org/r/20190416160127.30203-10-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-05-04perf machine: Update kernel map address and re-order properlyWei Li1-12/+20
[ Upstream commit 977c7a6d1e263ff1d755f28595b99e4bc0c48a9f ] Since commit 1fb87b8e9599 ("perf machine: Don't search for active kernel start in __machine__create_kernel_maps"), the __machine__create_kernel_maps() just create a map what start and end are both zero. Though the address will be updated later, the order of map in the rbtree may be incorrect. The commit ee05d21791db ("perf machine: Set main kernel end address properly") fixed the logic in machine__create_kernel_maps(), but it's still wrong in function machine__process_kernel_mmap_event(). To reproduce this issue, we need an environment which the module address is before the kernel text segment. I tested it on an aarch64 machine with kernel 4.19.25: [root@localhost hulk]# grep _stext /proc/kallsyms ffff000008081000 T _stext [root@localhost hulk]# grep _etext /proc/kallsyms ffff000009780000 R _etext [root@localhost hulk]# tail /proc/modules hisi_sas_v2_hw 77824 0 - Live 0xffff00000191d000 nvme_core 126976 7 nvme, Live 0xffff0000018b6000 mdio 20480 1 ixgbe, Live 0xffff0000018ab000 hisi_sas_main 106496 1 hisi_sas_v2_hw, Live 0xffff000001861000 hns_mdio 20480 2 - Live 0xffff000001822000 hnae 28672 3 hns_dsaf,hns_enet_drv, Live 0xffff000001815000 dm_mirror 40960 0 - Live 0xffff000001804000 dm_region_hash 32768 1 dm_mirror, Live 0xffff0000017f5000 dm_log 32768 2 dm_mirror,dm_region_hash, Live 0xffff0000017e7000 dm_mod 315392 17 dm_mirror,dm_log, Live 0xffff000001780000 [root@localhost hulk]# Before fix: [root@localhost bin]# perf record sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ] [root@localhost bin]# perf buildid-list -i perf.data 4c4e46c971ca935f781e603a09b52a92e8bdfee8 [vdso] [root@localhost bin]# perf buildid-list -i perf.data -H 0000000000000000000000000000000000000000 /proc/kcore [root@localhost bin]# After fix: [root@localhost tools]# ./perf/perf record sleep 3 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data (9 samples) ] [root@localhost tools]# ./perf/perf buildid-list -i perf.data 28a6c690262896dbd1b5e1011ed81623e6db0610 [kernel.kallsyms] 106c14ce6e4acea3453e484dc604d66666f08a2f [vdso] [root@localhost tools]# ./perf/perf buildid-list -i perf.data -H 28a6c690262896dbd1b5e1011ed81623e6db0610 /proc/kcore Signed-off-by: Wei Li <liwei391@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Li Bin <huawei.libin@huawei.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190228092003.34071-1-liwei391@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin (Microsoft) <sashal@kernel.org>
2019-05-04perf cs-etm: Add missing case valueSolomon Tan1-0/+1
[ Upstream commit c8fa7a807f3c5f946bd92076fbaf7826edb650dc ] The following error was thrown when compiling `tools/perf` using OpenCSD v0.11.1. This patch fixes said error. CC util/intel-pt-decoder/intel-pt-log.o CC util/cs-etm-decoder/cs-etm-decoder.o util/cs-etm-decoder/cs-etm-decoder.c: In function ‘cs_etm_decoder__buffer_range’: util/cs-etm-decoder/cs-etm-decoder.c:370:2: error: enumeration value ‘OCSD_INSTR_WFI_WFE’ not handled in switch [-Werror=switch-enum] switch (elem->last_i_type) { ^~~~~~ CC util/intel-pt-decoder/intel-pt-decoder.o cc1: all warnings being treated as errors Because `OCSD_INSTR_WFI_WFE` case was added only in v0.11.0, the minimum required OpenCSD library version for this patch is no longer v0.10.0. Signed-off-by: Solomon Tan <solomonbobstoner@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: Suzuki K Poulouse <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190322052255.GA4809@w-OptiPlex-7050 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin (Microsoft) <sashal@kernel.org>
2019-04-20perf evsel: Free evsel->counts in perf_evsel__exit()Arnaldo Carvalho de Melo1-0/+1
[ Upstream commit 42dfa451d825a2ad15793c476f73e7bbc0f9d312 ] Using gcc's ASan, Changbin reports: ================================================================= ==7494==ERROR: LeakSanitizer: detected memory leaks Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) #1 0x5625e5330a5e in zalloc util/util.h:23 #2 0x5625e5330a9b in perf_counts__new util/counts.c:10 #3 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47 #4 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505 #5 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347 #6 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47 #7 0x5625e51528e6 in run_test tests/builtin-test.c:358 #8 0x5625e5152baf in test_and_print tests/builtin-test.c:388 #9 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 #10 0x5625e515572f in cmd_test tests/builtin-test.c:722 #11 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 #12 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 #13 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 #14 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 #15 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Indirect leak of 72 byte(s) in 1 object(s) allocated from: #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) #1 0x5625e532560d in zalloc util/util.h:23 #2 0x5625e532566b in xyarray__new util/xyarray.c:10 #3 0x5625e5330aba in perf_counts__new util/counts.c:15 #4 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47 #5 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505 #6 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347 #7 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47 #8 0x5625e51528e6 in run_test tests/builtin-test.c:358 #9 0x5625e5152baf in test_and_print tests/builtin-test.c:388 #10 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 #11 0x5625e515572f in cmd_test tests/builtin-test.c:722 #12 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 #13 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 #14 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 #15 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 #16 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) His patch took care of evsel->prev_raw_counts, but the above backtraces are about evsel->counts, so fix that instead. Reported-by: Changbin Du <changbin.du@gmail.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Link: https://lkml.kernel.org/n/tip-hd1x13g59f0nuhe4anxhsmfp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20perf top: Fix global-buffer-overflow issueChangbin Du1-0/+2
[ Upstream commit 1e5b0cf8672e622257df024074e6e09bfbcb7750 ] The array str[] should have six elements. ================================================================= ==4322==ERROR: AddressSanitizer: global-buffer-overflow on address 0x56463844e300 at pc 0x564637e7ad0d bp 0x7f30c8c89d10 sp 0x7f30c8c89d00 READ of size 8 at 0x56463844e300 thread T9 #0 0x564637e7ad0c in __ordered_events__flush util/ordered-events.c:316 #1 0x564637e7b0e4 in ordered_events__flush util/ordered-events.c:338 #2 0x564637c6a57d in process_thread /home/changbin/work/linux/tools/perf/builtin-top.c:1073 #3 0x7f30d173a163 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x8163) #4 0x7f30cfffbdee in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x11adee) 0x56463844e300 is located 32 bytes to the left of global variable 'flags' defined in 'util/trace-event-parse.c:229:26' (0x56463844e320) of size 192 0x56463844e300 is located 0 bytes to the right of global variable 'str' defined in 'util/ordered-events.c:268:28' (0x56463844e2e0) of size 32 SUMMARY: AddressSanitizer: global-buffer-overflow util/ordered-events.c:316 in __ordered_events__flush Shadow bytes around the buggy address: 0x0ac947081c10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081c20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081c30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081c40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081c50: 00 00 00 00 00 00 00 00 f9 f9 f9 f9 00 00 00 00 =>0x0ac947081c60:[f9]f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081c70: 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9 0x0ac947081c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081c90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081ca0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ac947081cb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Thread T9 created by T0 here: #0 0x7f30d179de5f in __interceptor_pthread_create (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x4ae5f) #1 0x564637c6b954 in __cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1253 #2 0x564637c7173c in cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1642 #3 0x564637d85038 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 #4 0x564637d85577 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 #5 0x564637d8597b in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 #6 0x564637d860e9 in main /home/changbin/work/linux/tools/perf/perf.c:520 #7 0x7f30cff0509a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Jiri Olsa <jolsa@kernel.org> Fixes: 16c66bc167cc ("perf top: Add processing thread") Fixes: 68ca5d07de20 ("perf ordered_events: Add ordered_events__flush_time interface") Link: http://lkml.kernel.org/r/20190316080556.3075-13-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20perf maps: Purge all maps from the 'names' treeChangbin Du1-0/+15
[ Upstream commit da3a53a7390a89391bd63bead0c2e9af4c5ef3d6 ] Add function __maps__purge_names() to purge all maps from the names tree. We need to cleanup the names tree in maps__exit(). Detected with gcc's ASan. Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 1e6285699b30 ("perf symbols: Fix slowness due to -ffunction-section") Link: http://lkml.kernel.org/r/20190316080556.3075-12-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20perf map: Remove map from 'names' tree in __maps__remove()Changbin Du1-0/+3
[ Upstream commit b49265e04410b97b31a5ee66ef6782c1b2d6cd2c ] There are two trees for each map inserted by maps__insert(), so remove it from the 'names' tree in __maps__remove(). Detected with gcc's ASan. Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eric Saint-Etienne <eric.saint.etienne@oracle.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 1e6285699b30 ("perf symbols: Fix slowness due to -ffunction-section") Link: http://lkml.kernel.org/r/20190316080556.3075-11-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20perf hist: Add missing map__put() in error caseChangbin Du1-1/+3
[ Upstream commit cb6186aeffda4d27e56066c79e9579e7831541d3 ] We need to map__put() before returning from failure of sample__resolve_callchain(). Detected with gcc's ASan. Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Krister Johansen <kjlx@templeofstupid.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 9c68ae98c6f7 ("perf callchain: Reference count maps") Link: http://lkml.kernel.org/r/20190316080556.3075-10-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20perf build-id: Fix memory leak in print_sdt_events()Changbin Du2-0/+2
[ Upstream commit 8bde8516893da5a5fdf06121f74d11b52ab92df5 ] Detected with gcc's ASan: Direct leak of 4356 byte(s) in 120 object(s) allocated from: #0 0x7ff1a2b5a070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070) #1 0x55719aef4814 in build_id_cache__origname util/build-id.c:215 #2 0x55719af649b6 in print_sdt_events util/parse-events.c:2339 #3 0x55719af66272 in print_events util/parse-events.c:2542 #4 0x55719ad1ecaa in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58 #5 0x55719aec745d in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 #6 0x55719aec7d1a in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 #7 0x55719aec8184 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 #8 0x55719aeca41a in main /home/changbin/work/linux/tools/perf/perf.c:520 #9 0x7ff1a07ae09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 40218daea1db ("perf list: Show SDT and pre-cached events") Link: http://lkml.kernel.org/r/20190316080556.3075-7-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20perf config: Fix a memory leak in collect_config()Changbin Du1-2/+1
[ Upstream commit 54569ba4b06d5baedae4614bde33a25a191473ba ] Detected with gcc's ASan: Direct leak of 66 byte(s) in 5 object(s) allocated from: #0 0x7ff3b1f32070 in __interceptor_strdup (/usr/lib/x86_64-linux-gnu/libasan.so.5+0x3b070) #1 0x560c8761034d in collect_config util/config.c:597 #2 0x560c8760d9cb in get_value util/config.c:169 #3 0x560c8760dfd7 in perf_parse_file util/config.c:285 #4 0x560c8760e0d2 in perf_config_from_file util/config.c:476 #5 0x560c876108fd in perf_config_set__init util/config.c:661 #6 0x560c87610c72 in perf_config_set__new util/config.c:709 #7 0x560c87610d2f in perf_config__init util/config.c:718 #8 0x560c87610e5d in perf_config util/config.c:730 #9 0x560c875ddea0 in main /home/changbin/work/linux/tools/perf/perf.c:442 #10 0x7ff3afb8609a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Taeung Song <treeze.taeung@gmail.com> Fixes: 20105ca1240c ("perf config: Introduce perf_config_set class") Link: http://lkml.kernel.org/r/20190316080556.3075-6-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20perf list: Don't forget to drop the reference to the allocated thread_mapChangbin Du1-0/+1
[ Upstream commit 39df730b09774bd860e39ea208a48d15078236cb ] Detected via gcc's ASan: Direct leak of 2048 byte(s) in 64 object(s) allocated from: 6 #0 0x7f606512e370 in __interceptor_realloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee370) 7 #1 0x556b0f1d7ddd in thread_map__realloc util/thread_map.c:43 8 #2 0x556b0f1d84c7 in thread_map__new_by_tid util/thread_map.c:85 9 #3 0x556b0f0e045e in is_event_supported util/parse-events.c:2250 10 #4 0x556b0f0e1aa1 in print_hwcache_events util/parse-events.c:2382 11 #5 0x556b0f0e3231 in print_events util/parse-events.c:2514 12 #6 0x556b0ee0a66e in cmd_list /home/changbin/work/linux/tools/perf/builtin-list.c:58 13 #7 0x556b0f01e0ae in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 14 #8 0x556b0f01e859 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 15 #9 0x556b0f01edc8 in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 16 #10 0x556b0f01f71f in main /home/changbin/work/linux/tools/perf/perf.c:520 17 #11 0x7f6062ccf09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <changbin.du@gmail.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: 89896051f8da ("perf tools: Do not put a variable sized type not at the end of a struct") Link: http://lkml.kernel.org/r/20190316080556.3075-3-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-20perf stat: Fix --no-scaleAndi Kleen2-10/+5
[ Upstream commit 75998bb263bf48c1c85d78cd2d2f3a97d3747cab ] The -c option to enable multiplex scaling has been useless for quite some time because scaling is default. It's only useful as --no-scale to disable scaling. But the non scaling code path has bitrotted and doesn't print anything because perf output code relies on value run/ena information. Also even when we don't want to scale a value it's still useful to show its multiplex percentage. This patch: - Fixes help and documentation to show --no-scale instead of -c - Removes -c, only keeps the long option because -c doesn't support negatives. - Enables running/enabled even with --no-scale - And fixes some other problems in the no-scale output. Before: $ perf stat --no-scale -e cycles true Performance counter stats for 'true': <not counted> cycles 0.000984154 seconds time elapsed After: $ ./perf stat --no-scale -e cycles true Performance counter stats for 'true': 706,070 cycles 0.001219821 seconds time elapsed Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> LPU-Reference: 20190314225002.30108-9-andi@firstfloor.org Link: https://lkml.kernel.org/n/tip-xggjvwcdaj2aqy8ib3i4b1g6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-05perf script python: Add trace_context extension module to sys.modulesTony Jones1-3/+9
[ Upstream commit cc437642255224e4140fed1f3e3156fc8ad91903 ] In Python3, the result of PyModule_Create (called from scripts/python/Perf-Trace-Util/Context.c) is not automatically added to sys.modules. See: https://bugs.python.org/issue4592 Below is the observed behavior without the fix: # ldd /usr/bin/perf | grep -i python libpython3.6m.so.1.0 => /usr/lib64/libpython3.6m.so.1.0 (0x00007f8e1dfb2000) # perf record /bin/false [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.015 MB perf.data (17 samples) ] # perf script -g python | cat generated Python script: perf-script.py # perf script -s ./perf-script.py Traceback (most recent call last): File "./perf-script.py", line 18, in <module> from perf_trace_context import * ModuleNotFoundError: No module named 'perf_trace_context' Error running python script ./perf-script.py # Committer notes: To build with python3 use: $ make -C tools/perf PYTHON=python3 Use a non-const variable to pass the 'name' arg to PyImport_AppendInittab(), as python2.6 has that as 'char *', which ends up trowing this in some environments: CC /tmp/build/perf/util/parse-branch-options.o util/scripting-engines/trace-event-python.c: In function 'python_start_script': util/scripting-engines/trace-event-python.c:1520:2: error: passing argument 1 of 'PyImport_AppendInittab' discards 'const' qualifier from pointer target type [-Werror] PyImport_AppendInittab("perf_trace_context", initfunc); ^ In file included from /usr/include/python2.6/Python.h:130:0, from util/scripting-engines/trace-event-python.c:22: /usr/include/python2.6/import.h:54:17: note: expected 'char *' but argument is of type 'const char *' PyAPI_FUNC(int) PyImport_AppendInittab(char *name, void (*initfunc)(void)); ^ cc1: all warnings being treated as errors Signed-off-by: Tony Jones <tonyj@suse.de> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jaroslav Škarvada <jskarvad@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com> Fixes: 66dfdff03d19 ("perf tools: Add Python 3 support") Link: http://lkml.kernel.org/r/20190124005229.16146-2-tonyj@suse.de Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-05perf script python: Use PyBytes for attr in trace-event-pythonTony Jones1-2/+1
[ Upstream commit 72e0b15cb24a497d7d0d4707cf51ff40c185ae8c ] With Python3. PyUnicode_FromStringAndSize is unsafe to call on attr and will return NULL. Use _PyBytes_FromStringAndSize (as with raw_buf). Below is the observed behavior without the fix. Note it is first necessary to apply the prior fix (Add trace_context extension module to sys,modules): # ldd /usr/bin/perf | grep -i python libpython3.6m.so.1.0 => /usr/lib64/libpython3.6m.so.1.0 (0x00007f8e1dfb2000) # perf record -e raw_syscalls:sys_enter /bin/false [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (21 samples) ] # perf script -g python | cat generated Python script: perf-script.py # perf script -s ./perf-script.py in trace_begin Segmentation fault (core dumped) Signed-off-by: Tony Jones <tonyj@suse.de> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jaroslav Škarvada <jskarvad@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Seeteena Thoufeek <s1seetee@linux.vnet.ibm.com> Fixes: 66dfdff03d19 ("perf tools: Add Python 3 support") Link: http://lkml.kernel.org/r/20190124005229.16146-3-tonyj@suse.de Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-05perf report: Add s390 diagnosic sampling descriptor sizeThomas Richter1-0/+5
[ Upstream commit 2187d87eacd46f6214ce3dc9cfd7a558375a4153 ] On IBM z13 machine types 2964 and 2965 the descriptor sizes for sampling and diagnostic sampling entries might be missing in the trailer entry and are set to zero. This leads to a perf report failure when processing diagnostic sampling entries. This patch adds missing descriptor sizes when the trailer entry contains zero for these fields. Output before: [root@s38lp82 perf]# ./perf report --stdio | fgrep Samples 0xabbf0 [0x8]: failed to process type: 68 Error: failed to process sample [root@s38lp82 perf]# Output after: [root@s38lp82 perf]# ./perf report --stdio | fgrep Samples # Total Lost Samples: 0 # Samples: 3K of event 'SF_CYCLES_BASIC_DIAG' # Samples: 162 of event 'CF_DIAG' [root@s38lp82 perf]# Fixes: 2b1444f2e28b ("perf report: Add raw report support for s390 auxiliary trace") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20190211100627.85714-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-05perf report: Don't shadow inlined symbol with different addr rangeHe Kuang2-3/+9
[ Upstream commit 7346195e8643482968f547483e0d823ec1982fab ] We can't assume inlined symbols with the same name are equal, because their address range may be different. This will cause the symbols with different addresses be shadowed when adding to the hist entry, and lead to ERANGE error when checking the symbol address during sample parse, the addr should be within the range of [sym.start, sym.end]. The error message is like: "0x36aea60 [0x8]: failed to process type: 68". The second parameter of symbol__new() is the length of the fake symbol for the inline frame, which is the subtraction of the end and start address of base_sym. Signed-off-by: He Kuang <hekuang@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: aa441895f7b4 ("perf report: Compare symbol name for inlined frames when sorting") Link: http://lkml.kernel.org/r/20190219130531.15692-1-hekuang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-05perf annotate: Fix getting source line failureWei Li1-2/+2
[ Upstream commit 11db1ad4513d6205d2519e1a30ff4cef746e3243 ] The output of "perf annotate -l --stdio xxx" changed since commit 425859ff0de33 ("perf annotate: No need to calculate notes->start twice") removed notes->start assignment in symbol__calc_lines(). It will get failed in find_address_in_section() from symbol__tty_annotate() subroutine as the a2l->addr is wrong. So the annotate summary doesn't report the line number of source code correctly. Before fix: liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ cat common_while_1.c void hotspot_1(void) { volatile int i; for (i = 0; i < 0x10000000; i++); for (i = 0; i < 0x10000000; i++); for (i = 0; i < 0x10000000; i++); } int main(void) { hotspot_1(); return 0; } liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ gcc common_while_1.c -g -o common_while_1 liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record ./common_while_1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.488 MB perf.data (12498 samples) ] liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s hotspot_1 --stdio Sorted summary for file /home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1 ---------------------------------------------- 19.30 common_while_1[32] 19.03 common_while_1[4e] 19.01 common_while_1[16] 5.04 common_while_1[13] 4.99 common_while_1[4b] 4.78 common_while_1[2c] 4.77 common_while_1[10] 4.66 common_while_1[2f] 4.59 common_while_1[51] 4.59 common_while_1[35] 4.52 common_while_1[19] 4.20 common_while_1[56] 0.51 common_while_1[48] Percent | Source code & Disassembly of common_while_1 for cycles:ppp (12480 samples, percent: local period) ----------------------------------------------------------------------------------------------------------------- : : : : Disassembly of section .text: : : 00000000000005fa <hotspot_1>: : hotspot_1(): : void hotspot_1(void) : { 0.00 : 5fa: push %rbp 0.00 : 5fb: mov %rsp,%rbp : volatile int i; : : for (i = 0; i < 0x10000000; i++); 0.00 : 5fe: movl $0x0,-0x4(%rbp) 0.00 : 605: jmp 610 <hotspot_1+0x16> 0.00 : 607: mov -0x4(%rbp),%eax common_while_1[10] 4.77 : 60a: add $0x1,%eax common_while_1[13] 5.04 : 60d: mov %eax,-0x4(%rbp) common_while_1[16] 19.01 : 610: mov -0x4(%rbp),%eax common_while_1[19] 4.52 : 613: cmp $0xfffffff,%eax 0.00 : 618: jle 607 <hotspot_1+0xd> : for (i = 0; i < 0x10000000; i++); ... After fix: liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record ./common_while_1 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.488 MB perf.data (12500 samples) ] liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s hotspot_1 --stdio Sorted summary for file /home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1 ---------------------------------------------- 33.34 common_while_1.c:5 33.34 common_while_1.c:6 33.32 common_while_1.c:7 Percent | Source code & Disassembly of common_while_1 for cycles:ppp (12482 samples, percent: local period) ----------------------------------------------------------------------------------------------------------------- : : : : Disassembly of section .text: : : 00000000000005fa <hotspot_1>: : hotspot_1(): : void hotspot_1(void) : { 0.00 : 5fa: push %rbp 0.00 : 5fb: mov %rsp,%rbp : volatile int i; : : for (i = 0; i < 0x10000000; i++); 0.00 : 5fe: movl $0x0,-0x4(%rbp) 0.00 : 605: jmp 610 <hotspot_1+0x16> 0.00 : 607: mov -0x4(%rbp),%eax common_while_1.c:5 4.70 : 60a: add $0x1,%eax 4.89 : 60d: mov %eax,-0x4(%rbp) common_while_1.c:5 19.03 : 610: mov -0x4(%rbp),%eax common_while_1.c:5 4.72 : 613: cmp $0xfffffff,%eax 0.00 : 618: jle 607 <hotspot_1+0xd> : for (i = 0; i < 0x10000000; i++); 0.00 : 61a: movl $0x0,-0x4(%rbp) 0.00 : 621: jmp 62c <hotspot_1+0x32> 0.00 : 623: mov -0x4(%rbp),%eax common_while_1.c:6 4.54 : 626: add $0x1,%eax 4.73 : 629: mov %eax,-0x4(%rbp) common_while_1.c:6 19.54 : 62c: mov -0x4(%rbp),%eax common_while_1.c:6 4.54 : 62f: cmp $0xfffffff,%eax ... Signed-off-by: Wei Li <liwei391@huawei.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 425859ff0de33 ("perf annotate: No need to calculate notes->start twice") Link: http://lkml.kernel.org/r/20190221095716.39529-1-liwei391@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-04-03perf intel-pt: Fix TSC slipAdrian Hunter1-12/+8
commit f3b4e06b3bda759afd042d3d5fa86bea8f1fe278 upstream. A TSC packet can slip past MTC packets so that the timestamp appears to go backwards. One estimate is that can be up to about 40 CPU cycles, which is certainly less than 0x1000 TSC ticks, but accept slippage an order of magnitude more to be on the safe side. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Fixes: 79b58424b821c ("perf tools: Add Intel PT support for decoding MTC packets") Link: http://lkml.kernel.org/r/20190325135135.18348-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-03perf pmu: Fix parser error for uncore event aliasKan Liang1-0/+10
commit e94d6b7f615e6dfbaf9fba7db6011db561461d0c upstream. Perf fails to parse uncore event alias, for example: # perf stat -e unc_m_clockticks -a --no-merge sleep 1 event syntax error: 'unc_m_clockticks' \___ parser error Current code assumes that the event alias is from one specific PMU. To find the PMU, perf strcmps the PMU name of event alias with the real PMU name on the system. However, the uncore event alias may be from multiple PMUs with common prefix. The PMU name of uncore event alias is the common prefix. For example, UNC_M_CLOCKTICKS is clock event for iMC, which include 6 PMUs with the same prefix "uncore_imc" on a skylake server. The real PMU names on the system for iMC are uncore_imc_0 ... uncore_imc_5. The strncmp is used to only check the common prefix for uncore event alias. With the patch: # perf stat -e unc_m_clockticks -a --no-merge sleep 1 Performance counter stats for 'system wide': 723,594,722 unc_m_clockticks [uncore_imc_5] 724,001,954 unc_m_clockticks [uncore_imc_3] 724,042,655 unc_m_clockticks [uncore_imc_1] 724,161,001 unc_m_clockticks [uncore_imc_4] 724,293,713 unc_m_clockticks [uncore_imc_2] 724,340,901 unc_m_clockticks [uncore_imc_0] 1.002090060 seconds time elapsed Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: stable@vger.kernel.org Fixes: ea1fa48c055f ("perf stat: Handle different PMU names with common prefix") Link: http://lkml.kernel.org/r/1552672814-156173-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-27perf probe: Fix getting the kernel mapAdrian Hunter1-2/+4
commit eaeffeb9838a7c0dec981d258666bfcc0fa6a947 upstream. Since commit 4d99e4136580 ("perf machine: Workaround missing maps for x86 PTI entry trampolines"), perf tools has been creating more than one kernel map, however 'perf probe' assumed there could be only one. Fix by using machine__kernel_map() to get the main kernel map. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Cc: Xu Yu <xuyu@linux.alibaba.com> Fixes: 4d99e4136580 ("perf machine: Workaround missing maps for x86 PTI entry trampolines") Fixes: d83212d5dd67 ("kallsyms, x86: Export addresses of PTI entry trampolines") Link: http://lkml.kernel.org/r/2ed432de-e904-85d2-5c36-5897ddc5b23b@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23perf intel-pt: Fix divide by zero when TSC is not availableAdrian Hunter1-0/+2
commit 076333870c2f5bdd9b6d31e7ca1909cf0c84cbfa upstream. When TSC is not available, "timeless" decoding is used but a divide by zero occurs if perf_time_to_tsc() is called. Ensure the divisor is not zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.9+ Link: https://lkml.kernel.org/n/tip-1i4j0wqoc8vlbkcizqqxpsf4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23perf intel-pt: Fix overlap calculation for paddingAdrian Hunter1-2/+34
commit 5a99d99e3310a565b0cf63f785b347be9ee0da45 upstream. Auxtrace records might have up to 7 bytes of padding appended. Adjust the overlap accordingly. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23perf auxtrace: Define auxtrace record alignmentAdrian Hunter2-2/+5
commit c3fcadf0bb765faf45d6d562246e1d08885466df upstream. Define auxtrace record alignment so that it can be referenced elsewhere. Note this is preparation for patch "perf intel-pt: Fix overlap calculation for padding" Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23perf tools: Fix split_kallsyms_for_kcore() for trampoline symbolsAdrian Hunter1-0/+2
commit d6d457451eb94fa747dc202765592eb8885a7352 upstream. Kallsyms symbols do not have a size, so the size becomes the distance to the next symbol. Consequently the recently added trampoline symbols end up with large sizes because the trampolines are some distance from one another and the main kernel map. However, symbols that end outside their map can disrupt the symbol tree because, after mapping, it can appear incorrectly that they overlap other symbols. Add logic to truncate symbol size to the end of the corresponding map. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org Fixes: d83212d5dd67 ("kallsyms, x86: Export addresses of PTI entry trampolines") Link: http://lkml.kernel.org/r/20190109091835.5570-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23perf intel-pt: Fix CYC timestamp calculation after OVFAdrian Hunter1-1/+0
commit 03997612904866abe7cdcc992784ef65cb3a4b81 upstream. CYC packet timestamp calculation depends upon CBR which was being cleared upon overflow (OVF). That can cause errors due to failing to synchronize with sideband events. Even if a CBR change has been lost, the old CBR is still a better estimate than zero. So remove the clearing of CBR. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-04perf symbols: Filter out hidden symbols from labelsJiri Olsa1-1/+8
When perf is built with the annobin plugin (RHEL8 build) extra symbols are added to its binary: # nm perf | grep annobin | head -10 0000000000241100 t .annobin_annotate.c 0000000000326490 t .annobin_annotate.c 0000000000249255 t .annobin_annotate.c_end 00000000003283a8 t .annobin_annotate.c_end 00000000001bce18 t .annobin_annotate.c_end.hot 00000000001bce18 t .annobin_annotate.c_end.hot 00000000001bc3e2 t .annobin_annotate.c_end.unlikely 00000000001bc400 t .annobin_annotate.c_end.unlikely 00000000001bce18 t .annobin_annotate.c.hot 00000000001bce18 t .annobin_annotate.c.hot ... Those symbols have no use for report or annotation and should be skipped. Moreover they interfere with the DWARF unwind test on the PPC arch, where they are mixed with checked symbols and then the test fails: # perf test dwarf -v 59: Test dwarf unwind : --- start --- test child forked, pid 8515 unwind: .annobin_dwarf_unwind.c:ip = 0x10dba40dc (0x2740dc) ... got: .annobin_dwarf_unwind.c 0x10dba40dc, expecting test__arch_unwind_sample unwind: failed with 'no error' The annobin symbols are defined as NOTYPE/LOCAL/HIDDEN: # readelf -s ./perf | grep annobin | head -1 40: 00000000001bce4f 0 NOTYPE LOCAL HIDDEN 13 .annobin_init.c They can still pass the check for the label symbol. Adding check for HIDDEN and INTERNAL (as suggested by Nick below) visibility and filter out such symbols. > Just to be awkward, if you are going to ignore STV_HIDDEN > symbols then you should probably also ignore STV_INTERNAL ones > as well... Annobin does not generate them, but you never know, > one day some other tool might create some. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Clifton <nickc@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190128133526.GD15461@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-04perf symbols: Add fallback definitions for GELF_ST_VISIBILITY()Arnaldo Carvalho de Melo1-0/+14
Those aren't present in Alpine Linux 3.4 to edge, so provide fallback defines to get the next patch building there keeping the build bisectable. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Clifton <nickc@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/n/tip-03cg3gya2ju4ba2x6ibb9fuz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-04perf clang: Do not use 'return std::move(something)'Arnaldo Carvalho de Melo1-1/+1
It prevents copy elision, generating this warning when building with fedora:rawhide's clang: clang version 7.0.1 (Fedora 7.0.1-2.fc30) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/bin Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-redhat-linux/9 Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/9 Selected GCC installation: /usr/bin/../lib/gcc/x86_64-redhat-linux/9 Candidate multilib: .;@m64 Candidate multilib: 32;@m32 Selected multilib: .;@m64 $ make -C tools/perf CC=clang LIBCLANGLLVM=1 <SNIP> util/c++/clang.cpp: In function 'std::unique_ptr<llvm::SmallVectorImpl<char> > perf::getBPFObjectFromModule(llvm::Module*)': util/c++/clang.cpp:163:18: error: moving a local object in a return statement prevents copy elision [-Werror=pessimizing-move] 163 | return std::move(Buffer); | ~~~~~~~~~^~~~~~~~ util/c++/clang.cpp:163:18: note: remove 'std::move' call cc1plus: all warnings being treated as errors <SNIP> References: http://www.cplusplus.com/forum/general/186411/#msg908572 https://en.cppreference.com/w/cpp/language/return#Notes https://en.cppreference.com/w/cpp/language/copy_elision Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-lehqf5x5q96l0o8myhb6blz6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-02-04perf mem/c2c: Fix perf_mem_events to support powerpcRavi Bangoria1-1/+1
PowerPC hardware does not have a builtin latency filter (--ldlat) for the "mem-load" event and perf_mem_events by default includes "/ldlat=30/" which is causing a failure on PowerPC. Refactor the code to support "perf mem/c2c" on PowerPC. This patch depends on kernel side changes done my Madhavan: https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-December/182596.html Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Dick Fowles <fowles@inreach.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20190129132412.771-1-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-21perf tools: Handle TOPOLOGY headers with no CPUStephane Eranian1-2/+9
This patch fixes an issue in cpumap.c when used with the TOPOLOGY header. In some configurations, some NUMA nodes may have no CPU (empty cpulist). Yet a cpumap map must be created otherwise perf abort with an error. This patch handles this case by creating a dummy map. Before: $ perf record -o - -e cycles noploop 2 | perf script -i - 0x6e8 [0x6c]: failed to process type: 80 After: $ perf record -o - -e cycles noploop 2 | perf script -i - noploop for 2 seconds Signed-off-by: Stephane Eranian <eranian@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1547885559-1657-1-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-18perf python: Remove -fstack-clash-protection when building with some clang ↵Arnaldo Carvalho de Melo1-0/+2
versions These options are not present in some (all?) clang versions, so when we build for a distro that has a gcc new enough to have these options and that the distro python build config settings use them but clang doesn't support, b00m. This is the case with fedora rawhide (now gearing towards f30), so check if clang has the and remove the missing ones from CFLAGS. Cc: Eduardo Habkost <ehabkost@redhat.com> Cc: Thiago Macieira <thiago.macieira@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-5q50q9w458yawgxf9ez54jbp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-17perf ordered_events: Fix crash in ordered_events__freeJiri Olsa1-2/+4
Song Liu reported crash in 'perf record': > #0 0x0000000000500055 in ordered_events(float, long double,...)(...) () > #1 0x0000000000500196 in ordered_events.reinit () > #2 0x00000000004fe413 in perf_session.process_events () > #3 0x0000000000440431 in cmd_record () > #4 0x00000000004a439f in run_builtin () > #5 0x000000000042b3e5 in main ()" This can happen when we get out of buffers during event processing. The subsequent ordered_events__free() call assumes oe->buffer != NULL and crashes. Add a check to prevent that. Reported-by: Song Liu <liu.song.a23@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Song Liu <liu.song.a23@gmail.com> Tested-by: Song Liu <liu.song.a23@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20190117113017.12977-1-jolsa@kernel.org Fixes: d5ceb62b3654 ("perf ordered_events: Add 'struct ordered_events_buffer' layer") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-09perf symbols: Add 'arch_cpu_idle' to the list of kernel idle symbolsArnaldo Carvalho de Melo1-0/+1
When testing 'perf top' on a armhf system (32-bit, Orange Pi Zero), I noticed that 'arch_cpu_idle' dominated, add it to the list of idle symbols, so that we can see what is that being done when not idle. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-4q2b5g4p2hrstrhp9t2mrlho@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-08perf tools: Make find_vdso_map() more modularFlorian Fainelli2-7/+6
In preparation for checking that the vectors page on the ARM architecture, refactor the find_vdso_map() function to accept finding an arbitrary string and create a dedicated helper function for that under util/find-map.c and update the filename to find-map.c and all references to it: perf-read-vdso.c and util/vdso.c. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Healy <cphealy@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Russell King <rmk+kernel@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lkml.kernel.org/r/20181221034337.26663-2-f.fainelli@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-08Merge tag 'perf-core-for-mingo-4.21-20190104' of ↵Ingo Molnar5-18/+26
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: perf annotate: Ivan Krylov: - Pass filename to objdump via execl, fixing usage with filenames with special characters. perf report: Jin Yao: Fix wrong iteration count in --branch-history perf stat: Jin Yao: - Fix endless wait for child process perf test: Arnaldo Carvalho de Melo: - Use a fallback to get the pathname in vfs_getname in tools build: Jiri Olsa: - Allow overriding CFLAGS assignments. Misc: Arnaldo Carvalho de Melo: - Syncronize UAPI headers Mattias Jacobsson: - Remove redundant va_end() in strbuf_addv() Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-01-07Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds9-73/+200
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf tooling updates form Ingo Molnar: "A final batch of perf tooling changes: mostly fixes and small improvements" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits) perf session: Add comment for perf_session__register_idle_thread() perf thread-stack: Fix thread stack processing for the idle task perf thread-stack: Allocate an array of thread stacks perf thread-stack: Factor out thread_stack__init() perf thread-stack: Allow for a thread stack array perf thread-stack: Avoid direct reference to the thread's stack perf thread-stack: Tidy thread_stack__bottom() usage perf thread-stack: Simplify some code in thread_stack__process() tools gpio: Allow overriding CFLAGS tools power turbostat: Override CFLAGS assignments and add LDFLAGS to build command tools thermal tmon: Allow overriding CFLAGS assignments tools power x86_energy_perf_policy: Override CFLAGS assignments and add LDFLAGS to build command perf c2c: Increase the HITM ratio limit for displayed cachelines perf c2c: Change the default coalesce setup perf trace beauty ioctl: Beautify USBDEVFS_ commands perf trace beauty: Export function to get the files for a thread perf trace: Wire up ioctl's USBDEBFS_ cmd table generator perf beauty ioctl: Add generator for USBDEVFS_ ioctl commands tools headers uapi: Grab a copy of usbdevice_fs.h perf trace: Store the major number for a file when storing its pathname ...
2019-01-04perf strbuf: Remove redundant va_end() in strbuf_addv()Mattias Jacobsson1-1/+0
Each call to va_copy() should have one, and only one, corresponding call to va_end(). In strbuf_addv() some code paths result in va_end() getting called multiple times. Remove the superfluous va_end(). Signed-off-by: Mattias Jacobsson <2pi@mok.nu> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sanskriti Sharma <sansharm@redhat.com> Link: http://lkml.kernel.org/r/20181229141750.16945-1-2pi@mok.nu Fixes: ce49d8436cff ("perf strbuf: Match va_{add,copy} with va_end") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-04perf annotate: Pass filename to objdump via execlIvan Krylov1-4/+4
The symbol__disassemble() function uses shell to launch objdump and filter its output via grep. Passing filenames by interpolating them into the command line via "%s" may lead to problems if said filenames contain special characters. Instead, pass the filename as a command line argument where it is not subject to any kind of interpretation, then use quoted shell interpolation to build the strings we need safely. Signed-off-by: Ivan Krylov <krylov.r00t@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20181014111803.5d83b806@Tarkus Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-04perf report: Fix wrong iteration count in --branch-historyJin Yao3-13/+22
By calculating the removed loops, we can get the iteration count. But the iteration count could be reported incorrectly, reporting impossibly high counts. That's because previous code uses the number of removed LBR entries for the iteration count. That's not good. Fix this by increasing the iteration count when a loop is detected. When matching the chain, the iteration count would be added up, finally we need to compute the average value when printing out. For example, $ perf report --branch-history --stdio --no-children Before: ---f2 +0 | |--33.62%--f1 +9 (cycles:1) | f1 +0 | main +22 (cycles:1) | main +17 | main +38 (cycles:1) | main +27 | f1 +26 (cycles:1) | f1 +24 | f2 +27 (cycles:7) | f2 +0 | f1 +19 (cycles:1) | f1 +14 | f2 +27 (cycles:11) | f2 +0 | f1 +9 (cycles:1 iter:2968 avg_cycles:3) | f1 +0 | main +22 (cycles:1 iter:2968 avg_cycles:3) | main +17 | main +38 (cycles:1 iter:2968 avg_cycles:3) 2968 is an impossible high iteration count and avg_cycles is too small. After: ---f2 +0 | |--33.62%--f1 +9 (cycles:1) | f1 +0 | main +22 (cycles:1) | main +17 | main +38 (cycles:1) | main +27 | f1 +26 (cycles:1) | f1 +24 | f2 +27 (cycles:7) | f2 +0 | f1 +19 (cycles:1) | f1 +14 | f2 +27 (cycles:11) | f2 +0 | f1 +9 (cycles:1 iter:1 avg_cycles:23) | f1 +0 | main +22 (cycles:1 iter:1 avg_cycles:23) | main +17 | main +38 (cycles:1 iter:1 avg_cycles:23) avg_cycles:23 is the average cycles of this iteration. Fixes: c4ee06251d42 ("perf report: Calculate the average cycles of iterations") Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1546582230-17507-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-04Remove 'type' argument from access_ok() functionLinus Torvalds1-1/+1
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument of the user address range verification function since we got rid of the old racy i386-only code to walk page tables by hand. It existed because the original 80386 would not honor the write protect bit when in kernel mode, so you had to do COW by hand before doing any user access. But we haven't supported that in a long time, and these days the 'type' argument is a purely historical artifact. A discussion about extending 'user_access_begin()' to do the range checking resulted this patch, because there is no way we're going to move the old VERIFY_xyz interface to that model. And it's best done at the end of the merge window when I've done most of my merges, so let's just get this done once and for all. This patch was mostly done with a sed-script, with manual fix-ups for the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form. There were a couple of notable cases: - csky still had the old "verify_area()" name as an alias. - the iter_iov code had magical hardcoded knowledge of the actual values of VERIFY_{READ,WRITE} (not that they mattered, since nothing really used it) - microblaze used the type argument for a debug printout but other than those oddities this should be a total no-op patch. I tried to fix up all architectures, did fairly extensive grepping for access_ok() uses, and the changes are trivial, but I may have missed something. Any missed conversion should be trivially fixable, though. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-02perf session: Add comment for perf_session__register_idle_thread()Adrian Hunter1-0/+7
Add a comment to perf_session__register_idle_thread() to bring attention to a pitfall with the idle task thread structure. The pitfall is that there should really be a 'struct thread' for the idle task of each cpu, but there is only one that can have pid == tid == 0. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20181221120620.9659-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02perf thread-stack: Fix thread stack processing for the idle taskAdrian Hunter4-23/+67
perf creates a single 'struct thread' to represent the idle task. That is because threads are identified by PID and TID, and the idle task always has PID == TID == 0. However, there are actually separate idle tasks for each CPU. That creates a problem for thread stack processing which assumes that each thread has a single stack, not one stack per CPU. Fix that by passing through the CPU number, and in the case of the idle "thread", pick the thread stack from an array based on the CPU number. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20181221120620.9659-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02perf thread-stack: Allocate an array of thread stacksAdrian Hunter1-12/+17
In preparation for fixing thread stack processing for the idle task, allocate an array of thread stacks. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20181221120620.9659-7-adrian.hunter@intel.com [ No need to check for NULL when calling zfree(), noticed by Jiri Olsa ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02perf thread-stack: Factor out thread_stack__init()Adrian Hunter1-7/+19
In preparation for fixing thread stack processing for the idle task, factor out thread_stack__init(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20181221120620.9659-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02perf thread-stack: Allow for a thread stack arrayAdrian Hunter1-6/+34
In preparation for fixing thread stack processing for the idle task, allow for a thread stack array. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20181221120620.9659-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02perf thread-stack: Avoid direct reference to the thread's stackAdrian Hunter1-32/+49
In preparation for fixing thread stack processing for the idle task, avoid direct reference to the thread's stack. The thread stack will change to an array of thread stacks, at which point the meaning of the direct reference will change. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20181221120620.9659-4-adrian.hunter@intel.com [ Rename thread_stack__ts() to thread__stack() since this operates on a 'thread' struct ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02perf thread-stack: Tidy thread_stack__bottom() usageAdrian Hunter1-4/+3
In preparation for fixing thread stack processing for the idle task, tidy thread_stack__bottom() usage. Specifically, the parameter 'thread' is not needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20181221120620.9659-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-01-02perf thread-stack: Simplify some code in thread_stack__process()Adrian Hunter1-11/+7
In preparation for fixing thread stack processing for the idle task, simplify some code in thread_stack__process(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20181221120620.9659-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28perf script: Fix LBR skid dump problems in brstackinsnAndi Kleen3-0/+18
This is a fix for another instance of the skid problem Milian recently found [1] The LBRs don't freeze at the exact same time as the PMI is triggered. The perf script brstackinsn code that dumps LBR assembler assumes that the last branch in the LBR leads to the sample point. But with skid it's possible that the CPU executes one or more branches before the sample, but which do not appear in the LBR. What happens then is either that the sample point is before the last LBR branch. In this case the dumper sees a negative length and ignores it. Or it the sample point is long after the last branch. Then the dumper sees a very long block and dumps it upto its block limit (16k bytes), which is noise in the output. On typical sample session this can happen regularly. This patch tries to detect and handle the situation. On the last block that is dumped by the LBR dumper we always stop on the first branch. If the block length is negative just scan forward to the first branch. Otherwise scan until a branch is found. The PT decoder already has a function that uses the instruction decoder to detect branches, so we can just reuse it here. Then when a terminating branch is found print an indication and stop dumping. This might miss a few instructions, but at least shows no runaway blocks. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Link: http://lkml.kernel.org/r/20181120050617.4119-1-andi@firstfloor.org [ Resolved conflict with dd2e18e9ac20 ("perf tools: Support 'srccode' output") ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-28perf python: Do not force closing original perf descriptor in ↵Jiri Olsa1-1/+2
evlist.get_pollfd() Ondřej reported that when compiled with python3, the python extension regresses in evlist.get_pollfd function behaviour. The evlist.get_pollfd function creates file objects from evlist's fds and returns them in a list. The python3 version also sets them to 'close the original descriptor' when the object dies (is closed), by passing True via the 'closefd' arg in the PyFile_FromFd call. The python's closefd doc says: If closefd is False, the underlying file descriptor will be kept open when the file is closed. That's why the following line in python3 closes all evlist fds: evlist.get_pollfd() the returned list is immediately destroyed and that takes down the original events fds. Passing closefd as False to PyFile_FromFd to fix this. Reported-by: Ondřej Lysoněk <olysonek@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jaroslav Škarvada <jskarvad@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 66dfdff03d19 ("perf tools: Add Python 3 support") Link: http://lkml.kernel.org/r/20181226112121.5285-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>