summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2018-06-05Merge branch 'perf-core-for-linus' of ↵Linus Torvalds88-1009/+1858
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Kernel side changes: - x86 Intel uncore driver cleanups and enhancements (Kan Liang) - group scheduling and other fixes (Song Liu - store frame pointer in the sample traces for better profiling (Alexey Budankov) - compat fixes/enhancements (Eugene Syromiatnikov) Tooling side changes, which you can build and install in a single step via: make -C tools/perf clean install perf annotate: - Support 'perf annotate --group' for non-explicit recorded event "groups", showing multiple columns, one for each event, just like when dealing with explicit event groups (those enclosed with {}) (Jin Yao) - Record min/max LBR cycles (>= Skylake) and add 'perf annotate' TUI hotkey to show it (c) (Jin Yao) perf bpf: - Add infrastructure to help in writing eBPF C programs to be used with '-e name.c' type events in tools such as 'record' and 'trace', with headers for common constructs and an examples directory that will get populated as we add more such helpers and the 'perf bpf' (Arnaldo Carvalho de Melo) perf stat: - Display time in precision based on std deviation (Jiri Olsa) - Add --table option to display time of each run (Jiri Olsa) - Display length strings of each run for --table option (Jiri Olsa) perf buildid-cache: - Add --list and --purge-all options (Ravi Bangoria) perf test: - Let 'perf test list' display subtests (Hendrik Brueckner) perf pti: - Create extra kernel maps to help in decoding samples in x86 PTI entry trampolines (Adrian Hunter) - Copy x86 PTI entry trampoline sections in the kcore copy used for annotation and intel_pt CPU traces decoding (Adrian Hunter) ... and a lot of other fixes, enhancements and cleanups I did not list, see the shortlog and git log for details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (111 commits) perf/x86/intel/uncore: Clean up client IMC uncore perf/x86/intel/uncore: Expose uncore_pmu_event*() functions perf/x86/intel/uncore: Support IIO free-running counters on SKX perf/x86/intel/uncore: Add infrastructure for free running counters perf/x86/intel/uncore: Add new data structures for free running counters perf/x86/intel/uncore: Correct fixed counter index check in generic code perf/x86/intel/uncore: Correct fixed counter index check for NHM perf/x86/intel/uncore: Introduce customized event_read() for client IMC uncore perf/x86: Store user space frame-pointer value on a sample perf/core: Wire up compat PERF_EVENT_IOC_QUERY_BPF, PERF_EVENT_IOC_MODIFY_ATTRIBUTES perf/core: Fix bad use of igrab() perf/core: Fix group scheduling with mixed hw and sw events perf kcore_copy: Amend the offset of sections that remap kernel text perf kcore_copy: Copy x86 PTI entry trampoline sections perf kcore_copy: Get rid of kernel_map perf kcore_copy: Iterate phdrs perf kcore_copy: Layout sections perf kcore_copy: Calculate offset from phnum perf kcore_copy: Keep a count of phdrs perf kcore_copy: Keep phdr data in a list ...
2018-06-01perf tools intel-pt-decoder: Update insn.h from the kernel sourcesArnaldo Carvalho de Melo1-0/+18
To pick up the changes in: ee6a7354a362 ("kprobes/x86: Prohibit probing on exception masking instructions") That doesn't entail changes in tooling, but silences this perf build warning: Warning: Intel PT: x86 instruction decoder header at 'tools/perf/util/intel-pt-decoder/insn.h' differs from latest version at 'arch/x86/include/asm/insn.h' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-o3wfwjnyh7r8l0gi9q3y9f44@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-06-01perf trace beauty prctl: Default header_dir to cwd to work without parmsArnaldo Carvalho de Melo1-1/+1
Useful when checking the effects of header synchs for the files it uses as a input to generate string tables, in retrospect this is how it should've been done from day 1, not requiring the header_dir to be set on the Makefile, will change everything later, so that the only parm, common to all generators will be $(srctree) and $(beauty_outdir). So, to see what it generates, just call it without any parameters: $ tools/perf/trace/beauty/prctl_option.sh static const char *prctl_options[] = { [1] = "SET_PDEATHSIG", [2] = "GET_PDEATHSIG", [3] = "GET_DUMPABLE", [4] = "SET_DUMPABLE", [5] = "GET_UNALIGN", [6] = "SET_UNALIGN", [7] = "GET_KEEPCAPS", [8] = "SET_KEEPCAPS", [9] = "GET_FPEMU", [10] = "SET_FPEMU", [11] = "GET_FPEXC", [12] = "SET_FPEXC", [13] = "GET_TIMING", [14] = "SET_TIMING", [15] = "SET_NAME", [16] = "GET_NAME", [19] = "GET_ENDIAN", [20] = "SET_ENDIAN", [21] = "GET_SECCOMP", [22] = "SET_SECCOMP", [25] = "GET_TSC", [26] = "SET_TSC", [27] = "GET_SECUREBITS", [28] = "SET_SECUREBITS", [29] = "SET_TIMERSLACK", [30] = "GET_TIMERSLACK", [35] = "SET_MM", [36] = "SET_CHILD_SUBREAPER", [37] = "GET_CHILD_SUBREAPER", [38] = "SET_NO_NEW_PRIVS", [39] = "GET_NO_NEW_PRIVS", [40] = "GET_TID_ADDRESS", [41] = "SET_THP_DISABLE", [42] = "GET_THP_DISABLE", [45] = "SET_FP_MODE", [46] = "GET_FP_MODE", }; static const char *prctl_set_mm_options[] = { [1] = "START_CODE", [2] = "END_CODE", [3] = "START_DATA", [4] = "END_DATA", [5] = "START_STACK", [6] = "START_BRK", [7] = "BRK", [8] = "ARG_START", [9] = "ARG_END", [10] = "ENV_START", [11] = "ENV_END", [12] = "AUXV", [13] = "EXE_FILE", [14] = "MAP", [15] = "MAP_SIZE", }; $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-qtotspuztydjttxi7k6mec6h@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30perf tools: Fix perf.data format description of NRCPUS headerArnaldo Carvalho de Melo1-1/+1
In the perf.data HEADER_CPUDESC feadure header we store first the number of available CPUs in the system, then the number of CPUs at the time of writing the header, not the other way around. Reported-by: Thomas-Mich Richter <tmricht@linux.ibm.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Lakshman Annadorai <lakshmana@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Simon Que <sque@chromium.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-j7o92acm2vnxjv70y4o3swoc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30perf script python: Add addr into perf sample dictLeo Yan1-0/+2
ARM CoreSight auxtrace uses 'sample->addr' to record the target address for branch instructions, so the data of 'sample->addr' is required for tracing data analysis. This commit collects data of 'sample->addr' into perf sample dict, finally can be used for python script for parsing event. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: Tor Jeremiassen <tor@ti.com> Cc: coresight@lists.linaro.org Cc: kim.phillips@arm.co Cc: linux-arm-kernel@lists.infradead.org Cc: linux-doc@vger.kernel.org Link: http://lkml.kernel.org/r/1527497103-3593-3-git-send-email-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30perf data: Update documentation section on cpu topologyThomas Richter1-0/+8
Add an explanation of each cpu's core and socket identifier to the perf.data file format documentation. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180528074433.16652-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30perf cs-etm: Fix indexing for decoder packet queueMathieu Poirier1-2/+10
The tail of a queue is supposed to be pointing to the next available slot in a queue. In this implementation the tail is incremented before it is used and as such points to the last used element, something that has the immense advantage of centralizing tail management at a single location and eliminating a lot of redundant code. But this needs to be taken into consideration on the dequeueing side where the head also needs to be incremented before it is used, or the first available element of the queue will be skipped. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Walker <robert.walker@arm.com> Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/1527289854-10755-1-git-send-email-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30perf bpf: Fix NULL return handling in bpf__prepare_load()YueHaibing1-3/+3
bpf_object__open()/bpf_object__open_buffer can return error pointer or NULL, check the return values with IS_ERR_OR_NULL() in bpf__prepare_load and bpf__prepare_load_buffer Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: netdev@vger.kernel.org Link: https://lkml.kernel.org/n/tip-psf4xwc09n62al2cb9s33v9h@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30perf test: "Session topology" dumps core on s390Thomas Richter1-6/+24
The "perf test Session topology" entry fails with core dump on s390. The root cause is a NULL pointer dereference in function check_cpu_topology() line 76 (or line 82 without -v). The session->header.env.cpu variable is NULL because on s390 function process_cpu_topology() returns with error: socket_id number is too big. You may need to upgrade the perf tool. and releases the env.cpu variable via zfree() and sets it to NULL. Here is the gdb output: (gdb) n 76 pr_debug("CPU %d, core %d, socket %d\n", i, (gdb) n Program received signal SIGSEGV, Segmentation fault. 0x00000000010f4d9e in check_cpu_topology (path=0x3ffffffd6c8 "/tmp/perf-test-J6CHMa", map=0x14a1740) at tests/topology.c:76 76 pr_debug("CPU %d, core %d, socket %d\n", i, (gdb) Make sure the env.cpu variable is not used when its NULL. Test for NULL pointer and return TEST_SKIP if so. Output before: [root@p23lp27 perf]# ./perf test -F 39 39: Session topology :Segmentation fault (core dumped) [root@p23lp27 perf]# Output after: [root@p23lp27 perf]# ./perf test -vF 39 39: Session topology : --- start --- templ file: /tmp/perf-test-Ajx59D socket_id number is too big.You may need to upgrade the perf tool. ---- end ---- Session topology: Skip [root@p23lp27 perf]# Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180528073657.11743-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-30perf parse-events: Handle uncore event aliases in small groups properlyKan Liang4-9/+137
Perf stat doesn't count the uncore event aliases from the same uncore block in a group, for example: perf stat -e '{unc_m_cas_count.all,unc_m_clockticks}' -a -I 1000 # time counts unit events 1.000447342 <not counted> unc_m_cas_count.all 1.000447342 <not counted> unc_m_clockticks 2.000740654 <not counted> unc_m_cas_count.all 2.000740654 <not counted> unc_m_clockticks The output is very misleading. It gives a wrong impression that the uncore event doesn't work. An uncore block could be composed by several PMUs. An uncore event alias is a joint name which means the same event runs on all PMUs of a block. Perf doesn't support mixed events from different PMUs in the same group. It is wrong to put uncore event aliases in a big group. The right way is to split the big group into multiple small groups which only include the events from the same PMU. Only uncore event aliases from the same uncore block should be specially handled here. It doesn't make sense to mix the uncore events with other uncore events from different blocks or even core events in a group. With the patch: # time counts unit events 1.001557653 140,833 unc_m_cas_count.all 1.001557653 1,330,231,332 unc_m_clockticks 2.002709483 85,007 unc_m_cas_count.all 2.002709483 1,429,494,563 unc_m_clockticks Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Agustin Vega-Frias <agustinv@codeaurora.org> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1525727623-19768-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Amend the offset of sections that remap kernel textAdrian Hunter1-2/+51
x86 PTI entry trampolines all map to the same physical page. If that is reflected in the program headers of /proc/kcore, then do the same for the copy of kcore. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-18-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Copy x86 PTI entry trampoline sectionsAdrian Hunter1-0/+42
Identify and copy any sections for x86 PTI entry trampolines. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-17-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Get rid of kernel_mapAdrian Hunter1-18/+52
In preparation to add more program headers, get rid of kernel_map and modules_map by moving ->kernel_map and ->modules_map to newly allocated entries in the ->phdrs list. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-16-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Iterate phdrsAdrian Hunter1-15/+10
In preparation to add more program headers, iterate phdrs instead of assuming there is only one for the kernel text and one for the modules. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-15-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Layout sectionsAdrian Hunter1-3/+22
In preparation to add more program headers, layout the relative offset of each section. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-14-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Calculate offset from phnumAdrian Hunter1-1/+5
In preparation to add more program headers, calculate offset from the number of phdrs. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-13-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Keep a count of phdrsAdrian Hunter1-5/+4
In preparation to add more program headers, keep a count of phdrs. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-12-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf kcore_copy: Keep phdr data in a listAdrian Hunter1-0/+9
Currently, kcore_copy makes 2 program headers, one for the kernel text (namely kernel_map) and one for the modules (namely modules_map). Now more program headers are needed, but treating each program header as a special case results in much more code. Instead, in preparation to add more program headers, change to keep program header data (phdr_data) in a list. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-11-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf annotate: Show group event string for stdioJin Yao1-1/+5
When we enable the group, for tui/stdio2, the output first line includes the group event string. While for stdio, it will show only one event. For example, perf record -e cycles,branches ./div perf annotate --group --stdio Percent | Source code & Disassembly of div for cycles (44407 samples) ...... The first line doesn't include the event 'branches'. With this patch, it will show the correct group even string. perf annotate --group --stdio Percent | Source code & Disassembly of div for cycles, branches (44407 samples) ...... Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1526989115-14435-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf machine: Synthesize and process mmap events for x86 PTI entry trampolinesAdrian Hunter5-7/+140
Like the kernel text, the location of x86 PTI entry trampolines must be recorded in the perf.data file. Like the kernel, synthesize a mmap event for that, and add processing for it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-10-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-23perf machine: Create maps for x86 PTI entry trampolinesAdrian Hunter5-19/+187
Create maps for x86 PTI entry trampolines, based on symbols found in kallsyms. It is also necessary to keep track of whether the trampolines have been mapped particularly when the kernel dso is kcore. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-9-git-send-email-adrian.hunter@intel.com [ Fix extra_kernel_map_info.cnt designed struct initializer on gcc 4.4.7 (centos:6, etc) ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22perf machine: Allow for extra kernel mapsAdrian Hunter5-12/+34
Identify extra kernel maps by name so that they can be distinguished from the kernel map and module maps. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22perf machine: Fix map_groups__split_kallsyms() for entry trampoline symbolsAdrian Hunter2-0/+21
When kernel symbols are derived from /proc/kallsyms only (not using vmlinux or /proc/kcore) map_groups__split_kallsyms() is used. However that function makes assumptions that are not true with entry trampoline symbols. For now, remove the entry trampoline symbols at that point, as they are no longer needed at that point. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22perf machine: Workaround missing maps for x86 PTI entry trampolinesAdrian Hunter3-5/+106
On x86_64 the PTI entry trampolines are not in the kernel map created by perf tools. That results in the addresses having no symbols and prevents annotation. It also causes Intel PT to have decoding errors at the trampoline addresses. Workaround that by creating maps for the trampolines. At present the kernel does not export information revealing where the trampolines are. Until that happens, the addresses are hardcoded. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-22perf machine: Add nr_cpus_avail()Adrian Hunter4-0/+20
Add a function to return the number of the machine's available CPUs. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526986485-6562-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-21perf annotate: Support '--group' optionJin Yao1-0/+7
With the '--group' option, even for non-explicit group, 'perf annotate' will enable the group output. For example, $ perf record -e cycles,branches ./div $ perf annotate main --stdio --group : Disassembly of section .text: : : 00000000004004b0 <main>: : main(): : : return i; : } : : int main(void) : { 0.00 0.00 : 4004b0: push %rbx : int i; : int flag; : volatile double x = 1212121212, y = 121212; : : s_randseed = time(0); 0.00 0.00 : 4004b1: xor %edi,%edi : srand(s_randseed); 0.00 0.00 : 4004b3: mov $0x77359400,%ebx : : return i; : } : But if without --group, there is only one event reported. $ perf annotate main --stdio : Disassembly of section .text: : : 00000000004004b0 <main>: : main(): : : return i; : } : : int main(void) : { 0.00 : 4004b0: push %rbx : int i; : int flag; : volatile double x = 1212121212, y = 121212; : : s_randseed = time(0); 0.00 : 4004b1: xor %edi,%edi : srand(s_randseed); 0.00 : 4004b3: mov $0x77359400,%ebx : : return i; : } Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1526914666-31839-4-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-21perf report: Use perf_evlist__force_leader to support '--group'Jin Yao1-11/+2
Since we created a new function perf_evlist__force_leader(), remove the old code and use that new evlist method. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1526914666-31839-3-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-21perf evlist: Introduce force_leader() methodJin Yao2-0/+18
For non-explicit group (e.g. those created with -e '{eventA,eventB}'), 'perf report' supports a option '--group' which can enable group output. We also need to support 'perf annotate' with the same '--group'. Create a new function perf_evlist__force_leader() which contains common code to force setting the group leader. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1526914666-31839-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19perf tools: Fix kernel_start for PTI on x86Adrian Hunter1-1/+6
Opickn x86_64, PTI entry trampolines are less than the start of kernel text, but still above 2^63. So leave kernel_start = 1ULL << 63 for x86_64. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526548928-20790-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19perf machine: Add machine__is() to identify machine archAdrian Hunter4-0/+31
Add a function to identify the machine architecture. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1526548928-20790-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19perf bpf: Fixup include and examples install messagesArnaldo Carvalho de Melo1-0/+2
Before: INSTALL lib install include/bpf/*.h '/home/acme/lib/include/perf/bpf' INSTALL lib install examples/bpf/*.c '/home/acme/lib/examples/perf/bpf' After: INSTALL lib INSTALL include/bpf INSTALL lib INSTALL examples/bpf Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: dd8e4ead6e98 ("perf bpf: Add bpf.h to be used in eBPF proggies") Fixes: 8f12a2ff00e5 ("perf bpf: Add 'examples' directories") Link: https://lkml.kernel.org/n/tip-icljqe87e8pak8mu6mkki9d4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-19perf annotate: Create hotkey 'c' to show min/max cyclesJin Yao3-7/+45
In the 'perf annotate' view, a new hotkey 'c' is created for showing the min/max cycles. For example, when press 'c', the annotate view is: Percent│ IPC Cycle(min/max) │ │ │ Disassembly of section .text: │ │ 000000000003aab0 <random@@GLIBC_2.2.5>: 8.22 │3.92 sub $0x18,%rsp │3.92 mov $0x1,%esi │3.92 xor %eax,%eax │3.92 cmpl $0x0,argp_program_version_hook@@G │3.92 1(2/1) ↓ je 20 │ lock cmpxchg %esi,__abort_msg@@GLIBC_P │ ↓ jne 29 │ ↓ jmp 43 │1.10 20: cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+ 8.93 │1.10 1(5/1) ↓ je 43 When press 'c' again, the annotate view is switched back: Percent│ IPC Cycle │ │ │ Disassembly of section .text: │ │ 000000000003aab0 <random@@GLIBC_2.2.5>: 8.22 │3.92 sub $0x18,%rsp │3.92 mov $0x1,%esi │3.92 xor %eax,%eax │3.92 cmpl $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x │3.92 1 ↓ je 20 │ lock cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0 │ ↓ jne 29 │ ↓ jmp 43 │1.10 20: cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0 8.93 │1.10 1 ↓ je 43 Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1526569118-14217-3-git-send-email-yao.jin@linux.intel.com [ Rename all maxmin to minmax ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-18perf annotate: Record the min/max cyclesJin Yao2-1/+17
Currently perf has a feature to account cycles for LBRs For example, on skylake: perf record -b ... perf report or perf annotate And then browsing the annotate browser gives average cycle counts for program blocks. For some analysis it would be useful if we could know not only the average cycles but also the min and max cycles. This patch records the min and max cycles. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1526569118-14217-2-git-send-email-yao.jin@linux.intel.com [ Switch from max/min to min/max ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-18perf script: Show symbol offsets by defaultSandipan Das2-18/+20
Since the ip shown for a symbol is now always a virtual address, it becomes difficult to correlate this with objdump output and determine the exact instruction address. So, we always show the offset from the start of the symbol. This can be verified on a powerpc64le system running Fedora 27 as follows: # perf probe -a sys_write # perf record -e probe:sys_write -g ~/test Before applying this patch: # perf script test 9710 [013] 95614.332431: probe:sys_write: (c0000000004025b0) c0000000004025b0 sys_write (/lib/modules/4.17.0-rc4+/build/vmlinux) c00000000000b9e0 system_call (/lib/modules/4.17.0-rc4+/build/vmlinux) 7fffb70d8234 __GI___libc_write (/usr/lib64/libc-2.26.so) 7fffb7052c74 _IO_file_write@@GLIBC_2.17 (/usr/lib64/libc-2.26.so) 5afc1818 [unknown] ([unknown]) 7fffb7051a60 new_do_write (/usr/lib64/libc-2.26.so) 7fffb7054638 _IO_do_write@@GLIBC_2.17 (/usr/lib64/libc-2.26.so) 7fffb7054bbc _IO_file_overflow@@GLIBC_2.17 (/usr/lib64/libc-2.26.so) 7fffb7055a24 __overflow (/usr/lib64/libc-2.26.so) 7fffb7044548 _IO_puts (/usr/lib64/libc-2.26.so) 10000440 main (/home/sandipan/test) 7fffb6fe36a0 generic_start_main.isra.0 (/usr/lib64/libc-2.26.so) 7fffb6fe3898 __libc_start_main (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) ... After applying this patch: # perf script test 9710 [013] 95614.332431: probe:sys_write: (c0000000004025b0) c0000000004025b0 sys_write+0x10 (/lib/modules/4.17.0-rc4+/build/vmlinux) c00000000000b9e0 system_call+0x58 (/lib/modules/4.17.0-rc4+/build/vmlinux) 7fffb70d8234 __GI___libc_write+0x24 (/usr/lib64/libc-2.26.so) 7fffb7052c74 _IO_file_write@@GLIBC_2.17+0x44 (/usr/lib64/libc-2.26.so) 5afc1818 [unknown] ([unknown]) 7fffb7051a60 new_do_write+0x90 (/usr/lib64/libc-2.26.so) 7fffb7054638 _IO_do_write@@GLIBC_2.17+0x38 (/usr/lib64/libc-2.26.so) 7fffb7054bbc _IO_file_overflow@@GLIBC_2.17+0x14c (/usr/lib64/libc-2.26.so) 7fffb7055a24 __overflow+0x64 (/usr/lib64/libc-2.26.so) 7fffb7044548 _IO_puts+0x218 (/usr/lib64/libc-2.26.so) 10000440 main+0x20 (/home/sandipan/test) 7fffb6fe36a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fffb6fe3898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) ... Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20180517063326.6319-2-sandipan@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17perf script: Show virtual addresses instead of offsetsSandipan Das1-1/+1
When perf data is recorded with the call-graph option enabled, the callchain shown by perf script shows the binary offsets of the symbols as the ip. This is incorrect for kernel symbols as the ip values are always off by a fixed offset depending on the architecture. If the offsets from the start of the symbols are printed, they are also incorrect for both kernel and userspace symbols. Without the call-graph option, the callchain shows the virtual addresses of the symbols rather than their binary offsets. The offsets printed in this case are also correct. This fixes the inconsistency in perf script's output. This can be verified on a powerpc64le system running Fedora 27 as follows: # cat /proc/kallsyms | grep sys_write ... c0000000004025a0 T sys_write c0000000004025a0 T __se_sys_write ... # perf probe -a sys_write Before applying this patch: # perf record -e probe:sys_write -g ~/test # perf script -F ip,sym,symoff 4125b0 sys_write+0x8000000000008010 1b9e0 system_call+0x8000000000008058 118234 __GI___libc_write+0xffff0000f52c0024 92c74 _IO_file_write@@GLIBC_2.17+0xffff0000f52c0044 5afbfd8a [unknown] 91a60 new_do_write+0xffff0000f52c0090 94638 _IO_do_write@@GLIBC_2.17+0xffff0000f52c0038 94bbc _IO_file_overflow@@GLIBC_2.17+0xffff0000f52c014c 95a24 __overflow+0xffff0000f52c0064 84548 _IO_puts+0xffff0000f52c0218 440 main+0xffffffffe0000020 236a0 generic_start_main.isra.0+0xffff0000f52c0140 23898 __libc_start_main+0xffff0000f52c00b8 0 [unknown] ... # perf record -e probe:sys_write ~/test # perf script -F ip,sym,symoff c0000000004025b0 sys_write+0x10 ... After applying this patch: # perf record -e probe:sys_write -g ~/test # perf script -F ip,sym,symoff c0000000004025b0 sys_write+0x10 c00000000000b9e0 system_call+0x58 7fffb70d8234 __GI___libc_write+0x24 7fffb7052c74 _IO_file_write@@GLIBC_2.17+0x44 5afc1818 [unknown] 7fffb7051a60 new_do_write+0x90 7fffb7054638 _IO_do_write@@GLIBC_2.17+0x38 7fffb7054bbc _IO_file_overflow@@GLIBC_2.17+0x14c 7fffb7055a24 __overflow+0x64 7fffb7044548 _IO_puts+0x218 10000440 main+0x20 7fffb6fe36a0 generic_start_main.isra.0+0x140 7fffb6fe3898 __libc_start_main+0xb8 0 [unknown] ... # perf record -e probe:sys_write ~/test # perf script -F ip,sym,symoff c0000000004025b0 sys_write+0x10 ... Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lkml.kernel.org/r/20180517063326.6319-1-sandipan@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17perf tools: No need to unconditionally read the max_stack sysctlsArnaldo Carvalho de Melo6-10/+18
Let tools that need to have those variables with the sysctl current values use a function that will read them. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-1ljj3oeo5kpt2n1icfd9vowe@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17perf tools: Read the cache line size lazilyArnaldo Carvalho de Melo5-17/+25
It is not read as commonly as 'page_size', so it makes sense to read it lazily, caching its value when it is first read. Less files open unconditionally at startup. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-35xhrq91u94uc1djtclek1ie@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17tools lib api fs tracing_path: Introduce opendir() methodArnaldo Carvalho de Melo2-5/+5
That takes care of using the right call to get the tracing_path directory, the one that will end up calling tracing_path_set() to figure out where tracefs is mounted. One more step in doing just lazy reading of system structures to reduce the number of operations done unconditionaly at 'perf' start. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-42zzi0f274909bg9mxzl81bu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17perf parse-events: Use get/put_events_file()Arnaldo Carvalho de Melo3-22/+43
Instead of accessing the trace_events_path variable directly, that may not have been properly initialized wrt detecting where tracefs is mounted. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-id7hzn1ydgkxbumeve5wapqz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17perf tools: Reuse the path to the tracepoint /events/ directoryArnaldo Carvalho de Melo1-8/+7
When using for_each_event() we needlessly rebuild the whole path to the tracepoint directory, reuse the dir_path instead, saving some cycles and reducing the size of the next patch. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-54bcs15n0cp6gwcgpc4hptyc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-17tools lib api fs tracing_path: Introduce get/put_events_file() helpersArnaldo Carvalho de Melo1-6/+5
To make reading events files a tad more compact than with get_tracing_files("events/foo"). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-do6xgtwpmfl8zjs1euxsd2du@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-16tools lib api: Unexport 'tracing_path' variableArnaldo Carvalho de Melo2-6/+2
One should use tracing_path_mount() instead, so more things get done lazily instead of at every 'perf' tool call startup. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-fci4yll35idd9yuslp67vqc2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-16perf config: Call perf_config__init() lazilyArnaldo Carvalho de Melo3-9/+9
We check what perf_config__init() does at each perf_config() call, namely if the static perf_config instance was created, so instead of bailing out in that case, try to allocate it, bailing if it fails. Now to get the perf_config() call out of the start of perf's main() function, doing it also lazily. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-4bo45k6ivsmbxpfpdte4orsg@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-16perf bpf: Fix NULL return handling in bpf__prepare_load()YueHaibing1-3/+3
bpf_object__open()/bpf_object__open_buffer can return error pointer or NULL, check the return values with IS_ERR_OR_NULL() in bpf__prepare_load and bpf__prepare_load_buffer Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: netdev@vger.kernel.org Link: https://lkml.kernel.org/n/tip-psf4xwc09n62al2cb9s33v9h@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-16perf parse-events: Handle uncore event aliases in small groups properlyKan Liang4-9/+137
Perf stat doesn't count the uncore event aliases from the same uncore block in a group, for example: perf stat -e '{unc_m_cas_count.all,unc_m_clockticks}' -a -I 1000 # time counts unit events 1.000447342 <not counted> unc_m_cas_count.all 1.000447342 <not counted> unc_m_clockticks 2.000740654 <not counted> unc_m_cas_count.all 2.000740654 <not counted> unc_m_clockticks The output is very misleading. It gives a wrong impression that the uncore event doesn't work. An uncore block could be composed by several PMUs. An uncore event alias is a joint name which means the same event runs on all PMUs of a block. Perf doesn't support mixed events from different PMUs in the same group. It is wrong to put uncore event aliases in a big group. The right way is to split the big group into multiple small groups which only include the events from the same PMU. Only uncore event aliases from the same uncore block should be specially handled here. It doesn't make sense to mix the uncore events with other uncore events from different blocks or even core events in a group. With the patch: # time counts unit events 1.001557653 140,833 unc_m_cas_count.all 1.001557653 1,330,231,332 unc_m_clockticks 2.002709483 85,007 unc_m_cas_count.all 2.002709483 1,429,494,563 unc_m_clockticks Reported-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Agustin Vega-Frias <agustinv@codeaurora.org> Cc: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1525727623-19768-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-15perf tools: Use the "_stest" symbol to identify the kernel map when loading ↵Adrian Hunter1-8/+8
kcore The first symbol is not necessarily in the kernel text. Instead of using the first symbol, use the _stest symbol to identify the kernel map when loading kcore. This allows for the introduction of symbols to identify the x86_64 PTI entry trampolines. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/1525866228-30321-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-15perf bpf: Add probe() helper to reduce kprobes boilerplateArnaldo Carvalho de Melo2-2/+11
So that kprobe definitions become: int probe(function, variables)(void *ctx, int err, var1, var2, ...) The existing 5sec.c, got converted and goes from: SEC("func=hrtimer_nanosleep rqtp->tv_sec") int func(void *ctx, int err, long sec) { } To: int probe(hrtimer_nanosleep, rqtp->tv_sec)(void *ctx, int err, long sec) { } If we decide to add tv_nsec as well, then it becomes: $ cat tools/perf/examples/bpf/5sec.c #include <bpf.h> int probe(hrtimer_nanosleep, rqtp->tv_sec rqtp->tv_nsec)(void *ctx, int err, long sec, long nsec) { return sec == 5; } license(GPL); $ And if we run it, system wide as before and run some 'sleep' with values for the tv_nsec field, we get: # perf trace --no-syscalls -e tools/perf/examples/bpf/5sec.c 0.000 perf_bpf_probe:hrtimer_nanosleep:(ffffffff9811b5f0) tv_sec=5 tv_nsec=100000000 9641.650 perf_bpf_probe:hrtimer_nanosleep:(ffffffff9811b5f0) tv_sec=5 tv_nsec=123450001 ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-1v9r8f6ds5av0w9pcwpeknyl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-15perf bpf: Add license(NAME) helperArnaldo Carvalho de Melo3-4/+7
To further reduce boilerplate. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-vst6hj335s0ebxzqltes3nsc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-15perf bpf: Add kprobe example to catch 5s napsArnaldo Carvalho de Melo1-0/+44
Description: . Disable strace like syscall tracing (--no-syscalls), or try tracing just some (-e *sleep). . Attach a filter function to a kernel function, returning when it should be considered, i.e. appear on the output: $ cat tools/perf/examples/bpf/5sec.c #include <bpf.h> SEC("func=hrtimer_nanosleep rqtp->tv_sec") int func(void *ctx, int err, long sec) { return sec == 5; } char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; $ . Run it system wide, so that any sleep of >= 5 seconds and < than 6 seconds gets caught. . Ask for callgraphs using DWARF info, so that userspace can be unwound . While this is running, run something like "sleep 5s". # perf trace --no-syscalls -e tools/perf/examples/bpf/5sec.c/call-graph=dwarf/ 0.000 perf_bpf_probe:func:(ffffffff9811b5f0) tv_sec=5 hrtimer_nanosleep ([kernel.kallsyms]) __x64_sys_nanosleep ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) __GI___nanosleep (/usr/lib64/libc-2.26.so) rpl_nanosleep (/usr/bin/sleep) xnanosleep (/usr/bin/sleep) main (/usr/bin/sleep) __libc_start_main (/usr/lib64/libc-2.26.so) _start (/usr/bin/sleep) ^C# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-2nmxth2l2h09f9gy85lyexcq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-05-15perf bpf: Add bpf.h to be used in eBPF proggiesArnaldo Carvalho de Melo3-2/+10
So, the first helper is the one shortening a variable/function section attribute, from, for instance: char _license[] __attribute__((section("license"), used)) = "GPL"; to: char _license[] SEC("license") = "GPL"; Convert empty.c to that and it becomes: # cat ~acme/lib/examples/perf/bpf/empty.c #include <bpf.h> char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-zmeg52dlvy51rdlhyumfl5yf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>