summaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2018-03-08perf report: Fix the output for stdio events listJiri Olsa2-4/+17
Changing the output header for reporting forced groups via --groups option on non grouped events, like: $ perf record -e 'cycles,instructions' $ perf report --stdio --group Before: # Samples: 24 of event 'anon group { cycles:u, instructions:u }' After: # Samples: 24 of events 'cycles:u, instructions:u' Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Fixes: ad52b8cb4886 ("perf report: Add support to display group output for non group events") Link: http://lkml.kernel.org/r/20180307155020.32613-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf annotate: Fix s390 target function disassemblyThomas Richter1-1/+1
'perf annotate' displays function call assembler instructions with a right arrow. Hitting enter on this line/instruction causes the browser to disassemble this target function and show it on the screen. On s390 this results in an error message 'The called function was not found.' The function call assembly line parsing does not handle the s390 bras and brasl instructions. Function call__parse expects the target as first operand: callq e9140 <__fxstat> S390 has a register number as first operand: brasl %r14,41d60 <abort> Therefore the target addresses on s390 are always zero which is an invalid address. Introduce a s390 specific call parsing function which skips the first operand on s390. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180307134325.96106-1-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf intel-pt: Adjust overlap-checking to support sampling modeAdrian Hunter1-3/+4
Adjust overlap-checking to support sampling mode. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1520431349-30689-10-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf intel-pt: Remove a check for sampling modeAdrian Hunter1-3/+0
Intel PT code already has some preparation for AUX area sampling mode. However the implementation has changed from the first proposal and one of the side-effects is that it will not be impossible to support snapshot mode and sampling mode at the same time. Although there are no plans to support it, let validation (not yet implemented) control whether it is allowed rather than low-level functions. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1520431349-30689-9-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf intel-pt: Tidy old_buffer handling in intel_pt_get_trace()Adrian Hunter1-13/+11
intel_pt_get_trace() fixes overlaps between the current buffer and the previous buffer ('old_buffer'). However the previous buffer might not have had usable data (no PSB) so the comparison must be made against the previous buffer that had usable data. Tidy that by keeping a pointer for that purpose in struct intel_pt_queue. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1520431349-30689-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf intel-pt: Get rid of intel_pt_use_buffer_pid_tid()Adrian Hunter1-36/+3
With the new way sampling support will be implemented, intel_pt_use_buffer_pid_tid() will not be needed. Get rid of it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1520431349-30689-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf intel-pt: Fix timestamp following overflowAdrian Hunter1-0/+1
timestamp_insn_cnt is used to estimate the timestamp based on the number of instructions since the last known timestamp. If the estimate is not accurate enough decoding might not be correctly synchronized with side-band events causing more trace errors. However there are always timestamps following an overflow, so the estimate is not needed and can indeed result in more errors. Suppress the estimate by setting timestamp_insn_cnt to zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1520431349-30689-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf intel-pt: Fix error recovery from missing TIP packetAdrian Hunter1-0/+1
When a TIP packet is expected but there is a different packet, it is an error. However the unexpected packet might be something important like a TSC packet, so after the error, it is necessary to continue from there, rather than the next packet. That is achieved by setting pkt_step to zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1520431349-30689-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf intel-pt: Fix sync_switchAdrian Hunter1-7/+25
sync_switch is a facility to synchronize decoding more closely with the point in the kernel when the context actually switched. The flag when sync_switch is enabled was global to the decoding, whereas it is really specific to the CPU. The trace data for different CPUs is put on different queues, so add sync_switch to the intel_pt_queue structure and use that in preference to the global setting in the intel_pt structure. That fixes problems decoding one CPU's trace because sync_switch was disabled on a different CPU's queue. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1520431349-30689-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf intel-pt: Fix overlap detection to identify consecutive buffers correctlyAdrian Hunter3-35/+34
Overlap detection was not not updating the buffer's 'consecutive' flag. Marking buffers consecutive has the advantage that decoding begins from the start of the buffer instead of the first PSB. Fix overlap detection to identify consecutive buffers correctly. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1520431349-30689-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf mmap: Simplify perf_mmap__read_init()Kan Liang3-12/+4
It isn't necessary to pass the 'start', 'end' and 'overwrite' arguments to perf_mmap__read_init(). The data is stored in the struct perf_mmap. Discard the parameters. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-8-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf mmap: Simplify perf_mmap__read_event()Kan Liang3-8/+3
It isn't necessary to pass the 'overwrite', 'start' and 'end' argument to perf_mmap__read_event(). Discard them. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-7-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf mmap: Simplify perf_mmap__consume()Kan Liang3-5/+5
It isn't necessary to pass the 'overwrite' argument to perf_mmap__consume(). Discard it. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-6-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf mmap: Use stored 'overwrite' in perf_mmap__consume()Kan Liang1-2/+2
The 'overwrite' is set at allocation. It will not be changed. Using it to replace the parameter of perf_mmap__consume(). The parameters will be discarded later. No functional change. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-5-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf mmap: Use the stored data in perf_mmap__read_event()Kan Liang1-10/+8
Using the 'start', 'end' and 'overwrite' which are stored in struct perf_mmap to replace the parameters of perf_mmap__read_event(). The parameters will be discarded later. No functional change. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-4-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf mmap: Use the stored scope data in perf_mmap__push()Kan Liang2-14/+14
Using the 'start' and 'end' which are stored in struct perf_mmap to replace the temporary 'start' and 'end'. The temporary variables will be discarded later. It doesn't need to pass 'overwrite' to perf_mmap__push(). It's stored in struct perf_mmap. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-3-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf mmap: Store mmap scope in struct perf_mmap()Kan Liang2-4/+10
There is too much boilerplate in the perf_mmap__read*() interfaces. The 'start' and 'end' variables should be stored in struct perf_mmap at initialization. They will be used later. The old 'startp' and 'endp' pointers are used by perf_mmap__read_event() now. They cannot be removed. So the old 'startp/endp' and new 'md->start/md->end' will exist simultaneously now. The old one will be removed later. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-2-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf evlist: Store 'overwrite' in struct perf_mmapKan Liang2-3/+6
It has been determined that the map is for overwrite mode (evlist->overwrite_mmap) or non-overwrite mode (evlist->mmap) when calling perf_evlist__alloc_mmap(). Store the information in struct perf_mmap, which will be used later to simplify the perf_mmap__read*() interfaces. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Suggested-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1520350567-80082-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf pmu: Auto-merge PMU events created by prefix or glob matchAgustin Vega-Frias3-13/+6
Auto-merge for these events was disabled when auto-merging of non-alias events was disabled in commit 63ce844 (perf stat: Only auto-merge events that are PMU aliases). Non-merging of legacy events is preserved: $ perf stat -ag -e cache-misses,cache-misses sleep 1 Performance counter stats for 'system wide': 86,323 cache-misses 86,323 cache-misses 1.002623307 seconds time elapsed But prefix or glob matching auto-merges the events created: $ perf stat -a -e l3cache/read-miss/ sleep 1 Performance counter stats for 'system wide': 328 l3cache/read-miss/ 1.002627008 seconds time elapsed $ perf stat -a -e l3cache_0_[01]/read-miss/ sleep 1 Performance counter stats for 'system wide': 172 l3cache/read-miss/ 1.002627008 seconds time elapsed As with events created with aliases, auto-merging can be suppressed with the --no-merge option: $ perf stat -a -e l3cache/read-miss/ --no-merge sleep 1 Performance counter stats for 'system wide': 67 l3cache/read-miss/ 67 l3cache/read-miss/ 63 l3cache/read-miss/ 60 l3cache/read-miss/ 1.002622192 seconds time elapsed Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timur Tabi <timur@codeaurora.org> Cc: linux-arm-kernel@lists.infradead.org Change-Id: I0a47eed54c05e1982ca964d743b37f50f60c508c Link: http://lkml.kernel.org/r/1520345084-42646-4-git-send-email-agustinv@codeaurora.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf pmu: Display pmu name when printing unmerged events in statAgustin Vega-Frias3-1/+9
To simplify creation of events accross multiple instances of the same type of PMU stat supports two methods for creating multiple events from a single event specification: 1. A prefix or glob can be used in the PMU name. 2. Aliases, which are listed immediately after the Kernel PMU events by perf list, are used. When the --no-merge option is passed and these events are displayed individually the PMU name is lost and it's not possible to see which count corresponds to which pmu: $ perf stat -a -e l3cache/read-miss/ --no-merge ls > /dev/null Performance counter stats for 'system wide': 67 l3cache/read-miss/ 67 l3cache/read-miss/ 63 l3cache/read-miss/ 60 l3cache/read-miss/ 0.001675706 seconds time elapsed $ perf stat -a -e l3cache_read_miss --no-merge ls > /dev/null Performance counter stats for 'system wide': 12 l3cache_read_miss 17 l3cache_read_miss 10 l3cache_read_miss 8 l3cache_read_miss 0.001661305 seconds time elapsed This change adds the original pmu name to the event. For dynamic pmu events the pmu name is restored in the event name: $ perf stat -a -e l3cache/read-miss/ --no-merge ls > /dev/null Performance counter stats for 'system wide': 63 l3cache_0_3/read-miss/ 74 l3cache_0_1/read-miss/ 64 l3cache_0_2/read-miss/ 74 l3cache_0_0/read-miss/ 0.001675706 seconds time elapsed For alias events the name is added after the event name: $ perf stat -a -e l3cache_read_miss --no-merge ls > /dev/null Performance counter stats for 'system wide': 10 l3cache_read_miss [l3cache_0_3] 12 l3cache_read_miss [l3cache_0_1] 10 l3cache_read_miss [l3cache_0_2] 17 l3cache_read_miss [l3cache_0_0] 0.001661305 seconds time elapsed Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timur Tabi <timur@codeaurora.org> Cc: linux-arm-kernel@lists.infradead.org Change-Id: I8056b9eda74bda33e95065056167ad96e97cb1fb Link: http://lkml.kernel.org/r/1520345084-42646-3-git-send-email-agustinv@codeaurora.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-08perf pmu: Support wildcards on pmu name in dynamic pmu eventsAgustin Vega-Frias2-3/+13
Starting on v4.12 event parsing code for dynamic pmu events already supports prefix-based matching of multiple pmus when creating dynamic events. E.g., in a system with the following dynamic pmus: mypmu_0 mypmu_1 mypmu_2 mypmu_4 passing mypmu/<config>/ as an event spec will result in the creation of the event in all of the pmus. This change expands this matching through the use of fnmatch so glob-like expressions can be used to create events in multiple pmus. E.g., in the system described above if a user only wants to create the event in mypmu_0 and mypmu_1, mypmu_[01]/<config>/ can be passed. Signed-off-by: Agustin Vega-Frias <agustinv@codeaurora.org> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Timur Tabi <timur@codeaurora.org> Change-Id: Icb25653fc5d5239c20f3bffdfdf4ab4c9c9bb20b Link: http://lkml.kernel.org/r/1520454947-16977-1-git-send-email-agustinv@codeaurora.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf auxtrace: Make auxtrace_queues__add_buffer() return buffer_ptrAdrian Hunter1-8/+15
In preparation for supporting AUX area sampling buffers, auxtrace_queues__add_buffer() needs to be more generic. To that end, make it return buffer_ptr instead of the caller. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1520327598-1317-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf auxtrace: Rename some buffer-queuing functionsAdrian Hunter1-10/+10
Rename some buffer-queuing functions in preparation for supporting AUX area sampling buffers. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1520327598-1317-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf auxtrace: Add missing parameters from kernel-doc commentsAdrian Hunter1-0/+2
Add missing parameters from kernel-doc comments. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1520327598-1317-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf cgroup: Make the cgroup name be const char *Arnaldo Carvalho de Melo2-11/+15
The usual thing is for a constructor to allocate space for its members, not to require that the caller pass a pre-allocated 'name' and then, at its destructor, to free something not allocated by it. Fix it by making cgroup__new() to receive a const char pointer, then allocate cgroup->name that then can continue to be freed at cgroup__delete(), balancing the alloc/free operations inside the cgroup struct methods. This eases calling evlist__findnew_cgroup() from the custom 'perf trace' cgroup parser, that will only call parse_cgroups() when the '-G cgroup' is passed on the command line after '-e event' entries, when it'll behave just like 'perf stat' and 'perf record', i.e. the previous parse_cgroup() users that mandate that -G only can come after a -e. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-4leugnuyqi10t98990o3xi1t@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf cgroup: Add evlist__add_default_cgroup()Arnaldo Carvalho de Melo2-0/+16
So that tools like 'perf trace' can allow the user to set a cgroup to be used for all the evsels still without a crgroup setup by parse_cgroups(), such as the one to use for the syscalls, vfs_getname and other events involved in strace like syscall tracing. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-zf9jjsbj661r3lk6qb7g8j70@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf cgroup: Add evlist__findnew_cgroup()Arnaldo Carvalho de Melo2-7/+14
Similar to machine__findnew_thread(), etc, i.e. try to find, get a refcount if found and return it, otherwise return a new cgroup object. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-im1omevlihhyneiic4nl3g24@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf sched: Move thread::shortname to thread_runtimeChangbin Du1-1/+0
The thread::shortname only used by sched command, so move it to sched private structure. Signed-off-by: Changbin Du <changbin.du@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1520307457-23668-2-git-send-email-changbin.du@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf cgroup: Introduce cgroup__new() out of open coded equivalentArnaldo Carvalho de Melo1-10/+20
To follow the namespacing convention in tools/perf. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-jaalyl6bkvvji4r5u8wqw4n4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf cgroup: Introduce find_cgroup() methodArnaldo Carvalho de Melo1-2/+10
To break down complexity in add_cgroup(). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-5yqshcf5hm837n7c86u7lhjf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf cgroup: Introduce cgroup__get()Arnaldo Carvalho de Melo2-6/+11
The refcount operation counterpart to cgroup__put(), use it when reusing a cgroup. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-14ynvrl7y2cz8gyuy5q5v41g@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf cgroup: Rename close_cgroup() to cgroup__put()Arnaldo Carvalho de Melo3-5/+5
It is not really closing the cgroup, but instead dropping a reference count and if it hits zero, then calling delete, which will, among other cleanup shores, close the cgroup fd. So it is really dropping a reference to that cgroup, and the method name for that is "put", so rename close_cgroup() to cgroup__put() to follow this naming convention. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-sccxpnd7bgwc1llgokt6fcey@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf cgroup: Introduce cgroup__delete()Arnaldo Carvalho de Melo1-3/+8
Just to make this code look more like other places in tools/perf. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-j3j72vvn2d5j7tenlghdy195@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf cgroup: Rename 'struct cgroup_sel' to 'struct cgroup'Arnaldo Carvalho de Melo3-7/+8
That name isn't used, is shorter, lets switch to it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-e51yphwgvepd1y4f5fjptmjq@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07perf cgroup: Remove misplaced __maybe_unusedArnaldo Carvalho de Melo1-1/+1
The 'opt' parameter in parse_cgroups() _is_ used. The original patch used '__used' that was even more confusing :-) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 023695d96ee0 ("perf tool: Add cgroup support") Link: https://lkml.kernel.org/n/tip-4jo2puz0empkoou6bbq460tl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-07Merge branch 'perf/urgent' into perf/core, to resolve conflictIngo Molnar2-10/+14
Conflicts: tools/perf/perf.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-03-06perf tools: Fix trigger class trigger_on()Adrian Hunter1-4/+5
trigger_on() means that the trigger is available but not ready, however trigger_on() was making it ready. That can segfault if the signal comes before trigger_ready(). e.g. (USR2 signal delivery not shown) $ perf record -e intel_pt//u -S sleep 1 perf: Segmentation fault Obtained 16 stack frames. /home/ahunter/bin/perf(sighandler_dump_stack+0x40) [0x4ec550] /lib/x86_64-linux-gnu/libc.so.6(+0x36caf) [0x7fa76411acaf] /home/ahunter/bin/perf(perf_evsel__disable+0x26) [0x4b9dd6] /home/ahunter/bin/perf() [0x43a45b] /lib/x86_64-linux-gnu/libc.so.6(+0x36caf) [0x7fa76411acaf] /lib/x86_64-linux-gnu/libc.so.6(__xstat64+0x15) [0x7fa7641d2cc5] /home/ahunter/bin/perf() [0x4ec6c9] /home/ahunter/bin/perf() [0x4ec73b] /home/ahunter/bin/perf() [0x4ec73b] /home/ahunter/bin/perf() [0x4ec73b] /home/ahunter/bin/perf() [0x4eca15] /home/ahunter/bin/perf(machine__create_kernel_maps+0x257) [0x4f0b77] /home/ahunter/bin/perf(perf_session__new+0xc0) [0x4f86f0] /home/ahunter/bin/perf(cmd_record+0x722) [0x43c132] /home/ahunter/bin/perf() [0x4a11ae] /home/ahunter/bin/perf(main+0x5d4) [0x427fb4] Note, for testing purposes, this is hard to hit unless you add some sleep() in builtin-record.c before record__open(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: stable@vger.kernel.org Fixes: 3dcc4436fa6f ("perf tools: Introduce trigger class") Link: http://lkml.kernel.org/r/1519807144-30694-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-06perf auxtrace: Prevent decoding when --no-itraceAdrian Hunter1-6/+9
Prevent auxtrace_queues__process_index() from queuing AUX area data for decoding when the --no-itrace option has been used. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1520327598-1317-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-05perf record: Fix crash in pipe modeJiri Olsa1-2/+6
Currently we can crash perf record when running in pipe mode, like: $ perf record ls | perf report # To display the perf.data header info, please use --header/--header-only options. # perf: Segmentation fault Error: The - file has no samples! The callstack of the crash is: 0x0000000000515242 in perf_event__synthesize_event_update_name 3513 ev = event_update_event__new(len + 1, PERF_EVENT_UPDATE__NAME, evsel->id[0]); (gdb) bt #0 0x0000000000515242 in perf_event__synthesize_event_update_name #1 0x00000000005158a4 in perf_event__synthesize_extra_attr #2 0x0000000000443347 in record__synthesize #3 0x00000000004438e3 in __cmd_record #4 0x000000000044514e in cmd_record #5 0x00000000004cbc95 in run_builtin #6 0x00000000004cbf02 in handle_internal_command #7 0x00000000004cc054 in run_argv #8 0x00000000004cc422 in main The reason of the crash is that the evsel does not have ids array allocated and the pipe's synthesize code tries to access it. We don't force evsel ids allocation when we have single event, because it's not needed. However we need it when we are in pipe mode even for single event as a key for evsel update event. Fixing this by forcing evsel ids allocation event for single event, when we are in pipe mode. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180302161354.30192-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-05perf mmap: Discard legacy interfaces for mmap read forwardKan Liang3-48/+2
Discards legacy interfaces perf_evlist__mmap_read_forward(), perf_evlist__mmap_read() and perf_evlist__mmap_consume(). No tools use them. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-14-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-05perf python: Switch to new perf_mmap__read_event() interfaceKan Liang1-3/+9
The perf python binding still use the legacy interface. No functional change. Committer notes: Tested before and after with: [root@jouet perf]# export PYTHONPATH=/tmp/build/perf/python [root@jouet perf]# tools/perf/python/twatch.py cpu: 0, pid: 1183, tid: 6293 { type: exit, pid: 1183, ppid: 1183, tid: 6293, ptid: 6293, time: 17886646588257} cpu: 2, pid: 13820, tid: 13820 { type: fork, pid: 13820, ppid: 13820, tid: 6306, ptid: 13820, time: 17886869099529} cpu: 1, pid: 13820, tid: 6306 { type: comm, pid: 13820, tid: 6306, comm: TaskSchedulerFo } ^CTraceback (most recent call last): File "tools/perf/python/twatch.py", line 68, in <module> main() File "tools/perf/python/twatch.py", line 40, in main evlist.poll(timeout = -1) KeyboardInterrupt [root@jouet perf]# No problems found. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1519945751-37786-3-git-send-email-kan.liang@linux.intel.com [ Changed bool parameters from 0 to 'false', as per Jiri comment ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-05perf record: Fix crash in pipe modeJiri Olsa1-2/+6
Currently we can crash perf record when running in pipe mode, like: $ perf record ls | perf report # To display the perf.data header info, please use --header/--header-only options. # perf: Segmentation fault Error: The - file has no samples! The callstack of the crash is: 0x0000000000515242 in perf_event__synthesize_event_update_name 3513 ev = event_update_event__new(len + 1, PERF_EVENT_UPDATE__NAME, evsel->id[0]); (gdb) bt #0 0x0000000000515242 in perf_event__synthesize_event_update_name #1 0x00000000005158a4 in perf_event__synthesize_extra_attr #2 0x0000000000443347 in record__synthesize #3 0x00000000004438e3 in __cmd_record #4 0x000000000044514e in cmd_record #5 0x00000000004cbc95 in run_builtin #6 0x00000000004cbf02 in handle_internal_command #7 0x00000000004cc054 in run_argv #8 0x00000000004cc422 in main The reason of the crash is that the evsel does not have ids array allocated and the pipe's synthesize code tries to access it. We don't force evsel ids allocation when we have single event, because it's not needed. However we need it when we are in pipe mode even for single event as a key for evsel update event. Fixing this by forcing evsel ids allocation event for single event, when we are in pipe mode. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180302161354.30192-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-05perf annotate: Find 'call' instruction target symbol at parsing timeArnaldo Carvalho de Melo2-17/+22
So that we do it just once, not everytime we press enter or -> on a 'call' instruction line. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-uysyojl1e6nm94amzzzs08tf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-05perf record: Throttle user defined frequencies to the maximum allowedArnaldo Carvalho de Melo1-5/+15
# perf record -F 200000 sleep 1 warning: Maximum frequency rate (15,000 Hz) exceeded, throttling from 200,000 Hz to 15,000 Hz. The limit can be raised via /proc/sys/kernel/perf_event_max_sample_rate. The kernel will lower it when perf's interrupts take too long. Use --strict-freq to disable this throttling, refusing to record. [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (15 samples) ] # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 For those wanting that it fails if the desired frequency can't be used: # perf record --strict-freq -F 200000 sleep 1 error: Maximum frequency rate (15,000 Hz) exceeded. Please use -F freq option with a lower value or consider tweaking /proc/sys/kernel/perf_event_max_sample_rate. # Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-oyebruc44nlja499nqkr1nzn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-03-05perf record: Allow asking for the maximum allowed sample rateArnaldo Carvalho de Melo1-0/+23
Add the handy '-F max' shortcut to reading and using the kernel.perf_event_max_sample_rate value as the user supplied sampling frequency: # perf record -F max sleep 1 info: Using a maximum frequency rate of 15,000 Hz [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (14 samples) ] # sysctl kernel.perf_event_max_sample_rate kernel.perf_event_max_sample_rate = 15000 # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 15000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # perf record -F 10 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (4 samples) ] # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 10, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-4y0tiuws62c64gp4cf0hme0m@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-27perf stat: Ignore error thread when enabling system-wide --per-threadJin Yao3-0/+5
If we execute 'perf stat --per-thread' with non-root account (even set kernel.perf_event_paranoid = -1 yet), it reports the error: jinyao@skl:~$ perf stat --per-thread Error: You may not have permission to collect system-wide stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid, which controls use of the performance events system by unprivileged users (without CAP_SYS_ADMIN). The current value is 2: -1: Allow use of (almost) all events by all users Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK >= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN Disallow raw tracepoint access by users without CAP_SYS_ADMIN >= 1: Disallow CPU event access by users without CAP_SYS_ADMIN >= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN To make this setting permanent, edit /etc/sysctl.conf too, e.g.: kernel.perf_event_paranoid = -1 Perhaps the ptrace rule doesn't allow to trace some processes. But anyway the global --per-thread mode had better ignore such errors and continue working on other threads. This patch will record the index of error thread in perf_evsel__open() and remove this thread before retrying. For example (run with non-root, kernel.perf_event_paranoid isn't set): jinyao@skl:~$ perf stat --per-thread ^C Performance counter stats for 'system wide': vmstat-3458 6.171984 cpu-clock:u (msec) # 0.000 CPUs utilized perf-3670 0.515599 cpu-clock:u (msec) # 0.000 CPUs utilized vmstat-3458 1,163,643 cycles:u # 0.189 GHz perf-3670 40,881 cycles:u # 0.079 GHz vmstat-3458 1,410,238 instructions:u # 1.21 insn per cycle perf-3670 3,536 instructions:u # 0.09 insn per cycle vmstat-3458 288,937 branches:u # 46.814 M/sec perf-3670 936 branches:u # 1.815 M/sec vmstat-3458 15,195 branch-misses:u # 5.26% of all branches perf-3670 76 branch-misses:u # 8.12% of all branches 12.651675247 seconds time elapsed Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1516117388-10120-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-22perf cgroup: Simplify arguments when tracking multiple eventsweiping zhang1-1/+16
When using -G with one cgroup and -e with multiple events, only the first event gets the correct cgroup setting, all events from the second onwards will track system-wide events. If the user wants to track multiple events for a specific cgroup, the user must give parameters like the following: $ perf stat -e e1 -e e2 -e e3 -G test,test,test This patch simplify this case, just type one cgroup: $ perf stat -e e1 -e e2 -e e3 -G test $ mkdir -p /sys/fs/cgroup/perf_event/empty_cgroup $ perf stat -e cycles -e cache-misses -a -I 1000 -G empty_cgroup Before: 1.001007226 <not counted> cycles empty_cgroup 1.001007226 7,506 cache-misses After: 1.000834097 <not counted> cycles empty_cgroup 1.000834097 <not counted> cache-misses empty_cgroup Signed-off-by: weiping zhang <zhangweiping@didichuxing.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180129154805.GA6284@localhost.didichuxing.com [ Improved the doc text a bit, providing an example for cgroup + system wide counting ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-19perf tools: Add Python 3 supportJaroslav Škarvada3-64/+184
Added Python 3 support while keeping Python 2.7 compatibility. Committer notes: This doesn't make it to auto detect python 3, one has to explicitely ask it to build with python 3 devel files, here are the instructions provided by Jaroslav: --- $ cp -a tools/perf tools/python3-perf $ make V=1 prefix=/usr -C tools/perf PYTHON=/usr/bin/python2 all $ make V=1 prefix=/usr -C tools/python3-perf PYTHON=/usr/bin/python3 all $ make V=1 prefix=/usr -C tools/python3-perf PYTHON=/usr/bin/python3 DESTDIR=%{buildroot} install-python_ext $ make V=1 prefix=/usr -C tools/perf PYTHON=/usr/bin/python2 DESTDIR=%{buildroot} install-python_ext --- We need to make this automatic, just like the existing tests for checking if the python2 devel files are in place, allowing the build with python3 if available, fallbacking to python2 and then just disabling it if none are available. So, using the PYTHON variable to build it using O= we get: Before this patch: $ rpm -q python3 python3-devel python3-3.6.4-7.fc27.x86_64 python3-devel-3.6.4-7.fc27.x86_64 $ rm -rf /tmp/build/perf/ ; mkdir -p /tmp/build/perf ; make O=/tmp/build/perf PYTHON=/usr/bin/python3 -C tools/perf install-bin make: Entering directory '/home/acme/git/linux/tools/perf' <SNIP> Makefile.config:670: Python 3 is not yet supported; please set Makefile.config:671: PYTHON and/or PYTHON_CONFIG appropriately. Makefile.config:672: If you also have Python 2 installed, then Makefile.config:673: try something like: Makefile.config:674: Makefile.config:675: make PYTHON=python2 Makefile.config:676: Makefile.config:677: Otherwise, disable Python support entirely: Makefile.config:678: Makefile.config:679: make NO_LIBPYTHON=1 Makefile.config:680: Makefile.config:681: *** . Stop. make[1]: *** [Makefile.perf:212: sub-make] Error 2 make: *** [Makefile:110: install-bin] Error 2 make: Leaving directory '/home/acme/git/linux/tools/perf' $ After: $ make O=/tmp/build/perf PYTHON=python3 -C tools/perf install-bin $ ldd ~/bin/perf | grep python libpython3.6m.so.1.0 => /lib64/libpython3.6m.so.1.0 (0x00007f58a31e8000) $ rpm -qf /lib64/libpython3.6m.so.1.0 python3-libs-3.6.4-7.fc27.x86_64 $ Now verify that when using the binding the right ELF file is loaded, using perf trace: $ perf trace -e open* perf test python 0.051 ( 0.016 ms): perf/3927 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC ) = 3 <SNIP> 18: 'import perf' in python : 8.849 ( 0.013 ms): sh/3929 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC ) = 3 <SNIP> 25.572 ( 0.008 ms): python3/3931 openat(dfd: CWD, filename: /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so, flags: CLOEXEC) = 3 <SNIP> Ok <SNIP> $ And using tools/perf/python/twatch.py, to show PERF_RECORD_ metaevents: $ python3 tools/perf/python/twatch.py cpu: 3, pid: 16060, tid: 16060 { type: fork, pid: 5207, ppid: 16060, tid: 5207, ptid: 16060, time: 10798513015459} cpu: 3, pid: 16060, tid: 16060 { type: fork, pid: 5208, ppid: 16060, tid: 5208, ptid: 16060, time: 10798513562503} cpu: 0, pid: 5208, tid: 5208 { type: comm, pid: 5208, tid: 5208, comm: grep } cpu: 2, pid: 5207, tid: 5207 { type: comm, pid: 5207, tid: 5207, comm: ps } cpu: 2, pid: 5207, tid: 5207 { type: exit, pid: 5207, ppid: 5207, tid: 5207, ptid: 5207, time: 10798551337484} cpu: 3, pid: 5208, tid: 5208 { type: exit, pid: 5208, ppid: 5208, tid: 5208, ptid: 5208, time: 10798551292153} cpu: 3, pid: 601, tid: 601 { type: fork, pid: 5209, ppid: 601, tid: 5209, ptid: 601, time: 10801779977324} ^CTraceback (most recent call last): File "tools/perf/python/twatch.py", line 68, in <module> main() File "tools/perf/python/twatch.py", line 40, in main evlist.poll(timeout = -1) KeyboardInterrupt $ # ps ax|grep twatch 5197 pts/8 S+ 0:00 python3 tools/perf/python/twatch.py # ls -la /proc/5197/smaps -r--r--r--. 1 acme acme 0 Feb 19 13:14 /proc/5197/smaps # grep python /proc/5197/smaps 558111307000-558111309000 r-xp 00000000 fd:00 3151710 /usr/bin/python3.6 558111508000-558111509000 r--p 00001000 fd:00 3151710 /usr/bin/python3.6 558111509000-55811150a000 rw-p 00002000 fd:00 3151710 /usr/bin/python3.6 7ffad6fc1000-7ffad7008000 r-xp 00000000 00:2d 220196 /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so 7ffad7008000-7ffad7207000 ---p 00047000 00:2d 220196 /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so 7ffad7207000-7ffad7208000 r--p 00046000 00:2d 220196 /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so 7ffad7208000-7ffad7215000 rw-p 00047000 00:2d 220196 /tmp/build/perf/python/perf.cpython-36m-x86_64-linux-gnu.so 7ffadea77000-7ffaded3d000 r-xp 00000000 fd:00 3151795 /usr/lib64/libpython3.6m.so.1.0 7ffaded3d000-7ffadef3c000 ---p 002c6000 fd:00 3151795 /usr/lib64/libpython3.6m.so.1.0 7ffadef3c000-7ffadef42000 r--p 002c5000 fd:00 3151795 /usr/lib64/libpython3.6m.so.1.0 7ffadef42000-7ffadefa5000 rw-p 002cb000 fd:00 3151795 /usr/lib64/libpython3.6m.so.1.0 # And with this patch, but building normally, without specifying the PYTHON=python3 part, which will make it use python2 if its devel files are available, like in this test: $ make O=/tmp/build/perf -C tools/perf install-bin $ ldd ~/bin/perf | grep python libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 (0x00007f6a44410000) $ ldd /tmp/build/perf/python_ext_build/lib/perf.so | grep python libpython2.7.so.1.0 => /lib64/libpython2.7.so.1.0 (0x00007fed28a2c000) $ [acme@jouet perf]$ tools/perf/python/twatch.py cpu: 0, pid: 2817, tid: 2817 { type: fork, pid: 2817, ppid: 2817, tid: 8910, ptid: 2817, time: 11126454335306} cpu: 0, pid: 2817, tid: 2817 { type: comm, pid: 2817, tid: 8910, comm: worker } $ ps ax | grep twatch.py 8909 pts/8 S+ 0:00 /usr/bin/python tools/perf/python/twatch.py $ grep python /proc/8909/smaps 5579de658000-5579de659000 r-xp 00000000 fd:00 3156044 /usr/bin/python2.7 5579de858000-5579de859000 r--p 00000000 fd:00 3156044 /usr/bin/python2.7 5579de859000-5579de85a000 rw-p 00001000 fd:00 3156044 /usr/bin/python2.7 7f0de01f7000-7f0de023e000 r-xp 00000000 00:2d 230695 /tmp/build/perf/python/perf.so 7f0de023e000-7f0de043d000 ---p 00047000 00:2d 230695 /tmp/build/perf/python/perf.so 7f0de043d000-7f0de043e000 r--p 00046000 00:2d 230695 /tmp/build/perf/python/perf.so 7f0de043e000-7f0de044b000 rw-p 00047000 00:2d 230695 /tmp/build/perf/python/perf.so 7f0de6f0f000-7f0de6f13000 r-xp 00000000 fd:00 134975 /usr/lib64/python2.7/lib-dynload/_localemodule.so 7f0de6f13000-7f0de7113000 ---p 00004000 fd:00 134975 /usr/lib64/python2.7/lib-dynload/_localemodule.so 7f0de7113000-7f0de7114000 r--p 00004000 fd:00 134975 /usr/lib64/python2.7/lib-dynload/_localemodule.so 7f0de7114000-7f0de7115000 rw-p 00005000 fd:00 134975 /usr/lib64/python2.7/lib-dynload/_localemodule.so 7f0de7e73000-7f0de8052000 r-xp 00000000 fd:00 3173292 /usr/lib64/libpython2.7.so.1.0 7f0de8052000-7f0de8251000 ---p 001df000 fd:00 3173292 /usr/lib64/libpython2.7.so.1.0 7f0de8251000-7f0de8255000 r--p 001de000 fd:00 3173292 /usr/lib64/libpython2.7.so.1.0 7f0de8255000-7f0de8291000 rw-p 001e2000 fd:00 3173292 /usr/lib64/libpython2.7.so.1.0 $ Signed-off-by: Jaroslav Škarvada <jskarvad@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> LPU-Reference: 20180119205641.24242-1-jskarvad@redhat.com Link: https://lkml.kernel.org/n/tip-8d7dt9kqp83vsz25hagug8fu@git.kernel.org [ Removed explicit check for python version, allowing it to really build with python3 ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-19perf machine: Fix paranoid check in machine__set_kernel_mmap()Namhyung Kim1-1/+1
The machine__set_kernel_mmap() is to setup addresses of the kernel map using external info. But it has a check when the address is given from an incorrect input which should have the start and end address of 0 (i.e. machine__process_kernel_mmap_event). But we also use the end address of 0 for a valid input so change it to check both start and end addresses. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20180219101936.GD1583@sejong Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-02-16perf cpuid: Introduce a platform specific cpuid compare functionThomas Richter2-18/+30
The function get_cpuid_str() is called by perf_pmu__getcpuid() and on s390 returns a complete description of the CPU and its capabilities, which is a comma separated list. To map the CPU type with the value defined in the pmu-events/arch/s390/mapfile.csv, introduce an architecture specific cpuid compare function named strcmp_cpuid_str() The currently used regex algorithm is defined as the weak default and will be used if no platform specific one is defined. This matches the current behavior. Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180213151419.80737-3-tmricht@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>