summaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2017-06-27perf event-parse: Use pr_warning()Arnaldo Carvalho de Melo1-2/+2
Convert sole user of warning() in this file to pr_warning(), consolidating error reporting facilities. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-3y7yf6v673ujl2rcs34tzv8n@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-27perf config: Use pr_warning()Arnaldo Carvalho de Melo1-4/+2
warning() is going away, consolidating error reporting. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-5r3636cwl4z1varo90mervai@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf tools: Fix message because cpu list option is -C not -cAdrian Hunter1-1/+1
Fix message because cpu list option is -C not -c Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-19-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Fix transactions_sample_typeAdrian Hunter1-0/+1
'transactions_sample_type' is needed to correctly inject transactions samples but it was not being set. Set it from the event sample type. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-18-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Remove redundant initial_skip checksAdrian Hunter1-6/+2
'initial_skip' is checked inside the sample synthesis functions which means it is actually being done twice for 'instructions' and 'transactions' samples. Remove the redundant checks. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-17-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Add decoder support for CBR eventsAdrian Hunter2-0/+21
Add decoder support for informing the tools of changes to the core-to-bus ratio (CBR). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-16-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Add reserved byte to CBR packet payloadAdrian Hunter2-2/+2
Future proof CBR packet decoding by passing through also the undefined 'reserved' byte in the packet payload. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-15-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Add decoder support for ptwrite and power event packetsAdrian Hunter4-8/+293
Add decoder support for PTWRITE, MWAIT, PWRE, PWRX and EXSTOP packets. This patch only affects the decoder, so the tools still do not select or consume the new information. That is added in subsequent patches. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-14-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Allow decoding with branch tracing disabledAdrian Hunter3-0/+28
The kernel now supports the disabling of branch tracing, however the decoder assumes branch tracing is always enabled. Pass through a parameter to indicate whether branch tracing is enabled and use it to avoid cases when the decoder is expecting branch packets. There are 2 such cases. First, FUP packets which can bind to an IP even when there is no branch tracing. Secondly, the decoder will try to use branch packets to find an IP to start decoding or to recover from errors. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-11-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Add missing __fallthroughAdrian Hunter1-1/+1
perf tools uses __fallthrough. Add missing __fallthrough to a switch statement. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-10-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Clear FUP flag on errorAdrian Hunter1-0/+2
Sometimes a FUP packet is associated with a TSX transaction and a flag is set to indicate that. Ensure that flag is cleared on any error condition because at that point the decoder can no longer assume it is correct. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-9-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Use FUP always when scanning for an IPAdrian Hunter1-8/+4
The decoder will try to use branch packets to find an IP to start decoding or to recover from errors. Currently the FUP packet is used only in the case of an overflow, however there is no reason for that to be a special case. So just use FUP always when scanning for an IP. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zeroAdrian Hunter1-3/+5
Intel PT uses IP compression based on the last IP. For decoding purposes, 'last IP' is not updated when a branch target has been suppressed, which is indicated by IPBytes == 0. IPBytes is stored in the packet 'count', so ensure never to set 'last_ip' when packet 'count' is zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Fix last_ip usageAdrian Hunter1-2/+11
Intel PT uses IP compression based on the last IP. For decoding purposes, 'last IP' is considered to be reset to zero whenever there is a synchronization packet (PSB). The decoder wasn't doing that, and was treating the zero value to mean that there was no last IP, whereas compression can be done against the zero value. Fix by setting last_ip to zero when a PSB is received and keep track of have_last_ip. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IPAdrian Hunter1-0/+1
A value of zero is used to indicate that there is no IP. Ensure the value is zero when the state is INTEL_PT_STATE_NO_IP. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Fix missing stack clearAdrian Hunter1-0/+1
The return compression stack must be cleared whenever there is a PSB. Fix one case where that was not happening. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Improve sample timestampAdrian Hunter1-3/+31
The decoder uses its current timestamp in samples. Usually that is a timestamp that has already passed, but in some cases it is a timestamp for a branch that the decoder is walking towards, and consequently hasn't reached. Improve that situation by using the pkt_state to determine when to use the current or previous timestamp. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf intel-pt: Move decoder error setting into one conditionAdrian Hunter1-4/+7
Move decoder error setting into one condition. Cc'ed to stable because later fixes depend on it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1495786658-18063-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-21perf stat: Add support to measure SMI costKan Liang3-0/+37
Implementing a new --smi-cost mode in perf stat to measure SMI cost. During the measurement, the /sys/device/cpu/freeze_on_smi will be set. The measurement can be done with one counter (unhalted core cycles), and two free running MSR counters (IA32_APERF and SMI_COUNT). In practice, the percentages of SMI core cycles should be more useful than absolute value. So the output will be the percentage of SMI core cycles and SMI#. metric_only will be set by default. SMI cycles% = (aperf - unhalted core cycles) / aperf Here is an example output. Performance counter stats for 'sudo echo ': SMI cycles% SMI# 0.1% 1 0.010858678 seconds time elapsed Users who wants to get the actual value can apply additional --no-metric-only. Signed-off-by: Kan Liang <Kan.liang@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Elliott <elliott@hpe.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1495825538-5230-3-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-20perf tools: Remove unused _ALL_SOURCE defineArnaldo Carvalho de Melo1-1/+0
Curious as to what this was for I looked at /usr/include/ and only some python headers define this, and it ends up being to enable "extensions" on some old OSes: /* Enable extensions on AIX 3, Interix */ I guess we can remove this one safely. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-omnundlxo2brs552bdl6m0j1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-20perf tools: Do parameter validation earlier on fetch_kernel_version()Arnaldo Carvalho de Melo1-5/+10
While trying to reduce util.[ch] I noticed that fetch_kernel_version() and fetch_ubuntu_kernel_version() do lots of operations only to check if they are needed, i.e. it checks if the pointer where to return the kernel version is NULL only after obtaining the kernel version from /proc/version_signature or by parsing the results from uname(). Do it earlier not to confuse people reading this code in the future :-) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i94qwyekk4tzbu0b9ce1r1mz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-20perf evsel: Adopt find_process()Arnaldo Carvalho de Melo3-39/+39
And make it static, nobody else uses it, if we ever need it in more places we can carve a new source file for process related methods, for now lets reduce util.{c,h} a tad more. Link: http://lkml.kernel.org/n/tip-zgb28rllvypjibw52aaz9p15@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19perf annotate: Return arch from symbol__disassemble() and save it in browserJin Yao2-3/+11
In annotate browser, we will add support to check fused instructions. While this is x86-specific feature so we need the annotate browser to know what the arch it runs on. symbol__disassemble() has figured out the arch. This patch just lets the arch return from symbol__disassemble and save the arch in annotate browser. Signed-off-by: Yao Jin <yao.jin@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1497840958-4759-2-git-send-email-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19perf intel-pt/bts: Remove unused SAMPLE_SIZE defines and bts priv arrayKim Phillips1-2/+0
These defines were probably dragged in from sampling support in earlier patches. They can be put back when needed. Signed-off-by: Kim Phillips <kim.phillips@arm.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170616112339.3fb6986e4ff33e353008244b@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19tools: Adopt __aligned from kernel sourcesArnaldo Carvalho de Melo1-1/+2
To have a more compact way to ask the compiler to use a specific alignment, making tools/ look more like kernel source code. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-8jiem6ubg9rlpbs7c2p900no@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19tools: Adopt __packed from kernel sourcesArnaldo Carvalho de Melo1-2/+3
To have a more compact way to ask the compiler to not insert alignment paddings in a struct, making tools/ look more like kernel source code. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-byp46nr7hsxvvyc9oupfb40q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19perf tools: Use __maybe_unused consistentlyArnaldo Carvalho de Melo2-2/+4
Instead of defining __unused or redefining __maybe_unused. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-4eleto5pih31jw1q4dypm9pf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19tools: Adopt __scanf from kernel sourcesArnaldo Carvalho de Melo1-2/+2
To have a more compact way to ask the compiler to perform scanf like argument validation. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-yzqrhfjrn26lqqtwf55egg0h@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19tools: Adopt __printf from kernel sourcesArnaldo Carvalho de Melo6-21/+17
To have a more compact way to ask the compiler to perform printf like vargargs validation. v2: Fixed up build on arm, squashing a patch by Kim Phillips, thanks! Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-dopkqmmuqs04cxzql0024nnu@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-19tools: Adopt __noreturn from kernel sourcesArnaldo Carvalho de Melo3-10/+9
To have a more compact way to specify that a function doesn't return, instead of the open coded: __attribute__((noreturn)) And use it instead of the tools/perf/ specific variation, NORETURN. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-l0y144qzixcy5t4c6i7pdiqj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-16perf unwind: Report module before querying isactivation in dwfl unwindMilian Wolff1-0/+8
The PC returned by dwfl_frame_pc() may map into a not-yet-reported module. We have to report it before we continue unwinding. But when we query for the isactivation flag in dwfl_frame_pc, libdw will actually do one more unwinding step internally which can then break and lead to missed frames or broken stacks. With libunwind we get e.g.: ~~~~~ heaptrack_gui 2228 135073.400474: 613969 cycles: 108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0) 109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0) 1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0) 147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0) 109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0) 10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0) 1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0) 211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0) 92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0) 2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0) 297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0) f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0) 1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0) 78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui) 20439 __libc_start_main (/usr/lib/libc-2.25.so) 78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui) heaptrack_gui 2228 135073.401156: 569521 cycles: 131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0) 1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0) 21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0) 2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0) 279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0) e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0) f5a1c QGuiApplicationPrivate::createPlatformIntegration (/usr/lib/libQt5Gui.so.5.8.0) f650c QGuiApplicationPrivate::createEventDispatcher (/usr/lib/libQt5Gui.so.5.8.0) 298524 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0) f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0) 1589e8 QApplicationPrivate::init (/usr/lib/libQt5Widgets.so.5.8.0) 78622 main (/home/milian/projects/compiled/other/bin/heaptrack_gui) 20439 __libc_start_main (/usr/lib/libc-2.25.so) 78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui) ~~~~~ Note the two frames 1589e8 and 78622 in the first sample. These are missing when unwinding with libdw. The second sample's breakage is more obvious: ~~~~~ heaptrack_gui 2228 135073.400474: 613969 cycles: 108c8e [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1093bc [unknown] (/usr/lib/libQt5Core.so.5.8.0) 109e7b QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0) 1470ff [unknown] (/usr/lib/libQt5Core.so.5.8.0) 147f67 QSystemLocale::query (/usr/lib/libQt5Core.so.5.8.0) 109fbf QLocalePrivate::updateSystemPrivate (/usr/lib/libQt5Core.so.5.8.0) 10aa27 QLocale::QLocale (/usr/lib/libQt5Core.so.5.8.0) 1e02c3 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 2113bb [unknown] (/usr/lib/libQt5Core.so.5.8.0) 211505 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1b5df0 QFileInfo::exists (/usr/lib/libQt5Core.so.5.8.0) 92eb2 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 93423 [unknown] (/usr/lib/libQt5Core.so.5.8.0) 93d2a QLibraryInfo::location (/usr/lib/libQt5Core.so.5.8.0) 2170af [unknown] (/usr/lib/libQt5Core.so.5.8.0) 297c53 QCoreApplicationPrivate::init (/usr/lib/libQt5Core.so.5.8.0) f7cde QGuiApplicationPrivate::init (/usr/lib/libQt5Gui.so.5.8.0) 20439 __libc_start_main (/usr/lib/libc-2.25.so) 78299 _start (/home/milian/projects/compiled/other/bin/heaptrack_gui) heaptrack_gui 2228 135073.401156: 569521 cycles: 131633 QString::endsWith (/usr/lib/libQt5Core.so.5.8.0) 1a0701 QDir::cleanPath (/usr/lib/libQt5Core.so.5.8.0) 21b82d [unknown] (/usr/lib/libQt5Core.so.5.8.0) 1b3727 QFileInfo::canonicalFilePath (/usr/lib/libQt5Core.so.5.8.0) 2780c7 QFactoryLoader::update (/usr/lib/libQt5Core.so.5.8.0) 279525 QFactoryLoader::QFactoryLoader (/usr/lib/libQt5Core.so.5.8.0) e5bd0 QPlatformIntegrationFactory::create (/usr/lib/libQt5Gui.so.5.8.0) 723dbf [unknown] ([unknown]) ~~~~~ This patch fixes this issue and the libdw unwinder mimicks the libunwind behavior more closely. Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Acked-by: Jan Kratochvil <jan.kratochvil@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20170602143753.16907-2-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-14perf tools: Fix build with ARCH=x86_64Jiada Wang1-1/+1
With commit: 0a943cb10ce78 (tools build: Add HOSTARCH Makefile variable) when building for ARCH=x86_64, ARCH=x86_64 is passed to perf instead of ARCH=x86, so the perf build process searchs header files from tools/arch/x86_64/include, which doesn't exist. The following build failure is seen: In file included from util/event.c:2:0: tools/include/uapi/linux/mman.h:4:27: fatal error: uapi/asm/mman.h: No such file or directory compilation terminated. Fix this issue by using SRCARCH instead of ARCH in perf, just like the main kernel Makefile and tools/objtool's. Signed-off-by: Jiada Wang <jiada_wang@mentor.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Eugeniu Rosca <erosca@de.adit-jv.com> Cc: Jan Stancek <jstancek@redhat.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Rui Teng <rui.teng@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 0a943cb10ce7 ("tools build: Add HOSTARCH Makefile variable") Link: http://lkml.kernel.org/r/1491793357-14977-2-git-send-email-jiada_wang@mentor.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-14perf evsel: Fix probing of precise_ip level for default cycles eventArnaldo Carvalho de Melo1-0/+12
Since commit 18e7a45af91a ("perf/x86: Reject non sampling events with precise_ip") returns -EINVAL for sys_perf_event_open() with an attribute with (attr.precise_ip > 0 && attr.sample_period == 0), just like is done in the routine used to probe the max precise level when no events were passed to 'perf record' or 'perf top', i.e.: perf_evsel__new_cycles() perf_event_attr__set_max_precise_ip() The x86 code, in x86_pmu_hw_config(), which is called all the way from sys_perf_event_open() did, starting with the aforementioned commit: /* There's no sense in having PEBS for non sampling events: */ if (!is_sampling_event(event)) return -EINVAL; Which makes it fail for cycles:ppp, cycles:pp and cycles:p, always using just the non precise cycles variant. To make sure that this is the case, I tested it, before this patch, with: # perf probe -L x86_pmu_hw_config <x86_pmu_hw_config@/home/acme/git/linux/arch/x86/events/core.c:0> 0 int x86_pmu_hw_config(struct perf_event *event) 1 { 2 if (event->attr.precise_ip) { <SNIP> 17 if (event->attr.precise_ip > precise) 18 return -EOPNOTSUPP; /* There's no sense in having PEBS for non sampling events: */ 21 if (!is_sampling_event(event)) 22 return -EINVAL; } <SNIP> # perf probe x86_pmu_hw_config:22 Added new events: probe:x86_pmu_hw_config (on x86_pmu_hw_config:22) probe:x86_pmu_hw_config_1 (on x86_pmu_hw_config:22) You can now use it in all perf tools, such as: perf record -e probe:x86_pmu_hw_config_1 -aR sleep 1 # perf trace -e perf_event_open,probe:x86_pmu_hwconfig*/max-stack=16/ perf record usleep 1 0.000 ( 0.015 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ... 0.015 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1)) x86_pmu_hw_config ([kernel.kallsyms]) hsw_hw_config ([kernel.kallsyms]) x86_pmu_event_init ([kernel.kallsyms]) perf_try_init_event ([kernel.kallsyms]) perf_event_alloc ([kernel.kallsyms]) SYSC_perf_event_open ([kernel.kallsyms]) sys_perf_event_open ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) perf_evsel__new_cycles (/home/acme/bin/perf) perf_evlist__add_default (/home/acme/bin/perf) cmd_record (/home/acme/bin/perf) run_builtin (/home/acme/bin/perf) handle_internal_command (/home/acme/bin/perf) 0.000 ( 0.021 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument 0.023 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ... 0.025 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1)) x86_pmu_hw_config ([kernel.kallsyms]) hsw_hw_config ([kernel.kallsyms]) x86_pmu_event_init ([kernel.kallsyms]) perf_try_init_event ([kernel.kallsyms]) perf_event_alloc ([kernel.kallsyms]) SYSC_perf_event_open ([kernel.kallsyms]) sys_perf_event_open ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) perf_evsel__new_cycles (/home/acme/bin/perf) perf_evlist__add_default (/home/acme/bin/perf) cmd_record (/home/acme/bin/perf) run_builtin (/home/acme/bin/perf) handle_internal_command (/home/acme/bin/perf) 0.023 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument 0.028 ( 0.002 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8ba110, cpu: -1, group_fd: -1 ) ... 0.030 ( ): probe:x86_pmu_hw_config:(ffffffff9c0065e1)) x86_pmu_hw_config ([kernel.kallsyms]) hsw_hw_config ([kernel.kallsyms]) x86_pmu_event_init ([kernel.kallsyms]) perf_try_init_event ([kernel.kallsyms]) perf_event_alloc ([kernel.kallsyms]) SYSC_perf_event_open ([kernel.kallsyms]) sys_perf_event_open ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) perf_evsel__new_cycles (/home/acme/bin/perf) perf_evlist__add_default (/home/acme/bin/perf) cmd_record (/home/acme/bin/perf) run_builtin (/home/acme/bin/perf) handle_internal_command (/home/acme/bin/perf) 0.028 ( 0.004 ms): perf/4150 ... [continued]: perf_event_open()) = -1 EINVAL Invalid argument 41.018 ( 0.012 ms): perf/4150 perf_event_open(attr_uptr: 0x7ffebc8b5dd0, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 41.065 ( 0.011 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 41.080 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c7db78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 41.103 ( 0.010 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4 41.115 ( 0.006 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5 41.122 ( 0.004 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6 41.128 ( 0.008 ms): perf/4150 perf_event_open(attr_uptr: 0x3c4e748, pid: 4151 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.017 MB perf.data (2 samples) ] # I.e. that return -EINVAL in x86_pmu_hw_config() is hit three times. So fix it by just setting attr.sample_period Now, after this patch: # perf trace --max-stack=2 -e perf_event_open,probe:x86_pmu_hw_config* perf record usleep 1 [ perf record: Woken up 1 times to write data ] 0.000 ( 0.017 ms): perf/8469 perf_event_open(attr_uptr: 0x7ffe36c27d10, pid: -1, cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_event_open_cloexec_flag (/home/acme/bin/perf) 0.050 ( 0.031 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_evlist__config (/home/acme/bin/perf) 0.092 ( 0.040 ms): perf/8469 perf_event_open(attr_uptr: 0x24ebb78, pid: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_evlist__config (/home/acme/bin/perf) 0.143 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, cpu: -1, group_fd: -1 ) = 4 syscall (/usr/lib64/libc-2.24.so) perf_event_attr__set_max_precise_ip (/home/acme/bin/perf) 0.161 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), group_fd: -1, flags: FD_CLOEXEC) = 4 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) 0.171 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 1, group_fd: -1, flags: FD_CLOEXEC) = 5 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) 0.180 ( 0.007 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 2, group_fd: -1, flags: FD_CLOEXEC) = 6 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) 0.190 ( 0.005 ms): perf/8469 perf_event_open(attr_uptr: 0x24bc748, pid: 8470 (perf), cpu: 3, group_fd: -1, flags: FD_CLOEXEC) = 8 syscall (/usr/lib64/libc-2.24.so) perf_evsel__open (/home/acme/bin/perf) [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ] # The probe one called from perf_event_attr__set_max_precise_ip() works the first time, with attr.precise_ip = 3, wit hthe next ones being the per cpu ones for the cycles:ppp event. And here is the text from a report and alternative proposed patch by Thomas-Mich Richter: --- On s390 the counter and sampling facility do not support a precise IP skid level and sometimes returns EOPNOTSUPP when structure member precise_ip in struct perf_event_attr is not set to zero. On s390 commnd 'perf record -- true' fails with error EOPNOTSUPP. This happens only when no events are specified on command line. The functions called are ... --> perf_evlist__add_default --> perf_evsel__new_cycles --> perf_event_attr__set_max_precise_ip The last function determines the value of structure member precise_ip by invoking the perf_event_open() system call and checking the return code. The first successful open is the value for precise_ip. However the value is determined without setting member sample_period and indicates no sampling. On s390 the counter facility and sampling facility are different. The above procedure determines a precise_ip value of 3 using the counter facility. Later it uses the sampling facility with a value of 3 and fails with EOPNOTSUPP. --- v2: Older compilers (e.g. gcc 4.4.7) don't support referencing members of unnamed union members in the container struct initialization, so move from: struct perf_event_attr attr = { ... .sample_period = 1, }; to right after it as: struct perf_event_attr attr = { ... }; attr.sample_period = 1; v3: We need to reset .sample_period to 0 to let the users of perf_evsel__new_cycles() to properly setup attr.sample_period or attr.sample_freq. Reported by Ingo Molnar. Reported-and-Acked-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> Acked-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 18e7a45af91a ("perf/x86: Reject non sampling events with precise_ip") Link: http://lkml.kernel.org/n/tip-yv6nnkl7tzqocrm0hl3x7vf1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf symbols: Kill dso__build_id_is_kmod()Namhyung Kim3-50/+0
The commit e7ee40475760 ("perf symbols: Fix symbols searching for module in buildid-cache") added the function to check kernel modules reside in the build-id cache. This was because there's no way to identify a DSO which is actually a kernel module. So it searched linkname of the file and find ".ko" suffix. But this does not work for compressed kernel modules and now such DSOs hCcave correct symtab_type now. So no need to check it anymore. This patch essentially reverts the commit. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf symbols: Keep DSO->symtab_type after decompressNamhyung Kim1-0/+2
The symsrc__init() overwrites dso->symtab_type as symsrc->type in dso__load_sym(). But for compressed kernel modules in the build-id cache, it should have original symtab type to be decompressed as needed. This fixes perf annotate to show disassembly of the function properly. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-9-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf tools: Consolidate error path in __open_dso()Namhyung Kim1-11/+8
On failure, it should free the 'name', so clean up the error path using goto. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf tools: Decompress kernel module when reading DSO dataNamhyung Kim1-0/+16
Currently perf decompresses kernel modules when loading the symbol table but it missed to do it when reading raw data. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf annotate: Use dso__decompress_kmodule_path()Namhyung Kim1-24/+3
Convert open-coded decompress routine to use the function. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf tools: Introduce dso__decompress_kmodule_{fd,path}Namhyung Kim3-35/+65
Move decompress_kmodule() to util/dso.c and split it into two functions returning fd and (decompressed) file path. The existing user only wants the fd version but the path version will be used soon. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf tools: Fix a memory leak in __open_dso()Namhyung Kim1-1/+3
The 'name' variable should be freed on the error path. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170608073109.30699-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf annotate: Fix symbolic link of build-id cacheNamhyung Kim1-1/+9
The commit 6ebd2547dd24 ("perf annotate: Fix a bug following symbolic link of a build-id file") changed to use dirname to follow the symlink. But it only considers new-style build-id cache names so old names fail on readlink() and force to use system path which might not available. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-team@lge.com Fixes: 6ebd2547dd24 ("perf annotate: Fix a bug following symbolic link of a build-id file") Link: http://lkml.kernel.org/r/20170608073109.30699-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-08perf script: Fix outdated comment for perf-trace-pythonSeongJae Park1-1/+1
Script generated by the '--gen-script' option contains an outdated comment. It mentions a 'perf-trace-python' document while it has been renamed to 'perf-script-python'. Fix it. Signed-off-by: SeongJae Park <sj38.park@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 133dc4c39c57 ("perf: Rename 'perf trace' to 'perf script'") Link: http://lkml.kernel.org/r/20170530111827.21732-2-sj38.park@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-05perf report: Ensure the perf DSO mapping matches what libdw seesMilian Wolff1-0/+8
In some situations the libdw unwinder stopped working properly. I.e. with libunwind we see: ~~~~~ heaptrack_gui 2228 135073.400112: 641314 cycles: e8ed _dl_fixup (/usr/lib/ld-2.25.so) 15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so) ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0) 608f3 _GLOBAL__sub_I_kdynamicjobtracker.cpp (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0) f199 call_init.part.0 (/usr/lib/ld-2.25.so) f2a5 _dl_init (/usr/lib/ld-2.25.so) db9 _dl_start_user (/usr/lib/ld-2.25.so) ~~~~~ But with libdw and without this patch this sample is not properly unwound: ~~~~~ heaptrack_gui 2228 135073.400112: 641314 cycles: e8ed _dl_fixup (/usr/lib/ld-2.25.so) 15f06 _dl_runtime_resolve_sse_vex (/usr/lib/ld-2.25.so) ed94c KDynamicJobTracker::KDynamicJobTracker (/home/milian/projects/compiled/kf5/lib64/libKF5KIOWidgets.so.5.35.0) ~~~~~ Debug output showed me that libdw found a module for the last frame address, but it thinks it belongs to /usr/lib/ld-2.25.so. This patch double-checks what libdw sees and what perf knows. If the mappings mismatch, we now report the elf known to perf. This fixes the situation above, and the libdw unwinder produces the same stack as libunwind. Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20170602143753.16907-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-05perf report: Include partial stacks unwound with libdwMilian Wolff1-1/+1
So far the whole stack was thrown away when any error occurred before the maximum stack depth was unwound. This is actually a very common scenario though. The stacks that got unwound so far are still interesting. This removes a large chunk of differences when comparing perf script output for libunwind and libdw perf unwinding. E.g. with libunwind: ~~~~~ heaptrack_gui 2228 135073.388524: 479408 cycles: ffffffff811749ed perf_iterate_ctx ([kernel.kallsyms]) ffffffff81181662 perf_event_mmap ([kernel.kallsyms]) ffffffff811cf5ed mmap_region ([kernel.kallsyms]) ffffffff811cfe6b do_mmap ([kernel.kallsyms]) ffffffff811b0dca vm_mmap_pgoff ([kernel.kallsyms]) ffffffff811cdb0c sys_mmap_pgoff ([kernel.kallsyms]) ffffffff81033acb sys_mmap ([kernel.kallsyms]) ffffffff81631d37 entry_SYSCALL_64_fastpath ([kernel.kallsyms]) 192ca mmap64 (/usr/lib/ld-2.25.so) 59a9 _dl_map_object_from_fd (/usr/lib/ld-2.25.so) 83d0 _dl_map_object (/usr/lib/ld-2.25.so) cda1 openaux (/usr/lib/ld-2.25.so) 1834f _dl_catch_error (/usr/lib/ld-2.25.so) cfe2 _dl_map_object_deps (/usr/lib/ld-2.25.so) 3481 dl_main (/usr/lib/ld-2.25.so) 17387 _dl_sysdep_start (/usr/lib/ld-2.25.so) 4d37 _dl_start (/usr/lib/ld-2.25.so) d87 _start (/usr/lib/ld-2.25.so) heaptrack_gui 2228 135073.388677: 611329 cycles: 1a3e0 strcmp (/usr/lib/ld-2.25.so) 82b2 _dl_map_object (/usr/lib/ld-2.25.so) cda1 openaux (/usr/lib/ld-2.25.so) 1834f _dl_catch_error (/usr/lib/ld-2.25.so) cfe2 _dl_map_object_deps (/usr/lib/ld-2.25.so) 3481 dl_main (/usr/lib/ld-2.25.so) 17387 _dl_sysdep_start (/usr/lib/ld-2.25.so) 4d37 _dl_start (/usr/lib/ld-2.25.so) d87 _start (/usr/lib/ld-2.25.so) ~~~~~ With libdw without this patch: ~~~~~ heaptrack_gui 2228 135073.388524: 479408 cycles: ffffffff811749ed perf_iterate_ctx ([kernel.kallsyms]) ffffffff81181662 perf_event_mmap ([kernel.kallsyms]) ffffffff811cf5ed mmap_region ([kernel.kallsyms]) ffffffff811cfe6b do_mmap ([kernel.kallsyms]) ffffffff811b0dca vm_mmap_pgoff ([kernel.kallsyms]) ffffffff811cdb0c sys_mmap_pgoff ([kernel.kallsyms]) ffffffff81033acb sys_mmap ([kernel.kallsyms]) ffffffff81631d37 entry_SYSCALL_64_fastpath ([kernel.kallsyms]) heaptrack_gui 2228 135073.388677: 611329 cycles: ~~~~~ With this patch applied, the libdw unwinder will produce the same output as the libunwind unwinder. Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/20170601210021.20046-1-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-05perf symbols: Use correct filename for compressed modules in build-id cacheNamhyung Kim1-4/+1
The decompress_kmodule() decompresses kernel modules in order to load symbols from it. In the DSO_BINARY_TYPE__BUILD_ID_CACHE case, it needs the full file path to extract the file extension to determine the decompression method. But overwriting 'name' will fail the decompression since it might point to a non-existing old file. Instead, use dso->long_name for having the correct extension and use the real filename to decompress. In the DSO_BINARY_TYPE__SYSTEM_PATH_KMODULE_COMP case, both names should be the same. This allows resolving symbols in the old modules. Before: $ perf report -i perf.data.old | grep scsi_mod 0.00% cc1 [scsi_mod] [k] 0x0000000000004aa6 0.00% as [scsi_mod] [k] 0x00000000000099e1 0.00% cc1 [scsi_mod] [k] 0x0000000000009830 0.00% cc1 [scsi_mod] [k] 0x0000000000001b8f After: 0.00% cc1 [scsi_mod] [k] scsi_handle_queue_ramp_up 0.00% as [scsi_mod] [k] scsi_sg_alloc 0.00% cc1 [scsi_mod] [k] scsi_setup_cmnd 0.00% cc1 [scsi_mod] [k] scsi_get_command Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170531120105.21731-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-05perf symbols: Set module info when build-id event foundNamhyung Kim4-11/+20
Like machine__findnew_module_dso(), it should set necessary info for kernel modules to find symbol info from the file. Factor out dso__set_module_info() to do it. This is needed for dso__needs_decompress() to detect such DSOs. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170531120105.21731-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-05perf header: Set proper module name when build-id event foundNamhyung Kim1-2/+10
When perf processes build-id event, it creates DSOs with the build-id. But it didn't set the module short name (like '[module-name]') so when processing a kernel mmap event of the module, it cannot found the DSO as it only checks the short names. That leads for perf to create a same DSO without the build-id info and it'll lookup the system path even if the DSO is already in the build-id cache. After kernel was updated, perf cannot find the DSO and cannot show symbols in it anymore. You can see this if you have an old data file (w/ old kernel version): $ perf report -i perf.data.old -v |& grep scsi_mod build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz : cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1 Failed to open /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz, continuing without symbols ... The second message didn't show the build-id. With this patch: $ perf report -i perf.data.old -v |& grep scsi_mod build id event received for /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz: cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1 /lib/modules/3.19.2-1-ARCH/kernel/drivers/scsi/scsi_mod.ko.gz with build id cafe1ce6ca13a98a5d9ed3425cde249e57a27fc1 not found, continuing without symbols ... Now it shows the build-id but still cannot load the symbol table. This is a different problem which will be fixed in the next patch. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170531120105.21731-1-namhyung@kernel.org [ Fix the build on older compilers (debian <= 8, fedora <= 21, etc) wrt kmod_path var init ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-06-01perf annotate: Fix branch instruction with multiple operandsKim Phillips1-3/+30
'perf annotate' is dropping the cr* fields from branch instructions. Fix it by adding support to display branch instructions having multiple operands. Power Arch objdump of int_sqrt: 20.36 | c0000000004d2694: subf r10,r10,r3 | c0000000004d2698: v bgt cr6,c0000000004d26a0 <int_sqrt+0x40> 1.82 | c0000000004d269c: mr r3,r10 29.18 | c0000000004d26a0: mr r10,r8 | c0000000004d26a4: v bgt cr7,c0000000004d26ac <int_sqrt+0x4c> | c0000000004d26a8: mr r10,r7 Power Arch Before Patch: 20.36 | subf r10,r10,r3 | v bgt 40 1.82 | mr r3,r10 29.18 | 40: mr r10,r8 | v bgt 4c | mr r10,r7 Power Arch After patch: 20.36 | subf r10,r10,r3 | v bgt cr6,40 1.82 | mr r3,r10 29.18 | 40: mr r10,r8 | v bgt cr7,4c | mr r10,r7 Also support AArch64 conditional branch instructions, which can have up to three operands: Aarch64 Non-simplified (raw objdump) view: │ffff0000083cd11c: ↑ cbz w0, ffff0000083cd100 <security_fil▒ ... 4.44 │ffff000│083cd134: ↓ tbnz w0, #26, ffff0000083cd190 <securit▒ ... 1.37 │ffff000│083cd144: ↓ tbnz w22, #5, ffff0000083cd1a4 <securit▒ │ffff000│083cd148: mov w19, #0x20000 //▒ 1.02 │ffff000│083cd14c: ↓ tbz w22, #2, ffff0000083cd1ac <securit▒ ... 0.68 │ffff000└──3cd16c: ↑ cbnz w0, ffff0000083cd120 <security_fil▒ Aarch64 Simplified, before this patch: │ ↑ cbz 40 ... 4.44 │ │↓ tbnz w0, #26, ffff0000083cd190 <security_file_permiss▒ ... 1.37 │ │↓ tbnz w22, #5, ffff0000083cd1a4 <security_file_permiss▒ │ │ mov w19, #0x20000 // #131072 1.02 │ │↓ tbz w22, #2, ffff0000083cd1ac <security_file_permiss▒ ... 0.68 │ └──cbnz 60 the cbz operand is missing, and the tbz doesn't get simplified processing at all because the parsing function failed to match an address. Aarch64 Simplified, After this patch applied: │ ↑ cbz w0, 40 ... 4.44 │ │↓ tbnz w0, #26, d0 ... 1.37 │ │↓ tbnz w22, #5, e4 │ │ mov w19, #0x20000 // #131072 1.02 │ │↓ tbz w22, #2, ec ... 0.68 │ └──cbnz w0, 60 Originally-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Reported-by: Anton Blanchard <anton@samba.org> Reported-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Kim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <treeze.taeung@gmail.com> Link: http://lkml.kernel.org/r/20170601092959.f60d98912e8a1b66fd1e4c0e@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-05-27perf annotate: Fix failure when filename has special charsRavi Bangoria1-1/+1
When filename contains special chars, perf annotate fails with an error: $ perf annotate --vmlinux ./vmlinux\(test\) --stdio native_safe_halt sh: -c: line 0: syntax error near unexpected token `(' sh: -c: line 0: `objdump --start-address=0xffffffff8184e840 --stop-address=0xffffffff8184e848 -l -d --no-show-raw -S -C ./vmlinux(test) 2>/dev/null|grep -v ./vmlinux(test):|expand' Fix it by surrounding filename in double quotes. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Adam Stylinski <adam.stylinski@etegent.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <treeze.taeung@gmail.com> Link: http://lkml.kernel.org/r/20170505101417.2117-1-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-05-24perf report: Do not drop last inlined frameMilian Wolff1-4/+14
The very last inlined frame, i.e. the one furthest away from the non-inlined frame, was silently dropped. This is apparent when comparing the output of `perf script` and `addr2line`: ~~~~~~ $ perf script --inline ... a.out 26722 80836.309329: 72425 cycles: 21561 __hypot_finite (/usr/lib/libm-2.25.so) ace3 hypot (/usr/lib/libm-2.25.so) a4a main (a.out) std::abs<double> std::_Norm_helper<true>::_S_do_it<double> std::norm<double> main 20510 __libc_start_main (/usr/lib/libc-2.25.so) bd9 _start (a.out) $ addr2line -a -f -i -e /tmp/a.out a4a | c++filt 0x0000000000000a4a std::__complex_abs(doublecomplex ) /usr/include/c++/6.3.1/complex:589 double std::abs<double>(std::complex<double> const&) /usr/include/c++/6.3.1/complex:597 double std::_Norm_helper<true>::_S_do_it<double>(std::complex<double> const&) /usr/include/c++/6.3.1/complex:654 double std::norm<double>(std::complex<double> const&) /usr/include/c++/6.3.1/complex:664 main /tmp/inlining.cpp:14 ~~~~~ Note how `std::__complex_abs` is missing from the `perf script` output. This is similarly showing up in `perf report`. The patch here fixes this issue, and the output becomes: ~~~~~ a.out 26722 80836.309329: 72425 cycles: 21561 __hypot_finite (/usr/lib/libm-2.25.so) ace3 hypot (/usr/lib/libm-2.25.so) a4a main (a.out) std::__complex_abs std::abs<double> std::_Norm_helper<true>::_S_do_it<double> std::norm<double> main 20510 __libc_start_main (/usr/lib/libc-2.25.so) bd9 _start (a.out) ~~~~~ Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yao Jin <yao.jin@linux.intel.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170524062129.32529-7-namhyung@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>