summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2023-05-17perf stat: Separate bperf from bpf_profilerDmitrii Dolgov2-2/+7
[ Upstream commit ecc68ee216c6c5b2f84915e1441adf436f1b019b ] It seems that perf stat -b <prog id> doesn't produce any results: $ perf stat -e cycles -b 4 -I 10000 -vvv Control descriptor is not initialized cycles: 0 0 0 time counts unit events 10.007641640 <not supported> cycles Looks like this happens because fentry/fexit progs are getting loaded, but the corresponding perf event is not enabled and not added into the events bpf map. I think there is some mixing up between two type of bpf support, one for bperf and one for bpf_profiler. Both are identified via evsel__is_bpf, based on which perf events are enabled, but for the latter (bpf_profiler) a perf event is required. Using evsel__is_bperf to check only bperf produces expected results: $ perf stat -e cycles -b 4 -I 10000 -vvv Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: size 136 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ [...perf_event_attr for other CPUs...] ------------------------------------------------------------ cycles: 309426 169009 169009 time counts unit events 10.010091271 309426 cycles The final numbers correspond (at least in the level of magnitude) to the same metric obtained via bpftool. Fixes: 112cb56164bc2108 ("perf stat: Introduce config stat.bpf-counter-events") Reviewed-by: Song Liu <song@kernel.org> Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Tested-by: Song Liu <song@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230412182316.11628-1-9erthalion6@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf tracepoint: Fix memory leak in is_valid_tracepoint()Yang Jihong1-0/+1
[ Upstream commit 9b86c49710eec7b4fbb78a0232b2dd0972a2b576 ] When is_valid_tracepoint() returns 1, need to call put_events_file() to free `dir_path`. Fixes: 25a7d914274de386 ("perf parse-events: Use get/put_events_file()") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230421025953.173826-1-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf symbols: Fix return incorrect build_id size in elf_read_build_id()Yang Jihong1-1/+1
[ Upstream commit 1511e4696acb715a4fe48be89e1e691daec91c0e ] In elf_read_build_id(), if gnu build_id is found, should return the size of the actually copied data. If descsz is greater thanBuild_ID_SIZE, write_buildid data access may occur. Fixes: be96ea8ffa788dcc ("perf symbols: Fix issue with binaries using 16-bytes buildids (v2)") Reported-by: Will Ochowicz <Will.Ochowicz@genusplc.com> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Tested-by: Will Ochowicz <Will.Ochowicz@genusplc.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/lkml/CWLP265MB49702F7BA3D6D8F13E4B1A719C649@CWLP265MB4970.GBRP265.PROD.OUTLOOK.COM/T/ Link: https://lore.kernel.org/r/20230427012841.231729-1-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf cs-etm: Fix timeless decode mode detectionJames Clark1-12/+18
[ Upstream commit 449067f3fc9f340da54e383738286881e6634d0b ] In this context, timeless refers to the trace data rather than the perf event data. But when detecting whether there are timestamps in the trace data or not, the presence of a timestamp flag on any perf event is used. Since commit f42c0ce573df ("perf record: Always get text_poke events with --kcore option") timestamps were added to a tracking event when --kcore is used which breaks this detection mechanism. Fix it by detecting if trace timestamps exist by looking at the ETM config flags. This would have always been a more accurate way of doing it anyway. This fixes the following error message when using --kcore with Coresight: $ perf record --kcore -e cs_etm// --per-thread $ perf report The perf.data/data data has no samples! Fixes: f42c0ce573df79d1 ("perf record: Always get text_poke events with --kcore option") Reported-by: Yang Shi <shy828301@gmail.com> Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: denik@google.com Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/lkml/CAHbLzkrJQTrYBtPkf=jf3OpQ-yBcJe7XkvQstX9j2frz4WF-SQ@mail.gmail.com/ Link: https://lore.kernel.org/r/20230424134748.228137-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf map: Delete two variable initialisations before null pointer checks in ↵Markus Elfring1-2/+1
sort__sym_from_cmp() [ Upstream commit c160118a90d4acf335993d8d59b02ae2147a524e ] Addresses of two data structure members were determined before corresponding null pointer checks in the implementation of the function “sort__sym_from_cmp”. Thus avoid the risk for undefined behaviour by removing extra initialisations for the local variables “from_l” and “from_r” (also because they were already reassigned with the same value behind this pointer check). This issue was detected by using the Coccinelle software. Fixes: 1b9e97a2a95e4941 ("perf tools: Fix report -F symbol_from for data without branch info") Signed-off-by: <elfring@users.sourceforge.net> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/cocci/54a21fea-64e3-de67-82ef-d61b90ffad05@web.de/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf pmu: zfree() expects a pointer to a pointer to zero it after freeing ↵Arnaldo Carvalho de Melo1-1/+1
its contents [ Upstream commit 57f14b5ae1a97537f2abd2828ee7212cada7036e ] An audit showed just this one problem with zfree(), fix it. Fixes: 9fbc61f832ebf432 ("perf pmu: Add support for PMU capabilities") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf vendor events power9: Remove UTF-8 characters from JSON filesKajol Jain2-3/+3
[ Upstream commit 5d9df8731c0941f3add30f96745a62586a0c9d52 ] Commit 3c22ba5243040c13 ("perf vendor events powerpc: Update POWER9 events") added and updated power9 PMU JSON events. However some of the JSON events which are part of other.json and pipeline.json files, contains UTF-8 characters in their brief description. Having UTF-8 character could breaks the perf build on some distros. Fix this issue by removing the UTF-8 characters from other.json and pipeline.json files. Result without the fix: [command]# file -i pmu-events/arch/powerpc/power9/* pmu-events/arch/powerpc/power9/cache.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/floating-point.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/frontend.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/marked.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/memory.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/metrics.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/nest_metrics.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/other.json: application/json; charset=utf-8 pmu-events/arch/powerpc/power9/pipeline.json: application/json; charset=utf-8 pmu-events/arch/powerpc/power9/pmc.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/translation.json: application/json; charset=us-ascii [command]# Result with the fix: [command]# file -i pmu-events/arch/powerpc/power9/* pmu-events/arch/powerpc/power9/cache.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/floating-point.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/frontend.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/marked.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/memory.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/metrics.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/nest_metrics.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/other.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/pipeline.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/pmc.json: application/json; charset=us-ascii pmu-events/arch/powerpc/power9/translation.json: application/json; charset=us-ascii [command]# Fixes: 3c22ba5243040c13 ("perf vendor events powerpc: Update POWER9 events") Reported-by: Arnaldo Carvalho de Melo <acme@kernel.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Disha Goel <disgoel@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/lkml/ZBxP77deq7ikTxwG@kernel.org/ Link: https://lore.kernel.org/r/20230328112908.113158-1-kjain@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf ftrace: Make system wide the default target for latency subcommandYang Jihong1-2/+4
[ Upstream commit ecd4960d908e27e40b63a7046df2f942c148c6f6 ] If no target is specified for 'latency' subcommand, the execution fails because - 1 (invalid value) is written to set_ftrace_pid tracefs file. Make system wide the default target, which is the same as the default behavior of 'trace' subcommand. Before the fix: # perf ftrace latency -T schedule failed to set ftrace pid After the fix: # perf ftrace latency -T schedule ^C# DURATION | COUNT | GRAPH | 0 - 1 us | 0 | | 1 - 2 us | 0 | | 2 - 4 us | 0 | | 4 - 8 us | 2828 | #### | 8 - 16 us | 23953 | ######################################## | 16 - 32 us | 408 | | 32 - 64 us | 318 | | 64 - 128 us | 4 | | 128 - 256 us | 3 | | 256 - 512 us | 0 | | 512 - 1024 us | 1 | | 1 - 2 ms | 4 | | 2 - 4 ms | 0 | | 4 - 8 ms | 0 | | 8 - 16 ms | 0 | | 16 - 32 ms | 0 | | 32 - 64 ms | 0 | | 64 - 128 ms | 0 | | 128 - 256 ms | 4 | | 256 - 512 ms | 2 | | 512 - 1024 ms | 0 | | 1 - ... s | 0 | | Fixes: 53be50282269b46c ("perf ftrace: Add 'latency' subcommand") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230324032702.109964-1-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf tests record_offcpu.sh: Fix redirection of stderr to stdinPatrice Duroux1-1/+1
[ Upstream commit 9835b742ac3ee16dee361e7ccda8022f99d1cd94 ] It's not 2&>1, the correct is 2>&1 Fixes: ade1d0307b2fb3d9 ("perf offcpu: Update offcpu test for child process") Signed-off-by: Patrice Duroux <patrice.duroux@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230303193058.21274-1-patrice.duroux@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf vendor events s390: Remove UTF-8 characters from JSON fileThomas Richter1-5/+5
[ Upstream commit eb2feb68cb7d404288493c41480843bc9f404789 ] Commit 7f76b31130680fb3 ("perf list: Add IBM z16 event description for s390") contains the verbal description for z16 extended counter set. However some entries of the public description contain UTF-8 characters which breaks the build on some distros. Fix this and remove the UTF-8 characters. Fixes: 7f76b31130680fb3 ("perf list: Add IBM z16 event description for s390") Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/ZBwkl77/I31AQk12@osiris Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf hist: Improve srcfile sort key performance (really)Namhyung Kim1-6/+1
[ Upstream commit 6094c7744bb0563e833e81d8df8513f9a4e7a257 ] The earlier commit f0cdde28fecc0d7f ("perf hist: Improve srcfile sort key performance") updated the srcfile logic but missed to change the ->cmp() callback which is called for every sample. It should use the same logic like in the srcline to speed up the processing because it'd return the same information repeatedly for the same address. The real processing will be done in sort__srcfile_collapse(). Fixes: f0cdde28fecc0d7f ("perf hist: Improve srcfile sort key performance") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230323025005.191239-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf script: Fix Python support when no libtraceeventAdrian Hunter9-30/+72
[ Upstream commit 80c3a7d9f20401169283b5670dbb8d7ac07a1d55 ] Python scripting can be used without libtraceevent. In particular, scripting for Intel PT does not use tracepoints, and so does not need libtraceevent support. Alter the build and employ conditional compilation to allow Python scripting without libtraceevent. Example: Before: $ ldd `which perf` | grep -i python $ ldd `which perf` | grep -i libtraceevent $ perf record -e intel_pt//u uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.031 MB perf.data ] $ perf script intel-pt-events.py |& head -3 Error: Couldn't find script `intel-pt-events.py' See perf script -l for available scripts. After: $ ldd `which perf` | grep -i python libpython3.10.so.1.0 => /lib/x86_64-linux-gnu/libpython3.10.so.1.0 (0x00007f4bac400000) $ ldd `which perf` | grep -i libtraceevent $ perf script intel-pt-events.py | head Intel PT Branch Trace, Power Events, Event Trace and PTWRITE Switch In 8021/8021 [000] 11234.097713404 0/0 perf-exec 8021/8021 [000] 11234.098041726 psb offset: 0x0 0 [unknown] ([unknown]) perf-exec 8021/8021 [000] 11234.098041726 cbr 45 freq: 4505 MHz (161%) 0 [unknown] ([unknown]) uname 8021/8021 [000] 11234.098082170 branches:uH tr strt 0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) uname 8021/8021 [000] 11234.098082379 branches:uH tr end 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown]) uname 8021/8021 [000] 11234.098083629 branches:uH tr strt 0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) uname 8021/8021 [000] 11234.098083629 branches:uH call 7f3a8b9422b3 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 7f3a8b943050 _dl_start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) uname 8021/8021 [000] 11234.098083837 branches:uH tr end 7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown]) IPC: 0.01 (9/938) uname 8021/8021 [000] 11234.098084670 branches:uH tr strt 0 [unknown] ([unknown]) => 7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) Fixes: 378ef0f5d9d7f465 ("perf build: Use libtraceevent from the system") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230315084321.14563-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf scripts intel-pt-events.py: Fix IPC output for Python 2Roman Lozko1-1/+1
[ Upstream commit 1f64cfdebfe0494264271e8d7a3a47faf5f58ec7 ] Integers are not converted to floats during division in Python 2 which results in incorrect IPC values. Fix by switching to new division behavior. Fixes: a483e64c0b62e93a ("perf scripting python: intel-pt-events.py: Add --insn-trace and --src-trace") Signed-off-by: Roman Lozko <lozko.roma@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20230310150445.2925841-1-lozko.roma@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf build: Support python/perf.so testingIan Rogers2-4/+8
[ Upstream commit 7a9b223ca0761a7c7c72e569b86b84a907aa0f92 ] Add a build target to echo the python/perf.so's name from Makefile.perf. Use it in tests/make so the correct target is built and tested for. Fixes: caec54705adb73b0 ("perf build: Fix python/perf.so library's name") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Pavithra Gurushankar <gpavithrasha@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Roberto Sassu <roberto.sassu@huawei.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Tom Rix <trix@redhat.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17perf record: Fix "read LOST count failed" msg with sample readKan Liang1-1/+1
[ Upstream commit 07d85ba9d04e1ebd282f656a29ddf08c5b7b32a2 ] Hundreds of "read LOST count failed" error messages may be displayed, when the below command is launched. perf record -e '{cpu/mem-loads-aux/,cpu/event=0xcd,umask=0x1/}:S' -a According to the commit 89e3106fa25fb1b6 ("libperf: Handle read format in perf_evsel__read()"), the PERF_FORMAT_GROUP is only available for the leader. However, the record__read_lost_samples() goes through every entry of an evlist, which includes both leader and member. The member event errors out and triggers the error message. Since there may be hundreds of CPUs on a server, the message will be printed hundreds of times, which is very annoying. The message itself is correct, but the pr_err is a overkill. Other error messages in the record__read_lost_samples() are all pr_debug. To make the output format consistent, change the pr_err("read LOST count failed\n"); to pr_debug("read LOST count failed\n");. User can still get the message via -v option. Fixes: e3a23261ad06d598 ("perf record: Read and inject LOST_SAMPLES events") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230301150413.27011-1-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11perf intel-pt: Fix CYC timestamps after standalone CBRAdrian Hunter1-0/+2
commit 430635a0ef1ce958b7b4311f172694ece2c692b8 upstream. After a standalone CBR (not associated with TSC), update the cycles reference timestamp and reset the cycle count, so that CYC timestamps are calculated relative to that point with the new frequency. Fixes: cc33618619cefc6d ("perf tools: Add Intel PT support for decoding CYC packets") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230403154831.8651-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-11perf auxtrace: Fix address filter entire kernel sizeAdrian Hunter1-1/+4
commit 1f9f33ccf0320be21703d9195dd2b36a1c9a07cb upstream. kallsyms is not completely in address order. In find_entire_kern_cb(), calculate the kernel end from the maximum address not the last symbol. Example: Before: $ sudo cat /proc/kallsyms | grep ' [twTw] ' | tail -1 ffffffffc00b8bd0 t bpf_prog_6deef7357e7b4530 [bpf] $ sudo cat /proc/kallsyms | grep ' [twTw] ' | sort | tail -1 ffffffffc15e0cc0 t iwl_mvm_exit [iwlmvm] $ perf.d093603a05aa record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter Address filter: filter 0xffffffff93200000/0x2ceba000 After: $ perf.8fb0f7a01f8e record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter Address filter: filter 0xffffffff93200000/0x2e3e2000 Fixes: 1b36c03e356936d6 ("perf record: Add support for using symbols in address filters") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230403154831.8651-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17perf stat: Fix counting when initial delay configuredChangbin Du4-16/+18
[ Upstream commit 25f69c69bc3ca8c781a94473f28d443d745768e3 ] When creating counters with initial delay configured, the enable_on_exec field is not set. So we need to enable the counters later. The problem is, when a workload is specified the target__none() is true. So we also need to check stat_config.initial_delay. In this change, we add a new field 'initial_delay' for struct target which could be shared by other subcommands. And define target__enable_on_exec() which returns whether enable_on_exec should be set on normal cases. Before this fix the event is not counted: $ ./perf stat -e instructions -D 100 sleep 2 Events disabled Events enabled Performance counter stats for 'sleep 2': <not counted> instructions 1.901661124 seconds time elapsed 0.001602000 seconds user 0.000000000 seconds sys After fix it works: $ ./perf stat -e instructions -D 100 sleep 2 Events disabled Events enabled Performance counter stats for 'sleep 2': 404,214 instructions 1.901743475 seconds time elapsed 0.001617000 seconds user 0.000000000 seconds sys Fixes: c587e77e100fa40e ("perf stat: Do not delay the workload with --delay") Signed-off-by: Changbin Du <changbin.du@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hui Wang <hw.huiwang@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230302031146.2801588-2-changbin.du@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17perf inject: Fix --buildid-all not to eat up MMAP2Namhyung Kim1-0/+1
commit ce9f1c05d2edfa6cdf2c1a510495d333e11810a8 upstream. When MMAP2 has the PERF_RECORD_MISC_MMAP_BUILD_ID flag, it means the record already has the build-id info. So it marks the DSO as hit, to skip if the same DSO is not processed if it happens to miss the build-id later. But it missed to copy the MMAP2 record itself so it'd fail to symbolize samples for those regions. For example, the following generates 249 MMAP2 events. $ perf record --buildid-mmap -o- true | perf report --stat -i- | grep MMAP2 MMAP2 events: 249 (86.8%) Adding perf inject should not change the number of events like this $ perf record --buildid-mmap -o- true | perf inject -b | \ > perf report --stat -i- | grep MMAP2 MMAP2 events: 249 (86.5%) But when --buildid-all is used, it eats most of the MMAP2 events. $ perf record --buildid-mmap -o- true | perf inject -b --buildid-all | \ > perf report --stat -i- | grep MMAP2 MMAP2 events: 1 ( 2.5%) With this patch, it shows the original number now. $ perf record --buildid-mmap -o- true | perf inject -b --buildid-all | \ > perf report --stat -i- | grep MMAP2 MMAP2 events: 249 (86.5%) Committer testing: Before: $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b | perf report --stat -i- | grep MMAP2 MMAP2 events: 58 (36.2%) $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf report --stat -i- | grep MMAP2 MMAP2 events: 58 (36.2%) $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b --buildid-all | perf report --stat -i- | grep MMAP2 MMAP2 events: 2 ( 1.9%) $ After: $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b | perf report --stat -i- | grep MMAP2 MMAP2 events: 58 (29.3%) $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf report --stat -i- | grep MMAP2 MMAP2 events: 58 (34.3%) $ perf record --buildid-mmap -o- perf stat --null sleep 1 2> /dev/null | perf inject -b --buildid-all | perf report --stat -i- | grep MMAP2 MMAP2 events: 58 (38.4%) $ Fixes: f7fc0d1c915a74ff ("perf inject: Do not inject BUILD_ID record if MMAP2 has it") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230223070155.54251-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10perf tests stat_all_metrics: Change true workload to sleep workload for ↵Kajol Jain1-1/+1
system wide check [ Upstream commit f9fa0778ee7349a9aa3d2ea10e9f2ab843a0b44e ] Testcase stat_all_metrics.sh fails in powerpc: 98: perf all metrics test : FAILED! Logs with verbose: [command]# ./perf test 98 -vv 98: perf all metrics test : --- start --- test child forked, pid 13262 Testing BRU_STALL_CPI Testing COMPLETION_STALL_CPI ---- Testing TOTAL_LOCAL_NODE_PUMPS_P23 Metric 'TOTAL_LOCAL_NODE_PUMPS_P23' not printed in: Error: Invalid event (hv_24x7/PM_PB_LNS_PUMP23,chip=3/) in per-thread mode, enable system wide with '-a'. Testing TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01 Metric 'TOTAL_LOCAL_NODE_PUMPS_RETRIES_P01' not printed in: Error: Invalid event (hv_24x7/PM_PB_RTY_LNS_PUMP01,chip=3/) in per-thread mode, enable system wide with '-a'. ---- Based on above logs, we could see some of the hv-24x7 metric events fails, and logs suggest to run the metric event with -a option. This change happened after the commit a4b8cfcabb1d90ec ("perf stat: Delay metric parsing"), which delayed the metric parsing phase and now before metric parsing phase perf tool identifies, whether target is system-wide or not. With this change, perf_event_open will fails with workload monitoring for uncore events as expected. The perf all metric test case fails as some of the hv-24x7 metric events may need bigger workload with system wide monitoring to get the data. Fix this issue by changing current system wide check from true workload to sleep 0.01 workload. Result with the patch changes in powerpc: 98: perf all metrics test : Ok Fixes: a4b8cfcabb1d90ec ("perf stat: Delay metric parsing") Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Tested-by: Ian Rogers <irogers@google.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230215093827.124921-1-kjain@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10perf record: Fix segfault with --overwrite and --max-sizeYang Jihong1-10/+6
[ Upstream commit 91621be65d6812cd74b2ea09573ff9ee0cbf5666 ] When --overwrite and --max-size options of perf record are used together, a segmentation fault occurs. The following is an example: # perf record -e sched:sched* --overwrite --max-size 1K -a -- sleep 1 [ perf record: Woken up 1 times to write data ] perf: Segmentation fault Obtained 12 stack frames. ./perf/perf(+0x197673) [0x55f99710b673] /lib/x86_64-linux-gnu/libc.so.6(+0x3ef0f) [0x7fa45f3cff0f] ./perf/perf(+0x8eb40) [0x55f997002b40] ./perf/perf(+0x1f6882) [0x55f99716a882] ./perf/perf(+0x794c2) [0x55f996fed4c2] ./perf/perf(+0x7b7c7) [0x55f996fef7c7] ./perf/perf(+0x9074b) [0x55f99700474b] ./perf/perf(+0x12e23c) [0x55f9970a223c] ./perf/perf(+0x12e54a) [0x55f9970a254a] ./perf/perf(+0x7db60) [0x55f996ff1b60] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe6) [0x7fa45f3b2c86] ./perf/perf(+0x7dfe9) [0x55f996ff1fe9] Segmentation fault (core dumped) backtrace of the core file is as follows: (gdb) bt #0 record__bytes_written (rec=0x55f99755a200 <record>) at builtin-record.c:234 #1 record__output_max_size_exceeded (rec=0x55f99755a200 <record>) at builtin-record.c:242 #2 record__write (map=0x0, size=12816, bf=0x55f9978da2e0, rec=0x55f99755a200 <record>) at builtin-record.c:263 #3 process_synthesized_event (tool=tool@entry=0x55f99755a200 <record>, event=event@entry=0x55f9978da2e0, sample=sample@entry=0x0, machine=machine@entry=0x55f997893658) at builtin-record.c:618 #4 0x000055f99716a883 in __perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=0x55f9978928b0, machine=machine@entry=0x55f997893658, from=from@entry=0) at util/synthetic-events.c:1895 #5 0x000055f99716a91f in perf_event__synthesize_id_index (tool=tool@entry=0x55f99755a200 <record>, process=process@entry=0x55f997002aa0 <process_synthesized_event>, evlist=<optimized out>, machine=machine@entry=0x55f997893658) at util/synthetic-events.c:1905 #6 0x000055f996fed4c3 in record__synthesize (tail=tail@entry=true, rec=0x55f99755a200 <record>) at builtin-record.c:1997 #7 0x000055f996fef7c8 in __cmd_record (argc=argc@entry=2, argv=argv@entry=0x7ffc67551260, rec=0x55f99755a200 <record>) at builtin-record.c:2802 #8 0x000055f99700474c in cmd_record (argc=<optimized out>, argv=0x7ffc67551260) at builtin-record.c:4258 #9 0x000055f9970a223d in run_builtin (p=0x55f997564d88 <commands+264>, argc=10, argv=0x7ffc67551260) at perf.c:330 #10 0x000055f9970a254b in handle_internal_command (argc=10, argv=0x7ffc67551260) at perf.c:384 #11 0x000055f996ff1b61 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:428 #12 main (argc=<optimized out>, argv=0x7ffc67551260) at perf.c:562 The reason is that record__bytes_written accesses the freed memory rec->thread_data, The process is as follows: __cmd_record -> record__free_thread_data -> zfree(&rec->thread_data) // free rec->thread_data -> record__synthesize -> perf_event__synthesize_id_index -> process_synthesized_event -> record__write -> record__bytes_written // access rec->thread_data We add a member variable "thread_bytes_written" in the struct "record" to save the data size written by the threads. Fixes: 6d57581659f72299 ("perf record: Add support for limit perf output file size") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiwei Sun <jiwei.sun@windriver.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/CAM9d7ci_TRrqBQVQNW8=GwakUr7SsZpYxaaty-S4bxF8zJWyqw@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10perf stat: Avoid merging/aggregating metric counts twiceIan Rogers1-1/+1
[ Upstream commit 37f322cd58d81a9d46456531281c908de9ef6e42 ] The added perf_stat_merge_counters combines uncore counters. When metrics are enabled, the counts are merged into a metric_leader via the stat-shadow saved_value logic. As the leader now is passed an aggregated count, it leads to all counters being added together twice and counts appearing approximately doubled in metrics. This change disables the saved_value merging of counts for evsels that are merged. It is recommended that later changes remove the saved_value entirely as the two layers of aggregation in the code is confusing. Fixes: 942c5593393d9418 ("perf stat: Add perf_stat_merge_counters()") Reported-by: Perry Taylor <perry.taylor@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20230209064447.83733-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10perf tools: Fix auto-complete on aarch64Yicong Yang1-3/+8
[ Upstream commit ffd1240e8f0814262ceb957dbe961f6e0aef1e7a ] On aarch64 CPU related events are not under event_source/devices/cpu/events, they're under event_source/devices/armv8_pmuv3_0/events on my machine. Using current auto-complete script will generate below error: [root@localhost bin]# perf stat -e ls: cannot access '/sys/bus/event_source/devices/cpu/events': No such file or directory Fix this by not testing /sys/bus/event_source/devices/cpu/events on aarch64 machine. Fixes: 74cd5815d9af6e6c ("perf tool: Improve bash command line auto-complete for multiple events with comma") Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Cc: prime.zeng@hisilicon.com Link: https://lore.kernel.org/r/20230207035057.43394-1-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10perf test bpf: Skip test if kernel-debuginfo is not presentAthira Rajeev1-1/+5
[ Upstream commit 34266f904abd45731bdade2e92d0536c092ee9bc ] Perf BPF filter test fails in environment where "kernel-debuginfo" is not installed. Test failure logs: <<>> 42: BPF filter : 42.1: Basic BPF filtering : Ok 42.2: BPF pinning : Ok 42.3: BPF prologue generation : FAILED! <<>> Enabling verbose option provided debug logs, which says debuginfo needs to be installed. Snippet of verbose logs: <<>> 42.3: BPF prologue generation : --- start --- test child forked, pid 28218 <<>> Rebuild with CONFIG_DEBUG_INFO=y, or install an appropriate debuginfo package. bpf_probe: failed to convert perf probe events Failed to add events selected by BPF test child finished with -1 ---- end ---- BPF filter subtest 3: FAILED! <<>> Here the subtest "BPF prologue generation" failed and logs shows debuginfo is needed. After installing kernel-debuginfo package, testcase passes. The "BPF prologue generation" subtest failed because, the do_test() returns TEST_FAIL without checking the error type returned by parse_events_load_bpf_obj(). parse_events_load_bpf_obj() can also return error of type -ENODATA incase kernel-debuginfo package is not installed. Fix this by adding check for -ENODATA error. Test result after the patch changes: Test failure logs: <<>> 42: BPF filter : 42.1: Basic BPF filtering : Ok 42.2: BPF pinning : Ok 42.3: BPF prologue generation : Skip (clang/debuginfo isn't installed or environment missing BPF support) <<>> Fixes: ba1fae431e74bb42 ("perf test: Add 'perf test BPF'") Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Disha Goel <disgoel@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lore.kernel.org/linux-perf-users/Y7bIk77mdE4j8Jyi@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10perf jevents: Correct bad character encodingIan Rogers1-2/+2
[ Upstream commit d2e3dc829e389d686194d06f0a64adda4158faae ] A character encoding issue added a "3D" character that breaks the metrics test. Fixes: 40769665b63d8c84 ("perf jevents: Parse metrics during conversion") Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230126233645.200509-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10perf stat: Hide invalid uncore event output for aggr modeNamhyung Kim1-5/+46
[ Upstream commit dd15480a3d67b9cf04a1f6f5d60f1c0dc018e22f ] The current display code for perf stat iterates given cpus and build the aggr map to collect the event data for the aggregation mode. But uncore events have their own cpu maps and it won't guarantee that it'd match to the aggr map. For example, per-package uncore events would generate a single value for each socket. When user asks per-core aggregation mode, the output would contain 0 values for other cores. Thus it needs to check the uncore PMU's cpumask and if it matches to the current aggregation id. Before: $ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1 Performance counter stats for 'system wide': S0-D0-C0 1 3.73 Joules power/energy-pkg/ S0-D0-C1 0 <not counted> Joules power/energy-pkg/ S0-D0-C2 0 <not counted> Joules power/energy-pkg/ S0-D0-C3 0 <not counted> Joules power/energy-pkg/ 1.001404046 seconds time elapsed Some events weren't counted. Try disabling the NMI watchdog: echo 0 > /proc/sys/kernel/nmi_watchdog perf stat ... echo 1 > /proc/sys/kernel/nmi_watchdog The core 1, 2 and 3 should not be printed because the event is handled in a cpu in the core 0 only. With this change, the output becomes like below. After: $ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1 Performance counter stats for 'system wide': S0-D0-C0 1 2.09 Joules power/energy-pkg/ Fixes: b897613510890d6e ("perf stat: Update event skip condition for system-wide per-thread mode and merged uncore and hybrid events") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Ian Rogers <irogers@google.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230125192431.2929677-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10perf intel-pt: Do not try to queue auxtrace data on pipeNamhyung Kim3-0/+39
[ Upstream commit aeb802f872a7c42e4381f36041e77d1745908255 ] When it processes AUXTRACE_INFO, it calls to auxtrace_queue_data() to collect AUXTRACE data first. That won't work with pipe since it needs lseek() to read the scattered aux data. $ perf record -o- -e intel_pt// true | perf report -i- --itrace=i100 # To display the perf.data header info, please use --header/--header-only options. # 0x4118 [0xa0]: failed to process type: 70 Error: failed to process sample For the pipe mode, it can handle the aux data as it gets. But there's no guarantee it can get the aux data in time. So the following warning will be shown at the beginning: WARNING: Intel PT with pipe mode is not recommended. The output cannot relied upon. In particular, time stamps and the order of events may be incorrect. Fixes: dbd134322e74f19d ("perf intel-pt: Add support for decoding AUX area samples") Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20230131023350.1903992-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10perf inject: Use perf_data__read() for auxtraceNamhyung Kim1-3/+3
[ Upstream commit 1746212daeba95e9ae1639227dc0c3591d41deeb ] In copy_bytes(), it reads the data from the (input) fd and writes it to the output file. But it does with the read(2) unconditionally which caused a problem of mixing buffered vs unbuffered I/O together. You can see the problem when using pipes. $ perf record -e intel_pt// -o- true | perf inject -b > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] 0x45c0 [0x30]: failed to process type: 71 It should use perf_data__read() to honor the 'use_stdio' setting. Fixes: 601366678c93618f ("perf data: Allow to use stdio functions for pipe mode") Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20230131023350.1903992-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10perf llvm: Fix inadvertent file creationIan Rogers1-1/+24
[ Upstream commit 9f19aab47ced012eddef1e2bc96007efc7713b61 ] The LLVM template is first echo-ed into command_out and then command_out executed. The echo surrounds the template with double quotes, however, the template itself may contain quotes. This is generally innocuous but in tools/perf/tests/bpf-script-test-prologue.c we see: ... SEC("func=null_lseek file->f_mode offset orig") ... where the first double quote ends the double quote of the echo, then the > redirects output into a file called f_mode. To avoid this inadvertent behavior substitute redirects and similar characters to be ASCII control codes, then substitute the output in the echo back again. Fixes: 5eab5a7ee032acaa ("perf llvm: Display eBPF compiling command in debug output") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: bpf@vger.kernel.org Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: llvm@lists.linux.dev Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Rix <trix@redhat.com> Link: https://lore.kernel.org/r/20230105082609.344538-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18perf test build-id: Fix test check for PE fileAthira Rajeev1-1/+14
Perf test "build id cache operations" fails for PE executable. Logs below from powerpc system. Same is observed on x86 as well. <<>> Adding 5a0fd882b53084224ba47b624c55a469 ./tests/shell/../pe-file.exe: Ok build id: 5a0fd882b53084224ba47b624c55a469 link: /tmp/perf.debug.w0V/.build-id/5a/0fd882b53084224ba47b624c55a469 file: /tmp/perf.debug.w0V/.build-id/5a/../../root/<user>/linux/tools/perf/tests/pe-file.exe/5a0fd882b53084224ba47b624c55a469/elf failed: file /tmp/perf.debug.w0V/.build-id/5a/../../root/<user>/linux/tools/perf/tests/pe-file.exe/5a0fd882b53084224ba47b624c55a469/elf does not exist test child finished with -1 ---- end ---- build id cache operations: FAILED! <<>> The test tries to do: <<>> mkdir /tmp/perf.debug.TeY1 perf --buildid-dir /tmp/perf.debug.TeY1 buildid-cache -v -a ./tests/shell/../pe-file.exe <<>> The option "--buildid-dir" sets the build id cache directory as /tmp/perf.debug.TeY1. The option given to buildid-cahe, ie "-a ./tests/shell/../pe-file.exe", is to add the pe-file.exe to the cache. The testcase, sets buildid-dir and adds the file: pe-file.exe to build id cache. To check if the command is run successfully, "check" function looks for presence of the file in buildid cache directory. But the check here expects the added file to be executable. Snippet below: <<>> if [ ! -x $file ]; then echo "failed: file ${file} does not exist" exit 1 fi <<>> The buildid test is done for sha1 binary, md5 binary and also for PE file. The first two binaries are created at runtime by compiling with "--build-id" option and hence the check for sha1/md5 test should use [ ! -x ]. But in case of PE file, the permission for this input file is rw-r--r-- Hence the file added to build id cache has same permissoin Original file: ls tests/pe-file.exe | xargs stat --printf "%n %A \n" tests/pe-file.exe -rw-r--r-- buildid cache file: ls /tmp/perf.debug.w0V/.build-id/5a/../../root/<user>/linux/tools/perf/tests/pe-file.exe/5a0fd882b53084224ba47b624c55a469/elf | xargs stat --printf "%n %A \n" /tmp/perf.debug.w0V/.build-id/5a/../../root/<user>/linux/tools/perf/tests/pe-file.exe/5a0fd882b53084224ba47b624c55a469/elf -rw-r--r-- Fix the test to match with the permission of original file in case of FE file. ie if the "tests/pe-file.exe" file is not having exec permission, just check for existence of the buildid file using [ ! -e <file> ] Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Disha Goel <disgoel@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230116050131.17221-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-18perf buildid-cache: Fix the file mode with copyfile() while adding file to ↵Athira Rajeev1-3/+7
build-id cache The test "build id cache operations" fails on powerpc as below: Adding 5a0fd882b53084224ba47b624c55a469 ./tests/shell/../pe-file.exe: Ok build id: 5a0fd882b53084224ba47b624c55a469 link: /tmp/perf.debug.ZTu/.build-id/5a/0fd882b53084224ba47b624c55a469 file: /tmp/perf.debug.ZTu/.build-id/5a/../../root/linux/tools/perf/tests/pe-file.exe/5a0fd882b53084224ba47b624c55a469/elf failed: file /tmp/perf.debug.ZTu/.build-id/5a/../../root/linux/tools/perf/tests/pe-file.exe/5a0fd882b53084224ba47b624c55a469/elf does not exist test child finished with -1 ---- end ---- build id cache operations: FAILED! The failing test is when trying to add pe-file.exe to build id cache. 'perf buildid-cache' can be used to add/remove/manage files from the build-id cache. "-a" option is used to add a file to the build-id cache. Simple command to do so for a PE exe file: # ls -ltr tests/pe-file.exe -rw-r--r--. 1 root root 75595 Jan 10 23:35 tests/pe-file.exe The file is in home directory. # mkdir /tmp/perf.debug.TeY1 # perf --buildid-dir /tmp/perf.debug.TeY1 buildid-cache -v -a tests/pe-file.exe The above will create ".build-id" folder in build id directory, which is /tmp/perf.debug.TeY1. Also adds file to this folder under build id. Example: # ls -ltr /tmp/perf.debug.TeY1/.build-id/5a/0fd882b53084224ba47b624c55a469/ total 76 -rw-r--r--. 1 root root 0 Jan 11 00:38 probes -rwxr-xr-x. 1 root root 75595 Jan 11 00:38 elf We can see in the results that file mode for original file and file in build id directory is different. ie, build id file has executable permission whereas original file doesn’t have. The code path and function (build_id_cache__add to add a file to the cache is in "util/build-id.c". In build_id_cache__add() function, it first attempts to link the original file to destination cache folder. If linking the file fails (which can happen if the destination and source is on a different mount points), it will copy the file to destination. Here copyfile() routine explicitly uses mode as "755" and hence file in the destination will have executable permission. Code snippet: if (link(realname, filename) && errno != EEXIST && copyfile(name, filename)) strace logs: 172285 link("/home/<user_name>/linux/tools/perf/tests/pe-file.exe", "/tmp/perf.debug.TeY1/home/<user_name>/linux/tools/perf/tests/pe-file.exe/5a0fd882b53084224ba47b624c55a469/elf") = -1 EXDEV (Invalid cross-device link) 172285 newfstatat(AT_FDCWD, "tests/pe-file.exe", {st_mode=S_IFREG|0644, st_size=75595, ...}, 0) = 0 172285 openat(AT_FDCWD, "/tmp/perf.debug.TeY1/home/<user_name>/linux/tools/perf/tests/pe-file.exe/5a0fd882b53084224ba47b624c55a469/.elf.KbAnsl", O_RDWR|O_CREAT|O_EXCL, 0600) = 3 172285 fchmod(3, 0755) = 0 172285 openat(AT_FDCWD, "tests/pe-file.exe", O_RDONLY) = 4 172285 mmap(NULL, 75595, PROT_READ, MAP_PRIVATE, 4, 0) = 0x7fffa5cd0000 172285 pwrite64(3, "MZ\220\0\3\0\0\0\4\0\0\0\377\377\0\0\270\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 75595, 0) = 75595 Whereas if the link succeeds, it succeeds in the first attempt itself and the file in the build-id dir will have same permission as original file. Example, above uses /tmp. Instead if we use "--buildid-dir /home/build", linking will work here since mount points are same. Hence the destination file will not have executable permission. Since the testcase "tests/shell/buildid.sh" always looks for executable file, test fails in powerpc environment when test is run from /root. The patch adds a change in build_id_cache__add() to use copyfile_mode() which also passes the file’s original mode as argument. This way the destination file mode also will be same as original file. Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Disha Goel <disgoel@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230116050131.17221-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-18perf expr: Prevent normalize() from reading into undefined memory in the ↵Sohom Datta1-1/+4
expression lexer The current implementation does not account for a trailing backslash followed by a null-byte. If a null-byte is encountered following a backslash, normalize() will continue reading (and potentially writing) into garbage memory ignoring the EOS null-byte. Signed-off-by: Sohom Datta <sohomdatta1+git@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221204105836.1012885-1-sohomdatta1+git@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-18perf beauty: Update copy of linux/socket.h with the kernel sourcesArnaldo Carvalho de Melo1-1/+4
To pick the changes in: b5f0de6df6dce8d6 ("net: dev: Convert sa_data to flexible array in struct sockaddr") That don't result in any changes in the tables generated from that header. This silences this perf build warning: Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h' diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-11perf auxtrace: Fix address filter duplicate symbol selectionAdrian Hunter1-1/+1
When a match has been made to the nth duplicate symbol, return success not error. Example: Before: $ cat file.c cat: file.c: No such file or directory $ cat file1.c #include <stdio.h> static void func(void) { printf("First func\n"); } void other(void); int main() { func(); other(); return 0; } $ cat file2.c #include <stdio.h> static void func(void) { printf("Second func\n"); } void other(void) { func(); } $ gcc -Wall -Wextra -o test file1.c file2.c $ perf record -e intel_pt//u --filter 'filter func @ ./test' -- ./test Multiple symbols with name 'func' #1 0x1149 l func which is near main #2 0x1179 l func which is near other Disambiguate symbol name by inserting #n after the name e.g. func #2 Or select a global symbol by inserting #0 or #g or #G Failed to parse address filter: 'filter func @ ./test' Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>] Where multiple filters are separated by space or comma. $ perf record -e intel_pt//u --filter 'filter func #2 @ ./test' -- ./test Failed to parse address filter: 'filter func #2 @ ./test' Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>] Where multiple filters are separated by space or comma. After: $ perf record -e intel_pt//u --filter 'filter func #2 @ ./test' -- ./test First func Second func [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data ] $ perf script --itrace=b -Ftime,flags,ip,sym,addr --ns 1231062.526977619: tr strt 0 [unknown] => 558495708179 func 1231062.526977619: tr end call 558495708188 func => 558495708050 _init 1231062.526979286: tr strt 0 [unknown] => 55849570818d func 1231062.526979286: tr end return 55849570818f func => 55849570819d other Fixes: 1b36c03e356936d6 ("perf record: Add support for using symbols in address filters") Reported-by: Dmitrii Dolgov <9erthalion6@gmail.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Dmitry Dolgov <9erthalion6@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230110185659.15979-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-10perf bpf: Avoid build breakage with libbpf < 0.8.0 + LIBBPF_DYNAMIC=1Arnaldo Carvalho de Melo1-0/+2
In 746bd29e348f99b4 ("perf build: Use tools/lib headers from install path") we stopped having the tools/lib/ directory from the kernel sources in the header include path unconditionally, which breaks the build on systems with older versions of libbpf-devel, in this case 0.7.0 as some of the structures and function declarations present in the newer version of libbpf included in the kernel sources (tools/lib/bpf) are not anymore used, just the ones in the system libbpf. So instead of trying to provide alternative functions when the libbpf-bpf_program__set_insns feature test fails, fail a LIBBPF_DYNAMIC=1 build (requesting the use of the system's libbpf) and emit this build error message: $ make LIBBPF_DYNAMIC=1 -C tools/perf Makefile.config:593: *** Error: libbpf devel library needs to be >= 0.8.0 to build with LIBBPF_DYNAMIC, update or build statically with the version that comes with the kernel sources. Stop. $ For v6.3 these tests will be revamped and we'll require libbpf 1.0 as a minimal version for using LIBBPF_DYNAMIC=1, most distros should have it by now or at v6.3 time. Fixes: 746bd29e348f99b4 ("perf build: Use tools/lib headers from install path") Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/CAP-5=fVa51_URGsdDFVTzpyGmdDRj_Dj2EKPuDHNQ0BYgMSzUA@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-10perf build: Fix build error when NO_LIBBPF=1Ian Rogers2-9/+14
The $(LIBBPF) target should only be a dependency of prepare if the static version of libbpf is needed. Add a new LIBBPF_STATIC variable that is set by Makefile.config. Use LIBBPF_STATIC to determine whether the CFLAGS, etc. need updating and for adding $(LIBBPF) as a prepare dependency. As Makefile.config isn't loaded for "clean" as a target, always set LIBBPF_OUTPUT regardless of whether it is needed for $(LIBBPF). This is done to minimize conditional logic for $(LIBBPF)-clean. This issue and an original fix was reported by Mike Leach in: https://lore.kernel.org/lkml/20230105172243.7238-1-mike.leach@linaro.org/ Fixes: 746bd29e348f99b4 ("perf build: Use tools/lib headers from install path") Reported-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: bpf@vger.kernel.org Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20230106151320.619514-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-10perf tools: Don't install libtraceevent plugins as its not anymore in the ↵Arnaldo Carvalho de Melo2-20/+0
kernel sources While doing 'make -C tools/perf build-test' one can notice error messages while trying to install libtraceevent plugins, stop doing that as libtraceevent isn't anymore a homie. These are the warnings dealt with: make_install_prefix_slash_O: make install prefix=/tmp/krava/ failed to find: /tmp/krava/etc/bash_completion.d/perf failed to find: /tmp/krava/lib64/traceevent/plugins/plugin_cfg80211.so failed to find: /tmp/krava/lib64/traceevent/plugins/plugin_scsi.so failed to find: /tmp/krava/lib64/traceevent/plugins/plugin_xen.so failed to find: /tmp/krava/lib64/traceevent/plugins/plugin_function.so failed to find: /tmp/krava/lib64/traceevent/plugins/plugin_sched_switch.so failed to find: /tmp/krava/lib64/traceevent/plugins/plugin_mac80211.so failed to find: /tmp/krava/lib64/traceevent/plugins/plugin_kvm.so failed to find: /tmp/krava/lib64/traceevent/plugins/plugin_kmem.so failed to find: /tmp/krava/lib64/traceevent/plugins/plugin_hrtimer.so failed to find: /tmp/krava/lib64/traceevent/plugins/plugin_jbd2.so Fixes: 4171925aa9f3f7bf ("tools lib traceevent: Remove libtraceevent") Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lore.kernel.org/lkml/Y7xXz+TSpiCbQGjw@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-10perf kmem: Support field "node" in evsel__process_alloc_event() coping with ↵Leo Yan1-12/+24
recent tracepoint restructuring Commit 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints") adds the field "node" into the tracepoints 'kmalloc' and 'kmem_cache_alloc', so this patch modifies the event process function to support the field "node". If field "node" is detected by checking function evsel__field(), it stats the cross allocation. When the "node" value is NUMA_NO_NODE (-1), it means the memory can be allocated from any memory node, in this case, we don't account it as a cross allocation. Fixes: 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints") Reported-by: Ravi Bangoria <ravi.bangoria@amd.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20230108062400.250690-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-10perf kmem: Support legacy tracepointsLeo Yan1-3/+26
Commit 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints") removed tracepoints 'kmalloc_node' and 'kmem_cache_alloc_node', we need to consider the tool should be backward compatible. If it detect the tracepoint "kmem:kmalloc_node", this patch enables the legacy tracepoints, otherwise, it will ignore them. Fixes: 11e9734bcb6a7361 ("mm/slab_common: unify NUMA and UMA version of tracepoints") Reported-by: Ravi Bangoria <ravi.bangoria@amd.com> Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/r/20230108062400.250690-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-10perf build: Properly guard libbpf includesIan Rogers2-0/+8
Including libbpf header files should be guarded by HAVE_LIBBPF_SUPPORT. In bpf_counter.h, move the skeleton utilities under HAVE_BPF_SKEL. Fixes: d6a735ef3277c45f ("perf bpf_counter: Move common functions to bpf_counter.h") Reported-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Mike Leach <mike.leach@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20230105172243.7238-1-mike.leach@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-09perf tests bpf prologue: Fix bpf-script-test-prologue test compile issue ↵Athira Rajeev1-0/+2
with clang While running 'perf test' for bpf, observed that "BPF prologue generation" test case fails to compile with clang. Logs below from powerpc: <stdin>:33:2: error: use of undeclared identifier 'fmode_t' fmode_t f_mode = (fmode_t)_f_mode; ^ <stdin>:37:6: error: use of undeclared identifier 'f_mode'; did you mean '_f_mode'? if (f_mode & FMODE_WRITE) ^~~~~~ _f_mode <stdin>:30:60: note: '_f_mode' declared here int bpf_func__null_lseek(void *ctx, int err, unsigned long _f_mode, ^ 2 errors generated. The test code tests/bpf-script-test-prologue.c uses fmode_t. And the error above is for "fmode_t" which is defined in include/linux/types.h as part of kernel build directory: "/lib/modules/<kernel_version>/build" that comes from kernel devel [ soft link to /usr/src/<kernel_version> ]. Clang picks this header file from "-working-directory" build option that specifies this build folder. But the commit 14e4b9f4289aed2c ("perf trace: Raw augmented syscalls fix libbpf 1.0+ compatibility") changed the include directory to use: "/usr/include". Post this change, types.h from /usr/include/ is getting picked upwhich doesn’t contain definition of "fmode_t" and hence fails to compile. Compilation command before this commit: /usr/bin/clang -D__KERNEL__ -D__NR_CPUS__=72 -DLINUX_VERSION_CODE=0x50e00 -xc -I/root/lib/perf/include/bpf -nostdinc -I./arch/powerpc/include -I./arch/powerpc/include/generated -I./include -I./arch/powerpc/include/uapi -I./arch/powerpc/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h -Wno-unused-value -Wno-pointer-sign -working-directory /lib/modules/<ver>/build -c - -target bpf -g -O2 -o - Compilation command after this commit: /usr/bin/clang -D__KERNEL__ -D__NR_CPUS__=72 -DLINUX_VERSION_CODE=0x50e00 -xc -I/usr/include/ -nostdinc -I./arch/powerpc/include -I./arch/powerpc/include/generated -I./include -I./arch/powerpc/include/uapi -I./arch/powerpc/include/generated/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/compiler-version.h -include ./include/linux/kconfig.h -Wno-unused-value -Wno-pointer-sign -working-directory /lib/modules/<ver>/build -c - -target bpf -g -O2 -o - The difference is addition of -I/usr/include/ in the first line which is causing the error. Fix this by adding typedef for "fmode_t" in the testcase to solve the compile issue. Fixes: 14e4b9f4289aed2c ("perf trace: Raw augmented syscalls fix libbpf 1.0+ compatibility") Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Disha Goel <disgoel@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/linux-perf-users/20230105120436.92051-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-04perf tools: Fix build on uClibc systems by adding missing sys/types.h includeJesus Sanchez-Palencia1-0/+1
Not all libc implementations define ssize_t as part of stdio.h like glibc does since the standard only requires this type to be defined by unistd.h and sys/types.h. For this reason the perf build is currently broken for toolchains based on uClibc, for instance. Include sys/types.h explicitly to fix that. Committer notes: In addition, in the past this worked in uClibc test systems as there was another way to get to sys/types.h that got removed in that cset: tools/perf/util/trace-event.h /usr/include/traceevent/event_parse.h # This got removed from util/trace-event.h in 378ef0f5d9d7f465 /usr/include/regex.h /usr/include/sys/types.h typedef __ssize_t ssize_t; So the size_t that is used in tools/perf/util/trace-event.h was being obtained indirectly, by chance. Fixes: 378ef0f5d9d7f465 ("perf build: Use libtraceevent from the system") Signed-off-by: Jesus Sanchez-Palencia <jesussanp@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20230104193414.606905-1-jesussanp@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-04perf stat: Fix handling of --for-each-cgroup with --bpf-counters to match ↵Namhyung Kim1-5/+18
non BPF mode The --for-each-cgroup can have the same cgroup multiple times, but this confuses BPF counters (since they have the same cgroup id), making only the last cgroup events to be counted. Let's check the cgroup name before adding a new entry to the cgroups list. Before: $ sudo ./perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1 Performance counter stats for 'system wide': <not counted> msec cpu-clock / <not counted> context-switches / <not counted> cpu-migrations / <not counted> page-faults / <not counted> cycles / <not counted> instructions / <not counted> branches / <not counted> branch-misses / 8,016.04 msec cpu-clock / # 7.998 CPUs utilized 6,152 context-switches / # 767.461 /sec 250 cpu-migrations / # 31.187 /sec 442 page-faults / # 55.139 /sec 613,111,487 cycles / # 0.076 GHz 280,599,604 instructions / # 0.46 insn per cycle 57,692,724 branches / # 7.197 M/sec 3,385,168 branch-misses / # 5.87% of all branches 1.002220125 seconds time elapsed After it becomes similar to the non-BPF mode: $ sudo ./perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1 Performance counter stats for 'system wide': 8,013.38 msec cpu-clock / # 7.998 CPUs utilized 6,859 context-switches / # 855.944 /sec 334 cpu-migrations / # 41.680 /sec 345 page-faults / # 43.053 /sec 782,326,119 cycles / # 0.098 GHz 471,645,724 instructions / # 0.60 insn per cycle 94,963,430 branches / # 11.851 M/sec 3,685,511 branch-misses / # 3.88% of all branches 1.001864539 seconds time elapsed Committer notes: As a reminder, to test with BPF counters one has to use BUILD_BPF_SKEL=1 in the make command line and have clang/llvm installed when building perf, otherwise the --bpf-counters option will not be available: # perf stat -a --bpf-counters --for-each-cgroup /,/ sleep 1 Error: unknown option `bpf-counters' Usage: perf stat [<options>] [<command>] -a, --all-cpus system-wide collection from all CPUs <SNIP> # Fixes: bb1c15b60b981d10 ("perf stat: Support regex pattern in --for-each-cgroup") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: bpf@vger.kernel.org Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/r/20230104064402.1551516-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-04perf stat: Fix handling of unsupported cgroup events when using BPF countersNamhyung Kim1-11/+3
When --for-each-cgroup option is used, it fails when any of events is not supported and exits immediately. This is not how 'perf stat' handles unsupported events. Let's ignore the failure and proceed with others so that the output is similar to when BPF counters are not used: Before: $ sudo ./perf stat -a --bpf-counters -e L1-icache-loads,L1-dcache-loads --for-each-cgroup system.slice,user.slice sleep 1 Failed to open first cgroup events $ After it shows output similat to when --bpf-counters isn't specified: $ sudo ./perf stat -a --bpf-counters -e L1-icache-loads,L1-dcache-loads --for-each-cgroup system.slice,user.slice sleep 1 Performance counter stats for 'system wide': <not supported> L1-icache-loads system.slice 29,892,418 L1-dcache-loads system.slice <not supported> L1-icache-loads user.slice 52,497,220 L1-dcache-loads user.slice $ Fixes: 944138f048f7d759 ("perf stat: Enable BPF counter with --for-each-cgroup") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/r/20230104064402.1551516-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-04perf test record_probe_libc_inet_pton: Fix test on s/390 where ↵Thomas Richter1-0/+1
'text_to_binary_address' now appears on the backtrace perf test '84: probe libc's inet_pton & backtrace it with ping' fails on s390. Debugging revealed a changed stack trace for the ping command using probes: ping 35729 [002] 8006.365063: probe_libc:inet_pton: (3ff9603e7c0) 13e7c0 __GI___inet_pton+0x0 (/usr/lib64/libc.so.6) ---> 104371 text_to_binary_address+0xef1 (inlined) 104371 gaih_inet+0xef1 (inlined) 104371 __GI_getaddrinfo+0xef1 (inlined) 5d4b main+0x139b (/usr/bin/ping) The line "---> text_to_binary_address ..." is new. It was introduced with glibc version 2.36.7.2 released with Fedora 37 for s390. Output before # perf test inet_pton 84: probe libc's inet_pton & backtrace it with ping : FAILED! # Output after: # perf test inet_pton 84: probe libc's inet_pton & backtrace it with ping : Ok # Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20221228145704.2702487-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-03perf lock contention: Fix core dump related to not finding the ↵Thomas Richter1-0/+2
"__sched_text_end" symbol on s/390 The test case perf lock contention dumps core on s390. Run the following commands: # ./perf lock record -- ./perf bench sched messaging # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 2.799 [sec] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.073 MB perf.data (100 samples) ] # # ./perf lock contention Segmentation fault (core dumped) # The function call stack is lengthy, here are the top 5 functions: # gdb ./perf core.24048 GNU gdb (GDB) Fedora Linux 12.1-6.fc37 Core was generated by `./perf lock contention'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356 3356 machine->sched.text_end = kmap->unmap_ip(kmap, sym->start); (gdb) where #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28, addr=1789230) at util/machine.c:3356 #1 0x000000000109f244 in callchain_id (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:957 #2 0x000000000109e094 in get_key_by_aggr_mode (key=0x3ffea4f7290, addr=27758136, evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:586 #3 0x000000000109f4d0 in report_lock_contention_begin_event (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1004 #4 0x00000000010a00ae in evsel__process_contention_begin (evsel=0x30313e0, sample=0x3ffea4f77d0) at builtin-lock.c:1254 #5 0x00000000010a0e14 in process_sample_event (tool=0x3ffea4f8480, event=0x3ff85601ef8, sample=0x3ffea4f77d0, evsel=0x30313e0, machine=0x3029e28) at builtin-lock.c:1464 ..... The issue is in function machine__is_lock_function() in file ./util/machine.c lines 3355: /* should not fail from here */ sym = machine__find_kernel_symbol_by_name(machine, "__sched_text_end", &kmap); machine->sched.text_end = kmap->unmap_ip(kmap, sym->start) On s390 the symbol __sched_text_end is *NOT* in the symbol list and the resulting pointer sym is set to NULL. The sym->start is then a NULL pointer access and generates the core dump. The reason why __sched_text_end is not in the symbol list on s390 is simple: When the symbol list is created at perf start up with function calls dso__load +--> dso__load_vmlinux_path +--> dso__load_vmlinux +--> dso__load_sym +--> dso__load_sym_internal (reads kernel symbols) +--> symbols__fixup_end +--> symbols__fixup_duplicate The issue is in function symbols__fixup_duplicate(). It deletes all symbols with have the same address. On s390: # nm -g ~/linux/vmlinux| fgrep c68390 0000000000c68390 T __cpuidle_text_start 0000000000c68390 T __sched_text_end # two symbols have identical addresses and __sched_text_end is considered duplicate (in ascending sort order) and removed from the symbol list. Therefore it is missing and an invalid pointer reference occurs. The code checks for symbol __sched_text_start and when it exists assumes symbol __sched_text_end is also in the symbol table. However this is not the case on s390. Same situation exists for symbol __lock_text_start: 0000000000c68770 T __cpuidle_text_end 0000000000c68770 T __lock_text_start This symbol is also removed from the symbol table but used in function machine__is_lock_function(). To fix this and keep duplicate symbols in the symbol table, set symbol_conf.allow_aliases to true. This prevents the removal of duplicate symbols in function symbols__fixup_duplicate(). Output After: # ./perf lock contention contended total wait max wait avg wait type caller 48 124.39 ms 123.99 ms 2.59 ms rwsem:W unlink_anon_vmas+0x24a 47 83.68 ms 83.26 ms 1.78 ms rwsem:W free_pgtables+0x132 5 41.22 us 10.55 us 8.24 us rwsem:W free_pgtables+0x140 4 40.12 us 20.55 us 10.03 us rwsem:W copy_process+0x1ac8 # Fixes: 0d2997f750d1de39 ("perf lock: Look up callchain for the contended locks") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20221230102627.2410847-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-03perf build: Don't propagate subdir to submakes for install_headersIan Rogers1-5/+5
subdir is added to the OUTPUT which fails as part of building install_headers when passed from "make -C tools perf_install". Committer testing: The original reporter (see the Link: below) had trouble with this: $ make -C tools perf_install That ended up with errors like this: /var/home/acme/git/perf-urgent/tools/scripts/Makefile.include:17: *** output directory "/var/home/acme/git/perf-urgent/tools/perf/libperf/perf/" does not exist. Stop. With this patch applied we now get it installed at: INSTALL /var/home/acme/git/perf-urgent/tools/perf/libperf/include/perf/bpf_perf.h As expected: $ ls -la /var/home/acme/git/perf-urgent/tools/perf/libperf/include/perf/bpf_perf.h -rw-r--r--. 1 acme acme 1146 Jan 3 15:42 /var/home/acme/git/perf-urgent/tools/perf/libperf/include/perf/bpf_perf.h And if we clean tools with: $ make -C tools clean it gets cleaned up: $ ls -la /var/home/acme/git/perf-urgent/tools/perf/libperf/include/perf/bpf_perf.h ls: cannot access '/var/home/acme/git/perf-urgent/tools/perf/libperf/include/perf/bpf_perf.h': No such file or directory $ Fixes: 746bd29e348f99b4 ("perf build: Use tools/lib headers from install path") Reported-by: Torsten Hilbrich <torsten.hilbrich@secunet.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/fa4b3115-d555-3d7f-54d1-018002e99350@secunet.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-03perf test record_probe_libc_inet_pton: Fix failure due to extra inet_pton() ↵Arnaldo Carvalho de Melo1-1/+1
backtrace in glibc >= 2.35 Starting with glibc 2.35 there are extra inet_pton() calls when doing a IPv6 ping as in one of the 'perf test' entry, which makes it fail: # perf test inet_pton 89: probe libc's inet_pton & backtrace it with ping : FAILED! # If we look at what this script is expecting (commenting out the removal of the temporary files in it): # cat /tmp/expected.aT6 ping[][0-9 \.:]+probe_libc:inet_pton: \([[:xdigit:]]+\) .*inet_pton\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/libc.so.6|inlined\)$ getaddrinfo\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/libc.so.6\)$ .*(\+0x[[:xdigit:]]+|\[unknown\])[[:space:]]\(.*/bin/ping.*\)$ # And looking at what we are getting out of 'perf script', to match with the above: # cat /tmp/perf.script.IUC ping 623883 [006] 265438.471610: probe_libc:inet_pton: (7f32bcf314c0) 1314c0 __GI___inet_pton+0x0 (/usr/lib64/libc.so.6) 29510 __libc_start_call_main+0x80 (/usr/lib64/libc.so.6) ping 623883 [006] 265438.471664: probe_libc:inet_pton: (7f32bcf314c0) 1314c0 __GI___inet_pton+0x0 (/usr/lib64/libc.so.6) fa6c6 getaddrinfo+0x126 (/usr/lib64/libc.so.6) 491e [unknown] (/usr/bin/ping) # We see that its just the first call to inet_pton() that didn't came thru getaddrinfo(), so if we ignore the first the script matches what it expects, testing that using 'perf probe' + 'perf record' + 'perf script' with callchains on userspace targets is producing the expected results. Since we don't have a 'perf script --skip' to help us here, use tac + grep to do that, resulting in a one liner that makes this script work on both older glibc versions as well as with 2.35. With it, on fedora 36, x86, glibc 2.35: # perf test inet_pton 90: probe libc's inet_pton & backtrace it with ping : Ok # perf test -v inet_pton 90: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 627197 ping 627220 1 267956.962402: probe_libc:inet_pton_1: (7f488bf314c0) 1314c0 __GI___inet_pton+0x0 (/usr/lib64/libc.so.6) fa6c6 getaddrinfo+0x126 (/usr/lib64/libc.so.6) 491e n (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok # And on Ubuntu 22.04.1 LTS on a Libre Computer ROC-RK3399-PC arm64 system: Before this patch it works (see that the script used has no 'tac' to remove the first event): root@roc-rk3399-pc:~# dpkg -l | grep libc-bin ii libc-bin 2.35-0ubuntu3.1 arm64 GNU C Library: Binaries root@roc-rk3399-pc:~# grep -w tac ~acme/libexec/perf-core/tests/shell/record+probe_libc_inet_pton.sh root@roc-rk3399-pc:~# perf test inet_pton 86: probe libc's inet_pton & backtrace it with ping : Ok root@roc-rk3399-pc:~# perf test -v inet_pton 86: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 1375 ping 1399 [000] 4114.417450: probe_libc:inet_pton: (ffffb3e26120) 106120 inet_pton+0x0 (/usr/lib/aarch64-linux-gnu/libc.so.6) d18bc getaddrinfo+0xec (/usr/lib/aarch64-linux-gnu/libc.so.6) 2b68 [unknown] (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok root@roc-rk3399-pc:~# And after it continues to work: root@roc-rk3399-pc:~# grep -w tac ~acme/libexec/perf-core/tests/shell/record+probe_libc_inet_pton.sh perf script -i $perf_data | tac | grep -m1 ^ping -B9 | tac > $perf_script root@roc-rk3399-pc:~# perf test inet_pton 86: probe libc's inet_pton & backtrace it with ping : Ok root@roc-rk3399-pc:~# perf test -v inet_pton 86: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 6995 ping 7019 [005] 4832.160741: probe_libc:inet_pton: (ffffa62e6120) 106120 inet_pton+0x0 (/usr/lib/aarch64-linux-gnu/libc.so.6) d18bc getaddrinfo+0xec (/usr/lib/aarch64-linux-gnu/libc.so.6) 2b68 [unknown] (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok root@roc-rk3399-pc:~# Reported-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: http://lore.kernel.org/lkml/Y7QyPkPlDYip3cZH@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-02perf tools: Fix segfault when trying to process tracepoints in perf.data and ↵Arnaldo Carvalho de Melo1-0/+12
not linked with libtraceevent When we have a perf.data file with tracepoints, such as: # perf evlist -f probe_perf:lzma_decompress_to_file # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events # We end up segfaulting when using perf built with NO_LIBTRACEEVENT=1 by trying to find an evsel with a NULL 'event_name' variable: (gdb) run report --stdio -f Starting program: /root/bin/perf report --stdio -f Program received signal SIGSEGV, Segmentation fault. 0x000000000055219d in find_evsel (evlist=0xfda7b0, event_name=0x0) at util/sort.c:2830 warning: Source file is more recent than executable. 2830 if (event_name[0] == '%') { Missing separate debuginfos, use: dnf debuginfo-install bzip2-libs-1.0.8-11.fc36.x86_64 cyrus-sasl-lib-2.1.27-18.fc36.x86_64 elfutils-debuginfod-client-0.188-3.fc36.x86_64 elfutils-libelf-0.188-3.fc36.x86_64 elfutils-libs-0.188-3.fc36.x86_64 glibc-2.35-20.fc36.x86_64 keyutils-libs-1.6.1-4.fc36.x86_64 krb5-libs-1.19.2-12.fc36.x86_64 libbrotli-1.0.9-7.fc36.x86_64 libcap-2.48-4.fc36.x86_64 libcom_err-1.46.5-2.fc36.x86_64 libcurl-7.82.0-12.fc36.x86_64 libevent-2.1.12-6.fc36.x86_64 libgcc-12.2.1-4.fc36.x86_64 libidn2-2.3.4-1.fc36.x86_64 libnghttp2-1.51.0-1.fc36.x86_64 libpsl-0.21.1-5.fc36.x86_64 libselinux-3.3-4.fc36.x86_64 libssh-0.9.6-4.fc36.x86_64 libstdc++-12.2.1-4.fc36.x86_64 libunistring-1.0-1.fc36.x86_64 libunwind-1.6.2-2.fc36.x86_64 libxcrypt-4.4.33-4.fc36.x86_64 libzstd-1.5.2-2.fc36.x86_64 numactl-libs-2.0.14-5.fc36.x86_64 opencsd-1.2.0-1.fc36.x86_64 openldap-2.6.3-1.fc36.x86_64 openssl-libs-3.0.5-2.fc36.x86_64 slang-2.3.2-11.fc36.x86_64 xz-libs-5.2.5-9.fc36.x86_64 zlib-1.2.11-33.fc36.x86_64 (gdb) bt #0 0x000000000055219d in find_evsel (evlist=0xfda7b0, event_name=0x0) at util/sort.c:2830 #1 0x0000000000552416 in add_dynamic_entry (evlist=0xfda7b0, tok=0xffb6eb "trace", level=2) at util/sort.c:2976 #2 0x0000000000552d26 in sort_dimension__add (list=0xf93e00 <perf_hpp_list>, tok=0xffb6eb "trace", evlist=0xfda7b0, level=2) at util/sort.c:3193 #3 0x0000000000552e1c in setup_sort_list (list=0xf93e00 <perf_hpp_list>, str=0xffb6eb "trace", evlist=0xfda7b0) at util/sort.c:3227 #4 0x00000000005532fa in __setup_sorting (evlist=0xfda7b0) at util/sort.c:3381 #5 0x0000000000553cdc in setup_sorting (evlist=0xfda7b0) at util/sort.c:3608 #6 0x000000000042eb9f in cmd_report (argc=0, argv=0x7fffffffe470) at builtin-report.c:1596 #7 0x00000000004aee7e in run_builtin (p=0xf64ca0 <commands+288>, argc=3, argv=0x7fffffffe470) at perf.c:330 #8 0x00000000004af0f2 in handle_internal_command (argc=3, argv=0x7fffffffe470) at perf.c:384 #9 0x00000000004af241 in run_argv (argcp=0x7fffffffe29c, argv=0x7fffffffe290) at perf.c:428 #10 0x00000000004af5fc in main (argc=3, argv=0x7fffffffe470) at perf.c:562 (gdb) So check if we have tracepoint events in add_dynamic_entry() and bail out instead: # perf report --stdio -f This perf binary isn't linked with libtraceevent, can't process probe_perf:lzma_decompress_to_file Error: Unknown --sort key: `trace' # Fixes: 378ef0f5d9d7f465 ("perf build: Use libtraceevent from the system") Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/Y7MDb7kRaHZB6APC@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-02perf tools: Don't include signature in version stringsAhelenia Ziemiańska2-2/+2
This explodes the build if HEAD is signed, since the generated version is gpg: Signature made Mon 26 Dec 2022 20:34:48 CET, then a few more lines, then the SHA. Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/7c9637711271f50ec2341fb8a7c29585335dab04.1672174189.git.nabijaczleweli@nabijaczleweli.xyz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>