summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-12-05perf tools: Use dedicated non-atomic clear/set bit helpersSean Christopherson15-27/+27
Use the dedicated non-atomic helpers for {clear,set}_bit() and their test variants, i.e. the double-underscore versions. Depsite being defined in atomic.h, and despite the kernel versions being atomic in the kernel, tools' {clear,set}_bit() helpers aren't actually atomic. Move to the double-underscore versions so that the versions that are expected to be atomic (for kernel developers) can be made atomic without affecting users that don't want atomic operations. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: James Morse <james.morse@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Sean Christopherson <seanjc@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Yury Norov <yury.norov@gmail.com> Cc: alexandru elisei <alexandru.elisei@arm.com> Cc: kvm@vger.kernel.org Cc: kvmarm@lists.cs.columbia.edu Cc: kvmarm@lists.linux.dev Cc: linux-arm-kernel@lists.infradead.org Link: http://lore.kernel.org/lkml/20221119013450.2643007-6-seanjc@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo32-120/+259
To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf list: List callback support for libpfmIan Rogers2-90/+71
Missed previously, add libpfm support for 'perf list' callbacks and thereby JSON support. Committer notes: Add __maybe_unused to the args of the new print_libpfm_events() in the else HAVE_LIBPFM block. Fixes: e42b0ee61282a2f9 ("perf list: Add JSON output option") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221118024607.409083-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf list: JSON escape encoding improvementsIan Rogers1-42/+67
Use strbuf to make the string under construction's length unlimited. Use the format %s to mean a literal string copy and %S to signify a need to escape the string. Add supported for escaping a newline character. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221118024607.409083-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf list: Support newlines in wordwrapIan Rogers1-4/+7
Rather than a newline starting from column 0, record a newline was seen and then add the newline and space before the next word. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221118024607.409083-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf symbol: correction while adjusting symbolAjay Kaher1-1/+1
perf doesn't provide proper symbol information for specially crafted .debug files. Sometimes .debug file may not have similar program header as runtime ELF file. For example if we generate .debug file using objcopy --only-keep-debug resulting file will not contain .text, .data and other runtime sections. That means corresponding program headers will have zero FileSiz and modified Offset. Example: program header of text section of libxxx.so: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x00000000003d3000 0x00000000003d3000 0x00000000003d3000 0x000000000055ae80 0x000000000055ae80 R E 0x1000 Same program header after executing: objcopy --only-keep-debug libxxx.so libxxx.so.debug LOAD 0x0000000000001000 0x00000000003d3000 0x00000000003d3000 0x0000000000000000 0x000000000055ae80 R E 0x1000 Offset and FileSiz have been changed. Following formula will not provide correct value, if program header taken from .debug file (syms_ss): sym.st_value -= phdr.p_vaddr - phdr.p_offset; Correct program header information is located inside runtime ELF file (runtime_ss). Fixes: 2d86612aacb7805f ("perf symbol: Correct address for bss symbols") Signed-off-by: Ajay Kaher <akaher@vmware.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Makhalov <amakhalov@vmware.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srivatsa S. Bhat <srivatsab@vmware.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Vasavi Sirnapalli <vsirnapalli@vmware.com> Link: http://lore.kernel.org/lkml/1669198696-50547-1-git-send-email-akaher@vmware.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf vendor events intel: Update events and metrics for alderlakeZhengjun Xing11-2691/+1525
Update JSON events and metrics for alderlake to perf. Based on ADL JSON event list v1.16: https://github.com/intel/perfmon/tree/main/ADL/events Generate the event list and metrics with the converter scripts: https://github.com/intel/perfmon/pull/32 Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221124031441.110134-4-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf vendor events intel: Add metrics for Alderlake-NZhengjun Xing1-0/+583
Add JSON metrics for Alderlake-N to perf. It only included E-core metrics. E-core metrics based on E-core TMA v2.2 (E-core_TMA_Metrics.csv) It is downloaded from: https://github.com/intel/perfmon/ Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221124031441.110134-3-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf vendor events intel: Add uncore event list for Alderlake-NZhengjun Xing2-0/+208
Add JSON uncore events for Alderlake-N Based on JSON list v1.16: https://github.com/intel/perfmon/tree/main/ADL/events/ Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221124031441.110134-2-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf vendor events intel: Add core event list for Alderlake-NZhengjun Xing8-1/+1075
Alderlake-N only has E-core, it has been moved to non-hybrid code path on the kernel side, so add the cpuid for Alderlake-N separately. Add core event list for Alderlake-N, it is based on the ADL gracemont v1.16 JSON file. https://github.com/intel/perfmon/tree/main/ADL/events/ Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221124031441.110134-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Tidy up JSON metric-only output when no metricsNamhyung Kim1-10/+17
It printed empty strings for each metric. I guess it's needed for CSV output to match the column number. We could just ignore the empty metrics in JSON but it ended up with a broken JSON object with a trailing comma. So I added a dummy '"metric-value" : "none"' part. To do that, it needs to pass struct outstate to print_metric_end() to check if any metric value is printed or not. Before: # perf stat -aj --metric-only --per-socket --for-each-cgroup system.slice true {"socket" : "S0", "cpu-count" : 8, "cgroup" : "system.slice", "" : "", "" : "", "" : "", "" : "", "" : "", "" : "", "" : "", "" : ""} After: # perf stat -aj --metric-only --per-socket --for-each-cgroup system.slice true {"socket" : "S0", "cpu-count" : 8, "cgroup" : "system.slice", "metric-value" : "none"} Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-16-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Rename "aggregate-number" to "cpu-count" in JSONNamhyung Kim1-4/+4
As the JSON output has been broken for a little while, I guess there are not many users. Let's rename the field to more intuitive one. :) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-15-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Fix JSON output in metric-only modeNamhyung Kim1-18/+24
It generated a broken JSON output when aggregation mode or cgroup is used with --metric-only option. Also get rid of the header line and make the output single line for each entry. It needs to know whether the current metric is the first one or not. So add 'first' field in the outstate and mark it false after printing. Before: # perf stat -a -j --metric-only true {"unit" : "GHz"}{"unit" : "insn per cycle"}{"unit" : "branch-misses of all branches"} {{"metric-value" : "0.797"}{"metric-value" : "1.65"}{"metric-value" : "0.89"} ^ # perf stat -a -j --metric-only --per-socket true {"unit" : "GHz"}{"unit" : "insn per cycle"}{"unit" : "branch-misses of all branches"} {"socket" : "S0", "aggregate-number" : 8, {"metric-value" : "0.295"}{"metric-value" : "1.88"}{"metric-value" : "0.64"} ^ After: # perf stat -a -j --metric-only true {"GHz" : "0.990", "insn per cycle" : "2.06", "branch-misses of all branches" : "0.59"} # perf stat -a -j --metric-only --per-socket true {"socket" : "S0", "aggregate-number" : 8, "GHz" : "0.439", "insn per cycle" : "2.14", "branch-misses of all branches" : "0.51"} Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-14-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Pass through 'struct outstate'Namhyung Kim4-63/+50
Now most of the print functions take a pointer to the struct outstate. We have one in the evlist__print_counters() and pass it through the child functions. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-13-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Do not pass runtime_stat to printout()Namhyung Kim1-5/+4
It always passes a pointer to rt_stat as it's the only one. Let's not pass it and directly refer it in the printout(). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-12-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Pass struct outstate to printout()Namhyung Kim1-20/+18
The printout() takes a lot of arguments and sets an outstate with the value. Instead, we can fill the outstate first and then pass it to reduce the number of arguments. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-11-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Pass 'struct outstate' to print_metric_begin()Namhyung Kim1-22/+28
It passes prefix and cgroup pointers but the outstate already has them. Let's pass the outstate pointer instead. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Use 'struct outstate' in evlist__print_counters()Namhyung Kim1-11/+14
This is a preparation for the later cleanup. No functional changes intended. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-9-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Pass const char *prefix to display routinesNamhyung Kim2-10/+10
This is a minor cleanup and preparation for the later change. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Remove metric_only argument in print_counter_aggrdata()Namhyung Kim1-11/+6
It already passes the stat_config argument, then it can find the value in the config. No need to pass it separately. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Remove prefix argument in print_metric_headers()Namhyung Kim1-16/+10
It always passes a whitespace to the function, thus we can just add it to the function body. Furthermore, it's only used in the normal output mode. Well, actually CSV used it but it doesn't need to since we don't care about the indentation or alignment in the CSV output. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Use scnprintf() in prepare_interval()Namhyung Kim1-10/+10
It should not use sprintf() anymore. Let's pass the buffer size and use the safer scnprintf() instead. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Do not align time prefix in CSV outputNamhyung Kim1-3/+6
We don't care about the alignment in the CSV output as it's intended for machine processing. Let's get rid of it to make the output more compact. Before: # perf stat -a --summary -I 1 -x, true 0.001149309,219.20,msec,cpu-clock,219322251,100.00,219.200,CPUs utilized 0.001149309,144,,context-switches,219241902,100.00,656.935,/sec 0.001149309,38,,cpu-migrations,219173705,100.00,173.358,/sec 0.001149309,61,,page-faults,219093635,100.00,278.285,/sec 0.001149309,10679310,,cycles,218746228,100.00,0.049,GHz 0.001149309,6288296,,instructions,218589869,100.00,0.59,insn per cycle 0.001149309,1386904,,branches,218428851,100.00,6.327,M/sec 0.001149309,56863,,branch-misses,218219951,100.00,4.10,of all branches summary,219.20,msec,cpu-clock,219322251,100.00,20.025,CPUs utilized summary,144,,context-switches,219241902,100.00,656.935,/sec summary,38,,cpu-migrations,219173705,100.00,173.358,/sec summary,61,,page-faults,219093635,100.00,278.285,/sec summary,10679310,,cycles,218746228,100.00,0.049,GHz summary,6288296,,instructions,218589869,100.00,0.59,insn per cycle summary,1386904,,branches,218428851,100.00,6.327,M/sec summary,56863,,branch-misses,218219951,100.00,4.10,of all branches After: 0.001148449,224.75,msec,cpu-clock,224870589,100.00,224.747,CPUs utilized 0.001148449,176,,context-switches,224775564,100.00,783.103,/sec 0.001148449,38,,cpu-migrations,224707428,100.00,169.079,/sec 0.001148449,61,,page-faults,224629326,100.00,271.416,/sec 0.001148449,12172071,,cycles,224266368,100.00,0.054,GHz 0.001148449,6901907,,instructions,224108764,100.00,0.57,insn per cycle 0.001148449,1515655,,branches,223946693,100.00,6.744,M/sec 0.001148449,70027,,branch-misses,223735385,100.00,4.62,of all branches summary,224.75,msec,cpu-clock,224870589,100.00,21.066,CPUs utilized summary,176,,context-switches,224775564,100.00,783.103,/sec summary,38,,cpu-migrations,224707428,100.00,169.079,/sec summary,61,,page-faults,224629326,100.00,271.416,/sec summary,12172071,,cycles,224266368,100.00,0.054,GHz summary,6901907,,instructions,224108764,100.00,0.57,insn per cycle summary,1515655,,branches,223946693,100.00,6.744,M/sec summary,70027,,branch-misses,223735385,100.00,4.62,of all branches Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Move summary prefix printing logic in CSV outputNamhyung Kim1-7/+7
It matches to the prefix (interval timestamp), so better to have them together. No functional change intended. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24perf stat: Fix cgroup display in JSON outputNamhyung Kim1-1/+1
It missed the 'else' keyword after checking json output mode. Fixes: 41cb875242e71bf1 ("perf stat: Split print_cgroup() function") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221123180208.2068936-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-24Merge tag 'pci-v6.1-fixes-3' of ↵Linus Torvalds1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci fixes from Bjorn Helgaas: - Update MAINTAINERS to add Manivannan Sadhasivam as Qcom PCIe RC maintainer (replacing Stanimir Varbanov) and include DT PCI bindings in the "PCI native host bridge and endpoint drivers" entry. * tag 'pci-v6.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: MAINTAINERS: Include PCI bindings in host bridge entry MAINTAINERS: Add Manivannan Sadhasivam as Qcom PCIe RC maintainer
2022-11-23Merge tag 'spi-fix-v6.1-rc6' of ↵Linus Torvalds4-7/+20
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A few fixes, all device specific. The most important ones are for the i.MX driver which had a couple of nasty data corruption inducing errors appear after the change to support PIO mode in the last merge window (one introduced by the change and one latent one which the PIO changes exposed). Thanks to Frieder, Fabio, Marc and Marek for jumping on that and resolving the issues quickly once they were found" * tag 'spi-fix-v6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-imx: spi_imx_transfer_one(): check for DMA transfer first spi: tegra210-quad: Fix duplicate resource error spi: dw-dma: decrease reference count in dw_spi_dma_init_mfld() spi: spi-imx: Fix spi_bus_clk if requested clock is higher than input clock spi: mediatek: Fix DEVAPC Violation at KO Remove
2022-11-23Merge tag '9p-for-6.1-rc7' of https://github.com/martinetd/linuxLinus Torvalds2-11/+22
Pull 9p fixes from Dominique Martinet: - 9p now uses a variable size for its recv buffer, but every place hadn't been updated properly to use it and some buffer overflows have been found and needed fixing. There's still one place where msize is incorrectly used in a safety check (p9_check_errors), but all paths leading to it should already be avoiding overflows and that patch took a bit more time to get right for zero-copy requests so I'll send it for 6.2 - yet another race condition in p9_conn_cancel introduced by a fix for a syzbot report in the same place. Maybe at some point we'll get it right without burning it all down... * tag '9p-for-6.1-rc7' of https://github.com/martinetd/linux: 9p/xen: check logical size for buffer size 9p/fd: Use P9_HDRSZ for header size 9p/fd: Fix write overflow in p9_read_work 9p/fd: fix issue of list_del corruption in p9_fd_cancel()
2022-11-23fscache: fix OOB Read in __fscache_acquire_volumeDavid Howells2-3/+6
The type of a->key[0] is char in fscache_volume_same(). If the length of cache volume key is greater than 127, the value of a->key[0] is less than 0. In this case, klen becomes much larger than 255 after type conversion, because the type of klen is size_t. As a result, memcmp() is read out of bounds. This causes a slab-out-of-bounds Read in __fscache_acquire_volume(), as reported by Syzbot. Fix this by changing the type of the stored key to "u8 *" rather than "char *" (it isn't a simple string anyway). Also put in a check that the volume name doesn't exceed NAME_MAX. BUG: KASAN: slab-out-of-bounds in memcmp+0x16f/0x1c0 lib/string.c:757 Read of size 8 at addr ffff888016f3aa90 by task syz-executor344/3613 Call Trace: memcmp+0x16f/0x1c0 lib/string.c:757 memcmp include/linux/fortify-string.h:420 [inline] fscache_volume_same fs/fscache/volume.c:133 [inline] fscache_hash_volume fs/fscache/volume.c:171 [inline] __fscache_acquire_volume+0x76c/0x1080 fs/fscache/volume.c:328 fscache_acquire_volume include/linux/fscache.h:204 [inline] v9fs_cache_session_get_cookie+0x143/0x240 fs/9p/cache.c:34 v9fs_session_init+0x1166/0x1810 fs/9p/v9fs.c:473 v9fs_mount+0xba/0xc90 fs/9p/vfs_super.c:126 legacy_get_tree+0x105/0x220 fs/fs_context.c:610 vfs_get_tree+0x89/0x2f0 fs/super.c:1530 do_new_mount fs/namespace.c:3040 [inline] path_mount+0x1326/0x1e20 fs/namespace.c:3370 do_mount fs/namespace.c:3383 [inline] __do_sys_mount fs/namespace.c:3591 [inline] __se_sys_mount fs/namespace.c:3568 [inline] __x64_sys_mount+0x27f/0x300 fs/namespace.c:3568 Fixes: 62ab63352350 ("fscache: Implement volume registration") Reported-by: syzbot+a76f6a6e524cf2080aa3@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Zhang Peng <zhangpeng362@huawei.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Jeff Layton <jlayton@kernel.org> cc: v9fs-developer@lists.sourceforge.net cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/Y3OH+Dmi0QIOK18n@codewreck.org/ # Zhang Peng's v1 fix Link: https://lore.kernel.org/r/20221115140447.2971680-1-zhangpeng362@huawei.com/ # Zhang Peng's v2 fix Link: https://lore.kernel.org/r/166869954095.3793579.8500020902371015443.stgit@warthog.procyon.org.uk/ # v1 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-11-23perf lock contention: Do not use BPF task local storageNamhyung Kim2-12/+23
It caused some troubles when a lock inside kmalloc is contended because task local storage would allocate memory using kmalloc. It'd create a recusion and even crash in my system. There could be a couple of workarounds but I think the simplest one is to use a pre-allocated hash map. We could fix the task local storage to use the safe BPF allocator, but it takes time so let's change this until it happens actually. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Blake Jones <blakejones@google.com> Cc: Chris Li <chriscli@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20221118190109.1512674-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23MAINTAINERS: Update John Garry's email address for arm64 perf toolingJohn Garry1-1/+1
Update my address. Signed-off-by: John Garry <john.g.garry@oracle.com> Acked-by: Will Deacon <will@kernel.org> Cc: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20221121113018.1899426-1-john.g.garry@oracle.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Fix record test on KVM guestsMichael Petlan1-1/+1
Using precise flag with br_inst_retired.near_call causes the test fail on KVM guests, even when the guests have PMU forwarding enabled and the event itself is supported. Remove the precise flag in order to make the test work on KVM guests. Signed-off-by: Michael Petlan <mpetlan@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20221122083121.6012-1-mpetlan@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf inject: Set PERF_RECORD_MISC_BUILD_ID_SIZENamhyung Kim1-1/+2
With perf inject -b, it synthesizes build-id event for DSOs. But it missed to set the size and resulted in having trailing zeros. As perf record sets the size in write_build_id(), let's set the size here as well. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221119002750.1568027-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Skip watchpoint tests if no watchpoints availableNaveen N. Rao1-5/+7
On IBM Power9, perf watchpoint tests fail since no hardware breakpoints are available. Detect this by checking the error returned by perf_event_open() and skip the tests in that case. Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Reviewed-by: Kajol Jain<kjain@linux.ibm.com> Tested-by: Kajol Jain<kjain@linux.ibm.com> Link: https://lore.kernel.org/r/20221121102747.208289-1-naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org
2022-11-23perf trace: Remove unused bpf map 'syscalls'Leo Yan2-118/+0
augmented_raw_syscalls.c defines the bpf map 'syscalls' which is initialized by perf tool in user space to indicate which system calls are enabled for tracing, on the other flip eBPF program relies on the map to filter out the trace events which are not enabled. The map also includes a field 'string_args_len[6]' which presents the string length if the corresponding argument is a string type. Now the map 'syscalls' is not used, bpf program doesn't use it as filter anymore, this is replaced by using the function bpf_tail_call() and PROG_ARRAY syscalls map. And we don't need to explicitly set the string length anymore, bpf_probe_read_str() is smart to copy the string and return string length. Therefore, it's safe to remove the bpf map 'syscalls'. To consolidate the code, this patch removes the definition of map 'syscalls' from augmented_raw_syscalls.c and drops code for using the map in the perf trace. Note, since function trace__set_ev_qualifier_bpf_filter() is removed, calling trace__init_syscall_bpf_progs() from it is also removed. We don't need to worry it because trace__init_syscall_bpf_progs() is still invoked from trace__init_syscalls_bpf_prog_array_maps() for initialization the system call's bpf program callback. After: # perf trace -e examples/bpf/augmented_raw_syscalls.c,open* --max-events 10 perf stat --quiet sleep 0.001 openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libelf.so.1", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libdw.so.1", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libunwind.so.8", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libunwind-aarch64.so.8", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libcrypto.so.3", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libslang.so.2", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libperl.so.5.34", O_RDONLY|O_CLOEXEC) = 3 openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 # perf trace -e examples/bpf/augmented_raw_syscalls.c --max-events 10 perf stat --quiet sleep 0.001 ... [continued]: execve()) = 0 brk(NULL) = 0xaaaab1d28000 faccessat(-100, "/etc/ld.so.preload", 4) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 close(3</usr/lib/aarch64-linux-gnu/libcrypto.so.3>) = 0 openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3 read(3</usr/lib/aarch64-linux-gnu/libcrypto.so.3>, 0xfffff33f70d0, 832) = 832 munmap(0xffffb5519000, 28672) = 0 munmap(0xffffb55b7000, 32880) = 0 mprotect(0xffffb55a6000, 61440, PROT_NONE) = 0 Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20221121075237.127706-6-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf augmented_raw_syscalls: Remove unused variable 'syscall'Leo Yan1-1/+0
The local variable 'syscall' is not used anymore, remove it. Signed-off-by: Leo Yan <leo.yan@linaro.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20221121075237.127706-5-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf trace: Handle failure when trace point folder is missedLeo Yan1-7/+10
On Arm64 a case is perf tools fails to find the corresponding trace point folder for system calls listed in the table 'syscalltbl_arm64', e.g. the generated system call table contains "lookup_dcookie" but we cannot find out the matched trace point folder for it. We need to figure out if there have any issue for the generated system call table, on the other hand, we need to handle the case when trace point folder is missed under sysfs, this patch sets the flag syscall::nonexistent as true and returns the error from trace__read_syscall_info(). Another problem is for trace__syscall_info(), it returns two different values if a system call doesn't exist: at the first time calling trace__syscall_info() it returns NULL when the system call doesn't exist, later if call trace__syscall_info() again for the same missed system call, it returns pointer of syscall. trace__syscall_info() checks the condition 'syscalls.table[id].name == NULL', but the name will be assigned in the first invoking even the system call is not found. So checking system call's name in trace__syscall_info() is not the right thing to do, this patch simply checks flag syscall::nonexistent to make decision if a system call exists or not, finally trace__syscall_info() returns the consistent result (NULL) if a system call doesn't existed. Fixes: b8b1033fcaa091d8 ("perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages") Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: bpf@vger.kernel.org Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221121075237.127706-4-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf trace: Return error if a system call doesn't existLeo Yan1-2/+2
When a system call is not detected, the reason is either because the system call ID is out of scope or failure to find the corresponding path in the sysfs, trace__read_syscall_info() returns zero. Finally, without returning an error value it introduces confusion for the caller. This patch lets the function trace__read_syscall_info() to return -EEXIST when a system call doesn't exist. Fixes: b8b1033fcaa091d8 ("perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages") Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: bpf@vger.kernel.org Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20221121075237.127706-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf trace: Use macro RAW_SYSCALL_ARGS_NUM to replace numberLeo Yan1-4/+7
This patch defines a macro RAW_SYSCALL_ARGS_NUM to replace the open coded number '6'. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20221121075237.127706-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf list: Add JSON output optionIan Rogers2-67/+245
Output events and metrics in a JSON format by overriding the print callbacks. Currently other command line options aren't supported and metrics are repeated once per metric group. Committer testing: $ perf list cache List of pre-defined events (to be used in -e or -M): L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-prefetches [Hardware cache event] L1-icache-load-misses [Hardware cache event] L1-icache-loads [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] iTLB-load-misses [Hardware cache event] iTLB-loads [Hardware cache event] $ perf list --json cache [ { "Unit": "cache", "EventName": "L1-dcache-load-misses", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "L1-dcache-loads", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "L1-dcache-prefetches", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "L1-icache-load-misses", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "L1-icache-loads", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "branch-load-misses", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "branch-loads", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "dTLB-load-misses", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "dTLB-loads", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "iTLB-load-misses", "EventType": "Hardware cache event" }, { "Unit": "cache", "EventName": "iTLB-loads", "EventType": "Hardware cache event" } ] $ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf list: Reorganize to use callbacks to allow honouring command line optionsIan Rogers7-512/+624
Rather than controlling the list output with passed flags, add callbacks that are called when an event or metric are encountered. State is passed to the callback so that command line options can be respected, alternatively the callbacks can be changed. Fix a few bugs: - wordwrap to columns metric descriptions and expressions; - remove unnecessary whitespace after PMU event names; - the metric filter is a glob but matched using strstr which will always fail, switch to using a proper globmatch, - the detail flag gives details for extra kernel PMU events like branch-instructions. In metricgroup.c switch from struct mep being a rbtree of metricgroups containing a list of metrics, to the tree directly containing all the metrics. In general the alias for a name is passed to the print routine rather than being contained in the name with OR. Committer notes: Check the asprint() return to address this on fedora 36: util/print-events.c: In function ‘print_sdt_events’: util/print-events.c:183:33: error: ignoring return value of ‘asprintf’ declared with attribute ‘warn_unused_result’ [-Werror=unused-result] 183 | asprintf(&evt_name, "%s@%s(%.12s)", sdt_name->s, path, bid); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors $ gcc --version | head -1 gcc (GCC) 12.2.1 20220819 (Red Hat 12.2.1-2) $ Fix ps.pmu_glob setting when dealing with *:* events, it was being left with a freed pointer that then at the end of cmd_list() would be double freed. Check if pmu_name is NULL in default_print_event() before calling strglobmatch(pmu_name, ...) to avoid a segfault. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xin Gao <gaoxin@cdjrlc.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20221114210723.2749751-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf build: Fix LIBTRACEEVENT_DYNAMICIan Rogers2-4/+24
The tools/lib includes fixes break LIBTRACEVENT_DYNAMIC as the makefile erroneously had dependencies on building libtraceevent even when not linking with it. This change fixes the issues with LIBTRACEEVENT_DYNAMIC by making the built files optional. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nicolas Schier <nicolas@fjasle.eu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221116224631.207631-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Replace data symbol test workload with datasymNamhyung Kim1-28/+1
So that it can get rid of requirement of a compiler. $ sudo ./perf test -v 109 109: Test data symbol : --- start --- test child forked, pid 844526 Recording workload... [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.354 MB /tmp/__perf_test.perf.data.GFeZO (4847 samples) ] Cleaning up files... test child finished with 0 ---- end ---- Test data symbol: Ok Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-13-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Add 'datasym' test workloadNamhyung Kim4-0/+28
The datasym workload is to check if perf mem command gets the data addresses precisely. This is needed for data symbol test. $ perf test -w datasym I had to keep the buf1 in the data section, otherwise it could end up in the BSS and was mmaped as a separate //anon region, then it was not symbolized at all. It needs to be fixed separately. Committer notes: Add a -U _FORTIFY_SOURCE to the datasym CFLAGS, as the main perf flags set it and it requires building with optimization, and this new test has a -O0. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-12-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Replace brstack test workloadNamhyung Kim1-54/+12
So that it can get rid of requirement of a compiler. Also rename the symbols to match with the perf test workload. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: James Clark <james.clark@arm.com> Acked-by: German Gomez <german.gomez@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-11-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Add 'brstack' test workloadNamhyung Kim4-0/+44
The brstack is to run different kinds of branches repeatedly. This is necessary for brstack test case to verify if it has correct branch info. $ perf test -w brstack I renamed the internal functions to have brstack_ prefix as it's too generic name. Add a -U_FORTIFY_SOURCE to the brstack CFLAGS, as the main perf flags set it and it requires building with optimization, and this new test has a -O0. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Replace arm spe fork test workload with sqrtloopNamhyung Kim1-43/+1
So that it can get rid of requirement of a compiler. I've also removed killall as it'll kill perf process now and run the test workload for 10 sec instead. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: James Clark <james.clark@arm.com> Tested-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-9-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Add 'sqrtloop' test workloadNamhyung Kim4-0/+49
The sqrtloop creates a child process to run an infinite loop calling sqrt() with rand(). This is needed for ARM SPE fork test. $ perf test -w sqrtloop It can take an optional argument to specify how long it will run in seconds (default: 1). Committer notes: Explicitely ignored the sqrt() return to fix the build on systems where the compiler complains it isn't being used. And added a sqrtloop specific CFLAGS to disable optimizations to make this a bit more robust wrt dead code elimination. Doing that a -U_FORTIFY_SOURCE needs to be added, as -O0 is incompatible with it. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Replace arm callgraph fp test workload with leafloopNamhyung Kim1-31/+3
So that it can get rid of requirement of a compiler. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23perf test: Add 'leafloop' test workloadNamhyung Kim4-0/+39
The leafloop workload is to run an infinite loop in the test_leaf function. This is needed for the ARM fp callgraph test to verify if it gets the correct callchains. $ perf test -w leafloop Committer notes: Add a: -U_FORTIFY_SOURCE to the leafloop CFLAGS as the main perf flags set it and it requires building with optimization, and this new test has a -O0. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20221116233854.1596378-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>