| Age | Commit message (Collapse) | Author | Files | Lines |
|
Perf only looks at attr.config when determining what was programmed into
ETMCR. These bits could theoretically be in any of the config fields.
Add a generic helper to find the value of any named format field in any
config field and then use it to get the attributes relevant to ETMCR.
The kernel will also stop publishing the ETMCR register bits in a header
[1] so preempt that by defining them here.
Move field_prep() to util.h so we can define it along side field_get().
Unfortunately FIELD_PREP() and FIELD_GET() from the kernel can't be used
as they require the mask to be a compile time constant.
[1]: https://lore.kernel.org/linux-arm-kernel/20251128-james-cs-syncfreq-v8-10-4d319764cc58@linaro.org/
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This pattern occurs a few times and we'll add another one later, so add
a helper function for it.
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Test that evsel__set_config_if_unset() behaves as expected. This also
tests the user config change tracking mechanism as it depends on it.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Requiring the 'pmu->perf_event_attr_init_default' callback to be set to
track user changes is a bit of a trap to fall in. It's hard to see that
this is required when depending on the user change tracking.
It's possible to want all 0 defaults so not set it, but at the same time
still do some programmatic setting of configs with
evsel__set_config_if_unset(). Also if a PMU reverts to 0 defaults and
deletes its existing callback, it will silently break existing uses of
evsel__set_config_if_unset().
One way to fix this would be to assert in evsel__set_config_if_unset()
if the changes weren't tracked, but that would be a possibly untested
runtime failure. Instead, always track it as it's harmless and
simplifies testing too.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This will be used by aux PMUs to read an already written value for
configuring their events and for also testing.
Its helper perf_pmu__format_unpack() does the opposite of the existing
pmu_format_value() so rename that one to perf_pmu__format_pack() so it's
clear how they are related.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Misleadingly, evsel__set_config_if_unset() only works with the config
field and not config1, config2, etc. This is fine at the moment because
all users of it happen to operate on bits that are in that config field.
Fix it before there are any new users of the function which operate on
bits in different config fields.
In theory it's also possible for a driver to move an existing bit to
another config field and this fixes that scenario too, although this
hasn't happened yet either.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently we only track which bits were set by the user in attr->config.
But all configN fields should be treated equally as they can all have
default and user overridden values.
Track them all by making get_config_chgs() generic and calling it once
for each config value.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Sparse config fields are technically supported although currently
unused. field_prep() only works for contiguous bitfields so replace it
with pmu_format_value().
pmu_format_value() also takes a bitmap rather than a u64 so replace
'u64 bits' with format->bits.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
At least one of these were put here to avoid a Python binding linking
issue which is no longer present. Put them back in their correct
location to avoid confusion about which file to add a new evsel__*
function to later.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/all/ZEbAS2yx2fguW60w@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Make the evsel argument first to match the other evsel__* functions
and remove the redundant pmu argument, which can be accessed via evsel.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The ADD_CONFIG_TERM() macros build the __type argument out of a partial
EVSEL__CONFIG_TERM_x enum name. This means that they can't be called
from a function where __type is a variable and it's also impossible to
grep the codebase to find usages of these enums as they're never typed
in full.
Fix this by removing the macros and replacing them with an
add_config_term() function. It seems the main reason these existed in
the first place was to avoid type punning and to write to a specific
field in the union, but the same thing can be achieved with a single
write to a u64 'val' field.
Running the Perf tests with "-fsanitize=undefined -fno-sanitize-recover"
results in no new issues as a result of this change.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
strerror() has thread safety issues, strerror_r() requires stack
allocated buffers.
Code in perf has already been using the "%m" formatting flag that is a
widely support glibc extension to print the current errno's description.
Expand the usage of this formatting flag and remove usage of
strerror()/strerror_r().
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Yunseong Kim <ysk@kzalloc.com>
Cc: Zhongqiu Han <quic_zhonhan@quicinc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There's a lot of infrastructure for generating a relatively simple
array used by one function.
Move the array into the function and remove the supporting build logic.
At the same time opportunistically const-ify the array.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add metrics taken from Section 1.2 "Performance Measurement" of the
Performance Monitor Counters for AMD Family 1Ah Model 50h-57h Processors
document available at the link below.
The recommended metrics are sourced from Table 1 "Guidance for Common
Performance Statistics with Complex Event Selects".
The pipeline utilization metrics are sourced from Table 2 "Guidance
for Pipeline Utilization Analysis Statistics". These are useful for
finding performance bottlenecks by analyzing activity at different
stages of the pipeline. There are metric groups available for Level 1
and Level 2 analysis.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://bugzilla.kernel.org/attachment.cgi?id=309149
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add uncore events taken from Section 1.6 "L3 Cache Performance Monitor
Counters" and Section 2.2 "UMC Performance Monitor Events" of the
Performance Monitor Counters for AMD Family 1Ah Model 50h-57h Processors
document available at the link below.
This constitutes events which capture L3 cache and UMC command activity.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://bugzilla.kernel.org/attachment.cgi?id=309149
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add core events taken from Section 1.5 "Core Performance Monitor
Counters" of the Performance Monitor Counters for AMD Family 1Ah Model
50h-57h Processors document available at the link below.
This constitutes events which capture information on op dispatch,
execution and retirement, branch prediction, L1 and L2 cache activity,
TLB activity, etc.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://bugzilla.kernel.org/attachment.cgi?id=309149
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a regular expression in the map file so that appropriate JSON event
files are used for AMD Zen 6 processors. Restrict the regular expression
for AMD Zen 5 processors to known model ranges since they also belong to
Family 1Ah.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
[ Moved this one to the front of the series to keep the tree bisectable, as per Ian Rogers suggestion ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch add the OpenHW Core-V CVA6 Risc-V JSON file.
For more info:
https://openhwfoundation.org/news/2023/11/07/openhw-group-announces-core-v-cva6-platform-project-for-risc-v-software-development-and-testing/
Signed-off-by: Manuel Hernández Méndez <manuel.hernandez@openchip.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The function addr_location__put() was renamed addr_location__exit() in
commit 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy
functions"). Make the comment preceding the function consistent with
the function itself.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kexin Sun <kexinsun@smail.nju.edu.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ratnadira Widyasari <ratnadiraw@smu.edu.sg>
Cc: Xutong Ma <xutong.ma@inria.fr>
Cc: Yumbo Lyu <yunbolyu@smu.edu.sg>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
These are hard to interpret in the raw output because they are printed
as hex but are defined in perf_event.h as decimal. Make it much easier
to read the raw callchains by just printing their names.
For example:
$ perf report -D
1798195372321 0x4638 [0xb0]: PERF_RECORD_SAMPLE(IP, 0x4002): 44922/44922: 0x7c8046dd3400 period: 120218 addr: 0
... FP chain: nr:12
..... 0: fffffffffffffe00 (PERF_CONTEXT_USER)
..... 1: 00007c8046dd3400
..... 2: 00007c8046db86d3
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
[ Add PERF_CONTEXT_USER_DEFERRED too, as per Namhyung's review comment ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
These events are never countable by the PMU and are only intended to
be used as external inputs to trace. Therefore showing them in 'perf
list' is misleading so remove them.
The generator script doesn't emit these events when used with the new
telemetry-solution input files [1].
'perf list' should only show countable events because there are events
that are sometimes implemented, sometimes countable and sometimes not,
for example TRB_TRIG. If we always include any implemented events
whether they are countable or not then it's not possible to tell whether
they are usable in perf without going to the docs, defeating the point
of 'perf list'.
It's also not useful yet to display implemented events that are not
countable (for help in using trace rather than perf stat), because
PMU_OVFS and PMU_HOVFS are practically always implemented and TRB_TRIG
is always implemented when there is TRBE.
[1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/tree/main/data/pmu/cpu
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akio Kakuno <fj3333bs@aa.jp.fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshihiro Furudera <fj5100bi@fujitsu.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The correct call-stack option for branch stack sampling should be "stack"
instead of "call_stack". Correct it.
$perf record -e instructions -j call_stack -- sleep 1
unknown branch filter call_stack, check man page
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-j, --branch-filter <branch filter mask>
branch stack filter modes
Fixes: 955f6def5590ce6c ("perf record: Add remaining branch filters: "no_cycles", "no_flags" & "hw_index"")
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Xudong Hao <xudong.hao@intel.com>
Cc: Zide Chen <zide.chen@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
I think the return value of SKIP (2) should be used when it skipped the
entire test suite rather than a few of them. While the FAIL should be
reserved if any of test failed.
$ perf test -vv 109
109: perf all metricgroups test:
--- start ---
test child forked, pid 2493003
Testing Backend
Testing Bad
Testing BadSpec
Testing BigFootprint
Testing BrMispredicts
Testing Branches
Testing BvBC
Testing BvBO
Testing BvCB
Testing BvFB
Testing BvIO
Testing BvMB
Testing BvML
Testing BvMP
Testing BvMS
Testing BvMT
Testing BvOB
Testing BvUW
Testing CacheHits
Testing CacheMisses
Testing CodeGen
Testing Compute
Testing Cor
Testing DSB
Testing DSBmiss
Testing DataSharing
Testing Default
Testing Default2
Testing Default3
Testing Default4
Ignoring failures in Default4 that may contain unsupported legacy events
Testing Fed
Testing FetchBW
Testing FetchLat
Testing Flops
Testing FpScalar
Testing FpVector
Testing Frontend
Testing HPC
Testing IcMiss
Testing InsType
Testing LSD
Testing LockCont
Testing MachineClears
Testing Machine_Clears
Testing Mem
Testing MemOffcore
Testing MemoryBW
Testing MemoryBound
Testing MemoryLat
Testing MemoryTLB
Testing Memory_BW
Testing Memory_Lat
Testing MicroSeq
Testing OS
Testing Offcore
Testing PGO
Testing Pipeline
Testing PortsUtil
Testing Power
Testing Prefetches
Testing Ret
Testing Retire
Testing SMT
Testing Snoop
Testing SoC
Testing Summary
Testing TmaL1
Testing TmaL2
Testing TmaL3mem
Testing TopdownL1
Testing TopdownL2
Testing TopdownL3
Testing TopdownL4
Testing TopdownL5
Testing TopdownL6
Testing smi
Testing tma_L1_group
Testing tma_L2_group
Testing tma_L3_group
Testing tma_L4_group
Testing tma_L5_group
Testing tma_L6_group
Testing tma_alu_op_utilization_group
Testing tma_assists_group
Testing tma_backend_bound_group
Testing tma_bad_speculation_group
Testing tma_branch_mispredicts_group
Testing tma_branch_resteers_group
Testing tma_code_stlb_miss_group
Testing tma_core_bound_group
Testing tma_divider_group
Testing tma_dram_bound_group
Testing tma_dtlb_load_group
Testing tma_dtlb_store_group
Testing tma_fetch_bandwidth_group
Testing tma_fetch_latency_group
Testing tma_fp_arith_group
Testing tma_fp_vector_group
Testing tma_frontend_bound_group
Testing tma_heavy_operations_group
Testing tma_icache_misses_group
Testing tma_issue2P
Testing tma_issueBM
Testing tma_issueBW
Testing tma_issueComp
Testing tma_issueD0
Testing tma_issueFB
Testing tma_issueFL
Testing tma_issueL1
Testing tma_issueLat
Testing tma_issueMC
Testing tma_issueMS
Testing tma_issueMV
Testing tma_issueRFO
Testing tma_issueSL
Testing tma_issueSO
Testing tma_issueSmSt
Testing tma_issueSpSt
Testing tma_issueSyncxn
Testing tma_issueTLB
Testing tma_itlb_misses_group
Testing tma_l1_bound_group
Testing tma_l2_bound_group
Testing tma_l3_bound_group
Testing tma_light_operations_group
Testing tma_load_stlb_miss_group
Testing tma_machine_clears_group
Testing tma_memory_bound_group
Testing tma_microcode_sequencer_group
Testing tma_mite_group
Testing tma_other_light_ops_group
Testing tma_ports_utilization_group
Testing tma_ports_utilized_0_group
Testing tma_ports_utilized_3m_group
Testing tma_retiring_group
Testing tma_serializing_operation_group
Testing tma_store_bound_group
Testing tma_store_stlb_miss_group
Testing transaction
---- end(0) ----
109: perf all metricgroups test : Ok
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
I think the return value of SKIP (2) should be used when it skipped the
entire test suite rather than a few of them. While the FAIL should be
reserved if any of test failed.
$ perf test -vv 110
110: perf all metrics test:
--- start ---
test child forked, pid 2496399
Testing tma_core_bound
Testing tma_info_core_ilp
Testing tma_info_memory_l2mpki
Testing tma_memory_bound
Testing tma_bottleneck_irregular_overhead
Testing tma_bottleneck_mispredictions
Testing tma_info_bad_spec_branch_misprediction_cost
Testing tma_info_bad_spec_ipmisp_cond_ntaken
Testing tma_info_bad_spec_ipmisp_cond_taken
Testing tma_info_bad_spec_ipmisp_indirect
Testing tma_info_bad_spec_ipmisp_ret
Testing tma_info_bad_spec_ipmispredict
Testing tma_info_branches_callret
Testing tma_info_branches_cond_nt
Testing tma_info_branches_cond_tk
Testing tma_info_branches_jump
Testing tma_info_branches_other_branches
Testing tma_branch_mispredicts
Testing tma_clears_resteers
Testing tma_machine_clears
Testing tma_mispredicts_resteers
Testing tma_bottleneck_big_code
Testing tma_icache_misses
Testing tma_itlb_misses
Testing tma_unknown_branches
Testing tma_info_bad_spec_spec_clears_ratio
Testing tma_other_mispredicts
Testing tma_branch_instructions
Testing tma_info_frontend_tbpc
Testing tma_info_inst_mix_bptkbranch
Testing tma_info_inst_mix_ipbranch
Testing tma_info_inst_mix_ipcall
Testing tma_info_inst_mix_iptb
Testing tma_info_system_ipfarbranch
Testing tma_info_thread_uptb
Testing tma_bottleneck_branching_overhead
Testing tma_nop_instructions
Testing tma_bottleneck_compute_bound_est
Testing tma_divider
Testing tma_ports_utilized_3m
Testing tma_bottleneck_instruction_fetch_bw
Testing tma_frontend_bound
Testing tma_assists
Testing tma_other_nukes
Testing tma_serializing_operation
Testing tma_bottleneck_data_cache_memory_bandwidth
Testing tma_fb_full
Testing tma_mem_bandwidth
Testing tma_sq_full
Testing tma_bottleneck_data_cache_memory_latency
Testing tma_l1_latency_dependency
Testing tma_l2_bound
Testing tma_l3_hit_latency
Testing tma_mem_latency
Testing tma_store_latency
Testing tma_bottleneck_memory_synchronization
Testing tma_contested_accesses
Testing tma_data_sharing
Testing tma_false_sharing
Testing tma_bottleneck_memory_data_tlbs
Testing tma_dtlb_load
Testing tma_dtlb_store
Testing tma_backend_bound
Testing tma_bottleneck_other_bottlenecks
Testing tma_bottleneck_useful_work
Testing tma_retiring
Testing tma_info_memory_fb_hpki
Testing tma_info_memory_l1mpki
Testing tma_info_memory_l1mpki_load
Testing tma_info_memory_l2hpki_all
Testing tma_info_memory_l2hpki_load
Testing tma_info_memory_l2mpki_all
Testing tma_info_memory_l2mpki_load
Testing tma_l1_bound
Testing tma_l3_bound
Testing tma_info_memory_l2mpki_rfo
Testing tma_fp_scalar
Testing tma_fp_vector
Testing tma_fp_vector_128b
Testing tma_fp_vector_256b
Testing tma_fp_vector_512b
Testing tma_port_0
Testing tma_x87_use
Testing tma_info_botlnk_l0_core_bound_likely
Testing tma_info_core_fp_arith_utilization
Testing tma_info_pipeline_execute
Testing tma_info_system_gflops
Testing tma_info_thread_execute_per_issue
Testing tma_dsb
Testing tma_info_botlnk_l2_dsb_bandwidth
Testing tma_info_frontend_dsb_coverage
Testing tma_decoder0_alone
Testing tma_dsb_switches
Testing tma_info_botlnk_l2_dsb_misses
Testing tma_info_frontend_dsb_switch_cost
Testing tma_info_frontend_ipdsb_miss_ret
Testing tma_mite
Testing tma_mite_4wide
Testing CPUs_utilized
Testing backend_cycles_idle
[Ignored backend_cycles_idle] failed but as a Default metric this can be expected
Performance counter stats for 'perf test -w noploop': <not counted> cpu-cycles:u <not supported> stalled-cycles-backend:u 1.014051473 seconds time elapsed 1.005718000 seconds user 0.008013000 seconds sys
Testing branch_frequency
Testing branch_miss_rate
Testing cs_per_second
Testing cycles_frequency
Testing frontend_cycles_idle
[Ignored frontend_cycles_idle] failed but as a Default metric this can be expected
Performance counter stats for 'perf test -w noploop': <not counted> cpu-cycles:u <not supported> stalled-cycles-frontend:u 1.012813656 seconds time elapsed 1.004603000 seconds user 0.008004000 seconds sys
Testing insn_per_cycle
Testing migrations_per_second
Testing page_faults_per_second
Testing stalled_cycles_per_instruction
[Ignored stalled_cycles_per_instruction] failed but as a Default metric this can be expected
Error: No supported events found. The stalled-cycles-backend:u event is not supported.
Testing tma_bad_speculation
Testing l1d_miss_rate
Testing llc_miss_rate
Testing dtlb_miss_rate
Testing itlb_miss_rate
[Ignored itlb_miss_rate] failed but as a Default metric this can be expected
Performance counter stats for 'perf test -w noploop': <not supported> iTLB-loads:u 3,097 iTLB-load-misses:u 1.012766732 seconds time elapsed 1.004318000 seconds user 0.008002000 seconds sys
Testing l1i_miss_rate
[Ignored l1i_miss_rate] failed but as a Default metric this can be expected
Performance counter stats for 'perf test -w noploop': <not counted> L1-icache-load-misses:u <not supported> L1-icache-loads:u 1.013606395 seconds time elapsed 1.001371000 seconds user 0.011968000 seconds sys
Testing l1_prefetch_miss_rate
[Ignored l1_prefetch_miss_rate] failed but as a Default metric this can be expected
Error: No supported events found. The L1-dcache-prefetches:u event is not supported.
Testing tma_info_botlnk_l2_ic_misses
Testing tma_info_frontend_fetch_upc
Testing tma_info_frontend_icache_miss_latency
Testing tma_info_frontend_ipunknown_branch
Testing tma_info_frontend_lsd_coverage
Testing tma_info_memory_tlb_code_stlb_mpki
Testing tma_info_pipeline_fetch_dsb
Testing tma_info_pipeline_fetch_lsd
Testing tma_info_pipeline_fetch_mite
Testing tma_info_pipeline_fetch_ms
Testing tma_fetch_bandwidth
Testing tma_lsd
Testing tma_branch_resteers
Testing tma_code_l2_hit
Testing tma_code_l2_miss
Testing tma_code_stlb_hit
Testing tma_code_stlb_miss
Testing tma_code_stlb_miss_2m
Testing tma_code_stlb_miss_4k
Testing tma_lcp
Testing tma_ms_switches
Testing tma_info_core_flopc
Testing tma_info_inst_mix_iparith
Testing tma_info_inst_mix_iparith_avx128
Testing tma_info_inst_mix_iparith_avx256
Testing tma_info_inst_mix_iparith_avx512
Testing tma_info_inst_mix_iparith_scalar_dp
Testing tma_info_inst_mix_iparith_scalar_sp
Testing tma_info_inst_mix_ipflop
Testing tma_info_inst_mix_ippause
Testing tma_fetch_latency
Testing tma_fp_arith
Testing tma_fp_assists
Testing tma_info_system_cpu_utilization
Testing tma_info_system_dram_bw_use
[Skipped tma_info_system_dram_bw_use] Not supported events
Performance counter stats for 'perf test -w noploop': <not supported> UNC_ARB_TRK_REQUESTS.ALL:u <not supported> UNC_ARB_COH_TRK_REQUESTS.ALL:u 1,013,554,749 duration_time 1.013527265 seconds time elapsed 1.005417000 seconds user 0.008011000 seconds sys
Testing tma_info_frontend_l2mpki_code
Testing tma_info_frontend_l2mpki_code_all
Testing tma_info_inst_mix_ipload
Testing tma_info_inst_mix_ipstore
Testing tma_info_memory_latency_load_l2_miss_latency
Testing tma_lock_latency
Testing tma_info_memory_core_l1d_cache_fill_bw_2t
Testing tma_info_memory_core_l2_cache_fill_bw_2t
Testing tma_info_memory_core_l3_cache_access_bw_2t
Testing tma_info_memory_core_l3_cache_fill_bw_2t
Testing tma_info_memory_l1d_cache_fill_bw
Testing tma_info_memory_l2_cache_fill_bw
Testing tma_info_memory_l3_cache_access_bw
Testing tma_info_memory_l3_cache_fill_bw
Testing tma_info_memory_l3mpki
Testing tma_info_memory_load_miss_real_latency
Testing tma_info_memory_mix_bus_lock_pki
Testing tma_info_memory_mix_uc_load_pki
Testing tma_info_memory_mlp
Testing tma_info_memory_tlb_load_stlb_mpki
Testing tma_info_memory_tlb_page_walks_utilization
Testing tma_info_memory_tlb_store_stlb_mpki
Testing tma_info_system_mem_parallel_reads
[Skipped tma_info_system_mem_parallel_reads] Not supported events
Performance counter stats for 'perf test -w noploop': <not supported> UNC_ARB_DAT_OCCUPANCY.RD:u <not counted> UNC_ARB_DAT_OCCUPANCY.RD/cmask=1/ 1.013354884 seconds time elapsed 1.009239000 seconds user 0.004004000 seconds sys
Testing tma_info_system_mem_read_latency
[Skipped tma_info_system_mem_read_latency] Not supported events
Performance counter stats for 'perf test -w noploop': <not supported> UNC_ARB_DAT_OCCUPANCY.RD:u <not counted> UNC_ARB_TRK_OCCUPANCY.RD <not counted> UNC_ARB_TRK_REQUESTS.RD 1.012882143 seconds time elapsed 1.004600000 seconds user 0.008036000 seconds sys
Testing tma_info_thread_cpi
Testing tma_streaming_stores
Testing tma_dram_bound
Testing tma_store_bound
Testing tma_l2_hit_latency
Testing tma_load_stlb_hit
Testing tma_load_stlb_miss
Testing tma_load_stlb_miss_1g
Testing tma_load_stlb_miss_2m
Testing tma_load_stlb_miss_4k
Testing tma_store_stlb_hit
Testing tma_store_stlb_miss
Testing tma_store_stlb_miss_1g
Testing tma_store_stlb_miss_2m
Testing tma_store_stlb_miss_4k
Testing tma_info_memory_latency_data_l2_mlp
Testing tma_info_memory_latency_load_l2_mlp
Testing tma_info_pipeline_ipassist
Testing tma_microcode_sequencer
Testing tma_ms
Testing tma_info_system_kernel_cpi
[Failed tma_info_system_kernel_cpi] Metric contains missing events
Error: No supported events found. Access to performance monitoring and observability operations is limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open access to performance monitoring and observability operations for processes without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability. More information can be found at 'Perf events and tool security' document: https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html perf_event_paranoid setting is 2: -1: Allow use of (almost) all events by all users Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK >= 0: Disallow raw and ftrace function tracepoint access >= 1: Disallow CPU event access >= 2: Disallow kernel profiling To make the adjusted perf_event_paranoid setting permanent preserve it in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)
Testing tma_info_system_kernel_utilization
[Failed tma_info_system_kernel_utilization] Metric contains missing events
Error: No supported events found. Access to performance monitoring and observability operations is limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open access to performance monitoring and observability operations for processes without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability. More information can be found at 'Perf events and tool security' document: https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html perf_event_paranoid setting is 2: -1: Allow use of (almost) all events by all users Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK >= 0: Disallow raw and ftrace function tracepoint access >= 1: Disallow CPU event access >= 2: Disallow kernel profiling To make the adjusted perf_event_paranoid setting permanent preserve it in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)
Testing tma_info_pipeline_retire
Testing tma_info_thread_clks
Testing tma_info_thread_uoppi
Testing tma_memory_operations
Testing tma_other_light_ops
Testing tma_ports_utilization
Testing tma_ports_utilized_0
Testing tma_ports_utilized_1
Testing tma_ports_utilized_2
Testing C10_Pkg_Residency
[Failed C10_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c10-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c10-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C2_Pkg_Residency
[Failed C2_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c2-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c2-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C3_Pkg_Residency
[Failed C3_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { msr/tsc/, cstate_pkg/c3-residency/ } Error: No supported events found. Invalid event (msr/tsc/u) in per-thread mode, enable system wide with '-a'.
Testing C6_Core_Residency
[Failed C6_Core_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_core/c6-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_core/c6-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C6_Pkg_Residency
[Failed C6_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c6-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c6-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C7_Core_Residency
[Failed C7_Core_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_core/c7-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_core/c7-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C7_Pkg_Residency
[Failed C7_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c7-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c7-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C8_Pkg_Residency
[Failed C8_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c8-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c8-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C9_Pkg_Residency
[Failed C9_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c9-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c9-residency/u) in per-thread mode, enable system wide with '-a'.
Testing tma_info_core_epc
Testing tma_info_system_core_frequency
Testing tma_info_system_power
[Skipped tma_info_system_power] Not supported events
Performance counter stats for 'perf test -w noploop': <not supported> Joules power/energy-pkg/u 1,013,238,256 duration_time 1.013223072 seconds time elapsed 0.995924000 seconds user 0.011903000 seconds sys
Testing tma_info_system_power_license0_utilization
Testing tma_info_system_power_license1_utilization
Testing tma_info_system_power_license2_utilization
Testing tma_info_system_turbo_utilization
Testing tma_info_inst_mix_ipswpf
Testing tma_info_memory_prefetches_useless_hwpf
Testing tma_info_core_coreipc
Testing tma_info_thread_ipc
Testing tma_heavy_operations
Testing tma_light_operations
Testing tma_info_core_core_clks
Testing tma_info_system_smt_2t_utilization
Testing tma_info_thread_slots_utilization
Testing UNCORE_FREQ
[Skipped UNCORE_FREQ] Not supported events
Performance counter stats for 'perf test -w noploop': <not supported> UNC_CLOCK.SOCKET:u 1,015,993,466 duration_time 1.015949387 seconds time elapsed 1.007676000 seconds user 0.008029000 seconds sys
Testing tma_info_system_socket_clks
[Failed tma_info_system_socket_clks] Metric contains missing events
Error: No supported events found. Invalid event (UNC_CLOCK.SOCKET:u) in per-thread mode, enable system wide with '-a'.
Testing tma_info_inst_mix_instructions
Testing tma_info_system_cpus_utilized
Testing tma_info_system_mux
Testing tma_info_system_time
Testing tma_info_thread_slots
Testing tma_few_uops_instructions
Testing tma_4k_aliasing
Testing tma_cisc
Testing tma_fp_divider
Testing tma_int_divider
Testing tma_slow_pause
Testing tma_split_loads
Testing tma_split_stores
Testing tma_store_fwd_blk
Testing tma_alu_op_utilization
Testing tma_load_op_utilization
Testing tma_mixing_vectors
Testing tma_store_op_utilization
Testing tma_port_1
Testing tma_port_5
Testing tma_port_6
Testing smi_cycles
[Skipped smi_cycles] Not supported events
Performance counter stats for 'perf test -w noploop': <not supported> msr/smi/u <not supported> msr/aperf/u 3,965,789,327 cycles:u 1.012779591 seconds time elapsed 1.004579000 seconds user 0.007972000 seconds sys
Testing smi_num
[Failed smi_num] Metric contains missing events
Error: No supported events found. Invalid event (msr/smi/u) in per-thread mode, enable system wide with '-a'.
Testing tsx_aborted_cycles
Testing tsx_cycles_per_elision
Testing tsx_cycles_per_transaction
Testing tsx_transactional_cycles
---- end(-1) ----
110: perf all metrics test : FAILED!
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It uses tools/perf/include which assumes it's running from the root of
the linux kernel source tree. But you can run perf from other places
like tools/perf, then the include path won't match. We can use the
shelldir variable to locate the test script in the tree.
$ cd tools/perf
$ ./perf test dlfilter
63: dlfilter C API : Ok
101: perf script --dlfilter tests : Ok
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For some reason, it may fail to build the dlfilter. Let's skip the test
as it's not an error in the perf. This can happen when you run the perf
test without source code or in a different directory.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The keep_feat() determines which header features will be kept or
discarded. Usually 'perf inject' will add build-IDs based on -b, -B or
other related options. But it lose build-ID when none of those options
are used. This is meaningful only when --buildid-mmap is not used.
The following example shows the impact of this change.
$ perf record --no-buildid-mmap true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.037 MB perf.data (5 samples) ]
$ perf inject -i perf.data -o perf.data.inject
$ perf buildid-list -i perf.data
08cccc2a9388d5247ccb3e864f3063b975b0a15d /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
fd5c4d5673256cd6bda51725dba048dabb0f854e [kernel.kallsyms]
97a36ce1140071be5c36b147fa0bed173e05a602 [vdso]
$ perf buildid-list -i perf.data.inject
97a36ce1140071be5c36b147fa0bed173e05a602 [vdso]
With this change, perf.data.inject would show the same list (of course,
you need to run perf inject again).
Reported-by: Gabriel Marin <gmx@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now that the SHA-1 code is no longer used, remove it.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fangrui Song <maskray@sourceware.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pablo Galindo <pablogsal@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Recent patches [1] [2] added an implementation of SHA-1 to perf and made
it be used for build ID generation.
I had understood the choice of SHA-1, which is a legacy algorithm, to be
for backwards compatibility.
It turns out, though, that there's no backwards compatibility
requirement here other than the size of the build ID field, which is
fixed at 20 bytes. Not only did the hash algorithm already change (from
MD5 to SHA-1), but the inputs to the hash changed too: from 'load_addr
|| code' to just 'code', and now again to 'code || symtab || strsym'
[3]. Different linkers generate different build IDs, with the LLVM
linker using BLAKE3 hashes for example [4].
Therefore, we might as well switch to a more modern algorithm. Let's go
with BLAKE2s. It's faster than SHA-1, isn't cryptographically broken,
is easier to implement than BLAKE3, and the kernel's implementation in
lib/crypto/blake2s.c is easily borrowed. It also natively supports
variable-length hashes, so it can directly produce the needed 20 bytes.
Also make the following additional improvements:
- Hash the three inputs incrementally, so they don't all have to be
concatenated into one buffer.
- Add tag/length prefixes to each of the three inputs, so that distinct
input tuples reliably result in distinct hashes.
[1] https://lore.kernel.org/linux-perf-users/20250521225307.743726-1-yuzhuo@google.com/
[2] https://lore.kernel.org/linux-perf-users/20250625202311.23244-1-ebiggers@kernel.org/
[3] https://lore.kernel.org/linux-perf-users/20251125080748.461014-1-namhyung@kernel.org/
[4] https://github.com/llvm/llvm-project/commit/d3e5b6f7539b86995aef6e2075c1edb3059385ce
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fangrui Song <maskray@sourceware.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pablo Galindo <pablogsal@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add BLAKE2s support to the perf utility library. The code is borrowed
from the kernel. This will replace the use of SHA-1 in genelf.c.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fangrui Song <maskray@sourceware.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pablo Galindo <pablogsal@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When there is no exclusion occurring from the cmds list - for example -
cmds contains ["read-vdso32"] and excludes contains ["archive"] - the
main loop completes with ci == cj == 0. In the original code the loop
processing the remaining elements in the list was conditional:
if (ci != cj) { ...}
So we end up in the assertion loop since ci < cmds->cnt and we
incorrectly try to assert the list elements to be NULL and fail with
the following error
help.c:104: exclude_cmds: Assertion `cmds->names[ci] == NULL' failed.
Fix this by moving the if (ci != cj) check inside of a broader loop.
If ci != cj, left shift the list elements, as before, and then
unconditionally advance the ci and cj indicies which also covers the
ci == cj case.
Fixes: 1fdf938168c4d26f ("perf tools: Fix use-after-free in help_unknown_cmd()")
Reviewed-by: Guilherme Amadio <amadio@gentoo.org>
Signed-off-by: Sri Jayaramappa <sjayaram@akamai.com>
Tested-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Joshua Hunt <johunt@akamai.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20251202213632.2873731-1-sjayaram@akamai.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a test that seeks to see inline functions correctly displayed in
'perf script' from the inlineloop workload.
Committer testing:
# perf test 'addr2line inline unwinding'
76: test addr2line inline unwinding : Ok
# perf test -vv 'addr2line inline unwinding'
76: test addr2line inline unwinding:
--- start ---
test child forked, pid 1508628
Inline unwinding verification test
[ perf record: Woken up 129 times to write data ]
[ perf record: Captured and wrote 32.282 MB /tmp/perf-test-inline-addr2line.L4Sz8QtADJ/perf.data (4014 samples) ]
Inline unwinding verification test [Success]
---- end(0) ----
76: test addr2line inline unwinding : Ok
#
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
sample__fprintf_callchain() was using map__fprintf_srcline() which won't
report inline line numbers.
Fix by using the srcline from the callchain and falling back to the map
variant.
Fixes: 25da4fab5f66e659 ("perf evsel: Move fprintf methods to separate source file")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Allow the addr2line style to be specified on the `perf report` command
line or in the .perfconfig file.
Committer testing:
The methods:
# perf probe -x ~/bin/perf -F *__addr2line
cmd__addr2line
libbfd__addr2line
libdw__addr2line
llvm__addr2line
#
So if we configure one of them, say 'addr2line':
# perf config addr2line.style=addr2line
# perf config addr2line.style
addr2line.style=addr2line
#
And have probes on all of them:
# perf probe -x ~/bin/perf *__addr2line
Added new events:
probe_perf:cmd__addr2line (on *__addr2line in /home/acme/bin/perf)
probe_perf:llvm__addr2line (on *__addr2line in /home/acme/bin/perf)
probe_perf:libbfd__addr2line (on *__addr2line in /home/acme/bin/perf)
probe_perf:libdw__addr2line (on *__addr2line in /home/acme/bin/perf)
You can now use it in all perf tools, such as:
perf record -e probe_perf:libdw__addr2line -aR sleep 1
#
Only the selected method should be used:
# perf stat -e probe_perf:*_addr2line perf report -f --dso perf --stdio -s srcfile,srcline
# Total Lost Samples: 0
#
# Samples: 4K of event 'cpu/cycles/Pu'
# Event count (approx.): 5535180842
#
# Overhead Source File Source:Line
# ........ ............ ...............
#
99.04% inlineloop.c inlineloop.c:21
0.46% inlineloop.c inlineloop.c:20
#
# (Tip: For hierarchical output, try: perf report --hierarchy)
#
Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline':
44 probe_perf:cmd__addr2line
0 probe_perf:llvm__addr2line
0 probe_perf:libbfd__addr2line
0 probe_perf:libdw__addr2line
0.035915611 seconds time elapsed
0.028008000 seconds user
0.009051000 seconds sys
#
I checked and that is the case for the other methods.
Also when using:
# perf config addr2line.style=libdw,llvm
Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline':
0 probe_perf:cmd__addr2line
23 probe_perf:llvm__addr2line
0 probe_perf:libbfd__addr2line
44 probe_perf:libdw__addr2line
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The a2l_style is only relevant to the command line version, so rename
to make this clearer.
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add an implementation of addr2line that uses libdw.
Other addr2line implementations are slow, particularly in the case of
forking addr2line.
Add an implementation that caches the libdw information in the dso and
uses it to find the file and line number information.
Inline information is supported but because cu_walk_functions_at visits
the leaf function last add a inline_list__append_tail to reverse the
lists order.
Committer testing:
# perf probe -x ~/bin/perf libdw__addr2line
Added new event:
probe_perf:libdw_addr2line (on libdw__addr2line in /home/acme/bin/perf)
You can now use it in all perf tools, such as:
perf record -e probe_perf:libdw_addr2line -aR sleep 1
#
# perf stat -e probe_perf:libdw_addr2line perf report -f --dso perf --stdio -s srcfile,srcline
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 4K of event 'cpu/cycles/Pu'
# Event count (approx.): 5535180842
#
# Overhead Source File Source:Line
# ........ ............ ...............
#
99.04% inlineloop.c inlineloop.c:21
0.46% inlineloop.c inlineloop.c:20
#
# (Tip: For tracepoint events, try: perf report -s trace_fields)
#
Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline':
44 probe_perf:libdw_addr2line
0.037260744 seconds time elapsed
0.025299000 seconds user
0.011918000 seconds sys
#
Adding probes to the other addr2line implementations (llvm__addr2line,
libbfd__addr2line and cmd__addr2line) I noticed some fallbacks to the
llvm one:
Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline':
44 probe_perf:libdw_addr2line
23 probe_perf:llvm_addr2line
0 probe_perf:libbfd_addr2line
0 probe_perf:cmd_addr2line
Something to investigate further, but at least we don't fallback to the
cmd based one :-)
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The purpose of this workload is to gather samples in an inlined
function. This can be used to test whether inlined addr2line works
correctly.
Committer testing:
$ perf record perf test -w inlineloop 1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.161 MB perf.data (4005 samples) ]
$ perf report --stdio --dso perf -s srcfile,srcline
#
# Total Lost Samples: 0
#
# Samples: 4K of event 'cpu/cycles/Pu'
# Event count (approx.): 5535180842
#
# Overhead Source File Source:Line
# ........ ............ ...............
#
99.04% inlineloop.c inlineloop.c:21
0.46% inlineloop.c inlineloop.c:20
#
$
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The addition of addr_location__exit() causes use-after put on the maps
and map references in the unwind info. Add the gets and then add the
map_symbol__exit() calls.
Fixes: 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy functions")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The test is based on an error/fix posted to linux-perf-users.
Reported-by: Sri Jayaramappa <sjayaram@akamai.com>
Reviewed-by: Sri Jayaramappa <sjayaram@akamai.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Closes: https://lore.kernel.org/linux-perf-users/20251202213632.2873731-1-sjayaram@akamai.com/
Closes: https://urldefense.com/v3/__https://lore.kernel.org/linux-perf-users/20251202213632.2873731-1-sjayaram@akamai.com/__;!!GjvTz_vk!XehekKNUE4Ib_tvqIH6PMIIhly4X3BZ-Y40RC1HKMQ-6OdYEFvUPQhyWv_gk9vsRRN4_RcOLS2Bh0CQ$
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Commit bc22de9bcdb22491 ("perf stat: Display time in precision based on
std deviation") added multirun workload elapsed time. There was an
effort to make the precision in the output most useful for the user,
however, when gathering over runs it means the formatting varies. This
change just makes the output format fixed.
Before:
```
$ while :; do perf stat --null --repeat 3 sleep 0.1 2>&1 | grep elapsed; done
0.101140 +- 0.000149 seconds time elapsed ( +- 0.15% )
0.1011396 +- 0.0000218 seconds time elapsed ( +- 0.02% )
0.101331 +- 0.000124 seconds time elapsed ( +- 0.12% )
^C
$ while :; do perf stat --null --repeat 3 sleep 1 2>&1 | grep elapsed; done
1.001317 +- 0.000146 seconds time elapsed ( +- 0.01% )
1.001377 +- 0.000172 seconds time elapsed ( +- 0.02% )
1.00253 +- 0.00131 seconds time elapsed ( +- 0.13% )
```
After:
```
$ while :; do perf stat --null --repeat 3 sleep 0.1 2>&1 | grep elapsed; done
0.101406408 +- 0.000064778 seconds time elapsed ( +- 0.06% )
0.101367315 +- 0.000027253 seconds time elapsed ( +- 0.03% )
0.101434164 +- 0.000084750 seconds time elapsed ( +- 0.08% )
^C
$ while :; do perf stat --null --repeat 3 sleep 1 2>&1 | grep elapsed; done
1.001525467 +- 0.000051703 seconds time elapsed ( +- 0.01% )
1.001375093 +- 0.000116200 seconds time elapsed ( +- 0.01% )
1.001141025 +- 0.000046361 seconds time elapsed ( +- 0.00% )
```
Closes: https://lore.kernel.org/lkml/aTQRgAOpKyI53TEq@gmail.com/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Raise the minimum shellcheck version for perf builds to 0.7.2, so that
systems with shellcheck versions below 0.7.2 will automatically skip the
shell script checking, even if NO_SHELLCHECK is unset.
Since commit 241f21be7d0fdf3c ("perf test perftool_testsuite: Use
absolute paths"), shellcheck versions before 0.7.2 break the perf build
with several SC1090 [2] warnings due to its too strict dynamic source
handling [1], e.g.:
In tests/shell/base_probe/test_line_semantics.sh line 20:
. "$DIR_PATH/../common/init.sh"
^---------------------------^ SC1090: Can't follow non-constant source. Use a directive to specify location.
Fixes: 241f21be7d0fdf3c ("perf test perftool_testsuite: Use absolute paths")
Signed-off-by: Nicolas Schier <n.schier@avm.de>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jakub Brnak <jbrnak@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Nicolas Schier <nsc@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Philipp Hahn <p.hahn@avm.de>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Link: https://github.com/koalaman/shellcheck/issues/1998 # [1]
Link: https://www.shellcheck.net/wiki/SC1090
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On s390 'perf test's 'perf stat tests', subtest test_hybrid fails for
z/VM systems. The root cause is this statement:
$(perf stat -a -- sleep 0.1 2>&1 |\
grep -E "/cpu-cycles/[uH]*| cpu-cycles[:uH]* -c)
The 'perf stat' output on a s390 z/VM system is
# perf stat -a -- sleep 0.1 2>&1
Performance counter stats for 'system wide':
56 context-switches # 46.3 cs/sec cs_per_second
1,210.41 msec cpu-clock # 11.9 CPUs CPUs_utilized
12 cpu-migrations # 9.9 migrations/sec ...
81 page-faults # 66.9 faults/sec ...
0.100891009 seconds time elapsed
The grep command does not match any single line and exits with error
code 1.
As the bash script is executed with 'set -e', it aborts with the first
error code being non-zero.
Fix this and use 'wc -l' to count matching lines instead of 'grep ... -c'.
Output before:
# perf test 102
102: perf stat tests : FAILED!
#
Output after:
# perf test 102
102: perf stat tests : Ok
#
Fixes: bb6e7cb11d97ce19 ("perf tools: Add fallback for exclude_guest")
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Jan Polensky <japo@linux.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Adjust some oddly indented fprintf() calls.
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This adds a feature to allow restricting the range of converted samples
with a range string like perf-script and perf-report --time.
Committer testing:
Put a probe on the ICMP receive path handling broadcast packets:
# perf probe icmp_rcv:64
Added new event:
probe:icmp_rcv_L64 (on icmp_rcv:64)
You can now use it in all perf tools, such as:
perf record -e probe:icmp_rcv_L64 -aR sleep 1
# perf record -e probe:icmp_rcv_L64 ping -c 10 -b 127.255.255.255
WARNING: pinging broadcast address
PING 127.255.255.255 (127.255.255.255) 56(84) bytes of data.
^C
--- 127.255.255.255 ping statistics ---
10 packets transmitted, 0 received, 100% packet loss, time 9217ms
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.034 MB perf.data (10 samples) ]
# perf script
ping 52785 [009] 5847.300394: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5848.325018: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5849.349007: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5850.372979: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5851.396988: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5852.420954: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5853.444934: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5854.468926: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5855.492914: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5856.516883: probe:icmp_rcv_L64: (ffffffffaadb337e)
#
Now get some slices using perf script:
# perf script --time 40%
ping 52785 [009] 5847.300394: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5848.325018: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5849.349007: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5850.372979: probe:icmp_rcv_L64: (ffffffffaadb337e)
# perf script --time 40%-60%
ping 52785 [009] 5851.396988: probe:icmp_rcv_L64: (ffffffffaadb337e)
ping 52785 [009] 5852.420954: probe:icmp_rcv_L64: (ffffffffaadb337e)
#
And finally use this new feature:
# perf data convert --to-json out.json --time 0%-10%
[ perf data convert: Converted 'perf.data' into JSON data 'out.json' ]
[ perf data convert: Converted and wrote 0.001 MB (1 samples) ]
[ perf data convert: Skipped 9 samples ]
# cat out.json
{
"linux-perf-json-version": 1,
"headers": {
"header-version": 1,
"captured-on": "2026-01-06T22:26:40Z",
"data-offset": 520,
"data-size": 34648,
"feat-offset": 35168,
"hostname": "number",
"os-release": "6.17.12-300.fc43.x86_64",
"arch": "x86_64",
"cpu-desc": "AMD Ryzen 9 9950X3D 16-Core Processor",
"cpuid": "AuthenticAMD,26,68,0",
"nrcpus-online": 32,
"nrcpus-avail": 32,
"perf-version": "6.19.rc4.gf4c270685d3d",
"cmdline": [
"/home/acme/bin/perf"
]
},
"samples": [
{
"timestamp": 5847300394661,
"pid": 52785,
"tid": 52785,
"cpu": 9,
"comm": "ping",
"callchain": [
{
"ip": "0xffffffffaadb337f",
"symbol": "icmp_rcv",
"dso": "[kernel.kallsyms]"
}
],
"__probe_ip": "ffffffffaadb337e"
}
]
}
#
Signed-off-by: Derek Foreman <derek.foreman@collabora.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Non distro builds now require a new version of libbfd, so skip the test
if the library is too old.
The grep test isn't a strong as the feature test in
test-libbfd-threadsafe.c, but there seems to be precedent for feature
testing this way here and it's good enough for the build-test rule. If
the function exists but returns an error it will be picked up by the
feature test when attempting the build.
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The non-distro build requires libbfd 2.42 since commit b72b8132d8fd
("perf libbfd: Ensure libbfd is initialized prior to use"). Add a
feature test so that it's obvious why the build fails if this criteria
isn't met.
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
HAVE_LIBBFD_BUILDID_SUPPORT isn't used in the codebase so remove the
feature test that sets it.
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
None of the if statements or variable assignments in the non-distro
block actually affect the feature checks. Just do them all in one place
so the flow isn't obscured.
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
FEATURE_CHECK_LDFLAGS-disassembler-{four-args,init-styled} setting
As the building mechanism is now able to retry detection with different
combinations of linking flags, setting FEATURE_CHECK_LDFLAGS-disassembler-four-args
and FEATURE_CHECK_LDFLAGS-disassembler-init-styled is not necessary anymore,
so remove it.
James Clark notes:
Use the same technique to find the set of bfd-related libraries to link as in:
3308ffc5016e6136 ("tools, build: Retry detection of bfd-related features")
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Bill Wendling <morbo@google.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <qmo@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If perf is built into an output directory then so is
libperf-jvmti.so.
If `perf test` is run from that directory then PWD needn't also be that
directory meaning libperf-jvmti.so won't be found and the test skipped.
Add an additional check for libperf-jvmti.so in the same directory as
the perf binary for this case, this avoids the test skipping.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|