diff options
Diffstat (limited to 'tools/perf/Documentation')
-rw-r--r-- | tools/perf/Documentation/intel-hybrid.txt | 214 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-annotate.txt | 7 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-buildid-cache.txt | 2 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-config.txt | 11 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-data.txt | 5 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-iostat.txt | 88 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-record.txt | 1 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-report.txt | 10 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-stat.txt | 29 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-top.txt | 2 | ||||
-rw-r--r-- | tools/perf/Documentation/perf.txt | 12 | ||||
-rw-r--r-- | tools/perf/Documentation/topdown.txt | 18 |
12 files changed, 394 insertions, 5 deletions
diff --git a/tools/perf/Documentation/intel-hybrid.txt b/tools/perf/Documentation/intel-hybrid.txt new file mode 100644 index 000000000000..07f0aa3bf682 --- /dev/null +++ b/tools/perf/Documentation/intel-hybrid.txt @@ -0,0 +1,214 @@ +Intel hybrid support +-------------------- +Support for Intel hybrid events within perf tools. + +For some Intel platforms, such as AlderLake, which is hybrid platform and +it consists of atom cpu and core cpu. Each cpu has dedicated event list. +Part of events are available on core cpu, part of events are available +on atom cpu and even part of events are available on both. + +Kernel exports two new cpu pmus via sysfs: +/sys/devices/cpu_core +/sys/devices/cpu_atom + +The 'cpus' files are created under the directories. For example, + +cat /sys/devices/cpu_core/cpus +0-15 + +cat /sys/devices/cpu_atom/cpus +16-23 + +It indicates cpu0-cpu15 are core cpus and cpu16-cpu23 are atom cpus. + +Quickstart + +List hybrid event +----------------- + +As before, use perf-list to list the symbolic event. + +perf list + +inst_retired.any + [Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom] +inst_retired.any + [Number of instructions retired. Fixed Counter - architectural event. Unit: cpu_core] + +The 'Unit: xxx' is added to brief description to indicate which pmu +the event is belong to. Same event name but with different pmu can +be supported. + +Enable hybrid event with a specific pmu +--------------------------------------- + +To enable a core only event or atom only event, following syntax is supported: + + cpu_core/<event name>/ +or + cpu_atom/<event name>/ + +For example, count the 'cycles' event on core cpus. + + perf stat -e cpu_core/cycles/ + +Create two events for one hardware event automatically +------------------------------------------------------ + +When creating one event and the event is available on both atom and core, +two events are created automatically. One is for atom, the other is for +core. Most of hardware events and cache events are available on both +cpu_core and cpu_atom. + +For hardware events, they have pre-defined configs (e.g. 0 for cycles). +But on hybrid platform, kernel needs to know where the event comes from +(from atom or from core). The original perf event type PERF_TYPE_HARDWARE +can't carry pmu information. So now this type is extended to be PMU aware +type. The PMU type ID is stored at attr.config[63:32]. + +PMU type ID is retrieved from sysfs. +/sys/devices/cpu_atom/type +/sys/devices/cpu_core/type + +The new attr.config layout for PERF_TYPE_HARDWARE: + +PERF_TYPE_HARDWARE: 0xEEEEEEEE000000AA + AA: hardware event ID + EEEEEEEE: PMU type ID + +Cache event is similar. The type PERF_TYPE_HW_CACHE is extended to be +PMU aware type. The PMU type ID is stored at attr.config[63:32]. + +The new attr.config layout for PERF_TYPE_HW_CACHE: + +PERF_TYPE_HW_CACHE: 0xEEEEEEEE00DDCCBB + BB: hardware cache ID + CC: hardware cache op ID + DD: hardware cache op result ID + EEEEEEEE: PMU type ID + +When enabling a hardware event without specified pmu, such as, +perf stat -e cycles -a (use system-wide in this example), two events +are created automatically. + + ------------------------------------------------------------ + perf_event_attr: + size 120 + config 0x400000000 + sample_type IDENTIFIER + read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING + disabled 1 + inherit 1 + exclude_guest 1 + ------------------------------------------------------------ + +and + + ------------------------------------------------------------ + perf_event_attr: + size 120 + config 0x800000000 + sample_type IDENTIFIER + read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING + disabled 1 + inherit 1 + exclude_guest 1 + ------------------------------------------------------------ + +type 0 is PERF_TYPE_HARDWARE. +0x4 in 0x400000000 indicates it's cpu_core pmu. +0x8 in 0x800000000 indicates it's cpu_atom pmu (atom pmu type id is random). + +The kernel creates 'cycles' (0x400000000) on cpu0-cpu15 (core cpus), +and create 'cycles' (0x800000000) on cpu16-cpu23 (atom cpus). + +For perf-stat result, it displays two events: + + Performance counter stats for 'system wide': + + 6,744,979 cpu_core/cycles/ + 1,965,552 cpu_atom/cycles/ + +The first 'cycles' is core event, the second 'cycles' is atom event. + +Thread mode example: +-------------------- + +perf-stat reports the scaled counts for hybrid event and with a percentage +displayed. The percentage is the event's running time/enabling time. + +One example, 'triad_loop' runs on cpu16 (atom core), while we can see the +scaled value for core cycles is 160,444,092 and the percentage is 0.47%. + +perf stat -e cycles -- taskset -c 16 ./triad_loop + +As previous, two events are created. + +------------------------------------------------------------ +perf_event_attr: + size 120 + config 0x400000000 + sample_type IDENTIFIER + read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING + disabled 1 + inherit 1 + enable_on_exec 1 + exclude_guest 1 +------------------------------------------------------------ + +and + +------------------------------------------------------------ +perf_event_attr: + size 120 + config 0x800000000 + sample_type IDENTIFIER + read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING + disabled 1 + inherit 1 + enable_on_exec 1 + exclude_guest 1 +------------------------------------------------------------ + + Performance counter stats for 'taskset -c 16 ./triad_loop': + + 233,066,666 cpu_core/cycles/ (0.43%) + 604,097,080 cpu_atom/cycles/ (99.57%) + +perf-record: +------------ + +If there is no '-e' specified in perf record, on hybrid platform, +it creates two default 'cycles' and adds them to event list. One +is for core, the other is for atom. + +perf-stat: +---------- + +If there is no '-e' specified in perf stat, on hybrid platform, +besides of software events, following events are created and +added to event list in order. + +cpu_core/cycles/, +cpu_atom/cycles/, +cpu_core/instructions/, +cpu_atom/instructions/, +cpu_core/branches/, +cpu_atom/branches/, +cpu_core/branch-misses/, +cpu_atom/branch-misses/ + +Of course, both perf-stat and perf-record support to enable +hybrid event with a specific pmu. + +e.g. +perf stat -e cpu_core/cycles/ +perf stat -e cpu_atom/cycles/ +perf stat -e cpu_core/r1a/ +perf stat -e cpu_atom/L1-icache-loads/ +perf stat -e cpu_core/cycles/,cpu_atom/instructions/ +perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}' + +But '{cpu_core/cycles/,cpu_atom/instructions/}' will return +warning and disable grouping, because the pmus in group are +not matched (cpu_core vs. cpu_atom). diff --git a/tools/perf/Documentation/perf-annotate.txt b/tools/perf/Documentation/perf-annotate.txt index 1b5042f134a8..80c1be5d566c 100644 --- a/tools/perf/Documentation/perf-annotate.txt +++ b/tools/perf/Documentation/perf-annotate.txt @@ -124,6 +124,13 @@ OPTIONS --group:: Show event group information together +--demangle:: + Demangle symbol names to human readable form. It's enabled by default, + disable with --no-demangle. + +--demangle-kernel:: + Demangle kernel symbol names to human readable form (for C++ kernels). + --percent-type:: Set annotation percent type from following choices: global-period, local-period, global-hits, local-hits diff --git a/tools/perf/Documentation/perf-buildid-cache.txt b/tools/perf/Documentation/perf-buildid-cache.txt index bb167e32a1d7..cd8ce6e8ec12 100644 --- a/tools/perf/Documentation/perf-buildid-cache.txt +++ b/tools/perf/Documentation/perf-buildid-cache.txt @@ -57,7 +57,7 @@ OPTIONS -u:: --update=:: Update specified file of the cache. Note that this doesn't remove - older entires since those may be still needed for annotating old + older entries since those may be still needed for annotating old (or remote) perf.data. Only if there is already a cache which has exactly same build-id, that is replaced by new one. It can be used to update kallsyms and kernel dso to vmlinux in order to support diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt index 153bde14bbe0..b0872c801866 100644 --- a/tools/perf/Documentation/perf-config.txt +++ b/tools/perf/Documentation/perf-config.txt @@ -123,6 +123,7 @@ Given a $HOME/.perfconfig like this: queue-size = 0 children = true group = true + skip-empty = true [llvm] dump-obj = true @@ -393,6 +394,12 @@ annotate.*:: This option works with tui, stdio2 browsers. + annotate.demangle:: + Demangle symbol names to human readable form. Default is 'true'. + + annotate.demangle_kernel:: + Demangle kernel symbol names to human readable form. Default is 'true'. + hist.*:: hist.percentage:: This option control the way to calculate overhead of filtered entries - @@ -525,6 +532,10 @@ report.*:: 0.07% 0.00% noploop ld-2.15.so [.] strcmp 0.03% 0.00% noploop [kernel.kallsyms] [k] timerqueue_del + report.skip-empty:: + This option can change default stat behavior with empty results. + If it's set true, 'perf report --stat' will not show 0 stats. + top.*:: top.children:: Same as 'report.children'. So if it is enabled, the output of 'top' diff --git a/tools/perf/Documentation/perf-data.txt b/tools/perf/Documentation/perf-data.txt index 726b9bc9e1a7..417bf17e265c 100644 --- a/tools/perf/Documentation/perf-data.txt +++ b/tools/perf/Documentation/perf-data.txt @@ -17,7 +17,7 @@ Data file related processing. COMMANDS -------- convert:: - Converts perf data file into another format (only CTF [1] format is support by now). + Converts perf data file into another format. It's possible to set data-convert debug variable to get debug messages from conversion, like: perf --debug data-convert data convert ... @@ -27,6 +27,9 @@ OPTIONS for 'convert' --to-ctf:: Triggers the CTF conversion, specify the path of CTF data directory. +--to-json:: + Triggers JSON conversion. Specify the JSON filename to output. + --tod:: Convert time to wall clock time. diff --git a/tools/perf/Documentation/perf-iostat.txt b/tools/perf/Documentation/perf-iostat.txt new file mode 100644 index 000000000000..165176944031 --- /dev/null +++ b/tools/perf/Documentation/perf-iostat.txt @@ -0,0 +1,88 @@ +perf-iostat(1) +=============== + +NAME +---- +perf-iostat - Show I/O performance metrics + +SYNOPSIS +-------- +[verse] +'perf iostat' list +'perf iostat' <ports> -- <command> [<options>] + +DESCRIPTION +----------- +Mode is intended to provide four I/O performance metrics per each PCIe root port: + +- Inbound Read - I/O devices below root port read from the host memory, in MB + +- Inbound Write - I/O devices below root port write to the host memory, in MB + +- Outbound Read - CPU reads from I/O devices below root port, in MB + +- Outbound Write - CPU writes to I/O devices below root port, in MB + +OPTIONS +------- +<command>...:: + Any command you can specify in a shell. + +list:: + List all PCIe root ports. + +<ports>:: + Select the root ports for monitoring. Comma-separated list is supported. + +EXAMPLES +-------- + +1. List all PCIe root ports (example for 2-S platform): + + $ perf iostat list + S0-uncore_iio_0<0000:00> + S1-uncore_iio_0<0000:80> + S0-uncore_iio_1<0000:17> + S1-uncore_iio_1<0000:85> + S0-uncore_iio_2<0000:3a> + S1-uncore_iio_2<0000:ae> + S0-uncore_iio_3<0000:5d> + S1-uncore_iio_3<0000:d7> + +2. Collect metrics for all PCIe root ports: + + $ perf iostat -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct + 357708+0 records in + 357707+0 records out + 375083606016 bytes (375 GB, 349 GiB) copied, 215.974 s, 1.7 GB/s + + Performance counter stats for 'system wide': + + port Inbound Read(MB) Inbound Write(MB) Outbound Read(MB) Outbound Write(MB) + 0000:00 1 0 2 3 + 0000:80 0 0 0 0 + 0000:17 352552 43 0 21 + 0000:85 0 0 0 0 + 0000:3a 3 0 0 0 + 0000:ae 0 0 0 0 + 0000:5d 0 0 0 0 + 0000:d7 0 0 0 0 + +3. Collect metrics for comma-separated list of PCIe root ports: + + $ perf iostat 0000:17,0:3a -- dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct + 357708+0 records in + 357707+0 records out + 375083606016 bytes (375 GB, 349 GiB) copied, 197.08 s, 1.9 GB/s + + Performance counter stats for 'system wide': + + port Inbound Read(MB) Inbound Write(MB) Outbound Read(MB) Outbound Write(MB) + 0000:17 358559 44 0 22 + 0000:3a 3 2 0 0 + + 197.081983474 seconds time elapsed + +SEE ALSO +-------- +linkperf:perf-stat[1]
\ No newline at end of file diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index f3161c9673e9..d71bac847936 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -695,6 +695,7 @@ measurements: wait -n ${perf_pid} exit $? +include::intel-hybrid.txt[] SEE ALSO -------- diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt index f546b5e9db05..24efc0583c93 100644 --- a/tools/perf/Documentation/perf-report.txt +++ b/tools/perf/Documentation/perf-report.txt @@ -112,6 +112,8 @@ OPTIONS - ins_lat: Instruction latency in core cycles. This is the global instruction latency - local_ins_lat: Local instruction latency version + - p_stage_cyc: On powerpc, this presents the number of cycles spent in a + pipeline stage. And currently supported only on powerpc. By default, comm, dso and symbol keys are used. (i.e. --sort comm,dso,symbol) @@ -224,6 +226,9 @@ OPTIONS --dump-raw-trace:: Dump raw trace in ASCII. +--disable-order:: + Disable raw trace ordering. + -g:: --call-graph=<print_type,threshold[,print_limit],order,sort_key[,branch],value>:: Display call chains using type, min percent threshold, print limit, @@ -472,7 +477,7 @@ OPTIONS but probably we'll make the default not to show the switch-on/off events on the --group mode and if there is only one event besides the off/on ones, go straight to the histogram browser, just like 'perf report' with no events - explicitely specified does. + explicitly specified does. --itrace:: Options for decoding instruction tracing data. The options are: @@ -566,6 +571,9 @@ include::itrace.txt[] sampled cycles 'Avg Cycles' - block average sampled cycles +--skip-empty:: + Do not print 0 results in the --stat output. + include::callchain-overhead-calculation.txt[] SEE ALSO diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt index 08a1714494f8..45c2467e4eb2 100644 --- a/tools/perf/Documentation/perf-stat.txt +++ b/tools/perf/Documentation/perf-stat.txt @@ -93,6 +93,19 @@ report:: 1.102235068 seconds time elapsed +--bpf-counters:: + Use BPF programs to aggregate readings from perf_events. This + allows multiple perf-stat sessions that are counting the same metric (cycles, + instructions, etc.) to share hardware counters. + To use BPF programs on common events by default, use + "perf config stat.bpf-counter-events=<list_of_events>". + +--bpf-attr-map:: + With option "--bpf-counters", different perf-stat sessions share + information about shared BPF programs and maps via a pinned hashmap. + Use "--bpf-attr-map" to specify the path of this pinned hashmap. + The default path is /sys/fs/bpf/perf_attr_map. + ifdef::HAVE_LIBPFM[] --pfm-events events:: Select a PMU event using libpfm4 syntax (see http://perfmon2.sf.net) @@ -142,7 +155,10 @@ Do not aggregate counts across all monitored CPUs. -n:: --null:: - null run - don't start any counters +null run - Don't start any counters. + +This can be useful to measure just elapsed wall-clock time - or to assess the +raw overhead of perf stat itself, without running any counters. -v:: --verbose:: @@ -468,6 +484,15 @@ convenient for post processing. --summary:: Print summary for interval mode (-I). +--no-csv-summary:: +Don't print 'summary' at the first column for CVS summary output. +This option must be used with -x and --summary. + +This option can be enabled in perf config by setting the variable +'stat.no-csv-summary'. + +$ perf config stat.no-csv-summary=true + EXAMPLES -------- @@ -527,6 +552,8 @@ The fields are in this order: Additional metrics may be printed with all earlier fields being empty. +include::intel-hybrid.txt[] + SEE ALSO -------- linkperf:perf-top[1], linkperf:perf-list[1] diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt index ee2024691d46..bba5ffb05463 100644 --- a/tools/perf/Documentation/perf-top.txt +++ b/tools/perf/Documentation/perf-top.txt @@ -317,7 +317,7 @@ Default is to monitor all CPUS. but probably we'll make the default not to show the switch-on/off events on the --group mode and if there is only one event besides the off/on ones, go straight to the histogram browser, just like 'perf top' with no events - explicitely specified does. + explicitly specified does. --stitch-lbr:: Show callgraph with stitched LBRs, which may have more complete diff --git a/tools/perf/Documentation/perf.txt b/tools/perf/Documentation/perf.txt index c130a3c46a90..9c330cdfa973 100644 --- a/tools/perf/Documentation/perf.txt +++ b/tools/perf/Documentation/perf.txt @@ -76,3 +76,15 @@ SEE ALSO linkperf:perf-stat[1], linkperf:perf-top[1], linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-list[1] + +linkperf:perf-annotate[1],linkperf:perf-archive[1], +linkperf:perf-bench[1], linkperf:perf-buildid-cache[1], +linkperf:perf-buildid-list[1], linkperf:perf-c2c[1], +linkperf:perf-config[1], linkperf:perf-data[1], linkperf:perf-diff[1], +linkperf:perf-evlist[1], linkperf:perf-ftrace[1], +linkperf:perf-help[1], linkperf:perf-inject[1], +linkperf:perf-intel-pt[1], linkperf:perf-kallsyms[1], +linkperf:perf-kmem[1], linkperf:perf-kvm[1], linkperf:perf-lock[1], +linkperf:perf-mem[1], linkperf:perf-probe[1], linkperf:perf-sched[1], +linkperf:perf-script[1], linkperf:perf-test[1], +linkperf:perf-trace[1], linkperf:perf-version[1] diff --git a/tools/perf/Documentation/topdown.txt b/tools/perf/Documentation/topdown.txt index 10f07f9455b8..c6302df4cf29 100644 --- a/tools/perf/Documentation/topdown.txt +++ b/tools/perf/Documentation/topdown.txt @@ -72,6 +72,7 @@ For example, the perf_event_attr structure can be initialized with The Fixed counter 3 must be the leader of the group. #include <linux/perf_event.h> +#include <sys/mman.h> #include <sys/syscall.h> #include <unistd.h> @@ -95,6 +96,11 @@ int slots_fd = perf_event_open(&slots, 0, -1, -1, 0); if (slots_fd < 0) ... error ... +/* Memory mapping the fd permits _rdpmc calls from userspace */ +void *slots_p = mmap(0, getpagesize(), PROT_READ, MAP_SHARED, slots_fd, 0); +if (!slot_p) + .... error ... + /* * Open metrics event file descriptor for current task. * Set slots event as the leader of the group. @@ -110,6 +116,14 @@ int metrics_fd = perf_event_open(&metrics, 0, -1, slots_fd, 0); if (metrics_fd < 0) ... error ... +/* Memory mapping the fd permits _rdpmc calls from userspace */ +void *metrics_p = mmap(0, getpagesize(), PROT_READ, MAP_SHARED, metrics_fd, 0); +if (!metrics_p) + ... error ... + +Note: the file descriptors returned by the perf_event_open calls must be memory +mapped to permit calls to the _rdpmd instruction. Permission may also be granted +by writing the /sys/devices/cpu/rdpmc sysfs node. The RDPMC instruction (or _rdpmc compiler intrinsic) can now be used to read slots and the topdown metrics at different points of the program: @@ -141,6 +155,10 @@ as the parallelism and overlap in the CPU program execution will cause too much measurement inaccuracy. For example instrumenting individual basic blocks is definitely too fine grained. +_rdpmc calls should not be mixed with reading the metrics and slots counters +through system calls, as the kernel will reset these counters after each system +call. + Decoding metrics values ======================= |