summaryrefslogtreecommitdiff
path: root/tools/perf/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2015-04-15 00:37:47 +0300
committerLinus Torvalds <torvalds@linux-foundation.org>2015-04-15 00:37:47 +0300
commit6c8a53c9e6a151fffb07f8b4c34bd1e33dddd467 (patch)
tree791caf826ef136c521a97b7878f226b6ba1c1d75 /tools/perf/Documentation
parente95e7f627062be5e6ce971ce873e6234c91ffc50 (diff)
parent066450be419fa48007a9f29e19828f2a86198754 (diff)
downloadlinux-6c8a53c9e6a151fffb07f8b4c34bd1e33dddd467.tar.xz
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf changes from Ingo Molnar: "Core kernel changes: - One of the more interesting features in this cycle is the ability to attach eBPF programs (user-defined, sandboxed bytecode executed by the kernel) to kprobes. This allows user-defined instrumentation on a live kernel image that can never crash, hang or interfere with the kernel negatively. (Right now it's limited to root-only, but in the future we might allow unprivileged use as well.) (Alexei Starovoitov) - Another non-trivial feature is per event clockid support: this allows, amongst other things, the selection of different clock sources for event timestamps traced via perf. This feature is sought by people who'd like to merge perf generated events with external events that were measured with different clocks: - cluster wide profiling - for system wide tracing with user-space events, - JIT profiling events etc. Matching perf tooling support is added as well, available via the -k, --clockid <clockid> parameter to perf record et al. (Peter Zijlstra) Hardware enablement kernel changes: - x86 Intel Processor Trace (PT) support: which is a hardware tracer on steroids, available on Broadwell CPUs. The hardware trace stream is directly output into the user-space ring-buffer, using the 'AUX' data format extension that was added to the perf core to support hardware constraints such as the necessity to have the tracing buffer physically contiguous. This patch-set was developed for two years and this is the result. A simple way to make use of this is to use BTS tracing, the PT driver emulates BTS output - available via the 'intel_bts' PMU. More explicit PT specific tooling support is in the works as well - will probably be ready by 4.2. (Alexander Shishkin, Peter Zijlstra) - x86 Intel Cache QoS Monitoring (CQM) support: this is a hardware feature of Intel Xeon CPUs that allows the measurement and allocation/partitioning of caches to individual workloads. These kernel changes expose the measurement side as a new PMU driver, which exposes various QoS related PMU events. (The partitioning change is work in progress and is planned to be merged as a cgroup extension.) (Matt Fleming, Peter Zijlstra; CPU feature detection by Peter P Waskiewicz Jr) - x86 Intel Haswell LBR call stack support: this is a new Haswell feature that allows the hardware recording of call chains, plus tooling support. To activate this feature you have to enable it via the new 'lbr' call-graph recording option: perf record --call-graph lbr perf report or: perf top --call-graph lbr This hardware feature is a lot faster than stack walk or dwarf based unwinding, but has some limitations: - It reuses the current LBR facility, so LBR call stack and branch record can not be enabled at the same time. - It is only available for user-space callchains. (Yan, Zheng) - x86 Intel Broadwell CPU support and various event constraints and event table fixes for earlier models. (Andi Kleen) - x86 Intel HT CPUs event scheduling workarounds. This is a complex CPU bug affecting the SNB,IVB,HSW families that results in counter value corruption. The mitigation code is automatically enabled and is transparent. (Maria Dimakopoulou, Stephane Eranian) The perf tooling side had a ton of changes in this cycle as well, so I'm only able to list the user visible changes here, in addition to the tooling changes outlined above: User visible changes affecting all tools: - Improve support of compressed kernel modules (Jiri Olsa) - Save DSO loading errno to better report errors (Arnaldo Carvalho de Melo) - Bash completion for subcommands (Yunlong Song) - Add 'I' event modifier for perf_event_attr.exclude_idle bit (Jiri Olsa) - Support missing -f to override perf.data file ownership. (Yunlong Song) - Show the first event with an invalid filter (David Ahern, Arnaldo Carvalho de Melo) User visible changes in individual tools: 'perf data': New tool for converting perf.data to other formats, initially for the CTF (Common Trace Format) from LTTng (Jiri Olsa, Sebastian Siewior) 'perf diff': Add --kallsyms option (David Ahern) 'perf list': Allow listing events with 'tracepoint' prefix (Yunlong Song) Sort the output of the command (Yunlong Song) 'perf kmem': Respect -i option (Jiri Olsa) Print big numbers using thousands' group (Namhyung Kim) Allow -v option (Namhyung Kim) Fix alignment of slab result table (Namhyung Kim) 'perf probe': Support multiple probes on different binaries on the same command line (Masami Hiramatsu) Support unnamed union/structure members data collection. (Masami Hiramatsu) Check kprobes blacklist when adding new events. (Masami Hiramatsu) 'perf record': Teach 'perf record' about perf_event_attr.clockid (Peter Zijlstra) Support recording running/enabled time (Andi Kleen) 'perf sched': Improve the performance of 'perf sched replay' on high CPU core count machines (Yunlong Song) 'perf report' and 'perf top': Allow annotating entries in callchains in the hists browser (Arnaldo Carvalho de Melo) Indicate which callchain entries are annotated in the TUI hists browser (Arnaldo Carvalho de Melo) Add pid/tid filtering to 'report' and 'script' commands (David Ahern) Consider PERF_RECORD_ events with cpumode == 0 in 'perf top', removing one cause of long term memory usage buildup, i.e. not processing PERF_RECORD_EXIT events (Arnaldo Carvalho de Melo) 'perf stat': Report unsupported events properly (Suzuki K. Poulose) Output running time and run/enabled ratio in CSV mode (Andi Kleen) 'perf trace': Handle legacy syscalls tracepoints (David Ahern, Arnaldo Carvalho de Melo) Only insert blank duration bracket when tracing syscalls (Arnaldo Carvalho de Melo) Filter out the trace pid when no threads are specified (Arnaldo Carvalho de Melo) Dump stack on segfaults (Arnaldo Carvalho de Melo) No need to explicitely enable evsels for workload started from perf, let it be enabled via perf_event_attr.enable_on_exec, removing some events that take place in the 'perf trace' before a workload is really started by it. (Arnaldo Carvalho de Melo) Allow mixing with tracepoints and suppressing plain syscalls. (Arnaldo Carvalho de Melo) There's also been a ton of infrastructure work done, such as the split-out of perf's build system into tools/build/ and other changes - see the shortlog and changelog for details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (358 commits) perf/x86/intel/pt: Clean up the control flow in pt_pmu_hw_init() perf evlist: Fix type for references to data_head/tail perf probe: Check the orphaned -x option perf probe: Support multiple probes on different binaries perf buildid-list: Fix segfault when show DSOs with hits perf tools: Fix cross-endian analysis perf tools: Fix error path to do closedir() when synthesizing threads perf tools: Fix synthesizing fork_event.ppid for non-main thread perf tools: Add 'I' event modifier for exclude_idle bit perf report: Don't call map__kmap if map is NULL. perf tests: Fix attr tests perf probe: Fix ARM 32 building error perf tools: Merge all perf_event_attr print functions perf record: Add clockid parameter perf sched replay: Use replay_repeat to calculate the runavg of cpu usage instead of the default value 10 perf sched replay: Support using -f to override perf.data file ownership perf sched replay: Fix the EMFILE error caused by the limitation of the maximum open files perf sched replay: Handle the dead halt of sem_wait when create_tasks() fails for any task perf sched replay: Fix the segmentation fault problem caused by pr_err in threads perf sched replay: Realloc the memory of pid_to_task stepwise to adapt to the different pid_max configurations ...
Diffstat (limited to 'tools/perf/Documentation')
-rw-r--r--tools/perf/Documentation/Build.txt49
-rw-r--r--tools/perf/Documentation/perf-buildid-cache.txt24
-rw-r--r--tools/perf/Documentation/perf-data.txt40
-rw-r--r--tools/perf/Documentation/perf-diff.txt8
-rw-r--r--tools/perf/Documentation/perf-kmem.txt4
-rw-r--r--tools/perf/Documentation/perf-list.txt7
-rw-r--r--tools/perf/Documentation/perf-probe.txt16
-rw-r--r--tools/perf/Documentation/perf-record.txt30
-rw-r--r--tools/perf/Documentation/perf-report.txt5
-rw-r--r--tools/perf/Documentation/perf-script.txt6
-rw-r--r--tools/perf/Documentation/perf-trace.txt6
-rw-r--r--tools/perf/Documentation/perf.txt7
12 files changed, 188 insertions, 14 deletions
diff --git a/tools/perf/Documentation/Build.txt b/tools/perf/Documentation/Build.txt
new file mode 100644
index 000000000000..f6fc6507ba55
--- /dev/null
+++ b/tools/perf/Documentation/Build.txt
@@ -0,0 +1,49 @@
+
+1) perf build
+=============
+The perf build process consists of several separated building blocks,
+which are linked together to form the perf binary:
+ - libperf library (static)
+ - perf builtin commands
+ - traceevent library (static)
+ - GTK ui library
+
+Several makefiles govern the perf build:
+
+ - Makefile
+ top level Makefile working as a wrapper that calls the main
+ Makefile.perf with a -j option to do parallel builds.
+
+ - Makefile.perf
+ main makefile that triggers build of all perf objects including
+ installation and documentation processing.
+
+ - tools/build/Makefile.build
+ main makefile of the build framework
+
+ - tools/build/Build.include
+ build framework generic definitions
+
+ - Build makefiles
+ makefiles that defines build objects
+
+Please refer to tools/build/Documentation/Build.txt for more
+information about build framework.
+
+
+2) perf build
+=============
+The Makefile.perf triggers the build framework for build objects:
+ perf, libperf, gtk
+
+resulting in following objects:
+ $ ls *-in.o
+ gtk-in.o libperf-in.o perf-in.o
+
+Those objects are then used in final linking:
+ libperf-gtk.so <- gtk-in.o libperf-in.o
+ perf <- perf-in.o libperf-in.o
+
+
+NOTE this description is omitting other libraries involved, only
+ focusing on build framework outcomes
diff --git a/tools/perf/Documentation/perf-buildid-cache.txt b/tools/perf/Documentation/perf-buildid-cache.txt
index 0294c57b1f5e..dd07b55f58d8 100644
--- a/tools/perf/Documentation/perf-buildid-cache.txt
+++ b/tools/perf/Documentation/perf-buildid-cache.txt
@@ -12,9 +12,9 @@ SYNOPSIS
DESCRIPTION
-----------
-This command manages the build-id cache. It can add and remove files to/from
-the cache. In the future it should as well purge older entries, set upper
-limits for the space used by the cache, etc.
+This command manages the build-id cache. It can add, remove, update and purge
+files to/from the cache. In the future it should as well set upper limits for
+the space used by the cache, etc.
OPTIONS
-------
@@ -36,14 +36,24 @@ OPTIONS
actually made.
-r::
--remove=::
- Remove specified file from the cache.
+ Remove a cached binary which has same build-id of specified file
+ from the cache.
+-p::
+--purge=::
+ Purge all cached binaries including older caches which have specified
+ path from the cache.
-M::
--missing=::
List missing build ids in the cache for the specified file.
-u::
---update::
- Update specified file of the cache. It can be used to update kallsyms
- kernel dso to vmlinux in order to support annotation.
+--update=::
+ Update specified file of the cache. Note that this doesn't remove
+ older entires since those may be still needed for annotating old
+ (or remote) perf.data. Only if there is already a cache which has
+ exactly same build-id, that is replaced by new one. It can be used
+ to update kallsyms and kernel dso to vmlinux in order to support
+ annotation.
+
-v::
--verbose::
Be more verbose.
diff --git a/tools/perf/Documentation/perf-data.txt b/tools/perf/Documentation/perf-data.txt
new file mode 100644
index 000000000000..be8fa1a0a97e
--- /dev/null
+++ b/tools/perf/Documentation/perf-data.txt
@@ -0,0 +1,40 @@
+perf-data(1)
+==============
+
+NAME
+----
+perf-data - Data file related processing
+
+SYNOPSIS
+--------
+[verse]
+'perf data' [<common options>] <command> [<options>]",
+
+DESCRIPTION
+-----------
+Data file related processing.
+
+COMMANDS
+--------
+convert::
+ Converts perf data file into another format (only CTF [1] format is support by now).
+ It's possible to set data-convert debug variable to get debug messages from conversion,
+ like:
+ perf --debug data-convert data convert ...
+
+OPTIONS for 'convert'
+---------------------
+--to-ctf::
+ Triggers the CTF conversion, specify the path of CTF data directory.
+
+-i::
+ Specify input perf data file path.
+
+-v::
+--verbose::
+ Be more verbose (show counter open errors, etc).
+
+SEE ALSO
+--------
+linkperf:perf[1]
+[1] Common Trace Format - http://www.efficios.com/ctf
diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt
index e463caa3eb49..d1deb573877f 100644
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -20,12 +20,20 @@ If no parameters are passed it will assume perf.data.old and perf.data.
The differential profile is displayed only for events matching both
specified perf.data files.
+If no parameters are passed the samples will be sorted by dso and symbol.
+As the perf.data files could come from different binaries, the symbols addresses
+could vary. So perf diff is based on the comparison of the files and
+symbols name.
+
OPTIONS
-------
-D::
--dump-raw-trace::
Dump raw trace in ASCII.
+--kallsyms=<file>::
+ kallsyms pathname
+
-m::
--modules::
Load module symbols. WARNING: use only with -k and LIVE kernel
diff --git a/tools/perf/Documentation/perf-kmem.txt b/tools/perf/Documentation/perf-kmem.txt
index 7c8fbbf3f61c..150253cc3c97 100644
--- a/tools/perf/Documentation/perf-kmem.txt
+++ b/tools/perf/Documentation/perf-kmem.txt
@@ -25,6 +25,10 @@ OPTIONS
--input=<file>::
Select the input file (default: perf.data unless stdin is a fifo)
+-v::
+--verbose::
+ Be more verbose. (show symbol address, etc)
+
--caller::
Show per-callsite statistics
diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 3e2aec94f806..bada8933fdd4 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -26,6 +26,7 @@ counted. The following modifiers exist:
u - user-space counting
k - kernel counting
h - hypervisor counting
+ I - non idle counting
G - guest counting (in KVM guests)
H - host counting (not in KVM guests)
p - precise level
@@ -127,6 +128,12 @@ To limit the list use:
One or more types can be used at the same time, listing the events for the
types specified.
+Support raw format:
+
+. '--raw-dump', shows the raw-dump of all the events.
+. '--raw-dump [hw|sw|cache|tracepoint|pmu|event_glob]', shows the raw-dump of
+ a certain kind of events.
+
SEE ALSO
--------
linkperf:perf-stat[1], linkperf:perf-top[1],
diff --git a/tools/perf/Documentation/perf-probe.txt b/tools/perf/Documentation/perf-probe.txt
index aaa869be3dc1..239609c09f83 100644
--- a/tools/perf/Documentation/perf-probe.txt
+++ b/tools/perf/Documentation/perf-probe.txt
@@ -47,6 +47,12 @@ OPTIONS
-v::
--verbose::
Be more verbose (show parsed arguments, etc).
+ Can not use with -q.
+
+-q::
+--quiet::
+ Be quiet (do not show any messages including errors).
+ Can not use with -v.
-a::
--add=::
@@ -96,7 +102,7 @@ OPTIONS
Dry run. With this option, --add and --del doesn't execute actual
adding and removal operations.
---max-probes::
+--max-probes=NUM::
Set the maximum number of probe points for an event. Default is 128.
-x::
@@ -104,8 +110,13 @@ OPTIONS
Specify path to the executable or shared library file for user
space tracing. Can also be used with --funcs option.
+--demangle::
+ Demangle application symbols. --no-demangle is also available
+ for disabling demangling.
+
--demangle-kernel::
- Demangle kernel symbols.
+ Demangle kernel symbols. --no-demangle-kernel is also available
+ for disabling kernel demangling.
In absence of -m/-x options, perf probe checks if the first argument after
the options is an absolute path name. If its an absolute path, perf probe
@@ -137,6 +148,7 @@ Each probe argument follows below syntax.
[NAME=]LOCALVAR|$retval|%REG|@SYMBOL[:TYPE]
'NAME' specifies the name of this argument (optional). You can use the name of local variable, local data structure member (e.g. var->field, var.field2), local array with fixed index (e.g. array[1], var->array[0], var->pointer[2]), or kprobe-tracer argument format (e.g. $retval, %ax, etc). Note that the name of this argument will be set as the last member name if you specify a local data structure member (e.g. field2 for 'var->field1.field2'.)
+'$vars' special argument is also available for NAME, it is expanded to the local variables which can access at given probe point.
'TYPE' casts the type of this argument (optional). If omitted, perf probe automatically set the type based on debuginfo. You can specify 'string' type only for the local variable or structure member which is an array of or a pointer to 'char' or 'unsigned char' type.
On x86 systems %REG is always the short form of the register: for example %AX. %RAX or %EAX is not valid.
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 31e977459c51..4847a793de65 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -55,6 +55,11 @@ OPTIONS
If you want to profile write accesses in [0x1000~1008), just set
'mem:0x1000/8:w'.
+ - a group of events surrounded by a pair of brace ("{event1,event2,...}").
+ Each event is separated by commas and the group should be quoted to
+ prevent the shell interpretation. You also need to use --group on
+ "perf report" to view group events together.
+
--filter=<filter>::
Event filter.
@@ -62,9 +67,6 @@ OPTIONS
--all-cpus::
System-wide collection from all CPUs.
--l::
- Scale counter values.
-
-p::
--pid=::
Record events on existing process ID (comma separated list).
@@ -107,6 +109,10 @@ OPTIONS
specification with appended unit character - B/K/M/G. The
size is rounded up to have nearest pages power of two value.
+--group::
+ Put all events in a single event group. This precedes the --event
+ option and remains only for backward compatibility. See --event.
+
-g::
Enables call-graph (stack chain/backtrace) recording.
@@ -115,13 +121,19 @@ OPTIONS
implies -g.
Allows specifying "fp" (frame pointer) or "dwarf"
- (DWARF's CFI - Call Frame Information) as the method to collect
+ (DWARF's CFI - Call Frame Information) or "lbr"
+ (Hardware Last Branch Record facility) as the method to collect
the information used to show the call graphs.
In some systems, where binaries are build with gcc
--fomit-frame-pointer, using the "fp" method will produce bogus
call graphs, using "dwarf", if available (perf tools linked to
the libunwind library) should be used instead.
+ Using the "lbr" method doesn't require any compiler options. It
+ will produce call graphs from the hardware LBR registers. The
+ main limition is that it is only available on new Intel
+ platforms, such as Haswell. It can only get user call chain. It
+ doesn't work with branch stack sampling at the same time.
-q::
--quiet::
@@ -235,6 +247,16 @@ Capture machine state (registers) at interrupt, i.e., on counter overflows for
each sample. List of captured registers depends on the architecture. This option
is off by default.
+--running-time::
+Record running and enabled time for read events (:S)
+
+-k::
+--clockid::
+Sets the clock id to use for the various time fields in the perf_event_type
+records. See clock_gettime(). In particular CLOCK_MONOTONIC and
+CLOCK_MONOTONIC_RAW are supported, some events might also allow
+CLOCK_BOOTTIME, CLOCK_REALTIME and CLOCK_TAI.
+
SEE ALSO
--------
linkperf:perf-stat[1], linkperf:perf-list[1]
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index dd7cccdde498..4879cf638824 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -40,6 +40,11 @@ OPTIONS
Only consider symbols in these comms. CSV that understands
file://filename entries. This option will affect the percentage of
the overhead column. See --percentage for more info.
+--pid=::
+ Only show events for given process ID (comma separated list).
+
+--tid=::
+ Only show events for given thread ID (comma separated list).
-d::
--dsos=::
Only consider symbols in these dsos. CSV that understands
diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index a21eec05bc42..79445750fcb3 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -193,6 +193,12 @@ OPTIONS
Only display events for these comms. CSV that understands
file://filename entries.
+--pid=::
+ Only show events for given process ID (comma separated list).
+
+--tid=::
+ Only show events for given thread ID (comma separated list).
+
-I::
--show-info::
Display extended information about the perf.data file. This adds
diff --git a/tools/perf/Documentation/perf-trace.txt b/tools/perf/Documentation/perf-trace.txt
index 7e1b1f2bb83c..ba03fd5d1a54 100644
--- a/tools/perf/Documentation/perf-trace.txt
+++ b/tools/perf/Documentation/perf-trace.txt
@@ -55,6 +55,9 @@ OPTIONS
--uid=::
Record events in threads owned by uid. Name or number.
+--filter-pids=::
+ Filter out events for these pids and for 'trace' itself (comma separated list).
+
-v::
--verbose=::
Verbosity level.
@@ -115,6 +118,9 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs.
--syscalls::
Trace system calls. This options is enabled by default.
+--event::
+ Trace other events, see 'perf list' for a complete list.
+
PAGEFAULTS
----------
diff --git a/tools/perf/Documentation/perf.txt b/tools/perf/Documentation/perf.txt
index 1e8e400b4493..2b131776363e 100644
--- a/tools/perf/Documentation/perf.txt
+++ b/tools/perf/Documentation/perf.txt
@@ -13,11 +13,16 @@ SYNOPSIS
OPTIONS
-------
--debug::
- Setup debug variable (just verbose for now) in value
+ Setup debug variable (see list below) in value
range (0, 10). Use like:
--debug verbose # sets verbose = 1
--debug verbose=2 # sets verbose = 2
+ List of debug variables allowed to set:
+ verbose - general debug messages
+ ordered-events - ordered events object debug messages
+ data-convert - data convert command debug messages
+
--buildid-dir::
Setup buildid cache directory. It has higher priority than
buildid.dir config file option.