diff options
Diffstat (limited to 'tools/perf/Documentation')
-rw-r--r-- | tools/perf/Documentation/intel-pt.txt | 44 | ||||
-rw-r--r-- | tools/perf/Documentation/itrace.txt | 4 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-inject.txt | 3 | ||||
-rw-r--r-- | tools/perf/Documentation/perf-script.txt | 3 |
4 files changed, 54 insertions, 0 deletions
diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt index c94c9de3173e..be764f9ec769 100644 --- a/tools/perf/Documentation/intel-pt.txt +++ b/tools/perf/Documentation/intel-pt.txt @@ -671,6 +671,7 @@ The letters are: e synthesize tracing error events d create a debug log g synthesize a call chain (use with i or x) + l synthesize last branch entries (use with i or x) "Instructions" events look like they were recorded by "perf record -e instructions". @@ -707,12 +708,26 @@ on the sample is *not* adjusted and reflects the last known value of TSC. For Intel PT, the default period is 100us. +Setting it to a zero period means "as often as possible". + +In the case of Intel PT that is the same as a period of 1 and a unit of +'instructions' (i.e. --itrace=i1i). + Also the call chain size (default 16, max. 1024) for instructions or transactions events can be specified. e.g. --itrace=ig32 --itrace=xg32 +Also the number of last branch entries (default 64, max. 1024) for instructions or +transactions events can be specified. e.g. + + --itrace=il10 + --itrace=xl10 + +Note that last branch entries are cleared for each sample, so there is no overlap +from one sample to the next. + To disable trace decoding entirely, use the option --no-itrace. @@ -749,3 +764,32 @@ perf inject also accepts the --itrace option in which case tracing data is removed and replaced with the synthesized events. e.g. perf inject --itrace -i perf.data -o perf.data.new + +Below is an example of using Intel PT with autofdo. It requires autofdo +(https://github.com/google/autofdo) and gcc version 5. The bubble +sort example is from the AutoFDO tutorial (https://gcc.gnu.org/wiki/AutoFDO/Tutorial) +amended to take the number of elements as a parameter. + + $ gcc-5 -O3 sort.c -o sort_optimized + $ ./sort_optimized 30000 + Bubble sorting array of 30000 elements + 2254 ms + + $ cat ~/.perfconfig + [intel-pt] + mispred-all + + $ perf record -e intel_pt//u ./sort 3000 + Bubble sorting array of 3000 elements + 58 ms + [ perf record: Woken up 2 times to write data ] + [ perf record: Captured and wrote 3.939 MB perf.data ] + $ perf inject -i perf.data -o inj --itrace=i100usle --strip + $ ./create_gcov --binary=./sort --profile=inj --gcov=sort.gcov -gcov_version=1 + $ gcc-5 -O3 -fauto-profile=sort.gcov sort.c -o sort_autofdo + $ ./sort_autofdo 30000 + Bubble sorting array of 30000 elements + 2155 ms + +Note there is currently no advantage to using Intel PT instead of LBR, but +that may change in the future if greater use is made of the data. diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt index 2ff946677e3b..65453f4c7006 100644 --- a/tools/perf/Documentation/itrace.txt +++ b/tools/perf/Documentation/itrace.txt @@ -6,6 +6,7 @@ e synthesize error events d create a debug log g synthesize a call chain (use with i or x) + l synthesize last branch entries (use with i or x) The default is all events i.e. the same as --itrace=ibxe @@ -20,3 +21,6 @@ Also the call chain size (default 16, max. 1024) for instructions or transactions events can be specified. + + Also the number of last branch entries (default 64, max. 1024) for + instructions or transactions events can be specified. diff --git a/tools/perf/Documentation/perf-inject.txt b/tools/perf/Documentation/perf-inject.txt index 0c721c3e37e1..0b1cedeef895 100644 --- a/tools/perf/Documentation/perf-inject.txt +++ b/tools/perf/Documentation/perf-inject.txt @@ -50,6 +50,9 @@ OPTIONS include::itrace.txt[] +--strip:: + Use with --itrace to strip out non-synthesized events. + SEE ALSO -------- linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-archive[1] diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt index dc3ec783b7bd..b3b42f9285df 100644 --- a/tools/perf/Documentation/perf-script.txt +++ b/tools/perf/Documentation/perf-script.txt @@ -249,6 +249,9 @@ include::itrace.txt[] --full-source-path:: Show the full path for source files for srcline output. +--ns:: + Use 9 decimal places when displaying time (i.e. show the nanoseconds) + SEE ALSO -------- linkperf:perf-record[1], linkperf:perf-script-perl[1], |