diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-03-27 23:42:32 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-03-27 23:42:32 +0300 |
commit | 7b58b82b86c8b65a2b57a4c6cb96a460654f9e09 (patch) | |
tree | a13e19f216389f16f1cb6641d54751f167482515 /tools/perf/util/intel-pt.c | |
parent | 02f9a04d76b76b80b05ddc33ceabe806b84fda3c (diff) | |
parent | ab0809af0bee88b689ba289ec8c40aa2be3a17ec (diff) | |
download | linux-7b58b82b86c8b65a2b57a4c6cb96a460654f9e09.tar.xz |
Merge tag 'perf-tools-for-v5.18-2022-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
"New features:
perf ftrace:
- Add -n/--use-nsec option to the 'latency' subcommand.
Default: usecs:
$ sudo perf ftrace latency -T dput -a sleep 1
# DURATION | COUNT | GRAPH |
0 - 1 us | 2098375 | ############################# |
1 - 2 us | 61 | |
2 - 4 us | 33 | |
4 - 8 us | 13 | |
8 - 16 us | 124 | |
16 - 32 us | 123 | |
32 - 64 us | 1 | |
64 - 128 us | 0 | |
128 - 256 us | 1 | |
256 - 512 us | 0 | |
Better granularity with nsec:
$ sudo perf ftrace latency -T dput -a -n sleep 1
# DURATION | COUNT | GRAPH |
0 - 1 us | 0 | |
1 - 2 ns | 0 | |
2 - 4 ns | 0 | |
4 - 8 ns | 0 | |
8 - 16 ns | 0 | |
16 - 32 ns | 0 | |
32 - 64 ns | 0 | |
64 - 128 ns | 1163434 | ############## |
128 - 256 ns | 914102 | ############# |
256 - 512 ns | 884 | |
512 - 1024 ns | 613 | |
1 - 2 us | 31 | |
2 - 4 us | 17 | |
4 - 8 us | 7 | |
8 - 16 us | 123 | |
16 - 32 us | 83 | |
perf lock:
- Add -c/--combine-locks option to merge lock instances in the same
class into a single entry.
# perf lock report -c
Name acquired contended avg wait(ns) total wait(ns) max wait(ns) min wait(ns)
rcu_read_lock 251225 0 0 0 0 0
hrtimer_bases.lock 39450 0 0 0 0 0
&sb->s_type->i_l... 10301 1 662 662 662 662
ptlock_ptr(page) 10173 2 701 1402 760 642
&(ei->i_block_re... 8732 0 0 0 0 0
&xa->xa_lock 8088 0 0 0 0 0
&base->lock 6705 0 0 0 0 0
&p->pi_lock 5549 0 0 0 0 0
&dentry->d_lockr... 5010 4 1274 5097 1844 789
&ep->lock 3958 0 0 0 0 0
- Add -F/--field option to customize the list of fields to output:
$ perf lock report -F contended,wait_max -k avg_wait
Name contended max wait(ns) avg wait(ns)
slock-AF_INET6 1 23543 23543
&lruvec->lru_lock 5 18317 11254
slock-AF_INET6 1 10379 10379
rcu_node_1 1 2104 2104
&dentry->d_lockr... 1 1844 1844
&dentry->d_lockr... 1 1672 1672
&newf->file_lock 15 2279 1025
&dentry->d_lockr... 1 792 792
- Add --synth=no option for record, as there is no need to symbolize,
lock names comes from the tracepoints.
perf record:
- Threaded recording, opt-in, via the new --threads command line
option.
- Improve AMD IBS (Instruction-Based Sampling) error handling
messages.
perf script:
- Add 'brstackinsnlen' field (use it with -F) for branch stacks.
- Output branch sample type in 'perf script'.
perf report:
- Add "addr_from" and "addr_to" sort dimensions.
- Print branch stack entry type in 'perf report --dump-raw-trace'
- Fix symbolization for chrooted workloads.
Hardware tracing:
Intel PT:
- Add CFE (Control Flow Event) and EVD (Event Data) packets support.
- Add MODE.Exec IFLAG bit support.
Explanation about these features from the "Intel® 64 and IA-32
architectures software developer’s manual combined volumes: 1, 2A,
2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4" PDF at:
https://cdrdv2.intel.com/v1/dl/getContent/671200
At page 3951:
"32.2.4
Event Trace is a capability that exposes details about the
asynchronous events, when they are generated, and when their
corresponding software event handler completes execution. These
include:
o Interrupts, including NMI and SMI, including the interrupt
vector when defined.
o Faults, exceptions including the fault vector.
- Page faults additionally include the page fault address,
when in context.
o Event handler returns, including IRET and RSM.
o VM exits and VM entries.¹
- VM exits include the values written to the “exit reason”
and “exit qualification” VMCS fields. INIT and SIPI events.
o TSX aborts, including the abort status returned for the RTM
instructions.
o Shutdown.
Additionally, it provides indication of the status of the
Interrupt Flag (IF), to indicate when interrupts are masked"
ARM CoreSight:
- Use advertised caps/min_interval as default sample_period on ARM
spe.
- Update deduction of TRCCONFIGR register for branch broadcast on
ARM's CoreSight ETM.
Vendor Events (JSON):
Intel:
- Update events and metrics for: Alderlake, Broadwell, Broadwell DE,
BroadwellX, CascadelakeX, Elkhartlake, Bonnell, Goldmont,
GoldmontPlus, Westmere EP-DP, Haswell, HaswellX, Icelake, IcelakeX,
Ivybridge, Ivytown, Jaketown, Knights Landing, Nehalem EP,
Sandybridge, Silvermont, Skylake, Skylake Server, SkylakeX,
Tigerlake, TremontX, Westmere EP-SP, and Westmere EX.
ARM:
- Add support for HiSilicon CPA PMU aliasing.
perf stat:
- Fix forked applications enablement of counters.
- The 'slots' should only be printed on a different order than the
one specified on the command line when 'topdown' events are
present, fix it.
Miscellaneous:
- Sync msr-index, cpufeatures header files with the kernel sources.
- Stop using some deprecated libbpf APIs in 'perf trace'.
- Fix some spelling mistakes.
- Refactor the maps pointers usage to pave the way for using refcount
debugging.
- Only offer the --tui option on perf top, report and annotate when
perf was built with libslang.
- Don't mention --to-ctf in 'perf data --help' when not linking with
the required library, libbabeltrace.
- Use ARRAY_SIZE() instead of ad hoc equivalent, spotted by
array_size.cocci.
- Enhance the matching of sub-commands abbreviations:
'perf c2c rec' -> 'perf c2c record'
'perf c2c recport -> error
- Set build-id using build-id header on new mmap records.
- Fix generation of 'perf --version' string.
perf test:
- Add test for the arm_spe event.
- Add test to check unwinding using fame-pointer (fp) mode on arm64.
- Make metric testing more robust in 'perf test'.
- Add error message for unsupported branch stack cases.
libperf:
- Add API for allocating new thread map array.
- Fix typo in perf_evlist__open() failure error messages in libperf
tests.
perf c2c:
- Replace bitmap_weight() with bitmap_empty() where appropriate"
* tag 'perf-tools-for-v5.18-2022-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (143 commits)
perf evsel: Improve AMD IBS (Instruction-Based Sampling) error handling messages
perf python: Add perf_env stubs that will be needed in evsel__open_strerror()
perf tools: Enhance the matching of sub-commands abbreviations
libperf tests: Fix typo in perf_evlist__open() failure error messages
tools arm64: Import cputype.h
perf lock: Add -F/--field option to control output
perf lock: Extend struct lock_key to have print function
perf lock: Add --synth=no option for record
tools headers cpufeatures: Sync with the kernel sources
tools headers cpufeatures: Sync with the kernel sources
perf stat: Fix forked applications enablement of counters
tools arch x86: Sync the msr-index.h copy with the kernel sources
perf evsel: Make evsel__env() always return a valid env
perf build-id: Fix spelling mistake "Cant" -> "Can't"
perf header: Fix spelling mistake "could't" -> "couldn't"
perf script: Add 'brstackinsnlen' for branch stacks
perf parse-events: Move slots only with topdown
perf ftrace latency: Update documentation
perf ftrace latency: Add -n/--use-nsec option
perf tools: Fix version kernel tag
...
Diffstat (limited to 'tools/perf/util/intel-pt.c')
-rw-r--r-- | tools/perf/util/intel-pt.c | 164 |
1 files changed, 161 insertions, 3 deletions
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c index e8613cbda331..ec43d364d0de 100644 --- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -46,6 +46,12 @@ #define MAX_TIMESTAMP (~0ULL) +#define INTEL_PT_CFG_PASS_THRU BIT_ULL(0) +#define INTEL_PT_CFG_PWR_EVT_EN BIT_ULL(4) +#define INTEL_PT_CFG_BRANCH_EN BIT_ULL(13) +#define INTEL_PT_CFG_EVT_EN BIT_ULL(31) +#define INTEL_PT_CFG_TNT_DIS BIT_ULL(55) + struct range { u64 start; u64 end; @@ -71,6 +77,7 @@ struct intel_pt { bool mispred_all; bool use_thread_stack; bool callstack; + bool cap_event_trace; unsigned int br_stack_sz; unsigned int br_stack_sz_plus; int have_sched_switch; @@ -115,6 +122,12 @@ struct intel_pt { bool sample_pebs; struct evsel *pebs_evsel; + u64 evt_sample_type; + u64 evt_id; + + u64 iflag_chg_sample_type; + u64 iflag_chg_id; + u64 tsc_bit; u64 mtc_bit; u64 mtc_freq_bits; @@ -953,12 +966,26 @@ static bool intel_pt_branch_enable(struct intel_pt *pt) evlist__for_each_entry(pt->session->evlist, evsel) { if (intel_pt_get_config(pt, &evsel->core.attr, &config) && - (config & 1) && !(config & 0x2000)) + (config & INTEL_PT_CFG_PASS_THRU) && + !(config & INTEL_PT_CFG_BRANCH_EN)) return false; } return true; } +static bool intel_pt_disabled_tnt(struct intel_pt *pt) +{ + struct evsel *evsel; + u64 config; + + evlist__for_each_entry(pt->session->evlist, evsel) { + if (intel_pt_get_config(pt, &evsel->core.attr, &config) && + config & INTEL_PT_CFG_TNT_DIS) + return true; + } + return false; +} + static unsigned int intel_pt_mtc_period(struct intel_pt *pt) { struct evsel *evsel; @@ -1214,6 +1241,10 @@ static struct intel_pt_queue *intel_pt_alloc_queue(struct intel_pt *pt, params.first_timestamp = pt->first_timestamp; params.max_loops = pt->max_loops; + /* Cannot walk code without TNT, so force 'quick' mode */ + if (params.branch_enable && intel_pt_disabled_tnt(pt) && !params.quick) + params.quick = 1; + if (pt->filts.cnt > 0) params.pgd_ip = intel_pt_pgd_ip; @@ -1316,6 +1347,8 @@ static void intel_pt_set_pid_tid_cpu(struct intel_pt *pt, static void intel_pt_sample_flags(struct intel_pt_queue *ptq) { + struct intel_pt *pt = ptq->pt; + ptq->insn_len = 0; if (ptq->state->flags & INTEL_PT_ABORT_TX) { ptq->flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TX_ABORT; @@ -1346,6 +1379,17 @@ static void intel_pt_sample_flags(struct intel_pt_queue *ptq) ptq->flags |= PERF_IP_FLAG_TRACE_BEGIN; if (ptq->state->type & INTEL_PT_TRACE_END) ptq->flags |= PERF_IP_FLAG_TRACE_END; + + if (pt->cap_event_trace) { + if (ptq->state->type & INTEL_PT_IFLAG_CHG) { + if (!ptq->state->from_iflag) + ptq->flags |= PERF_IP_FLAG_INTR_DISABLE; + if (ptq->state->from_iflag != ptq->state->to_iflag) + ptq->flags |= PERF_IP_FLAG_INTR_TOGGLE; + } else if (!ptq->state->to_iflag) { + ptq->flags |= PERF_IP_FLAG_INTR_DISABLE; + } + } } static void intel_pt_setup_time_range(struct intel_pt *pt, @@ -2160,6 +2204,78 @@ static int intel_pt_synth_pebs_sample(struct intel_pt_queue *ptq) return err; } +static int intel_pt_synth_events_sample(struct intel_pt_queue *ptq) +{ + struct intel_pt *pt = ptq->pt; + union perf_event *event = ptq->event_buf; + struct perf_sample sample = { .ip = 0, }; + struct { + struct perf_synth_intel_evt cfe; + struct perf_synth_intel_evd evd[INTEL_PT_MAX_EVDS]; + } raw; + int i; + + if (intel_pt_skip_event(pt)) + return 0; + + intel_pt_prep_p_sample(pt, ptq, event, &sample); + + sample.id = ptq->pt->evt_id; + sample.stream_id = ptq->pt->evt_id; + + raw.cfe.type = ptq->state->cfe_type; + raw.cfe.reserved = 0; + raw.cfe.ip = !!(ptq->state->flags & INTEL_PT_FUP_IP); + raw.cfe.vector = ptq->state->cfe_vector; + raw.cfe.evd_cnt = ptq->state->evd_cnt; + + for (i = 0; i < ptq->state->evd_cnt; i++) { + raw.evd[i].et = 0; + raw.evd[i].evd_type = ptq->state->evd[i].type; + raw.evd[i].payload = ptq->state->evd[i].payload; + } + + sample.raw_size = perf_synth__raw_size(raw) + + ptq->state->evd_cnt * sizeof(struct perf_synth_intel_evd); + sample.raw_data = perf_synth__raw_data(&raw); + + return intel_pt_deliver_synth_event(pt, event, &sample, + pt->evt_sample_type); +} + +static int intel_pt_synth_iflag_chg_sample(struct intel_pt_queue *ptq) +{ + struct intel_pt *pt = ptq->pt; + union perf_event *event = ptq->event_buf; + struct perf_sample sample = { .ip = 0, }; + struct perf_synth_intel_iflag_chg raw; + + if (intel_pt_skip_event(pt)) + return 0; + + intel_pt_prep_p_sample(pt, ptq, event, &sample); + + sample.id = ptq->pt->iflag_chg_id; + sample.stream_id = ptq->pt->iflag_chg_id; + + raw.flags = 0; + raw.iflag = ptq->state->to_iflag; + + if (ptq->state->type & INTEL_PT_BRANCH) { + raw.via_branch = 1; + raw.branch_ip = ptq->state->to_ip; + } else { + sample.addr = 0; + } + sample.flags = ptq->flags; + + sample.raw_size = perf_synth__raw_size(raw); + sample.raw_data = perf_synth__raw_data(&raw); + + return intel_pt_deliver_synth_event(pt, event, &sample, + pt->iflag_chg_sample_type); +} + static int intel_pt_synth_error(struct intel_pt *pt, int code, int cpu, pid_t pid, pid_t tid, u64 ip, u64 timestamp) { @@ -2266,6 +2382,19 @@ static int intel_pt_sample(struct intel_pt_queue *ptq) return err; } + if (pt->synth_opts.intr_events) { + if (state->type & INTEL_PT_EVT) { + err = intel_pt_synth_events_sample(ptq); + if (err) + return err; + } + if (state->type & INTEL_PT_IFLAG_CHG) { + err = intel_pt_synth_iflag_chg_sample(ptq); + if (err) + return err; + } + } + if (pt->sample_pwr_events) { if (state->type & INTEL_PT_PSB_EVT) { err = intel_pt_synth_psb_sample(ptq); @@ -3429,7 +3558,7 @@ static int intel_pt_synth_events(struct intel_pt *pt, id += 1; } - if (pt->synth_opts.pwr_events && (evsel->core.attr.config & 0x10)) { + if (pt->synth_opts.pwr_events && (evsel->core.attr.config & INTEL_PT_CFG_PWR_EVT_EN)) { attr.config = PERF_SYNTH_INTEL_MWAIT; err = intel_pt_synth_event(session, "mwait", &attr, id); if (err) @@ -3463,6 +3592,28 @@ static int intel_pt_synth_events(struct intel_pt *pt, id += 1; } + if (pt->synth_opts.intr_events && (evsel->core.attr.config & INTEL_PT_CFG_EVT_EN)) { + attr.config = PERF_SYNTH_INTEL_EVT; + err = intel_pt_synth_event(session, "evt", &attr, id); + if (err) + return err; + pt->evt_sample_type = attr.sample_type; + pt->evt_id = id; + intel_pt_set_event_name(evlist, id, "evt"); + id += 1; + } + + if (pt->synth_opts.intr_events && pt->cap_event_trace) { + attr.config = PERF_SYNTH_INTEL_IFLAG_CHG; + err = intel_pt_synth_event(session, "iflag", &attr, id); + if (err) + return err; + pt->iflag_chg_sample_type = attr.sample_type; + pt->iflag_chg_id = id; + intel_pt_set_event_name(evlist, id, "iflag"); + id += 1; + } + return 0; } @@ -3790,7 +3941,7 @@ int intel_pt_process_auxtrace_info(union perf_event *event, } info = &auxtrace_info->priv[INTEL_PT_FILTER_STR_LEN] + 1; - info_end = (void *)info + auxtrace_info->header.size; + info_end = (void *)auxtrace_info + auxtrace_info->header.size; if (intel_pt_has(auxtrace_info, INTEL_PT_FILTER_STR_LEN)) { size_t len; @@ -3829,6 +3980,13 @@ int intel_pt_process_auxtrace_info(union perf_event *event, intel_pt_print_info_str("Filter string", pt->filter); } + if ((void *)info < info_end) { + pt->cap_event_trace = *info++; + if (dump_trace) + fprintf(stdout, " Cap Event Trace %d\n", + pt->cap_event_trace); + } + pt->timeless_decoding = intel_pt_timeless_decoding(pt); if (pt->timeless_decoding && !pt->tc.time_mult) pt->tc.time_mult = 1; |