diff options
author | Kim Phillips <kim.phillips@amd.com> | 2020-09-02 01:09:43 +0300 |
---|---|---|
committer | Arnaldo Carvalho de Melo <acme@redhat.com> | 2020-09-04 22:32:22 +0300 |
commit | 08ed77e414ab234258166f5628d91a7a0e681ff4 (patch) | |
tree | b9a1616934cb229ebed5d32de1150dc881b02cc4 /tools/perf/pmu-events/jevents.c | |
parent | ab22eea35f1f120c4a379aa26e5333e7e41bb303 (diff) | |
download | linux-08ed77e414ab234258166f5628d91a7a0e681ff4.tar.xz |
perf vendor events amd: Add recommended events
Add support for events listed in Section 2.1.15.2 "Performance
Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803
Rev 0.54 - Sep 12, 2019".
perf now supports these new events (-e):
all_dc_accesses
all_tlbs_flushed
l1_dtlb_misses
l2_cache_accesses_from_dc_misses
l2_cache_accesses_from_ic_misses
l2_cache_hits_from_dc_misses
l2_cache_hits_from_ic_misses
l2_cache_misses_from_dc_misses
l2_cache_misses_from_ic_miss
l2_dtlb_misses
l2_itlb_misses
sse_avx_stalls
uops_dispatched
uops_retired
l3_accesses
l3_misses
and these metrics (-M):
branch_misprediction_ratio
all_l2_cache_accesses
all_l2_cache_hits
all_l2_cache_misses
ic_fetch_miss_ratio
l2_cache_accesses_from_l2_hwpf
l2_cache_hits_from_l2_hwpf
l2_cache_misses_from_l2_hwpf
l3_read_miss_latency
l1_itlb_misses
all_remote_links_outbound
nps1_die_to_dram
The nps1_die_to_dram event may need perf stat's --metric-no-group
switch if the number of available data fabric counters is less
than the number it uses (8).
Committer testing:
On a AMD Ryzen 3900x system:
Before:
# perf list all_dc_accesses all_tlbs_flushed l1_dtlb_misses l2_cache_accesses_from_dc_misses l2_cache_accesses_from_ic_misses l2_cache_hits_from_dc_misses l2_cache_hits_from_ic_misses l2_cache_misses_from_dc_misses l2_cache_misses_from_ic_miss l2_dtlb_misses l2_itlb_misses sse_avx_stalls uops_dispatched uops_retired l3_accesses l3_misses | grep -v "^Metric Groups:$" | grep -v "^$"
#
After:
# perf list all_dc_accesses all_tlbs_flushed l1_dtlb_misses l2_cache_accesses_from_dc_misses l2_cache_accesses_from_ic_misses l2_cache_hits_from_dc_misses l2_cache_hits_from_ic_misses l2_cache_misses_from_dc_misses l2_cache_misses_from_ic_miss l2_dtlb_misses l2_itlb_misses sse_avx_stalls uops_dispatched uops_retired l3_accesses l3_misses | grep -v "^Metric Groups:$" | grep -v "^$" | grep -v "^recommended:$"
all_dc_accesses
[All L1 Data Cache Accesses]
all_tlbs_flushed
[All TLBs Flushed]
l1_dtlb_misses
[L1 DTLB Misses]
l2_cache_accesses_from_dc_misses
[L2 Cache Accesses from L1 Data Cache Misses (including prefetch)]
l2_cache_accesses_from_ic_misses
[L2 Cache Accesses from L1 Instruction Cache Misses (including
prefetch)]
l2_cache_hits_from_dc_misses
[L2 Cache Hits from L1 Data Cache Misses]
l2_cache_hits_from_ic_misses
[L2 Cache Hits from L1 Instruction Cache Misses]
l2_cache_misses_from_dc_misses
[L2 Cache Misses from L1 Data Cache Misses]
l2_cache_misses_from_ic_miss
[L2 Cache Misses from L1 Instruction Cache Misses]
l2_dtlb_misses
[L2 DTLB Misses & Data page walks]
l2_itlb_misses
[L2 ITLB Misses & Instruction page walks]
sse_avx_stalls
[Mixed SSE/AVX Stalls]
uops_dispatched
[Micro-ops Dispatched]
uops_retired
[Micro-ops Retired]
l3_accesses
[L3 Accesses. Unit: amd_l3]
l3_misses
[L3 Misses (includes Chg2X). Unit: amd_l3]
#
# perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses,l2_cache_hits_from_dc_misses,l2_cache_hits_from_ic_misses,l2_cache_misses_from_dc_misses,l2_cache_misses_from_ic_miss,l2_dtlb_misses,l2_itlb_misses,sse_avx_stalls,uops_dispatched,uops_retired,l3_accesses,l3_misses sleep 2
Performance counter stats for 'system wide':
433,439,949 all_dc_accesses (35.66%)
443 all_tlbs_flushed (35.66%)
2,985,885 l1_dtlb_misses (35.66%)
18,318,019 l2_cache_accesses_from_dc_misses (35.68%)
50,114,810 l2_cache_accesses_from_ic_misses (35.72%)
12,423,978 l2_cache_hits_from_dc_misses (35.74%)
40,703,103 l2_cache_hits_from_ic_misses (35.74%)
6,698,673 l2_cache_misses_from_dc_misses (35.74%)
12,090,892 l2_cache_misses_from_ic_miss (35.74%)
614,267 l2_dtlb_misses (35.74%)
216,036 l2_itlb_misses (35.74%)
11,977 sse_avx_stalls (35.74%)
999,276,223 uops_dispatched (35.73%)
1,075,311,620 uops_retired (35.69%)
1,420,763 l3_accesses
540,164 l3_misses
2.002344121 seconds time elapsed
# perf stat -a -e all_dc_accesses,all_tlbs_flushed,l1_dtlb_misses,l2_cache_accesses_from_dc_misses,l2_cache_accesses_from_ic_misses sleep 2
Performance counter stats for 'system wide':
175,943,104 all_dc_accesses
310 all_tlbs_flushed
2,280,359 l1_dtlb_misses
11,700,151 l2_cache_accesses_from_dc_misses
25,414,963 l2_cache_accesses_from_ic_misses
2.001957818 seconds time elapsed
#
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Jon Grimm <jon.grimm@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: Martin Liška <mliska@suse.cz>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vijay Thakkar <vijaythakkar@me.com>
Cc: William Cohen <wcohen@redhat.com>
Cc: Yunfeng Ye <yeyunfeng@huawei.com>
Link: http://lore.kernel.org/lkml/20200901220944.277505-3-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Diffstat (limited to 'tools/perf/pmu-events/jevents.c')
-rw-r--r-- | tools/perf/pmu-events/jevents.c | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c index fc9c158bfa13..e37d17830c5c 100644 --- a/tools/perf/pmu-events/jevents.c +++ b/tools/perf/pmu-events/jevents.c @@ -240,6 +240,7 @@ static struct map { { "hisi_sccl,hha", "hisi_sccl,hha" }, { "hisi_sccl,l3c", "hisi_sccl,l3c" }, { "L3PMC", "amd_l3" }, + { "DFPMC", "amd_df" }, {} }; |