From 35503ce12a2c3d5d9a94e3cd85a06739b0120f79 Mon Sep 17 00:00:00 2001 From: Jiri Olsa Date: Wed, 31 Aug 2022 14:40:41 +0200 Subject: perf script: Skip dummy event attr check Hongtao Yu reported problem when displaying uregs in perf script for system wide perf.data: # perf script -F uregs | head -10 Samples for 'dummy:HG' event do not have UREGS attribute set. Cannot print 'uregs' field. The problem is the extra dummy event added for system wide, which does not have proper sample_type setup. Skipping attr check completely for dummy event as suggested by Namhyung, because it does not have any samples anyway. Reported-by: Hongtao Yu Suggested-by: Namhyung Kim Signed-off-by: Jiri Olsa Acked-by: Ian Rogers Cc: Alexander Shishkin Cc: Ingo Molnar Cc: Mark Rutland Cc: Peter Zijlstra Link: https://lore.kernel.org/r/20220831124041.219925-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-script.c | 2 ++ 1 file changed, 2 insertions(+) (limited to 'tools/perf/builtin-script.c') diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 13580a9c50b8..304d234d8e84 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -566,6 +566,8 @@ static struct evsel *find_first_output_type(struct evlist *evlist, struct evsel *evsel; evlist__for_each_entry(evlist, evsel) { + if (evsel__is_dummy_event(evsel)) + continue; if (output_type(evsel->core.attr.type) == (int)type) return evsel; } -- cgit v1.2.3 From 82b2425fad2dd47204b3da589b679220f8aacc0e Mon Sep 17 00:00:00 2001 From: Zhengjun Xing Date: Thu, 8 Sep 2022 15:00:30 +0800 Subject: perf script: Fix Cannot print 'iregs' field for hybrid systems Commit b91e5492f9d7ca89 ("perf record: Add a dummy event on hybrid systems to collect metadata records") adds a dummy event on hybrid systems to fix the symbol "unknown" issue when the workload is created in a P-core but runs on an E-core. The added dummy event will cause "perf script -F iregs" to fail. Dummy events do not have "iregs" attribute set, so when we do evsel__check_attr, the "iregs" attribute check will fail, so the issue happened. The following commit [1] has fixed a similar issue by skipping the attr check for the dummy event because it does not have any samples anyway. It works okay for the normal mode, but the issue still happened when running the test in the pipe mode. In the pipe mode, it calls process_attr() which still checks the attr for the dummy event. This commit fixed the issue by skipping the attr check for the dummy event in the API evsel__check_attr, Otherwise, we have to patch everywhere when evsel__check_attr() is called. Before: #./perf record -o - --intr-regs=di,r8,dx,cx -e br_inst_retired.near_call:p -c 1000 --per-thread true 2>/dev/null|./perf script -F iregs |head -5 Samples for 'dummy:HG' event do not have IREGS attribute set. Cannot print 'iregs' field. 0x120 [0x90]: failed to process type: 64 # After: # ./perf record -o - --intr-regs=di,r8,dx,cx -e br_inst_retired.near_call:p -c 1000 --per-thread true 2>/dev/null|./perf script -F iregs |head -5 ABI:2 CX:0x55b8efa87000 DX:0x55b8efa7e000 DI:0xffffba5e625efbb0 R8:0xffff90e51f8ae100 ABI:2 CX:0x7f1dae1e4000 DX:0xd0 DI:0xffff90e18c675ac0 R8:0x71 ABI:2 CX:0xcc0 DX:0x1 DI:0xffff90e199880240 R8:0x0 ABI:2 CX:0xffff90e180dd7500 DX:0xffff90e180dd7500 DI:0xffff90e180043500 R8:0x1 ABI:2 CX:0x50 DX:0xffff90e18c583bd0 DI:0xffff90e1998803c0 R8:0x58 # [1]https://lore.kernel.org/lkml/20220831124041.219925-1-jolsa@kernel.org/ Fixes: b91e5492f9d7ca89 ("perf record: Add a dummy event on hybrid systems to collect metadata records") Suggested-by: Namhyung Kim Signed-off-by: Xing Zhengjun Acked-by: Jiri Olsa Cc: Alexander Shishkin Cc: Andi Kleen Cc: Ian Rogers Cc: Ingo Molnar Cc: Kan Liang Cc: Peter Zijlstra Link: https://lore.kernel.org/r/20220908070030.3455164-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-script.c | 3 +++ 1 file changed, 3 insertions(+) (limited to 'tools/perf/builtin-script.c') diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 304d234d8e84..029b4330e59b 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -445,6 +445,9 @@ static int evsel__check_attr(struct evsel *evsel, struct perf_session *session) struct perf_event_attr *attr = &evsel->core.attr; bool allow_user_set; + if (evsel__is_dummy_event(evsel)) + return 0; + if (perf_header__has_feat(&session->header, HEADER_STAT)) return 0; -- cgit v1.2.3 From 0ddea8e2a0c20ff32a28ef21574f704d8f4699a2 Mon Sep 17 00:00:00 2001 From: Anshuman Khandual Date: Wed, 24 Aug 2022 10:18:20 +0530 Subject: perf branch: Extend branch type classification This updates the perf tool with generic branch type classification with new ABI extender place holder i.e PERF_BR_EXTEND_ABI, the new 4 bit branch type field i.e perf_branch_entry.new_type, new generic page fault related branch types and some arch specific branch types as added earlier in the kernel. Committer note: Add an extra entry to the branch_type_name array to cope with PERF_BR_EXTEND_ABI, to address build warnings on some compiler/systems, like: 75 8.89 ubuntu:20.04-x-powerpc64el : FAIL gcc version 10.3.0 (Ubuntu 10.3.0-1ubuntu1~20.04) inlined from 'branch_type_stat_display' at util/branch.c:152:4: /usr/powerpc64le-linux-gnu/include/bits/stdio2.h:100:10: error: '%8s' directive argument is null [-Werror=format-overflow=] 100 | return __fprintf_chk (__stream, __USE_FORTIFY_LEVEL - 1, __fmt, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 101 | __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~ Signed-off-by: Anshuman Khandual Cc: Alexander Shishkin Cc: Catalin Marinas Cc: Ingo Molnar Cc: James Clark Cc: Jiri Olsa Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Robin Murphy Cc: Stephen Rothwell Cc: Suzuki Poulouse Cc: Thomas Gleixner Cc: Will Deacon Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220824044822.70230-7-anshuman.khandual@arm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/include/uapi/linux/perf_event.h | 16 +++++++++- tools/perf/builtin-script.c | 2 +- tools/perf/util/branch.c | 55 +++++++++++++++++++++++++++++++++-- tools/perf/util/branch.h | 6 +++- tools/perf/util/session.c | 2 +- 5 files changed, 75 insertions(+), 6 deletions(-) (limited to 'tools/perf/builtin-script.c') diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index 146c137ff0c1..0f7c7ce29899 100644 --- a/tools/include/uapi/linux/perf_event.h +++ b/tools/include/uapi/linux/perf_event.h @@ -255,9 +255,22 @@ enum { PERF_BR_IRQ = 12, /* irq */ PERF_BR_SERROR = 13, /* system error */ PERF_BR_NO_TX = 14, /* not in transaction */ + PERF_BR_EXTEND_ABI = 15, /* extend ABI */ PERF_BR_MAX, }; +enum { + PERF_BR_NEW_FAULT_ALGN = 0, /* Alignment fault */ + PERF_BR_NEW_FAULT_DATA = 1, /* Data fault */ + PERF_BR_NEW_FAULT_INST = 2, /* Inst fault */ + PERF_BR_NEW_ARCH_1 = 3, /* Architecture specific */ + PERF_BR_NEW_ARCH_2 = 4, /* Architecture specific */ + PERF_BR_NEW_ARCH_3 = 5, /* Architecture specific */ + PERF_BR_NEW_ARCH_4 = 6, /* Architecture specific */ + PERF_BR_NEW_ARCH_5 = 7, /* Architecture specific */ + PERF_BR_NEW_MAX, +}; + #define PERF_SAMPLE_BRANCH_PLM_ALL \ (PERF_SAMPLE_BRANCH_USER|\ PERF_SAMPLE_BRANCH_KERNEL|\ @@ -1375,7 +1388,8 @@ struct perf_branch_entry { abort:1, /* transaction abort */ cycles:16, /* cycle count to last branch */ type:4, /* branch type */ - reserved:40; + new_type:4, /* additional branch type */ + reserved:36; }; union perf_sample_weight { diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 029b4330e59b..886f53cfa257 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -882,7 +882,7 @@ static int print_bstack_flags(FILE *fp, struct branch_entry *br) br->flags.in_tx ? 'X' : '-', br->flags.abort ? 'A' : '-', br->flags.cycles, - br->flags.type ? branch_type_name(br->flags.type) : "-"); + get_branch_type(br)); } static int perf_sample__fprintf_brstack(struct perf_sample *sample, diff --git a/tools/perf/util/branch.c b/tools/perf/util/branch.c index abc673347bee..675cbbe80ce3 100644 --- a/tools/perf/util/branch.c +++ b/tools/perf/util/branch.c @@ -21,7 +21,10 @@ void branch_type_count(struct branch_type_stat *st, struct branch_flags *flags, if (flags->type == PERF_BR_UNKNOWN || from == 0) return; - st->counts[flags->type]++; + if (flags->type == PERF_BR_EXTEND_ABI) + st->new_counts[flags->new_type]++; + else + st->counts[flags->type]++; if (flags->type == PERF_BR_COND) { if (to > from) @@ -36,6 +39,25 @@ void branch_type_count(struct branch_type_stat *st, struct branch_flags *flags, st->cross_4k++; } +const char *branch_new_type_name(int new_type) +{ + const char *branch_new_names[PERF_BR_NEW_MAX] = { + "FAULT_ALGN", + "FAULT_DATA", + "FAULT_INST", + "ARCH_1", + "ARCH_2", + "ARCH_3", + "ARCH_4", + "ARCH_5" + }; + + if (new_type >= 0 && new_type < PERF_BR_NEW_MAX) + return branch_new_names[new_type]; + + return NULL; +} + const char *branch_type_name(int type) { const char *branch_names[PERF_BR_MAX] = { @@ -53,7 +75,8 @@ const char *branch_type_name(int type) "ERET", "IRQ", "SERROR", - "NO_TX" + "NO_TX", + "", // Needed for PERF_BR_EXTEND_ABI that ends up triggering some compiler warnings about NULL deref }; if (type >= 0 && type < PERF_BR_MAX) @@ -62,6 +85,17 @@ const char *branch_type_name(int type) return NULL; } +const char *get_branch_type(struct branch_entry *e) +{ + if (e->flags.type == PERF_BR_UNKNOWN) + return ""; + + if (e->flags.type == PERF_BR_EXTEND_ABI) + return branch_new_type_name(e->flags.new_type); + + return branch_type_name(e->flags.type); +} + void branch_type_stat_display(FILE *fp, struct branch_type_stat *st) { u64 total = 0; @@ -108,6 +142,15 @@ void branch_type_stat_display(FILE *fp, struct branch_type_stat *st) 100.0 * (double)st->counts[i] / (double)total); } + + for (i = 0; i < PERF_BR_NEW_MAX; i++) { + if (st->new_counts[i] > 0) + fprintf(fp, "\n%8s: %5.1f%%", + branch_new_type_name(i), + 100.0 * + (double)st->new_counts[i] / (double)total); + } + } static int count_str_scnprintf(int idx, const char *str, char *bf, int size) @@ -123,6 +166,9 @@ int branch_type_str(struct branch_type_stat *st, char *bf, int size) for (i = 0; i < PERF_BR_MAX; i++) total += st->counts[i]; + for (i = 0; i < PERF_BR_NEW_MAX; i++) + total += st->new_counts[i]; + if (total == 0) return 0; @@ -140,6 +186,11 @@ int branch_type_str(struct branch_type_stat *st, char *bf, int size) printed += count_str_scnprintf(j++, branch_type_name(i), bf + printed, size - printed); } + for (i = 0; i < PERF_BR_NEW_MAX; i++) { + if (st->new_counts[i] > 0) + printed += count_str_scnprintf(j++, branch_new_type_name(i), bf + printed, size - printed); + } + if (st->cross_4k > 0) printed += count_str_scnprintf(j++, "CROSS_4K", bf + printed, size - printed); diff --git a/tools/perf/util/branch.h b/tools/perf/util/branch.h index 17b2ccc61094..8d251b35428a 100644 --- a/tools/perf/util/branch.h +++ b/tools/perf/util/branch.h @@ -24,7 +24,8 @@ struct branch_flags { u64 abort:1; u64 cycles:16; u64 type:4; - u64 reserved:40; + u64 new_type:4; + u64 reserved:36; }; }; }; @@ -72,6 +73,7 @@ static inline struct branch_entry *perf_sample__branch_entries(struct perf_sampl struct branch_type_stat { bool branch_to; u64 counts[PERF_BR_MAX]; + u64 new_counts[PERF_BR_NEW_MAX]; u64 cond_fwd; u64 cond_bwd; u64 cross_4k; @@ -82,6 +84,8 @@ void branch_type_count(struct branch_type_stat *st, struct branch_flags *flags, u64 from, u64 to); const char *branch_type_name(int type); +const char *branch_new_type_name(int new_type); +const char *get_branch_type(struct branch_entry *e); void branch_type_stat_display(FILE *fp, struct branch_type_stat *st); int branch_type_str(struct branch_type_stat *st, char *bf, int bfsize); diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 192c9274f7ad..47d5a50e616a 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -1180,7 +1180,7 @@ static void branch_stack__printf(struct perf_sample *sample, bool callstack) e->flags.abort ? "A" : " ", e->flags.in_tx ? "T" : " ", (unsigned)e->flags.reserved, - e->flags.type ? branch_type_name(e->flags.type) : ""); + get_branch_type(e)); } else { if (i == 0) { printf("..... %2"PRIu64": %016" PRIx64 "\n" -- cgit v1.2.3 From 1337b9dcb03b1c81448eed1b70296148f62730b8 Mon Sep 17 00:00:00 2001 From: Namhyung Kim Date: Mon, 3 Oct 2022 13:46:47 -0700 Subject: perf tools: Remove special handling of system-wide evsel For system-wide evsels, the thread map should be dummy - i.e. it has a single entry of -1. But the code guarantees such a thread map, so no need to handle it specially. No functional change intended. Reviewed-by: Adrian Hunter Signed-off-by: Namhyung Kim Cc: Ian Rogers Cc: Ingo Molnar Cc: Jiri Olsa Cc: Kan Liang Cc: Leo Yan Cc: Peter Zijlstra Link: https://lore.kernel.org/r/20221003204647.1481128-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/lib/perf/evsel.c | 3 --- tools/perf/builtin-script.c | 3 --- tools/perf/util/evsel.c | 12 ++---------- tools/perf/util/stat.c | 3 --- 4 files changed, 2 insertions(+), 19 deletions(-) (limited to 'tools/perf/builtin-script.c') diff --git a/tools/lib/perf/evsel.c b/tools/lib/perf/evsel.c index 8ce5bbd09666..8b51b008a81f 100644 --- a/tools/lib/perf/evsel.c +++ b/tools/lib/perf/evsel.c @@ -515,9 +515,6 @@ int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads) if (ncpus == 0 || nthreads == 0) return 0; - if (evsel->system_wide) - nthreads = 1; - evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id)); if (evsel->sample_id == NULL) return -ENOMEM; diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 886f53cfa257..7fa467ed91dc 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -2243,9 +2243,6 @@ static void __process_stat(struct evsel *counter, u64 tstamp) struct perf_cpu cpu; static int header_printed; - if (counter->core.system_wide) - nthreads = 1; - if (!header_printed) { printf("%3s %8s %15s %15s %15s %15s %s\n", "CPU", "THREAD", "VAL", "ENA", "RUN", "TIME", "EVENT"); diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index a27092339b81..76605fde3507 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -1813,7 +1813,7 @@ static struct perf_thread_map *empty_thread_map; static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus, struct perf_thread_map *threads) { - int nthreads; + int nthreads = perf_thread_map__nr(threads); if ((perf_missing_features.write_backward && evsel->core.attr.write_backward) || (perf_missing_features.aux_output && evsel->core.attr.aux_output)) @@ -1839,11 +1839,6 @@ static int __evsel__prepare_open(struct evsel *evsel, struct perf_cpu_map *cpus, threads = empty_thread_map; } - if (evsel->core.system_wide) - nthreads = 1; - else - nthreads = threads->nr; - if (evsel->core.fd == NULL && perf_evsel__alloc_fd(&evsel->core, perf_cpu_map__nr(cpus), nthreads) < 0) return -ENOMEM; @@ -2061,10 +2056,7 @@ static int evsel__open_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, if (threads == NULL) threads = empty_thread_map; - if (evsel->core.system_wide) - nthreads = 1; - else - nthreads = threads->nr; + nthreads = perf_thread_map__nr(threads); if (evsel->cgrp) pid = evsel->cgrp->fd; diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c index ce5e9e372fc4..cef943377ad7 100644 --- a/tools/perf/util/stat.c +++ b/tools/perf/util/stat.c @@ -420,9 +420,6 @@ static int process_counter_maps(struct perf_stat_config *config, int ncpus = evsel__nr_cpus(counter); int idx, thread; - if (counter->core.system_wide) - nthreads = 1; - for (thread = 0; thread < nthreads; thread++) { for (idx = 0; idx < ncpus; idx++) { if (process_counter_values(config, counter, idx, thread, -- cgit v1.2.3 From d79310700590b8b40d8c867012d6c899ea6fd505 Mon Sep 17 00:00:00 2001 From: Ravi Bangoria Date: Thu, 6 Oct 2022 21:09:46 +0530 Subject: perf script: Add missing fields in usage hint A few fields are missing in the usage message printed when an unknown field option is passed. Add them to the list. Signed-off-by: Ravi Bangoria Acked-by: Jiri Olsa Cc: Alexander Shishkin Cc: Ali Saidi Cc: Ananth Narayan Cc: Andi Kleen Cc: Borislav Petkov Cc: Dave Hansen Cc: H. Peter Anvin Cc: Ian Rogers Cc: Ingo Molnar Cc: Joe Mario Cc: Kan Liang Cc: Kim Phillips Cc: Leo Yan Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Sandipan Das Cc: Santosh Shukla Cc: Stephane Eranian Cc: Thomas Gleixner Cc: x86@kernel.org Link: https://lore.kernel.org/r/20221006153946.7816-9-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-script.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) (limited to 'tools/perf/builtin-script.c') diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 7fa467ed91dc..7ca238277d83 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -3846,9 +3846,10 @@ int cmd_script(int argc, const char **argv) "Valid types: hw,sw,trace,raw,synth. " "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso," "addr,symoff,srcline,period,iregs,uregs,brstack," - "brstacksym,flags,bpf-output,brstackinsn,brstackinsnlen,brstackoff," - "callindent,insn,insnlen,synth,phys_addr,metric,misc,ipc,tod," - "data_page_size,code_page_size,ins_lat", + "brstacksym,flags,data_src,weight,bpf-output,brstackinsn," + "brstackinsnlen,brstackoff,callindent,insn,insnlen,synth," + "phys_addr,metric,misc,srccode,ipc,tod,data_page_size," + "code_page_size,ins_lat", parse_output_fields), OPT_BOOLEAN('a', "all-cpus", &system_wide, "system-wide collection from all CPUs"), -- cgit v1.2.3