summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2024-08-11perf arch events: Fix duplicate RISC-V SBI firmware event nameEric Lin5-5/+5
[ Upstream commit 63ba5b0fb4f54db256ec43b3062b2606b383055d ] Currently, the RISC-V firmware JSON file has duplicate event name "FW_SFENCE_VMA_RECEIVED". According to the RISC-V SBI PMU extension[1], the event name should be "FW_SFENCE_VMA_ASID_SENT". Before this patch: $ perf list firmware: fw_access_load [Load access trap event. Unit: cpu] fw_access_store [Store access trap event. Unit: cpu] .... fw_set_timer [Set timer event. Unit: cpu] fw_sfence_vma_asid_received [Received SFENCE.VMA with ASID request from other HART event. Unit: cpu] fw_sfence_vma_received [Sent SFENCE.VMA with ASID request to other HART event. Unit: cpu] After this patch: $ perf list firmware: fw_access_load [Load access trap event. Unit: cpu] fw_access_store [Store access trap event. Unit: cpu] ..... fw_set_timer [Set timer event. Unit: cpu] fw_sfence_vma_asid_received [Received SFENCE.VMA with ASID request from other HART event. Unit: cpu] fw_sfence_vma_asid_sent [Sent SFENCE.VMA with ASID request to other HART event. Unit: cpu] fw_sfence_vma_received [Received SFENCE.VMA request from other HART event. Unit: cpu] Link: https://github.com/riscv-non-isa/riscv-sbi-doc/blob/master/src/ext-pmu.adoc#event-firmware-events-type-15 [1] Fixes: 8f0dcb4e7364 ("perf arch events: riscv sbi firmware std event files") Fixes: c4f769d4093d ("perf vendor events riscv: add Sifive U74 JSON file") Fixes: acbf6de674ef ("perf vendor events riscv: Add StarFive Dubhe-80 JSON file") Fixes: 7340c6df49df ("perf vendor events riscv: add T-HEAD C9xx JSON file") Fixes: f5102e31c209 ("riscv: andes: Support specifying symbolic firmware and hardware raw event") Signed-off-by: Eric Lin <eric.lin@sifive.com> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Nikita Shubin <n.shubin@yadro.com> Reviewed-by: Inochi Amaoto <inochiama@outlook.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20240719115018.27356-1-eric.lin@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-11perf tool: fix dereferencing NULL al->mapsCasey Chen1-1/+1
[ Upstream commit 4c17736689ccfc44ec7dcc472577f25c34cf8724 ] With 0dd5041c9a0e ("perf addr_location: Add init/exit/copy functions"), when cpumode is 3 (macro PERF_RECORD_MISC_HYPERVISOR), thread__find_map() could return with al->maps being NULL. The path below could add a callchain_cursor_node with NULL ms.maps. add_callchain_ip() thread__find_symbol(.., &al) thread__find_map(.., &al) // al->maps becomes NULL ms.maps = maps__get(al.maps) callchain_cursor_append(..., &ms, ...) node->ms.maps = maps__get(ms->maps) Then the path below would dereference NULL maps and get segfault. fill_callchain_info() maps__machine(node->ms.maps); Fix it by checking if maps is NULL in fill_callchain_info(). Fixes: 0dd5041c9a0e ("perf addr_location: Add init/exit/copy functions") Signed-off-by: Casey Chen <cachen@purestorage.com> Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: yzhong@purestorage.com Link: https://lore.kernel.org/r/20240722211548.61455-1-cachen@purestorage.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03perf dso: Fix build when libunwind is enabledJames Clark3-2/+7
[ Upstream commit 92717bc077892d1ce60fee07aee3a33f33909b85 ] Now that symsrc_filename is always accessed through an accessor, we also need a free() function for it to avoid the following compilation error: util/unwind-libunwind-local.c:416:12: error: lvalue required as unary ‘&’ operand 416 | zfree(&dso__symsrc_filename(dso)); Fixes: 1553419c3c10 ("perf dso: Fix address sanitizer build") Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Tested-by: Leo Yan <leo.yan@arm.com> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Cc: Yunseong Kim <yskelg@gmail.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20240715094715.3914813-1-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03perf stat: Fix the hard-coded metrics calculation on the hybridKan Liang1-0/+7
commit 3612ca8e2935c4c142d99e33b8effa7045ce32b5 upstream. The hard-coded metrics is wrongly calculated on the hybrid machine. $ perf stat -e cycles,instructions -a sleep 1 Performance counter stats for 'system wide': 18,205,487 cpu_atom/cycles/ 9,733,603 cpu_core/cycles/ 9,423,111 cpu_atom/instructions/ # 0.52 insn per cycle 4,268,965 cpu_core/instructions/ # 0.23 insn per cycle The insn per cycle for cpu_core should be 4,268,965 / 9,733,603 = 0.44. When finding the metric events, the find_stat() doesn't take the PMU type into account. The cpu_atom/cycles/ is wrongly used to calculate the IPC of the cpu_core. In the hard-coded metrics, the events from a different PMU are only SW_CPU_CLOCK and SW_TASK_CLOCK. They both have the stat type, STAT_NSECS. Except the SW CLOCK events, check the PMU type as well. Fixes: 0a57b910807a ("perf stat: Use counts rather than saved_value") Reported-by: Khalil, Amiri <amiri.khalil@intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240606180316.4122904-1-kan.liang@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-03perf dso: Fix address sanitizer buildIan Rogers9-44/+56
[ Upstream commit 1553419c3c10cf386496e68b90b5d0ce966ac614 ] Various files had been missed from having accessor functions added for the sake of dso reference count checking. Add the function calls and missing dso accessor functions. Fixes: ee756ef7491e ("perf dso: Add reference count checking and accessor functions") Signed-off-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Yunseong Kim <yskelg@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20240704011745.1021288-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03perf intel-pt: Fix exclude_guest settingAdrian Hunter1-0/+12
[ Upstream commit b40934ae32232140e85dc7dc1c3ea0e296986723 ] In the past, the exclude_guest setting has had no effect on Intel PT tracing, but that may not be the case in the future. Set the flag correctly based upon whether KVM is using Intel PT "Host/Guest" mode, which is determined by the kvm_intel module parameter pt_mode: pt_mode=0 System-wide mode : host and guest output to host buffer pt_mode=1 Host/Guest mode : host/guest output to host/guest buffers respectively Fixes: 6e86bfdc4a60 ("perf intel-pt: Support decoding of guest kernel") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20240625104532.11990-3-adrian.hunter@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03perf intel-pt: Fix aux_watermark calculation for 64-bit sizeAdrian Hunter1-1/+2
[ Upstream commit 36b4cd990a8fd3f5b748883050e9d8c69fe6398d ] aux_watermark is a u32. For a 64-bit size, cap the aux_watermark calculation at UINT_MAX instead of truncating it to 32-bits. Fixes: 874fc35cdd55 ("perf intel-pt: Use aux_watermark") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20240625104532.11990-2-adrian.hunter@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03perf stat: Fix a segfault with --per-cluster --metric-onlyNamhyung Kim1-0/+3
[ Upstream commit caa463bb79a82ac5fce05a0dcf7893d94c84fc5e ] The new --per-cluster option was added recently but it forgot to update the aggr_header fields which are used for --metric-only option. And it resulted in a segfault due to NULL string in fputs(). Fixes: cbc917a1b03b ("perf stat: Support per-cluster aggregation") Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Tested-by: Yicong Yang <yangyicong@hisilicon.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20240628000604.1296808-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03perf report: Fix condition in sort__sym_cmp()Namhyung Kim1-1/+1
[ Upstream commit cb39d05e67dc24985ff9f5150e71040fa4d60ab8 ] It's expected that both hist entries are in the same hists when comparing two. But the current code in the function checks one without dso sort key and other with the key. This would make the condition true in any case. I guess the intention of the original commit was to add '!' for the right side too. But as it should be the same, let's just remove it. Fixes: 69849fc5d2119 ("perf hists: Move sort__has_dso into struct perf_hpp_list") Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240621170528.608772-2-namhyung@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03perf pmus: Fixes always false when compare duplicates aliasesJunhao He1-2/+3
[ Upstream commit dd9a426eade634bf794c7e0f1b0c6659f556942f ] In the previous loop, all the members in the aliases[j-1] have been freed and set to NULL. But in this loop, the function pmu_alias_is_duplicate() compares the aliases[j] with the aliases[j-1] that has already been disposed, so the function will always return false and duplicate aliases will never be discarded. If we find duplicate aliases, it skips the zfree aliases[j], which is accompanied by a memory leak. We can use the next aliases[j+1] to theck for duplicate aliases to fixes the aliases NULL pointer dereference, then goto zfree code snippet to release it. After patch testing: $ perf list --unit=hisi_sicl,cpa pmu uncore cpa: cpa_p0_rd_dat_32b [Number of read ops transmitted by the P0 port which size is 32 bytes. Unit: hisi_sicl,cpa] cpa_p0_rd_dat_64b [Number of read ops transmitted by the P0 port which size is 64 bytes. Unit: hisi_sicl,cpa] Fixes: c3245d2093c1 ("perf pmu: Abstract alias/event struct") Signed-off-by: Junhao He <hejunhao3@huawei.com> Cc: ravi.bangoria@amd.com Cc: james.clark@arm.com Cc: prime.zeng@hisilicon.com Cc: cuigaosheng1@huawei.com Cc: jonathan.cameron@huawei.com Cc: linuxarm@huawei.com Cc: yangyicong@huawei.com Cc: robh@kernel.org Cc: renyu.zj@linux.alibaba.com Cc: kjain@linux.ibm.com Cc: john.g.garry@oracle.com Cc: linux-arm-kernel@lists.infradead.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240614094318.11607-1-hejunhao3@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03tools/perf: Fix the string match for "/tmp/perf-$PID.map" files in dso__loadAthira Rajeev3-1/+18
[ Upstream commit b0979f008f1352a44cd3c8877e3eb8a1e3e1c6f3 ] Perf test for perf probe of function from different CU fails as below: ./perf test -vv "test perf probe of function from different CU" 116: test perf probe of function from different CU: --- start --- test child forked, pid 2679 Failed to find symbol foo in /tmp/perf-uprobe-different-cu-sh.Msa7iy89bx/testfile Error: Failed to add events. --- Cleaning up --- "foo" does not hit any event. Error: Failed to delete events. ---- end(-1) ---- 116: test perf probe of function from different CU : FAILED! The test does below to probe function "foo" : # gcc -g -Og -flto -c /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.c -o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.o # gcc -g -Og -c /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.c -o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.o # gcc -g -Og -o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-foo.o /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile-main.o # ./perf probe -x /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile foo Failed to find symbol foo in /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7/testfile Error: Failed to add events. Perf probe fails to find symbol foo in the executable placed in /tmp/perf-uprobe-different-cu-sh.XniNxNEVT7 Simple reproduce: # mktemp -d /tmp/perf-checkXXXXXXXXXX /tmp/perf-checkcWpuLRQI8j # gcc -g -o test test.c # cp test /tmp/perf-checkcWpuLRQI8j/ # nm /tmp/perf-checkcWpuLRQI8j/test | grep foo 00000000100006bc T foo # ./perf probe -x /tmp/perf-checkcWpuLRQI8j/test foo Failed to find symbol foo in /tmp/perf-checkcWpuLRQI8j/test Error: Failed to add events. But it works with any files like /tmp/perf/test. Only for patterns with "/tmp/perf-", this fails. Further debugging, commit 80d496be89ed ("perf report: Add support for profiling JIT generated code") added support for profiling JIT generated code. This patch handles dso's of form "/tmp/perf-$PID.map" . The check used "if (strncmp(self->name, "/tmp/perf-", 10) == 0)" to match "/tmp/perf-$PID.map". With this commit, any dso in /tmp/perf- folder will be considered separately for processing (not only JIT created map files ). Fix this by changing the string pattern to check for "/tmp/perf-%d.map". Add a helper function is_perf_pid_map_name to do this check. In "struct dso", dso->long_name holds the long name of the dso file. Since the /tmp/perf-$PID.map check uses the complete name, use dso___long_name for the string name. With the fix, # ./perf test "test perf probe of function from different CU" 117: test perf probe of function from different CU : Ok Fixes: 56cbeacf1435 ("perf probe: Add test for regression introduced by switch to die_get_decl_file()") Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: akanksha@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240623064850.83720-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03perf test: Make test_arm_callgraph_fp.sh more robustJames Clark2-21/+26
[ Upstream commit ff16aeb9b83441b8458d4235496cf320189a0c60 ] The 2 second sleep can cause the test to fail on very slow network file systems because Perf ends up being killed before it finishes starting up. Fix it by making the leafloop workload end after a fixed time like the other workloads so there is no need to kill it after 2 seconds. Also remove the 1 second start sampling delay because it is similarly fragile. Instead, search through all samples for a matching one, rather than just checking the first sample and hoping it's in the right place. Fixes: cd6382d82752 ("perf test arm64: Test unwinding using fame-pointer (fp) mode") Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: German Gomez <german.gomez@arm.com> Cc: Spoorthy S <spoorts2@in.ibm.com> Cc: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240612140316.3006660-1-james.clark@arm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03perf maps: Fix use after free in __maps__fixup_overlap_and_insertIan Rogers1-4/+5
[ Upstream commit 0b90dfda222e38b7ca8dad6e098e36f5186f0b94 ] In the case 'before' and 'after' are broken out from pos, maps_by_address may be changed by __maps__insert, as such it needs re-reading. Don't ignore the return value from __maps_insert. Fixes: 659ad3492b91 ("perf maps: Switch from rbtree to lazily sorted array for addresses") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@arm.com> Cc: Steinar H . Gunderson <sesse@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240521165109.708593-2-irogers@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-08perf dsos: When adding a dso into sorted dsos maintain the sort orderIan Rogers1-5/+21
dsos__add would add at the end of the dso array possibly requiring a later find to re-sort the array. Patterns of find then add were becoming O(n*log n) due to the sorts. Change the add routine to be O(n) rather than O(1) but to maintain the sorted-ness of the dsos array so that later finds don't need the O(n*log n) sort. Fixes: 3f4ac23a9908 ("perf dsos: Switch backing storage to array from rbtree/list") Reported-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Steinar Gunderson <sesse@google.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Matt Fleming <matt@readmodwrite.com> Link: https://lore.kernel.org/r/20240703172117.810918-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-08perf comm str: Avoid sort during insertIan Rogers1-11/+18
The array is sorted, so just move the elements and insert in order. Fixes: 13ca628716c6 ("perf comm: Add reference count checking to 'struct comm_str'") Reported-by: Matt Fleming <matt@readmodwrite.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Matt Fleming <matt@readmodwrite.com> Cc: Steinar Gunderson <sesse@google.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20240703172117.810918-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-06-05perf bpf: Fix handling of minimal vmlinux.h file when interrupting the buildNamhyung Kim1-0/+1
Ingo reported that he was seeing these when hitting Control+C during a perf tools build: Makefile.perf:1149: *** Missing bpftool input for generating vmlinux.h. Stop. The failure happens when you don't have vmlinux.h or vmlinux with BTF. ifeq ($(VMLINUX_H),) ifeq ($(VMLINUX_BTF),) $(error Missing bpftool input for generating vmlinux.h) endif endif VMLINUX_BTF can be empty if you didn't build a kernel or it doesn't have a BTF section and the current kernel also has no BTF. This is totally ok. But VMLINUX_H should be set to the minimal version in the source tree (unless you overwrite it manually) when you don't pass GEN_VMLINUX_H=1 (which requires VMLINUX_BTF should not be empty). The problem is that it's defined in Makefile.config which is not included for `make clean`. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Link: http://lore.kernel.org/lkml/CAM9d7ch5HTr+k+_GpbMrX0HUo5BZ11byh1xq0Two7B7RQACuNw@mail.gmail.com Link: http://lore.kernel.org/lkml/ZjssGrj+abyC6mYP@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-06-05Revert "perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event"Arnaldo Carvalho de Melo1-4/+2
This reverts commit 7d1405c71df21f6c394b8a885aa8a133f749fa22. This causes segfaults in some cases, as reported by Milian: ``` sudo /usr/bin/perf record -z --call-graph dwarf -e cycles -e raw_syscalls:sys_enter ls ... [ perf record: Woken up 3 times to write data ] malloc(): invalid next size (unsorted) Aborted ``` Backtrace with GDB + debuginfod: ``` malloc(): invalid next size (unsorted) Thread 1 "perf" received signal SIGABRT, Aborted. __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 Downloading source file /usr/src/debug/glibc/glibc/nptl/pthread_kill.c 44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0; (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007ffff6ea8eb3 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78 #2 0x00007ffff6e50a30 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/ raise.c:26 #3 0x00007ffff6e384c3 in __GI_abort () at abort.c:79 #4 0x00007ffff6e39354 in __libc_message_impl (fmt=fmt@entry=0x7ffff6fc22ea "%s\n") at ../sysdeps/posix/libc_fatal.c:132 #5 0x00007ffff6eb3085 in malloc_printerr (str=str@entry=0x7ffff6fc5850 "malloc(): invalid next size (unsorted)") at malloc.c:5772 #6 0x00007ffff6eb657c in _int_malloc (av=av@entry=0x7ffff6ff6ac0 <main_arena>, bytes=bytes@entry=368) at malloc.c:4081 #7 0x00007ffff6eb877e in __libc_calloc (n=<optimized out>, elem_size=<optimized out>) at malloc.c:3754 #8 0x000055555569bdb6 in perf_session.do_write_header () #9 0x00005555555a373a in __cmd_record.constprop.0 () #10 0x00005555555a6846 in cmd_record () #11 0x000055555564db7f in run_builtin () #12 0x000055555558ed77 in main () ``` Valgrind memcheck: ``` ==45136== Invalid write of size 8 ==45136== at 0x2B38A5: perf_event__synthesize_id_sample (in /usr/bin/perf) ==45136== by 0x157069: __cmd_record.constprop.0 (in /usr/bin/perf) ==45136== by 0x15A845: cmd_record (in /usr/bin/perf) ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf) ==45136== by 0x142D76: main (in /usr/bin/perf) ==45136== Address 0x6a866a8 is 0 bytes after a block of size 40 alloc'd ==45136== at 0x4849BF3: calloc (vg_replace_malloc.c:1675) ==45136== by 0x3574AB: zalloc (in /usr/bin/perf) ==45136== by 0x1570E0: __cmd_record.constprop.0 (in /usr/bin/perf) ==45136== by 0x15A845: cmd_record (in /usr/bin/perf) ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf) ==45136== by 0x142D76: main (in /usr/bin/perf) ==45136== ==45136== Syscall param write(buf) points to unaddressable byte(s) ==45136== at 0x575953D: __libc_write (write.c:26) ==45136== by 0x575953D: write (write.c:24) ==45136== by 0x35761F: ion (in /usr/bin/perf) ==45136== by 0x357778: writen (in /usr/bin/perf) ==45136== by 0x1548F7: record__write (in /usr/bin/perf) ==45136== by 0x15708A: __cmd_record.constprop.0 (in /usr/bin/perf) ==45136== by 0x15A845: cmd_record (in /usr/bin/perf) ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf) ==45136== by 0x142D76: main (in /usr/bin/perf) ==45136== Address 0x6a866a8 is 0 bytes after a block of size 40 alloc'd ==45136== at 0x4849BF3: calloc (vg_replace_malloc.c:1675) ==45136== by 0x3574AB: zalloc (in /usr/bin/perf) ==45136== by 0x1570E0: __cmd_record.constprop.0 (in /usr/bin/perf) ==45136== by 0x15A845: cmd_record (in /usr/bin/perf) ==45136== by 0x201B7E: run_builtin (in /usr/bin/perf) ==45136== by 0x142D76: main (in /usr/bin/perf) ==45136== ----- Closes: https://lore.kernel.org/linux-perf-users/23879991.0LEYPuXRzz@milian-workstation/ Reported-by: Milian Wolff <milian.wolff@kdab.com> Tested-by: Milian Wolff <milian.wolff@kdab.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@kernel.org # 6.8+ Link: https://lore.kernel.org/lkml/Zl9ksOlHJHnKM70p@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-28tools headers: Update the syscall tables and unistd.h, mostly to support the ↵Arnaldo Carvalho de Melo4-1/+5
new 'mseal' syscall But also to wire up shadow stacks on 32-bit x86, picking up those changes from these csets: ff388fe5c481d39c ("mseal: wire up mseal syscall") 2883f01ec37dd866 ("x86/shstk: Enable shadow stacks for x32") This makes 'perf trace' support it, now its possible, for instance to do: # perf trace -e mseal --max-stack=16 Here is an example with the 'sendmmsg' syscall: root@x1:~# perf trace -e sendmmsg --max-stack 16 --max-events=1 0.000 ( 0.062 ms): dbus-broker/1012 sendmmsg(fd: 150, mmsg: 0x7ffef57cca50, vlen: 1, flags: DONTWAIT|NOSIGNAL) = 1 syscall_exit_to_user_mode_prepare ([kernel.kallsyms]) syscall_exit_to_user_mode_prepare ([kernel.kallsyms]) syscall_exit_to_user_mode ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) [0x117ce7] (/usr/lib64/libc.so.6 (deleted)) root@x1:~# To do a system wide tracing of the new 'mseal' syscall with a backtrace of at most 16 entries. This addresses these perf tools build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H J Lu <hjl.tools@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jeff Xu <jeffxu@chromium.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZlXlo4TNcba4wnVZ@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-27perf trace beauty: Update the arch/x86/include/asm/irq_vectors.h copy with ↵Arnaldo Carvalho de Melo1-1/+7
the kernel sources to pick POSTED_MSI_NOTIFICATION To pick up the change in: f5a3562ec9dd29e6 ("x86/irq: Reserve a per CPU IDT vector for posted MSIs") That picks up this new vector: $ cp arch/x86/include/asm/irq_vectors.h tools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h $ tools/perf/trace/beauty/tracepoints/x86_irq_vectors.sh > after $ diff -u before after --- before 2024-05-27 12:50:47.708863932 -0300 +++ after 2024-05-27 12:51:15.335113123 -0300 @@ -1,6 +1,7 @@ static const char *x86_irq_vectors[] = { [0x02] = "NMI", [0x80] = "IA32_SYSCALL", + [0xeb] = "POSTED_MSI_NOTIFICATION", [0xec] = "LOCAL_TIMER", [0xed] = "HYPERV_STIMER0", [0xee] = "HYPERV_REENLIGHTENMENT", $ Now those will be known when pretty printing the irq_vectors:* tracepoints. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/ZlS34M0x30EFVhbg@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-27perf beauty: Update copy of linux/socket.h with the kernel sourcesArnaldo Carvalho de Melo1-1/+2
To pick up the fixes in: 0645fbe760afcc53 ("net: have do_accept() take a struct proto_accept_arg argument") That just changes a function prototype, not touching things used by the perf scrape scripts such as: $ tools/perf/trace/beauty/sockaddr.sh | head -5 static const char *socket_families[] = { [0] = "UNSPEC", [1] = "LOCAL", [2] = "INET", [3] = "AX25", $ This addresses this perf tools build warning: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZlSrceExgjrUiDb5@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-27tools headers UAPI: Sync fcntl.h with the kernel sources to pick F_DUPFD_QUERYArnaldo Carvalho de Melo2-7/+9
There is no scrape script yet for those, but the warning pointed out we need to update the array with the F_LINUX_SPECIFIC_BASE entries, do it. Now 'perf trace' can decode that cmd and also use it in filter, as in: root@number:~# perf trace -e syscalls:*enter_fcntl --filter 'cmd != SETFL && cmd != GETFL' 0.000 sssd_kcm/303828 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7fffdc6a8a50) 0.013 sssd_kcm/303828 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLKW, arg: 0x7fffdc6a8aa0) 0.090 sssd_kcm/303828 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLKW, arg: 0x7fffdc6a88e0) ^Croot@number:~# This picks up the changes in: c62b758bae6af16f ("fcntl: add F_DUPFD_QUERY fcntl()") Addressing this perf tools build warning: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZlSqNQH9mFw2bmjq@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-27tools headers UAPI: Sync linux/prctl.h with the kernel sourcesArnaldo Carvalho de Melo1-0/+22
To pick the changes in: 628d701f2de5b9a1 ("powerpc/dexcr: Add DEXCR prctl interface") 6b9391b581fddd85 ("riscv: Include riscv_set_icache_flush_ctx prctl") That adds some PowerPC and a RISC-V specific prctl options: $ tools/perf/trace/beauty/prctl_option.sh > before $ cp include/uapi/linux/prctl.h tools/perf/trace/beauty/include/uapi/linux/prctl.h $ tools/perf/trace/beauty/prctl_option.sh > after $ diff -u before after --- before 2024-05-27 12:14:21.358032781 -0300 +++ after 2024-05-27 12:14:32.364530185 -0300 @@ -65,6 +65,9 @@ [68] = "GET_MEMORY_MERGE", [69] = "RISCV_V_SET_CONTROL", [70] = "RISCV_V_GET_CONTROL", + [71] = "RISCV_SET_ICACHE_FLUSH_CTX", + [72] = "PPC_GET_DEXCR", + [73] = "PPC_SET_DEXCR", }; static const char *prctl_set_mm_options[] = { [1] = "START_CODE", $ That now will be used to decode the syscall option and also to compose filters, for instance: [root@five ~]# perf trace -e syscalls:sys_enter_prctl --filter option==SET_NAME 0.000 Isolated Servi/3474327 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23f13b7aee) 0.032 DOM Worker/3474327 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23deb25670) 7.920 :3474328/3474328 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24fbb10) 7.935 StreamT~s #374/3474328 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24fb970) 8.400 Isolated Servi/3474329 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24bab10) 8.418 StreamT~s #374/3474329 syscalls:sys_enter_prctl(option: SET_NAME, arg2: 0x7f23e24ba970) ^C[root@five ~]# This addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/lkml/ZlSklGWp--v_Ije7@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-27tools include UAPI: Sync linux/stat.h with the kernel sourcesArnaldo Carvalho de Melo1-1/+3
To get the changes in: 2a82bb02941fb53d ("statx: stx_subvol") To pick up this change and support it: $ tools/perf/trace/beauty/statx_mask.sh > before $ cp include/uapi/linux/stat.h tools/perf/trace/beauty/include/uapi/linux/stat.h $ tools/perf/trace/beauty/statx_mask.sh > after $ diff -u before after --- before 2024-05-22 13:39:49.742470571 -0300 +++ after 2024-05-22 13:39:59.157883101 -0300 @@ -14,4 +14,5 @@ [ilog2(0x00001000) + 1] = "MNT_ID", [ilog2(0x00002000) + 1] = "DIOALIGN", [ilog2(0x00004000) + 1] = "MNT_ID_UNIQUE", + [ilog2(0x00008000) + 1] = "SUBVOL", }; $ Now we'll see it like we see these: # perf trace -e statx 0.000 ( 0.015 ms): systemd-userwo/3982299 statx(dfd: 6, filename: ".", mask: TYPE|INO|MNT_ID, buffer: 0x7ffd8945e850) = 0 <SNIP> 180.559 ( 0.007 ms): (ostnamed)/3982957 statx(dfd: 4, filename: "sys", flags: SYMLINK_NOFOLLOW|NO_AUTOMOUNT|STATX_DONT_SYNC, mask: TYPE, buffer: 0x7fff13161190) = 0 180.918 ( 0.011 ms): (ostnamed)/3982957 statx(dfd: CWD, filename: "/run/systemd/mount-rootfs/sys/kernel/security", flags: SYMLINK_NOFOLLOW|NO_AUTOMOUNT|STATX_DONT_SYNC, mask: MNT_ID, buffer: 0x7fff13161120) = 0 180.956 ( 0.010 ms): (ostnamed)/3982957 statx(dfd: CWD, filename: "/run/systemd/mount-rootfs/sys/fs/cgroup", flags: SYMLINK_NOFOLLOW|NO_AUTOMOUNT|STATX_DONT_SYNC, mask: MNT_ID, buffer: 0x7fff13161120) = 0 <SNIP> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/Zk5nO9yT0oPezUoo@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-26Revert "perf parse-events: Prefer sysfs/JSON hardware events over legacy"Arnaldo Carvalho de Melo4-103/+68
This reverts commit 617824a7f0f73e4de325cf8add58e55b28c12493. This made a simple 'perf record -e cycles:pp make -j199' stop working on the Ampere ARM64 system Linus uses to test ARM64 kernels, as discussed at length in the threads in the Link tags below. The fix provided by Ian wasn't acceptable and work to fix this will take time we don't have at this point, so lets revert this and work on it on the next devel cycle. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Bhaskar Chowdhury <unixbhaskar@gmail.com> Cc: Ethan Adams <j.ethan.adams@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tycho Andersen <tycho@tycho.pizza> Cc: Yang Jihong <yangjihong@bytedance.com> Link: https://lore.kernel.org/lkml/CAHk-=wi5Ri=yR2jBVk-4HzTzpoAWOgstr1LEvg_-OXtJvXXJOA@mail.gmail.com Link: https://lore.kernel.org/lkml/CAHk-=wiWvtFyedDNpoV7a8Fq_FpbB+F5KmWK2xPY3QoYseOf_A@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-22Merge tag 'perf-tools-for-v6.10-1-2024-05-21' of ↵Linus Torvalds288-5882/+18043
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Arnaldo Carvalho de Melo: "General: - Integrate the shellcheck utility with the build of perf to allow catching shell problems early in areas such as 'perf test', 'perf trace' scrape scripts, etc - Add 'uretprobe' variant in the 'perf bench uprobe' tool - Add script to run instances of 'perf script' in parallel - Allow parsing tracepoint names that start with digits, such as 9p/9p_client_req, etc. Make sure 'perf test' tests it even on systems where those tracepoints aren't available - Add Kan Liang to MAINTAINERS as a perf tools reviewer - Add support for using the 'capstone' disassembler library in various tools, such as 'perf script' and 'perf annotate'. This is an alternative for the use of the 'xed' and 'objdump' disassemblers Data-type profiling improvements: - Resolve types for a->b->c by backtracking the assignments until it finds DWARF info for one of those members - Support for global variables, keeping a cache to speed up lookups - Handle the 'call' instruction, dealing with effects on registers and handling its return when tracking register data types - Handle x86's segment based addressing like %gs:0x28, to support things like per CPU variables, the stack canary, etc - Data-type profiling got big speedups when using capstone for disassembling. The objdump outoput parsing method is left as a fallback when capstone fails or isn't available. There are patches posted for 6.11 that to use a LLVM disassembler - Support event group display in the TUI when annotating types with --data-type, for instance to show memory load and store events for the data type fields - Optimize the 'perf annotate' data structures, reducing memory usage - Add a initial 'perf test' for 'perf annotate', checking that a target symbol appears on the output, specifying objdump via the command line, etc Vendor Events: - Update Intel JSON files for Cascade Lake X, Emerald Rapids, Grand Ridge, Ice Lake X, Lunar Lake, Meteor Lake, Sapphire Rapids, Sierra Forest, Sky Lake X, Sky Lake and Snow Ridge X. Remove info metrics erroneously in TopdownL1 - Add AMD's Zen 5 core and uncore events and metrics. Those come from the "Performance Monitor Counters for AMD Family 1Ah Model 00h- 0Fh Processors" document, with events that capture information on op dispatch, execution and retirement, branch prediction, L1 and L2 cache activity, TLB activity, etc - Mark L1D_CACHE_INVAL impacted by errata for ARM64's AmpereOne/ AmpereOneX Miscellaneous: - Sync header copies with the kernel sources - Move some header copies used only for generating translation string tables for ioctl cmds and other syscall integer arguments to a new directory under tools/perf/beauty/, to separate from copies in tools/include/ that are used to build the tools - Introduce scrape script for several syscall 'flags'/'mask' arguments - Improve cpumap utilization, fixing up pairing of refcounts, using the right iterators (perf_cpu_map__for_each_cpu), etc - Give more details about raw event encodings in 'perf list', show tracepoint encoding in the detailed output - Refactor the DSOs handling code, reducing memory usage - Document the BPF event modifier and add a 'perf test' for it - Improve the event parser, better error messages and add further 'perf test's for it - Add reference count checking to 'struct comm_str' and 'struct mem_info' - Make ARM64's 'perf test' entries for the Neoverse N1 more robust - Tweak the ARM64's Coresight 'perf test's - Improve ARM64's CoreSight ETM version detection and error reporting - Fix handling of symbols when using kcore - Fix PAI (Processor Activity Instrumentation) counter names for s390 virtual machines in 'perf report' - Fix -g/--call-graph option failure in 'perf sched timehist' - Add LIBTRACEEVENT_DIR build option to allow building with libtraceevent installed in non-standard directories, such as when doing cross builds - Various 'perf test' and 'perf bench' fixes - Improve 'perf probe' error message for long C++ probe names" * tag 'perf-tools-for-v6.10-1-2024-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (260 commits) tools lib subcmd: Show parent options in help perf pmu: Count sys and cpuid JSON events separately perf stat: Don't display metric header for non-leader uncore events perf annotate-data: Ensure the number of type histograms perf annotate: Fix segfault on sample histogram perf daemon: Fix file leak in daemon_session__control libsubcmd: Fix parse-options memory leak perf lock: Avoid memory leaks from strdup() perf sched: Rename 'switches' column header to 'count' and add usage description, options for latency perf tools: Ignore deleted cgroups perf parse: Allow tracepoint names to start with digits perf parse-events: Add new 'fake_tp' parameter for tests perf parse-events: pass parse_state to add_tracepoint perf symbols: Fix ownership of string in dso__load_vmlinux() perf symbols: Update kcore map before merging in remaining symbols perf maps: Re-use __maps__free_maps_by_name() perf symbols: Remove map from list before updating addresses perf tracepoint: Don't scan all tracepoints to test if one exists perf dwarf-aux: Fix build with HAVE_DWARF_CFI_SUPPORT perf thread: Fixes to thread__new() related to initializing comm ...
2024-05-11perf pmu: Count sys and cpuid JSON events separatelyIan Rogers2-23/+53
Sys events are eagerly loaded as each event has a compat option that may mean the event is or isn't associated with the PMU. These shouldn't be counted as loaded_json_events as that is used for JSON events matching the CPUID that may or may not have been loaded. The mismatch causes issues on ARM64 that uses sys events. Fixes: e6ff1eed3584362d ("perf pmu: Lazily add JSON events") Closes: https://lore.kernel.org/lkml/20240510024729.1075732-1-justin.he@arm.com/ Reported-by: Jia He <justin.he@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240511003601.2666907-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-11perf stat: Don't display metric header for non-leader uncore eventsIan Rogers1-0/+3
On an Intel tigerlake laptop a metric like: { "BriefDescription": "Test", "MetricExpr": "imc_free_running@data_read@ + imc_free_running@data_write@", "MetricGroup": "Test", "MetricName": "Test", "ScaleUnit": "6.103515625e-5MiB" }, Will have 4 events: uncore_imc_free_running_0/data_read/ uncore_imc_free_running_0/data_write/ uncore_imc_free_running_1/data_read/ uncore_imc_free_running_1/data_write/ If aggregration is disabled with metric-only 2 column headers are needed: $ perf stat -M test --metric-only -A -a sleep 1 Performance counter stats for 'system wide': MiB Test MiB Test CPU0 1821.0 1820.5 But when not, the counts aggregated in the metric leader and only 1 column should be shown: $ perf stat -M test --metric-only -a sleep 1 Performance counter stats for 'system wide': MiB Test 5909.4 1.001258915 seconds time elapsed Achieve this by skipping events that aren't metric leaders when printing column headers and aggregation isn't disabled. The bug is long standing, the fixes tag is set to a refactor as that is as far back as is reasonable to backport. Fixes: 088519f318be3a41 ("perf stat: Move the display functions to stat-display.c") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaige Ye <ye@kaige.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240510051309.2452468-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-11perf annotate-data: Ensure the number of type histogramsNamhyung Kim1-1/+4
Arnaldo reported that there is a case where nr_histograms and histograms don't agree each other. It ended up in a segfault trying to access a NULL histograms array. Let's make sure to update the nr_histograms when the histograms array is changed. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240510210452.2449944-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-11perf annotate: Fix segfault on sample histogramNamhyung Kim1-4/+5
A symbol can have no samples, then accessing the annotated_source->samples hashmap will result in a segfault. Fixes: a3f7768bcf48281d ("perf annotate: Fix memory leak in annotated_source") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240510210452.2449944-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf daemon: Fix file leak in daemon_session__controlSamasth Norway Ananda1-2/+2
The open() function returns -1 on error. The 'control' and 'ack' file descriptors are both initialized with open() and further validated with 'if' statement. 'if (!control)' would evaluate to 'true' if returned value on error were '0' but it is actually '-1'. Fixes: edcaa47958c7438b ("perf daemon: Add 'ping' command") Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240510003424.2016914-1-samasth.norway.ananda@oracle.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf lock: Avoid memory leaks from strdup()Ian Rogers1-14/+4
Leak sanitizer complains about the strdup-ed arguments not being freed and given cmd_record doesn't modify the given strings, remove the strdups. Original discussion in this patch: https://lore.kernel.org/lkml/20240430184156.1824083-1-irogers@google.com/ Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240509053123.1918093-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf sched: Rename 'switches' column header to 'count' and add usage ↵Madadi Vineeth Reddy2-1/+37
description, options for latency Rename 'Switches' to 'Count' and document metrics shown for perf sched latency output. Also add options possible with perf sched latency. Initially, after seeing the output of 'perf sched latency', the term 'Switches' seemed like it's the number of context switches-in for a particular task, but upon going through the code, it was observed that it's actually keeping track of number of times a delay was calculated so that it is used in calculation of the average delay. Actually, the switches here is a subset of number of context switches-in because there are some cases where the count is not incremented in switch-in handler 'add_sched_in_event'. For example when a task is switched-in while it's state is not ready to run(!= THREAD_WAIT_CPU). commit d9340c1db3f52460 ("perf sched: Display time in milliseconds, reorganize output") changed it from the original count to switches. So, renamed switches to count to make things a bit more clearer and added the metrics description of latency in the document. Reviewed-by: Aditya Gupta <adityag@linux.ibm.com> Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240328090005.8321-1-vineethr@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf tools: Ignore deleted cgroupsNamhyung Kim2-4/+5
On large systems, cgroups can be created and deleted often. That means there's a race between perf tools and cgroups when it gets the cgroup name and opens the cgroup. I got a report that 'perf stat' with many cgroups failed quite often due to the missing cgroups on such a large machine. I think we can ignore such cgroups when expanding events and use id 0 if it fails to read the cgroup id. IIUC 0 is not a vaild cgroup id so it won't update event counts for the failed cgroups. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240509182235.2319599-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf parse: Allow tracepoint names to start with digitsDominique Martinet2-2/+9
Tracepoints can start with digits, although we don't have many of these: $ rg -g '*.h' '\bTRACE_EVENT\([0-9]' net/mac802154/trace.h 53:TRACE_EVENT(802154_drv_return_int, ... net/ieee802154/trace.h 66:TRACE_EVENT(802154_rdev_add_virtual_intf, ... include/trace/events/9p.h 124:TRACE_EVENT(9p_client_req, ... Just allow names to start with digits too so e.g. "perf trace -e '9p:*'" works Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240510-perf_digit-v4-3-db1553f3233b@codewreck.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf parse-events: Add new 'fake_tp' parameter for testsDominique Martinet8-19/+34
The next commit will allow tracepoints starting with digits, but most systems do not have any available by default so tests should skip the actual "check if it exists in /sys/kernel/debug/tracing" step. In order to do that, add a new boolean flag specifying if we should actually "format" the probe or not. Originally-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240510-perf_digit-v4-2-db1553f3233b@codewreck.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf parse-events: pass parse_state to add_tracepointDominique Martinet3-15/+21
The next patch will add another flag to parse_state that we will want to pass to evsel__newtp_idx(), so pass the whole parse_state all the way down instead of giving only the index Originally-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240510-perf_digit-v4-1-db1553f3233b@codewreck.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf symbols: Fix ownership of string in dso__load_vmlinux()James Clark1-3/+8
The linked commit updated dso__load_vmlinux() to call dso__set_long_name() before loading the symbols. Loading the symbols may not succeed but dso__set_long_name() takes ownership of the string. The two callers of this function free the string themselves on failure cases, resulting in the following error: $ perf record -- ls $ perf report free(): double free detected in tcache 2 Fix it by always taking ownership of the string, even on failure. This means the string is either freed at the very first early exit condition, or later when the dso is deleted or the long name is replaced. Now no special return value is needed to signify that the caller needs to free the string. Fixes: e59fea47f83e8a9a ("perf symbols: Fix DSO kernel load and symbol process to correctly map DSO to its long_name, type and adjust_symbols") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240507141210.195939-5-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf symbols: Update kcore map before merging in remaining symbolsJames Clark1-19/+21
When loading kcore, the main vmlinux map is updated in the same loop that merges the remaining maps. If a map that overlaps is merged in before kcore, the list can become unsortable when the main map addresses are updated. This will later trigger the check_invariants() assert: $ perf record $ perf report util/maps.c:96: check_invariants: Assertion `map__end(prev) <= map__start(map) || map__start(prev) == map__start(map)' failed. Aborted Fix it by moving the main map update prior to the loop so that maps__merge_in() can split it if necessary. Fixes: 659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses") Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240507141210.195939-4-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf maps: Re-use __maps__free_maps_by_name()James Clark1-7/+7
maps__merge_in() hard codes the steps to free the maps_by_name list. It seems to not map__put() each element before freeing, and it sets maps_by_name_sorted to true after freeing, which may be harmless but is inconsistent with maps__init() and other functions. maps__maps_by_name_addr() is also quite hard to read because we already have maps__maps_by_name() and maps__maps_by_address(), but the function is only used in that place so delete it. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240507141210.195939-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf symbols: Remove map from list before updating addressesJames Clark1-4/+6
Make the order of operations remove, update, add. Updating addresses before the map is removed causes the ordering check to fail when the map is removed. This can be reproduced when running Perf on an Arm system with a static kernel and Perf uses kcore rather than other sources: $ perf record -- ls $ perf report util/maps.c:96: check_invariants: Assertion `map__end(prev) <= map__start(map) || map__start(prev) == map__start(map)' failed Fixes: 659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses") Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240507141210.195939-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf tracepoint: Don't scan all tracepoints to test if one existsIan Rogers2-35/+24
In is_valid_tracepoint, rather than scanning "/sys/kernel/tracing/events/*/*" skipping any path where "/sys/kernel/tracing/events/*/*/id" doesn't exist, and then testing if "*:*" matches the tracepoint name, just use the given tracepoint name replace the ':' with '/' and see if the id file exists. This turns a nested directory search into a single file available test. Rather than return 1 for valid and 0 for invalid, return true and false. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240509153245.1990426-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf dwarf-aux: Fix build with HAVE_DWARF_CFI_SUPPORTJames Clark1-28/+28
check_allowed_ops() is used from both HAVE_DWARF_GETLOCATIONS_SUPPORT and HAVE_DWARF_CFI_SUPPORT sections, so move it into the right place so that it's available when either are defined. This shows up when doing a static cross compile for arm64: $ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- LDFLAGS="-static" \ EXTRA_PERFLIBS="-lexpat" util/dwarf-aux.c:1723:6: error: implicit declaration of function 'check_allowed_ops' Fixes: 55442cc2f22d0727 ("perf dwarf-aux: Check allowed DWARF Ops") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240508141458.439017-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf thread: Fixes to thread__new() related to initializing commIan Rogers1-9/+5
Freeing the thread on failure won't work with reference count checking, use thread__delete(). Don't allocate the comm_str, use a stack allocation instead. Fixes: f6005cafebab72f8 ("perf thread: Add reference count checking") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240508035301.1554434-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf report: Avoid SEGV in report__setup_sample_type()Ian Rogers1-1/+1
In some cases evsel->name is lazily initialized in evsel__name(). If not initialized passing NULL to strstr() leads to a SEGV. Fixes: ccb17caecfbd542f ("perf report: Set PERF_SAMPLE_DATA_SRC bit for Arm SPE event") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240508035301.1554434-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf comm: Fix comm_str__put() for reference count checkingIan Rogers1-14/+31
Searching for the entry in the array needs to avoid the intermediate pointer with reference count checking. Refactor the array removal to binary search for the entry. Change the array to hold an entry with a reference count (so the intermediate pointer can work) and remove from the array when the reference count on a comm_str falls to 1. Fixes: 13ca628716c6f2c3 ("perf comm: Add reference count checking to 'struct comm_str'") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240508035301.1554434-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-10perf ui browser: Avoid SEGV on titleIan Rogers1-1/+1
If the title is NULL then it can lead to a SEGV. Fixes: 769e6a1e15bdbbaf ("perf ui browser: Don't save pointer to stack memory") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240508035301.1554434-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-08perf dwarf-aux: Print array type name with "[]"Namhyung Kim1-1/+3
It's confusing both pointers and arrays are printed as *. Let's print array types with [] so that we can identify them easily. Although it's interchangable, sometimes it can cause confusion with size like in the below example. Note that it is not the same with C syntax where it goes to the variable names, but we want to have it in the type names (like in Go language). Before: mov [20] 0x68(reg5) -> reg0 type='struct page**' size=0x80 (die:0x4e61d32) After: mov [20] 0x68(reg5) -> reg0 type='struct page*[]' size=0x80 (die:0x4e61d32) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240507041338.2081775-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-08perf hist: Avoid 'struct hist_entry_iter' mem_info memory leakIan Rogers2-28/+19
'struct mem_info' is reference counted while 'struct branch_info' and he_cache (struct hist_entry **) are not. Break apart the priv field in 'struct hist_entry_iter' so that we can know which values are owned by the iter and do the appropriate free or put. Move hide_unresolved to marginally shrink the size of the now grown struct. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-08perf mem-info: Add reference count checkingIan Rogers11-88/+135
Add reference count checking and switch 'struct mem_info' usage to use accessor functions. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-08perf mem-info: Move mem-info out of mem-events and symbolIan Rogers15-63/+85
Move mem-info to its own header rather than having it split between mem-events and symbol. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>