summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
8 daysMerge tag 'riscv-for-linus-6.16-mw1' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for the FWFT SBI extension, which is part of SBI 3.0 and a dependency for many new SBI and ISA extensions - Support for getrandom() in the VDSO - Support for mseal - Optimized routines for raid6 syndrome and recovery calculations - kexec_file() supports loading Image-formatted kernel binaries - Improvements to the instruction patching framework to allow for atomic instruction patching, along with rules as to how systems need to behave in order to function correctly - Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha, some SiFive vendor extensions - Various fixes and cleanups, including: misaligned access handling, perf symbol mangling, module loading, PUD THPs, and improved uaccess routines * tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (69 commits) riscv: uaccess: Only restore the CSR_STATUS SUM bit RISC-V: vDSO: Wire up getrandom() vDSO implementation riscv: enable mseal sysmap for RV64 raid6: Add RISC-V SIMD syndrome and recovery calculations riscv: mm: Add support for Svinval extension RISC-V: Documentation: Add enough title underlines to CMODX riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE MAINTAINERS: Update Atish's email address riscv: uaccess: do not do misaligned accesses in get/put_user() riscv: process: use unsigned int instead of unsigned long for put_user() riscv: make unsafe user copy routines use existing assembly routines riscv: hwprobe: export Zabha extension riscv: Make regs_irqs_disabled() more clear perf symbols: Ignore mapping symbols on riscv RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND riscv: module: Optimize PLT/GOT entry counting riscv: Add support for PUD THP riscv: xchg: Prefetch the destination word for sc.w riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop riscv: Add support for Zicbop ...
9 daysMerge tag 'riscv-mw2-6.16-rc1' of ↵Palmer Dabbelt1-0/+1
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into for-next riscv patches for 6.16-rc1, part 2 * Performance improvements - Add support for vdso getrandom - Implement raid6 calculations using vectors - Introduce svinval tlb invalidation * Cleanup - A bunch of deduplication of the macros we use for manipulating instructions * Misc - Introduce a kunit test for kprobes - Add support for mseal as riscv fits the requirements (thanks to Lorenzo for making sure of that :)) [Palmer: There was a rebase between part 1 and part 2, so I've had to do some more git surgery here... at least two rounds of surgery...] * alex-pr-2: (866 commits) RISC-V: vDSO: Wire up getrandom() vDSO implementation riscv: enable mseal sysmap for RV64 raid6: Add RISC-V SIMD syndrome and recovery calculations riscv: mm: Add support for Svinval extension riscv: Add kprobes KUnit test riscv: kprobes: Remove duplication of RV_EXTRACT_ITYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_UTYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_RD_REG riscv: kprobes: Remove duplication of RVC_EXTRACT_BTYPE_IMM riscv: kprobes: Remove duplication of RVC_EXTRACT_C2_RS1_REG riscv: kproves: Remove duplication of RVC_EXTRACT_JTYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_BTYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_RS1_REG riscv: kprobes: Remove duplication of RV_EXTRACT_JTYPE_IMM riscv: kprobes: Move branch_funct3 to insn.h riscv: kprobes: Move branch_rs2_idx to insn.h Linux 6.15-rc6 Input: xpad - fix xpad_device sorting Input: xpad - add support for several more controllers Input: xpad - fix Share button on Xbox One controllers ...
9 daysperf symbols: Ignore mapping symbols on riscvHaibo Xu1-0/+6
RISCV ELF use mapping symbols with special names $x, $d to identify regions of RISCV code or code with different ISAs[1]. These symbols don't identify functions, so will confuse the perf output. The patch filters out these symbols at load time, similar to "4886f2ca perf symbols: Ignore mapping symbols on aarch64". [1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/ master/riscv-elf.adoc#mapping-symbol Signed-off-by: Haibo Xu <haibo1.xu@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250409025202.201046-1-haibo1.xu@intel.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
11 daysMerge tag 'perf-tools-for-v6.16-1-2025-06-03' of ↵Linus Torvalds308-11885/+21998
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Arnaldo Carvalho de Melo: "perf report/top/annotate TUI: - Accept the left arrow key as a Zoom out if done on the first column - Show if source code toggle status in title, to help spotting bugs with the various disassemblers (capstone, llvm, objdump) - Provide feedback on unhandled hotkeys Build: - Better inform when certain features are not available with warnings in the build process and in 'perf version --build-options' or 'perf -vv' perf record: - Improve the --off-cpu code by synthesizing events for switch-out -> switch-in intervals using a BPF program. This can be fine tuned using a --off-cpu-thresh knob perf report: - Add 'tgid' sort key perf mem/c2c: - Add 'op', 'cache', 'snoop', 'dtlb' output fields - Add support for 'ldlat' on AMD IBS (Instruction Based Sampling) perf ftrace: - Use process/session specific trace settings instead of messing with the global ftrace knobs perf trace: - Implement syscall summary in BPF - Support --summary-mode=cgroup - Always print return value for syscalls returning a pid - The rseq and set_robust_list don't return a pid, just -errno perf lock contention: - Symbolize zone->lock using BTF - Add -J/--inject-delay option to estimate impact on application performance by optimization of kernel locking behavior perf stat: - Improve hybrid support for the NMI watchdog warning Symbol resolution: - Handle 'u' and 'l' symbols in /proc/kallsyms, resolving some Rust symbols - Improve Rust demangler Hardware tracing: Intel PT: - Fix PEBS-via-PT data_src - Do not default to recording all switch events - Fix pattern matching with python3 on the SQL viewer script arm64: - Fixups for the hip08 hha PMU Vendor events: - Update Intel events/metrics files for alderlake, alderlaken, arrowlake, bonnell, broadwell, broadwellde, broadwellx, cascadelakex, clearwaterforest, elkhartlake, emeraldrapids, grandridge, graniterapids, haswell, haswellx, icelake, icelakex, ivybridge, ivytown, jaketown, lunarlake, meteorlake, nehalemep, nehalemex, rocketlake, sandybridge, sapphirerapids, sierraforest, skylake, skylakex, snowridgex, tigerlake, westmereep-dp, westmereep-sp, westmereep-sx python support: - Add support for event counts in the python binding, add a counting.py example perf list: - Display the PMU name associated with a perf metric in JSON perf test: - Hybrid improvements for metric value validation test - Fix LBR test by ignoring idle task - Add AMD IBS sw filter ana d'ldlat' tests - Add 'perf trace --summary-mode=cgroup' test - Add tests for the various language symbol demanglers Miscellaneous: - Allow specifying the cpu an event will be tied using '-e event/cpu=N/' - Sync various headers with the kernel sources - Add annotations to use clang's -Wthread-safety and fix some problems it detected - Make dump_stack() use perf's symbol resolution to provide better backtraces - Intel TPEBS support cleanups and fixes. TPEBS stands for Timed PEBS (Precision Event-Based Sampling), that adds timing info, the retirement latency of instructions - Various memory allocation (some detected by ASAN) and reference counting fixes - Add a 8-byte aligned PERF_RECORD_COMPRESSED2 to replace PERF_RECORD_COMPRESSED - Skip unsupported event types in perf.data files, don't stop when finding one - Improve lookups using hashmaps and binary searches" * tag 'perf-tools-for-v6.16-1-2025-06-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (206 commits) perf callchain: Always populate the addr_location map when adding IP perf lock contention: Reject more than 10ms delays for safety perf trace: Set errpid to false for rseq and set_robust_list perf symbol: Move demangling code out of symbol-elf.c perf trace: Always print return value for syscalls returning a pid perf script: Print PERF_AUX_FLAG_COLLISION flag perf mem: Show absolute percent in mem_stat output perf mem: Display sort order only if it's available perf mem: Describe overhead calculation in brief perf record: Fix incorrect --user-regs comments Revert "perf thread: Ensure comm_lock held for comm_list" perf test trace_summary: Skip --bpf-summary tests if no libbpf perf test intel-pt: Skip jitdump test if no libelf perf intel-tpebs: Avoid race when evlist is being deleted perf test demangle-java: Don't segv if demangling fails perf symbol: Fix use-after-free in filename__read_build_id perf pmu: Avoid segv for missing name/alias_name in wildcarding perf machine: Factor creating a "live" machine out of dwarf-unwind perf test: Add AMD IBS sw filter test perf mem: Count L2 HITM for c2c statistic ...
2025-05-31perf callchain: Always populate the addr_location map when adding IPIan Rogers3-5/+11
Dropping symbols also meant the callchain maps wasn't populated, but the callchain map is needed to find the DSO. Plumb the symbols option better, falling back to thread__find_map() rather than thread__find_symbol() when symbols are disabled. Fixes: 02b2705017d2e5ad ("perf callchain: Allow symbols to be optional when resolving a callchain") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Graham Woodward <graham.woodward@arm.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Levi Yun <yeoreum.yun@arm.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <martin.liska@hey.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Matt Fleming <matt@readmodwrite.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steve Clevenger <scclevenger@os.amperecomputing.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Yujie Liu <yujie.liu@intel.com> Cc: Zhongqiu Han <quic_zhonhan@quicinc.com> Cc: Zixian Cai <fzczx123@gmail.com> Link: https://lore.kernel.org/r/20250529044000.759937-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-31perf lock contention: Reject more than 10ms delays for safetyNamhyung Kim2-2/+11
Delaying kernel operations can be dangerous and the kernel may kill (non-sleepable) BPF programs running for long in the future. Limit the max delay to 10ms and update the document about it. $ sudo ./perf lock con -abl -J 100000us@cgroup_mutex true lock delay is too long: 100000us (> 10ms) Usage: perf lock contention [<options>] -J, --inject-delay <TIME@FUNC> Inject delays to specific locks Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250515181042.555189-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-30perf trace: Set errpid to false for rseq and set_robust_listAnubhav Shelat1-2/+2
The 'rseq' and 'set_robust_list' syscalls don't return a pid, so set errpid for both to false. Fixes: 0c1019e3463b263a ("perf trace: Mark the 'rseq' arg in the rseq syscall as coming from user space") Fixes: 1de5b5dcb8353f36 ("perf trace: Mark the 'head' arg in the set_robust_list syscall as coming from user space") Signed-off-by: Anubhav Shelat <ashelat@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250529143334.1469669-2-ashelat@redhat.com [ Remove explicit .errpid = false, omitting its initialization zeroes it, as noted by Namhyung ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-29perf symbol: Move demangling code out of symbol-elf.cIan Rogers4-96/+87
symbol-elf.c is used when building with libelf, symbol-minimal is used otherwise. There is no reason the demangling code with no dependencies on libelf is part of symbol-elf.c so move to symbol.c. This allows demangling tests to pass with NO_LIBELF=1. Structurally, while moving the functions rename demangle_sym() to dso__demangle_sym() which is already a function exposed in symbol.h and the only purpose of which in symbol-elf.c was to call demangle_sym(). Change the calls to demangle_sym() in symbol-elf.c to calls to dso__demangle_sym(). Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528210858.499898-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf trace: Always print return value for syscalls returning a pidAnubhav Shelat1-1/+1
The syscalls that were consistently observed were set_robust_list and rseq. This is because perf cannot find their child process. This change ensures that the return value is always printed. Before: 0.256 ( 0.001 ms): set_robust_list(head: 0x7f09c77dba20, len: 24) = 0.259 ( 0.001 ms): rseq(rseq: 0x7f09c77dc0e0, rseq_len: 32, sig: 1392848979) = After: 0.270 ( 0.002 ms): set_robust_list(head: 0x7f0bb14a6a20, len: 24) = 0 0.273 ( 0.002 ms): rseq(rseq: 0x7f0bb14a70e0, rseq_len: 32, sig: 1392848979) = 0 Committer notes: As discussed in the thread in the Link: tag below, these two don't return a pid, but for syscalls returning one, we need to print the result and if we manage to find the children in 'perf trace' data structures, then print its name as well. Fixes: 11c8e39f5133aed9 ("perf trace: Infrastructure to show COMM strings for syscalls returning PIDs") Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Anubhav Shelat <ashelat@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250403160411.159238-2-ashelat@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf script: Print PERF_AUX_FLAG_COLLISION flagLeo Yan1-2/+3
Print out the collision flag for AUX trace data. This is helpful for inspecting sample collisions. After: 0x217b60@/data_nvme1n1/niayan01/upstream/perf.data [0x40]: event: 11 . . ... raw event: size 64 bytes . 0000: 0b 00 00 00 00 00 40 00 d2 ef 3f 00 00 00 00 00 ......@...?..... . 0010: ff 0f 00 00 00 00 00 00 08 00 00 00 00 00 00 00 ................ . 0020: 1c 01 00 00 1c 01 00 00 10 bf 38 d6 11 01 00 00 ..........8..... . 0030: 03 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00 ................ 3 1176120114960 0x217b60 [0x40]: PERF_RECORD_AUX offset: 0x3fefd2 size: 0xfff flags: 0x8 [C] The added character '[C]' indicates the collision. Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250528153519.188644-1-leo.yan@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf mem: Show absolute percent in mem_stat outputNamhyung Kim1-1/+1
Currently the output sums up to 100% for each entry. But it can be confusing when it's displayed with 'overhead'. Before: $ perf mem report -F overhead,sample,cache,comm ... # -------------- Cache -------------- # Overhead Samples L1 L2 L3 L1-buf Other Command # ........ ............ ................................... ............... # 25.38% 517 34.6% 0.0% 15.8% 23.3% 26.2% swapper 9.03% 239 35.4% 0.8% 9.1% 22.1% 32.6% chrome 8.61% 233 45.3% 1.2% 8.9% 22.7% 21.9% Chrome_ChildIOT 7.81% 189 33.6% 0.4% 5.5% 35.9% 24.6% Isolated Web Co 3.73% 103 40.4% 0.3% 2.7% 39.4% 17.2% gnome-shell Let's convert it to use absolute percent value so that it can add up to the overhead for that entry. After: # -------------- Cache -------------- # Overhead Samples L1 L2 L3 L1-buf Other Command # ........ ............ ................................... ............... # 25.38% 517 8.8% 0.0% 4.0% 5.9% 6.7% swapper 9.03% 239 3.2% 0.1% 0.8% 2.0% 2.9% chrome 8.61% 233 3.9% 0.1% 0.8% 2.0% 1.9% Chrome_ChildIOT 7.81% 189 2.6% 0.0% 0.4% 2.8% 1.9% Isolated Web Co 3.73% 103 1.5% 0.0% 0.1% 1.5% 0.6% gnome-shell This aligns well with the existing 'mem' sort key. $ perf mem report -s comm,mem -H ... # # Overhead Samples Command / Memory access # ......................... .......................................... # 25.38% 517 swapper 8.78% 150 L1 hit 6.66% 72 RAM hit 5.92% 137 LFB/MAB hit 4.02% 157 L3 hit 0.00% 1 L3 miss 9.03% 239 chrome 3.19% 117 L1 hit 2.94% 35 RAM hit 1.99% 48 LFB/MAB hit 0.82% 32 L3 hit 0.08% 5 L2 hit 0.00% 2 L3 miss We can add an option or a config to change the setting later. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20250523222157.1259998-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf mem: Display sort order only if it's availableNamhyung Kim1-1/+4
IOW it's not used when -F option is used alone. Let's make it conditional to skip printing incorrect information. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20250523222157.1259998-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf mem: Describe overhead calculation in briefRavi Bangoria1-0/+19
Unlike perf-report which uses sample period for overhead calculation, perf-mem overhead is calculated using sample weight. Describe perf-mem overhead calculation method in it's man page. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20250523222157.1259998-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf record: Fix incorrect --user-regs commentsDapeng Mi1-1/+1
The comment of "--user-regs" option is not correct, fix it. "on interrupt," -> "in user space," Fixes: 84c417422798c897 ("perf record: Support direct --user-regs arguments") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250403060810.196028-1-dapeng1.mi@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28Revert "perf thread: Ensure comm_lock held for comm_list"Arnaldo Carvalho de Melo3-20/+8
This reverts commit 8f454c95817d15ee529d58389612ea4b34f5ffb3. 'perf top' is freezing on exit sometimes, bisected to this one, revert. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Fei Lang <langfei@huawei.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/r/aDcyvvOKZkRYbjul@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf test trace_summary: Skip --bpf-summary tests if no libbpfIan Rogers1-0/+6
If perf is built without libbpf (e.g. NO_LIBBPF=1) then the --bpf-summary perf trace tests will fail. Skip the tests as this is expected behavior. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Howard Chu <howardchu95@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528032637.198960-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf test intel-pt: Skip jitdump test if no libelfIan Rogers1-0/+5
jitdump support is only present if building with libelf. Skip the intel-pt jitdump test if perf isn't compiled with libelf support. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528032637.198960-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf intel-tpebs: Avoid race when evlist is being deletedIan Rogers1-2/+10
Reading through the evsel->evlist may seg fault if a sample arrives when the evlist is being deleted. Detect this case and ignore samples arriving when the evlist is being deleted. Fixes: bcfab08db7fb38bf ("perf intel-tpebs: Filter non-workload samples") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528032637.198960-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf test demangle-java: Don't segv if demangling failsIan Rogers1-0/+5
The buffer returned by dso__demangle_sym() may be NULL, don't segv in strcmp if this happens. Currently this happens for NO_LIBELF=1 builds. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250528032637.198960-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf symbol: Fix use-after-free in filename__read_build_idIan Rogers1-98/+70
The same buf is used for the program headers and reading notes. As the notes memory may be reallocated then this corrupts the memory pointed to by the phdr. Using the same buffer is in any case a logic error. Rather than deal with the duplicated code, introduce an elf32 boolean and a union for either the elf32 or elf64 headers that are in use. Let the program headers have their own memory and grow the buffer for notes as necessary. Before `perf list -j` compiled with asan would crash with: ``` ==4176189==ERROR: AddressSanitizer: heap-use-after-free on address 0x5160000070b8 at pc 0x555d3b15075b bp 0x7ffebb5a8090 sp 0x7ffebb5a8088 READ of size 8 at 0x5160000070b8 thread T0 #0 0x555d3b15075a in filename__read_build_id tools/perf/util/symbol-minimal.c:212:25 #1 0x555d3ae43aff in filename__sprintf_build_id tools/perf/util/build-id.c:110:8 ... 0x5160000070b8 is located 312 bytes inside of 560-byte region [0x516000006f80,0x5160000071b0) freed by thread T0 here: #0 0x555d3ab21840 in realloc (perf+0x264840) (BuildId: 12dff2f6629f738e5012abdf0e90055518e70b5e) #1 0x555d3b1506e7 in filename__read_build_id tools/perf/util/symbol-minimal.c:206:11 ... previously allocated by thread T0 here: #0 0x555d3ab21423 in malloc (perf+0x264423) (BuildId: 12dff2f6629f738e5012abdf0e90055518e70b5e) #1 0x555d3b1503a2 in filename__read_build_id tools/perf/util/symbol-minimal.c:182:9 ... ``` Note: this bug is long standing and not introduced by the other asan fix in commit fa9c4977fbfb ("perf symbol-minimal: Fix double free in filename__read_build_id"). Fixes: b691f64360ecec49 ("perf symbols: Implement poor man's ELF parser") Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250528032637.198960-2-irogers@google.com Cc: Mark Rutland <mark.rutland@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf pmu: Avoid segv for missing name/alias_name in wildcardingIan Rogers1-0/+3
The pmu name or alias_name fields may be NULL and should be skipped if so. This is done in all loops of perf_pmu___name_match except the final wildcard loop which was an oversight. Fixes: 63e287131cf0c59b ("perf pmu: Rename name matching for no suffix or wildcard variants") Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250527215035.187992-1-irogers@google.com [ Fixup the Fixes: tag to the right commit ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf machine: Factor creating a "live" machine out of dwarf-unwindIan Rogers3-38/+51
Factor out for use in places other than the dwarf unwinding tests for libunwind. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anne Macedo <retpolanne@posteo.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250313052952.871958-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf test: Add AMD IBS sw filter testNamhyung Kim1-0/+67
The kernel v6.14 added 'swfilt' to support privilege filtering in software so that IBS can be used by regular users. Add a test case in x86 to verify the behavior. $ sudo perf test -vv 'IBS software filter' 113: AMD IBS software filtering: --- start --- test child forked, pid 178826 check availability of IBS swfilt run perf record with modifier and swfilt [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.000 MB /dev/null ] [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.000 MB /dev/null ] [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.000 MB /dev/null ] [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 0.000 MB /dev/null ] check number of samples with swfilt [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.037 MB - ] [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.041 MB - ] ---- end(0) ---- 113: AMD IBS software filtering : Ok Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> # On a 9950x3d Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250524002754.1266681-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-28perf mem: Count L2 HITM for c2c statisticYicong Yang1-1/+4
L2 HITM is not counted in c2c statistic decoding. Count it for lcl_hitm like how we handle L2 Peer snoop. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: CaiJingtao <caijingtao@huawei.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Yushan Wang <wangyushan12@huawei.com> Cc: Zeng Tao <prime.zeng@hisilicon.com> Cc: xueshan2@huawei.com Link: https://lore.kernel.org/r/20250425033845.57671-4-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-27perf arm-spe: Add support for SPE Data Source packet on HiSilicon HIP12Yicong Yang2-0/+113
Add data source encoding for HiSilicon HIP12 and coresponding mapping to the perf's memory data source. This will help to synthesize the data and support upper layer tools like perf-mem and perf-c2c. Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: CaiJingtao <caijingtao@huawei.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Yushan Wang <wangyushan12@huawei.com> Cc: Zeng Tao <prime.zeng@hisilicon.com> Cc: xueshan2@huawei.com Link: https://lore.kernel.org/r/20250425033845.57671-3-yangyicong@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-27Merge tag 'x86-core-2025-05-25' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core x86 updates from Ingo Molnar: "Boot code changes: - A large series of changes to reorganize the x86 boot code into a better isolated and easier to maintain base of PIC early startup code in arch/x86/boot/startup/, by Ard Biesheuvel. Motivation & background: | Since commit | | c88d71508e36 ("x86/boot/64: Rewrite startup_64() in C") | | dated Jun 6 2017, we have been using C code on the boot path in a way | that is not supported by the toolchain, i.e., to execute non-PIC C | code from a mapping of memory that is different from the one provided | to the linker. It should have been obvious at the time that this was a | bad idea, given the need to sprinkle fixup_pointer() calls left and | right to manipulate global variables (including non-pointer variables) | without crashing. | | This C startup code has been expanding, and in particular, the SEV-SNP | startup code has been expanding over the past couple of years, and | grown many of these warts, where the C code needs to use special | annotations or helpers to access global objects. This tree includes the first phase of this work-in-progress x86 boot code reorganization. Scalability enhancements and micro-optimizations: - Improve code-patching scalability (Eric Dumazet) - Remove MFENCEs for X86_BUG_CLFLUSH_MONITOR (Andrew Cooper) CPU features enumeration updates: - Thorough reorganization and cleanup of CPUID parsing APIs (Ahmed S. Darwish) - Fix, refactor and clean up the cacheinfo code (Ahmed S. Darwish, Thomas Gleixner) - Update CPUID bitfields to x86-cpuid-db v2.3 (Ahmed S. Darwish) Memory management changes: - Allow temporary MMs when IRQs are on (Andy Lutomirski) - Opt-in to IRQs-off activate_mm() (Andy Lutomirski) - Simplify choose_new_asid() and generate better code (Borislav Petkov) - Simplify 32-bit PAE page table handling (Dave Hansen) - Always use dynamic memory layout (Kirill A. Shutemov) - Make SPARSEMEM_VMEMMAP the only memory model (Kirill A. Shutemov) - Make 5-level paging support unconditional (Kirill A. Shutemov) - Stop prefetching current->mm->mmap_lock on page faults (Mateusz Guzik) - Predict valid_user_address() returning true (Mateusz Guzik) - Consolidate initmem_init() (Mike Rapoport) FPU support and vector computing: - Enable Intel APX support (Chang S. Bae) - Reorgnize and clean up the xstate code (Chang S. Bae) - Make task_struct::thread constant size (Ingo Molnar) - Restore fpu_thread_struct_whitelist() to fix CONFIG_HARDENED_USERCOPY=y (Kees Cook) - Simplify the switch_fpu_prepare() + switch_fpu_finish() logic (Oleg Nesterov) - Always preserve non-user xfeatures/flags in __state_perm (Sean Christopherson) Microcode loader changes: - Help users notice when running old Intel microcode (Dave Hansen) - AMD: Do not return error when microcode update is not necessary (Annie Li) - AMD: Clean the cache if update did not load microcode (Boris Ostrovsky) Code patching (alternatives) changes: - Simplify, reorganize and clean up the x86 text-patching code (Ingo Molnar) - Make smp_text_poke_batch_process() subsume smp_text_poke_batch_finish() (Nikolay Borisov) - Refactor the {,un}use_temporary_mm() code (Peter Zijlstra) Debugging support: - Add early IDT and GDT loading to debug relocate_kernel() bugs (David Woodhouse) - Print the reason for the last reset on modern AMD CPUs (Yazen Ghannam) - Add AMD Zen debugging document (Mario Limonciello) - Fix opcode map (!REX2) superscript tags (Masami Hiramatsu) - Stop decoding i64 instructions in x86-64 mode at opcode (Masami Hiramatsu) CPU bugs and bug mitigations: - Remove X86_BUG_MMIO_UNKNOWN (Borislav Petkov) - Fix SRSO reporting on Zen1/2 with SMT disabled (Borislav Petkov) - Restructure and harmonize the various CPU bug mitigation methods (David Kaplan) - Fix spectre_v2 mitigation default on Intel (Pawan Gupta) MSR API: - Large MSR code and API cleanup (Xin Li) - In-kernel MSR API type cleanups and renames (Ingo Molnar) PKEYS: - Simplify PKRU update in signal frame (Chang S. Bae) NMI handling code: - Clean up, refactor and simplify the NMI handling code (Sohil Mehta) - Improve NMI duration console printouts (Sohil Mehta) Paravirt guests interface: - Restrict PARAVIRT_XXL to 64-bit only (Kirill A. Shutemov) SEV support: - Share the sev_secrets_pa value again (Tom Lendacky) x86 platform changes: - Introduce the <asm/amd/> header namespace (Ingo Molnar) - i2c: piix4, x86/platform: Move the SB800 PIIX4 FCH definitions to <asm/amd/fch.h> (Mario Limonciello) Fixes and cleanups: - x86 assembly code cleanups and fixes (Uros Bizjak) - Misc fixes and cleanups (Andi Kleen, Andy Lutomirski, Andy Shevchenko, Ard Biesheuvel, Bagas Sanjaya, Baoquan He, Borislav Petkov, Chang S. Bae, Chao Gao, Dan Williams, Dave Hansen, David Kaplan, David Woodhouse, Eric Biggers, Ingo Molnar, Josh Poimboeuf, Juergen Gross, Malaya Kumar Rout, Mario Limonciello, Nathan Chancellor, Oleg Nesterov, Pawan Gupta, Peter Zijlstra, Shivank Garg, Sohil Mehta, Thomas Gleixner, Uros Bizjak, Xin Li)" * tag 'x86-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (331 commits) x86/bugs: Fix spectre_v2 mitigation default on Intel x86/bugs: Restructure ITS mitigation x86/xen/msr: Fix uninitialized variable 'err' x86/msr: Remove a superfluous inclusion of <asm/asm.h> x86/paravirt: Restrict PARAVIRT_XXL to 64-bit only x86/mm/64: Make 5-level paging support unconditional x86/mm/64: Make SPARSEMEM_VMEMMAP the only memory model x86/mm/64: Always use dynamic memory layout x86/bugs: Fix indentation due to ITS merge x86/cpuid: Rename hypervisor_cpuid_base()/for_each_possible_hypervisor_cpuid_base() to cpuid_base_hypervisor()/for_each_possible_cpuid_base_hypervisor() x86/cpu/intel: Rename CPUID(0x2) descriptors iterator parameter x86/cacheinfo: Rename CPUID(0x2) descriptors iterator parameter x86/cpuid: Rename cpuid_get_leaf_0x2_regs() to cpuid_leaf_0x2() x86/cpuid: Rename have_cpuid_p() to cpuid_feature() x86/cpuid: Set <asm/cpuid/api.h> as the main CPUID header x86/cpuid: Move CPUID(0x2) APIs into <cpuid/api.h> x86/msr: Add rdmsrl_on_cpu() compatibility wrapper x86/mm: Fix kernel-doc descriptions of various pgtable methods x86/asm-offsets: Export certain 'struct cpuinfo_x86' fields for 64-bit asm use too x86/boot: Defer initialization of VM space related global variables ...
2025-05-25Merge branch 'locking/futex' into locking/core, to pick up pending futex changesIngo Molnar8-1/+103
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-24perf tests switch-tracking: Fix timestamp comparisonLeo Yan1-1/+1
The test might fail on the Arm64 platform with the error: # perf test -vvv "Track with sched_switch" Missing sched_switch events # The issue is caused by incorrect handling of timestamp comparisons. The comparison result, a signed 64-bit value, was being directly cast to an int, leading to incorrect sorting for sched events. The case does not fail everytime, usually I can trigger the failure after run 20 ~ 30 times: # while true; do perf test "Track with sched_switch"; done 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : FAILED! 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok 106: Track with sched_switch : FAILED! 106: Track with sched_switch : Ok 106: Track with sched_switch : Ok I used cross compiler to build Perf tool on my host machine and tested on Debian / Juno board. Generally, I think this issue is not very specific to GCC versions. As both internal CI and my local env can reproduce the issue. My Host Build compiler: # aarch64-linux-gnu-gcc --version aarch64-linux-gnu-gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0 Juno Board: # lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 12 (bookworm) Release: 12 Codename: bookworm Fix this by explicitly returning 0, 1, or -1 based on whether the result is zero, positive, or negative. Fixes: d44bc558297222d9 ("perf tests: Add a test for tracking with sched_switch") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: James Clark <james.clark@linaro.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250331172759.115604-1-leo.yan@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-23perf pmu intel: Adjust cpumaks for sub-NUMA clusters on graniterapidsIan Rogers1-5/+263
On graniterapids the cache home agent (CHA) and memory controller (IMC) PMUs all have their cpumask set to per-socket information. In order for per NUMA node aggregation to work correctly the PMUs cpumask needs to be set to CPUs for the relevant sub-NUMA grouping. For example, on a 2 socket graniterapids machine with sub NUMA clustering of 3, for uncore_cha and uncore_imc PMUs the cpumask is "0,120" leading to aggregation only on NUMA nodes 0 and 3: ``` $ perf stat --per-node -e 'UNC_CHA_CLOCKTICKS,UNC_M_CLOCKTICKS' -a sleep 1 Performance counter stats for 'system wide': N0 1 277,835,681,344 UNC_CHA_CLOCKTICKS N0 1 19,242,894,228 UNC_M_CLOCKTICKS N3 1 277,803,448,124 UNC_CHA_CLOCKTICKS N3 1 19,240,741,498 UNC_M_CLOCKTICKS 1.002113847 seconds time elapsed ``` By updating the PMUs cpumasks to "0,120", "40,160" and "80,200" then the correctly 6 NUMA node aggregations are achieved: ``` $ perf stat --per-node -e 'UNC_CHA_CLOCKTICKS,UNC_M_CLOCKTICKS' -a sleep 1 Performance counter stats for 'system wide': N0 1 92,748,667,796 UNC_CHA_CLOCKTICKS N0 0 6,424,021,142 UNC_M_CLOCKTICKS N1 0 92,753,504,424 UNC_CHA_CLOCKTICKS N1 1 6,424,308,338 UNC_M_CLOCKTICKS N2 0 92,751,170,084 UNC_CHA_CLOCKTICKS N2 0 6,424,227,402 UNC_M_CLOCKTICKS N3 1 92,745,944,144 UNC_CHA_CLOCKTICKS N3 0 6,423,752,086 UNC_M_CLOCKTICKS N4 0 92,725,793,788 UNC_CHA_CLOCKTICKS N4 1 6,422,393,266 UNC_M_CLOCKTICKS N5 0 92,717,504,388 UNC_CHA_CLOCKTICKS N5 0 6,421,842,618 UNC_M_CLOCKTICKS 1.003406645 seconds time elapsed ``` In general, having the perf tool adjust cpumasks isn't desirable as ideally the PMU driver would be advertising the correct cpumask. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20250515181417.491401-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-23perf tests trace_summary.sh: Run in exclusive modeArnaldo Carvalho de Melo1-1/+1
And it is being successfull only when running alone, probably because there are some tests that add the vfs_getname probe that gets used by 'perf trace' and alter how it does syscall arg pathname resolution. This should be removed or made a fallback to the preferred BPF mode of getting syscall parameters, but till then, run this in exclusive mode. For reference, here are some of the tests that run close to this one: 127: perf record offcpu profiling tests : Ok 128: perf all PMU test : Ok 129: perf stat --bpf-counters test : Ok 130: Check Arm CoreSight trace data recording and synthesized samples: Skip 131: Check Arm CoreSight disassembly script completes without errors : Skip 132: Check Arm SPE trace data recording and synthesized samples : Skip 133: Test data symbol : Ok 134: Miscellaneous Intel PT testing : Skip 135: test Intel TPEBS counting mode : Skip 136: perf script task-analyzer tests : Ok 137: Check open filename arg using perf trace + vfs_getname : Ok 138: perf trace summary : Ok Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/aC-hHTgArwlF_zu9@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-23perf test: Add cgroup summary test case for 'perf trace'Namhyung Kim1-0/+6
$ sudo ./perf test -vv 112 112: perf trace summary: --- start --- test child forked, pid 1018940 testing: perf trace -s -- true testing: perf trace -S -- true testing: perf trace -s --summary-mode=thread -- true testing: perf trace -S --summary-mode=total -- true testing: perf trace -as --summary-mode=thread --no-bpf-summary -- true testing: perf trace -as --summary-mode=total --no-bpf-summary -- true testing: perf trace -as --summary-mode=thread --bpf-summary -- true testing: perf trace -as --summary-mode=total --bpf-summary -- true testing: perf trace -aS --summary-mode=total --bpf-summary -- true testing: perf trace -as --summary-mode=cgroup --bpf-summary -- true testing: perf trace -aS --summary-mode=cgroup --bpf-summary -- true ---- end(0) ---- 112: perf trace summary : Ok Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250522142551.1062417-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-23perf python: Add counting.py as example for counting perf eventsGautam Menghani1-0/+36
Add counting.py - a python version of counting.c to demonstrate measuring and reading of counts for given perf events. Committer testing: Build perf and make the generated python binding somewhere you can point to to avoid using the one in the distro python3-perf (fedora, may be different in other distros): $ make -k O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin Copy /tmp/build/perf-tools-next/python/perf.cpython-313-x86_64-linux-gnu.so to somewhere outside this toolbox container and then use it with root: # export PYTHONPATH=/root/python/ # ls -la /root/python/ total 10640 drwxr-xr-x. 1 root root 72 May 21 11:40 . dr-xr-x---. 1 root root 574 May 21 11:40 .. -rwxr-xr-x. 1 acme acme 10894360 May 21 11:40 perf.cpython-313-x86_64-linux-gnu.so # tools/perf/python/counting.py | head -5 For evsel(software/cpu-clock/) val: 2930946 enable: 2932479 run: 2932479 For evsel(software/cpu-clock/) val: 2924975 enable: 2926267 run: 2926267 For evsel(software/cpu-clock/) val: 2921017 enable: 2922430 run: 2922430 For evsel(software/cpu-clock/) val: 2914966 enable: 2916549 run: 2916549 For evsel(software/cpu-clock/) val: 2910027 enable: 2911589 run: 2911589 # Signed-off-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> [ make the API take a CPU and thread then compute from these the appropriate indices. ] Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/linux-perf-users/CAP-5=fWb-=hCYmpg7U5N9C94EucQGTOS7YwR2-fo4ptOexzxyg@mail.gmail.com/ Link: https://lore.kernel.org/r/20250519195148.1708988-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-23perf python: Add evlist close supportGautam Menghani1-0/+16
Add support for the evlist close function. Signed-off-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250519195148.1708988-7-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-23perf python: Add evsel read methodGautam Menghani1-0/+37
Add the evsel read method to enable python to read counter data for the given evsel. Signed-off-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/linux-perf-users/20250512055748.479786-1-gautam@linux.ibm.com/ Link: https://lore.kernel.org/r/20250519195148.1708988-6-irogers@google.com [ make the API take a CPU and thread then compute from these the appropriate indices. ] Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-23perf python: Add support for 'struct perf_counts_values' to return counter dataGautam Menghani1-1/+91
Add support for the perf_counts_values struct to enable the python bindings to read and return the counter data. Committer notes: Use T_ULONG instead of Py_T_ULONG, as all the other PyMemberDef arrays, fixing the build with older python3 versions. Use { .name = NULL, } to finish the new PyMemberDef pyrf_counts_values_members array, again as the other arrays to please some clang versions, ditto for PyGetSetDef. Signed-off-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250519195148.1708988-5-irogers@google.com Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-21perf python: Add evsel cpus and threads functionsIan Rogers1-0/+33
Allow access to cpus and thread_map structs associated with an evsel. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250519195148.1708988-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-21perf test amd: Skip amd-ibs-period test on kernel < v6.15Ravi Bangoria1-0/+29
Bunch of IBS kernel fixes went in v6.15-rc1 [1]. The amd-ibs-period test will fail without those kernel patches. Skip the test on system running kernel older than v6.15 to distinguish genuine new failures vs known failure due to old kernel. Since all the related IBS fixes went in -rc1 itself, the ">= 6.15" check will work for any custom compiled v6.15-* kernel as well. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Closes: https://lore.kernel.org/r/aCfuGXUnNIbnYo_r@x1 Link: https://lore.kernel.org/r/20250115054438.1021-1-ravi.bangoria@amd.com [1] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-21perf thread: Ensure comm_lock held for comm_listIan Rogers3-8/+20
Add thread safety annotations for comm_list and add locking for two instances where the list is accessed without the lock held (in contradiction to ____thread__set_comm()). Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Fei Lang <langfei@huawei.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/r/20250519224645.1810891-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-21perf rwsem: Add clang's -Wthread-safety annotationsIan Rogers4-5/+23
Add annotations used by clang's -Wthread-safety. Fix dsos compilation errors caused by a lock of annotations. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Fei Lang <langfei@huawei.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/r/20250519224645.1810891-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-21perf dso: Minor refactor to allow clang's Wthread-safety analysisIan Rogers1-19/+26
The pattern: ``` if (x) { lock(...) } block1; if (x) { unlock(...) } ``` defeats clang's -Wthread-safety analysis where it complains of locks held on one path and not another. Add helper functions for "block1" then restructure as: ``` if (x) { lock(...); block1(); unlock(...); } else { block1(); } ``` Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com> Cc: Fei Lang <langfei@huawei.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Link: https://lore.kernel.org/r/20250519224645.1810891-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-21tools headers: Synchronize prctl.h ABI headerSebastian Andrzej Siewior1-1/+3
The prctl.h ABI header was slightly updated during the development of the interface. In particular the "immutable" parameter became a bit in the option argument. Synchronize prctl.h ABI header again and make use of the definition in the testsuite and "perf bench futex". Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: André Almeida <andrealmeid@igalia.com> Link: https://lore.kernel.org/r/20250517151455.1065363-5-bigeasy@linutronix.de
2025-05-20perf ftrace: Use process/session specific trace settingsThomas Richter1-14/+87
Executing 'perf ftrace' commands 'ftrace', 'profile' and 'latency' leave tracing disabled as can seen in this output: # echo 1 > /sys/kernel/debug/tracing/tracing_on # cat /sys/kernel/debug/tracing/tracing_on 1 # perf ftrace trace --graph-opts depth=5 sleep 0.1 > /dev/null # cat /sys/kernel/debug/tracing/tracing_on 0 # The 'tracing_on' file is not restored to its value before the command. To fix that this patch uses the .../tracing/instances/XXX subdirectory feature. Each 'perf ftrace' invocation creates its own session/process specific subdirectory and does not change the global state in the .../tracing directory itself. Use rmdir(../tracing/instances/dir) to stop process/session specific tracing and delete all process/session specific setings. Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com> Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20250520093726.2009696-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-20tools include UAPI: Sync linux/vhost.h with the kernel sourcesArnaldo Carvalho de Melo1-2/+2
To get the changes in: a940e0a685575424 ("vhost: fix VHOST_*_OWNER documentation") That just changed lines in comments This addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/vhost.h include/uapi/linux/vhost.h Please see tools/include/uapi/README for further details. Acked-by: Stefano Garzarella <sgarzare@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250519214126.1652491-2-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-20perf test probe_vfs_getname: Add regex for searching probe lineLeo Yan1-2/+10
Since commit 611851010c74046c ("fs: dedup handling of struct filename init and refcounts bumps"), the kernel has been refactored to use a new inline function initname(), moving name initialization into it. As a result, the perf probe test can no longer find the source line that matches the defined regular expressions. This causes the script to fail when attempting to add probes. Add a regular expression to search for the call site of initname(). This provides a valid source line number for adding the probe. Keeps the older regular expressions for passing test on older kernels. Fixes: 611851010c74046c ("fs: dedup handling of struct filename init and refcounts bumps") Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jakub Brnak <jbrnak@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250519082755.1669187-1-leo.yan@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-16perf record: Fix a asan runtime error in util/maps.cChun-Tse Shao1-3/+6
If I build perf with asan and run Zstd test: $ make -C tools/perf O=/tmp/perf DEBUG=1 EXTRA_CFLAGS="-O0 -g -fno-omit-frame-pointer -fsanitize=undefined" $ /tmp/perf/perf test "Zstd perf.data compression/decompression" -vv 83: Zstd perf.data compression/decompression: ... util/maps.c:1046:5: runtime error: null pointer passed as argument 2, which is declared to never be null ... The issue was caused by `bsearch`. The patch adds a check to ensure argument 2 and 3 are not NULL and 0. Testing with the commands above confirms that the runtime error is resolved. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Terrell <terrelln@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303183646.327510-2-ctshao@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-16perf record: Add 8-byte aligned event type PERF_RECORD_COMPRESSED2Chun-Tse Shao5-13/+51
The original PERF_RECORD_COMPRESS is not 8-byte aligned, which can cause asan runtime error: # Build with asan $ make -C tools/perf O=/tmp/perf DEBUG=1 EXTRA_CFLAGS="-O0 -g -fno-omit-frame-pointer -fsanitize=undefined" # Test success with many asan runtime errors: $ /tmp/perf/perf test "Zstd perf.data compression/decompression" -vv 83: Zstd perf.data compression/decompression: ... util/session.c:1959:13: runtime error: member access within misaligned address 0x7f69e3f99653 for type 'union perf_event', which requires 13 byte alignment 0x7f69e3f99653: note: pointer points here d0 3a 50 69 44 00 00 00 00 00 08 00 bb 07 00 00 00 00 00 00 44 00 00 00 00 00 00 00 ff 07 00 00 ^ util/session.c:2163:22: runtime error: member access within misaligned address 0x7f69e3f99653 for type 'union perf_event', which requires 8 byte alignment 0x7f69e3f99653: note: pointer points here d0 3a 50 69 44 00 00 00 00 00 08 00 bb 07 00 00 00 00 00 00 44 00 00 00 00 00 00 00 ff 07 00 00 ^ ... Since there is no way to align compressed data in zstd compression, this patch add a new event type `PERF_RECORD_COMPRESSED2`, which adds a field `data_size` to specify the actual compressed data size. The `header.size` contains the total record size, including the padding at the end to make it 8-byte aligned. Tested with `Zstd perf.data compression/decompression` Signed-off-by: Chun-Tse Shao <ctshao@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Terrell <terrelln@fb.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250303183646.327510-1-ctshao@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-16perf intel-tpebs: Filter non-workload samplesIan Rogers1-1/+58
If perf is running with a benchmark then we want the retirement latency samples associated with the benchmark rather than from the system as a whole. Use the workload's PID to filter out samples that aren't from the workload or its children. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250430200108.243234-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-15perf test: Allow tolerance for leader sampling testChun-Tse Shao1-6/+27
There is a known issue that the leader sampling is inconsistent, since throttle only affect leader, not the slave. The detail is in [1]. To maintain test coverage, this patch sets a tolerance rate of 80% to accommodate the throttled samples and prevent test failures due to throttling. [1] lore.kernel.org/20250328182752.769662-1-ctshao@google.com Suggested-by: Ian Rogers <irogers@google.com> Suggested-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Co-developed-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20250430140611.599078-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-14perf test: Add stat uniquifying testChun-Tse Shao1-0/+69
The `stat+uniquify.sh` test retrieves all uniquified `clockticks` events from `perf list -v clockticks` and check if `perf stat -e clockticks -A` contains all of them. Committer testing: root@x1:~# grep -m1 "model name" /proc/cpuinfo model name : 13th Gen Intel(R) Core(TM) i7-1365U root@x1:~# perf list clockticks List of pre-defined events (to be used in -e or -M): uncore_clock/clockticks/ [Kernel PMU event] uncore memory: unc_m_clockticks [Number of clocks. Unit: uncore_imc] root@x1:~# root@x1:~# perf test uniquifying 92: perf stat events uniquifying : Ok root@x1:~# perf test -vv uniquifying 92: perf stat events uniquifying: --- start --- test child forked, pid 1552628 stat event uniquifying test ---- end(0) ---- 92: perf stat events uniquifying : Ok root@x1:~# Signed-off-by: Chun-Tse Shao <ctshao@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Levi Yun <yeoreum.yun@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250513215401.2315949-4-ctshao@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-14perf parse-events: Use wildcard processing to set an event to merge intoIan Rogers5-72/+88
The merge stat code fails for uncore events if they are repeated twice, for example `perf stat -e clockticks,clockticks -I 1000` as the counts of the second set of uncore events will be merged into the first counter. Reimplement the logic to have a first_wildcard_match so that merged later events correctly merge into the first wildcard event that they will be aggregated into. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Chun-Tse Shao <ctshao@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Levi Yun <yeoreum.yun@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250513215401.2315949-3-ctshao@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>