summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-04-27perf intel-pt: Fix unassigned instruction op (discovered by MemorySanitizer)Adrian Hunter2-0/+4
MemorySanitizer discovered instances where the instruction op value was not assigned.: WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x5581c00a76b3 in intel_pt_sample_flags tools/perf/util/intel-pt.c:1527:17 Uninitialized value was stored to memory at #0 0x5581c005ddf8 in intel_pt_walk_insn tools/perf/util/intel-pt-decoder/intel-pt-decoder.c:1256:25 The op value is used to set branch flags for branch instructions encountered when walking the code, so fix by setting op to INTEL_PT_OP_OTHER in other cases. Fixes: 4c761d805bb2d2ea ("perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type") Reported-by: Ian Rogers <irogers@google.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Closes: https://lore.kernel.org/linux-perf-users/20240320162619.1272015-1-irogers@google.com/ Link: https://lore.kernel.org/r/20240326083223.10883-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf record: Fix comment misspellingsHoward Chu2-3/+3
Fix comment misspellings Signed-off-by: Howard Chu <howardchu95@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240425060427.1800663-1-howardchu95@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf annotate: Update DSO binary type when trying build-idNamhyung Kim1-0/+2
dso__disassemble_filename() tries to get the filename for objdump (or capstone) using build-id. But I found sometimes it didn't disassemble some functions. It turned out that those functions belong to a DSO which has no binary type set. It seems it sets the binary type for some special files only - like kernel (kallsyms or kcore) or BPF images. And there's a logic to skip dso with DSO_BINARY_TYPE__NOT_FOUND. As it's checked the build-id cache link, it should set the binary type as DSO_BINARY_TYPE__BUILD_ID_CACHE. Fixes: 873a83731f1cc85c ("perf annotate: Skip DSOs not found") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240425005157.1104789-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf annotate: Fallback disassemble to objdump when capstone failsNamhyung Kim1-0/+14
I found some cases that capstone failed to disassemble. Probably my capstone is an old version but anyway there's a chance it can fail. And then it silently stopped in the middle. In my case, it didn't understand "RDPKRU" instruction. Let's check if the capstone disassemble reached the end of the function and fallback to objdump if not. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240425005157.1104789-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf annotate-data: Check if 'struct annotation_source' was allocated on ↵Namhyung Kim1-1/+1
'perf report' TUI As it removed the sample accounting for code when no symbol sort key is given for 'perf report' TUI, it might not have allocated the 'struct annotated_source' yet. Let's check if it's NULL first. Fixes: 6cdd977ec24e1538 ("perf report: Do not collect sample histogram unnecessarily") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240424230015.1054013-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf test: Add a new test for 'perf annotate'Namhyung Kim1-0/+83
Add a basic 'perf annotate' test: $ ./perf test annotate -vv 76: perf annotate basic tests: --- start --- test child forked, pid 846989 fbcd0-fbd55 l noploop perf does have symbol 'noploop' Basic perf annotate test : 0 0xfbcd0 <noploop>: 0.00 : fbcd0: pushq %rbp 0.00 : fbcd1: movq %rsp, %rbp 0.00 : fbcd4: pushq %r12 0.00 : fbcd6: pushq %rbx 0.00 : fbcd7: movl $1, %ebx 0.00 : fbcdc: subq $0x10, %rsp 0.00 : fbce0: movq %fs:0x28, %rax 0.00 : fbce9: movq %rax, -0x18(%rbp) 0.00 : fbced: xorl %eax, %eax 0.00 : fbcef: testl %edi, %edi 0.00 : fbcf1: jle 0xfbd04 0.00 : fbcf3: movq (%rsi), %rdi 0.00 : fbcf6: movl $0xa, %edx 0.00 : fbcfb: xorl %esi, %esi 0.00 : fbcfd: callq 0x41920 0.00 : fbd02: movl %eax, %ebx 0.00 : fbd04: leaq -0x7b(%rip), %r12 # fbc90 <sighandler> 0.00 : fbd0b: movl $2, %edi 0.00 : fbd10: movq %r12, %rsi 0.00 : fbd13: callq 0x40a00 0.00 : fbd18: movl $0xe, %edi 0.00 : fbd1d: movq %r12, %rsi 0.00 : fbd20: callq 0x40a00 0.00 : fbd25: movl %ebx, %edi 0.00 : fbd27: callq 0x407c0 0.10 : fbd2c: movl 0x89785e(%rip), %eax # 993590 <done> 0.00 : fbd32: testl %eax, %eax 99.90 : fbd34: je 0xfbd2c 0.00 : fbd36: movq -0x18(%rbp), %rax 0.00 : fbd3a: subq %fs:0x28, %rax 0.00 : fbd43: jne 0xfbd50 0.00 : fbd45: addq $0x10, %rsp 0.00 : fbd49: xorl %eax, %eax 0.00 : fbd4b: popq %rbx 0.00 : fbd4c: popq %r12 0.00 : fbd4e: popq %rbp 0.00 : fbd4f: retq 0.00 : fbd50: callq 0x407e0 0.00 : fbcd0: pushq %rbp 0.00 : fbcd1: movq %rsp, %rbp 0.00 : fbcd4: pushq %r12 0.00 : fbcd0: push %rbp 0.00 : fbcd1: mov %rsp,%rbp 0.00 : fbcd4: push %r12 Basic annotate test [Success] ---- end(0) ---- 76: perf annotate basic tests : Ok Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240424001231.849972-1-namhyung@kernel.org [ Improved a bit the error messages ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-events: Tidy the setting of the default event nameIan Rogers4-7/+19
Add comments. Pass ownership of the event name to save on a strdup. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-17-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-events: Minor grouping tidy upIan Rogers2-1/+6
Add comments. Ensure leader->group_name is freed before overwriting it. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-16-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-event: Constify event_symbol arraysIan Rogers2-4/+4
Moves 352 bytes from .data to .data.rel.ro. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-events: Improvements to modifier parsingIan Rogers4-182/+194
Use a struct/bitmap rather than a copied string from lexer. In lexer give improved error message when too many precise flags are given or repeated modifiers. Before: $ perf stat -e 'cycles:kuk' true event syntax error: 'cycles:kuk' \___ Bad modifier ... $ perf stat -e 'cycles:pppp' true event syntax error: 'cycles:pppp' \___ Bad modifier ... $ perf stat -e '{instructions:p,cycles:pp}:pp' -a true event syntax error: '..cycles:pp}:pp' \___ Bad modifier ... After: $ perf stat -e 'cycles:kuk' true event syntax error: 'cycles:kuk' \___ Duplicate modifier 'k' (kernel) ... $ perf stat -e 'cycles:pppp' true event syntax error: 'cycles:pppp' \___ Maximum precise value is 3 ... $ perf stat -e '{instructions:p,cycles:pp}:pp' true event syntax error: '..cycles:pp}:pp' \___ Maximum combined precise value is 3, adding precision to "cycles:pp" ... Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-events: Inline parse_events_evlist_errorIan Rogers3-13/+8
Inline parse_events_evlist_error that is only used in parse_events_error. Modify parse_events_error to not report a parser error unless errors haven't already been reported. Make it clearer that the latter case only happens for unrecognized input. Before: $ perf stat -e 'cycles/period=99999999999999999999/' true event syntax error: 'cycles/period=99999999999999999999/' \___ parser error event syntax error: '..les/period=99999999999999999999/' \___ Bad base 10 number "99999999999999999999" Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events $ perf stat -e 'cycles:xyz' true event syntax error: 'cycles:xyz' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events After: $ perf stat -e 'cycles/period=99999999999999999999/xyz' true event syntax error: '..les/period=99999999999999999999/xyz' \___ Bad base 10 number "99999999999999999999" Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events $ perf stat -e 'cycles:xyz' true event syntax error: 'cycles:xyz' \___ Unrecognized input Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-events: Improve error message for bad numbersIan Rogers1-16/+24
Use the error handler from the parse_state to give a more informative error message. Before: $ perf stat -e 'cycles/period=99999999999999999999/' true event syntax error: 'cycles/period=99999999999999999999/' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events After: $ perf stat -e 'cycles/period=99999999999999999999/' true event syntax error: 'cycles/period=99999999999999999999/' \___ parser error event syntax error: '..les/period=99999999999999999999/' \___ Bad base 10 number "99999999999999999999" Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-12-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-events: Inline parse_events_update_listsIan Rogers3-31/+25
The helper function just wraps a splice and free. Making the free inline removes a comment, so then it just wraps a splice which we can make inline too. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-events: Prefer sysfs/JSON hardware events over legacyIan Rogers4-68/+103
It was requested that RISC-V be able to add events to the perf tool so the PMU driver didn't need to map legacy events to config encodings: https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@rivosinc.com/ This change makes the priority of events specified without a PMU the same as those specified with a PMU, namely sysfs and JSON events are checked first before using the legacy encoding. The hw_term is made more generic as a hardware_event that encodes a pair of string and int value, allowing parse_events_multi_pmu_add to fall back on a known encoding when the sysfs/JSON adding fails for core events. As this covers PE_VALUE_SYM_HW, that token is removed and related code simplified. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-events: Constify parse_events_add_numericIan Rogers2-10/+12
Allow the term list to be const so that other functions can pass const term lists. Add const as necessary to called functions. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-events: Handle PE_TERM_HW in name_or_rawIan Rogers1-26/+5
Avoid duplicate logic for name_or_raw and PE_TERM_HW by having a rule to turn PE_TERM_HW into a name_or_raw. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-events: Legacy cache names on all PMUs and lower priorityIan Rogers2-9/+32
Prior behavior is to not look for legacy cache names in sysfs/JSON and to create events on all core PMUs. New behavior is to look for sysfs/JSON events first on all PMUs, for core PMUs add a legacy event if the sysfs/JSON event isn't present. This is done so that there is consistency with how event names in terms are handled and their prioritization of sysfs/JSON over legacy. It may make sense to use a legacy cache event name as an event name on a non-core PMU so we should allow it. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf tests parse-events: Use "branches" rather than "cache-references"Ian Rogers1-3/+3
Switch from "cache-references" to "branches" in test as Intel has a sysfs event for "cache-references" and changing the priority for sysfs over legacy causes the test to fail. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf pmu: Refactor perf_pmu__match()Ian Rogers3-26/+22
Move all implementation to pmu code. Don't allocate a fnmatch wildcard pattern, matching ignoring the suffix already handles this, and only use fnmatch if the given PMU name has a '*' in it. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-events: Avoid copying an empty listIan Rogers1-12/+13
In parse_events_add_pmu, delay copying the list of terms until it is known the list contains terms. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-events: Directly pass PMU to parse_events_add_pmu()Ian Rogers1-29/+17
Avoid passing the name of a PMU then finding it again, just directly pass the PMU. parse_events_multi_pmu_add_or_add_pmu() is the only version that needs to find a PMU, so move the find there. Remove the error message as parse_events_multi_pmu_add_or_add_pmu will given an error at the end when a name isn't either a PMU name or event name. Without the error message being created the location in the input parameter (loc) can be removed. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf parse-events: Factor out '<event_or_pmu>/.../' parsingIan Rogers3-73/+80
Factor out the case of an event or PMU name followed by a slash based term list. This is with a view to sharing the code with new legacy hardware parsing. Use early return to reduce indentation in the code. Make parse_events_add_pmu static now it doesn't need sharing with parse-events.y. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf scripts python: Add a script to run instances of 'perf script' in parallelAdrian Hunter2-1/+1013
Add a Python script to run a perf script command multiple times in parallel, using perf script options --cpu and --time so that each job processes a different chunk of the data. Extend perf script tests to test also the new script. The script supports the use of normal 'perf script' options like --dlfilter and --script, so that the benefit of running parallel jobs naturally extends to them also. In addition, a command can be provided (refer --pipe-to option) to pipe standard output to a custom command. Refer to the script's own help text at the end of the patch for more details. The script is useful for Intel PT traces, that can be efficiently decoded by 'perf script' when split by CPU and/or time ranges. Running jobs in parallel can decrease the overall decoding time. Committer testing: Ian reported that shellcheck found some issues, I installed it as there are no warnings about it not being available, but when available it fails the build with: TEST /tmp/build/perf-tools-next/tests/shell/script.sh.shellcheck_log CC /tmp/build/perf-tools-next/util/header.o In tests/shell/script.sh line 20: rm -rf "${temp_dir}/"* ^-------------^ SC2115 (warning): Use "${var:?}" to ensure this never expands to /* . In tests/shell/script.sh line 83: output1_dir="${temp_dir}/output1" ^---------^ SC2034 (warning): output1_dir appears unused. Verify use (or export if used externally). In tests/shell/script.sh line 84: output2_dir="${temp_dir}/output2" ^---------^ SC2034 (warning): output2_dir appears unused. Verify use (or export if used externally). In tests/shell/script.sh line 86: python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}" ^-----------^ SC2154 (warning): output_dir is referenced but not assigned (did you mean 'output1_dir'?). For more information: https://www.shellcheck.net/wiki/SC2034 -- output1_dir appears unused. Verif... https://www.shellcheck.net/wiki/SC2115 -- Use "${var:?}" to ensure this nev... https://www.shellcheck.net/wiki/SC2154 -- output_dir is referenced but not ... Did these fixes: - rm -rf "${temp_dir}/"* + rm -rf "${temp_dir:?}/"* And: @@ -83,8 +83,8 @@ test_parallel_perf() output1_dir="${temp_dir}/output1" output2_dir="${temp_dir}/output2" perf record -o "${perf_data}" --sample-cpu uname - python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}" - python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose --per-cpu -- perf script -i "${perf_data}" + python3 "${pp}" -o "${output1_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}" + python3 "${pp}" -o "${output2_dir}" --jobs 4 --verbose --per-cpu -- perf script -i "${perf_data}" After that: root@number:~# perf test -vv "perf script tests" 97: perf script tests: --- start --- test child forked, pid 4084139 DB test [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.032 MB /tmp/perf-test-script.T4MJDr0L6J/perf.data (7 samples) ] <SNIP> DB test [Success] parallel-perf test Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.034 MB /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data (7 samples) ] Starting: perf script --time=,91898.301878499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --time=91898.301878500,91898.301905999 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --time=91898.301906000,91898.301933499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --time=91898.301933500, -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --time=91898.301878500,91898.301905999 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --time=91898.301906000,91898.301933499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 4 jobs: 2 completed, 2 running Finished: perf script --time=,91898.301878499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --time=91898.301933500, -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 4 jobs: 4 completed, 0 running All jobs finished successfully parallel-perf.py done Starting: perf script --cpu=0 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=1 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=2 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=3 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=0 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=1 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=2 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=3 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 4 completed, 0 running Starting: perf script --cpu=4 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=5 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=6 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=7 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=4 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=5 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=6 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=7 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 8 completed, 0 running Starting: perf script --cpu=8 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=9 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=10 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=11 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=8 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=9 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=10 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=11 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 12 completed, 0 running Starting: perf script --cpu=12 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=13 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=14 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=15 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=12 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=13 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=14 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=15 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 16 completed, 0 running Starting: perf script --cpu=16 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=17 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=18 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=19 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=16 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=17 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=18 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=19 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 20 completed, 0 running Starting: perf script --cpu=20 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=21 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=22 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=23 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=20 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=21 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=22 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=23 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 24 completed, 0 running Starting: perf script --cpu=24 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=25 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=26 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=27 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=25 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=26 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=27 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 27 completed, 1 running Finished: perf script --cpu=24 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 28 completed, 0 running All jobs finished successfully parallel-perf.py done parallel-perf test [Success] --- Cleaning up --- ---- end(0) ---- 97: perf script tests : Ok root@number:~# Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240423133248.10206-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27tools lib rbtree: Pick some improvements from the kernel rbtree codeArnaldo Carvalho de Melo2-3/+3
The tools/lib/rbtree.c code came from the kernel, removing the EXPORT_SYMBOL() that make sense only there, unfortunately it is not being checked with tools/perf/check_headers.sh, will try to remedy this, till then pick the improvements from: b0687c1119b4e8c8 ("lib/rbtree: use '+' instead of '|' for setting color.") That I noticed by doing: diff -u tools/lib/rbtree.c lib/rbtree.c diff -u tools/include/linux/rbtree_augmented.h include/linux/rbtree_augmented.h There is one other cases, but lets pick it in separate patches. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Noah Goldstein <goldstein.w.n@gmail.com> Link: https://lore.kernel.org/lkml/ZigZzeFoukzRKG1Q@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-27perf tests shell kprobes: Add missing description as used by 'perf test' outputArnaldo Carvalho de Melo1-0/+1
Before: root@x1:~# perf test 76 76: SPDX-License-Identifier: GPL-2.0 : Ok root@x1:~# After: root@x1:~# perf test 76 76: Add 'perf probe's, list and remove them. : Ok root@x1:~# Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Veronika Molnarova <vmolnaro@redhat.com> Link: https://lore.kernel.org/lkml/ZigRDKUGkcDqD-yW@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-23tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo1-1/+8
To pick up the changes from these csets: be482ff9500999f5 ("x86/bhi: Enumerate Branch History Injection (BHI) bug") 0f4a837615ff925b ("x86/bhi: Define SPEC_CTRL_BHI_DIS_S") That cause no changes to tooling: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > x86_msr.before $ objdump -dS /tmp/build/perf-tools-next/util/amd-sample-raw.o > amd-sample-raw.o.before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ make -C tools/perf O=/tmp/build/perf-tools-next <SNIP> CC /tmp/build/perf-tools-next/trace/beauty/tracepoints/x86_msr.o <SNIP> CC /tmp/build/perf-tools-next/util/amd-sample-raw.o <SNIP> $ objdump -dS /tmp/build/perf-tools-next/util/amd-sample-raw.o > amd-sample-raw.o.after $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > x86_msr.after $ diff -u x86_msr.before x86_msr.after $ diff -u amd-sample-raw.o.before amd-sample-raw.o.after Just silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/ZifCnEZFx5MZQuIW@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-22tools include UAPI: Sync linux/vhost.h with the kernel sourcesArnaldo Carvalho de Melo1-6/+14
To get the changes in: 2855c2a7820bc819 ("vhost-vdpa: change ioctl # for VDPA_GET_VRING_SIZE") 1496c47065f9f841 ("vhost-vdpa: uapi to support reporting per vq size") To pick up these changes and support them: $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > before $ cp include/uapi/linux/vhost.h tools/perf/trace/beauty/include/uapi/linux/vhost.h $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > after $ diff -u before after --- before 2024-04-22 13:39:37.185674799 -0300 +++ after 2024-04-22 13:39:52.043344784 -0300 @@ -50,5 +50,6 @@ [0x7F] = "VDPA_GET_VRING_DESC_GROUP", [0x80] = "VDPA_GET_VQS_COUNT", [0x81] = "VDPA_GET_GROUP_NUM", + [0x82] = "VDPA_GET_VRING_SIZE", [0x8] = "NEW_WORKER", }; $ For instance, see how those 'cmd' ioctl arguments get translated, now VDPA_GET_VRING_SIZE will be as well: # perf trace -a -e ioctl --max-events=10 0.000 ( 0.011 ms): pipewire/2261 ioctl(fd: 60, cmd: SNDRV_PCM_HWSYNC, arg: 0x1) = 0 21.353 ( 0.014 ms): pipewire/2261 ioctl(fd: 60, cmd: SNDRV_PCM_HWSYNC, arg: 0x1) = 0 25.766 ( 0.014 ms): gnome-shell/2196 ioctl(fd: 14, cmd: DRM_I915_IRQ_WAIT, arg: 0x7ffe4a22c740) = 0 25.845 ( 0.034 ms): gnome-shel:cs0/2212 ioctl(fd: 14, cmd: DRM_I915_IRQ_EMIT, arg: 0x7fd43915dc70) = 0 25.916 ( 0.011 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_ADDFB2, arg: 0x7ffe4a22c8a0) = 0 25.941 ( 0.025 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_ATOMIC, arg: 0x7ffe4a22c840) = 0 32.915 ( 0.009 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_RMFB, arg: 0x7ffe4a22cf9c) = 0 42.522 ( 0.013 ms): gnome-shell/2196 ioctl(fd: 14, cmd: DRM_I915_IRQ_WAIT, arg: 0x7ffe4a22c740) = 0 42.579 ( 0.031 ms): gnome-shel:cs0/2212 ioctl(fd: 14, cmd: DRM_I915_IRQ_EMIT, arg: 0x7fd43915dc70) = 0 42.644 ( 0.010 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_ADDFB2, arg: 0x7ffe4a22c8a0) = 0 # This addresses this perf tools build warning: diff -u tools/perf/trace/beauty/include/uapi/linux/vhost.h include/uapi/linux/vhost.h But this specific process, usually boring, this time around catch a problem, namely the addition of VDPA_GET_VRING_SIZE used an ioctl number already taken, which went on unnoticed and only got caught when the tools/perf/trace/beauty/vhost_virtio_ioctl.sh script was run as part of the perf tools process of updating the tools copies of system headers it uses for creating id->string tables that, well, broke the perf tools build because there were multiple initializations in the strings table for the 0x80 entry... I'm adding here a link to the discussion, that is lacking in the fix for the reported problem, and a quote from one of the developers involved: "Thanks a lot for taking care of this! So given the header is actually buggy pls hang on to this change until I merge the fix for the header (you were CC'd on the patch). It's great we have this redundancy which allowed us to catch the bug in time, and many thanks to Namhyung Kim for reporting the issue!" This is here as a hint for anyone thinking about ways to automate checking these issues in a more automated way... ;-) Link: https://lore.kernel.org/lkml/ 20240402172151-mutt-send-email-mst@kernel.org Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zhu Lingshan <lingshan.zhu@intel.com> Link: https://lore.kernel.org/lkml/ZiaW-csEZLKK48BE@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-22Merge remote-tracking branch 'torvalds/master' into perf-tools-nextArnaldo Carvalho de Melo2570-29435/+63268
To pick up fixes sent via perf-tools, by Namhyung Kim. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-21Linux 6.9-rc5Linus Torvalds1-1/+1
2024-04-21Merge tag 'char-misc-6.9-rc5' of ↵Linus Torvalds11-80/+104
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc driver fixes from Greg KH: "Here are some small char/misc and other driver fixes for 6.9-rc5. Included in here are the following: - binder driver fix for reported problem - speakup crash fix - mei driver fixes for reported problems - comdei driver fix - interconnect driver fixes - rtsx driver fix - peci.h kernel doc fix All of these have been in linux-next for over a week with no reported problems" * tag 'char-misc-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: peci: linux/peci.h: fix Excess kernel-doc description warning binder: check offset alignment in binder_get_object() comedi: vmk80xx: fix incomplete endpoint checking mei: vsc: Unregister interrupt handler for system suspend Revert "mei: vsc: Call wake_up() in the threaded IRQ handler" misc: rtsx: Fix rts5264 driver status incorrect when card removed mei: me: disable RPL-S on SPS and IGN firmwares speakup: Avoid crash on very long word interconnect: Don't access req_list while it's being manipulated interconnect: qcom: x1e80100: Remove inexistent ACV_PERF BCM
2024-04-21Merge tag 'driver-core-6.9-rc5' of ↵Linus Torvalds2-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull kernfs bugfix and documentation update from Greg KH: "Here are two changes for 6.9-rc5 that deal with "driver core" stuff, that do the following: - sysfs reference leak fix - embargoed-hardware-issues.rst update for Power Both of these have been in linux-next for over a week with no reported issues" * tag 'driver-core-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: Documentation: embargoed-hardware-issues.rst: Add myself for Power fs: sysfs: Fix reference leak in sysfs_break_active_protection()
2024-04-21Merge tag 'tty-6.9-rc5' of ↵Linus Torvalds12-34/+81
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver fixes from Greg KH: "Here are some small tty and serial driver fixes for 6.9-rc5 that resolve a bunch of reported problems. Included in here are: - MAINTAINERS and .mailmap update for Richard Genoud - serial core regression fixes from 6.9-rc1 changes - pci id cleanups - serial core crash fix - stm32 driver fixes - 8250 driver fixes All of these have been in linux-next for a while with no reported problems" * tag 'tty-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: stm32: Reset .throttled state in .startup() serial: stm32: Return IRQ_NONE in the ISR if no handling happend serial: core: Fix missing shutdown and startup for serial base port serial: core: Clearing the circular buffer before NULLifying it MAINTAINERS: mailmap: update Richard Genoud's email address serial/pmac_zilog: Remove flawed mitigation for rx irq flood serial: 8250_pci: Remove redundant PCI IDs serial: core: Fix regression when runtime PM is not enabled serial: mxs-auart: add spinlock around changing cts state serial: 8250_dw: Revert: Do not reclock if already at correct rate serial: 8250_lpc18xx: disable clks on error in probe()
2024-04-21Merge tag 'usb-6.9-rc5' of ↵Linus Torvalds19-64/+147
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt driver fixes from Greg KH: "Here are some small USB and Thunderbolt driver fixes for 6.9-rc5. Included in here are: - MAINTAINER file update for invalid email address - usb-serial device id updates - typec driver fixes - thunderbolt / usb4 driver fixes - usb core shutdown fixes - cdc-wdm driver revert for reported problem in -rc1 - usb gadget driver fixes - xhci driver fixes All of these have been in linux-next for a while with no reported problems" * tag 'usb-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (25 commits) USB: serial: option: add Telit FN920C04 rmnet compositions usb: dwc3: ep0: Don't reset resource alloc flag Revert "usb: cdc-wdm: close race between read and workqueue" USB: serial: option: add Rolling RW101-GL and RW135-GL support USB: serial: option: add Lonsung U8300/U9300 product USB: serial: option: add support for Fibocom FM650/FG650 USB: serial: option: support Quectel EM060K sub-models USB: serial: option: add Fibocom FM135-GL variants usb: misc: onboard_usb_hub: Disable the USB hub clock on failure thunderbolt: Avoid notify PM core about runtime PM resume thunderbolt: Fix wake configurations after device unplug usb: dwc2: host: Fix dereference issue in DDMA completion flow. usb: typec: mux: it5205: Fix ChipID value typo MAINTAINERS: Drop Li Yang as their email address stopped working usb: gadget: fsl: Initialize udc before using it usb: Disable USB3 LPM at shutdown usb: gadget: f_ncm: Fix UAF ncm object at re-bind after usb ep transport error usb: typec: tcpm: Correct the PDO counting in pd_set usb: gadget: functionfs: Wait for fences before enqueueing DMABUF usb: gadget: functionfs: Fix inverted DMA fence direction ...
2024-04-21Merge tag 'sched_urgent_for_v6.9_rc5' of ↵Linus Torvalds3-6/+25
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Borislav Petkov: - Add a missing memory barrier in the concurrency ID mm switching * tag 'sched_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Add missing memory barrier in switch_mm_cid
2024-04-21Merge tag 'x86_urgent_for_v6.9_rc5' of ↵Linus Torvalds5-12/+87
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Fix CPU feature dependencies of GFNI, VAES, and VPCLMULQDQ - Print the correct error code when FRED reports a bad event type - Add a FRED-specific INT80 handler without the special dances that need to happen in the current one - Enable the using-the-default-return-thunk-but-you-should-not warning only on configs which actually enable those special return thunks - Check the proper feature flags when selecting BHI retpoline mitigation * tag 'x86_urgent_for_v6.9_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpufeatures: Fix dependencies for GFNI, VAES, and VPCLMULQDQ x86/fred: Fix incorrect error code printout in fred_bad_type() x86/fred: Fix INT80 emulation for FRED x86/retpolines: Enable the default thunk warning only on relevant configs x86/bugs: Fix BHI retpoline check
2024-04-20Merge tag 'block-6.9-20240420' of git://git.kernel.dk/linuxLinus Torvalds4-13/+28
Pull block fixes from Jens Axboe: "Just two minor fixes that should go into the 6.9 kernel release, one fixing a regression with partition scanning errors, and one fixing a WARN_ON() that can get triggered if we race with a timer" * tag 'block-6.9-20240420' of git://git.kernel.dk/linux: blk-iocost: do not WARN if iocg was already offlined block: propagate partition scanning errors to the BLKRRPART ioctl
2024-04-20Merge tag 'email' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds1-2/+2
Pull email address update from James Bottomley: "My IBM email has stopped working, so update to a working email address" * tag 'email' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: MAINTAINERS: update to working email address
2024-04-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds25-159/+267
Pull kvm fixes from Paolo Bonzini: "This is a bit on the large side, mostly due to two changes: - Changes to disable some broken PMU virtualization (see below for details under "x86 PMU") - Clean up SVM's enter/exit assembly code so that it can be compiled without OBJECT_FILES_NON_STANDARD. This fixes a warning "Unpatched return thunk in use. This should not happen!" when running KVM selftests. Everything else is small bugfixes and selftest changes: - Fix a mostly benign bug in the gfn_to_pfn_cache infrastructure where KVM would allow userspace to refresh the cache with a bogus GPA. The bug has existed for quite some time, but was exposed by a new sanity check added in 6.9 (to ensure a cache is either GPA-based or HVA-based). - Drop an unused param from gfn_to_pfn_cache_invalidate_start() that got left behind during a 6.9 cleanup. - Fix a math goof in x86's hugepage logic for KVM_SET_MEMORY_ATTRIBUTES that results in an array overflow (detected by KASAN). - Fix a bug where KVM incorrectly clears root_role.direct when userspace sets guest CPUID. - Fix a dirty logging bug in the where KVM fails to write-protect SPTEs used by a nested guest, if KVM is using Page-Modification Logging and the nested hypervisor is NOT using EPT. x86 PMU: - Drop support for virtualizing adaptive PEBS, as KVM's implementation is architecturally broken without an obvious/easy path forward, and because exposing adaptive PEBS can leak host LBRs to the guest, i.e. can leak host kernel addresses to the guest. - Set the enable bits for general purpose counters in PERF_GLOBAL_CTRL at RESET time, as done by both Intel and AMD processors. - Disable LBR virtualization on CPUs that don't support LBR callstacks, as KVM unconditionally uses PERF_SAMPLE_BRANCH_CALL_STACK when creating the perf event, and would fail on such CPUs. Tests: - Fix a flaw in the max_guest_memory selftest that results in it exhausting the supply of ucall structures when run with more than 256 vCPUs. - Mark KVM_MEM_READONLY as supported for RISC-V in set_memory_region_test" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits) KVM: Drop unused @may_block param from gfn_to_pfn_cache_invalidate_start() KVM: selftests: Add coverage of EPT-disabled to vmx_dirty_log_test KVM: x86/mmu: Fix and clarify comments about clearing D-bit vs. write-protecting KVM: x86/mmu: Remove function comments above clear_dirty_{gfn_range,pt_masked}() KVM: x86/mmu: Write-protect L2 SPTEs in TDP MMU when clearing dirty status KVM: x86/mmu: Precisely invalidate MMU root_role during CPUID update KVM: VMX: Disable LBR virtualization if the CPU doesn't support LBR callstacks perf/x86/intel: Expose existence of callback support to KVM KVM: VMX: Snapshot LBR capabilities during module initialization KVM: x86/pmu: Do not mask LVTPC when handling a PMI on AMD platforms KVM: x86: Snapshot if a vCPU's vendor model is AMD vs. Intel compatible KVM: x86: Stop compiling vmenter.S with OBJECT_FILES_NON_STANDARD KVM: SVM: Create a stack frame in __svm_sev_es_vcpu_run() KVM: SVM: Save/restore args across SEV-ES VMRUN via host save area KVM: SVM: Save/restore non-volatile GPRs in SEV-ES VMRUN via host save area KVM: SVM: Clobber RAX instead of RBX when discarding spec_ctrl_intercepted KVM: SVM: Drop 32-bit "support" from __svm_sev_es_vcpu_run() KVM: SVM: Wrap __svm_sev_es_vcpu_run() with #ifdef CONFIG_KVM_AMD_SEV KVM: SVM: Create a stack frame in __svm_vcpu_run() for unwinding KVM: SVM: Remove a useless zeroing of allocated memory ...
2024-04-20Merge tag 'powerpc-6.9-3' of ↵Linus Torvalds3-6/+11
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix wireguard loading failure on pre-Power10 due to Power10 crypto routines - Fix papr-vpd selftest failure due to missing variable initialization - Avoid unnecessary get/put in spapr_tce_platform_iommu_attach_dev() Thanks to Geetika Moolchandani, Jason Gunthorpe, Michal Suchánek, Nathan Lynch, and Shivaprasad G Bhat. * tag 'powerpc-6.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc/papr-vpd: Fix missing variable initialization powerpc/crypto/chacha-p10: Fix failure on non Power10 powerpc/iommu: Refactor spapr_tce_platform_iommu_attach_dev()
2024-04-20Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds4-39/+156
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A couple clk driver fixes, a build fix, and a deadlock fix: - Mediatek mt7988 has broken PCIe because the wrong parent is used - Mediatek clk drivers may deadlock when registering their clks because the clk provider device is repeatedly runtime PM resumed and suspended during probe and clk registration. Resuming the clk provider device deadlocks with an ABBA deadlock due to genpd_lock and the clk prepare_lock. The fix is to keep the device runtime resumed while registering clks. - Another runtime PM related deadlock, this time with disabling unused clks during late init. We get an ABBA deadlock where a device is runtime PM resuming (or suspending) while the disabling of unused clks is happening in parallel. That runtime PM action calls into the clk framework and tries to grab the clk prepare_lock while the disabling of unused clks holds the prepare_lock and is waiting for that runtime PM action to complete. The fix is to runtime resume all the clk provider devices before grabbing the clk prepare_lock during disable unused. - A build fix to provide an empty devm_clk_rate_exclusive_get() function when CONFIG_COMMON_CLK=n" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: mediatek: mt7988-infracfg: fix clocks for 2nd PCIe port clk: mediatek: Do a runtime PM get on controllers during probe clk: Get runtime PM before walking tree for clk_summary clk: Get runtime PM before walking tree during disable_unused clk: Initialize struct clk_core kref earlier clk: Don't hold prepare_lock when calling kref_put() clk: Remove prepare_lock hold assertion in __clk_release() clk: Provide !COMMON_CLK dummy for devm_clk_rate_exclusive_get()
2024-04-20MAINTAINERS: update to working email addressJames Bottomley1-2/+2
jejb@linux.ibm.com no longer works. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2024-04-20Merge tag 'perf-tools-fixes-for-v6.9-2024-04-19' of ↵Linus Torvalds19-742/+817
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Namhyung Kim: "A random set of small bug fixes: - Fix perf annotate TUI when used with data type profiling - Work around BPF verifier about sighand lock checking And a set of kernel header synchronization" * tag 'perf-tools-fixes-for-v6.9-2024-04-19' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: tools/include: Sync arm64 asm/cputype.h with the kernel sources tools/include: Sync asm-generic/bitops/fls.h with the kernel sources tools/include: Sync x86 asm/msr-index.h with the kernel sources tools/include: Sync x86 asm/irq_vectors.h with the kernel sources tools/include: Sync x86 CPU feature headers with the kernel sources tools/include: Sync uapi/sound/asound.h with the kernel sources tools/include: Sync uapi/linux/kvm.h and asm/kvm.h with the kernel sources tools/include: Sync uapi/linux/fs.h with the kernel sources tools/include: Sync uapi/drm/i915_drm.h with the kernel sources perf lock contention: Add a missing NULL check perf annotate: Make sure to call symbol__annotate2() in TUI
2024-04-20Merge tag 'hardening-v6.9-rc5' of ↵Linus Torvalds2-7/+22
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - Correctly disable UBSAN configs in configs/hardening (Nathan Chancellor) - Add missing signed integer overflow trap types to arm64 handler * tag 'hardening-v6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: ubsan: Add awareness of signed integer overflow traps configs/hardening: Disable CONFIG_UBSAN_SIGNED_WRAP configs/hardening: Fix disabling UBSAN configurations
2024-04-20Merge tag 'for-linus-iommufd' of ↵Linus Torvalds2-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd fixes from Jason Gunthorpe: "Two fixes for the selftests: - CONFIG_IOMMUFD_TEST needs CONFIG_IOMMUFD_DRIVER to work - The kconfig fragment sshould include fault injection so the fault injection test can work" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: iommufd: Add config needed for iommufd_fail_nth iommufd: Add missing IOMMUFD_DRIVER kconfig for the selftest
2024-04-19Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds3-5/+11
Pull rdma fixes from Jason Gunthorpe: - Add a missing mutex_destroy() in rxe - Enhance the debugging print for cm_destroy failures to help debug these - Fix mlx5 MAD processing in cases where multiport devices are running in switchedev mode * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/mlx5: Fix port number for counter query in multi-port configuration RDMA/cm: Print the old state when cm_destroy_id gets timeout RDMA/rxe: Fix the problem "mutex_destroy missing"
2024-04-19Merge tag '9p-fixes-for-6.9-rc5' of ↵Linus Torvalds4-6/+23
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs Pull fs/9p fixes from Eric Van Hensbergen: "This contains a reversion of one of the original 6.9 patches which seems to have been the cause of most of the instability. It also incorporates several fixes to legacy support and cache fixes. There are few additional changes to improve stability, but I want another week of testing before sending them upstream" * tag '9p-fixes-for-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: drop inodes immediately on non-.L too fs/9p: Revert "fs/9p: fix dups even in uncached mode" fs/9p: remove erroneous nlink init from legacy stat2inode 9p: explicitly deny setlease attempts fs/9p: fix the cache always being enabled on files with qid flags fs/9p: translate O_TRUNC into OTRUNC fs/9p: only translate RWX permissions for plain 9P2000
2024-04-19Merge tag 'fuse-fixes-6.9-rc5' of ↵Linus Torvalds6-27/+58
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse fixes from Miklos Szeredi: - Fix two bugs in the new passthrough mode - Fix a statx bug introduced in v6.6 - Fix code documentation * tag 'fuse-fixes-6.9-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: cuse: add kernel-doc comments to cuse_process_init_reply() fuse: fix leaked ENOSYS error on first statx call fuse: fix parallel dio write on file open in passthrough mode fuse: fix wrong ff->iomode state changes from parallel dio write
2024-04-19Merge tag 'arm64-fixes' of ↵Linus Torvalds3-6/+9
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Fix a kernel fault during page table walking in huge_pte_alloc() with PTABLE_LEVELS=5 due to using p4d_offset() instead of p4d_alloc() - head.S fix and cleanup to disable the MMU before toggling the HCR_EL2.E2H bit when entering the kernel with the MMU on from the EFI stub. Changing this bit (currently from VHE to nVHE) causes some system registers as well as page table descriptors to be interpreted differently, potentially resulting in spurious MMU faults - Fix translation fault in swsusp_save() accessing MEMBLOCK_NOMAP memory ranges due to kernel_page_present() returning true in most configurations other than rodata_full == true, CONFIG_DEBUG_PAGEALLOC=y or CONFIG_KFENCE=y * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: hibernate: Fix level3 translation fault in swsusp_save() arm64/head: Disable MMU at EL2 before clearing HCR_EL2.E2H arm64/head: Drop unnecessary pre-disable-MMU workaround arm64/hugetlb: Fix page table walk in huge_pte_alloc()
2024-04-19Merge tag 's390-6.9-4' of ↵Linus Torvalds5-11/+46
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Fix NULL pointer dereference in program check handler - Fake IRBs are important events relevant for problem analysis. Add traces when queueing and delivering - Fix a race condition in ccw_device_set_online() that can cause the online process to fail - Deferred condition code 1 response indicates that I/O was not started and should be retried. The current QDIO implementation handles a cc1 response as an error, resulting in a failed QDIO setup. Fix that by retrying the setup when a cc1 response is received * tag 's390-6.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: Fix NULL pointer dereference s390/cio: log fake IRB events s390/cio: fix race condition during online processing s390/qdio: handle deferred cc1
2024-04-19Merge tag 'bootconfig-fixes-v6.9-rc4' of ↵Linus Torvalds3-10/+21
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull bootconfig fixes from Masami Hiramatsu: - Fix potential static_command_line buffer overrun. Currently we allocate the memory for static_command_line based on "boot_command_line", but it will copy "command_line" into it. So we use the length of "command_line" instead of "boot_command_line" (as we previously did) - Use memblock_free_late() in xbc_exit() instead of memblock_free() after the buddy system is initialized - Fix a kerneldoc warning * tag 'bootconfig-fixes-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: bootconfig: Fix the kerneldoc of _xbc_exit() bootconfig: use memblock_free_late to free xbc memory to buddy init/main.c: Fix potential static_command_line memory overflow