kernel/linux.git/tools/perf/bench, branch v6.12.80

perf bench: Fix perf bench syscall loop count

2025-04-10T12:39:23+00:00

[ Upstream commit 957d194163bf983da98bf7ec7e4f86caff8cd0eb ] Command 'perf bench syscall fork -l 100000' offers option -l to run for a specified number of iterations. However this option is not always observed. The number is silently limited to 10000 iterations as can be seen: Output before: # perf bench syscall fork -l 100000 # Running 'syscall/fork' benchmark: # Executed 10,000 fork() calls Total time: 23.388 [sec] 2338.809800 usecs/op 427 ops/sec # When explicitly specified with option -l or --loops, also observe higher number of iterations: Output after: # perf bench syscall fork -l 100000 # Running 'syscall/fork' benchmark: # Executed 100,000 fork() calls Total time: 716.982 [sec] 7169.829510 usecs/op 139 ops/sec # This patch fixes the issue for basic execve fork and getpgid. Fixes: ece7f7c0507c ("perf bench syscall: Add fork syscall benchmark") Signed-off-by: Thomas Richter Acked-by: Sumanth Korikkar Tested-by: Athira Rajeev Cc: Tiezhu Yang Link: https://lore.kernel.org/r/20250304092349.2618082-1-tmricht@linux.ibm.com Signed-off-by: Namhyung Kim Signed-off-by: Sasha Levin

perf bench: Fix undefined behavior in cmpworker()

2025-02-17T09:05:12+00:00

commit 62892e77b8a64b9dc0e1da75980aa145347b6820 upstream. The comparison function cmpworker() violates the C standard's requirements for qsort() comparison functions, which mandate symmetry and transitivity: Symmetry: If x < y, then y > x. Transitivity: If x < y and y < z, then x < z. In its current implementation, cmpworker() incorrectly returns 0 when w1->tid < w2->tid, which breaks both symmetry and transitivity. This violation causes undefined behavior, potentially leading to issues such as memory corruption in glibc [1]. Fix the issue by returning -1 when w1->tid < w2->tid, ensuring compliance with the C standard and preventing undefined behavior. Link: https://www.qualys.com/2024/01/30/qsort.txt [1] Fixes: 121dd9ea0116 ("perf bench: Add epoll parallel epoll_wait benchmark") Cc: stable@vger.kernel.org Signed-off-by: Kuan-Wei Chiu Reviewed-by: James Clark Link: https://lore.kernel.org/r/20250116110842.4087530-1-visitorckw@gmail.com Signed-off-by: Namhyung Kim Signed-off-by: Greg Kroah-Hartman

perf tool: Constify tool pointers

2024-08-12T21:05:14+00:00

The tool pointer (to a struct largely of function pointers) is passed around but is unchanged except at initialization. Change parameter and variable types to be const to lower the possibilities of what could happen with a tool. Reviewed-by: Adrian Hunter Signed-off-by: Ian Rogers Tested-by: Adrian Hunter Tested-by: Leo Yan Cc: Alexander Shishkin Cc: Anshuman Khandual Cc: Athira Rajeev Cc: Huacai Chen Cc: Ilkka Koskinen Cc: Ingo Molnar Cc: James Clark Cc: Jiri Olsa Cc: John Garry Cc: Jonathan Cameron Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mike Leach Cc: Namhyung Kim Cc: Nick Desaulniers Cc: Nick Terrell Cc: Oliver Upton Cc: Peter Zijlstra Cc: Song Liu Cc: Sun Haiyong Cc: Suzuki Poulouse Cc: Will Deacon Cc: Yanteng Si Cc: Yicong Yang Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20240812204720.631678-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo

perf bench: Make bench its own library

2024-06-26T18:07:28+00:00

Make the benchmark code into a library so it may be linked against things like the python module to avoid compiling code twice. Signed-off-by: Ian Rogers Reviewed-by: James Clark Cc: Suzuki K Poulose Cc: Kees Cook Cc: Palmer Dabbelt Cc: Albert Ou Cc: Nick Terrell Cc: Gary Guo Cc: Alex Gaynor Cc: Boqun Feng Cc: Wedson Almeida Filho Cc: Ze Gao Cc: Alice Ryhl Cc: Andrei Vagin Cc: Yicong Yang Cc: Jonathan Cameron Cc: Guo Ren Cc: Miguel Ojeda Cc: Will Deacon Cc: Mike Leach Cc: Leo Yan Cc: Oliver Upton Cc: John Garry Cc: Benno Lossin Cc: Björn Roy Baron Cc: Andreas Hindborg Cc: Paul Walmsley Signed-off-by: Namhyung Kim Link: https://lore.kernel.org/r/20240625214117.953777-6-irogers@google.com

tools/perf: Fix timing issue with parallel threads in perf bench wake-up-parallel

2024-06-14T04:27:49+00:00

perf bench futex fails as below and hangs intermittently when attempted to run on on a powerpc system: ./perf bench futex wake-parallel Running 'futex/wake-parallel' benchmark: Run summary [PID 88588]: blocking on 640 threads (at [private] futex 0x10464b8c), 640 threads waking up 1 at a time. [Run 1]: Avg per-thread latency (waking 1/640 threads) in 0.1309 ms (+-53.27%) [Run 2]: Avg per-thread latency (waking 1/640 threads) in 0.0120 ms (+-31.16%) [Run 3]: Avg per-thread latency (waking 1/640 threads) in 0.1474 ms (+-92.47%) [Run 4]: Avg per-thread latency (waking 1/640 threads) in 0.2883 ms (+-67.75%) [Run 5]: Avg per-thread latency (waking 1/640 threads) in 0.4108 ms (+-39.60%) [Run 6]: Avg per-thread latency (waking 1/640 threads) in 0.7843 ms (+-78.98%) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) perf: couldn't wakeup all tasks (0/1) In the system, where perf bench wake-up-parallel is has system configuration of 640 cpus. After debugging, this turned out to be a timing issue. The benchmark creates threads equal to number of cpus and issues a futex_wait. Then it does a usleep for .1 second before initiating futex_wake. In system configuration with more threads, the usleep time is not enough. Patch changes the usleep from 100000 to 200000 With the patch, ran multiple iterations and there were no issues further seen Reported-by: Disha Goel Signed-off-by: Athira Rajeev Reviewed-by: Ian Rogers Tested-by: Disha Goel Cc: akanksha@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Namhyung Kim Link: https://lore.kernel.org/r/20240607044354.82225-3-atrajeev@linux.vnet.ibm.com

tools/perf: Fix perf bench epoll to enable the run when some CPU's are offline

2024-06-14T04:27:26+00:00

Perf bench epoll fails as below when attempted to run on on a powerpc system: ./perf bench epoll wait Running 'epoll/wait' benchmark: Run summary [PID 627653]: 79 threads monitoring on 64 file-descriptors for 8 secs. perf: pthread_create: No such file or directory In the setup where this perf bench was ran, difference was that partition had 640 CPU's, but not all CPUs were online. 80 CPUs were online. While creating threads and using epoll_wait , code sets the affinity using cpumask. The cpumask size used is 80 which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the benchmark reports fail while setting affinity for cpu number which is greater than 80 or higher, because it attempts to set a bit position which is not allocated on the cpumask. Fix this by changing the size of cpumask to number of possible cpus and not the number of online cpus. Signed-off-by: Athira Rajeev Reviewed-by: Ian Rogers Tested-by: Disha Goel Cc: akanksha@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Namhyung Kim Link: https://lore.kernel.org/r/20240607044354.82225-2-atrajeev@linux.vnet.ibm.com

tools/perf: Fix perf bench futex to enable the run when some CPU's are offline

2024-06-14T04:26:58+00:00

Perf bench futex fails as below when attempted to run on on a powerpc system: ./perf bench futex all Running futex/hash benchmark... Run summary [PID 626307]: 80 threads, each operating on 1024 [private] futexes for 10 secs. perf: pthread_create: No such file or directory In the setup where this perf bench was ran, difference was that partition had 640 CPU's, but not all CPUs were online. 80 CPUs were online. While blocking the threads with futex_wait, code sets the affinity using cpumask. The cpumask size used is 80 which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the benchmark reports fail while setting affinity for cpu number which is greater than 80 or higher, because it attempts to set a bit position which is not allocated on the cpumask. Fix this by changing the size of cpumask to number of possible cpus and not the number of online cpus. Signed-off-by: Athira Rajeev Reviewed-by: Ian Rogers Tested-by: Disha Goel Cc: akanksha@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Namhyung Kim Link: https://lore.kernel.org/r/20240607044354.82225-1-atrajeev@linux.vnet.ibm.com

perf bench internals inject-build-id: Fix trap divide when collecting just one DSO

2024-05-07T15:44:02+00:00

'perf bench internals inject-build-id' suffers from the following error when only one DSO is collected. # perf bench internals inject-build-id -v Collected 1 DSOs traps: internals-injec[2305] trap divide error ip:557566ba6394 sp:7ffd4de97fe0 error:0 in perf[557566b2a000+23d000] Build-id injection benchmark Iteration #1 Floating point exception This patch removes the unnecessary minus one from the divisor which also corrects the randomization range. Signed-off-by: He Zhe Fixes: 0bf02a0d80427f26 ("perf bench: Add build-id injection benchmark") Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Ian Rogers Cc: Jiri Olsa Cc: Mark Rutland Cc: Namhyung Kim Link: https://lore.kernel.org/r/20240507065026.2652929-1-zhe.he@windriver.com Signed-off-by: Arnaldo Carvalho de Melo

perf bench uprobe: Add uretprobe variant of uprobe benchmarks

2024-04-12T20:54:02+00:00

Name benchmarks with _ret at the end to avoid creating a new set of benchmarks. Signed-off-by: Ian Rogers Acked-by: Jiri Olsa Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Andrei Vagin Cc: Ingo Molnar Cc: Kan Liang Cc: Kees Kook Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Link: https://lore.kernel.org/r/20240406040911.1603801-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo

perf bench uprobe: Remove lib64 from libc.so.6 binary path

2024-04-12T20:54:02+00:00

bpf_program__attach_uprobe_opts will search LD_LIBRARY_PATH and so specifying `/lib64` is unnecessary and causes failures for libc.so.6 paths like `/lib/x86_64-linux-gnu/libc.so.6`. Fixes: 7b47623b8cae8149 ("perf bench uprobe trace_printk: Add entry attaching an BPF program that does a trace_printk") Signed-off-by: Ian Rogers Acked-by: Jiri Olsa Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Andrei Vagin Cc: Ingo Molnar Cc: Kan Liang Cc: Kees Kook Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Link: https://lore.kernel.org/r/20240406040911.1603801-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo