| Age | Commit message (Collapse) | Author | Files | Lines |
|
I think the return value of SKIP (2) should be used when it skipped the
entire test suite rather than a few of them. While the FAIL should be
reserved if any of test failed.
$ perf test -vv 109
109: perf all metricgroups test:
--- start ---
test child forked, pid 2493003
Testing Backend
Testing Bad
Testing BadSpec
Testing BigFootprint
Testing BrMispredicts
Testing Branches
Testing BvBC
Testing BvBO
Testing BvCB
Testing BvFB
Testing BvIO
Testing BvMB
Testing BvML
Testing BvMP
Testing BvMS
Testing BvMT
Testing BvOB
Testing BvUW
Testing CacheHits
Testing CacheMisses
Testing CodeGen
Testing Compute
Testing Cor
Testing DSB
Testing DSBmiss
Testing DataSharing
Testing Default
Testing Default2
Testing Default3
Testing Default4
Ignoring failures in Default4 that may contain unsupported legacy events
Testing Fed
Testing FetchBW
Testing FetchLat
Testing Flops
Testing FpScalar
Testing FpVector
Testing Frontend
Testing HPC
Testing IcMiss
Testing InsType
Testing LSD
Testing LockCont
Testing MachineClears
Testing Machine_Clears
Testing Mem
Testing MemOffcore
Testing MemoryBW
Testing MemoryBound
Testing MemoryLat
Testing MemoryTLB
Testing Memory_BW
Testing Memory_Lat
Testing MicroSeq
Testing OS
Testing Offcore
Testing PGO
Testing Pipeline
Testing PortsUtil
Testing Power
Testing Prefetches
Testing Ret
Testing Retire
Testing SMT
Testing Snoop
Testing SoC
Testing Summary
Testing TmaL1
Testing TmaL2
Testing TmaL3mem
Testing TopdownL1
Testing TopdownL2
Testing TopdownL3
Testing TopdownL4
Testing TopdownL5
Testing TopdownL6
Testing smi
Testing tma_L1_group
Testing tma_L2_group
Testing tma_L3_group
Testing tma_L4_group
Testing tma_L5_group
Testing tma_L6_group
Testing tma_alu_op_utilization_group
Testing tma_assists_group
Testing tma_backend_bound_group
Testing tma_bad_speculation_group
Testing tma_branch_mispredicts_group
Testing tma_branch_resteers_group
Testing tma_code_stlb_miss_group
Testing tma_core_bound_group
Testing tma_divider_group
Testing tma_dram_bound_group
Testing tma_dtlb_load_group
Testing tma_dtlb_store_group
Testing tma_fetch_bandwidth_group
Testing tma_fetch_latency_group
Testing tma_fp_arith_group
Testing tma_fp_vector_group
Testing tma_frontend_bound_group
Testing tma_heavy_operations_group
Testing tma_icache_misses_group
Testing tma_issue2P
Testing tma_issueBM
Testing tma_issueBW
Testing tma_issueComp
Testing tma_issueD0
Testing tma_issueFB
Testing tma_issueFL
Testing tma_issueL1
Testing tma_issueLat
Testing tma_issueMC
Testing tma_issueMS
Testing tma_issueMV
Testing tma_issueRFO
Testing tma_issueSL
Testing tma_issueSO
Testing tma_issueSmSt
Testing tma_issueSpSt
Testing tma_issueSyncxn
Testing tma_issueTLB
Testing tma_itlb_misses_group
Testing tma_l1_bound_group
Testing tma_l2_bound_group
Testing tma_l3_bound_group
Testing tma_light_operations_group
Testing tma_load_stlb_miss_group
Testing tma_machine_clears_group
Testing tma_memory_bound_group
Testing tma_microcode_sequencer_group
Testing tma_mite_group
Testing tma_other_light_ops_group
Testing tma_ports_utilization_group
Testing tma_ports_utilized_0_group
Testing tma_ports_utilized_3m_group
Testing tma_retiring_group
Testing tma_serializing_operation_group
Testing tma_store_bound_group
Testing tma_store_stlb_miss_group
Testing transaction
---- end(0) ----
109: perf all metricgroups test : Ok
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
I think the return value of SKIP (2) should be used when it skipped the
entire test suite rather than a few of them. While the FAIL should be
reserved if any of test failed.
$ perf test -vv 110
110: perf all metrics test:
--- start ---
test child forked, pid 2496399
Testing tma_core_bound
Testing tma_info_core_ilp
Testing tma_info_memory_l2mpki
Testing tma_memory_bound
Testing tma_bottleneck_irregular_overhead
Testing tma_bottleneck_mispredictions
Testing tma_info_bad_spec_branch_misprediction_cost
Testing tma_info_bad_spec_ipmisp_cond_ntaken
Testing tma_info_bad_spec_ipmisp_cond_taken
Testing tma_info_bad_spec_ipmisp_indirect
Testing tma_info_bad_spec_ipmisp_ret
Testing tma_info_bad_spec_ipmispredict
Testing tma_info_branches_callret
Testing tma_info_branches_cond_nt
Testing tma_info_branches_cond_tk
Testing tma_info_branches_jump
Testing tma_info_branches_other_branches
Testing tma_branch_mispredicts
Testing tma_clears_resteers
Testing tma_machine_clears
Testing tma_mispredicts_resteers
Testing tma_bottleneck_big_code
Testing tma_icache_misses
Testing tma_itlb_misses
Testing tma_unknown_branches
Testing tma_info_bad_spec_spec_clears_ratio
Testing tma_other_mispredicts
Testing tma_branch_instructions
Testing tma_info_frontend_tbpc
Testing tma_info_inst_mix_bptkbranch
Testing tma_info_inst_mix_ipbranch
Testing tma_info_inst_mix_ipcall
Testing tma_info_inst_mix_iptb
Testing tma_info_system_ipfarbranch
Testing tma_info_thread_uptb
Testing tma_bottleneck_branching_overhead
Testing tma_nop_instructions
Testing tma_bottleneck_compute_bound_est
Testing tma_divider
Testing tma_ports_utilized_3m
Testing tma_bottleneck_instruction_fetch_bw
Testing tma_frontend_bound
Testing tma_assists
Testing tma_other_nukes
Testing tma_serializing_operation
Testing tma_bottleneck_data_cache_memory_bandwidth
Testing tma_fb_full
Testing tma_mem_bandwidth
Testing tma_sq_full
Testing tma_bottleneck_data_cache_memory_latency
Testing tma_l1_latency_dependency
Testing tma_l2_bound
Testing tma_l3_hit_latency
Testing tma_mem_latency
Testing tma_store_latency
Testing tma_bottleneck_memory_synchronization
Testing tma_contested_accesses
Testing tma_data_sharing
Testing tma_false_sharing
Testing tma_bottleneck_memory_data_tlbs
Testing tma_dtlb_load
Testing tma_dtlb_store
Testing tma_backend_bound
Testing tma_bottleneck_other_bottlenecks
Testing tma_bottleneck_useful_work
Testing tma_retiring
Testing tma_info_memory_fb_hpki
Testing tma_info_memory_l1mpki
Testing tma_info_memory_l1mpki_load
Testing tma_info_memory_l2hpki_all
Testing tma_info_memory_l2hpki_load
Testing tma_info_memory_l2mpki_all
Testing tma_info_memory_l2mpki_load
Testing tma_l1_bound
Testing tma_l3_bound
Testing tma_info_memory_l2mpki_rfo
Testing tma_fp_scalar
Testing tma_fp_vector
Testing tma_fp_vector_128b
Testing tma_fp_vector_256b
Testing tma_fp_vector_512b
Testing tma_port_0
Testing tma_x87_use
Testing tma_info_botlnk_l0_core_bound_likely
Testing tma_info_core_fp_arith_utilization
Testing tma_info_pipeline_execute
Testing tma_info_system_gflops
Testing tma_info_thread_execute_per_issue
Testing tma_dsb
Testing tma_info_botlnk_l2_dsb_bandwidth
Testing tma_info_frontend_dsb_coverage
Testing tma_decoder0_alone
Testing tma_dsb_switches
Testing tma_info_botlnk_l2_dsb_misses
Testing tma_info_frontend_dsb_switch_cost
Testing tma_info_frontend_ipdsb_miss_ret
Testing tma_mite
Testing tma_mite_4wide
Testing CPUs_utilized
Testing backend_cycles_idle
[Ignored backend_cycles_idle] failed but as a Default metric this can be expected
Performance counter stats for 'perf test -w noploop': <not counted> cpu-cycles:u <not supported> stalled-cycles-backend:u 1.014051473 seconds time elapsed 1.005718000 seconds user 0.008013000 seconds sys
Testing branch_frequency
Testing branch_miss_rate
Testing cs_per_second
Testing cycles_frequency
Testing frontend_cycles_idle
[Ignored frontend_cycles_idle] failed but as a Default metric this can be expected
Performance counter stats for 'perf test -w noploop': <not counted> cpu-cycles:u <not supported> stalled-cycles-frontend:u 1.012813656 seconds time elapsed 1.004603000 seconds user 0.008004000 seconds sys
Testing insn_per_cycle
Testing migrations_per_second
Testing page_faults_per_second
Testing stalled_cycles_per_instruction
[Ignored stalled_cycles_per_instruction] failed but as a Default metric this can be expected
Error: No supported events found. The stalled-cycles-backend:u event is not supported.
Testing tma_bad_speculation
Testing l1d_miss_rate
Testing llc_miss_rate
Testing dtlb_miss_rate
Testing itlb_miss_rate
[Ignored itlb_miss_rate] failed but as a Default metric this can be expected
Performance counter stats for 'perf test -w noploop': <not supported> iTLB-loads:u 3,097 iTLB-load-misses:u 1.012766732 seconds time elapsed 1.004318000 seconds user 0.008002000 seconds sys
Testing l1i_miss_rate
[Ignored l1i_miss_rate] failed but as a Default metric this can be expected
Performance counter stats for 'perf test -w noploop': <not counted> L1-icache-load-misses:u <not supported> L1-icache-loads:u 1.013606395 seconds time elapsed 1.001371000 seconds user 0.011968000 seconds sys
Testing l1_prefetch_miss_rate
[Ignored l1_prefetch_miss_rate] failed but as a Default metric this can be expected
Error: No supported events found. The L1-dcache-prefetches:u event is not supported.
Testing tma_info_botlnk_l2_ic_misses
Testing tma_info_frontend_fetch_upc
Testing tma_info_frontend_icache_miss_latency
Testing tma_info_frontend_ipunknown_branch
Testing tma_info_frontend_lsd_coverage
Testing tma_info_memory_tlb_code_stlb_mpki
Testing tma_info_pipeline_fetch_dsb
Testing tma_info_pipeline_fetch_lsd
Testing tma_info_pipeline_fetch_mite
Testing tma_info_pipeline_fetch_ms
Testing tma_fetch_bandwidth
Testing tma_lsd
Testing tma_branch_resteers
Testing tma_code_l2_hit
Testing tma_code_l2_miss
Testing tma_code_stlb_hit
Testing tma_code_stlb_miss
Testing tma_code_stlb_miss_2m
Testing tma_code_stlb_miss_4k
Testing tma_lcp
Testing tma_ms_switches
Testing tma_info_core_flopc
Testing tma_info_inst_mix_iparith
Testing tma_info_inst_mix_iparith_avx128
Testing tma_info_inst_mix_iparith_avx256
Testing tma_info_inst_mix_iparith_avx512
Testing tma_info_inst_mix_iparith_scalar_dp
Testing tma_info_inst_mix_iparith_scalar_sp
Testing tma_info_inst_mix_ipflop
Testing tma_info_inst_mix_ippause
Testing tma_fetch_latency
Testing tma_fp_arith
Testing tma_fp_assists
Testing tma_info_system_cpu_utilization
Testing tma_info_system_dram_bw_use
[Skipped tma_info_system_dram_bw_use] Not supported events
Performance counter stats for 'perf test -w noploop': <not supported> UNC_ARB_TRK_REQUESTS.ALL:u <not supported> UNC_ARB_COH_TRK_REQUESTS.ALL:u 1,013,554,749 duration_time 1.013527265 seconds time elapsed 1.005417000 seconds user 0.008011000 seconds sys
Testing tma_info_frontend_l2mpki_code
Testing tma_info_frontend_l2mpki_code_all
Testing tma_info_inst_mix_ipload
Testing tma_info_inst_mix_ipstore
Testing tma_info_memory_latency_load_l2_miss_latency
Testing tma_lock_latency
Testing tma_info_memory_core_l1d_cache_fill_bw_2t
Testing tma_info_memory_core_l2_cache_fill_bw_2t
Testing tma_info_memory_core_l3_cache_access_bw_2t
Testing tma_info_memory_core_l3_cache_fill_bw_2t
Testing tma_info_memory_l1d_cache_fill_bw
Testing tma_info_memory_l2_cache_fill_bw
Testing tma_info_memory_l3_cache_access_bw
Testing tma_info_memory_l3_cache_fill_bw
Testing tma_info_memory_l3mpki
Testing tma_info_memory_load_miss_real_latency
Testing tma_info_memory_mix_bus_lock_pki
Testing tma_info_memory_mix_uc_load_pki
Testing tma_info_memory_mlp
Testing tma_info_memory_tlb_load_stlb_mpki
Testing tma_info_memory_tlb_page_walks_utilization
Testing tma_info_memory_tlb_store_stlb_mpki
Testing tma_info_system_mem_parallel_reads
[Skipped tma_info_system_mem_parallel_reads] Not supported events
Performance counter stats for 'perf test -w noploop': <not supported> UNC_ARB_DAT_OCCUPANCY.RD:u <not counted> UNC_ARB_DAT_OCCUPANCY.RD/cmask=1/ 1.013354884 seconds time elapsed 1.009239000 seconds user 0.004004000 seconds sys
Testing tma_info_system_mem_read_latency
[Skipped tma_info_system_mem_read_latency] Not supported events
Performance counter stats for 'perf test -w noploop': <not supported> UNC_ARB_DAT_OCCUPANCY.RD:u <not counted> UNC_ARB_TRK_OCCUPANCY.RD <not counted> UNC_ARB_TRK_REQUESTS.RD 1.012882143 seconds time elapsed 1.004600000 seconds user 0.008036000 seconds sys
Testing tma_info_thread_cpi
Testing tma_streaming_stores
Testing tma_dram_bound
Testing tma_store_bound
Testing tma_l2_hit_latency
Testing tma_load_stlb_hit
Testing tma_load_stlb_miss
Testing tma_load_stlb_miss_1g
Testing tma_load_stlb_miss_2m
Testing tma_load_stlb_miss_4k
Testing tma_store_stlb_hit
Testing tma_store_stlb_miss
Testing tma_store_stlb_miss_1g
Testing tma_store_stlb_miss_2m
Testing tma_store_stlb_miss_4k
Testing tma_info_memory_latency_data_l2_mlp
Testing tma_info_memory_latency_load_l2_mlp
Testing tma_info_pipeline_ipassist
Testing tma_microcode_sequencer
Testing tma_ms
Testing tma_info_system_kernel_cpi
[Failed tma_info_system_kernel_cpi] Metric contains missing events
Error: No supported events found. Access to performance monitoring and observability operations is limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open access to performance monitoring and observability operations for processes without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability. More information can be found at 'Perf events and tool security' document: https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html perf_event_paranoid setting is 2: -1: Allow use of (almost) all events by all users Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK >= 0: Disallow raw and ftrace function tracepoint access >= 1: Disallow CPU event access >= 2: Disallow kernel profiling To make the adjusted perf_event_paranoid setting permanent preserve it in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)
Testing tma_info_system_kernel_utilization
[Failed tma_info_system_kernel_utilization] Metric contains missing events
Error: No supported events found. Access to performance monitoring and observability operations is limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open access to performance monitoring and observability operations for processes without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability. More information can be found at 'Perf events and tool security' document: https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html perf_event_paranoid setting is 2: -1: Allow use of (almost) all events by all users Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK >= 0: Disallow raw and ftrace function tracepoint access >= 1: Disallow CPU event access >= 2: Disallow kernel profiling To make the adjusted perf_event_paranoid setting permanent preserve it in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>)
Testing tma_info_pipeline_retire
Testing tma_info_thread_clks
Testing tma_info_thread_uoppi
Testing tma_memory_operations
Testing tma_other_light_ops
Testing tma_ports_utilization
Testing tma_ports_utilized_0
Testing tma_ports_utilized_1
Testing tma_ports_utilized_2
Testing C10_Pkg_Residency
[Failed C10_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c10-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c10-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C2_Pkg_Residency
[Failed C2_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c2-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c2-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C3_Pkg_Residency
[Failed C3_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { msr/tsc/, cstate_pkg/c3-residency/ } Error: No supported events found. Invalid event (msr/tsc/u) in per-thread mode, enable system wide with '-a'.
Testing C6_Core_Residency
[Failed C6_Core_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_core/c6-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_core/c6-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C6_Pkg_Residency
[Failed C6_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c6-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c6-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C7_Core_Residency
[Failed C7_Core_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_core/c7-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_core/c7-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C7_Pkg_Residency
[Failed C7_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c7-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c7-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C8_Pkg_Residency
[Failed C8_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c8-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c8-residency/u) in per-thread mode, enable system wide with '-a'.
Testing C9_Pkg_Residency
[Failed C9_Pkg_Residency] Metric contains missing events
WARNING: grouped events cpus do not match. Events with CPUs not matching the leader will be removed from the group. anon group { cstate_pkg/c9-residency/, msr/tsc/ } Error: No supported events found. Invalid event (cstate_pkg/c9-residency/u) in per-thread mode, enable system wide with '-a'.
Testing tma_info_core_epc
Testing tma_info_system_core_frequency
Testing tma_info_system_power
[Skipped tma_info_system_power] Not supported events
Performance counter stats for 'perf test -w noploop': <not supported> Joules power/energy-pkg/u 1,013,238,256 duration_time 1.013223072 seconds time elapsed 0.995924000 seconds user 0.011903000 seconds sys
Testing tma_info_system_power_license0_utilization
Testing tma_info_system_power_license1_utilization
Testing tma_info_system_power_license2_utilization
Testing tma_info_system_turbo_utilization
Testing tma_info_inst_mix_ipswpf
Testing tma_info_memory_prefetches_useless_hwpf
Testing tma_info_core_coreipc
Testing tma_info_thread_ipc
Testing tma_heavy_operations
Testing tma_light_operations
Testing tma_info_core_core_clks
Testing tma_info_system_smt_2t_utilization
Testing tma_info_thread_slots_utilization
Testing UNCORE_FREQ
[Skipped UNCORE_FREQ] Not supported events
Performance counter stats for 'perf test -w noploop': <not supported> UNC_CLOCK.SOCKET:u 1,015,993,466 duration_time 1.015949387 seconds time elapsed 1.007676000 seconds user 0.008029000 seconds sys
Testing tma_info_system_socket_clks
[Failed tma_info_system_socket_clks] Metric contains missing events
Error: No supported events found. Invalid event (UNC_CLOCK.SOCKET:u) in per-thread mode, enable system wide with '-a'.
Testing tma_info_inst_mix_instructions
Testing tma_info_system_cpus_utilized
Testing tma_info_system_mux
Testing tma_info_system_time
Testing tma_info_thread_slots
Testing tma_few_uops_instructions
Testing tma_4k_aliasing
Testing tma_cisc
Testing tma_fp_divider
Testing tma_int_divider
Testing tma_slow_pause
Testing tma_split_loads
Testing tma_split_stores
Testing tma_store_fwd_blk
Testing tma_alu_op_utilization
Testing tma_load_op_utilization
Testing tma_mixing_vectors
Testing tma_store_op_utilization
Testing tma_port_1
Testing tma_port_5
Testing tma_port_6
Testing smi_cycles
[Skipped smi_cycles] Not supported events
Performance counter stats for 'perf test -w noploop': <not supported> msr/smi/u <not supported> msr/aperf/u 3,965,789,327 cycles:u 1.012779591 seconds time elapsed 1.004579000 seconds user 0.007972000 seconds sys
Testing smi_num
[Failed smi_num] Metric contains missing events
Error: No supported events found. Invalid event (msr/smi/u) in per-thread mode, enable system wide with '-a'.
Testing tsx_aborted_cycles
Testing tsx_cycles_per_elision
Testing tsx_cycles_per_transaction
Testing tsx_transactional_cycles
---- end(-1) ----
110: perf all metrics test : FAILED!
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It uses tools/perf/include which assumes it's running from the root of
the linux kernel source tree. But you can run perf from other places
like tools/perf, then the include path won't match. We can use the
shelldir variable to locate the test script in the tree.
$ cd tools/perf
$ ./perf test dlfilter
63: dlfilter C API : Ok
101: perf script --dlfilter tests : Ok
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For some reason, it may fail to build the dlfilter. Let's skip the test
as it's not an error in the perf. This can happen when you run the perf
test without source code or in a different directory.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The keep_feat() determines which header features will be kept or
discarded. Usually 'perf inject' will add build-IDs based on -b, -B or
other related options. But it lose build-ID when none of those options
are used. This is meaningful only when --buildid-mmap is not used.
The following example shows the impact of this change.
$ perf record --no-buildid-mmap true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.037 MB perf.data (5 samples) ]
$ perf inject -i perf.data -o perf.data.inject
$ perf buildid-list -i perf.data
08cccc2a9388d5247ccb3e864f3063b975b0a15d /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
fd5c4d5673256cd6bda51725dba048dabb0f854e [kernel.kallsyms]
97a36ce1140071be5c36b147fa0bed173e05a602 [vdso]
$ perf buildid-list -i perf.data.inject
97a36ce1140071be5c36b147fa0bed173e05a602 [vdso]
With this change, perf.data.inject would show the same list (of course,
you need to run perf inject again).
Reported-by: Gabriel Marin <gmx@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now that the SHA-1 code is no longer used, remove it.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fangrui Song <maskray@sourceware.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pablo Galindo <pablogsal@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Recent patches [1] [2] added an implementation of SHA-1 to perf and made
it be used for build ID generation.
I had understood the choice of SHA-1, which is a legacy algorithm, to be
for backwards compatibility.
It turns out, though, that there's no backwards compatibility
requirement here other than the size of the build ID field, which is
fixed at 20 bytes. Not only did the hash algorithm already change (from
MD5 to SHA-1), but the inputs to the hash changed too: from 'load_addr
|| code' to just 'code', and now again to 'code || symtab || strsym'
[3]. Different linkers generate different build IDs, with the LLVM
linker using BLAKE3 hashes for example [4].
Therefore, we might as well switch to a more modern algorithm. Let's go
with BLAKE2s. It's faster than SHA-1, isn't cryptographically broken,
is easier to implement than BLAKE3, and the kernel's implementation in
lib/crypto/blake2s.c is easily borrowed. It also natively supports
variable-length hashes, so it can directly produce the needed 20 bytes.
Also make the following additional improvements:
- Hash the three inputs incrementally, so they don't all have to be
concatenated into one buffer.
- Add tag/length prefixes to each of the three inputs, so that distinct
input tuples reliably result in distinct hashes.
[1] https://lore.kernel.org/linux-perf-users/20250521225307.743726-1-yuzhuo@google.com/
[2] https://lore.kernel.org/linux-perf-users/20250625202311.23244-1-ebiggers@kernel.org/
[3] https://lore.kernel.org/linux-perf-users/20251125080748.461014-1-namhyung@kernel.org/
[4] https://github.com/llvm/llvm-project/commit/d3e5b6f7539b86995aef6e2075c1edb3059385ce
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fangrui Song <maskray@sourceware.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pablo Galindo <pablogsal@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add BLAKE2s support to the perf utility library. The code is borrowed
from the kernel. This will replace the use of SHA-1 in genelf.c.
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fangrui Song <maskray@sourceware.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Pablo Galindo <pablogsal@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The docs for YNL event ops currently render raw python structs. For
example in:
https://docs.kernel.org/netlink/specs/ethtool.html#cable-test-ntf
event: {‘attributes’: [‘header’, ‘status’, ‘nest’], ‘__lineno__’: 2385}
Handle event ops correctly and render their op attributes:
event: attributes: [header, status]
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260112153436.75495-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This test occasionally fails due to exceeding timing bounds, as
run in continuous testing on netdev.bots:
https://netdev.bots.linux.dev/contest.html?test=txtimestamp-sh
A common pattern is a single elevated delay between USR and SND.
# 8.36 [+0.00] test SND
# 8.36 [+0.00] USR: 1767864384 s 240994 us (seq=0, len=0)
# 8.44 [+0.08] ERROR: 18461 us expected between 10000 and 18000
# 8.44 [+0.00] SND: 1767864384 s 259455 us (seq=42, len=10) (USR +18460 us)
# 8.52 [+0.07] SND: 1767864384 s 339523 us (seq=42, len=10) (USR +10005 us)
# 8.52 [+0.00] USR: 1767864384 s 409580 us (seq=0, len=0)
# 8.60 [+0.08] SND: 1767864384 s 419586 us (seq=42, len=10) (USR +10005 us)
# 8.60 [+0.00] USR: 1767864384 s 489645 us (seq=0, len=0)
# 8.68 [+0.08] SND: 1767864384 s 499651 us (seq=42, len=10) (USR +10005 us)
# 8.68 [+0.00] USR-SND: count=4, avg=12119 us, min=10005 us, max=18460 us
(Note that other delays are nowhere near the large 8ms tolerance.)
One hypothesis is that the task is descheduled between taking the USR
timestamp and sending the packet. Possibly in printing.
Delay taking the timestamp closer to sendmsg, and delay printing until
after sendmsg.
With this change, failure rate is significantly lower in current runs.
Link: https://lore.kernel.org/netdev/20260107110521.1aab55e9@kernel.org/
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260112163355.3510150-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When there is no exclusion occurring from the cmds list - for example -
cmds contains ["read-vdso32"] and excludes contains ["archive"] - the
main loop completes with ci == cj == 0. In the original code the loop
processing the remaining elements in the list was conditional:
if (ci != cj) { ...}
So we end up in the assertion loop since ci < cmds->cnt and we
incorrectly try to assert the list elements to be NULL and fail with
the following error
help.c:104: exclude_cmds: Assertion `cmds->names[ci] == NULL' failed.
Fix this by moving the if (ci != cj) check inside of a broader loop.
If ci != cj, left shift the list elements, as before, and then
unconditionally advance the ci and cj indicies which also covers the
ci == cj case.
Fixes: 1fdf938168c4d26f ("perf tools: Fix use-after-free in help_unknown_cmd()")
Reviewed-by: Guilherme Amadio <amadio@gentoo.org>
Signed-off-by: Sri Jayaramappa <sjayaram@akamai.com>
Tested-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Joshua Hunt <johunt@akamai.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20251202213632.2873731-1-sjayaram@akamai.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a test that seeks to see inline functions correctly displayed in
'perf script' from the inlineloop workload.
Committer testing:
# perf test 'addr2line inline unwinding'
76: test addr2line inline unwinding : Ok
# perf test -vv 'addr2line inline unwinding'
76: test addr2line inline unwinding:
--- start ---
test child forked, pid 1508628
Inline unwinding verification test
[ perf record: Woken up 129 times to write data ]
[ perf record: Captured and wrote 32.282 MB /tmp/perf-test-inline-addr2line.L4Sz8QtADJ/perf.data (4014 samples) ]
Inline unwinding verification test [Success]
---- end(0) ----
76: test addr2line inline unwinding : Ok
#
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
sample__fprintf_callchain() was using map__fprintf_srcline() which won't
report inline line numbers.
Fix by using the srcline from the callchain and falling back to the map
variant.
Fixes: 25da4fab5f66e659 ("perf evsel: Move fprintf methods to separate source file")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Allow the addr2line style to be specified on the `perf report` command
line or in the .perfconfig file.
Committer testing:
The methods:
# perf probe -x ~/bin/perf -F *__addr2line
cmd__addr2line
libbfd__addr2line
libdw__addr2line
llvm__addr2line
#
So if we configure one of them, say 'addr2line':
# perf config addr2line.style=addr2line
# perf config addr2line.style
addr2line.style=addr2line
#
And have probes on all of them:
# perf probe -x ~/bin/perf *__addr2line
Added new events:
probe_perf:cmd__addr2line (on *__addr2line in /home/acme/bin/perf)
probe_perf:llvm__addr2line (on *__addr2line in /home/acme/bin/perf)
probe_perf:libbfd__addr2line (on *__addr2line in /home/acme/bin/perf)
probe_perf:libdw__addr2line (on *__addr2line in /home/acme/bin/perf)
You can now use it in all perf tools, such as:
perf record -e probe_perf:libdw__addr2line -aR sleep 1
#
Only the selected method should be used:
# perf stat -e probe_perf:*_addr2line perf report -f --dso perf --stdio -s srcfile,srcline
# Total Lost Samples: 0
#
# Samples: 4K of event 'cpu/cycles/Pu'
# Event count (approx.): 5535180842
#
# Overhead Source File Source:Line
# ........ ............ ...............
#
99.04% inlineloop.c inlineloop.c:21
0.46% inlineloop.c inlineloop.c:20
#
# (Tip: For hierarchical output, try: perf report --hierarchy)
#
Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline':
44 probe_perf:cmd__addr2line
0 probe_perf:llvm__addr2line
0 probe_perf:libbfd__addr2line
0 probe_perf:libdw__addr2line
0.035915611 seconds time elapsed
0.028008000 seconds user
0.009051000 seconds sys
#
I checked and that is the case for the other methods.
Also when using:
# perf config addr2line.style=libdw,llvm
Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline':
0 probe_perf:cmd__addr2line
23 probe_perf:llvm__addr2line
0 probe_perf:libbfd__addr2line
44 probe_perf:libdw__addr2line
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Pull x86 kvm fixes from Paolo Bonzini:
- Avoid freeing stack-allocated node in kvm_async_pf_queue_task
- Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1
selftests: kvm: try getting XFD and XSAVE state out of sync
selftests: kvm: replace numbered sync points with actions
x86/fpu: Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1
x86/kvm: Avoid freeing stack-allocated node in kvm_async_pf_queue_task
|
|
Add tests for special arithmetic shift right.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Co-developed-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260112201424.816836-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
With 64K page on arm64, verifier_arena_globals1 failed like below:
...
libbpf: map 'arena': failed to create: -E2BIG
...
#509/1 verifier_arena_globals1/check_reserve1:FAIL
...
For 64K page, if the number of arena pages is (1UL << 20), the total
memory will exceed 4G and this will cause map creation failure.
Adjusting ARENA_PAGES based on the actual page size fixed the problem.
Cc: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260113061033.3798549-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The current selftest sk_bypass_prot_mem only supports 4K page.
When running with 64K page on arm64, the following failure happens:
...
check_bypass:FAIL:no bypass unexpected no bypass: actual 3 <= expected 32
...
#385/1 sk_bypass_prot_mem/TCP :FAIL
...
check_bypass:FAIL:no bypass unexpected no bypass: actual 4 <= expected 32
...
#385/2 sk_bypass_prot_mem/UDP :FAIL
...
Adding support to 64K page as well fixed the failure.
Cc: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20260113061028.3798326-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
On arm64 with 64K page , I observed the following test failure:
...
subtest_dmabuf_iter_check_lots_of_buffers:FAIL:total_bytes_read unexpected total_bytes_read:
actual 4696 <= expected 65536
#97/3 dmabuf_iter/lots_of_buffers:FAIL
With 4K page on x86, the total_bytes_read is 4593.
With 64K page on arm64, the total_byte_read is 4696.
In progs/dmabuf_iter.c, for each iteration, the output is
BPF_SEQ_PRINTF(seq, "%lu\n%llu\n%s\n%s\n", inode, size, name, exporter);
The only difference between 4K and 64K page is 'size' in
the above BPF_SEQ_PRINTF. The 4K page will output '4096' and
the 64K page will output '65536'. So the total_bytes_read with 64K page
is slighter greater than 4K page.
Adjusting the total_bytes_read from 65536 to 4096 fixed the issue.
Cc: T.J. Mercier <tjmercier@google.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20260113061023.3798085-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Translation functions may return an invalid address in case of errors.
If the address is not checked the further use of the invalid value
will cause an address corruption.
Consistently check for a valid address returned by translation
functions. Use RESOURCE_SIZE_MAX to indicate an invalid address for
type resource_size_t. Depending on the type either RESOURCE_SIZE_MAX
or ULLONG_MAX is used to indicate an address error.
Propagating an invalid address from a failed translation may cause
userspace to think it has received a valid SPA, when in fact it is
wrong. The CXL userspace API, using trace events, expects ULLONG_MAX
to indicate a translation failure. If ULLONG_MAX is not returned
immediately, subsequent calculations can transform that bad address
into a different value (!ULLONG_MAX), and an invalid SPA may be
returned to userspace. This can lead to incorrect diagnostics and
erroneous corrective actions.
[ dj: Added user impact statement from Alison. ]
[ dj: Fixed checkpatch tab alignment issue. ]
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Fixes: c3dd67681c70 ("cxl/region: Add inject and clear poison by region offset")
Fixes: b78b9e7b7979 ("cxl/region: Refactor address translation funcs for testing")
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260107120544.410993-1-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Instead of having the pv spinlock function definitions in paravirt.h,
move them into the new header paravirt-spinlock.h.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260105110520.21356-22-jgross@suse.com
|
|
Some architectures will start to implement this function.
Make sure it works correctly.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20251223-vdso-compat-time32-v1-4-97ea7a06a543@linutronix.de
|
|
SYS_clock_getres might have been redirected by libc to some other system
call than the actual clock_getres. For testing it is required to use
exactly this system call.
Use the system call number exported by the UAPI headers which is always
correct.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20251223-vdso-compat-time32-v1-3-97ea7a06a543@linutronix.de
|
|
Some architectures will start to implement this function.
Make sure that tests can be written for it.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20251223-vdso-compat-time32-v1-2-97ea7a06a543@linutronix.de
|
|
Having a single large pv_ops array has the main disadvantage of needing all
prototypes of the single array members in one header file. This is adding up
to the need to include lots of otherwise unrelated headers.
In order to allow multiple smaller pv_ops arrays dedicated to one area of the
kernel each, allow multiple arrays in objtool.
For better performance limit the possible names of the arrays to start with
"pv_ops".
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20260105110520.21356-19-jgross@suse.com
|
|
Test 684b: Create CAKE_MQ with default setting (4 queues)
Test 7ee8: Create CAKE_MQ with bandwidth limit (4 queues)
Test 1f87: Create CAKE_MQ with rtt time (4 queues)
Test e9cf: Create CAKE_MQ with besteffort flag (4 queues)
Test 7c05: Create CAKE_MQ with diffserv8 flag (4 queues)
Test 5a77: Create CAKE_MQ with diffserv4 flag (4 queues)
Test 8f7a: Create CAKE_MQ with flowblind flag (4 queues)
Test 7ef7: Create CAKE_MQ with dsthost and nat flag (4 queues)
Test 2e4d: Create CAKE_MQ with wash flag (4 queues)
Test b3e6: Create CAKE_MQ with flowblind and no-split-gso flag (4 queues)
Test 62cd: Create CAKE_MQ with dual-srchost and ack-filter flag (4 queues)
Test 0df3: Create CAKE_MQ with dual-dsthost and ack-filter-aggressive flag (4 queues)
Test 9a75: Create CAKE_MQ with memlimit and ptm flag (4 queues)
Test cdef: Create CAKE_MQ with fwmark and atm flag (4 queues)
Test 93dd: Create CAKE_MQ with overhead 0 and mpu (4 queues)
Test 1475: Create CAKE_MQ with conservative and ingress flag (4 queues)
Test 7bf1: Delete CAKE_MQ with conservative and ingress flag (4 queues)
Test ee55: Replace CAKE_MQ with mpu (4 queues)
Test 6df9: Change CAKE_MQ with mpu (4 queues)
Test 67e2: Show CAKE_MQ class (4 queues)
Test 2de4: Change bandwidth of CAKE_MQ (4 queues)
Test 5f62: Fail to create CAKE_MQ with autorate-ingress flag (4 queues)
Test 038e: Fail to change setting of sub-qdisc under CAKE_MQ
Test 7bdc: Fail to replace sub-qdisc under CAKE_MQ
Test 18e0: Fail to install CAKE_MQ on single queue device
Reviewed-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: Jonas Köppeler <j.koeppeler@tu-berlin.de>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20260109-mq-cake-sub-qdisc-v8-6-8d613fece5d8@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Commit 59953303827e ("objtool: Disassemble code with libopcodes instead
of running objdump") added support for using libopcodes for disassembly.
However, the feature detection checks for libbfd availability but then
unconditionally links against libopcodes:
ifeq ($(feature-libbfd),1)
OBJTOOL_LDFLAGS += -lopcodes
endif
This causes build failures in environments where libbfd is installed but
libopcodes is not, since the test-libbfd.c feature test only links
against -lbfd and -ldl, not -lopcodes:
/usr/bin/ld: cannot find -lopcodes: No such file or directory
collect2: error: ld returned 1 exit status
make[4]: *** [Makefile:109: objtool] Error 1
Additionally, the shared feature framework uses $(CC) which is the
cross-compiler in cross-compilation builds. Since objtool is a host tool
that links with $(HOSTCC) against host libraries, the feature detection
can falsely report libopcodes as available when the cross-compiler's
sysroot has it but the host system doesn't.
Fix this by replacing the feature framework check with a direct inline
test that uses $(HOSTCC) to compile and link a test program against
libopcodes, similar to how xxhash availability is detected.
Fixes: 59953303827e ("objtool: Disassemble code with libopcodes instead of running objdump")
Assisted-by: claude-opus-4-5-20251101
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251223120357.2492008-1-sashal@kernel.org
|
|
When using the x32 toolchain, compilation fails because the printf
specifier "%lx" (long), doesn't match the type of the "checksum" variable
(long long). Fix this by changing the printf specifier to "%llx" and
casting "checksum" to unsigned long long.
Fixes: a3493b33384a ("objtool/klp: Add --debug-checksum=<funcs> to show per-instruction checksums")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/a1158c99-fe0e-a218-4b5b-ffac212489f6@redhat.com
|
|
Stop definining these privately and instead move them to the uapi
errno.h so that they become canonical instead of copy pasta.
Cc: linux-api@vger.kernel.org
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Link: https://patch.msgid.link/176826402587.3490369.17659117524205214600.stgit@frogsfrogsfrogs
Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The patch 'Replace atoi() with a robust strtoi()' introduced a bug
in parse_cpu_set(), which relies on partial parsing of the input string.
The function parses CPU specifications like '0-3,5' by incrementing
a pointer through the string. strtoi() rejects strings with trailing
characters, causing parse_cpu_set() to fail on any CPU list with
multiple entries.
Restore the original use of atoi() in parse_cpu_set().
Fixes: 7e9dfccf8f11 ("rtla: Replace atoi() with a robust strtoi()")
Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20260112192642.212848-2-costa.shul@redhat.com
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
|
|
The "struct alg" object contains a union of 3 xfrm structures:
union {
struct xfrm_algo;
struct xfrm_algo_aead;
struct xfrm_algo_auth;
}
All of them end with a flexible array member used to store key material,
but the flexible array appears at *different offsets* in each struct.
bcz of this, union itself is of variable-sized & Placing it above
char buf[...] triggers:
ipsec.c:835:5: warning: field 'u' with variable sized type 'union
(unnamed union at ipsec.c:831:3)' not at the end of a struct or class
is a GNU extension [-Wgnu-variable-sized-type-not-at-end]
835 | } u;
| ^
one fix is to use "TRAILING_OVERLAP()" which works with one flexible
array member only.
But In "struct alg" flexible array member exists in all union members,
but not at the same offset, so TRAILING_OVERLAP cannot be applied.
so the fix is to explicitly overlay the key buffer at the correct offset
for the largest union member (xfrm_algo_auth). This ensures that the
flexible-array region and the fixed buffer line up.
No functional change.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com>
Link: https://patch.msgid.link/20260109152201.15668-1-ankitkhushwaha.linux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With CONFIG_CFI enabled, the kernel strictly enforces that indirect
function calls use a function pointer type that matches the target
function. As bpf_testmod_ctx_release() signature differs from the
btf_dtor_kfunc_t pointer type used for the destructor calls in
bpf_obj_free_fields(), add a stub function with the correct type to
fix the type mismatch.
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20260110082548.113748-9-samitolvanen@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add 'stop' subcommand to kublk utility that uses the new
UBLK_CMD_TRY_STOP_DEV command when --safe option is specified.
This allows stopping a device only if it has no active openers,
returning -EBUSY otherwise.
Also add test_generic_16.sh to test the new functionality.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
As pointed out during review of the --list-attrs support the GET
ops very often return the same attrs from do and dump. Make the
output more readable by combining the reply information, from:
Do request attributes:
- ifindex: u32
netdev ifindex
Do reply attributes:
- ifindex: u32
netdev ifindex
[ .. other attrs .. ]
Dump reply attributes:
- ifindex: u32
netdev ifindex
[ .. other attrs .. ]
To, after:
Do request attributes:
- ifindex: u32
netdev ifindex
Do and Dump reply attributes:
- ifindex: u32
netdev ifindex
[ .. other attrs .. ]
Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Event and notify handling is quite different from do / dump
handling. Forcing it into print_mode_attrs() doesn't really
buy us anything as events and notifications do not have requests.
Call print_attr_list() directly. Apart form subjective code
clarity this also removes the word "reply" from the output:
Before:
Event reply attributes:
Now:
Event attributes:
Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We'll soon add more code to the --doc handling. Factor it out
to avoid making main() too long.
Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
--list-attrs also provides information about the operation itself.
So --doc seems more appropriate. Add an alias.
Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Improve the clarity of --help. Reorder, provide some grouping and
add help messages to most of the options.
No functional changes intended.
Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We already use textwrap when printing "doc" section about an attribute,
but only to indent the text. Switch to using fill() to split and indent
all the lines. While at it indent the text by 2 more spaces, so that it
doesn't align with the name of the attribute.
Before (I'm drawing a "box" at ~60 cols here, in an attempt for clarity):
| - irq-suspend-timeout: uint |
| The timeout, in nanoseconds, of how long to suspend irq|
|processing, if event polling finds events |
After:
| - irq-suspend-timeout: uint |
| The timeout, in nanoseconds, of how long to suspend |
| irq processing, if event polling finds events |
Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
It's a little hard to make sense of the output of --list-attrs,
it looks like a wall of text. Sprinkle a little bit of formatting -
make op and attr names bold, and Enum: / Flags: keywords italics.
Tested-by: Gal Pressman <gal@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20260110233142.3921386-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Instead of having a pre-filled array xen_mmu_ops for Xen PV paravirt
functions, drop the array and assign each element individually.
This is in preparation of reducing the paravirt include hell by
splitting paravirt.h into multiple more fine grained header files,
which will in turn require to split up the pv_ops vector as well.
Dropping the pre-filled array makes life easier for objtool to
detect missing initializers in multiple pv_ops_ arrays.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://patch.msgid.link/20260105110520.21356-18-jgross@suse.com
|
|
The a2l_style is only relevant to the command line version, so rename
to make this clearer.
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add an implementation of addr2line that uses libdw.
Other addr2line implementations are slow, particularly in the case of
forking addr2line.
Add an implementation that caches the libdw information in the dso and
uses it to find the file and line number information.
Inline information is supported but because cu_walk_functions_at visits
the leaf function last add a inline_list__append_tail to reverse the
lists order.
Committer testing:
# perf probe -x ~/bin/perf libdw__addr2line
Added new event:
probe_perf:libdw_addr2line (on libdw__addr2line in /home/acme/bin/perf)
You can now use it in all perf tools, such as:
perf record -e probe_perf:libdw_addr2line -aR sleep 1
#
# perf stat -e probe_perf:libdw_addr2line perf report -f --dso perf --stdio -s srcfile,srcline
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 4K of event 'cpu/cycles/Pu'
# Event count (approx.): 5535180842
#
# Overhead Source File Source:Line
# ........ ............ ...............
#
99.04% inlineloop.c inlineloop.c:21
0.46% inlineloop.c inlineloop.c:20
#
# (Tip: For tracepoint events, try: perf report -s trace_fields)
#
Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline':
44 probe_perf:libdw_addr2line
0.037260744 seconds time elapsed
0.025299000 seconds user
0.011918000 seconds sys
#
Adding probes to the other addr2line implementations (llvm__addr2line,
libbfd__addr2line and cmd__addr2line) I noticed some fallbacks to the
llvm one:
Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline':
44 probe_perf:libdw_addr2line
23 probe_perf:llvm_addr2line
0 probe_perf:libbfd_addr2line
0 probe_perf:cmd_addr2line
Something to investigate further, but at least we don't fallback to the
cmd based one :-)
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The purpose of this workload is to gather samples in an inlined
function. This can be used to test whether inlined addr2line works
correctly.
Committer testing:
$ perf record perf test -w inlineloop 1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.161 MB perf.data (4005 samples) ]
$ perf report --stdio --dso perf -s srcfile,srcline
#
# Total Lost Samples: 0
#
# Samples: 4K of event 'cpu/cycles/Pu'
# Event count (approx.): 5535180842
#
# Overhead Source File Source:Line
# ........ ............ ...............
#
99.04% inlineloop.c inlineloop.c:21
0.46% inlineloop.c inlineloop.c:20
#
$
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The addition of addr_location__exit() causes use-after put on the maps
and map references in the unwind info. Add the gets and then add the
map_symbol__exit() calls.
Fixes: 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy functions")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Tony Jones <tonyj@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
As stated in commit 1c09b195d37f ("cpuset: fix a regression in validating
config change"), it is not allowed to clear masks of a cpuset if
there're tasks in it. This is specific to v1 since empty "cpuset.cpus"
or "cpuset.mems" will cause the v2 cpuset to inherit the effective CPUs
or memory nodes from its parent. So it is OK to have empty cpus or mems
even if there are tasks in the cpuset.
Move this empty cpus/mems check in validate_change() to
cpuset1_validate_change() to allow more flexibility in setting
cpus or mems in v2. cpuset_is_populated() needs to be moved into
cpuset-internal.h as it is needed by the empty cpus/mems checking code.
Also add a test case to test_cpuset_prs.sh to verify that.
Reported-by: Chen Ridong <chenridong@huaweicloud.com>
Closes: https://lore.kernel.org/lkml/7a3ec392-2e86-4693-aa9f-1e668a668b9c@huaweicloud.com/
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Chen Ridong <chenridong@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Currently, when setting a cpuset's cpuset.cpus to a value that conflicts
with the cpuset.cpus/cpuset.cpus.exclusive of a sibling partition,
the sibling's partition state becomes invalid. This is overly harsh and
is probably not necessary.
The cpuset.cpus.exclusive control file, if set, will override the
cpuset.cpus of the same cpuset when creating a cpuset partition.
So cpuset.cpus has less priority than cpuset.cpus.exclusive in setting up
a partition. However, it cannot override a conflicting cpuset.cpus file
in a sibling cpuset and the partition creation process will fail. This
is inconsistent. That will also make using cpuset.cpus.exclusive less
valuable as a tool to set up cpuset partitions as the users have to
check if such a cpuset.cpus conflict exists or not.
Fix these problems by making sure that once a cpuset.cpus.exclusive
is set without failure, it will always be allowed to form a valid
partition as long as at least one CPU can be granted from its parent
irrespective of the state of the siblings' cpuset.cpus values. Of
course, setting cpuset.cpus.exclusive will fail if it conflicts with
the cpuset.cpus.exclusive or the cpuset.cpus.exclusive.effective value
of a sibling.
Partition can still be created by setting only cpuset.cpus without
setting cpuset.cpus.exclusive. However, any conflicting CPUs in sibling's
cpuset.cpus.exclusive.effective and cpuset.cpus.exclusive values will
be removed from its cpuset.cpus.exclusive.effective as long as there
is still one or more CPUs left and can be granted from its parent. This
CPU stripping is currently done in rm_siblings_excl_cpus().
The new code will now try its best to enable the creation of new
partitions with only cpuset.cpus set without invalidating existing ones.
However it is not guaranteed that all the CPUs requested in cpuset.cpus
will be used in the new partition even when all these CPUs can be
granted from the parent.
This is similar to the fact that cpuset.cpus.effective may not be
able to include all the CPUs requested in cpuset.cpus. In this case,
the parent may not able to grant all the exclusive CPUs requested in
cpuset.cpus to cpuset.cpus.exclusive.effective if some of them have
already been granted to other partitions earlier.
With the creation of multiple sibling partitions by setting
only cpuset.cpus, this does have the side effect that their exact
cpuset.cpus.exclusive.effective settings will depend on the order of
partition creation if there are conflicts. Due to the exclusive nature
of the CPUs in a partition, it is not easy to make it fair other than
the old behavior of invalidating all the conflicting partitions.
For example,
# echo "0-2" > A1/cpuset.cpus
# echo "root" > A1/cpuset.cpus.partition
# cat A1/cpuset.cpus.partition
root
# cat A1/cpuset.cpus.exclusive.effective
0-2
# echo "2-4" > B1/cpuset.cpus
# echo "root" > B1/cpuset.cpus.partition
# cat B1/cpuset.cpus.partition
root
# cat B1/cpuset.cpus.exclusive.effective
3-4
# cat B1/cpuset.cpus.effective
3-4
For users who want to be sure that they can get most of the CPUs they
want, cpuset.cpus.exclusive should be used instead if they can set
it successfully without failure. Setting cpuset.cpus.exclusive will
guarantee that sibling conflicts from then onward is no longer possible.
To make this change, we have to separate out the is_cpu_exclusive()
check in cpus_excl_conflict() into a cgroup v1 only
cpuset1_cpus_excl_conflict() helper. The cpus_allowed_validate_change()
helper is now no longer needed and can be removed.
Some existing tests in test_cpuset_prs.sh are updated and new ones are
added to reflect the new behavior. The cgroup-v2.rst doc file is also
updated the clarify what exclusive CPUs will be used when a partition
is created.
Reported-by: Sun Shaojie <sunshaojie@kylinos.cn>
Closes: https://lore.kernel.org/lkml/20251117015708.977585-1-sunshaojie@kylinos.cn/
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Chen Ridong <chenridong@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
The test is based on an error/fix posted to linux-perf-users.
Reported-by: Sri Jayaramappa <sjayaram@akamai.com>
Reviewed-by: Sri Jayaramappa <sjayaram@akamai.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Closes: https://lore.kernel.org/linux-perf-users/20251202213632.2873731-1-sjayaram@akamai.com/
Closes: https://urldefense.com/v3/__https://lore.kernel.org/linux-perf-users/20251202213632.2873731-1-sjayaram@akamai.com/__;!!GjvTz_vk!XehekKNUE4Ib_tvqIH6PMIIhly4X3BZ-Y40RC1HKMQ-6OdYEFvUPQhyWv_gk9vsRRN4_RcOLS2Bh0CQ$
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Commit bc22de9bcdb22491 ("perf stat: Display time in precision based on
std deviation") added multirun workload elapsed time. There was an
effort to make the precision in the output most useful for the user,
however, when gathering over runs it means the formatting varies. This
change just makes the output format fixed.
Before:
```
$ while :; do perf stat --null --repeat 3 sleep 0.1 2>&1 | grep elapsed; done
0.101140 +- 0.000149 seconds time elapsed ( +- 0.15% )
0.1011396 +- 0.0000218 seconds time elapsed ( +- 0.02% )
0.101331 +- 0.000124 seconds time elapsed ( +- 0.12% )
^C
$ while :; do perf stat --null --repeat 3 sleep 1 2>&1 | grep elapsed; done
1.001317 +- 0.000146 seconds time elapsed ( +- 0.01% )
1.001377 +- 0.000172 seconds time elapsed ( +- 0.02% )
1.00253 +- 0.00131 seconds time elapsed ( +- 0.13% )
```
After:
```
$ while :; do perf stat --null --repeat 3 sleep 0.1 2>&1 | grep elapsed; done
0.101406408 +- 0.000064778 seconds time elapsed ( +- 0.06% )
0.101367315 +- 0.000027253 seconds time elapsed ( +- 0.03% )
0.101434164 +- 0.000084750 seconds time elapsed ( +- 0.08% )
^C
$ while :; do perf stat --null --repeat 3 sleep 1 2>&1 | grep elapsed; done
1.001525467 +- 0.000051703 seconds time elapsed ( +- 0.01% )
1.001375093 +- 0.000116200 seconds time elapsed ( +- 0.01% )
1.001141025 +- 0.000046361 seconds time elapsed ( +- 0.00% )
```
Closes: https://lore.kernel.org/lkml/aTQRgAOpKyI53TEq@gmail.com/
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Instead of having a pre-filled array xen_cpu_ops for Xen PV paravirt
functions, drop the array and assign each element individually.
This is in preparation of reducing the paravirt include hell by
splitting paravirt.h into multiple more fine grained header files,
which will in turn require to split up the pv_ops vector as well.
Dropping the pre-filled array makes life easier for objtool to
detect missing initializers in multiple pv_ops_ arrays.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://patch.msgid.link/20260105110520.21356-17-jgross@suse.com
|