| Age | Commit message (Collapse) | Author | Files | Lines |
|
This will be used by aux PMUs to read an already written value for
configuring their events and for also testing.
Its helper perf_pmu__format_unpack() does the opposite of the existing
pmu_format_value() so rename that one to perf_pmu__format_pack() so it's
clear how they are related.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Misleadingly, evsel__set_config_if_unset() only works with the config
field and not config1, config2, etc. This is fine at the moment because
all users of it happen to operate on bits that are in that config field.
Fix it before there are any new users of the function which operate on
bits in different config fields.
In theory it's also possible for a driver to move an existing bit to
another config field and this fixes that scenario too, although this
hasn't happened yet either.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently we only track which bits were set by the user in attr->config.
But all configN fields should be treated equally as they can all have
default and user overridden values.
Track them all by making get_config_chgs() generic and calling it once
for each config value.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Sparse config fields are technically supported although currently
unused. field_prep() only works for contiguous bitfields so replace it
with pmu_format_value().
pmu_format_value() also takes a bitmap rather than a u64 so replace
'u64 bits' with format->bits.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
At least one of these were put here to avoid a Python binding linking
issue which is no longer present. Put them back in their correct
location to avoid confusion about which file to add a new evsel__*
function to later.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/all/ZEbAS2yx2fguW60w@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Make the evsel argument first to match the other evsel__* functions
and remove the redundant pmu argument, which can be accessed via evsel.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The ADD_CONFIG_TERM() macros build the __type argument out of a partial
EVSEL__CONFIG_TERM_x enum name. This means that they can't be called
from a function where __type is a variable and it's also impossible to
grep the codebase to find usages of these enums as they're never typed
in full.
Fix this by removing the macros and replacing them with an
add_config_term() function. It seems the main reason these existed in
the first place was to avoid type punning and to write to a specific
field in the union, but the same thing can be achieved with a single
write to a u64 'val' field.
Running the Perf tests with "-fsanitize=undefined -fno-sanitize-recover"
results in no new issues as a result of this change.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Cross-merge BPF and other fixes after downstream PR.
No conflicts.
Adjacent:
Auto-merging MAINTAINERS
Auto-merging Makefile
Auto-merging kernel/bpf/verifier.c
Auto-merging kernel/sched/ext.c
Auto-merging mm/memcontrol.c
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add a test for VMLOAD/VMSAVE in an L2 guest. The test verifies that L1
intercepts for VMSAVE/VMLOAD always work regardless of
VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK.
Then, more interestingly, it makes sure that when L1 does not intercept
VMLOAD/VMSAVE, they work as intended in L2. When
VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is enabled by L1, VMSAVE/VMLOAD from
L2 should interpret the GPA as an L2 GPA and translate it through the
NPT. When VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is disabled by L1,
VMSAVE/VMLOAD from L2 should interpret the GPA as an L1 GPA.
To test this, put two VMCBs (0 and 1) in L1's physical address space,
and have a single L2 GPA where:
- L2 VMCB GPA == L1 VMCB(0) GPA
- L2 VMCB GPA maps to L1 VMCB(1) via the NPT in L1.
This setup allows detecting how the GPA is interpreted based on which L1
VMCB is actually accessed.
In both cases, L2 sets KERNEL_GS_BASE (one of the fields handled by
VMSAVE/VMLOAD), and executes VMSAVE to write its value to the VMCB. The
test userspace code then checks that the write was made to the correct
VMCB (based on whether VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is set by L1),
and writes a new value to that VMCB. L2 then executes VMLOAD to load the
new value and makes sure it's reflected correctly in KERNERL_GS_BASE.
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20260110004821.3411245-4-yosry.ahmed@linux.dev
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
strerror() has thread safety issues, strerror_r() requires stack
allocated buffers.
Code in perf has already been using the "%m" formatting flag that is a
widely support glibc extension to print the current errno's description.
Expand the usage of this formatting flag and remove usage of
strerror()/strerror_r().
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Haibo Xu <haibo1.xu@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Yunseong Kim <ysk@kzalloc.com>
Cc: Zhongqiu Han <quic_zhonhan@quicinc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There's a lot of infrastructure for generating a relatively simple
array used by one function.
Move the array into the function and remove the supporting build logic.
At the same time opportunistically const-ify the array.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add metrics taken from Section 1.2 "Performance Measurement" of the
Performance Monitor Counters for AMD Family 1Ah Model 50h-57h Processors
document available at the link below.
The recommended metrics are sourced from Table 1 "Guidance for Common
Performance Statistics with Complex Event Selects".
The pipeline utilization metrics are sourced from Table 2 "Guidance
for Pipeline Utilization Analysis Statistics". These are useful for
finding performance bottlenecks by analyzing activity at different
stages of the pipeline. There are metric groups available for Level 1
and Level 2 analysis.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://bugzilla.kernel.org/attachment.cgi?id=309149
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add uncore events taken from Section 1.6 "L3 Cache Performance Monitor
Counters" and Section 2.2 "UMC Performance Monitor Events" of the
Performance Monitor Counters for AMD Family 1Ah Model 50h-57h Processors
document available at the link below.
This constitutes events which capture L3 cache and UMC command activity.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://bugzilla.kernel.org/attachment.cgi?id=309149
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add core events taken from Section 1.5 "Core Performance Monitor
Counters" of the Performance Monitor Counters for AMD Family 1Ah Model
50h-57h Processors document available at the link below.
This constitutes events which capture information on op dispatch,
execution and retirement, branch prediction, L1 and L2 cache activity,
TLB activity, etc.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://bugzilla.kernel.org/attachment.cgi?id=309149
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a regular expression in the map file so that appropriate JSON event
files are used for AMD Zen 6 processors. Restrict the regular expression
for AMD Zen 5 processors to known model ranges since they also belong to
Family 1Ah.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephane Eranian <eranian@google.com>
[ Moved this one to the front of the series to keep the tree bisectable, as per Ian Rogers suggestion ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch add the OpenHW Core-V CVA6 Risc-V JSON file.
For more info:
https://openhwfoundation.org/news/2023/11/07/openhw-group-announces-core-v-cva6-platform-project-for-risc-v-software-development-and-testing/
Signed-off-by: Manuel Hernández Méndez <manuel.hernandez@openchip.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Instead of calling memstress_setup_ept_mappings() only in the first
iteration in the loop, move it before the loop.
The call needed to happen within the loop before commit e40e72fec0de
("KVM: selftests: Stop passing VMX metadata to TDP mapping functions"),
as memstress_setup_ept_mappings() used to take in a pointer to vmx_pages
and pass it into tdp_identity_map_1g() (to get the EPT root GPA). This
is no longer the case, as tdp_identity_map_1g() gets the EPT root
through stage2 MMU.
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20260113171456.2097312-1-yosry.ahmed@linux.dev
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Now that get/put_unaligned() use memcpy() the -Wpacked and -Wattributes
warnings don't need disabling anymore.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20251016205126.2882625-5-irogers@google.com
|
|
To pick up the changes in:
vdso: Switch get/put_unaligned() from packed struct to memcpy
As the code is dependent on __unqual_scalar_typeof, update also the tools
version of compiler_types.h to include this.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Link: https://patch.msgid.link/20251016205126.2882625-4-irogers@google.com
|
|
The cache parameter of getcpu() is useless nowadays for various reasons.
* It is never passed by userspace for either the vDSO or syscalls.
* It is never used by the kernel.
* It could not be made to work on the current vDSO architecture.
* The structure definition is not part of the UAPI headers.
* vdso_getcpu() is superseded by restartable sequences in any case.
Remove the struct and its header.
As a side-effect this gets rid of an unwanted inclusion of the linux/
header namespace from vDSO code.
[ tglx: Adapt to s390 upstream changes */
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
Link: https://patch.msgid.link/20251230-getcpu_cache-v3-1-fb9c5f880ebe@linutronix.de
|
|
Pull bpf fixes from Alexei Starovoitov:
- Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK in riscv JIT (Menglong
Dong)
- Fix reference count leak in bpf_prog_test_run_xdp() (Tetsuo Handa)
- Fix metadata size check in bpf_test_run() (Toke Høiland-Jørgensen)
- Check that BPF insn array is not allowed as a map for const strings
(Deepanshu Kartikey)
* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Fix reference count leak in bpf_prog_test_run_xdp()
bpf: Reject BPF_MAP_TYPE_INSN_ARRAY in check_reg_const_str()
selftests/bpf: Update xdp_context_test_run test to check maximum metadata size
bpf, test_run: Subtract size of xdp_frame from allowed metadata size
riscv, bpf: Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK
|
|
The ldimm64 instruction for map value supports an offset.
For insn array maps it wasn't tested before, as normally
such instructions aren't generated. However, this is still
possible to pass such instructions, so add a few tests to
check that correct offsets work properly and incorrect
offsets are rejected.
Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Link: https://lore.kernel.org/r/20260111153047.8388-4-a.s.protopopov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The BPF verifier was recently updated to treat pointers to struct types
returned from BPF kfuncs as implicitly trusted by default. Add a new
test case to exercise this new implicit trust semantic.
The KF_ACQUIRE flag was dropped from the bpf_get_root_mem_cgroup()
kfunc because it returns a global pointer to root_mem_cgroup without
performing any explicit reference counting. This makes it an ideal
candidate to verify the new implicit trusted pointer semantics.
Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260113083949.2502978-3-mattbobrowski@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Teach the BPF verifier to treat pointers to struct types returned from
BPF kfuncs as implicitly trusted (PTR_TO_BTF_ID | PTR_TRUSTED) by
default. Returning untrusted pointers to struct types from BPF kfuncs
should be considered an exception only, and certainly not the norm.
Update existing selftests to reflect the change in register type
printing (e.g. `ptr_` becoming `trusted_ptr_` in verifier error
messages).
Link: https://lore.kernel.org/bpf/aV4nbCaMfIoM0awM@google.com/
Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260113083949.2502978-1-mattbobrowski@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The RPS bitmask bounds check uses ~(RPS_MAX_CPUS - 1) which equals ~15 =
0xfff0, only allowing CPUs 0-3.
Change the mask to ~((1UL << RPS_MAX_CPUS) - 1) = ~0xffff to allow CPUs
0-15.
Fixes: 5ebfb4cc3048 ("selftests/net: toeplitz test")
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260112173715.384843-3-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The toeplitz.py test passed the hex mask without "0x" prefix (e.g.,
"300" for CPUs 8,9). The toeplitz.c strtoul() call wrongly parsed this
as decimal 300 (0x12c) instead of hex 0x300.
Pass the prefixed mask to toeplitz.c, and the unprefixed one to sysfs.
Fixes: 9cf9aa77a1f6 ("selftests: drv-net: hw: convert the Toeplitz test to Python")
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260112173715.384843-2-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add test cases that verify that when the "onlink" keyword is specified,
both address families (with and without VRF) accept routes with a
gateway address that is reachable via a different interface than the one
specified.
Output without "ipv6: Allow for nexthop device mismatch with "onlink"":
# ./fib-onlink-tests.sh | grep mismatch
TEST: nexthop device mismatch [ OK ]
TEST: nexthop device mismatch [ OK ]
TEST: nexthop device mismatch [FAIL]
TEST: nexthop device mismatch [FAIL]
Output with "ipv6: Allow for nexthop device mismatch with "onlink"":
# ./fib-onlink-tests.sh | grep mismatch
TEST: nexthop device mismatch [ OK ]
TEST: nexthop device mismatch [ OK ]
TEST: nexthop device mismatch [ OK ]
TEST: nexthop device mismatch [ OK ]
That is, the IPv4 tests were always passing, but the IPv6 ones only pass
after the specified patch.
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260111120813.159799-6-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A multicast gateway address should be rejected when "onlink" is
specified, but it is only tested as part of the IPv6 tests. Add an
equivalent IPv4 test.
# ./fib-onlink-tests.sh -v
[...]
COMMAND: ip ro add table 254 169.254.101.12/32 via 233.252.0.1 dev veth1 onlink
Error: Nexthop has invalid gateway.
TEST: Invalid gw - multicast address [ OK ]
[...]
COMMAND: ip ro add table 1101 169.254.102.12/32 via 233.252.0.1 dev veth5 onlink
Error: Nexthop has invalid gateway.
TEST: Invalid gw - multicast address, VRF [ OK ]
[...]
Tests passed: 37
Tests failed: 0
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260111120813.159799-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The command in the test fails as expected because IPv6 forbids a nexthop
device mismatch:
# ./fib-onlink-tests.sh -v
[...]
COMMAND: ip -6 ro add table 1101 2001:db8:102::103/128 via 2001:db8:701::64 dev veth5 onlink
Error: Nexthop has invalid gateway or device mismatch.
TEST: Gateway resolves to wrong nexthop device - VRF [ OK ]
[...]
Where:
# ip route get 2001:db8:701::64 vrf lisa
2001:db8:701::64 dev veth7 table 1101 proto kernel src 2001:db8:701::1 metric 256 pref medium
This is in contrast to IPv4 where a nexthop device mismatch is allowed
when "onlink" is specified:
# ip route get 169.254.7.2 vrf lisa
169.254.7.2 dev veth7 table 1101 src 169.254.7.1 uid 0
# ip ro add table 1101 169.254.102.103/32 via 169.254.7.2 dev veth5 onlink
# echo $?
0
Remove these tests in preparation for aligning IPv6 with IPv4 and
allowing nexthop device mismatch when "onlink" is specified.
A subsequent patch will add tests that verify that both address families
allow a nexthop device mismatch with "onlink".
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260111120813.159799-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
According to the test description, these tests fail because of a wrong
nexthop device:
# ./fib-onlink-tests.sh -v
[...]
COMMAND: ip ro add table 254 169.254.101.102/32 via 169.254.3.1 dev veth1 onlink
Error: Nexthop has invalid gateway.
TEST: Gateway resolves to wrong nexthop device [ OK ]
COMMAND: ip ro add table 1101 169.254.102.103/32 via 169.254.7.1 dev veth5 onlink
Error: Nexthop has invalid gateway.
TEST: Gateway resolves to wrong nexthop device - VRF [ OK ]
[...]
But this is incorrect. They fail because the gateway addresses are local
addresses:
# ip -4 address show
[...]
28: veth3@if27: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link-netns peer_ns-Urqh3o
inet 169.254.3.1/24 scope global veth3
[...]
32: veth7@if31: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master lisa state UP group default qlen 1000 link-netns peer_ns-Urqh3o
inet 169.254.7.1/24 scope global veth7
Therefore, using a local address that matches the nexthop device fails
as well:
# ip ro add table 254 169.254.101.102/32 via 169.254.3.1 dev veth3 onlink
Error: Nexthop has invalid gateway.
Using a gateway address with a "wrong" nexthop device is actually valid
and allowed:
# ip route get 169.254.1.2
169.254.1.2 dev veth1 src 169.254.1.1 uid 0
# ip ro add table 254 169.254.101.102/32 via 169.254.1.2 dev veth3 onlink
# echo $?
0
Remove these tests given that their output is confusing and that the
scenario that they are testing is already covered by other tests.
A subsequent patch will add tests for the nexthop device mismatch
scenario.
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260111120813.159799-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a test using large chunks for zcrx memory area.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
|
|
GRO test groups the cases into categories, e.g. "tcp" case
checks coalescing in presence of:
- packets with bad csum,
- sequence number mismatch,
- timestamp option value mismatch,
- different TCP options.
Since we now have TAP support grouping the cases like that
lowers our reporting granularity. This matters even more for
NICs performing HW GRO and LRO since it appears that most
implementation have _some_ bugs. Flagging the whole group
of tests as failed prevents us from catching regressions
in the things that work today.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260113000740.255360-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Run the test against HW GRO and LRO. NICs I have pass the base cases.
Interestingly all are happy to build GROs larger than 64k.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260113000740.255360-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We'll need to do a lot more feature handling to test HW-GRO and LRO.
Clean up the feature handling for SW GRO a bit to let the next commit
focus on the new test cases, only.
Make sure HW GRO-like features are not enabled for the SW tests.
Be more careful about changing features as "nothing changed"
situations may result in non-zero error code from ethtool.
Don't disable TSO on the local interface (receiver) when running over
netdevsim, we just want GSO to break up the segments on the sender.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260113000740.255360-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Now that cmd() can be printed directly remove the old formatting.
Before:
# fragmented ip6 doesn't coalesce:
# Expected {200 100 100 }, Total 3 packets
# Received {200 100 }, Total 2 packets.
# /root/ksft-net-drv/drivers/net/gro: incorrect number of packets
Now:
# CMD: drivers/net/gro --ipv6 --dmac 9e:[...]
# EXIT: 1
# STDOUT: fragmented ip6 doesn't coalesce:
# STDERR: Expected {200 100 100 }, Total 3 packets
# Received {200 100 }, Total 2 packets.
# /root/ksft-net-drv/drivers/net/gro: incorrect number of packets
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260113000740.255360-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Teach cmd() how to print itself, to make debug prints easier.
Example output (leading # due to ksft_pr()):
# CMD: /root/ksft-net-drv/drivers/net/gro
# EXIT: 1
# STDOUT: ipv6 with ext header does coalesce:
# STDERR: Expected {200 }, Total 1 packets
# Received {100 [!=200]100 [!=0]}, Total 2 packets.
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260113000740.255360-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Make printing multi-line logs easier by automatically prefixing
each line in ksft_pr(). Make use of this when formatting exceptions.
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260113000740.255360-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When building ynltool with parallel make (-jN), a warning is emitted:
make[1]: warning: jobserver unavailable: using -j1.
Add '+' to parent make rule.
The warning trips up local runs of NIPA's ingest_mdir.py, which
correctly fails on make warnings.
This occurs because SRC_VERSION uses $(shell make ...) to make
kernelversion. The $(shell) function inherits make's MAKEFLAGS env var
which specifies "--jobserver-auth=R,W" pointing to file descriptors that
the invoked make sub-shell does not have access to.
Observed with:
$ make --version | head -1
GNU Make 4.3
Instead of suppressing MAKEFLAGS and foregoing all future MAKEFLAGS
(some of which may be desirable, such as variable overrides) or
introducing a new make target, we instead just ignore the warning by
piping stderr to /dev/null. If 'make kernelversion' fails, the ' || echo
"unknown"' phrase will catch the failure.
Before:
NIPA ingest_mdir.py:
ynl
Full series FAIL (1)
Generated files up to date; build has 1 warnings/errors; no diff in
generated;
After:
NIPA ingest_mdir.py:
Series level tests:
ynl OKAY
Validated output:
$ ./ynltool/ynltool --version
ynltool 6.19.0-rc4
Signed-off-by: Bobby Eshleman <bobbyeshleman@meta.com>
Link: https://patch.msgid.link/20260112-ynl-make-fix-v1-1-c399e76925ad@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix KVM's long-standing buggy handling of SVM's exit_code as a 32-bit
value. Per the APM and Xen commit d1bd157fbc ("Big merge the HVM
full-virtualisation abstractions.") (which is arguably more trustworthy
than KVM), offset 0x70 is a single 64-bit value:
070h 63:0 EXITCODE
Track exit_code as a single u64 to prevent reintroducing bugs where KVM
neglects to correctly set bits 63:32.
Fixes: 6aa8b732ca01 ("[PATCH] kvm: userspace interface")
Cc: Jim Mattson <jmattson@google.com>
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230211347.4099600-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a test to verify KVM correctly handles a variety of edge cases related
to APICv updates, and in particular updates that are triggered while L2 is
actively running.
Reviewed-by: Chao Gao <chao.gao@intel.com>
Link: https://patch.msgid.link/20260109034532.1012993-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Calling the str_is_empty function to clarify the code and
no functional changes are introduced.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-12-dolinux.peng@gmail.com
|
|
This patch checks whether the BTF is sorted by name in ascending
order. If sorted, binary search will be used when looking up types.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-6-dolinux.peng@gmail.com
|
|
This patch introduces binary search optimization for BTF type lookups
when the BTF instance contains sorted types.
The optimization significantly improves performance when searching for
types in large BTF instances with sorted types. For unsorted BTF, the
implementation falls back to the original linear search.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-5-dolinux.peng@gmail.com
|
|
This introduces a new BTF sorting phase that specifically sorts
BTF types by name in ascending order, so that the binary search
can be used to look up types.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-4-dolinux.peng@gmail.com
|
|
This patch introduces test cases for the btf__permute function to ensure
it works correctly with both base BTF and split BTF scenarios.
The test suite includes:
- test_permute_base: Validates permutation on base BTF
- test_permute_split: Tests permutation on split BTF
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-3-dolinux.peng@gmail.com
|
|
Introduce btf__permute() API to allow in-place rearrangement of BTF types.
This function reorganizes BTF type order according to a provided array of
type IDs, updating all type references to maintain consistency.
Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-2-dolinux.peng@gmail.com
|
|
The function addr_location__put() was renamed addr_location__exit() in
commit 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy
functions"). Make the comment preceding the function consistent with
the function itself.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kexin Sun <kexinsun@smail.nju.edu.cn>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ratnadira Widyasari <ratnadiraw@smu.edu.sg>
Cc: Xutong Ma <xutong.ma@inria.fr>
Cc: Yumbo Lyu <yunbolyu@smu.edu.sg>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
These are hard to interpret in the raw output because they are printed
as hex but are defined in perf_event.h as decimal. Make it much easier
to read the raw callchains by just printing their names.
For example:
$ perf report -D
1798195372321 0x4638 [0xb0]: PERF_RECORD_SAMPLE(IP, 0x4002): 44922/44922: 0x7c8046dd3400 period: 120218 addr: 0
... FP chain: nr:12
..... 0: fffffffffffffe00 (PERF_CONTEXT_USER)
..... 1: 00007c8046dd3400
..... 2: 00007c8046db86d3
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
[ Add PERF_CONTEXT_USER_DEFERRED too, as per Namhyung's review comment ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
These events are never countable by the PMU and are only intended to
be used as external inputs to trace. Therefore showing them in 'perf
list' is misleading so remove them.
The generator script doesn't emit these events when used with the new
telemetry-solution input files [1].
'perf list' should only show countable events because there are events
that are sometimes implemented, sometimes countable and sometimes not,
for example TRB_TRIG. If we always include any implemented events
whether they are countable or not then it's not possible to tell whether
they are usable in perf without going to the docs, defeating the point
of 'perf list'.
It's also not useful yet to display implemented events that are not
countable (for help in using trace rather than perf stat), because
PMU_OVFS and PMU_HOVFS are practically always implemented and TRB_TRIG
is always implemented when there is TRBE.
[1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/tree/main/data/pmu/cpu
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akio Kakuno <fj3333bs@aa.jp.fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshihiro Furudera <fj5100bi@fujitsu.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The correct call-stack option for branch stack sampling should be "stack"
instead of "call_stack". Correct it.
$perf record -e instructions -j call_stack -- sleep 1
unknown branch filter call_stack, check man page
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-j, --branch-filter <branch filter mask>
branch stack filter modes
Fixes: 955f6def5590ce6c ("perf record: Add remaining branch filters: "no_cycles", "no_flags" & "hw_index"")
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Xudong Hao <xudong.hao@intel.com>
Cc: Zide Chen <zide.chen@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|