summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2022-07-19libbpf: fallback to tracefs mount point if debugfs is not mountedAndrii Nakryiko1-21/+40
Teach libbpf to fallback to tracefs mount point (/sys/kernel/tracing) if debugfs (/sys/kernel/debug/tracing) isn't mounted. Acked-by: Yonghong Song <yhs@fb.com> Suggested-by: Connor O'Brien <connoro@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220715185736.898848-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19selftests/bpf: validate .bss section bigger than 8MB is possible nowAndrii Nakryiko2-0/+6
Add a simple big 16MB array and validate access to the very last byte of it to make sure that kernel supports > KMALLOC_MAX_SIZE value_size for BPF array maps (which are backing .bss in this case). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220715053146.1291891-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19selftests/bpf: use BPF_KSYSCALL and SEC("ksyscall") in selftestsAndrii Nakryiko3-32/+16
Convert few selftest that used plain SEC("kprobe") with arch-specific syscall wrapper prefix to ksyscall/kretsyscall and corresponding BPF_KSYSCALL macro. test_probe_user.c is especially benefiting from this simplification. Tested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19libbpf: add ksyscall/kretsyscall sections support for syscall kprobesAndrii Nakryiko4-9/+157
Add SEC("ksyscall")/SEC("ksyscall/<syscall_name>") and corresponding kretsyscall variants (for return kprobes) to allow users to kprobe syscall functions in kernel. These special sections allow to ignore complexities and differences between kernel versions and host architectures when it comes to syscall wrapper and corresponding __<arch>_sys_<syscall> vs __se_sys_<syscall> differences, depending on whether host kernel has CONFIG_ARCH_HAS_SYSCALL_WRAPPER (though libbpf itself doesn't rely on /proc/config.gz for detecting this, see BPF_KSYSCALL patch for how it's done internally). Combined with the use of BPF_KSYSCALL() macro, this allows to just specify intended syscall name and expected input arguments and leave dealing with all the variations to libbpf. In addition to SEC("ksyscall+") and SEC("kretsyscall+") add bpf_program__attach_ksyscall() API which allows to specify syscall name at runtime and provide associated BPF cookie value. At the moment SEC("ksyscall") and bpf_program__attach_ksyscall() do not handle all the calling convention quirks for mmap(), clone() and compat syscalls. It also only attaches to "native" syscall interfaces. If host system supports compat syscalls or defines 32-bit syscalls in 64-bit kernel, such syscall interfaces won't be attached to by libbpf. These limitations may or may not change in the future. Therefore it is recommended to use SEC("kprobe") for these syscalls or if working with compat and 32-bit interfaces is required. Tested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALLAndrii Nakryiko2-13/+40
Improve BPF_KPROBE_SYSCALL (and rename it to shorter BPF_KSYSCALL to match libbpf's SEC("ksyscall") section name, added in next patch) to use __kconfig variable to determine how to properly fetch syscall arguments. Instead of relying on hard-coded knowledge of whether kernel's architecture uses syscall wrapper or not (which only reflects the latest kernel versions, but is not necessarily true for older kernels and won't necessarily hold for later kernel versions on some particular host architecture), determine this at runtime by attempting to create perf_event (with fallback to kprobe event creation through tracefs on legacy kernels, just like kprobe attachment code is doing) for kernel function that would correspond to bpf() syscall on a system that has CONFIG_ARCH_HAS_SYSCALL_WRAPPER set (e.g., for x86-64 it would try '__x64_sys_bpf'). If host kernel uses syscall wrapper, syscall kernel function's first argument is a pointer to struct pt_regs that then contains syscall arguments. In such case we need to use bpf_probe_read_kernel() to fetch actual arguments (which we do through BPF_CORE_READ() macro) from inner pt_regs. But if the kernel doesn't use syscall wrapper approach, input arguments can be read from struct pt_regs directly with no probe reading. All this feature detection is done without requiring /proc/config.gz existence and parsing, and BPF-side helper code uses newly added LINUX_HAS_SYSCALL_WRAPPER virtual __kconfig extern to keep in sync with user-side feature detection of libbpf. BPF_KSYSCALL() macro can be used both with SEC("kprobe") programs that define syscall function explicitly (e.g., SEC("kprobe/__x64_sys_bpf")) and SEC("ksyscall") program added in the next patch (which are the same kprobe program with added benefit of libbpf determining correct kernel function name automatically). Kretprobe and kretsyscall (added in next patch) programs don't need BPF_KSYSCALL as they don't provide access to input arguments. Normal BPF_KRETPROBE is completely sufficient and is recommended. Tested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19selftests/bpf: add test of __weak unknown virtual __kconfig externAndrii Nakryiko2-10/+10
Exercise libbpf's logic for unknown __weak virtual __kconfig externs. USDT selftests are already excercising non-weak known virtual extern already (LINUX_HAS_BPF_COOKIE), so no need to add explicit tests for it. Tested-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19libbpf: generalize virtual __kconfig externs and use it for USDTAndrii Nakryiko2-45/+66
Libbpf supports single virtual __kconfig extern currently: LINUX_KERNEL_VERSION. LINUX_KERNEL_VERSION isn't coming from /proc/kconfig.gz and is intead customly filled out by libbpf. This patch generalizes this approach to support more such virtual __kconfig externs. One such extern added in this patch is LINUX_HAS_BPF_COOKIE which is used for BPF-side USDT supporting code in usdt.bpf.h instead of using CO-RE-based enum detection approach for detecting bpf_get_attach_cookie() BPF helper. This allows to remove otherwise not needed CO-RE dependency and keeps user-space and BPF-side parts of libbpf's USDT support strictly in sync in terms of their feature detection. We'll use similar approach for syscall wrapper detection for BPF_KSYSCALL() BPF-side macro in follow up patch. Generally, currently libbpf reserves CONFIG_ prefix for Kconfig values and LINUX_ for virtual libbpf-backed externs. In the future we might extend the set of prefixes that are supported. This can be done without any breaking changes, as currently any __kconfig extern with unrecognized name is rejected. For LINUX_xxx externs we support the normal "weak rule": if libbpf doesn't recognize given LINUX_xxx extern but such extern is marked as __weak, it is not rejected and defaults to zero. This follows CONFIG_xxx handling logic and will allow BPF applications to opportunistically use newer libbpf virtual externs without breaking on older libbpf versions unnecessarily. Tested-by: Alan Maguire <alan.maguire@oracle.com> Reviewed-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20220714070755.3235561-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-19tools headers UAPI: Sync linux/kvm.h with the kernel sourcesPaolo Bonzini1-1/+2
Silence this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-19KVM: selftests: Fix target thread to be migrated in rseq_testGavin Shan1-3/+5
In rseq_test, there are two threads, which are vCPU thread and migration worker separately. Unfortunately, the test has the wrong PID passed to sched_setaffinity() in the migration worker. It forces migration on the migration worker because zeroed PID represents the calling thread, which is the migration worker itself. It means the vCPU thread is never enforced to migration and it can migrate at any time, which eventually leads to failure as the following logs show. host# uname -r 5.19.0-rc6-gavin+ host# # cat /proc/cpuinfo | grep processor | tail -n 1 processor : 223 host# pwd /home/gavin/sandbox/linux.main/tools/testing/selftests/kvm host# for i in `seq 1 100`; do \ echo "--------> $i"; ./rseq_test; done --------> 1 --------> 2 --------> 3 --------> 4 --------> 5 --------> 6 ==== Test Assertion Failure ==== rseq_test.c:265: rseq_cpu == cpu pid=3925 tid=3925 errno=4 - Interrupted system call 1 0x0000000000401963: main at rseq_test.c:265 (discriminator 2) 2 0x0000ffffb044affb: ?? ??:0 3 0x0000ffffb044b0c7: ?? ??:0 4 0x0000000000401a6f: _start at ??:? rseq CPU = 4, sched CPU = 27 Fix the issue by passing correct parameter, TID of the vCPU thread, to sched_setaffinity() in the migration worker. Fixes: 61e52f1630f5 ("KVM: selftests: Add a test for KVM_RUN+rseq to detect task migration bugs") Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Message-Id: <20220719020830.3479482-1-gshan@redhat.com> Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-07-18random: remove CONFIG_ARCH_RANDOMJason A. Donenfeld1-1/+0
When RDRAND was introduced, there was much discussion on whether it should be trusted and how the kernel should handle that. Initially, two mechanisms cropped up, CONFIG_ARCH_RANDOM, a compile time switch, and "nordrand", a boot-time switch. Later the thinking evolved. With a properly designed RNG, using RDRAND values alone won't harm anything, even if the outputs are malicious. Rather, the issue is whether those values are being *trusted* to be good or not. And so a new set of options were introduced as the real ones that people use -- CONFIG_RANDOM_TRUST_CPU and "random.trust_cpu". With these options, RDRAND is used, but it's not always credited. So in the worst case, it does nothing, and in the best case, maybe it helps. Along the way, CONFIG_ARCH_RANDOM's meaning got sort of pulled into the center and became something certain platforms force-select. The old options don't really help with much, and it's a bit odd to have special handling for these instructions when the kernel can deal fine with the existence or untrusted existence or broken existence or non-existence of that CPU capability. Simplify the situation by removing CONFIG_ARCH_RANDOM and using the ordinary asm-generic fallback pattern instead, keeping the two options that are actually used. For now it leaves "nordrand" for now, as the removal of that will take a different route. Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-07-17perf trace: Fix SIGSEGV when processing syscall argsNaveen N. Rao1-0/+2
On powerpc, 'perf trace' is crashing with a SIGSEGV when trying to process a perf.data file created with 'perf trace record -p': #0 0x00000001225b8988 in syscall_arg__scnprintf_augmented_string <snip> at builtin-trace.c:1492 #1 syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1492 #2 syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1486 #3 0x00000001225bdd9c in syscall_arg_fmt__scnprintf_val <snip> at builtin-trace.c:1973 #4 syscall__scnprintf_args <snip> at builtin-trace.c:2041 #5 0x00000001225bff04 in trace__sys_enter <snip> at builtin-trace.c:2319 That points to the below code in tools/perf/builtin-trace.c: /* * If this is raw_syscalls.sys_enter, then it always comes with the 6 possible * arguments, even if the syscall being handled, say "openat", uses only 4 arguments * this breaks syscall__augmented_args() check for augmented args, as we calculate * syscall->args_size using each syscalls:sys_enter_NAME tracefs format file, * so when handling, say the openat syscall, we end up getting 6 args for the * raw_syscalls:sys_enter event, when we expected just 4, we end up mistakenly * thinking that the extra 2 u64 args are the augmented filename, so just check * here and avoid using augmented syscalls when the evsel is the raw_syscalls one. */ if (evsel != trace->syscalls.events.sys_enter) augmented_args = syscall__augmented_args(sc, sample, &augmented_args_size, trace->raw_augmented_syscalls_args_size); As the comment points out, we should not be trying to augment the args for raw_syscalls. However, when processing a perf.data file, we are not initializing those properly. Fix the same. Reported-by: Claudio Carvalho <cclaudio@linux.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220707090900.572584-1-naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17perf tests: Fix Convert perf time to TSC test for hybridAdrian Hunter1-14/+4
The test does not always correctly determine the number of events for hybrids, nor allow for more than 1 evsel when parsing. Fix by iterating the events actually created and getting the correct evsel for the events processed. Fixes: d9da6f70eb235110 ("perf tests: Support 'Convert perf time to TSC' test for hybrid") Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20220713123459.24145-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17perf tests: Stop Convert perf time to TSC test opening events twiceAdrian Hunter1-3/+6
Do not call evlist__open() twice. Fixes: 5bb017d4b97a0f13 ("perf test: Fix error message for test case 71 on s390, where it is not supported") Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20220713123459.24145-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo1-0/+4
To pick up the changes from these csets: 4ad3278df6fe2b08 ("x86/speculation: Disable RRSBA behavior") d7caac991feeef1b ("x86/cpu/amd: Add Spectral Chicken") That cause no changes to tooling: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after $ Just silences this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h' diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/lkml/YtQTm9wsB3hxQWvy@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17tools headers cpufeatures: Sync with the kernel sourcesArnaldo Carvalho de Melo2-3/+30
To pick the changes from: f43b9876e857c739 ("x86/retbleed: Add fine grained Kconfig knobs") a149180fbcf336e9 ("x86: Add magic AMD return-thunk") 15e67227c49a5783 ("x86: Undo return-thunk damage") 369ae6ffc41a3c11 ("x86/retpoline: Cleanup some #ifdefery") 4ad3278df6fe2b08 x86/speculation: Disable RRSBA behavior 26aae8ccbc197223 x86/cpu/amd: Enumerate BTC_NO 9756bba28470722d x86/speculation: Fill RSB on vmexit for IBRS 3ebc170068885b6f x86/bugs: Add retbleed=ibpb 2dbb887e875b1de3 x86/entry: Add kernel IBRS implementation 6b80b59b35557065 x86/bugs: Report AMD retbleed vulnerability a149180fbcf336e9 x86: Add magic AMD return-thunk 15e67227c49a5783 x86: Undo return-thunk damage a883d624aed463c8 x86/cpufeatures: Move RETPOLINE flags to word 11 51802186158c74a0 x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug This only causes these perf files to be rebuilt: CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o And addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h' diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h' diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org Link: https://lore.kernel.org/lkml/YtQM40VmiLTkPND2@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17tools headers UAPI: Sync linux/kvm.h with the kernel sourcesArnaldo Carvalho de Melo1-0/+1
To pick the changes in: 1b870fa5573e260b ("kvm: stats: tell userspace which values are boolean") That just rebuilds perf, as these patches don't add any new KVM ioctl to be harvested for the the 'perf trace' ioctl syscall argument beautifiers. This is also by now used by tools/testing/selftests/kvm/, a simple test build succeeded. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lore.kernel.org/lkml/YtQLDvQrBhJNl3n5@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-16selftests: net: arp_ndisc_untracked_subnets: test for arp_accept and ↵Jaehee Park2-0/+309
accept_untracked_na ipv4 arp_accept has a new option '2' to create new neighbor entries only if the src ip is in the same subnet as an address configured on the interface that received the garp message. This selftest tests all options in arp_accept. ipv6 has a sysctl endpoint, accept_untracked_na, that defines the behavior for accepting untracked neighbor advertisements. A new option similar to that of arp_accept for learning only from the same subnet is added to accept_untracked_na. This selftest tests this new feature. Signed-off-by: Jaehee Park <jhpark1013@gmail.com> Suggested-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-15libbpf: perfbuf: Add API to get the ring bufferJon Doron3-0/+33
Add support for writing a custom event reader, by exposing the ring buffer. With the new API perf_buffer__buffer() you will get access to the raw mmaped()'ed per-cpu underlying memory of the ring buffer. This region contains both the perf buffer data and header (struct perf_event_mmap_page), which manages the ring buffer state (head/tail positions, when accessing the head/tail position it's important to take into consideration SMP). With this type of low level access one can implement different types of consumers here are few simple examples where this API helps with: 1. perf_event_read_simple is allocating using malloc, perhaps you want to handle the wrap-around in some other way. 2. Since perf buf is per-cpu then the order of the events is not guarnteed, for example: Given 3 events where each event has a timestamp t0 < t1 < t2, and the events are spread on more than 1 CPU, then we can end up with the following state in the ring buf: CPU[0] => [t0, t2] CPU[1] => [t1] When you consume the events from CPU[0], you could know there is a t1 missing, (assuming there are no drops, and your event data contains a sequential index). So now one can simply do the following, for CPU[0], you can store the address of t0 and t2 in an array (without moving the tail, so there data is not perished) then move on the CPU[1] and set the address of t1 in the same array. So you end up with something like: void **arr[] = [&t0, &t1, &t2], now you can consume it orderely and move the tails as you process in order. 3. Assuming there are multiple CPUs and we want to start draining the messages from them, then we can "pick" with which one to start with according to the remaining free space in the ring buffer. Signed-off-by: Jon Doron <jond@wiz.io> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220715181122.149224-1-arilou@gmail.com
2022-07-15tools: runqslower: Build and use lightweight bootstrap version of bpftoolPu Lehui1-4/+3
tools/runqslower use bpftool for vmlinux.h, skeleton, and static linking only. So we can use lightweight bootstrap version of bpftool to handle these, and it will be faster. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220714024612.944071-3-pulehui@huawei.com
2022-07-15LSM: SafeSetID: add setgroups() testing to selftestMicah Morton1-0/+69
Selftest already has support for testing UID and GID transitions. Signed-off-by: Micah Morton <mortonm@chromium.org>
2022-07-15LSM: SafeSetID: add GID testing to selftestMicah Morton1-0/+131
GID security policies were added back in v5.10, update the selftest to reflect this. Signed-off-by: Micah Morton <mortonm@chromium.org>
2022-07-15LSM: SafeSetID: selftest cleanup and prepare for GIDsMicah Morton2-44/+51
Add some notes on how to run the test, update the policy file paths to reflect recent upstream changes, prepare test for adding GID testing. Signed-off-by: Micah Morton <mortonm@chromium.org>
2022-07-15LSM: SafeSetID: fix userns bug in selftestMicah Morton1-1/+1
Not sure how this bug got in here but its been there since the original merge. I think I tested the code on a system that wouldn't let me clone() with CLONE_NEWUSER flag set so had to comment out these test_userns invocations. Trying to map UID 0 inside the userns to UID 0 outside will never work, even with CAP_SETUID. The code is supposed to test whether we can map UID 0 in the userns to the UID of the parent process (the one with CAP_SETUID that is writing the /proc/[pid]/uid_map file). Signed-off-by: Micah Morton <mortonm@chromium.org>
2022-07-15pm-graph v5.9Todd Brandt4-186/+360
bootgraph: - fix parsing of /proc/version to be much more flexible - check kernel version to disallow ftrace on anything older than 4.10 sleepgraph: - include fix to bugzilla 212761 in case it regresses - fix for -proc bug: https://github.com/intel/pm-graph/pull/20 - add -debugtiming arg to get timestamps on prints - allow use of the netfix tool hosted in the github repo - read s0ix data from pmc_core for better debug - include more system data in the output log - Do a better job testing input files useability - flag more error data from dmesg in the timeline - pre-parse the trace log to fix any ordering issues - add new parser to process dmesg only timelines - remove superflous sleep(5) in multitest mode config/custom-timeline-functions.cfg: - change some names to keep up to date README: - new version, small wording changes Signed-off-by: Todd Brandt <todd.e.brandt@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-07-15selftests/bpf: Do not attach kprobe_multi bench to bpf_dispatcher_xdp_funcJiri Olsa1-0/+2
Alexei reported crash by running test_progs -j on system with 32 cpus. It turned out the kprobe_multi bench test that attaches all ftrace-able functions will race with bpf_dispatcher_update, that calls bpf_arch_text_poke on bpf_dispatcher_xdp_func, which is ftrace-able function. Ftrace is not aware of this update so this will cause ftrace_bug with: WARNING: CPU: 6 PID: 1985 at arch/x86/kernel/ftrace.c:94 ftrace_verify_code+0x27/0x50 ... ftrace_replace_code+0xa3/0x170 ftrace_modify_all_code+0xbd/0x150 ftrace_startup_enable+0x3f/0x50 ftrace_startup+0x98/0xf0 register_ftrace_function+0x20/0x60 register_fprobe_ips+0xbb/0xd0 bpf_kprobe_multi_link_attach+0x179/0x430 __sys_bpf+0x18a1/0x2440 ... ------------[ ftrace bug ]------------ ftrace failed to modify [<ffffffff818d9380>] bpf_dispatcher_xdp_func+0x0/0x10 actual: ffffffe9:7b:ffffff9c:77:1e Setting ftrace call site to call ftrace function It looks like we need some way to hide some functions from ftrace, but meanwhile we workaround this by skipping bpf_dispatcher_xdp_func from kprobe_multi bench test. Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220714082316.479181-1-jolsa@kernel.org
2022-07-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski19-39/+511
include/net/sock.h 310731e2f161 ("net: Fix data-races around sysctl_mem.") e70f3c701276 ("Revert "net: set SK_MEM_QUANTUM to 4096"") https://lore.kernel.org/all/20220711120211.7c8b7cba@canb.auug.org.au/ net/ipv4/fib_semantics.c 747c14307214 ("ip: fix dflt addr selection for connected nexthop") d62607c3fe45 ("net: rename reference+tracking helpers") net/tls/tls.h include/net/tls.h 3d8c51b25a23 ("net/tls: Check for errors in tls_device_init") 587903142308 ("tls: create an internal header") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-14selftests/landlock: drop deprecated headers dependencyGuillaume Tucker1-7/+2
The khdr make target has been removed, so drop it from the landlock Makefile dependencies as well as related include paths that are standard for headers in the kernel tree. Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com> Reported-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-14selftests: timers: clocksource-switch: adapt to kselftest frameworkWolfram Sang1-3/+5
So we have proper counters at the end of a test. We also print the kselftest header at the end of the test, so we don't mix with the output of the child process. There is only this one test anyhow. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-14selftests: timers: clocksource-switch: add 'runtime' command line parameterWolfram Sang1-3/+8
So the user can decide how long the test should run. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-14selftests: timers: clocksource-switch: add command line switch to skip ↵Wolfram Sang1-12/+28
sanity check The sanity check takes a while. If you do repeated checks when debugging, this is time consuming. Add a parameter to skip it. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-14selftests: timers: clocksource-switch: sort includesWolfram Sang1-5/+5
It is easier to check if you need to add an include if the existing ones are sorted. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-14selftests: timers: clocksource-switch: fix passing errors from childWolfram Sang1-3/+3
The return value from system() is a waitpid-style integer. Do not return it directly because with the implicit masking in exit() it will always return 0. Access it with appropriate macros to really pass on errors. Fixes: 7290ce1423c3 ("selftests/timers: Add clocksource-switch test from timetest suite") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-14selftests: timers: inconsistency-check: adapt to kselftest frameworkWolfram Sang1-14/+18
So we have proper counters at the end of a test, e.g.: # Totals: pass:11 fail:0 xfail:0 xpass:0 skip:1 error:0 Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-14selftests: timers: nanosleep: adapt to kselftest frameworkWolfram Sang1-7/+11
So we have proper counters at the end of a test, e.g.: # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:8 error:0 Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-14selftests: timers: fix declarations of main()Wolfram Sang4-4/+4
Mixing up argc/argv went unnoticed because they were not used. Still, this is worth fixing. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-14selftests: timers: valid-adjtimex: build fix for newer toolchainsWolfram Sang1-1/+1
Toolchains with an include file 'sys/timex.h' based on 3.18 will have a 'clock_adjtime' definition added, so it can't be static in the code: valid-adjtimex.c:43:12: error: static declaration of ‘clock_adjtime’ follows non-static declaration Fixes: e03a58c320e1 ("kselftests: timers: Add adjtimex SETOFFSET validity tests") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: John Stultz <jstultz@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-14Merge tag 'net-5.19-rc7' of ↵Linus Torvalds8-13/+138
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, bpf and wireless. Still no major regressions, the release continues to be calm. An uptick of fixes this time around due to trivial data race fixes and patches flowing down from subtrees. There has been a few driver fixes (particularly a few fixes for false positives due to 66e4c8d95008 which went into -next in May!) that make me worry the wide testing is not exactly fully through. So "calm" but not "let's just cut the final ASAP" vibes over here. Current release - regressions: - wifi: rtw88: fix write to const table of channel parameters Current release - new code bugs: - mac80211: add gfp_t arg to ieeee80211_obss_color_collision_notify - mlx5: - TC, allow offload from uplink to other PF's VF - Lag, decouple FDB selection and shared FDB - Lag, correct get the port select mode str - bnxt_en: fix and simplify XDP transmit path - r8152: fix accessing unset transport header Previous releases - regressions: - conntrack: fix crash due to confirmed bit load reordering (after atomic -> refcount conversion) - stmmac: dwc-qos: disable split header for Tegra194 Previous releases - always broken: - mlx5e: ring the TX doorbell on DMA errors - bpf: make sure mac_header was set before using it - mac80211: do not wake queues on a vif that is being stopped - mac80211: fix queue selection for mesh/OCB interfaces - ip: fix dflt addr selection for connected nexthop - seg6: fix skb checksums for SRH encapsulation/insertion - xdp: fix spurious packet loss in generic XDP TX path - bunch of sysctl data race fixes - nf_log: incorrect offset to network header Misc: - bpf: add flags arg to bpf_dynptr_read and bpf_dynptr_write APIs" * tag 'net-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits) nfp: flower: configure tunnel neighbour on cmsg rx net/tls: Check for errors in tls_device_init MAINTAINERS: Add an additional maintainer to the AMD XGBE driver xen/netback: avoid entering xenvif_rx_next_skb() with an empty rx queue selftests/net: test nexthop without gw ip: fix dflt addr selection for connected nexthop net: atlantic: remove aq_nic_deinit() when resume net: atlantic: remove deep parameter on suspend/resume functions sfc: fix kernel panic when creating VF seg6: bpf: fix skb checksum in bpf_push_seg6_encap() seg6: fix skb checksum in SRv6 End.B6 and End.B6.Encaps behaviors seg6: fix skb checksum evaluation in SRH encapsulation/insertion sfc: fix use after free when disabling sriov net: sunhme: output link status with a single print. r8152: fix accessing unset transport header net: stmmac: fix leaks in probe net: ftgmac100: Hold reference returned by of_get_child_by_name() nexthop: Fix data-races around nexthop_compat_mode. ipv4: Fix data-races around sysctl_ip_dynaddr. tcp: Fix a data-race around sysctl_tcp_ecn_fallback. ...
2022-07-14selftests/net: test nexthop without gwNicolas Dichtel2-1/+120
This test implement the scenario described in the commit "ip: fix dflt addr selection for connected nexthop". The test configures a nexthop object with an output device only (no gateway address) and a route that uses this nexthop. The goal is to check if the kernel selects a valid source address. Link: https://lore.kernel.org/netdev/20220712095545.10947-1-nicolas.dichtel@6wind.com/ Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Link: https://lore.kernel.org/r/20220713114853.29406-2-nicolas.dichtel@6wind.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-14selftests/bpf: Return true/false (not 1/0) from bool functionsLinkui Xiao1-15/+15
Return boolean values ("true" or "false") instead of 1 or 0 from bool functions. This fixes the following warnings from coccicheck: tools/testing/selftests/bpf/progs/test_xdp_noinline.c:407:9-10: WARNING: return of 0/1 in function 'decap_v4' with return type bool tools/testing/selftests/bpf/progs/test_xdp_noinline.c:389:9-10: WARNING: return of 0/1 in function 'decap_v6' with return type bool tools/testing/selftests/bpf/progs/test_xdp_noinline.c:290:9-10: WARNING: return of 0/1 in function 'encap_v6' with return type bool tools/testing/selftests/bpf/progs/test_xdp_noinline.c:264:9-10: WARNING: return of 0/1 in function 'parse_tcp' with return type bool tools/testing/selftests/bpf/progs/test_xdp_noinline.c:242:9-10: WARNING: return of 0/1 in function 'parse_udp' with return type bool Generated by: scripts/coccinelle/misc/boolreturn.cocci Suggested-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Linkui Xiao <xiaolinkui@kylinos.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20220714015647.25074-1-xiaolinkui@kylinos.cn
2022-07-14libbpf: Fix the name of a reused mapAnquan Wu1-2/+7
BPF map name is limited to BPF_OBJ_NAME_LEN. A map name is defined as being longer than BPF_OBJ_NAME_LEN, it will be truncated to BPF_OBJ_NAME_LEN when a userspace program calls libbpf to create the map. A pinned map also generates a path in the /sys. If the previous program wanted to reuse the map, it can not get bpf_map by name, because the name of the map is only partially the same as the name which get from pinned path. The syscall information below show that map name "process_pinned_map" is truncated to "process_pinned_". bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/process_pinned_map", bpf_fd=0, file_flags=0}, 144) = -1 ENOENT (No such file or directory) bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=4,max_entries=1024, map_flags=0, inner_map_fd=0, map_name="process_pinned_",map_ifindex=0, btf_fd=3, btf_key_type_id=6, btf_value_type_id=10,btf_vmlinux_value_type_id=0}, 72) = 4 This patch check that if the name of pinned map are the same as the actual name for the first (BPF_OBJ_NAME_LEN - 1), bpf map still uses the name which is included in bpf object. Fixes: 26736eb9a483 ("tools: libbpf: allow map reuse") Signed-off-by: Anquan Wu <leiqi96@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/OSZP286MB1725CEA1C95C5CB8E7CCC53FB8869@OSZP286MB1725.JPNP286.PROD.OUTLOOK.COM
2022-07-13libbpf: Error out when binary_path is NULL for uprobe and USDTHengqi Chen1-6/+7
binary_path is a required non-null parameter for bpf_program__attach_usdt and bpf_program__attach_uprobe_opts. Check it against NULL to prevent coredump on strchr. Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220712025745.2703995-1-hengqi.chen@gmail.com
2022-07-13selftests: mptcp: add MPC backup testsPaolo Abeni1-0/+30
Add a couple of test-cases covering the newly introduced features - priority update for the MPC subflow. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-13selftests/bpf: add a ksym iter subtestAlan Maguire3-0/+97
add subtest verifying BPF ksym iter behaviour. The BPF ksym iter program shows an example of dumping a format different to /proc/kallsyms. It adds KIND and MAX_SIZE fields which represent the kind of symbol (core kernel, module, ftrace, bpf, or kprobe) and the maximum size the symbol can be. The latter is calculated from the difference between current symbol value and the next symbol value. The key benefit for this iterator will likely be supporting in-kernel data-gathering rather than dumping symbol details to userspace and parsing the results. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/1657629105-7812-3-git-send-email-alan.maguire@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-12selftests: tls: add test for NoPad getsockoptJakub Kicinski1-0/+51
Make sure setsockopt / getsockopt behave as expected. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-12selftest: net: add tun to .gitignoreJakub Kicinski1-0/+1
Add missing .gitignore entry. Fixes: 839b92fede7b ("selftest: tun: add test for NAPI dismantle") Link: https://lore.kernel.org/r/20220709024141.321683-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-12Merge tag 'x86_bugs_retbleed' of ↵Linus Torvalds11-24/+372
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 retbleed fixes from Borislav Petkov: "Just when you thought that all the speculation bugs were addressed and solved and the nightmare is complete, here's the next one: speculating after RET instructions and leaking privileged information using the now pretty much classical covert channels. It is called RETBleed and the mitigation effort and controlling functionality has been modelled similar to what already existing mitigations provide" * tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits) x86/speculation: Disable RRSBA behavior x86/kexec: Disable RET on kexec x86/bugs: Do not enable IBPB-on-entry when IBPB is not supported x86/entry: Move PUSH_AND_CLEAR_REGS() back into error_entry x86/bugs: Add Cannon lake to RETBleed affected CPU list x86/retbleed: Add fine grained Kconfig knobs x86/cpu/amd: Enumerate BTC_NO x86/common: Stamp out the stepping madness KVM: VMX: Prevent RSB underflow before vmenter x86/speculation: Fill RSB on vmexit for IBRS KVM: VMX: Fix IBRS handling after vmexit KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS KVM: VMX: Convert launched argument to flags KVM: VMX: Flatten __vmx_vcpu_run() objtool: Re-add UNWIND_HINT_{SAVE_RESTORE} x86/speculation: Remove x86_spec_ctrl_mask x86/speculation: Use cached host SPEC_CTRL value for guest entry/exit x86/speculation: Fix SPEC_CTRL write on SMT state change x86/speculation: Fix firmware entry SPEC_CTRL handling x86/speculation: Fix RSB filling with CONFIG_RETPOLINE=n ...
2022-07-12selftests: drop KSFT_KHDR_INSTALL make targetGuillaume Tucker2-39/+0
Drop the KSFT_KHDR_INSTALL make target now that all use-cases have been removed from the other kselftest Makefiles. Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com> Tested-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-12selftests: stop using KSFT_KHDR_INSTALLGuillaume Tucker11-13/+1
Stop using the KSFT_KHDR_INSTALL flag as installing the kernel headers from the kselftest Makefile is causing some issues. Instead, rely on the headers to be installed directly by the top-level Makefile "headers_install" make target prior to building kselftest. Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com> Tested-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-12selftests: drop khdr make targetGuillaume Tucker1-25/+2
Drop the "khdr" make target as it fails when the build directory is a sub-directory of the source tree. Rely on the "headers_install" target to have been run first instead. For example, here's a typical error this patch is addressing: $ make O=build -j32 kselftest-gen_tar make[1]: Entering directory '/home/kernelci/linux/build' make --no-builtin-rules INSTALL_HDR_PATH=/home/kernelci/linux/build/usr \ ARCH=x86 -C ../../.. headers_install make[3]: Entering directory '/home/kernelci/linux' Makefile:1022: ../scripts/Makefile.extrawarn: No such file or directory The source directory is determined in the top-level Makefile as ".." relatively to the "build" directory, but then the kselftest Makefile switches to "-C ../../.." so "../scripts" then points one level higher than the source tree e.g. "linux/../scripts" - which fails obviously. There is no other use-case in the kernel tree where a sub-directory Makefile tries to call a top-level make target, and it appears this isn't really a valid thing to do. Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com> Tested-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-12selftest: Taint kernel when test module loadedDavid Gow1-0/+4
Make any kselftest test module (using the kselftest_module framework) taint the kernel with TAINT_TEST on module load. Also mark the module as a test module using MODULE_INFO(test, "Y") so that other tools can tell this is a test module. We can't rely solely on this, though, as these test modules are also often built-in. Finally, update the kselftest documentation to mention that the kernel should be tainted, and how to do so manually (as below). Note that several selftests use kernel modules which are not based on the kselftest_module framework, and so will not automatically taint the kernel. This can be done in two ways: - Moving the module to the tools/testing directory. All modules under this directory will taint the kernel. - Adding the 'test' module property with: MODULE_INFO(test, "Y") Similarly, selftests which do not load modules into the kernel generally should not taint the kernel (or possibly should only do so on failure), as it's assumed that testing from user-space should be safe. Regardless, they can write to /proc/sys/kernel/tainted if required. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Brendan Higgins <brendanhiggins@google.com> Signed-off-by: David Gow <davidgow@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>