starfive-tech/linux.git - StarFive Tech Linux Kernel for VisionFive (JH7110) boards (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2022-03-10	selftests: mptcp: join: avoid backquotes	Matthieu Baerts	1	-32/+34
	As explained on ShellCheck's wiki [1], it is recommended to avoid backquotes `...` in favour of parenthesis $(...): > Backtick command substitution `...` is legacy syntax with several > issues. > > - It has a series of undefined behaviors related to quoting in POSIX. > - It imposes a custom escaping mode with surprising results. > - It's exceptionally hard to nest. > > $(...) command substitution has none of these problems, and is > therefore strongly encouraged. [1] https://www.shellcheck.net/wiki/SC2006 Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10	selftests: mptcp: join: clarify local/global vars	Matthieu Baerts	1	-52/+83
	Some vars are redefined in different places. Best to avoid this classical Bash pitfall where variables are accidentally overridden by other functions because the proper scope has not been defined. Most issues are with loops: typically 'i' is used in for-loops but if it is not global, calling a function from a for-loop also doing a for-loop with the same non local 'i' variable causes troubles because the first 'i' will be assigned to another value. To prevent such issues, the iterator variable is now declared as local just before the loop. If it is always done like this, issues are avoided. To distinct between local and non local variables, all non local ones are defined at the beginning of the script. The others are now defined with the "local" keyword. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10	selftests: mptcp: join: helper to filter TCP	Matthieu Baerts	1	-3/+12
	This is more readable and reduces duplicated commands. This might also be useful to add v6 support and switch to nftables. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10	selftests: mptcp: join: list failure at the end	Matthieu Baerts	1	-30/+55
	With ~100 tests, it helps to have this summary at the end not to scroll to find which one has failed. It is especially interseting when looking at the output produced by the CI where the kernel logs from the serial are mixed together. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10	selftests: mptcp: join: alt. to exec specific tests	Matthieu Baerts	1	-214/+232
	Running a specific test by giving the ID is often what we want: the CI reports an issue with the Nth test, it is reproducible with: ./mptcp_join.sh N But this might not work when there is a need to find which commit has introduced a regression making a test unstable: failing from time to time. Indeed, a specific test is not attached to one ID: the ID is in fact a counter. It means the same test can have a different ID if other tests have been added/removed before this unstable one. Remembering the current test can also help listing failed tests at the end. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10	selftests: mptcp: join: option to execute specific tests	Matthieu Baerts	1	-752/+877
	Often, it is needed to run one specific test. There are options to run subgroups of tests but when only one fails, no need to run all the subgroup. So far, the solution was to edit the script to comment the tests that are not needed but that's not ideal. Now, it is possible to run one specific test by giving the ID of the tests that are going to be validated, e.g. ./mptcp_join.sh 36 37 This is cleaner and saves time. Technically, the reset* functions now return 0 if the test can be executed. This naturally creates sections per test in the code which is also helpful to understand what a test is exactly doing. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10	selftests: mptcp: join: reset failing links	Matthieu Baerts	1	-2/+6
	Best to always reset this env var before each test to avoid surprising behaviour depending on the order tests are running. Also clearly set it for the last failing links test is also needed when only this test is executed. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10	selftests: mptcp: join: define tests groups once	Matthieu Baerts	1	-94/+47
	When adding a new tests group, it has to be defined in multiple places: - in the all_tests() function - in the 'usage()' function - in the getopts: short option + what to do when the option is used Because it is easy to forget one of them, it is useful to have to define them only once. Note: only using an associative array would simplify the code but the entries are stored in a hashtable and iterating over the different items doesn't give the same order as the one used in the declaration of this array. Because we want to run these tests in the same order as before, a "simple" array is used first. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10	selftests: mptcp: drop msg argument of chk_csum_nr	Geliang Tang	1	-14/+12
	This patch dropped the msg argument of chk_csum_nr, to unify chk_csum_nr with other chk_*_nr functions. Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10	bpftool: Restore support for BPF offload-enabled feature probing	Niklas Söderlund	1	-13/+139
	Commit 1a56c18e6c2e4e74 ("bpftool: Stop supporting BPF offload-enabled feature probing") removed the support to probe for BPF offload features. This is still something that is useful for NFP NIC that can support offloading of BPF programs. The reason for the dropped support was that libbpf starting with v1.0 would drop support for passing the ifindex to the BPF prog/map/helper feature probing APIs. In order to keep this useful feature for NFP restore the functionality by moving it directly into bpftool. The code restored is a simplified version of the code that existed in libbpf which supposed passing the ifindex. The simplification is that it only targets the cases where ifindex is given and call into libbpf for the cases where it's not. Before restoring support for probing offload features: # bpftool feature probe dev ens4np0 Scanning system call availability... bpf() syscall is available Scanning eBPF program types... Scanning eBPF map types... Scanning eBPF helper functions... eBPF helpers supported for program type sched_cls: eBPF helpers supported for program type xdp: Scanning miscellaneous eBPF features... Large program size limit is NOT available Bounded loop support is NOT available ISA extension v2 is NOT available ISA extension v3 is NOT available With support for probing offload features restored: # bpftool feature probe dev ens4np0 Scanning system call availability... bpf() syscall is available Scanning eBPF program types... eBPF program_type sched_cls is available eBPF program_type xdp is available Scanning eBPF map types... eBPF map_type hash is available eBPF map_type array is available Scanning eBPF helper functions... eBPF helpers supported for program type sched_cls: - bpf_map_lookup_elem - bpf_get_prandom_u32 - bpf_perf_event_output eBPF helpers supported for program type xdp: - bpf_map_lookup_elem - bpf_get_prandom_u32 - bpf_perf_event_output - bpf_xdp_adjust_head - bpf_xdp_adjust_tail Scanning miscellaneous eBPF features... Large program size limit is NOT available Bounded loop support is NOT available ISA extension v2 is NOT available ISA extension v3 is NOT available Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20220310121846.921256-1-niklas.soderlund@corigine.com
2022-03-10	selftests: pmtu.sh: Kill nettest processes launched in subshell.	Guillaume Nault	1	-2/+12
	When using "run_cmd <command> &", then "$!" refers to the PID of the subshell used to run <command>, not the command itself. Therefore nettest_pids actually doesn't contain the list of the nettest commands running in the background. So cleanup() can't kill them and the nettest processes run until completion (fortunately they have a 5s timeout). Fix this by defining a new command for running processes in the background, for which "$!" really refers to the PID of the command run. Also, double quote variables on the modified lines, to avoid shellcheck warnings. Fixes: ece1278a9b81 ("selftests: net: add ESP-in-UDP PMTU test") Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10	selftests: pmtu.sh: Kill tcpdump processes launched by subshell.	Guillaume Nault	1	-2/+5
	The cleanup() function takes care of killing processes launched by the test functions. It relies on variables like ${tcpdump_pids} to get the relevant PIDs. But tests are run in their own subshell, so updated *_pids values are invisible to other shells. Therefore cleanup() never sees any process to kill: $ ./tools/testing/selftests/net/pmtu.sh -t pmtu_ipv4_exception TEST: ipv4: PMTU exceptions [ OK ] TEST: ipv4: PMTU exceptions - nexthop objects [ OK ] $ pgrep -af tcpdump 6084 tcpdump -s 0 -i veth_A-R1 -w pmtu_ipv4_exception_veth_A-R1.pcap 6085 tcpdump -s 0 -i veth_R1-A -w pmtu_ipv4_exception_veth_R1-A.pcap 6086 tcpdump -s 0 -i veth_R1-B -w pmtu_ipv4_exception_veth_R1-B.pcap 6087 tcpdump -s 0 -i veth_B-R1 -w pmtu_ipv4_exception_veth_B-R1.pcap 6088 tcpdump -s 0 -i veth_A-R2 -w pmtu_ipv4_exception_veth_A-R2.pcap 6089 tcpdump -s 0 -i veth_R2-A -w pmtu_ipv4_exception_veth_R2-A.pcap 6090 tcpdump -s 0 -i veth_R2-B -w pmtu_ipv4_exception_veth_R2-B.pcap 6091 tcpdump -s 0 -i veth_B-R2 -w pmtu_ipv4_exception_veth_B-R2.pcap 6228 tcpdump -s 0 -i veth_A-R1 -w pmtu_ipv4_exception_veth_A-R1.pcap 6229 tcpdump -s 0 -i veth_R1-A -w pmtu_ipv4_exception_veth_R1-A.pcap 6230 tcpdump -s 0 -i veth_R1-B -w pmtu_ipv4_exception_veth_R1-B.pcap 6231 tcpdump -s 0 -i veth_B-R1 -w pmtu_ipv4_exception_veth_B-R1.pcap 6232 tcpdump -s 0 -i veth_A-R2 -w pmtu_ipv4_exception_veth_A-R2.pcap 6233 tcpdump -s 0 -i veth_R2-A -w pmtu_ipv4_exception_veth_R2-A.pcap 6234 tcpdump -s 0 -i veth_R2-B -w pmtu_ipv4_exception_veth_R2-B.pcap 6235 tcpdump -s 0 -i veth_B-R2 -w pmtu_ipv4_exception_veth_B-R2.pcap Fix this by running cleanup() in the context of the test subshell. Now that each test cleans the environment after completion, there's no need for calling cleanup() again when the next test starts. So let's drop it from the setup() function. This is okay because cleanup() is also called when pmtu.sh starts, so even the first test starts in a clean environment. Also, use tcpdump's immediate mode. Otherwise it might not have time to process buffered packets, resulting in missing packets or even empty pcap files for short tests. Note: PAUSE_ON_FAIL is still evaluated before cleanup(), so one can still inspect the test environment upon failure when using -p. Fixes: a92a0a7b8e7c ("selftests: pmtu: Simplify cleanup and namespace names") Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10	selftests/bpf: Add selftest for XDP_REDIRECT in BPF_PROG_RUN	Toke Høiland-Jørgensen	2	-0/+277
	This adds a selftest for the XDP_REDIRECT facility in BPF_PROG_RUN, that redirects packets into a veth and counts them using an XDP program on the other side of the veth pair and a TC program on the local side of the veth. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220309105346.100053-6-toke@redhat.com
2022-03-10	selftests/bpf: Move open_netns() and close_netns() into network_helpers.c	Toke Høiland-Jørgensen	3	-89/+95
	These will also be used by the xdp_do_redirect test being added in the next commit. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220309105346.100053-5-toke@redhat.com
2022-03-10	libbpf: Support batch_size option to bpf_prog_test_run	Toke Høiland-Jørgensen	2	-1/+3
	Add support for setting the new batch_size parameter to BPF_PROG_TEST_RUN to libbpf; just add it as an option and pass it through to the kernel. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220309105346.100053-4-toke@redhat.com
2022-03-10	bpf: Add "live packet" mode for XDP in BPF_PROG_RUN	Toke Høiland-Jørgensen	1	-0/+3
	This adds support for running XDP programs through BPF_PROG_RUN in a mode that enables live packet processing of the resulting frames. Previous uses of BPF_PROG_RUN for XDP returned the XDP program return code and the modified packet data to userspace, which is useful for unit testing of XDP programs. The existing BPF_PROG_RUN for XDP allows userspace to set the ingress ifindex and RXQ number as part of the context object being passed to the kernel. This patch reuses that code, but adds a new mode with different semantics, which can be selected with the new BPF_F_TEST_XDP_LIVE_FRAMES flag. When running BPF_PROG_RUN in this mode, the XDP program return codes will be honoured: returning XDP_PASS will result in the frame being injected into the networking stack as if it came from the selected networking interface, while returning XDP_TX and XDP_REDIRECT will result in the frame being transmitted out that interface. XDP_TX is translated into an XDP_REDIRECT operation to the same interface, since the real XDP_TX action is only possible from within the network drivers themselves, not from the process context where BPF_PROG_RUN is executed. Internally, this new mode of operation creates a page pool instance while setting up the test run, and feeds pages from that into the XDP program. The setup cost of this is amortised over the number of repetitions specified by userspace. To support the performance testing use case, we further optimise the setup step so that all pages in the pool are pre-initialised with the packet data, and pre-computed context and xdp_frame objects stored at the start of each page. This makes it possible to entirely avoid touching the page content on each XDP program invocation, and enables sending up to 9 Mpps/core on my test box. Because the data pages are recycled by the page pool, and the test runner doesn't re-initialise them for each run, subsequent invocations of the XDP program will see the packet data in the state it was after the last time it ran on that particular page. This means that an XDP program that modifies the packet before redirecting it has to be careful about which assumptions it makes about the packet content, but that is only an issue for the most naively written programs. Enabling the new flag is only allowed when not setting ctx_out and data_out in the test specification, since using it means frames will be redirected somewhere else, so they can't be returned. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20220309105346.100053-2-toke@redhat.com
2022-03-09	tools/power/x86/amd_pstate_tracer: Add tracer tool for AMD P-state	Jinzhou Su	1	-0/+354
	Intel P-state tracer is a useful tool to tune and debug Intel P-state driver. AMD P-state tracer import intel pstate tracer. This tool can be used to analyze the performance of AMD P-state tracer. Now CPU frequency, load and desired perf can be traced. Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-09	tools/power/x86/intel_pstate_tracer: make tracer as a module	Jinzhou Su	1	-131/+129
	Make intel_pstate_tracer as a module. Other trace event can import this module to analyze their trace data. Signed-off-by: Jinzhou Su <Jinzhou.Su@amd.com> Acked-by: Doug Smythies <dsmythies@telus.net> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-09	selftests: mptcp: add implicit endpoint test case	Paolo Abeni	2	-1/+126
	Ensure implicit endpoint are created when expected and that the user-space can update them Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09	mptcp: introduce implicit endpoints	Paolo Abeni	1	-2/+2
	In some edge scenarios, an MPTCP subflows can use a local address mapped by a "implicit" endpoint created by the in-kernel path manager. Such endpoints presence can be confusing, as it's creation is hard to track and will prevent the later endpoint creation from the user-space using the same address. Define a new endpoint flag to mark implicit endpoints and allow the user-space to replace implicit them with user-provided data at endpoint creation time. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09	mptcp: more careful RM_ADDR generation	Paolo Abeni	1	-6/+36
	The in-kernel MPTCP path manager, when processing the MPTCP_PM_CMD_FLUSH_ADDR command, generates RM_ADDR events for each known local address. While that is allowed by the RFC, it makes unpredictable the exact number of RM_ADDR generated when both ends flush the PM addresses. This change restricts the RM_ADDR generation to previously explicitly announced addresses, and adjust the expected results in a bunch of related self-tests. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09	selftests: mptcp: Rename wait function	Mat Martineau	1	-2/+2
	The "selftests: mptcp: improve 'fair usage on close' stability" commit changed that self test to check the TcpAttemptFails MIB instead of looking for TW sockets. The associated bash function wasn't renamed in that commit because of the merge conflicts it would cause, so this commit updates the function name as Paolo originally intended. Cc: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09	selftests: mptcp: join: allow running -cCi	Matthieu Baerts	1	-39/+28
	Without this patch, no tests would be ran when launching: mptcp_join.sh -cCi In any order or a combination with 2 of these letters. The recommended way with getopt is first parse all options and then act. This allows to do some actions in priority, e.g. display the help menu and stop. But also some global variables changing the behaviour of this selftests -- like the ones behind -cCi options -- can be set before running the different tests. By doing that, we can also avoid long and unreadable regex. Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09	Improve stability of find_vma BPF test	Mykola Lysenko	1	-9/+19
	Remove unneeded spleep and increase length of dummy CPU intensive computation to guarantee test process execution. Also, complete aforemention computation as soon as test success criteria is met Signed-off-by: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220308200449.1757478-4-mykolal@fb.com
2022-03-09	Improve send_signal BPF test stability	Mykola Lysenko	2	-8/+11
	Substitute sleep with dummy CPU intensive computation. Finish aforemention computation as soon as signal was delivered to the test process. Make the BPF code to only execute when PID global variable is set Signed-off-by: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220308200449.1757478-3-mykolal@fb.com
2022-03-09	Improve perf related BPF tests (sample_freq issue)	Mykola Lysenko	4	-5/+5
	Linux kernel may automatically reduce kernel.perf_event_max_sample_rate value when running tests in parallel on slow systems. Linux kernel checks against this limit when opening perf event with freq=1 parameter set. The lower bound is 1000. This patch reduces sample_freq value to 1000 in all BPF tests that use sample_freq to ensure they always can open perf event. Signed-off-by: Mykola Lysenko <mykolal@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220308200449.1757478-2-mykolal@fb.com
2022-03-09	tools: Fix unavoidable GCC call in Clang builds	Adrian Ratiu	1	-0/+4
	In ChromeOS and Gentoo we catch any unwanted mixed Clang/LLVM and GCC/binutils usage via toolchain wrappers which fail builds. This has revealed that GCC is called unconditionally in Clang configured builds to populate GCC_TOOLCHAIN_DIR. Allow the user to override CLANG_CROSS_FLAGS to avoid the GCC call - in our case we set the var directly in the ebuild recipe. In theory Clang could be able to autodetect these settings so this logic could be removed entirely, but in practice as the commit cebdb7374577 ("tools: Help cross-building with clang") mentions, this does not always work, so giving distributions more control to specify their flags & sysroot is beneficial. Suggested-by: Manoj Gupta <manojgupta@chromium.com> Suggested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/lkml/87czjk4osi.fsf@ryzen9.i-did-not-set--mail-host-address--so-tickle-me Link: https://lore.kernel.org/bpf/20220308121428.81735-1-adrian.ratiu@collabora.com
2022-03-08	KVM: selftests: Add test to populate a VM with the max possible guest mem	Sean Christopherson	3	-0/+294
	Add a selftest that enables populating a VM with the maximum amount of guest memory allowed by the underlying architecture. Abuse KVM's memslots by mapping a single host memory region into multiple memslots so that the selftest doesn't require a system with terabytes of RAM. Default to 512gb of guest memory, which isn't all that interesting, but should work on all MMUs and doesn't take an exorbitant amount of memory or time. E.g. testing with ~64tb of guest memory takes the better part of an hour, and requires 200gb of memory for KVM's page tables when using 4kb pages. To inflicit maximum abuse on KVM' MMU, default to 4kb pages (or whatever the not-hugepage size is) in the backing store (memfd). Use memfd for the host backing store to ensure that hugepages are guaranteed when requested, and to give the user explicit control of the size of hugepage being tested. By default, spin up as many vCPUs as there are available to the selftest, and distribute the work of dirtying each 4kb chunk of memory across all vCPUs. Dirtying guest memory forces KVM to populate its page tables, and also forces KVM to write back accessed/dirty information to struct page when the guest memory is freed. On x86, perform two passes with a MMU context reset between each pass to coerce KVM into dropping all references to the MMU root, e.g. to emulate a vCPU dropping the last reference. Perform both passes and all rendezvous on all architectures in the hope that arm64 and s390x can gain similar shenanigans in the future. Measure and report the duration of each operation, which is helpful not only to verify the test is working as intended, but also to easily evaluate the performance differences different page sizes. Provide command line options to limit the amount of guest memory, set the size of each slot (i.e. of the host memory region), set the number of vCPUs, and to enable usage of hugepages. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220226001546.360188-29-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-08	KVM: selftests: Define cpu_relax() helpers for s390 and x86	Sean Christopherson	2	-0/+13
	Add cpu_relax() for s390 and x86 for use in arch-agnostic tests. arm64 already defines its own version. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220226001546.360188-28-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-08	KVM: selftests: Split out helper to allocate guest mem via memfd	Sean Christopherson	2	-18/+25
	Extract the code for allocating guest memory via memfd out of vm_userspace_mem_region_add() and into a new helper, kvm_memfd_alloc(). A future selftest to populate a guest with the maximum amount of guest memory will abuse KVM's memslots to alias guest memory regions to a single memfd-backed host region, i.e. needs to back a guest with memfd memory without a 1:1 association between a memslot and a memfd instance. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220226001546.360188-27-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-08	KVM: selftests: Move raw KVM_SET_USER_MEMORY_REGION helper to utils	Sean Christopherson	3	-27/+36
	Move set_memory_region_test's KVM_SET_USER_MEMORY_REGION helper to KVM's utils so that it can be used by other tests. Provide a raw version as well as an assert-success version to reduce the amount of boilerplate code need for basic usage. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220226001546.360188-26-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-03-08	selftests/bpf: Make test_lwt_ip_encap more stable and faster	Felix Maurer	1	-1/+9
	In test_lwt_ip_encap, the ingress IPv6 encap test failed from time to time. The failure occured when an IPv4 ping through the IPv6 GRE encapsulation did not receive a reply within the timeout. The IPv4 ping and the IPv6 ping in the test used different timeouts (1 sec for IPv4 and 6 sec for IPv6), probably taking into account that IPv6 might need longer to successfully complete. However, when IPv4 pings (with the short timeout) are encapsulated into the IPv6 tunnel, the delays of IPv6 apply. The actual reason for the long delays with IPv6 was that the IPv6 neighbor discovery sometimes did not complete in time. This was caused by the outgoing interface only having a tentative link local address, i.e., not having completed DAD for that lladdr. The ND was successfully retried after 1 sec but that was too late for the ping timeout. The IPv6 addresses for the test were already added with nodad. However, for the lladdrs, DAD was still performed. We now disable DAD in the test netns completely and just assume that the two lladdrs on each veth pair do not collide. This removes all the delays for IPv6 traffic in the test. Without the delays, we can now also reduce the delay of the IPv6 ping to 1 sec. This makes the whole test complete faster because we don't need to wait for the excessive timeout for each IPv6 ping that is supposed to fail. Fixes: 0fde56e4385b0 ("selftests: bpf: add test_lwt_ip_encap selftest") Signed-off-by: Felix Maurer <fmaurer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/4987d549d48b4e316cd5b3936de69c8d4bc75a4f.1646305899.git.fmaurer@redhat.com
2022-03-08	turbostat: fix PC6 displaying on some systems	Artem Bityutskiy	1	-1/+1
	'MSR_PKG_CST_CONFIG_CONTROL' encodes the deepest allowed package C-state limit, and turbostat decodes it. Before this patch: turbostat does not recognize value "3" on Ice Lake Xeon (ICX) and Sapphire Rapids Xeon (SPR), treats it as "unknown", and does not display any package C-states in the results table. After this patch: turbostat recognizes value 3 on ICX and SPR, treats it as "PC6", and correctly displays package C-states in the results table. Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-08	Revert "netfilter: nat: force port remap to prevent shadowing well-known ports"	Florian Westphal	1	-3/+2
	This reverts commit 878aed8db324bec64f3c3f956e64d5ae7375a5de. This change breaks existing setups where conntrack is used with asymmetric paths. In these cases, the NAT transformation occurs on the syn-ack instead of the syn: 1. SYN x:12345 -> y -> 443 // sent by initiator, receiverd by responder 2. SYNACK y:443 -> x:12345 // First packet seen by conntrack, as sent by responder 3. tuple_force_port_remap() gets called, sees: 'tcp from 443 to port 12345 NAT' -> pick a new source port, inititor receives 4. SYNACK y:$RANDOM -> x:12345 // connection is never established While its possible to avoid the breakage with NOTRACK rules, a kernel update should not break working setups. An alternative to the revert is to augment conntrack to tag mid-stream connections plus more code in the nat core to skip NAT for such connections, however, this leads to more interaction/integration between conntrack and NAT. Therefore, revert, users will need to add explicit nat rules to avoid port shadowing. Link: https://lore.kernel.org/netfilter-devel/20220302105908.GA5852@breakpoint.cc/#R Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2051413 Signed-off-by: Florian Westphal <fw@strlen.de>
2022-03-08	selftests: tpm: add async space test with noneexisting handle	Tadeusz Struk	1	-0/+16
	Add a test for /dev/tpmrm0 in async mode that checks if the code handles invalid handles correctly. Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: <linux-integrity@vger.kernel.org> Cc: <linux-kselftest@vger.kernel.org> Cc: <linux-kernel@vger.kernel.org> Tested-by: Jarkko Sakkinen<jarkko@kernel.org> Signed-off-by: Tadeusz Struk <tstruk@gmail.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2022-03-08	selftests: tpm2: Determine available PCR bank	Stefan Berger	2	-8/+52
	Determine an available PCR bank to be used by a test case by querying the capability TPM2_GET_CAP. The TPM2 returns TPML_PCR_SELECTIONS that contains an array of TPMS_PCR_SELECTIONs indicating available PCR banks and the bitmasks that show which PCRs are enabled in each bank. Collect the data in a dictionary. From the dictionary determine the PCR bank that has the PCRs enabled that the test needs. This avoids test failures with TPM2's that either to not have a SHA-1 bank or whose SHA-1 bank is disabled. Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2022-03-08	bpf/docs: Update list of architectures supported.	KP Singh	1	-1/+1
	vmtest.sh also supports s390x now. Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220307133048.1287644-2-kpsingh@kernel.org
2022-03-08	bpf/docs: Update vmtest docs for static linking	KP Singh	1	-0/+8
	Dynamic linking when compiling on the host can cause issues when the libc version does not match the one in the VM image. Update the docs to explain how to do this. Before: ./vmtest.sh -- ./test_progs -t test_ima ./test_progs: /usr/lib/libc.so.6: version `GLIBC_2.33' not found (required by ./test_progs) After: LDLIBS=-static ./vmtest.sh -- ./test_progs -t test_ima test_ima:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED Reported-by: "Geyslan G. Bem" <geyslan@gmail.com> Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220307133048.1287644-1-kpsingh@kernel.org
2022-03-08	libbpf: Fix array_size.cocci warning	Guo Zhengkui	2	-3/+4
	Fix the following coccicheck warning: tools/lib/bpf/bpf.c:114:31-32: WARNING: Use ARRAY_SIZE tools/lib/bpf/xsk.c:484:34-35: WARNING: Use ARRAY_SIZE tools/lib/bpf/xsk.c:485:35-36: WARNING: Use ARRAY_SIZE It has been tested with gcc (Debian 8.3.0-6) 8.3.0 on x86_64. Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220306023426.19324-1-guozhengkui@vivo.com
2022-03-08	libbpf: Unmap rings when umem deleted	lic121	1	-0/+11
	xsk_umem__create() does mmap for fill/comp rings, but xsk_umem__delete() doesn't do the unmap. This works fine for regular cases, because xsk_socket__delete() does unmap for the rings. But for the case that xsk_socket__create_shared() fails, umem rings are not unmapped. fill_save/comp_save are checked to determine if rings have already be unmapped by xsk. If fill_save and comp_save are NULL, it means that the rings have already been used by xsk. Then they are supposed to be unmapped by xsk_socket__delete(). Otherwise, xsk_umem__delete() does the unmap. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Cheng Li <lic121@chinatelecom.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220301132623.GA19995@vscode.7~
2022-03-08	Merge tag 'x86_bugs_for_v5.17' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 spectre fixes from Borislav Petkov: - Mitigate Spectre v2-type Branch History Buffer attacks on machines which support eIBRS, i.e., the hardware-assisted speculation restriction after it has been shown that such machines are vulnerable even with the hardware mitigation. - Do not use the default LFENCE-based Spectre v2 mitigation on AMD as it is insufficient to mitigate such attacks. Instead, switch to retpolines on all AMD by default. - Update the docs and add some warnings for the obviously vulnerable cmdline configurations. * tag 'x86_bugs_for_v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/speculation: Warn about eIBRS + LFENCE + Unprivileged eBPF + SMT x86/speculation: Warn about Spectre v2 LFENCE mitigation x86/speculation: Update link to AMD speculation whitepaper x86/speculation: Use generic retpoline by default on AMD x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting Documentation/hw-vuln: Update spectre doc x86/speculation: Add eIBRS + Retpoline options x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE
2022-03-08	kselftest/arm64: Log the PIDs of the parent and child in sve-ptrace	Mark Brown	1	-0/+2
	If the test triggers a problem it may well result in a log message from the kernel such as a WARN() or BUG(). If these include a PID it can help with debugging to know if it was the parent or child process that triggered the issue, since the test is just creating a new thread the process name will be the same either way. Print the PIDs of the parent and child on startup so users have this information to hand should it be needed. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20220303192817.2732509-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-03-07	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost	Linus Torvalds	2	-0/+4
	Pull virtio fixes from Michael Tsirkin: "Some last minute fixes that took a while to get ready. Not regressions, but they look safe and seem to be worth to have" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: tools/virtio: handle fallout from folio work tools/virtio: fix virtio_test execution vhost: remove avail_event arg from vhost_update_avail_event() virtio: drop default for virtio-mem vdpa: fix use-after-free on vp_vdpa_remove virtio-blk: Remove BUG_ON() in virtio_queue_rq() virtio-blk: Don't use MAX_DISCARD_SEGMENTS if max_discard_seg is zero vhost: fix hung thread due to erroneous iotlb entries vduse: Fix returning wrong type in vduse_domain_alloc_iova() vdpa/mlx5: add validation for VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET command vdpa/mlx5: should verify CTRL_VQ feature exists for MQ vdpa: factor out vdpa_set_features_unlocked for vdpa internal use virtio_console: break out of buf poll on remove virtio: document virtio_reset_device virtio: acknowledge all features before access virtio: unexport virtio_finalize_features
2022-03-07	nds32: Remove the architecture	Alan Kao	5	-37/+0
	The nds32 architecture, also known as AndeStar V3, is a custom 32-bit RISC target designed by Andes Technologies. Support was added to the kernel in 2016 as the replacement RISC-V based V5 processors were already announced, and maintained by (current or former) Andes employees. As explained by Alan Kao, new customers are now all using RISC-V, and all known nds32 users are already on longterm stable kernels provided by Andes, with no development work going into mainline support any more. While the port is still in a reasonably good shape, it only gets worse over time without active maintainers, so it seems best to remove it before it becomes unusable. As always, if it turns out that there are mainline users after all, and they volunteer to maintain the port in the future, the removal can be reverted. Link: https://lore.kernel.org/linux-mm/YhdWNLUhk+x9RAzU@yamatobi.andestech.com/ Link: https://lore.kernel.org/lkml/20220302065213.82702-1-alankao@andestech.com/ Link: https://www.andestech.com/en/products-solutions/andestar-architecture/ Signed-off-by: Alan Kao <alankao@andestech.com> [arnd: rewrite changelog to provide more background] Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-03-07	selftests: net: fix array_size.cocci warning	Guo Zhengkui	1	-1/+1
	Fit the following coccicheck warning: tools/testing/selftests/net/reuseport_bpf_numa.c:89:28-29: WARNING: Use ARRAY_SIZE. It has been tested with gcc (Debian 8.3.0-6) 8.3.0 on x86_64. Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-06	tools/virtio: handle fallout from folio work	Michael S. Tsirkin	1	-0/+3
	just add a stub Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-06	tools/virtio: fix virtio_test execution	Stefano Garzarella	1	-0/+1
	virtio_test hangs on __vring_new_virtqueue() because `vqs_list_lock` is not initialized. Let's initialize it in vdev_info_init(). Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/r/20220118150631.167015-1-sgarzare@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
2022-03-06	selftests/bpf: Add a test for btf_type_tag "percpu"	Hao Luo	3	-29/+215
	Add test for percpu btf_type_tag. Similar to the "user" tag, we test the following cases: 1. __percpu struct field. 2. __percpu as function parameter. 3. per_cpu_ptr() accepts dynamically allocated __percpu memory. Because the test for "user" and the test for "percpu" are very similar, a little bit of refactoring has been done in btf_tag.c. Basically, both tests share the same function for loading vmlinux and module btf. Example output from log: > ./test_progs -v -t btf_tag libbpf: prog 'test_percpu1': BPF program load failed: Permission denied libbpf: prog 'test_percpu1': -- BEGIN PROG LOAD LOG -- ... ; g = arg->a; 1: (61) r1 = (u32 )(r1 +0) R1 is ptr_bpf_testmod_btf_type_tag_1 access percpu memory: off=0 ... test_btf_type_tag_mod_percpu:PASS:btf_type_tag_percpu 0 nsec #26/6 btf_tag/btf_type_tag_percpu_mod1:OK libbpf: prog 'test_percpu2': BPF program load failed: Permission denied libbpf: prog 'test_percpu2': -- BEGIN PROG LOAD LOG -- ... ; g = arg->p->a; 2: (61) r1 = (u32 )(r1 +0) R1 is ptr_bpf_testmod_btf_type_tag_1 access percpu memory: off=0 ... test_btf_type_tag_mod_percpu:PASS:btf_type_tag_percpu 0 nsec #26/7 btf_tag/btf_type_tag_percpu_mod2:OK libbpf: prog 'test_percpu_load': BPF program load failed: Permission denied libbpf: prog 'test_percpu_load': -- BEGIN PROG LOAD LOG -- ... ; g = (__u64)cgrp->rstat_cpu->updated_children; 2: (79) r1 = (u64 )(r1 +48) R1 is ptr_cgroup_rstat_cpu access percpu memory: off=48 ... test_btf_type_tag_vmlinux_percpu:PASS:btf_type_tag_percpu_load 0 nsec #26/8 btf_tag/btf_type_tag_percpu_vmlinux_load:OK load_btfs:PASS:could not load vmlinux BTF 0 nsec test_btf_type_tag_vmlinux_percpu:PASS:btf_type_tag_percpu 0 nsec test_btf_type_tag_vmlinux_percpu:PASS:btf_type_tag_percpu_helper 0 nsec #26/9 btf_tag/btf_type_tag_percpu_vmlinux_helper:OK Signed-off-by: Hao Luo <haoluo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220304191657.981240-5-haoluo@google.com
2022-03-06	selftests/bpf: Add tests for kfunc register offset checks	Kumar Kartikeya Dwivedi	1	-0/+83
	Include a few verifier selftests that test against the problems being fixed by previous commits, i.e. release kfunc always require PTR_TO_BTF_ID fixed and var_off to be 0, and negative offset is not permitted and returns a helpful error message. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220304224645.3677453-9-memxor@gmail.com
2022-03-06	bpf: Disallow negative offset in check_ptr_off_reg	Kumar Kartikeya Dwivedi	2	-5/+5
	check_ptr_off_reg only allows fixed offset to be set for PTR_TO_BTF_ID, where reg->off < 0 doesn't make sense. This would shift the pointer backwards, and fails later in btf_struct_ids_match or btf_struct_walk due to out of bounds access (since offset is interpreted as unsigned). Improve the verifier by rejecting this case by using a better error message for BPF helpers and kfunc, by putting a check inside the check_func_arg_reg_off function. Also, update existing verifier selftests to work with new error string. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220304224645.3677453-4-memxor@gmail.com