summaryrefslogtreecommitdiff
path: root/tools/testing/selftests
AgeCommit message (Collapse)AuthorFilesLines
2022-09-02selftests: net: dsa: symlink the tc_actions.sh testVladimir Oltean3-1/+4
This has been validated on the Ocelot/Felix switch family (NXP LS1028A) and should be relevant to any switch driver that offloads the tc-flower and/or tc-matchall actions trap, drop, accept, mirred, for which DSA has operations. TEST: gact drop and ok (skip_hw) [ OK ] TEST: mirred egress flower redirect (skip_hw) [ OK ] TEST: mirred egress flower mirror (skip_hw) [ OK ] TEST: mirred egress matchall mirror (skip_hw) [ OK ] TEST: mirred_egress_to_ingress (skip_hw) [ OK ] TEST: gact drop and ok (skip_sw) [ OK ] TEST: mirred egress flower redirect (skip_sw) [ OK ] TEST: mirred egress flower mirror (skip_sw) [ OK ] TEST: mirred egress matchall mirror (skip_sw) [ OK ] TEST: trap (skip_sw) [ OK ] TEST: mirred_egress_to_ingress (skip_sw) [ OK ] Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220831170839.931184-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-02Merge tag 'kvm-s390-master-6.0-1' of ↵Paolo Bonzini11-176/+343
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD PCI interpretation compile fixes
2022-09-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-28/+54
tools/testing/selftests/net/.gitignore sort the net-next version and use it Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-01selftests/bpf: Test concurrent updates on bpf_task_storage_busyHou Tao2-0/+161
Under full preemptible kernel, task local storage lookup operations on the same CPU may update per-cpu bpf_task_storage_busy concurrently. If the update of bpf_task_storage_busy is not preemption safe, the final value of bpf_task_storage_busy may become not-zero forever and bpf_task_storage_trylock() will always fail. So add a test case to ensure the update of bpf_task_storage_busy is preemption safe. Will skip the test case when CONFIG_PREEMPT is disabled, and it can only reproduce the problem probabilistically. By increasing TASK_STORAGE_MAP_NR_LOOP and running it under ARM64 VM with 4-cpus, it takes about four rounds to reproduce: > test_maps is modified to only run test_task_storage_map_stress_lookup() $ export TASK_STORAGE_MAP_NR_THREAD=256 $ export TASK_STORAGE_MAP_NR_LOOP=81920 $ export TASK_STORAGE_MAP_PIN_CPU=1 $ time ./test_maps test_task_storage_map_stress_lookup(135):FAIL:bad bpf_task_storage_busy got -2 real 0m24.743s user 0m6.772s sys 0m17.966s Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20220901061938.3789460-5-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-09-01selftests/bpf: Move sys_pidfd_open() into task_local_storage_helpers.hHou Tao3-18/+20
sys_pidfd_open() is defined twice in both test_bprm_opts.c and test_local_storage.c, so move it to a common header file. And it will be used in map_tests as well. Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/r/20220901061938.3789460-4-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-09-01selftests/net: return back io_uring zc send testsPavel Begunkov2-79/+41
Enable io_uring zerocopy send tests back and fix them up to follow the new inteface. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/c8e5018c516093bdad0b6e19f2f9847dea17e4d2.1662027856.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-01selftests/net: temporarily disable io_uring zc testPavel Begunkov1-0/+9
We're going to change API, to avoid build problems with a couple of following commits, disable io_uring testing. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/12b7507223df04fbd12aa05fc0cb544b51d7ed79.1662027856.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-09-01net-next: Fix IP_UNICAST_IF option behavior for connected socketsRichard Gobert2-2/+44
The IP_UNICAST_IF socket option is used to set the outgoing interface for outbound packets. The IP_UNICAST_IF socket option was added as it was needed by the Wine project, since no other existing option (SO_BINDTODEVICE socket option, IP_PKTINFO socket option or the bind function) provided the needed characteristics needed by the IP_UNICAST_IF socket option. [1] The IP_UNICAST_IF socket option works well for unconnected sockets, that is, the interface specified by the IP_UNICAST_IF socket option is taken into consideration in the route lookup process when a packet is being sent. However, for connected sockets, the outbound interface is chosen when connecting the socket, and in the route lookup process which is done when a packet is being sent, the interface specified by the IP_UNICAST_IF socket option is being ignored. This inconsistent behavior was reported and discussed in an issue opened on systemd's GitHub project [2]. Also, a bug report was submitted in the kernel's bugzilla [3]. To understand the problem in more detail, we can look at what happens for UDP packets over IPv4 (The same analysis was done separately in the referenced systemd issue). When a UDP packet is sent the udp_sendmsg function gets called and the following happens: 1. The oif member of the struct ipcm_cookie ipc (which stores the output interface of the packet) is initialized by the ipcm_init_sk function to inet->sk.sk_bound_dev_if (the device set by the SO_BINDTODEVICE socket option). 2. If the IP_PKTINFO socket option was set, the oif member gets overridden by the call to the ip_cmsg_send function. 3. If no output interface was selected yet, the interface specified by the IP_UNICAST_IF socket option is used. 4. If the socket is connected and no destination address is specified in the send function, the struct ipcm_cookie ipc is not taken into consideration and the cached route, that was calculated in the connect function is being used. Thus, for a connected socket, the IP_UNICAST_IF sockopt isn't taken into consideration. This patch corrects the behavior of the IP_UNICAST_IF socket option for connect()ed sockets by taking into consideration the IP_UNICAST_IF sockopt when connecting the socket. In order to avoid reconnecting the socket, this option is still ignored when applied on an already connected socket until connect() is called again by the Richard Gobert. Change the __ip4_datagram_connect function, which is called during socket connection, to take into consideration the interface set by the IP_UNICAST_IF socket option, in a similar way to what is done in the udp_sendmsg function. [1] https://lore.kernel.org/netdev/1328685717.4736.4.camel@edumazet-laptop/T/ [2] https://github.com/systemd/systemd/issues/11935#issuecomment-618691018 [3] https://bugzilla.kernel.org/show_bug.cgi?id=210255 Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220829111554.GA1771@debian Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-01selftests/bpf: Add test cases for htab updateHou Tao3-0/+156
One test demonstrates the reentrancy of hash map update on the same bucket should fail, and another one shows concureently updates of the same hash map bucket should succeed and not fail due to the reentrancy checking for bucket lock. There is no trampoline support on s390x, so move htab_update to denylist. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220831042629.130006-4-houtao@huaweicloud.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-08-31selftest/bpf: Ensure no module loading in bpf_setsockopt(TCP_CONGESTION)Martin KaFai Lau1-0/+4
This patch adds a test to ensure bpf_setsockopt(TCP_CONGESTION, "not_exist") will not trigger the kernel module autoload. Before the fix: [ 40.535829] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274 [...] [ 40.552134] tcp_ca_find_autoload.constprop.0+0xcb/0x200 [ 40.552689] tcp_set_congestion_control+0x99/0x7b0 [ 40.553203] do_tcp_setsockopt+0x3ed/0x2240 [...] [ 40.556041] __bpf_setsockopt+0x124/0x640 Signed-off-by: Martin KaFai Lau <martin.lau@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220830231953.792412-1-martin.lau@linux.dev
2022-08-31selftests: net: sort .gitignore fileAxel Rasmussen1-25/+25
This is the result of `sort tools/testing/selftests/net/.gitignore`, but preserving the comment at the top. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Link: https://lore.kernel.org/r/20220829184748.1535580-1-axelrasmussen@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31selftests/xsk: Add missing close() on netns fdMaciej Fijalkowski1-0/+4
Commit 1034b03e54ac ("selftests: xsk: Simplify cleanup of ifobjects") removed close on netns fd, which is not correct, so let us restore it. Fixes: 1034b03e54ac ("selftests: xsk: Simplify cleanup of ifobjects") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20220830133905.9945-1-maciej.fijalkowski@intel.com
2022-08-31selftests/nolibc: Avoid generated files being committedFernanda Ma'rouf1-0/+4
After running the nolibc tests, the "git status" is not clean because the generated files are not ignored. Create a `.gitignore` inside the selftests/nolibc directory to ignore them. Cc: Ammar Faizi <ammarfaizi2@gnuweeb.org> Cc: Fernanda Ma'rouf <fernandafmr2@gmail.com> Signed-off-by: Fernanda Ma'rouf <fernandafmr12@gnuweeb.org> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: add a "help" targetWilly Tarreau1-1/+26
It presents the supported targets, and becomes the default target to save the user from having to read the makefile. The "all" target was placed after it and now points to "run" to do everything since it's no longer the default one. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: "sysroot" target installs a local copy of the sysrootWilly Tarreau1-2/+11
It's not convenient to rely on a sysroot built in another directory, especially when running cross-compilation tests, where one has to switch back and forth between directories. Let's make it possible to install the sysroot directly in the test directory. It's not big and even benefits from being copied by arch so that it's easier to switch between archs if needed. The new "sysroot" target does this, it just calls "headers_standalone" from nolibc to install the sysroot right here. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: add a "run" target to start the kernel in QEMUWilly Tarreau1-0/+33
The "run" target will build the kernel and start it in QEMU. The "rerun" target will not have the kernel dependency and will just try to start QEMU. The QEMU architecture used to start the kernel is derived from the configured ARCH. This might need to be improved for archs which include different variants under the same name (mips vs mipsel, +/-64, riscv32 vs riscv64). This could be tested for i386, x86, arm, arm64, mips and riscv (the later two reporting issues on some tests). It is possible to pass a test specification for nolibc-test in the TEST variable, which will be passed as-is as NOLIBC_TEST. On success, the number of successful tests is printed. On failure, failed lines are individually printed. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: add a "defconfig" targetWilly Tarreau1-0/+12
While most archs will work fine with "make defconfig", not all will do, and it's not always easy to remember the most suitable choice to use for a specific architecture. This adds a "defconfig" target to the Makefile so that one may easily run "make -C ... defconfig" and make sure to clean and rebuild a fresh config. This is *not* used by default because we want to preserve the user's config by default. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: add a "kernel" target to build the kernel with the initramfsWilly Tarreau1-0/+13
The "kernel" target rebuilds the kernel with the current config for the selected arch, with an initramfs containing the nolibc-test utility. Since image names depend on the architecture, the currently supported ones are referenced and resolved based on the architecture. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: support glibc as wellWilly Tarreau1-2/+45
Adding support for glibc can be useful to distinguish between bugs in nolibc and bugs in the kernel when a syscall reports an unusual value. It's not that much work and should not affect the long term maintainability of the tests. The necessary changes can essentially be summed up like this: - set _GNU_SOURCE a the top to access some definitions - many includes added when we know we don't come from nolibc (missing the stdio include guard) - disable gettid() which is not exposed by glibc - disable gettimeofday's support of bad pointers since these crash in glibc - add a simple itoa() for errorname(); strerror() is too verbose (no way to get short messages). strerrorname_np() was added in modern glibc (2.32) to do exactly this but that 's too recent to be usable as the default fallback. - use the standard ioperm() definition. May be we need to implement ioperm() in nolibc if that's useful. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: condition some tests on /proc existenceWilly Tarreau1-5/+9
If /proc is not available (program run inside a chroot or without sufficient permissions), it's better to disable the associated tests. Some will be preserved like the ones which check for a failure to create some entries there since they're still supposed to fail. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: recreate and populate /dev and /proc if missingWilly Tarreau1-0/+56
Most of the time the program will be run alone in an initramfs. There is no value in requiring the user to populate /dev and /proc for such tests, we can do it ourselves, and it participates to the tests at the same time. What's done here is that when called as init (getpid()==1) we check if /dev exists or create it, if /dev/console and /dev/null exists, otherwise we try to mount a devtmpfs there, and if it fails we fall back to mknod. The console is reopened if stdout was closed. Finally /proc is created and mounted if /proc/self cannot be found. This is sufficient for most tests. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: on x86, support exiting with isa-debug-exitWilly Tarreau1-0/+9
QEMU, when started with "-device isa-debug-exit -no-reboot" will exit with status code 2N+1 when N is written to 0x501. This is particularly convenient for automated tests but this is not portable. As such we only enable this on x86_64 when pid==1. In addition, this requires an ioperm() call but in order not to have to define arch-specific syscalls we just perform the syscall by hand there. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: exit with poweroff on success when getpid() == 1Willy Tarreau1-0/+14
The idea is to ease automated testing under qemu. If the test succeeds while running as PID 1, indicating the system was booted with init=/test, let's just power off so that qemu can exit with a successful code. In other situations it will exit and provoke a panic, which may be caught for example with CONFIG_PVPANIC. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: add a few tests for some libc functionsWilly Tarreau1-0/+35
The test series called "stdlib" covers some libc functions (string, stdlib etc). By default they are automatically run after "syscall" but may be requested in argument or in variable NOLIBC_TEST. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: implement a few tests for various syscallsWilly Tarreau1-0/+110
This adds 63 tests covering about 34 syscalls. Both successes and failures are tested. Two tests fail when run as unprivileged user (link_dir which returns EACCESS instead of EPERM, and chroot which returns EPERM). One test (execve("/")) expects to fail on EACCESS, but needs to have valid arguments otherwise the kernel will log a message. And a few tests require /proc to be mounted. The code is not pretty since all tests are one-liners, sometimes resulting in long lines, especially when using compount statements to preset a line, but it's convenient and doesn't obfuscate the code, which is important to understand what failed. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: support a test definition formatWilly Tarreau1-0/+91
It now becomes possible to pass a string either in argv[1] or in the NOLIBC_TEST environment variable (the former having precedence), to specify which tests to run. The format is: testname[:range]*[,testname...] Where a range is either a single value or the min and max numbers of the test IDs in a sequence, delimited by a dash. Multiple ranges are possible. This should provide enough flexibility to focus on certain failing parts just by playing with the boot command line in a boot loader or in qemu depending on what is accessible. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31selftests/nolibc: add basic infrastructure to ease creation of nolibc testsWilly Tarreau2-0/+438
This creates a "nolibc" selftest that intends to test various parts of the nolibc component, both in terms of build and execution for a given architecture. The aim is for it to be as simple to run as a kernel build, by just passing the compiler (for the build) and the ARCH (for kernel and execution). It brings a basic squeleton made of a single C file that will ease testing and error reporting. The code will be arranged so that it remains easy to add basic tests for syscalls or library calls that may rely on a condition to be executed, and whose result is compared to a value or to an error with a specific errno value. Tests will just use a relative line number in switch/case statements as an index, saving the user from having to maintain arrays and complicated functions which can often just be one-liners. MAINTAINERS was updated. Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-08-31netfilter: remove nf_conntrack_helper sysctl and modparam togglesPablo Neira Ayuso1-10/+26
__nf_ct_try_assign_helper() remains in place but it now requires a template to configure the helper. A toggle to disable automatic helper assignment was added by: a9006892643a ("netfilter: nf_ct_helper: allow to disable automatic helper assignment") in 2012 to address the issues described in "Secure use of iptables and connection tracking helpers". Automatic conntrack helper assignment was disabled by: 3bb398d925ec ("netfilter: nf_ct_helper: disable automatic helper assignment") back in 2016. This patch removes the sysctl and modparam toggles, users now have to rely on explicit conntrack helper configuration via ruleset. Update tools/testing/selftests/netfilter/nft_conntrack_helper.sh to check that auto-assignment does not happen anymore. Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-29selftests/bpf: Fix connect4_prog tcp/socket header type conflictJames Hilliard1-2/+3
There is a potential for us to hit a type conflict when including netinet/tcp.h and sys/socket.h, we can replace both of these includes with linux/tcp.h and bpf_tcp_helpers.h to avoid this conflict. Fixes errors like the below when compiling with gcc BPF backend: In file included from /usr/include/netinet/tcp.h:91, from progs/connect4_prog.c:11: /home/buildroot/opt/cross/lib/gcc/bpf/13.0.0/include/stdint.h:34:23: error: conflicting types for 'int8_t'; have 'char' 34 | typedef __INT8_TYPE__ int8_t; | ^~~~~~ In file included from /usr/include/x86_64-linux-gnu/sys/types.h:155, from /usr/include/x86_64-linux-gnu/bits/socket.h:29, from /usr/include/x86_64-linux-gnu/sys/socket.h:33, from progs/connect4_prog.c:10: /usr/include/x86_64-linux-gnu/bits/stdint-intn.h:24:18: note: previous declaration of 'int8_t' with type 'int8_t' {aka 'signed char'} 24 | typedef __int8_t int8_t; | ^~~~~~ /home/buildroot/opt/cross/lib/gcc/bpf/13.0.0/include/stdint.h:43:24: error: conflicting types for 'int64_t'; have 'long int' 43 | typedef __INT64_TYPE__ int64_t; | ^~~~~~~ /usr/include/x86_64-linux-gnu/bits/stdint-intn.h:27:19: note: previous declaration of 'int64_t' with type 'int64_t' {aka 'long long int'} 27 | typedef __int64_t int64_t; | ^~~~~~~ Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220829154710.3870139-1-james.hilliard1@gmail.com
2022-08-29selftests/bpf: Fix bind{4,6} tcp/socket header type conflictJames Hilliard2-4/+0
There is a potential for us to hit a type conflict when including netinet/tcp.h with sys/socket.h, we can remove these as they are not actually needed. Fixes errors like the below when compiling with gcc BPF backend: In file included from /usr/include/netinet/tcp.h:91, from progs/bind4_prog.c:10: /home/buildroot/opt/cross/lib/gcc/bpf/13.0.0/include/stdint.h:34:23: error: conflicting types for 'int8_t'; have 'char' 34 | typedef __INT8_TYPE__ int8_t; | ^~~~~~ In file included from /usr/include/x86_64-linux-gnu/sys/types.h:155, from /usr/include/x86_64-linux-gnu/bits/socket.h:29, from /usr/include/x86_64-linux-gnu/sys/socket.h:33, from progs/bind4_prog.c:9: /usr/include/x86_64-linux-gnu/bits/stdint-intn.h:24:18: note: previous declaration of 'int8_t' with type 'int8_t' {aka 'signed char'} 24 | typedef __int8_t int8_t; | ^~~~~~ /home/buildroot/opt/cross/lib/gcc/bpf/13.0.0/include/stdint.h:43:24: error: conflicting types for 'int64_t'; have 'long int' 43 | typedef __INT64_TYPE__ int64_t; | ^~~~~~~ /usr/include/x86_64-linux-gnu/bits/stdint-intn.h:27:19: note: previous declaration of 'int64_t' with type 'int64_t' {aka 'long long int'} 27 | typedef __int64_t int64_t; | ^~~~~~~ make: *** [Makefile:537: /home/buildroot/bpf-next/tools/testing/selftests/bpf/bpf_gcc/bind4_prog.o] Error 1 Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220826052925.980431-1-james.hilliard1@gmail.com
2022-08-27selftests/bpf: Declare subprog_noise as static in tailcall_bpf2bpf4James Hilliard1-1/+1
Due to bpf_map_lookup_elem being declared static we need to also declare subprog_noise as static. Fixes the following error: progs/tailcall_bpf2bpf4.c:26:9: error: 'bpf_map_lookup_elem' is static but used in inline function 'subprog_noise' which is not static [-Werror] 26 | bpf_map_lookup_elem(&nop_table, &key); | ^~~~~~~~~~~~~~~~~~~ Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20220826035141.737919-1-james.hilliard1@gmail.com
2022-08-27selftests/bpf: fix type conflict in test_tc_dtimeJames Hilliard1-1/+0
The sys/socket.h header isn't required to build test_tc_dtime and may cause a type conflict. Fixes the following error: In file included from /usr/include/x86_64-linux-gnu/sys/types.h:155, from /usr/include/x86_64-linux-gnu/bits/socket.h:29, from /usr/include/x86_64-linux-gnu/sys/socket.h:33, from progs/test_tc_dtime.c:18: /usr/include/x86_64-linux-gnu/bits/stdint-intn.h:24:18: error: conflicting types for 'int8_t'; have '__int8_t' {aka 'signed char'} 24 | typedef __int8_t int8_t; | ^~~~~~ In file included from progs/test_tc_dtime.c:5: /home/buildroot/opt/cross/lib/gcc/bpf/13.0.0/include/stdint.h:34:23: note: previous declaration of 'int8_t' with type 'int8_t' {aka 'char'} 34 | typedef __INT8_TYPE__ int8_t; | ^~~~~~ /usr/include/x86_64-linux-gnu/bits/stdint-intn.h:27:19: error: conflicting types for 'int64_t'; have '__int64_t' {aka 'long long int'} 27 | typedef __int64_t int64_t; | ^~~~~~~ /home/buildroot/opt/cross/lib/gcc/bpf/13.0.0/include/stdint.h:43:24: note: previous declaration of 'int64_t' with type 'int64_t' {aka 'long int'} 43 | typedef __INT64_TYPE__ int64_t; | ^~~~~~~ make: *** [Makefile:537: /home/buildroot/bpf-next/tools/testing/selftests/bpf/bpf_gcc/test_tc_dtime.o] Error 1 Signed-off-by: James Hilliard <james.hilliard1@gmail.com> Link: https://lore.kernel.org/r/20220826050703.869571-1-james.hilliard1@gmail.com Signed-off-by: Martin KaFai Lau <kafai@fb.com>
2022-08-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2-0/+26
Daniel borkmann says: ==================== The following pull-request contains BPF updates for your *net* tree. We've added 11 non-merge commits during the last 14 day(s) which contain a total of 13 files changed, 61 insertions(+), 24 deletions(-). The main changes are: 1) Fix BPF verifier's precision tracking around BPF ring buffer, from Kumar Kartikeya Dwivedi. 2) Fix regression in tunnel key infra when passing FLOWI_FLAG_ANYSRC, from Eyal Birger. 3) Fix insufficient permissions for bpf_sys_bpf() helper, from YiFei Zhu. 4) Fix splat from hitting BUG when purging effective cgroup programs, from Pu Lehui. 5) Fix range tracking for array poke descriptors, from Daniel Borkmann. 6) Fix corrupted packets for XDP_SHARED_UMEM in aligned mode, from Magnus Karlsson. 7) Fix NULL pointer splat in BPF sockmap sk_msg_recvmsg(), from Liu Jian. 8) Add READ_ONCE() to bpf_jit_limit when reading from sysctl, from Kuniyuki Iwashima. 9) Add BPF selftest lru_bug check to s390x deny list, from Daniel Müller. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-26selftests/powerpc: Add a test for execute-only memoryNicholas Miehlbradt2-1/+233
This selftest is designed to cover execute-only protections on the Radix MMU but will also work with Hash. The tests are based on those found in pkey_exec_test with modifications to use the generic mprotect() instead of the pkey variants. Signed-off-by: Nicholas Miehlbradt <nicholas@linux.ibm.com> Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220817050640.406017-2-ruscur@russell.cc
2022-08-26bpf: Add CGROUP prefix to cgroup_iter_orderHao Luo3-7/+7
bpf_cgroup_iter_order is globally visible but the entries do not have CGROUP prefix. As requested by Andrii, put a CGROUP in the names in bpf_cgroup_iter_order. This patch fixes two previous commits: one introduced the API and the other uses the API in bpf selftest (that is, the selftest cgroup_hierarchical_stats). I tested this patch via the following command: test_progs -t cgroup,iter,btf_dump Fixes: d4ccaf58a847 ("bpf: Introduce cgroup iter") Fixes: 88886309d2e8 ("selftests/bpf: add a selftest for cgroup hierarchical stats collection") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220825223936.1865810-1-haoluo@google.com Signed-off-by: Martin KaFai Lau <kafai@fb.com>
2022-08-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski9-4/+131
drivers/net/ethernet/mellanox/mlx5/core/en_fs.c 21234e3a84c7 ("net/mlx5e: Fix use after free in mlx5e_fs_init()") c7eafc5ed068 ("net/mlx5e: Convert ethtool_steering member of flow_steering struct to pointer") https://lore.kernel.org/all/20220825104410.67d4709c@canb.auug.org.au/ https://lore.kernel.org/all/20220823055533.334471-1-saeed@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-26Merge tag 'net-6.0-rc3' of ↵Linus Torvalds5-0/+90
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from ipsec and netfilter (with one broken Fixes tag). Current release - new code bugs: - dsa: don't dereference NULL extack in dsa_slave_changeupper() - dpaa: fix <1G ethernet on LS1046ARDB - neigh: don't call kfree_skb() under spin_lock_irqsave() Previous releases - regressions: - r8152: fix the RX FIFO settings when suspending - dsa: microchip: keep compatibility with device tree blobs with no phy-mode - Revert "net: macsec: update SCI upon MAC address change." - Revert "xfrm: update SA curlft.use_time", comply with RFC 2367 Previous releases - always broken: - netfilter: conntrack: work around exceeded TCP receive window - ipsec: fix a null pointer dereference of dst->dev on a metadata dst in xfrm_lookup_with_ifid - moxa: get rid of asymmetry in DMA mapping/unmapping - dsa: microchip: make learning configurable and keep it off while standalone - ice: xsk: prohibit usage of non-balanced queue id - rxrpc: fix locking in rxrpc's sendmsg Misc: - another chunk of sysctl data race silencing" * tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits) net: lantiq_xrx200: restore buffer if memory allocation failed net: lantiq_xrx200: fix lock under memory pressure net: lantiq_xrx200: confirm skb is allocated before using net: stmmac: work around sporadic tx issue on link-up ionic: VF initial random MAC address if no assigned mac ionic: fix up issues with handling EAGAIN on FW cmds ionic: clear broken state on generation change rxrpc: Fix locking in rxrpc's sendmsg net: ethernet: mtk_eth_soc: fix hw hash reporting for MTK_NETSYS_V2 MAINTAINERS: rectify file entry in BONDING DRIVER i40e: Fix incorrect address type for IPv6 flow rules ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter net: Fix a data-race around sysctl_somaxconn. net: Fix a data-race around netdev_unregister_timeout_secs. net: Fix a data-race around gro_normal_batch. net: Fix data-races around sysctl_devconf_inherit_init_net. net: Fix data-races around sysctl_fb_tunnels_only_for_init_net. net: Fix a data-race around netdev_budget_usecs. net: Fix data-races around sysctl_max_skb_frags. net: Fix a data-race around netdev_budget. ...
2022-08-25selftests/net: fix reinitialization of TEST_PROGS in net self tests.Adel Abouchaev1-1/+1
Assinging will drop all previous tests. Fixes: b690842d12fd ("selftests/net: test l2 tunnel TOS/TTL inheriting") Signed-off-by: Adel Abouchaev <adel.abushaev@gmail.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/r/20220824184351.3759862-1-adel.abushaev@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-25selftests/bpf: Add regression test for pruning fixKumar Kartikeya Dwivedi1-0/+25
Add a test to ensure we do mark_chain_precision for the argument type ARG_CONST_ALLOC_SIZE_OR_ZERO. For other argument types, this was already done, but propagation for missing for this case. Without the fix, this test case loads successfully. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220823185500.467-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-25selftests/bpf: add a selftest for cgroup hierarchical stats collectionYosry Ahmed3-0/+584
Add a selftest that tests the whole workflow for collecting, aggregating (flushing), and displaying cgroup hierarchical stats. TL;DR: - Userspace program creates a cgroup hierarchy and induces memcg reclaim in parts of it. - Whenever reclaim happens, vmscan_start and vmscan_end update per-cgroup percpu readings, and tell rstat which (cgroup, cpu) pairs have updates. - When userspace tries to read the stats, vmscan_dump calls rstat to flush the stats, and outputs the stats in text format to userspace (similar to cgroupfs stats). - rstat calls vmscan_flush once for every (cgroup, cpu) pair that has updates, vmscan_flush aggregates cpu readings and propagates updates to parents. - Userspace program makes sure the stats are aggregated and read correctly. Detailed explanation: - The test loads tracing bpf programs, vmscan_start and vmscan_end, to measure the latency of cgroup reclaim. Per-cgroup readings are stored in percpu maps for efficiency. When a cgroup reading is updated on a cpu, cgroup_rstat_updated(cgroup, cpu) is called to add the cgroup to the rstat updated tree on that cpu. - A cgroup_iter program, vmscan_dump, is loaded and pinned to a file, for each cgroup. Reading this file invokes the program, which calls cgroup_rstat_flush(cgroup) to ask rstat to propagate the updates for all cpus and cgroups that have updates in this cgroup's subtree. Afterwards, the stats are exposed to the user. vmscan_dump returns 1 to terminate iteration early, so that we only expose stats for one cgroup per read. - An ftrace program, vmscan_flush, is also loaded and attached to bpf_rstat_flush. When rstat flushing is ongoing, vmscan_flush is invoked once for each (cgroup, cpu) pair that has updates. cgroups are popped from the rstat tree in a bottom-up fashion, so calls will always be made for cgroups that have updates before their parents. The program aggregates percpu readings to a total per-cgroup reading, and also propagates them to the parent cgroup. After rstat flushing is over, all cgroups will have correct updated hierarchical readings (including all cpus and all their descendants). - Finally, the test creates a cgroup hierarchy and induces memcg reclaim in parts of it, and makes sure that the stats collection, aggregation, and reading workflow works as expected. Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220824233117.1312810-6-haoluo@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-25selftests/bpf: extend cgroup helpersYosry Ahmed2-47/+174
This patch extends bpf selft cgroup_helpers [ID] n various ways: - Add enable_controllers() that allows tests to enable all or a subset of controllers for a specific cgroup. - Add join_cgroup_parent(). The cgroup workdir is based on the pid, therefore a spawned child cannot join the same cgroup hierarchy of the test through join_cgroup(). join_cgroup_parent() is used in child processes to join a cgroup under the parent's workdir. - Add write_cgroup_file() and write_cgroup_file_parent() (similar to join_cgroup_parent() above). - Add get_root_cgroup() for tests that need to do checks on root cgroup. - Distinguish relative and absolute cgroup paths in function arguments. Now relative paths are called relative_path, and absolute paths are called cgroup_path. Signed-off-by: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220824233117.1312810-5-haoluo@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-25selftests/bpf: Test cgroup_iter.Hao Luo4-1/+271
Add a selftest for cgroup_iter. The selftest creates a mini cgroup tree of the following structure: ROOT (working cgroup) | PARENT / \ CHILD1 CHILD2 and tests the following scenarios: - invalid cgroup fd. - pre-order walk over descendants from PARENT. - post-order walk over descendants from PARENT. - walk of ancestors from PARENT. - process only a single object (i.e. PARENT). - early termination. Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220824233117.1312810-3-haoluo@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-25bpf: Introduce cgroup iterHao Luo1-2/+2
Cgroup_iter is a type of bpf_iter. It walks over cgroups in four modes: - walking a cgroup's descendants in pre-order. - walking a cgroup's descendants in post-order. - walking a cgroup's ancestors. - process only the given cgroup. When attaching cgroup_iter, one can set a cgroup to the iter_link created from attaching. This cgroup is passed as a file descriptor or cgroup id and serves as the starting point of the walk. If no cgroup is specified, the starting point will be the root cgroup v2. For walking descendants, one can specify the order: either pre-order or post-order. For walking ancestors, the walk starts at the specified cgroup and ends at the root. One can also terminate the walk early by returning 1 from the iter program. Note that because walking cgroup hierarchy holds cgroup_mutex, the iter program is called with cgroup_mutex held. Currently only one session is supported, which means, depending on the volume of data bpf program intends to send to user space, the number of cgroups that can be walked is limited. For example, given the current buffer size is 8 * PAGE_SIZE, if the program sends 64B data for each cgroup, assuming PAGE_SIZE is 4kb, the total number of cgroups that can be walked is 512. This is a limitation of cgroup_iter. If the output data is larger than the kernel buffer size, after all data in the kernel buffer is consumed by user space, the subsequent read() syscall will signal EOPNOTSUPP. In order to work around, the user may have to update their program to reduce the volume of data sent to output. For example, skip some uninteresting cgroups. In future, we may extend bpf_iter flags to allow customizing buffer size. Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Hao Luo <haoluo@google.com> Link: https://lore.kernel.org/r/20220824233117.1312810-2-haoluo@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-25selftests/net: Add sk_bind_sendto_listen and sk_connect_zero_addrJoanne Koong4-0/+146
This patch adds 2 new tests: sk_bind_sendto_listen and sk_connect_zero_addr. The sk_bind_sendto_listen test exercises the path where a socket's rcv saddr changes after it has been added to the binding tables, and then a listen() on the socket is invoked. The listen() should succeed. The sk_bind_sendto_listen test is copied over from one of syzbot's tests: https://syzkaller.appspot.com/x/repro.c?x=1673a38df00000 The sk_connect_zero_addr test exercises the path where the socket was never previously added to the binding tables and it gets assigned a saddr upon a connect() to address 0. Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-25selftests/net: Add test for timing a bind request to a port with a populated ↵Joanne Koong4-1/+215
bhash entry This test populates the bhash table for a given port with MAX_THREADS * MAX_CONNECTIONS sockets, and then times how long a bind request on the port takes. When populating the bhash table, we create the sockets and then bind the sockets to the same address and port (SO_REUSEADDR and SO_REUSEPORT are set). When timing how long a bind on the port takes, we bind on a different address without SO_REUSEPORT set. We do not set SO_REUSEPORT because we are interested in the case where the bind request does not go through the tb->fastreuseport path, which is fragile (eg tb->fastreuseport path does not work if binding with a different uid). To run the script: Usage: ./bind_bhash.sh [-6 | -4] [-p port] [-a address] 6: use ipv6 4: use ipv4 port: Port number address: ip address Without any arguments, ./bind_bhash.sh defaults to ipv6 using ip address "2001:0db8:0:f101::1" on port 443. On my local machine, I see: ipv4: before - 0.002317 seconds with bhash2 - 0.000020 seconds ipv6: before - 0.002431 seconds with bhash2 - 0.000021 seconds Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-25selftests/bpf: Fix wrong size passed to bpf_setsockopt()Yang Yingliang1-3/+7
sizeof(new_cc) is not real memory size that new_cc points to; introduce a new_cc_len to store the size and then pass it to bpf_setsockopt(). Fixes: 31123c0360e0 ("selftests/bpf: bpf_setsockopt tests") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220824013907.380448-1-yangyingliang@huawei.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-25selftests/bpf: Add cb_refs test to s390x deny listDaniel Müller1-0/+1
The cb_refs BPF selftest is failing execution on s390x machines. This is a newly added test that requires a feature not presently supported on this architecture. Denylist the test for this architecture. Fixes: 3cf7e7d8685c ("selftests/bpf: Add tests for reference state fixes for callbacks") Signed-off-by: Daniel Müller <deso@posteo.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220824163906.1186832-1-deso@posteo.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-25selftests/bpf: Add tests for reference state fixes for callbacksKumar Kartikeya Dwivedi2-0/+164
These are regression tests to ensure we don't end up in invalid runtime state for helpers that execute callbacks multiple times. It exercises the fixes to verifier callback handling for reference state in previous patches. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220823013226.24988-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-24selftests/bpf: Make sure bpf_{g,s}et_retval is exposed everywhereStanislav Fomichev4-0/+90
For each hook, have a simple bpf_set_retval(bpf_get_retval) program and make sure it loads for the hooks we want. The exceptions are the hooks which don't propagate the error to the callers: - sockops - recvmsg - getpeername - getsockname - cg_skb ingress and egress Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20220823222555.523590-6-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-23bpf, selftests: Test BPF_FLOW_DISSECTOR_CONTINUEShmulik Ladkani3-0/+44
The dissector program returns BPF_FLOW_DISSECTOR_CONTINUE (and avoids setting skb->flow_keys or last_dissection map) in case it encounters IP packets whose (outer) source address is 127.0.0.127. Additional test is added to prog_tests/flow_dissector.c which sets this address as test's pkk.iph.saddr, with the expected retval of BPF_FLOW_DISSECTOR_CONTINUE. Also, legacy test_flow_dissector.sh was similarly augmented. Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Stanislav Fomichev <sdf@google.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20220821113519.116765-5-shmulik.ladkani@gmail.com