kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2023-12-02	selftests/bpf: Sort config in alphabetic order	Song Liu	1	-1/+1
	Move CONFIG_VSOCKETS up, so the CONFIGs are in alphabetic order. Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20231129234417.856536-5-song@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-01	selftests/bpf: Fix erroneous bitmask operation	Jeroen van Ingen Schenau	1	-2/+2
	xdp_synproxy_kern.c is a BPF program that generates SYN cookies on allowed TCP ports and sends SYNACKs to clients, accelerating synproxy iptables module. Fix the bitmask operation when checking the status of an existing conntrack entry within tcp_lookup() function. Do not AND with the bit position number, but with the bitmask value to check whether the entry found has the IPS_CONFIRMED flag set. Fixes: fb5cd0ce70d4 ("selftests/bpf: Add selftests for raw syncookie helpers") Signed-off-by: Jeroen van Ingen Schenau <jeroen.vaningenschenau@novoserve.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Minh Le Hoang <minh.lehoang@novoserve.com> Link: https://lore.kernel.org/xdp-newbies/CAAi1gX7owA+Tcxq-titC-h-KPM7Ri-6ZhTNMhrnPq5gmYYwKow@mail.gmail.com/T/#u Link: https://lore.kernel.org/bpf/20231130120353.3084-1-jeroen.vaningenschenau@novoserve.com
2023-12-01	Merge tag 'for-netdev' of ↵	Jakub Kicinski	16	-72/+627
	https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-11-30 We've added 30 non-merge commits during the last 7 day(s) which contain a total of 58 files changed, 1598 insertions(+), 154 deletions(-). The main changes are: 1) Add initial TX metadata implementation for AF_XDP with support in mlx5 and stmmac drivers. Two types of offloads are supported right now, that is, TX timestamp and TX checksum offload, from Stanislav Fomichev with stmmac implementation from Song Yoong Siang. 2) Change BPF verifier logic to validate global subprograms lazily instead of unconditionally before the main program, so they can be guarded using BPF CO-RE techniques, from Andrii Nakryiko. 3) Add BPF link_info support for uprobe multi link along with bpftool integration for the latter, from Jiri Olsa. 4) Use pkg-config in BPF selftests to determine ld flags which is in particular needed for linking statically, from Akihiko Odaki. 5) Fix a few BPF selftest failures to adapt to the upcoming LLVM18, from Yonghong Song. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (30 commits) bpf/tests: Remove duplicate JSGT tests selftests/bpf: Add TX side to xdp_hw_metadata selftests/bpf: Convert xdp_hw_metadata to XDP_USE_NEED_WAKEUP selftests/bpf: Add TX side to xdp_metadata selftests/bpf: Add csum helpers selftests/xsk: Support tx_metadata_len xsk: Add option to calculate TX checksum in SW xsk: Validate xsk_tx_metadata flags xsk: Document tx_metadata_len layout net: stmmac: Add Tx HWTS support to XDP ZC net/mlx5e: Implement AF_XDP TX timestamp and checksum offload tools: ynl: Print xsk-features from the sample xsk: Add TX timestamp and TX checksum offload support xsk: Support tx_metadata_len selftests/bpf: Use pkg-config for libelf selftests/bpf: Override PKG_CONFIG for static builds selftests/bpf: Choose pkg-config for the target bpftool: Add support to display uprobe_multi links selftests/bpf: Add link_info test for uprobe_multi link selftests/bpf: Use bpf_link__destroy in fill_link_info tests ... ==================== Conflicts: Documentation/netlink/specs/netdev.yaml: 839ff60df3ab ("net: page_pool: add nlspec for basic access to page pools") 48eb03dd2630 ("xsk: Add TX timestamp and TX checksum offload support") https://lore.kernel.org/all/20231201094705.1ee3cab8@canb.auug.org.au/ While at it also regen, tree is dirty after: 48eb03dd2630 ("xsk: Add TX timestamp and TX checksum offload support") looks like code wasn't re-rendered after "render-max" was removed. Link: https://lore.kernel.org/r/20231130145708.32573-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-01	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	2	-11/+47
	Cross-merge networking fixes after downstream PR. No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30	bpf, sockmap: Add af_unix test with both sockets in map	John Fastabend	2	-11/+47
	This adds a test where both pairs of a af_unix paired socket are put into a BPF map. This ensures that when we tear down the af_unix pair we don't have any issues on sockmap side with ordering and reference counting. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20231129012557.95371-3-john.fastabend@gmail.com
2023-11-30	selftests/bpf: Add TX side to xdp_hw_metadata	Stanislav Fomichev	1	-4/+160
	When we get a packet on port 9091, we swap src/dst and send it out. At this point we also request the timestamp and checksum offloads. Checksum offload is verified by looking at the tcpdump on the other side. The tool prints pseudo-header csum and the final one it expects. The final checksum actually matches the incoming packets checksum because we only flip the src/dst and don't change the payload. Some other related changes: - switched to zerocopy mode by default; new flag can be used to force old behavior - request fixed tx_metadata_len headroom - some other small fixes (umem size, fill idx+i, etc) mvbz3:~# ./xdp_hw_metadata eth3 ... xsk_ring_cons__peek: 1 0x19546f8: rx_desc[0]->addr=80100 addr=80100 comp_addr=80100 rx_hash: 0x80B7EA8B with RSS type:0x2A rx_timestamp: 1697580171852147395 (sec:1697580171.8521) HW RX-time: 1697580171852147395 (sec:1697580171.8521), delta to User RX-time sec:0.2797 (279673.082 usec) XDP RX-time: 1697580172131699047 (sec:1697580172.1317), delta to User RX-time sec:0.0001 (121.430 usec) 0x19546f8: ping-pong with csum=3b8e (want d862) csum_start=54 csum_offset=6 0x19546f8: complete tx idx=0 addr=8 tx_timestamp: 1697580172056756493 (sec:1697580172.0568) HW TX-complete-time: 1697580172056756493 (sec:1697580172.0568), delta to User TX-complete-time sec:0.0852 (85175.537 usec) XDP RX-time: 1697580172131699047 (sec:1697580172.1317), delta to User TX-complete-time sec:0.0102 (10232.983 usec) HW RX-time: 1697580171852147395 (sec:1697580171.8521), delta to HW TX-complete-time sec:0.2046 (204609.098 usec) 0x19546f8: complete rx idx=128 addr=80100 mvbz4:~# nc -Nu -q1 ${MVBZ3_LINK_LOCAL_IP}%eth3 9091 mvbz4:~# tcpdump -vvx -i eth3 udp tcpdump: listening on eth3, link-type EN10MB (Ethernet), snapshot length 262144 bytes 12:26:09.301074 IP6 (flowlabel 0x35fa5, hlim 127, next-header UDP (17) payload length: 11) fe80::1270:fdff:fe48:1087.55807 > fe80::1270:fdff:fe48:1077.9091: [bad udp cksum 0x3b8e -> 0xde7e!] UDP, length 3 0x0000: 6003 5fa5 000b 117f fe80 0000 0000 0000 0x0010: 1270 fdff fe48 1087 fe80 0000 0000 0000 0x0020: 1270 fdff fe48 1077 d9ff 2383 000b 3b8e 0x0030: 7864 70 12:26:09.301976 IP6 (flowlabel 0x35fa5, hlim 127, next-header UDP (17) payload length: 11) fe80::1270:fdff:fe48:1077.9091 > fe80::1270:fdff:fe48:1087.55807: [udp sum ok] UDP, length 3 0x0000: 6003 5fa5 000b 117f fe80 0000 0000 0000 0x0010: 1270 fdff fe48 1077 fe80 0000 0000 0000 0x0020: 1270 fdff fe48 1087 2383 d9ff 000b de7e 0x0030: 7864 70 Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231127190319.1190813-14-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-30	selftests/bpf: Convert xdp_hw_metadata to XDP_USE_NEED_WAKEUP	Stanislav Fomichev	1	-24/+49
	This is the recommended way to run AF_XDP, so let's use it in the test. Also, some unrelated changes to now blow up the log too much: - change default mode to zerocopy and add -c to use copy mode - small fixes for the flags/sizes/prints - add print_tstamp_delta to print timestamp + reference Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231127190319.1190813-13-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-30	selftests/bpf: Add TX side to xdp_metadata	Stanislav Fomichev	1	-4/+29
	Request TX timestamp and make sure it's not empty. Request TX checksum offload (SW-only) and make sure it's resolved to the correct one. Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231127190319.1190813-12-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-30	selftests/bpf: Add csum helpers	Stanislav Fomichev	1	-0/+43
	Checksum helpers will be used to calculate pseudo-header checksum in AF_XDP metadata selftests. The helpers are mirroring existing kernel ones: - csum_tcpudp_magic : IPv4 pseudo header csum - csum_ipv6_magic : IPv6 pseudo header csum - csum_fold : fold csum and do one's complement Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231127190319.1190813-11-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-30	selftests/xsk: Support tx_metadata_len	Stanislav Fomichev	2	-0/+4
	Add new config field and propagate to UMEM registration setsockopt. Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231127190319.1190813-10-sdf@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-29	selftests/bpf: Use pkg-config for libelf	Akihiko Odaki	1	-2/+6
	When linking statically, libraries may require other dependencies to be included to ld flags. In particular, libelf may require libzstd. Use pkg-config to determine such dependencies. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231125084253.85025-4-akihiko.odaki@daynix.com
2023-11-29	selftests/bpf: Override PKG_CONFIG for static builds	Akihiko Odaki	1	-1/+1
	A library may need to depend on additional archive files for static builds so pkg-config should be instructed to list them. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231125084253.85025-3-akihiko.odaki@daynix.com
2023-11-29	selftests/bpf: Choose pkg-config for the target	Akihiko Odaki	1	-3/+3
	pkg-config is used to build sign-file executable. It should use the library for the target instead of the host as it is called during tests. Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20231125084253.85025-2-akihiko.odaki@daynix.com
2023-11-29	selftests/bpf: Add link_info test for uprobe_multi link	Jiri Olsa	2	-0/+204
	Adding fill_link_info test for uprobe_multi link. Setting up uprobes with bogus ref_ctr_offsets and cookie values to test all the bpf_link_info::uprobe_multi fields. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20231125193130.834322-6-jolsa@kernel.org
2023-11-29	selftests/bpf: Use bpf_link__destroy in fill_link_info tests	Jiri Olsa	1	-21/+23
	The fill_link_info test keeps skeleton open and just creates various links. We are wrongly calling bpf_link__detach after each test to close them, we need to call bpf_link__destroy. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/bpf/20231125193130.834322-5-jolsa@kernel.org
2023-11-29	libbpf: Add st_type argument to elf_resolve_syms_offsets function	Jiri Olsa	1	-1/+1
	We need to get offsets for static variables in following changes, so making elf_resolve_syms_offsets to take st_type value as argument and passing it to elf_sym_iter_new. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/bpf/20231125193130.834322-2-jolsa@kernel.org
2023-11-28	selftests/bpf: update test_offload to use new orphaned property	Stanislav Fomichev	1	-9/+6
	- filter orphaned programs by default - when trying to query orphaned program, don't expect bpftool failure Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231127182057.1081138-2-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-27	bpf: Fix a few selftest failures due to llvm18 change	Yonghong Song	1	-0/+1
	With latest upstream llvm18, the following test cases failed: $ ./test_progs -j #13/2 bpf_cookie/multi_kprobe_link_api:FAIL #13/3 bpf_cookie/multi_kprobe_attach_api:FAIL #13 bpf_cookie:FAIL #77 fentry_fexit:FAIL #78/1 fentry_test/fentry:FAIL #78 fentry_test:FAIL #82/1 fexit_test/fexit:FAIL #82 fexit_test:FAIL #112/1 kprobe_multi_test/skel_api:FAIL #112/2 kprobe_multi_test/link_api_addrs:FAIL [...] #112 kprobe_multi_test:FAIL #356/17 test_global_funcs/global_func17:FAIL #356 test_global_funcs:FAIL Further analysis shows llvm upstream patch [1] is responsible for the above failures. For example, for function bpf_fentry_test7() in net/bpf/test_run.c, without [1], the asm code is: 0000000000000400 <bpf_fentry_test7>: 400: f3 0f 1e fa endbr64 404: e8 00 00 00 00 callq 0x409 <bpf_fentry_test7+0x9> 409: 48 89 f8 movq %rdi, %rax 40c: c3 retq 40d: 0f 1f 00 nopl (%rax) ... and with [1], the asm code is: 0000000000005d20 <bpf_fentry_test7.specialized.1>: 5d20: e8 00 00 00 00 callq 0x5d25 <bpf_fentry_test7.specialized.1+0x5> 5d25: c3 retq ... and <bpf_fentry_test7.specialized.1> is called instead of <bpf_fentry_test7> and this caused test failures for #13/#77 etc. except #356. For test case #356/17, with [1] (progs/test_global_func17.c)), the main prog looks like: 0000000000000000 <global_func17>: 0: b4 00 00 00 2a 00 00 00 w0 = 0x2a 1: 95 00 00 00 00 00 00 00 exit ... which passed verification while the test itself expects a verification failure. Let us add 'barrier_var' style asm code in both places to prevent function specialization which caused selftests failure. [1] https://github.com/llvm/llvm-project/pull/72903 Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231127050342.1945270-1-yonghong.song@linux.dev
2023-11-24	selftests/bpf: Add lazy global subprog validation tests	Andrii Nakryiko	2	-0/+94
	Add a few test that validate BPF verifier's lazy approach to validating global subprogs. We check that global subprogs that are called transitively through another global subprog is validated. We also check that invalid global subprog is not validated, if it's not called from the main program. And we also check that main program is always validated first, before any of the subprogs. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231124035937.403208-4-andrii@kernel.org
2023-11-24	bpf: Validate global subprogs lazily	Andrii Nakryiko	2	-4/+4
	Slightly change BPF verifier logic around eagerness and order of global subprog validation. Instead of going over every global subprog eagerly and validating it before main (entry) BPF program is verified, turn it around. Validate main program first, mark subprogs that were called from main program for later verification, but otherwise assume it is valid. Afterwards, go over marked global subprogs and validate those, potentially marking some more global functions as being called. Continue this process until all (transitively) callable global subprogs are validated. It's a BFS traversal at its heart and will always converge. This is an important change because it allows to feature-gate some subprograms that might not be verifiable on some older kernel, depending on supported set of features. E.g., at some point, global functions were allowed to accept a pointer to memory, which size is identified by user-provided type. Unfortunately, older kernels don't support this feature. With BPF CO-RE approach, the natural way would be to still compile BPF object file once and guard calls to this global subprog with some CO-RE check or using .rodata variables. That's what people do to guard usage of new helpers or kfuncs, and any other new BPF-side feature that might be missing on old kernels. That's currently impossible to do with global subprogs, unfortunately, because they are eagerly and unconditionally validated. This patch set aims to change this, so that in the future when global funcs gain new features, those can be guarded using BPF CO-RE techniques in the same fashion as any other new kernel feature. Two selftests had to be adjusted in sync with these changes. test_global_func12 relied on eager global subprog validation failing before main program failure is detected (unknown return value). Fix by making sure that main program is always valid. verifier_subprog_precision's parent_stack_slot_precise subtest relied on verifier checkpointing heuristic to do a checkpoint at instruction #5, but that's no longer true because we don't have enough jumps validated before reaching insn #5 due to global subprogs being validated later. Other than that, no changes, as one would expect. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20231124035937.403208-3-andrii@kernel.org
2023-11-23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	9	-208/+617
	Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/intel/ice/ice_main.c c9663f79cd82 ("ice: adjust switchdev rebuild path") 7758017911a4 ("ice: restore timestamp configuration after device reset") https://lore.kernel.org/all/20231121211259.3348630-1-anthony.l.nguyen@intel.com/ Adjacent changes: kernel/bpf/verifier.c bb124da69c47 ("bpf: keep track of max number of bpf_loop callback iterations") 5f99f312bd3b ("bpf: add register bounds sanity checks and sanitization") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-22	Merge tag 'for-netdev' of ↵	Jakub Kicinski	38	-362/+3004
	https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-11-21 We've added 85 non-merge commits during the last 12 day(s) which contain a total of 63 files changed, 4464 insertions(+), 1484 deletions(-). The main changes are: 1) Huge batch of verifier changes to improve BPF register bounds logic and range support along with a large test suite, and verifier log improvements, all from Andrii Nakryiko. 2) Add a new kfunc which acquires the associated cgroup of a task within a specific cgroup v1 hierarchy where the latter is identified by its id, from Yafang Shao. 3) Extend verifier to allow bpf_refcount_acquire() of a map value field obtained via direct load which is a use-case needed in sched_ext, from Dave Marchevsky. 4) Fix bpf_get_task_stack() helper to add the correct crosstask check for the get_perf_callchain(), from Jordan Rome. 5) Fix BPF task_iter internals where lockless usage of next_thread() was wrong. The rework also simplifies the code, from Oleg Nesterov. 6) Fix uninitialized tail padding via LIBBPF_OPTS_RESET, and another fix for certain BPF UAPI structs to fix verifier failures seen in bpf_dynptr usage, from Yonghong Song. 7) Add BPF selftest fixes for map_percpu_stats flakes due to per-CPU BPF memory allocator not being able to allocate per-CPU pointer successfully, from Hou Tao. 8) Add prep work around dynptr and string handling for kfuncs which is later going to be used by file verification via BPF LSM and fsverity, from Song Liu. 9) Improve BPF selftests to update multiple prog_tests to use ASSERT_* macros, from Yuran Pereira. 10) Optimize LPM trie lookup to check prefixlen before walking the trie, from Florian Lehner. 11) Consolidate virtio/9p configs from BPF selftests in config.vm file given they are needed consistently across archs, from Manu Bretelle. 12) Small BPF verifier refactor to remove register_is_const(), from Shung-Hsi Yu. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits) selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in vmlinux selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_obj_id selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bind_perm selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_tcp_ca selftests/bpf: reduce verboseness of reg_bounds selftest logs bpf: bpf_iter_task_next: use next_task(kit->task) rather than next_task(kit->pos) bpf: bpf_iter_task_next: use __next_thread() rather than next_thread() bpf: task_group_seq_get_next: use __next_thread() rather than next_thread() bpf: emit frameno for PTR_TO_STACK regs if it differs from current one bpf: smarter verifier log number printing logic bpf: omit default off=0 and imm=0 in register state log bpf: emit map name in register state if applicable and available bpf: print spilled register state in stack slot bpf: extract register state printing bpf: move verifier state printing code to kernel/bpf/log.c bpf: move verbose_linfo() into kernel/bpf/log.c bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS bpf: Remove test for MOVSX32 with offset=32 selftests/bpf: add iter test requiring range x range logic veristat: add ability to set BPF_F_TEST_SANITY_STRICT flag with -r flag ... ==================== Link: https://lore.kernel.org/r/20231122000500.28126-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-21	selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in vmlinux	Yuran Pereira	1	-8/+8
	vmlinux.c uses the `CHECK` calls even though the use of ASSERT_ series of macros is preferred in the bpf selftests. This patch replaces all `CHECK` calls for equivalent `ASSERT_` macro calls. Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/GV1PR10MB6563ED1023A2A3AEF30BDA5DE8BBA@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM
2023-11-21	selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_obj_id	Yuran Pereira	1	-131/+73
	bpf_obj_id uses the `CHECK` calls even though the use of ASSERT_ series of macros is preferred in the bpf selftests. This patch replaces all `CHECK` calls for equivalent `ASSERT_` macro calls. Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/GV1PR10MB65639AA3A10B4BBAA79952C7E8BBA@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM
2023-11-21	selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bind_perm	Yuran Pereira	1	-4/+2
	bind_perm uses the `CHECK` calls even though the use of ASSERT_ series of macros is preferred in the bpf selftests. This patch replaces all `CHECK` calls for equivalent `ASSERT_` macro calls. Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/GV1PR10MB656314F467E075A106CA02BFE8BBA@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM
2023-11-21	selftests/bpf: Replaces the usage of CHECK calls for ASSERTs in bpf_tcp_ca	Yuran Pereira	1	-26/+22
	bpf_tcp_ca uses the `CHECK` calls even though the use of ASSERT_ series of macros is preferred in the bpf selftests. This patch replaces all `CHECK` calls for equivalent `ASSERT_` macro calls. Signed-off-by: Yuran Pereira <yuran.pereira@hotmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/GV1PR10MB6563F180C0F2BB4F6CFA5130E8BBA@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM
2023-11-21	selftests/bpf: check if max number of bpf_loop iterations is tracked	Eduard Zingerman	1	-0/+75
	Check that even if bpf_loop() callback simulation does not converge to a specific state, verification could proceed via "brute force" simulation of maximal number of callback calls. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20231121020701.26440-12-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-21	bpf: keep track of max number of bpf_loop callback iterations	Eduard Zingerman	1	-10/+25
	In some cases verifier can't infer convergence of the bpf_loop() iteration. E.g. for the following program: static int cb(__u32 idx, struct num_context* ctx) { ctx->i++; return 0; } SEC("?raw_tp") int prog(void *_) { struct num_context ctx = { .i = 0 }; __u8 choice_arr[2] = { 0, 1 }; bpf_loop(2, cb, &ctx, 0); return choice_arr[ctx.i]; } Each 'cb' simulation would eventually return to 'prog' and reach 'return choice_arr[ctx.i]' statement. At which point ctx.i would be marked precise, thus forcing verifier to track multitude of separate states with {.i=0}, {.i=1}, ... at bpf_loop() callback entry. This commit allows "brute force" handling for such cases by limiting number of callback body simulations using 'umax' value of the first bpf_loop() parameter. For this, extend bpf_func_state with 'callback_depth' field. Increment this field when callback visiting state is pushed to states traversal stack. For frame #N it's 'callback_depth' field counts how many times callback with frame depth N+1 had been executed. Use bpf_func_state specifically to allow independent tracking of callback depths when multiple nested bpf_loop() calls are present. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20231121020701.26440-11-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-21	selftests/bpf: test widening for iterating callbacks	Eduard Zingerman	1	-0/+20
	A test case to verify that imprecise scalars widening is applied to callback entering state, when callback call is simulated repeatedly. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20231121020701.26440-10-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-21	selftests/bpf: tests for iterating callbacks	Eduard Zingerman	2	-0/+149
	A set of test cases to check behavior of callback handling logic, check if verifier catches the following situations: - program not safe on second callback iteration; - program not safe on zero callback iterations; - infinite loop inside a callback. Verify that callback logic works for bpf_loop, bpf_for_each_map_elem, bpf_user_ringbuf_drain, bpf_find_vma. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20231121020701.26440-8-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-21	bpf: verify callbacks as if they are called unknown number of times	Eduard Zingerman	3	-14/+60
	Prior to this patch callbacks were handled as regular function calls, execution of callback body was modeled exactly once. This patch updates callbacks handling logic as follows: - introduces a function push_callback_call() that schedules callback body verification in env->head stack; - updates prepare_func_exit() to reschedule callback body verification upon BPF_EXIT; - as calls to bpf_*_iter_next(), calls to callback invoking functions are marked as checkpoints; - is_state_visited() is updated to stop callback based iteration when some identical parent state is found. Paths with callback function invoked zero times are now verified first, which leads to necessity to modify some selftests: - the following negative tests required adding release/unlock/drop calls to avoid previously masked unrelated error reports: - cb_refs.c:underflow_prog - exceptions_fail.c:reject_rbtree_add_throw - exceptions_fail.c:reject_with_cp_reference - the following precision tracking selftests needed change in expected log trace: - verifier_subprog_precision.c:callback_result_precise (note: r0 precision is no longer propagated inside callback and I think this is a correct behavior) - verifier_subprog_precision.c:parent_callee_saved_reg_precise_with_callback - verifier_subprog_precision.c:parent_stack_slot_precise_with_callback Reported-by: Andrew Werner <awerner32@gmail.com> Closes: https://lore.kernel.org/bpf/CA+vRuzPChFNXmouzGG+wsy=6eMcfr1mFG0F3g7rbg-sedGKW3w@mail.gmail.com/ Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20231121020701.26440-7-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-21	selftests/bpf: fix bpf_loop_bench for new callback verification scheme	Eduard Zingerman	1	-5/+8
	This is a preparatory change. A follow-up patch "bpf: verify callbacks as if they are called unknown number of times" changes logic for callbacks handling. While previously callbacks were verified as a single function call, new scheme takes into account that callbacks could be executed unknown number of times. This has dire implications for bpf_loop_bench: SEC("fentry/" SYS_PREFIX "sys_getpgid") int benchmark(void *ctx) { for (int i = 0; i < 1000; i++) { bpf_loop(nr_loops, empty_callback, NULL, 0); __sync_add_and_fetch(&hits, nr_loops); } return 0; } W/o callbacks change verifier sees it as a 1000 calls to empty_callback(). However, with callbacks change things become exponential: - i=0: state exploring empty_callback is scheduled with i=0 (a); - i=1: state exploring empty_callback is scheduled with i=1; ... - i=999: state exploring empty_callback is scheduled with i=999; - state (a) is popped from stack; - i=1: state exploring empty_callback is scheduled with i=1; ... Avoid this issue by rewriting outer loop as bpf_loop(). Unfortunately, this adds a function call to a loop at runtime, which negatively affects performance: throughput latency before: 149.919 ± 0.168 M ops/s, 6.670 ns/op after : 137.040 ± 0.187 M ops/s, 7.297 ns/op Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20231121020701.26440-4-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-21	selftests/bpf: track string payload offset as scalar in strobemeta	Eduard Zingerman	1	-30/+48
	This change prepares strobemeta for update in callbacks verification logic. To allow bpf_loop() verification converge when multiple callback iterations are considered: - track offset inside strobemeta_payload->payload directly as scalar value; - at each iteration make sure that remaining strobemeta_payload->payload capacity is sufficient for execution of read_{map,str}_var functions; - make sure that offset is tracked as unbound scalar between iterations, otherwise verifier won't be able infer that bpf_loop callback reaches identical states. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20231121020701.26440-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-21	selftests/bpf: track tcp payload offset as scalar in xdp_synproxy	Eduard Zingerman	1	-32/+52
	This change prepares syncookie_{tc,xdp} for update in callbakcs verification logic. To allow bpf_loop() verification converge when multiple callback itreations are considered: - track offset inside TCP payload explicitly, not as a part of the pointer; - make sure that offset does not exceed MAX_PACKET_OFF enforced by verifier; - make sure that offset is tracked as unbound scalar between iterations, otherwise verifier won't be able infer that bpf_loop callback reaches identical states. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20231121020701.26440-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-20	selftests/bpf: reduce verboseness of reg_bounds selftest logs	Andrii Nakryiko	1	-11/+21
	Reduce verboseness of test_progs' output in reg_bounds set of tests with two changes. First, instead of each different operator (<, <=, >, ...) being it's own subtest, combine all different ops for the same (x, y, init_t, cond_t) values into single subtest. Instead of getting 6 subtests, we get one generic one, e.g.: #192/53 reg_bounds_crafted/(s64)[0xffffffffffffffff; 0] (s64)<op> 0xffffffff00000000:OK Second, for random generated test cases, treat all of them as a single test to eliminate very verbose output with random values in them. So now we'll just get one line per each combination of (init_t, cond_t), instead of 6 x 25 = 150 subtests before this change: #225 reg_bounds_rand_consts_s32_s32:OK Given we reduce verboseness so much, it makes sense to do a bit more random testing, so we also bump default number of random tests to 100, up from 25. This doesn't increase runtime significantly, especially in parallelized mode. With all the above changes we still make sure that we have all the information necessary for reproducing test case if it happens to fail. That includes reporting random seed and specific operator that is failing. Those will only be printed to console if related test/subtest fails, so it doesn't have any added verboseness implications. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231120180452.145849-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-20	selftests/bpf: Add netkit to tc_redirect selftest	Daniel Borkmann	1	-0/+52
	Extend the existing tc_redirect selftest to also cover netkit devices for exercising the bpf_redirect_peer() code paths, so that we have both veth as well as netkit covered, all tests still pass after this change. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20231114004220.6495-9-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-20	selftests/bpf: De-veth-ize the tc_redirect test case	Daniel Borkmann	1	-126/+137
	No functional changes to the test case, but just renaming various functions, variables, etc, to remove veth part of their name for making it more generic and reusable later on (e.g. for netkit). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20231114004220.6495-8-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-18	bpf: smarter verifier log number printing logic	Andrii Nakryiko	2	-35/+50
	Instead of always printing numbers as either decimals (and in some cases, like for "imm=%llx", in hexadecimals), decide the form based on actual values. For numbers in a reasonably small range (currently, [0, U16_MAX] for unsigned values, and [S16_MIN, S16_MAX] for signed ones), emit them as decimals. In all other cases, even for signed values, emit them in hexadecimals. For large values hex form is often times way more useful: it's easier to see an exact difference between 0xffffffff80000000 and 0xffffffff7fffffff, than between 18446744071562067966 and 18446744071562067967, as one particular example. Small values representing small pointer offsets or application constants, on the other hand, are way more useful to be represented in decimal notation. Adjust reg_bounds register state parsing logic to take into account this change. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231118034623.3320920-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-18	bpf: omit default off=0 and imm=0 in register state log	Andrii Nakryiko	4	-35/+35
	Simplify BPF verifier log further by omitting default (and frequently irrelevant) off=0 and imm=0 parts for non-SCALAR_VALUE registers. As can be seen from fixed tests, this is often a visual noise for PTR_TO_CTX register and even for PTR_TO_PACKET registers. Omitting default values follows the rest of register state logic: we omit default values to keep verifier log succinct and to highlight interesting state that deviates from default one. E.g., we do the same for var_off, when it's unknown, which gives no additional information. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231118034623.3320920-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-18	bpf: emit map name in register state if applicable and available	Andrii Nakryiko	1	-5/+5
	In complicated real-world applications, whenever debugging some verification error through verifier log, it often would be very useful to see map name for PTR_TO_MAP_VALUE register. Usually this needs to be inferred from key/value sizes and maybe trying to guess C code location, but it's not always clear. Given verifier has the name, and it's never too long, let's just emit it for ptr_to_map_key, ptr_to_map_value, and const_ptr_to_map registers. We reshuffle the order a bit, so that map name, key size, and value size appear before offset and immediate values, which seems like a more logical order. Current output: R1_w=map_ptr(map=array_map,ks=4,vs=8,off=0,imm=0) But we'll get rid of useless off=0 and imm=0 parts in the next patch. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231118034623.3320920-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-17	bpf: rename BPF_F_TEST_SANITY_STRICT to BPF_F_TEST_REG_INVARIANTS	Andrii Nakryiko	8	-18/+17
	Rename verifier internal flag BPF_F_TEST_SANITY_STRICT to more neutral BPF_F_TEST_REG_INVARIANTS. This is a follow up to [0]. A few selftests and veristat need to be adjusted in the same patch as well. [0] https://patchwork.kernel.org/project/netdevbpf/patch/20231112010609.848406-5-andrii@kernel.org/ Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231117171404.225508-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-16	Merge tag 'net-6.7-rc2' of ↵	Linus Torvalds	6	-17/+127
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from BPF and netfilter. Current release - regressions: - core: fix undefined behavior in netdev name allocation - bpf: do not allocate percpu memory at init stage - netfilter: nf_tables: split async and sync catchall in two functions - mptcp: fix possible NULL pointer dereference on close Current release - new code bugs: - eth: ice: dpll: fix initial lock status of dpll Previous releases - regressions: - bpf: fix precision backtracking instruction iteration - af_unix: fix use-after-free in unix_stream_read_actor() - tipc: fix kernel-infoleak due to uninitialized TLV value - eth: bonding: stop the device in bond_setup_by_slave() - eth: mlx5: - fix double free of encap_header - avoid referencing skb after free-ing in drop path - eth: hns3: fix VF reset - eth: mvneta: fix calls to page_pool_get_stats Previous releases - always broken: - core: set SOCK_RCU_FREE before inserting socket into hashtable - bpf: fix control-flow graph checking in privileged mode - eth: ppp: limit MRU to 64K - eth: stmmac: avoid rx queue overrun - eth: icssg-prueth: fix error cleanup on failing initialization - eth: hns3: fix out-of-bounds access may occur when coalesce info is read via debugfs - eth: cortina: handle large frames Misc: - selftests: gso: support CONFIG_MAX_SKB_FRAGS up to 45" * tag 'net-6.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (78 commits) macvlan: Don't propagate promisc change to lower dev in passthru net: sched: do not offload flows with a helper in act_ct net/mlx5e: Check return value of snprintf writing to fw_version buffer for representors net/mlx5e: Check return value of snprintf writing to fw_version buffer net/mlx5e: Reduce the size of icosq_str net/mlx5: Increase size of irq name buffer net/mlx5e: Update doorbell for port timestamping CQ before the software counter net/mlx5e: Track xmit submission to PTP WQ after populating metadata map net/mlx5e: Avoid referencing skb after free-ing in drop path of mlx5e_sq_xmit_wqe net/mlx5e: Don't modify the peer sent-to-vport rules for IPSec offload net/mlx5e: Fix pedit endianness net/mlx5e: fix double free of encap_header in update funcs net/mlx5e: fix double free of encap_header net/mlx5: Decouple PHC .adjtime and .adjphase implementations net/mlx5: DR, Allow old devices to use multi destination FTE net/mlx5: Free used cpus mask when an IRQ is released Revert "net/mlx5: DR, Supporting inline WQE when possible" bpf: Do not allocate percpu memory at init stage net: Fix undefined behavior in netdev name allocation dt-bindings: net: ethernet-controller: Fix formatting error ...
2023-11-15	selftests/bpf: add iter test requiring range x range logic	Andrii Nakryiko	1	-0/+22
	Add a simple verifier test that requires deriving reg bounds for one register from another register that's not a constant. This is a realistic example of iterating elements of an array with fixed maximum number of elements, but smaller actual number of elements. This small example was an original motivation for doing this whole patch set in the first place, yes. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231112010609.848406-14-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15	veristat: add ability to set BPF_F_TEST_SANITY_STRICT flag with -r flag	Andrii Nakryiko	2	-3/+11
	Add a new flag -r (--test-sanity), similar to -t (--test-states), to add extra BPF program flags when loading BPF programs. This allows to use veristat to easily catch sanity violations in production BPF programs. reg_bounds tests are also enforcing BPF_F_TEST_SANITY_STRICT flag now. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231112010609.848406-13-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15	selftests/bpf: set BPF_F_TEST_SANITY_SCRIPT by default	Andrii Nakryiko	6	-13/+33
	Make sure to set BPF_F_TEST_SANITY_STRICT program flag by default across most verifier tests (and a bunch of others that set custom prog flags). There are currently two tests that do fail validation, if enforced strictly: verifier_bounds/crossing_64_bit_signed_boundary_2 and verifier_bounds/crossing_32_bit_signed_boundary_2. To accommodate them, we teach test_loader a flag negation: __flag(!<flagname>) will clear specified flag, allowing easy opt-out. We apply __flag(!BPF_F_TEST_SANITY_STRICT) to these to tests. Also sprinkle BPF_F_TEST_SANITY_STRICT everywhere where we already set test-only BPF_F_TEST_RND_HI32 flag, for completeness. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231112010609.848406-12-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15	selftests/bpf: add randomized reg_bounds tests	Andrii Nakryiko	1	-7/+159
	Add random cases generation to reg_bounds.c and run them without SLOW_TESTS=1 to increase a chance of BPF CI catching latent issues. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231112010609.848406-11-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15	selftests/bpf: add range x range test to reg_bounds	Andrii Nakryiko	1	-0/+86
	Now that verifier supports range vs range bounds adjustments, validate that by checking each generated range against every other generated range, across all supported operators (everything by JSET). We also add few cases that were problematic during development either for verifier or for selftest's range tracking implementation. Note that we utilize the same trick with splitting everything into multiple independent parallelizable tests, but init_t and cond_t. This brings down verification time in parallel mode from more than 8 hours down to less that 1.5 hours. 106 million cases were successfully validate for range vs range logic, in addition to about 7 million range vs const cases, added in earlier patch. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231112010609.848406-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15	selftests/bpf: adjust OP_EQ/OP_NE handling to use subranges for branch taken	Andrii Nakryiko	1	-4/+26
	Similar to kernel-side BPF verifier logic enhancements, use 32-bit subrange knowledge for is_branch_taken() logic in reg_bounds selftests. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20231112010609.848406-9-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-15	selftests/bpf: BPF register range bounds tester	Andrii Nakryiko	1	-0/+1838
	Add test to validate BPF verifier's register range bounds tracking logic. The main bulk is a lot of auto-generated tests based on a small set of seed values for lower and upper 32 bits of full 64-bit values. Currently we validate only range vs const comparisons, but the idea is to start validating range over range comparisons in subsequent patch set. When setting up initial register ranges we treat registers as one of u64/s64/u32/s32 numeric types, and then independently perform conditional comparisons based on a potentially different u64/s64/u32/s32 types. This tests lots of tricky cases of deriving bounds information across different numeric domains. Given there are lots of auto-generated cases, we guard them behind SLOW_TESTS=1 envvar requirement, and skip them altogether otherwise. With current full set of upper/lower seed value, all supported comparison operators and all the combinations of u64/s64/u32/s32 number domains, we get about 7.7 million tests, which run in about 35 minutes on my local qemu instance without parallelization. But we also split those tests by init/cond numeric types, which allows to rely on test_progs's parallelization of tests with `-j` option, getting run time down to about 5 minutes on 8 cores. It's still something that shouldn't be run during normal test_progs run. But we can run it a reasonable time, and so perhaps a nightly CI test run (once we have it) would be a good option for this. We also add a small set of tricky conditions that came up during development and triggered various bugs or corner cases in either selftest's reimplementation of range bounds logic or in verifier's logic itself. These are fast enough to be run as part of normal test_progs test run and are great for a quick sanity checking. Let's take a look at test output to understand what's going on: $ sudo ./test_progs -t reg_bounds_crafted #191/1 reg_bounds_crafted/(u64)[0; 0xffffffff] (u64)< 0:OK ... #191/115 reg_bounds_crafted/(u64)[0; 0x17fffffff] (s32)< 0:OK ... #191/137 reg_bounds_crafted/(u64)[0xffffffff; 0x100000000] (u64)== 0:OK Each test case is uniquely and fully described by this generated string. E.g.: "(u64)[0; 0x17fffffff] (s32)< 0". This means that we initialize a register (R6) in such a way that verifier knows that it can have a value in [(u64)0; (u64)0x17fffffff] range. Another register (R7) is also set up as u64, but this time a constant (zero in this case). They then are compared using 32-bit signed < operation. Resulting TRUE/FALSE branches are evaluated (including cases where it's known that one of the branches will never be taken, in which case we validate that verifier also determines this as a dead code). Test validates that verifier's final register state matches expected state based on selftest's own reg_state logic, implemented from scratch for cross-checking purposes. These test names can be conveniently used for further debugging, and if -vv verboseness is requested we can get a corresponding verifier log (with mark_precise logs filtered out as irrelevant and distracting). Example below is slightly redacted for brevity, omitting irrelevant register output in some places, marked with [...]. $ sudo ./test_progs -a 'reg_bounds_crafted/(u32)[0; U32_MAX] (s32)< -1' -vv ... VERIFIER LOG: ======================== func#0 @0 0: R1=ctx(off=0,imm=0) R10=fp0 0: (05) goto pc+2 3: (85) call bpf_get_current_pid_tgid#14 ; R0_w=scalar() 4: (bc) w6 = w0 ; R0_w=scalar() R6_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 5: (85) call bpf_get_current_pid_tgid#14 ; R0_w=scalar() 6: (bc) w7 = w0 ; R0_w=scalar() R7_w=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 7: (b4) w1 = 0 ; R1_w=0 8: (b4) w2 = -1 ; R2=4294967295 9: (ae) if w6 < w1 goto pc-9 9: R1=0 R6=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 10: (2e) if w6 > w2 goto pc-10 10: R2=4294967295 R6=scalar(smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 11: (b4) w1 = -1 ; R1_w=4294967295 12: (b4) w2 = -1 ; R2_w=4294967295 13: (ae) if w7 < w1 goto pc-13 ; R1_w=4294967295 R7=4294967295 14: (2e) if w7 > w2 goto pc-14 14: R2_w=4294967295 R7=4294967295 15: (bc) w0 = w6 ; [...] R6=scalar(id=1,smin=0,smax=umax=4294967295,var_off=(0x0; 0xffffffff)) 16: (bc) w0 = w7 ; [...] R7=4294967295 17: (ce) if w6 s< w7 goto pc+3 ; R6=scalar(id=1,smin=0,smax=umax=4294967295,smin32=-1,var_off=(0x0; 0xffffffff)) R7=4294967295 18: (bc) w0 = w6 ; [...] R6=scalar(id=1,smin=0,smax=umax=4294967295,smin32=-1,var_off=(0x0; 0xffffffff)) 19: (bc) w0 = w7 ; [...] R7=4294967295 20: (95) exit from 17 to 21: [...] 21: (bc) w0 = w6 ; [...] R6=scalar(id=1,smin=umin=umin32=2147483648,smax=umax=umax32=4294967294,smax32=-2,var_off=(0x80000000; 0x7fffffff)) 22: (bc) w0 = w7 ; [...] R7=4294967295 23: (95) exit from 13 to 1: [...] 1: [...] 1: (b7) r0 = 0 ; R0_w=0 2: (95) exit processed 24 insns (limit 1000000) max_states_per_insn 0 total_states 2 peak_states 2 mark_read 1 ===================== Verifier log above is for `(u32)[0; U32_MAX] (s32)< -1` use cases, where u32 range is used for initialization, followed by signed < operator. Note how we use w6/w7 in this case for register initialization (it would be R6/R7 for 64-bit types) and then `if w6 s< w7` for comparison at instruction #17. It will be `if R6 < R7` for 64-bit unsigned comparison. Above example gives a good impression of the overall structure of a BPF programs generated for reg_bounds tests. In the future, this "framework" can be extended to test not just conditional jumps, but also arithmetic operations. Adding randomized testing is another possibility. Some implementation notes. We basically have our own generics-like operations on numbers, where all the numbers are stored in u64, but how they are interpreted is passed as runtime argument enum num_t. Further, `struct range` represents a bounds range, and those are collected together into a minimal `struct reg_state`, which collects range bounds across all four numberical domains: u64, s64, u32, s64. Based on these primitives and `enum op` representing possible conditional operation (<, <=, >, >=, ==, !=), there is a set of generic helpers to perform "range arithmetics", which is used to maintain struct reg_state. We simulate what verifier will do for reg bounds of R6 and R7 registers using these range and reg_state primitives. Simulated information is used to determine branch taken conclusion and expected exact register state across all four number domains. Implementation of "range arithmetics" is more generic than what verifier is currently performing: it allows range over range comparisons and adjustments. This is the intended end goal of this patch set overall and verifier logic is enhanced in subsequent patches in this series to handle range vs range operations, at which point selftests are extended to validate these conditions as well. For now it's range vs const cases only. Note that tests are split into multiple groups by their numeric types for initialization of ranges and for comparison operation. This allows to use test_progs's -j parallelization to speed up tests, as we now have 16 groups of parallel running tests. Overall reduction of running time that allows is pretty good, we go down from more than 30 minutes to slightly less than 5 minutes running time. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com> Link: https://lore.kernel.org/r/20231112010609.848406-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-11-14	selftests/bpf: Add selftests for cgroup1 hierarchy	Yafang Shao	2	-0/+229
	Add selftests for cgroup1 hierarchy. The result as follows, $ tools/testing/selftests/bpf/test_progs --name=cgroup1_hierarchy #36/1 cgroup1_hierarchy/test_cgroup1_hierarchy:OK #36/2 cgroup1_hierarchy/test_root_cgid:OK #36/3 cgroup1_hierarchy/test_invalid_level:OK #36/4 cgroup1_hierarchy/test_invalid_cgid:OK #36/5 cgroup1_hierarchy/test_invalid_hid:OK #36/6 cgroup1_hierarchy/test_invalid_cgrp_name:OK #36/7 cgroup1_hierarchy/test_invalid_cgrp_name2:OK #36/8 cgroup1_hierarchy/test_sleepable_prog:OK #36 cgroup1_hierarchy:OK Summary: 1/8 PASSED, 0 SKIPPED, 0 FAILED Besides, I also did some stress test similar to the patch #2 in this series, as follows (with CONFIG_PROVE_RCU_LIST enabled): - Continuously mounting and unmounting named cgroups in some tasks, for example: cgrp_name=$1 while true do mount -t cgroup -o none,name=$cgrp_name none /$cgrp_name umount /$cgrp_name done - Continuously run this selftest concurrently, while true; do ./test_progs --name=cgroup1_hierarchy; done They can ran successfully without any RCU warnings in dmesg. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20231111090034.4248-7-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>