kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2023-07-28	selftests/bpf: Test ldsx with more complex cases	Yonghong Song	3	-1/+265
	The following ldsx cases are tested: - signed readonly map value - read/write map value - probed memory - not-narrowed ctx field access - narrowed ctx field access. Without previous proper verifier/git handling, the test will fail. If cpuv4 is not supported either by compiler or by jit, the test will be skipped. # ./test_progs -t ldsx_insn #113/1 ldsx_insn/map_val and probed_memory:SKIP #113/2 ldsx_insn/ctx_member_sign_ext:SKIP #113/3 ldsx_insn/ctx_member_narrow_sign_ext:SKIP #113 ldsx_insn:SKIP Summary: 1/0 PASSED, 3 SKIPPED, 0 FAILED Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230728011336.3723434-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-28	selftests/bpf: Add unit tests for new gotol insn	Yonghong Song	2	-0/+46
	Add unit tests for gotol insn. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230728011329.3721881-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-28	selftests/bpf: Add unit tests for new sdiv/smod insns	Yonghong Song	2	-0/+783
	Add unit tests for sdiv/smod insns. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230728011321.3720500-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-28	selftests/bpf: Add unit tests for new bswap insns	Yonghong Song	2	-0/+61
	Add unit tests for bswap insns. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230728011314.3720109-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-28	selftests/bpf: Add unit tests for new sign-extension mov insns	Yonghong Song	2	-0/+215
	Add unit tests for movsx insns. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230728011309.3719295-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-28	selftests/bpf: Add unit tests for new sign-extension load insns	Yonghong Song	2	-0/+133
	Add unit tests for new ldsx insns. The test includes sign-extension with a single value or with a value range. If cpuv4 is not supported due to (1) older compiler, e.g., less than clang version 18, or (2) test runner test_progs and test_progs-no_alu32 which tests cpu v2 and v3, or (3) non-x86_64 arch not supporting new insns in jit yet, a dummy program is added with below output: #318/1 verifier_ldsx/cpuv4 is not supported by compiler or jit, use a dummy test:OK #318 verifier_ldsx:OK to indicate the test passed with a dummy test instead of actually testing cpuv4. I am using a dummy prog to avoid changing the verifier testing infrastructure. Once clang 18 is widely available and other architectures support cpuv4, at least for CI run, the dummy program can be removed. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230728011304.3719139-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-28	selftests/bpf: Add a cpuv4 test runner for cpu=v4 testing	Yonghong Song	2	-4/+26
	Similar to no-alu32 runner, if clang compiler supports -mcpu=v4, a cpuv4 runner is created to test bpf programs compiled with -mcpu=v4. The following are some num-of-insn statistics for each newer instructions based on existing selftests, excluding subsequent cpuv4 insn specific tests. insn pattern # of instructions reg = (s8)reg 4 reg = (s16)reg 4 reg = (s32)reg 144 reg = (s8 )(reg + off) 13 reg = (s16 )(reg + off) 14 reg = (s32 )(reg + off) 15215 reg = bswap16 reg 142 reg = bswap32 reg 38 reg = bswap64 reg 14 reg s/= reg 0 reg s%= reg 0 gotol <offset> 58 Note that in llvm -mcpu=v4 implementation, the compiler is a little bit conservative about generating 'gotol' insn (32-bit branch offset) as it didn't precise count the number of insns (e.g., some insns are debug insns, etc.). Compared to old 'goto' insn, newer 'gotol' insn should have comparable verification states to 'goto' insn. With current patch set, all selftests passed with -mcpu=v4 when running test_progs-cpuv4 binary. The -mcpu=v3 and -mcpu=v2 run are also successful. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230728011250.3718252-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-28	selftests/bpf: Fix a test_verifier failure	Yonghong Song	1	-3/+3
	The following test_verifier subtest failed due to new encoding for BSWAP. $ ./test_verifier ... #99/u invalid 64-bit BPF_END FAIL Unexpected success to load! verification time 215 usec stack depth 0 processed 3 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 #99/p invalid 64-bit BPF_END FAIL Unexpected success to load! verification time 198 usec stack depth 0 processed 3 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0 Tighten the test so it still reports a failure. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230728011244.3717464-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-26	selftests/xsk: Fix spelling mistake "querrying" -> "querying"	Colin Ian King	1	-1/+1
	There is a spelling mistake in an error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20230720104815.123146-1-colin.i.king@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-07-26	selftests/bpf: Test that SO_REUSEPORT can be used with sk_assign helper	Daniel Borkmann	3	-0/+344
	We use two programs to check that the new reuseport logic is executed appropriately. The first is a TC clsact program which bpf_sk_assigns the skb to a UDP or TCP socket created by user space. Since the test communicates via lo we see both directions of packets in the eBPF. Traffic ingressing to the reuseport socket is identified by looking at the destination port. For TCP, we additionally need to make sure that we only assign the initial SYN packets towards our listening socket. The network stack then creates a request socket which transitions to ESTABLISHED after the 3WHS. The second is a reuseport program which shares the fact that it has been executed with user space. This tells us that the delayed lookup mechanism is working. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: Lorenz Bauer <lmb@isovalent.com> Signed-off-by: Lorenz Bauer <lmb@isovalent.com> Cc: Joe Stringer <joe@cilium.io> Link: https://lore.kernel.org/r/20230720-so-reuseport-v6-8-7021b683cdae@isovalent.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-07-21	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	1	-2/+23
	Cross-merge networking fixes after downstream PR. No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-19	selftests/bpf: Add mprog API tests for BPF tcx links	Daniel Borkmann	1	-0/+1583
	Add a big batch of test coverage to assert all aspects of the tcx link API: # ./vmtest.sh -- ./test_progs -t tc_links [...] #225 tc_links_after:OK #226 tc_links_append:OK #227 tc_links_basic:OK #228 tc_links_before:OK #229 tc_links_chain_classic:OK #230 tc_links_dev_cleanup:OK #231 tc_links_invalid:OK #232 tc_links_prepend:OK #233 tc_links_replace:OK #234 tc_links_revision:OK Summary: 10/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20230719140858.13224-9-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19	selftests/bpf: Add mprog API tests for BPF tcx opts	Daniel Borkmann	3	-0/+2351
	Add a big batch of test coverage to assert all aspects of the tcx opts attach, detach and query API: # ./vmtest.sh -- ./test_progs -t tc_opts [...] #238 tc_opts_after:OK #239 tc_opts_append:OK #240 tc_opts_basic:OK #241 tc_opts_before:OK #242 tc_opts_chain_classic:OK #243 tc_opts_demixed:OK #244 tc_opts_detach:OK #245 tc_opts_detach_after:OK #246 tc_opts_detach_before:OK #247 tc_opts_dev_cleanup:OK #248 tc_opts_invalid:OK #249 tc_opts_mixed:OK #250 tc_opts_prepend:OK #251 tc_opts_replace:OK #252 tc_opts_revision:OK Summary: 15/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/r/20230719140858.13224-8-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19	selftests/xsk: reset NIC settings to default after running test suite	Maciej Fijalkowski	2	-0/+12
	Currently, when running ZC test suite, after finishing first run of test suite and then switching to busy-poll tests within xskxceiver, such errors are observed: libbpf: Kernel error message: ice: MTU is too large for linear frames and XDP prog does not support frags 1..26 libbpf: Kernel error message: Native and generic XDP can't be active at the same time Error attaching XDP program not ok 1 [xskxceiver.c:xsk_reattach_xdp:1568]: ERROR: 17/"File exists" this is because test suite ends with 9k MTU and native xdp program being loaded. Busy-poll tests start non-multi-buffer tests for generic mode. To fix this, let us introduce bash function that will reset NIC settings to default (e.g. 1500 MTU and no xdp progs loaded) so that test suite can continue without interrupts. It also means that after busy-poll tests NIC will have those default settings, whereas right now it is left with 9k MTU and xdp prog loaded in native mode. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20230719132421.584801-25-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19	selftests/xsk: add test for too many frags	Magnus Karlsson	2	-2/+55
	Add a test that will exercise maximum number of supported fragments. This number depends on mode of the test - for SKB and DRV it will be 18 whereas for ZC this is defined by a value from NETDEV_A_DEV_XDP_ZC_MAX_SEGS netlink attribute. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> # made use of new netlink attribute Link: https://lore.kernel.org/r/20230719132421.584801-24-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19	selftests/xsk: add metadata copy test for multi-buff	Magnus Karlsson	3	-2/+8
	Enable the already existing metadata copy test to also run in multi-buffer mode with 9K packets. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230719132421.584801-23-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19	selftests/xsk: add invalid descriptor test for multi-buffer	Magnus Karlsson	2	-41/+151
	Add a test that produces lots of nasty descriptors testing the corner cases of the descriptor validation. Some of these descriptors are valid and some are not as indicated by the valid flag. For a description of all the test combinations, please see the code. To stress the API, we need to be able to generate combinations of descriptors that make little sense. A new verbatim mode is introduced for the packet_stream to accomplish this. In this mode, all packets in the packet_stream are sent as is. We do not try to chop them up into frames that are of the right size that we know are going to work as we would normally do. The packets are just written into the Tx ring even if we know they make no sense. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> # adjusted valid flags for frags Link: https://lore.kernel.org/r/20230719132421.584801-22-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19	selftests/xsk: add unaligned mode test for multi-buffer	Magnus Karlsson	2	-0/+16
	Add a test for multi-buffer AF_XDP when using unaligned mode. The test sends 4096 9K-buffers. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230719132421.584801-21-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19	selftests/xsk: add basic multi-buffer test	Magnus Karlsson	5	-3/+213
	Add the first basic multi-buffer test that sends a stream of 9K packets and validates that they are received at the other end. In order to enable sending and receiving multi-buffer packets, code that sets the MTU is introduced as well as modifications to the XDP programs so that they signal that they are multi-buffer enabled. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230719132421.584801-20-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19	selftests/xsk: transmit and receive multi-buffer packets	Magnus Karlsson	2	-34/+136
	Add the ability to send and receive packets that are larger than the size of a umem frame, using the AF_XDP /XDP multi-buffer support. There are three pieces of code that need to be changed to achieve this: the Rx path, the Tx path, and the validation logic. Both the Rx path and Tx could only deal with a single fragment per packet. The Tx path is extended with a new function called pkt_nb_frags() that can be used to retrieve the number of fragments a packet will consume. We then create these many fragments in a loop and fill the N-1 first ones to the max size limit to use the buffer space efficiently, and the Nth one with whatever data that is left. This goes on until we have filled in at the most BATCH_SIZE worth of descriptors and fragments. If we detect that the next packet would lead to BATCH_SIZE number of fragments sent being exceeded, we do not send this packet and finish the batch. This packet is instead sent in the next iteration of BATCH_SIZE fragments. For Rx, we loop over all fragments we receive as usual, but for every descriptor that we receive we call a new validation function called is_frag_valid() to validate the consistency of this fragment. The code then checks if the packet continues in the next frame. If so, it loops over the next packet and performs the same validation. once we have received the last fragment of the packet we also call the function is_pkt_valid() to validate the packet as a whole. If we get to the end of the batch and we are not at the end of the current packet, we back out the partial packet and end the loop. Once we get into the receive loop next time, we start over from the beginning of that packet. This so the code becomes simpler at the cost of some performance. The validation function is_frag_valid() checks that the sequence and packet numbers are correct at the start and end of each fragment. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20230719132421.584801-19-maciej.fijalkowski@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19	bpf: allow any program to use the bpf_map_sum_elem_count kfunc	Anton Protopopov	1	-0/+5
	Register the bpf_map_sum_elem_count func for all programs, and update the map_ptr subtest of the test_progs test to test the new functionality. The usage is allowed as long as the pointer to the map is trusted (when using tracing programs) or is a const pointer to map, as in the following example: struct { __uint(type, BPF_MAP_TYPE_HASH); ... } hash SEC(".maps"); ... static inline int some_bpf_prog(void) { struct bpf_map map = (struct bpf_map )&hash; __s64 count; count = bpf_map_sum_elem_count(map); ... } Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Link: https://lore.kernel.org/r/20230719092952.41202-5-aspsk@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19	selftests/bpf: Disable newly-added 'owner' field test until refcount re-enabled	Dave Marchevsky	1	-24/+0
	The test added in previous patch will fail with bpf_refcount_acquire disabled. Until all races are fixed and bpf_refcount_acquire is re-enabled on bpf-next, disable the test so CI doesn't complain. Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Link: https://lore.kernel.org/r/20230718083813.3416104-6-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19	selftests/bpf: Add rbtree test exercising race which 'owner' field prevents	Dave Marchevsky	2	-1/+121
	This patch adds a runnable version of one of the races described by Kumar in [0]. Specifically, this interleaving: (rbtree1 and list head protected by lock1, rbtree2 protected by lock2) Prog A Prog B ====================================== n = bpf_obj_new(...) m = bpf_refcount_acquire(n) kptr_xchg(map, m) m = kptr_xchg(map, NULL) lock(lock2) bpf_rbtree_add(rbtree2, m->r, less) unlock(lock2) lock(lock1) bpf_list_push_back(head, n->l) /* make n non-owning ref / bpf_rbtree_remove(rbtree1, n->r) unlock(lock1) The above interleaving, the node's struct bpf_rb_node r can be used to add it to either rbtree1 or rbtree2, which are protected by different locks. If the node has been added to rbtree2, we should not be allowed to remove it while holding rbtree1's lock. Before changes in the previous patch in this series, the rbtree_remove in the second part of Prog A would succeed as the verifier has no way of knowing which tree owns a particular node at verification time. The addition of 'owner' field results in bpf_rbtree_remove correctly failing. The test added in this patch splits "Prog A" above into two separate BPF programs - A1 and A2 - and uses a second mapval + kptr_xchg to pass n from A1 to A2 similarly to the pass from A1 to B. If the test is run without the fix applied, the remove will succeed. Kumar's example had the two programs running on separate CPUs. This patch doesn't do this as it's not necessary to exercise the broken behavior / validate fixed behavior. [0]: https://lore.kernel.org/bpf/d7hyspcow5wtjcmw4fugdgyp3fwhljwuscp3xyut5qnwivyeru@ysdq543otzv2 Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230718083813.3416104-5-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19	bpf: Add 'owner' field to bpf_{list,rb}_node	Dave Marchevsky	1	-39/+39
	As described by Kumar in [0], in shared ownership scenarios it is necessary to do runtime tracking of {rb,list} node ownership - and synchronize updates using this ownership information - in order to prevent races. This patch adds an 'owner' field to struct bpf_list_node and bpf_rb_node to implement such runtime tracking. The owner field is a void * that describes the ownership state of a node. It can have the following values: NULL - the node is not owned by any data structure BPF_PTR_POISON - the node is in the process of being added to a data structure ptr_to_root - the pointee is a data structure 'root' (bpf_rb_root / bpf_list_head) which owns this node The field is initially NULL (set by bpf_obj_init_field default behavior) and transitions states in the following sequence: Insertion: NULL -> BPF_PTR_POISON -> ptr_to_root Removal: ptr_to_root -> NULL Before a node has been successfully inserted, it is not protected by any root's lock, and therefore two programs can attempt to add the same node to different roots simultaneously. For this reason the intermediate BPF_PTR_POISON state is necessary. For removal, the node is protected by some root's lock so this intermediate hop isn't necessary. Note that bpf_list_pop_{front,back} helpers don't need to check owner before removing as the node-to-be-removed is not passed in as input and is instead taken directly from the list. Do the check anyways and WARN_ON_ONCE in this unexpected scenario. Selftest changes in this patch are entirely mechanical: some BTF tests have hardcoded struct sizes for structs that contain bpf_{list,rb}_node fields, those were adjusted to account for the new sizes. Selftest additions to validate the owner field are added in a further patch in the series. [0]: https://lore.kernel.org/bpf/d7hyspcow5wtjcmw4fugdgyp3fwhljwuscp3xyut5qnwivyeru@ysdq543otzv2 Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Suggested-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230718083813.3416104-4-davemarchevsky@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-19	selftests/bpf: Add more tests for check_max_stack_depth bug	Kumar Kartikeya Dwivedi	1	-2/+23
	Another test which now exercies the path of the verifier where it will explore call chains rooted at the async callback. Without the prior fixes, this program loads successfully, which is incorrect. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230717161530.1238-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-14	Merge tag 'for-netdev' of ↵	Jakub Kicinski	45	-31/+2407
	https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2023-07-13 We've added 67 non-merge commits during the last 15 day(s) which contain a total of 106 files changed, 4444 insertions(+), 619 deletions(-). The main changes are: 1) Fix bpftool build in presence of stale vmlinux.h, from Alexander Lobakin. 2) Introduce bpf_me_mcache_free_rcu() and fix OOM under stress, from Alexei Starovoitov. 3) Teach verifier actual bounds of bpf_get_smp_processor_id() and fix perf+libbpf issue related to custom section handling, from Andrii Nakryiko. 4) Introduce bpf map element count, from Anton Protopopov. 5) Check skb ownership against full socket, from Kui-Feng Lee. 6) Support for up to 12 arguments in BPF trampoline, from Menglong Dong. 7) Export rcu_request_urgent_qs_task, from Paul E. McKenney. 8) Fix BTF walking of unions, from Yafang Shao. 9) Extend link_info for kprobe_multi and perf_event links, from Yafang Shao. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (67 commits) selftests/bpf: Add selftest for PTR_UNTRUSTED bpf: Fix an error in verifying a field in a union selftests/bpf: Add selftests for nested_trust bpf: Fix an error around PTR_UNTRUSTED selftests/bpf: add testcase for TRACING with 6+ arguments bpf, x86: allow function arguments up to 12 for TRACING bpf, x86: save/restore regs with BPF_DW size bpftool: Use "fallthrough;" keyword instead of comments bpf: Add object leak check. bpf: Convert bpf_cpumask to bpf_mem_cache_free_rcu. bpf: Introduce bpf_mem_free_rcu() similar to kfree_rcu(). selftests/bpf: Improve test coverage of bpf_mem_alloc. rcu: Export rcu_request_urgent_qs_task() bpf: Allow reuse from waiting_for_gp_ttrace list. bpf: Add a hint to allocated objects. bpf: Change bpf_mem_cache draining process. bpf: Further refactor alloc_bulk(). bpf: Factor out inc/dec of active flag into helpers. bpf: Refactor alloc_bulk(). bpf: Let free_all() return the number of freed elements. ... ==================== Link: https://lore.kernel.org/r/20230714020910.80794-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-07-14	selftests/bpf: Add selftest for PTR_UNTRUSTED	Yafang Shao	2	-0/+65
	Add a new selftest to check the PTR_UNTRUSTED condition. Below is the result, #160 ptr_untrusted:OK Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20230713025642.27477-5-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-14	selftests/bpf: Add selftests for nested_trust	Yafang Shao	2	-0/+31
	Add selftests for nested_strust to check whehter PTR_UNTRUSTED is cleared as expected, the result as follows: #141/1 nested_trust/test_read_cpumask:OK #141/2 nested_trust/test_skb_field:OK <<<< #141/3 nested_trust/test_invalid_nested_user_cpus:OK #141/4 nested_trust/test_invalid_nested_offset:OK #141/5 nested_trust/test_invalid_skb_field:OK <<<< #141 nested_trust:OK The #141/2 and #141/5 are newly added. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20230713025642.27477-3-laoar.shao@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-14	selftests/bpf: add testcase for TRACING with 6+ arguments	Menglong Dong	12	-15/+332
	Add fentry_many_args.c and fexit_many_args.c to test the fentry/fexit with 7/11 arguments. As this feature is not supported by arm64 yet, we disable these testcases for arm64 in DENYLIST.aarch64. We can combine them with fentry_test.c/fexit_test.c when arm64 is supported too. Correspondingly, add bpf_testmod_fentry_test7() and bpf_testmod_fentry_test11() to bpf_testmod.c Meanwhile, add bpf_modify_return_test2() to test_run.c to test the MODIFY_RETURN with 7 arguments. Add bpf_testmod_test_struct_arg_7/bpf_testmod_test_struct_arg_7 in bpf_testmod.c to test the struct in the arguments. And the testcases passed on x86_64: ./test_progs -t fexit Summary: 5/14 PASSED, 0 SKIPPED, 0 FAILED ./test_progs -t fentry Summary: 3/2 PASSED, 0 SKIPPED, 0 FAILED ./test_progs -t modify_return Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED ./test_progs -t tracing_struct Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Menglong Dong <imagedong@tencent.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20230713040738.1789742-4-imagedong@tencent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-13	selftests/bpf: Improve test coverage of bpf_mem_alloc.	Alexei Starovoitov	1	-1/+1
	bpf_obj_new() calls bpf_mem_alloc(), but doing alloc/free of 8 elements is not triggering watermark conditions in bpf_mem_alloc. Increase to 200 elements to make sure alloc_bulk/free_bulk is exercised. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/bpf/20230706033447.54696-12-alexei.starovoitov@gmail.com
2023-07-12	selftests/bpf: extend existing map resize tests for per-cpu use case	Andrii Nakryiko	2	-5/+17
	Add a per-cpu array resizing use case and demonstrate how bpf_get_smp_processor_id() can be used to directly access proper data with no extra checks. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230711232400.1658562-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-07	selftests/bpf: Correct two typos	Lu Hongfei	2	-2/+2
	When wrapping code, use ';' better than using ',' which is more in line with the coding habits of most engineers. Signed-off-by: Lu Hongfei <luhongfei@vivo.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Hou Tao <houtao1@huawei.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230707081253.34638-1-luhongfei@vivo.com
2023-07-06	selftests/bpf: Bump and validate MAX_SYMS	Björn Töpel	1	-1/+4
	BPF tests that load /proc/kallsyms, e.g. bpf_cookie, will perform a buffer overrun if the number of syms on the system is larger than MAX_SYMS. Bump the MAX_SYMS to 400000, and add a runtime check that bails out if the maximum is reached. Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230706142228.1128452-1-bjorn@kernel.org
2023-07-06	selftests/bpf: test map percpu stats	Anton Protopopov	2	-0/+471
	Add a new map test, map_percpu_stats.c, which is checking the correctness of map's percpu elements counters. For supported maps the test upserts a number of elements, checks the correctness of the counters, then deletes all the elements and checks again that the counters sum drops down to zero. The following map types are tested: * BPF_MAP_TYPE_HASH, BPF_F_NO_PREALLOC * BPF_MAP_TYPE_PERCPU_HASH, BPF_F_NO_PREALLOC * BPF_MAP_TYPE_HASH, * BPF_MAP_TYPE_PERCPU_HASH, * BPF_MAP_TYPE_LRU_HASH * BPF_MAP_TYPE_LRU_PERCPU_HASH * BPF_MAP_TYPE_LRU_HASH, BPF_F_NO_COMMON_LRU * BPF_MAP_TYPE_LRU_PERCPU_HASH, BPF_F_NO_COMMON_LRU * BPF_MAP_TYPE_HASH_OF_MAPS Signed-off-by: Anton Protopopov <aspsk@isovalent.com> Acked-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230706133932.45883-6-aspsk@isovalent.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-06	selftests/bpf: Add selftest for check_stack_max_depth bug	Kumar Kartikeya Dwivedi	2	-0/+49
	Use the bpf_timer_set_callback helper to mark timer_cb as an async callback, and put a direct call to timer_cb in the main subprog. As the check_stack_max_depth happens after the do_check pass, the order does not matter. Without the previous fix, the test passes successfully. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230705144730.235802-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-06	selftests/bpf: Add benchmark for bpf memory allocator	Hou Tao	5	-0/+502
	The benchmark could be used to compare the performance of hash map operations and the memory usage between different flavors of bpf memory allocator (e.g., no bpf ma vs bpf ma vs reuse-after-gp bpf ma). It also could be used to check the performance improvement or the memory saving provided by optimization. The benchmark creates a non-preallocated hash map which uses bpf memory allocator and shows the operation performance and the memory usage of the hash map under different use cases: (1) overwrite Each CPU overwrites nonoverlapping part of hash map. When each CPU completes overwriting of 64 elements in hash map, it increases the op_count. (2) batch_add_batch_del Each CPU adds then deletes nonoverlapping part of hash map in batch. When each CPU adds and deletes 64 elements in hash map, it increases the op_count twice. (3) add_del_on_diff_cpu Each two-CPUs pair adds and deletes nonoverlapping part of map cooperatively. When each CPU adds or deletes 64 elements in hash map, it will increase the op_count. The following is the benchmark results when comparing between different flavors of bpf memory allocator. These tests are conducted on a KVM guest with 8 CPUs and 16 GB memory. The command line below is used to do all the following benchmarks: ./bench htab-mem --use-case $name ${OPTS} -w3 -d10 -a -p8 These results show that preallocated hash map has both better performance and smaller memory footprint. (1) non-preallocated + no bpf memory allocator (v6.0.19) use kmalloc() + call_rcu overwrite per-prod-op: 11.24 ± 0.07k/s, avg mem: 82.64 ± 26.32MiB, peak mem: 119.18MiB batch_add_batch_del per-prod-op: 18.45 ± 0.10k/s, avg mem: 50.47 ± 14.51MiB, peak mem: 94.96MiB add_del_on_diff_cpu per-prod-op: 14.50 ± 0.03k/s, avg mem: 4.64 ± 0.73MiB, peak mem: 7.20MiB (2) preallocated OPTS=--preallocated overwrite per-prod-op: 191.42 ± 0.09k/s, avg mem: 1.24 ± 0.00MiB, peak mem: 1.49MiB batch_add_batch_del per-prod-op: 221.83 ± 0.17k/s, avg mem: 1.23 ± 0.00MiB, peak mem: 1.49MiB add_del_on_diff_cpu per-prod-op: 39.66 ± 0.31k/s, avg mem: 1.47 ± 0.13MiB, peak mem: 1.75MiB (3) normal bpf memory allocator overwrite per-prod-op: 126.59 ± 0.02k/s, avg mem: 2.26 ± 0.00MiB, peak mem: 2.74MiB batch_add_batch_del per-prod-op: 83.37 ± 0.20k/s, avg mem: 2.14 ± 0.17MiB, peak mem: 2.74MiB add_del_on_diff_cpu per-prod-op: 21.25 ± 0.24k/s, avg mem: 17.50 ± 3.32MiB, peak mem: 28.87MiB Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230704025039.938914-1-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-07-05	selftests/bpf: Honor $(O) when figuring out paths	Björn Töpel	1	-0/+4
	When building the kselftests out-of-tree, e.g. ... \| make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- \ \| O=/tmp/kselftest headers \| make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- \ \| O=/tmp/kselftest HOSTCC=gcc FORMAT= \ \| SKIP_TARGETS="arm64 ia64 powerpc sparc64 x86 sgx" \ \| -C tools/testing/selftests gen_tar ... the kselftest build would not pick up the correct GENDIR path, and therefore not including autoconf.h. Correct that by taking $(O) into consideration when figuring out the GENDIR path. Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230705113926.751791-3-bjorn@kernel.org
2023-07-05	selftests/bpf: Add F_NEEDS_EFFICIENT_UNALIGNED_ACCESS to some tests	Björn Töpel	5	-1/+14
	Some verifier tests were missing F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, which made the test fail. Add the flag where needed. Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230705113926.751791-2-bjorn@kernel.org
2023-06-30	selftests/bpf: Add bpf_program__attach_netfilter helper test	Florian Westphal	2	-0/+100
	Call bpf_program__attach_netfilter() with different protocol/hook/priority combinations. Test fails if supposedly-illegal attachments work (e.g., bogus protocol family, illegal priority and so on) or if a should-work attachment fails. Expected output: ./test_progs -t netfilter_link_attach #145/1 netfilter_link_attach/allzero:OK #145/2 netfilter_link_attach/invalid-pf:OK #145/3 netfilter_link_attach/invalid-hooknum:OK #145/4 netfilter_link_attach/invalid-priority-min:OK #145/5 netfilter_link_attach/invalid-priority-max:OK #145/6 netfilter_link_attach/invalid-flags:OK #145/7 netfilter_link_attach/invalid-inet-not-supported:OK #145/8 netfilter_link_attach/attach ipv4:OK #145/9 netfilter_link_attach/attach ipv6:OK Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Daniel Xu <dxu@dxuuu.xyz> Link: https://lore.kernel.org/bpf/20230628152738.22765-3-fw@strlen.de
2023-06-30	selftests/bpf: Verify that the cgroup_skb filters receive expected packets.	Kui-Feng Lee	5	-0/+832
	This test case includes four scenarios: 1. Connect to the server from outside the cgroup and close the connection from outside the cgroup. 2. Connect to the server from outside the cgroup and close the connection from inside the cgroup. 3. Connect to the server from inside the cgroup and close the connection from outside the cgroup. 4. Connect to the server from inside the cgroup and close the connection from inside the cgroup. The test case is to verify that cgroup_skb/{egress, ingress} filters receive expected packets including SYN, SYN/ACK, ACK, FIN, and FIN/ACK. Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230624014600.576756-3-kuifeng@meta.com
2023-06-30	selftests/bpf: Add test to exercise typedef walking	Stanislav Fomichev	2	-0/+25
	Add new bpf_fentry_test_sinfo with skb_shared_info argument and try to access frags. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20230626212522.2414485-2-sdf@google.com
2023-06-30	selftests/bpf: Fix bpf_nf failure upon test rerun	Daniel Borkmann	1	-2/+3
	Alexei reported: After fast forwarding bpf-next today bpf_nf test started to fail when run twice: $ ./test_progs -t bpf_nf #17 bpf_nf:OK Summary: 1/10 PASSED, 0 SKIPPED, 0 FAILED $ ./test_progs -t bpf_nf All error logs: test_bpf_nf_ct:PASS:test_bpf_nf__open_and_load 0 nsec test_bpf_nf_ct:PASS:iptables-legacy -t raw -A PREROUTING -j CONNMARK --set-mark 42/0 0 nsec (network_helpers.c:102: errno: Address already in use) Failed to bind socket test_bpf_nf_ct:FAIL:start_server unexpected start_server: actual -1 < expected 0 #17/1 bpf_nf/xdp-ct:FAIL test_bpf_nf_ct:PASS:test_bpf_nf__open_and_load 0 nsec test_bpf_nf_ct:PASS:iptables-legacy -t raw -A PREROUTING -j CONNMARK --set-mark 42/0 0 nsec (network_helpers.c:102: errno: Address already in use) Failed to bind socket test_bpf_nf_ct:FAIL:start_server unexpected start_server: actual -1 < expected 0 #17/2 bpf_nf/tc-bpf-ct:FAIL #17 bpf_nf:FAIL Summary: 0/8 PASSED, 0 SKIPPED, 1 FAILED I was able to locally reproduce as well. Rearrange the connection teardown so that the client closes its connection first so that we don't need to linger in TCP time-wait. Fixes: e81fbd4c1ba7 ("selftests/bpf: Add existing connection bpf_*_ct_lookup() test") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/CAADnVQ+0dnDq_v_vH1EfkacbfGnHANaon7zsw10pMb-D9FS0Pw@mail.gmail.com Link: https://lore.kernel.org/bpf/20230626131942.5100-1-daniel@iogearbox.net
2023-06-29	bpf: Replace deprecated -target with --target= for Clang	Fangrui Song	2	-4/+4
	The -target option has been deprecated since clang 3.4 in 2013. Therefore, use the preferred --target=bpf form instead. This also matches how we use --target= in scripts/Makefile.clang. Signed-off-by: Fangrui Song <maskray@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Quentin Monnet <quentin@isovalent.com> Link: https://github.com/llvm/llvm-project/commit/274b6f0c87a6a1798de0a68135afc7f95def6277 Link: https://lore.kernel.org/bpf/20230624001856.1903733-1-maskray@google.com
2023-06-29	Merge tag 'net-next-6.5' of ↵	Linus Torvalds	94	-1156/+5265
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking changes from Jakub Kicinski: "WiFi 7 and sendpage changes are the biggest pieces of work for this release. The latter will definitely require fixes but I think that we got it to a reasonable point. Core: - Rework the sendpage & splice implementations Instead of feeding data into sockets page by page extend sendmsg handlers to support taking a reference on the data, controlled by a new flag called MSG_SPLICE_PAGES Rework the handling of unexpected-end-of-file to invoke an additional callback instead of trying to predict what the right combination of MORE/NOTLAST flags is Remove the MSG_SENDPAGE_NOTLAST flag completely - Implement SCM_PIDFD, a new type of CMSG type analogous to SCM_CREDENTIALS, but it contains pidfd instead of plain pid - Enable socket busy polling with CONFIG_RT - Improve reliability and efficiency of reporting for ref_tracker - Auto-generate a user space C library for various Netlink families Protocols: - Allow TCP to shrink the advertised window when necessary, prevent sk_rcvbuf auto-tuning from growing the window all the way up to tcp_rmem[2] - Use per-VMA locking for "page-flipping" TCP receive zerocopy - Prepare TCP for device-to-device data transfers, by making sure that payloads are always attached to skbs as page frags - Make the backoff time for the first N TCP SYN retransmissions linear. Exponential backoff is unnecessarily conservative - Create a new MPTCP getsockopt to retrieve all info (MPTCP_FULL_INFO) - Avoid waking up applications using TLS sockets until we have a full record - Allow using kernel memory for protocol ioctl callbacks, paving the way to issuing ioctls over io_uring - Add nolocalbypass option to VxLAN, forcing packets to be fully encapsulated even if they are destined for a local IP address - Make TCPv4 use consistent hash in TIME_WAIT and SYN_RECV. Ensure in-kernel ECMP implementation (e.g. Open vSwitch) select the same link for all packets. Support L4 symmetric hashing in Open vSwitch - PPPoE: make number of hash bits configurable - Allow DNS to be overwritten by DHCPACK in the in-kernel DHCP client (ipconfig) - Add layer 2 miss indication and filtering, allowing higher layers (e.g. ACL filters) to make forwarding decisions based on whether packet matched forwarding state in lower devices (bridge) - Support matching on Connectivity Fault Management (CFM) packets - Hide the "link becomes ready" IPv6 messages by demoting their printk level to debug - HSR: don't enable promiscuous mode if device offloads the proto - Support active scanning in IEEE 802.15.4 - Continue work on Multi-Link Operation for WiFi 7 BPF: - Add precision propagation for subprogs and callbacks. This allows maintaining verification efficiency when subprograms are used, or in fact passing the verifier at all for complex programs, especially those using open-coded iterators - Improve BPF's {g,s}setsockopt() length handling. Previously BPF assumed the length is always equal to the amount of written data. But some protos allow passing a NULL buffer to discover what the output buffer should be, without writing anything - Accept dynptr memory as memory arguments passed to helpers - Add routing table ID to bpf_fib_lookup BPF helper - Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands - Drop bpf_capable() check in BPF_MAP_FREEZE command (used to mark maps as read-only) - Show target_{obj,btf}_id in tracing link fdinfo - Addition of several new kfuncs (most of the names are self-explanatory): - Add a set of new dynptr kfuncs: bpf_dynptr_adjust(), bpf_dynptr_is_null(), bpf_dynptr_is_rdonly(), bpf_dynptr_size() and bpf_dynptr_clone(). - bpf_task_under_cgroup() - bpf_sock_destroy() - force closing sockets - bpf_cpumask_first_and(), rework bpf_cpumask_any() kfuncs Netfilter: - Relax set/map validation checks in nf_tables. Allow checking presence of an entry in a map without using the value - Increase ip_vs_conn_tab_bits range for 64BIT builds - Allow updating size of a set - Improve NAT tuple selection when connection is closing Driver API: - Integrate netdev with LED subsystem, to allow configuring HW "offloaded" blinking of LEDs based on link state and activity (i.e. packets coming in and out) - Support configuring rate selection pins of SFP modules - Factor Clause 73 auto-negotiation code out of the drivers, provide common helper routines - Add more fool-proof helpers for managing lifetime of MDIO devices associated with the PCS layer - Allow drivers to report advanced statistics related to Time Aware scheduler offload (taprio) - Allow opting out of VF statistics in link dump, to allow more VFs to fit into the message - Split devlink instance and devlink port operations New hardware / drivers: - Ethernet: - Synopsys EMAC4 IP support (stmmac) - Marvell 88E6361 8 port (5x1GE + 3x2.5GE) switches - Marvell 88E6250 7 port switches - Microchip LAN8650/1 Rev.B0 PHYs - MediaTek MT7981/MT7988 built-in 1GE PHY driver - WiFi: - Realtek RTL8192FU, 2.4 GHz, b/g/n mode, 2T2R, 300 Mbps - Realtek RTL8723DS (SDIO variant) - Realtek RTL8851BE - CAN: - Fintek F81604 Drivers: - Ethernet NICs: - Intel (100G, ice): - support dynamic interrupt allocation - use meta data match instead of VF MAC addr on slow-path - nVidia/Mellanox: - extend link aggregation to handle 4, rather than just 2 ports - spawn sub-functions without any features by default - OcteonTX2: - support HTB (Tx scheduling/QoS) offload - make RSS hash generation configurable - support selecting Rx queue using TC filters - Wangxun (ngbe/txgbe): - add basic Tx/Rx packet offloads - add phylink support (SFP/PCS control) - Freescale/NXP (enetc): - report TAPRIO packet statistics - Solarflare/AMD: - support matching on IP ToS and UDP source port of outer header - VxLAN and GENEVE tunnel encapsulation over IPv4 or IPv6 - add devlink dev info support for EF10 - Virtual NICs: - Microsoft vNIC: - size the Rx indirection table based on requested configuration - support VLAN tagging - Amazon vNIC: - try to reuse Rx buffers if not fully consumed, useful for ARM servers running with 16kB pages - Google vNIC: - support TCP segmentation of >64kB frames - Ethernet embedded switches: - Marvell (mv88e6xxx): - enable USXGMII (88E6191X) - Microchip: - lan966x: add support for Egress Stage 0 ACL engine - lan966x: support mapping packet priority to internal switch priority (based on PCP or DSCP) - Ethernet PHYs: - Broadcom PHYs: - support for Wake-on-LAN for BCM54210E/B50212E - report LPI counter - Microsemi PHYs: support RGMII delay configuration (VSC85xx) - Micrel PHYs: receive timestamp in the frame (LAN8841) - Realtek PHYs: support optional external PHY clock - Altera TSE PCS: merge the driver into Lynx PCS which it is a variant of - CAN: Kvaser PCIEcan: - support packet timestamping - WiFi: - Intel (iwlwifi): - major update for new firmware and Multi-Link Operation (MLO) - configuration rework to drop test devices and split the different families - support for segmented PNVM images and power tables - new vendor entries for PPAG (platform antenna gain) feature - Qualcomm 802.11ax (ath11k): - Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID Advertisement (EMA) support in AP mode - support factory test mode - RealTek (rtw89): - add RSSI based antenna diversity - support U-NII-4 channels on 5 GHz band - RealTek (rtl8xxxu): - AP mode support for 8188f - support USB RX aggregation for the newer chips" tag 'net-next-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1602 commits) net: scm: introduce and use scm_recv_unix helper af_unix: Skip SCM_PIDFD if scm->pid is NULL. net: lan743x: Simplify comparison netlink: Add __sock_i_ino() for __netlink_diag_dump(). net: dsa: avoid suspicious RCU usage for synced VLAN-aware MAC addresses Revert "af_unix: Call scm_recv() only after scm_set_cred()." phylink: ReST-ify the phylink_pcs_neg_mode() kdoc libceph: Partially revert changes to support MSG_SPLICE_PAGES net: phy: mscc: fix packet loss due to RGMII delays net: mana: use vmalloc_array and vcalloc net: enetc: use vmalloc_array and vcalloc ionic: use vmalloc_array and vcalloc pds_core: use vmalloc_array and vcalloc gve: use vmalloc_array and vcalloc octeon_ep: use vmalloc_array and vcalloc net: usb: qmi_wwan: add u-blox 0x1312 composition perf trace: fix MSG_SPLICE_PAGES build error ipvlan: Fix return value of ipvlan_queue_xmit() netfilter: nf_tables: fix underflow in chain reference counter netfilter: nf_tables: unbind non-anonymous set if rule construction fails ...
2023-06-29	Merge tag 'v6.5-rc1-modules-next' of ↵	Linus Torvalds	1	-3/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux Pull module updates from Luis Chamberlain: "The changes queued up for modules are pretty tame, mostly code removal of moving of code. Only two minor functional changes are made, the only one which stands out is Sebastian Andrzej Siewior's simplification of module reference counting by removing preempt_disable() and that has been tested on linux-next for well over a month without no regressions. I'm now, I guess, also a kitchen sink for some kallsyms changes" [ There was a mis-communication about the concurrent module load changes that I had expected to come through Luis despite me authoring the patch. So some of the module updates were left hanging in the email ether, and I just committed them separately. It's my bad - I should have made it more clear that I expected my own patches to come through the module tree too. Now they missed linux-next, but hopefully that won't cause any issues - Linus ] * tag 'v6.5-rc1-modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: kallsyms: make kallsyms_show_value() as generic function kallsyms: move kallsyms_show_value() out of kallsyms.c kallsyms: remove unsed API lookup_symbol_attrs kallsyms: remove unused arch_get_kallsym() helper module: Remove preempt_disable() from module reference counting.
2023-06-25	Merge tag 'for-netdev' of ↵	Jakub Kicinski	33	-174/+1313
	https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-06-23 We've added 49 non-merge commits during the last 24 day(s) which contain a total of 70 files changed, 1935 insertions(+), 442 deletions(-). The main changes are: 1) Extend bpf_fib_lookup helper to allow passing the route table ID, from Louis DeLosSantos. 2) Fix regsafe() in verifier to call check_ids() for scalar registers, from Eduard Zingerman. 3) Extend the set of cpumask kfuncs with bpf_cpumask_first_and() and a rework of bpf_cpumask_any() kfuncs. Additionally, add selftests, from David Vernet. 4) Fix socket lookup BPF helpers for tc/XDP to respect VRF bindings, from Gilad Sever. 5) Change bpf_link_put() to use workqueue unconditionally to fix it under PREEMPT_RT, from Sebastian Andrzej Siewior. 6) Follow-ups to address issues in the bpf_refcount shared ownership implementation, from Dave Marchevsky. 7) A few general refactorings to BPF map and program creation permissions checks which were part of the BPF token series, from Andrii Nakryiko. 8) Various fixes for benchmark framework and add a new benchmark for BPF memory allocator to BPF selftests, from Hou Tao. 9) Documentation improvements around iterators and trusted pointers, from Anton Protopopov. 10) Small cleanup in verifier to improve allocated object check, from Daniel T. Lee. 11) Improve performance of bpf_xdp_pointer() by avoiding access to shared_info when XDP packet does not have frags, from Jesper Dangaard Brouer. 12) Silence a harmless syzbot-reported warning in btf_type_id_size(), from Yonghong Song. 13) Remove duplicate bpfilter_umh_cleanup in favor of umd_cleanup_helper, from Jarkko Sakkinen. 14) Fix BPF selftests build for resolve_btfids under custom HOSTCFLAGS, from Viktor Malik. tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (49 commits) bpf, docs: Document existing macros instead of deprecated bpf, docs: BPF Iterator Document selftests/bpf: Fix compilation failure for prog vrf_socket_lookup selftests/bpf: Add vrf_socket_lookup tests bpf: Fix bpf socket lookup from tc/xdp to respect socket VRF bindings bpf: Call __bpf_sk_lookup()/__bpf_skc_lookup() directly via TC hookpoint bpf: Factor out socket lookup functions for the TC hookpoint. selftests/bpf: Set the default value of consumer_cnt as 0 selftests/bpf: Ensure that next_cpu() returns a valid CPU number selftests/bpf: Output the correct error code for pthread APIs selftests/bpf: Use producer_cnt to allocate local counter array xsk: Remove unused inline function xsk_buff_discard() bpf: Keep BPF_PROG_LOAD permission checks clear of validations bpf: Centralize permissions checks for all BPF map types bpf: Inline map creation logic in map_create() function bpf: Move unprivileged checks into map_create() and bpf_prog_load() bpf: Remove in_atomic() from bpf_link_put(). selftests/bpf: Verify that check_ids() is used for scalars in regsafe() bpf: Verify scalar ids mapping in regsafe() using check_ids() selftests/bpf: Check if mark_chain_precision() follows scalar ids ... ==================== Link: https://lore.kernel.org/r/20230623211256.8409-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	3	-0/+159
	Cross-merge networking fixes after downstream PR. Conflicts: tools/testing/selftests/net/fcnal-test.sh d7a2fc1437f7 ("selftests: net: fcnal-test: check if FIPS mode is enabled") dd017c72dde6 ("selftests: fcnal: Test SO_DONTROUTE on TCP sockets.") https://lore.kernel.org/all/5007b52c-dd16-dbf6-8d64-b9701bfa498b@tessares.net/ https://lore.kernel.org/all/20230619105427.4a0df9b3@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-22	selftests/bpf: Fix compilation failure for prog vrf_socket_lookup	Yonghong Song	1	-2/+3
	When building the latest kernel/selftest with clang17 compiler: make LLVM=1 -j <== for kernel make -C tools/testing/selftests/bpf LLVM=1 -j <== for selftest I hit the following compilation error: [...] In file included from progs/vrf_socket_lookup.c:3: In file included from /usr/include/linux/ip.h:21: In file included from /usr/include/asm/byteorder.h:5: In file included from /usr/include/linux/byteorder/little_endian.h:13: /usr/include/linux/swab.h:136:8: error: unknown type name '__always_inline' 136 \| static __always_inline unsigned long __swab(const unsigned long y) \| ^ /usr/include/linux/swab.h:171:8: error: unknown type name '__always_inline' 171 \| static __always_inline __u16 __swab16p(const __u16 p) \| ^ /usr/include/linux/swab.h:171:29: error: expected ';' after top level declarator 171 \| static __always_inline __u16 __swab16p(const __u16 p) \| ^ [...] Basically, with header files in my local host which is based on 5.12 kernel, __always_inline is not defined and this caused compilation failure. Since __always_inline is defined in bpf_helpers.h, let us move bpf_helpers.h to an early position which fixed the problem. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230622061921.816772-1-yhs@fb.com
2023-06-22	selftests/bpf: Add vrf_socket_lookup tests	Gilad Sever	2	-0/+400
	Verify that socket lookup via TC/XDP with all BPF APIs is VRF aware. Signed-off-by: Gilad Sever <gilad9366@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eyal Birger <eyal.birger@gmail.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230621104211.301902-5-gilad9366@gmail.com
2023-06-19	selftests/bpf: Set the default value of consumer_cnt as 0	Hou Tao	14	-128/+35
	Considering that only bench_ringbufs.c supports consumer, just set the default value of consumer_cnt as 0. After that, update the validity check of consumer_cnt, remove unused consumer_thread code snippets and set consumer_cnt as 1 in run_bench_ringbufs.sh accordingly. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20230613080921.1623219-5-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>