Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit 7015843afcaf68c132784c89528dfddc0005e483 ]
Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT
configs. In this particular case, the base VM is AMD with 166 cpus, and I
run selftests with regular qemu on top of that and indeed send_signal test
failed. I also tried with an Intel box with 80 cpus and there is no issue.
The main qemu command line includes:
-enable-kvm -smp 16 -cpu host
The failure log looks like:
$ ./test_progs -t send_signal
[ 48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225]
[ 48.503622] Modules linked in: bpf_testmod(O)
[ 48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G O 6.9.0-08561-g2c1713a8f1c9-dirty #69
[ 48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
[ 48.511635] RIP: 0010:handle_softirqs+0x71/0x290
[ 48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb
[ 48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246
[ 48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0
[ 48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000
[ 48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000
[ 48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280
[ 48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 48.528525] FS: 00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000
[ 48.531600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0
[ 48.537538] Call Trace:
[ 48.537538] <IRQ>
[ 48.537538] ? watchdog_timer_fn+0x1cd/0x250
[ 48.539590] ? lockup_detector_update_enable+0x50/0x50
[ 48.539590] ? __hrtimer_run_queues+0xff/0x280
[ 48.542520] ? hrtimer_interrupt+0x103/0x230
[ 48.544524] ? __sysvec_apic_timer_interrupt+0x4f/0x140
[ 48.545522] ? sysvec_apic_timer_interrupt+0x3a/0x90
[ 48.547612] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 48.547612] ? handle_softirqs+0x71/0x290
[ 48.547612] irq_exit_rcu+0x63/0x80
[ 48.551585] sysvec_apic_timer_interrupt+0x75/0x90
[ 48.552521] </IRQ>
[ 48.553529] <TASK>
[ 48.553529] asm_sysvec_apic_timer_interrupt+0x1a/0x20
[ 48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260
[ 48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74
[ 48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282
[ 48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620
[ 48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00
[ 48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000
[ 48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000
[ 48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00
[ 48.575614] ? finish_task_switch.isra.0+0x8d/0x260
[ 48.576523] __schedule+0x364/0xac0
[ 48.577535] schedule+0x2e/0x110
[ 48.578555] pipe_read+0x301/0x400
[ 48.579589] ? destroy_sched_domains_rcu+0x30/0x30
[ 48.579589] vfs_read+0x2b3/0x2f0
[ 48.579589] ksys_read+0x8b/0xc0
[ 48.583590] do_syscall_64+0x3d/0xc0
[ 48.583590] entry_SYSCALL_64_after_hwframe+0x4b/0x53
[ 48.586525] RIP: 0033:0x7f2f28703fa1
[ 48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0
[ 48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1
[ 48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006
[ 48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000
[ 48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0
[ 48.605527] </TASK>
In the test, two processes are communicating through pipe. Further debugging
with strace found that the above splat is triggered as read() syscall could
not receive the data even if the corresponding write() syscall in another
process successfully wrote data into the pipe.
The failed subtest is "send_signal_perf". The corresponding perf event has
sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every
overflow event will trigger a call to the BPF program. So I suspect this may
overwhelm the system. So I increased the sample_period to 100,000 and the test
passed. The sample_period 10,000 still has the test failed.
In other parts of selftest, e.g., [1], sample_freq is used instead. So I
decided to use sample_freq = 1,000 since the test can pass as well.
[1] https://lore.kernel.org/bpf/20240604070700.3032142-1-song@kernel.org/
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240605201203.2603846-1-yonghong.song@linux.dev
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 517125f6749402e579f715519147145944f12ad9 upstream.
Revert commit 90dc946059b7 ("selftests/bpf: DENYLIST.aarch64: Remove
fexit_sleep") again. The fix in 19d3c179a377 ("bpf, arm64: Fix trampoline
for BPF_TRAMP_F_CALL_ORIG") does not address all of the issues and BPF
CI is still hanging and timing out:
https://github.com/kernel-patches/bpf/actions/runs/9905842936/job/27366435436
[...]
#89/11 fexit_bpf2bpf/func_replace_global_func:OK
#89/12 fexit_bpf2bpf/fentry_to_cgroup_bpf:OK
#89/13 fexit_bpf2bpf/func_replace_progmap:OK
#89 fexit_bpf2bpf:OK
Error: The operation was canceled.
Thus more investigation work & fixing is needed before the test can be put
in place again.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/bpf/20240705145009.32340-1-puranjay@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 189f1a976e426011e6a5588f1d3ceedf71fe2965 ]
For all these years libbpf's BTF dumper has been emitting not strictly
valid syntax for function prototypes that have no input arguments.
Instead of `int (*blah)()` we should emit `int (*blah)(void)`.
This is not normally a problem, but it manifests when we get kfuncs in
vmlinux.h that have no input arguments. Due to compiler internal
specifics, we get no BTF information for such kfuncs, if they are not
declared with proper `(void)`.
The fix is trivial. We also need to adjust a few ancient tests that
happily assumed `()` is correct.
Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion")
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://lore.kernel.org/bpf/20240712224442.282823-1-andrii@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e1ef78dce9b7b0fa7f9d88bb3554441d74d33b34 ]
On ARM64 the stack pointer should be aligned at a 16 byte boundary or
the SPAlignmentFault can occur. The fexit_sleep selftest allocates the
stack for the child process as a character array, this is not guaranteed
to be aligned at 16 bytes.
Because of the SPAlignmentFault, the child process is killed before it
can do the nanosleep call and hence fentry_cnt remains as 0. This causes
the main thread to hang on the following line:
while (READ_ONCE(fexit_skel->bss->fentry_cnt) != 2);
Fix this by allocating the stack using mmap() as described in the
example in the man page of clone().
Remove the fexit_sleep test from the DENYLIST of arm64.
Fixes: eddbe8e65214 ("selftest/bpf: Add a test to check trampoline freeing logic.")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240715173327.8657-1-puranjay@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 52b49ec1b2c78deb258596c3b231201445ef5380 ]
If bpf_object__load() fails in test_xdp_adjust_frags_tail_grow(), "obj"
opened before this should be closed. So use "goto out" to close it instead
of using "return" here.
Fixes: 110221081aac ("bpf: selftests: update xdp_adjust_tail selftest to include xdp frags")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/f282a1ed2d0e3fb38cceefec8e81cabb69cab260.1720615848.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit eef0532e900c20a6760da829e82dac3ee18688c5 ]
Run bpf_tcp_ca selftests (./test_progs -t bpf_tcp_ca) on a Loongarch
platform, some "Segmentation fault" errors occur:
'''
test_dctcp:PASS:bpf_dctcp__open_and_load 0 nsec
test_dctcp:FAIL:bpf_map__attach_struct_ops unexpected error: -524
#29/1 bpf_tcp_ca/dctcp:FAIL
test_cubic:PASS:bpf_cubic__open_and_load 0 nsec
test_cubic:FAIL:bpf_map__attach_struct_ops unexpected error: -524
#29/2 bpf_tcp_ca/cubic:FAIL
test_dctcp_fallback:PASS:dctcp_skel 0 nsec
test_dctcp_fallback:PASS:bpf_dctcp__load 0 nsec
test_dctcp_fallback:FAIL:dctcp link unexpected error: -524
#29/4 bpf_tcp_ca/dctcp_fallback:FAIL
test_write_sk_pacing:PASS:open_and_load 0 nsec
test_write_sk_pacing:FAIL:attach_struct_ops unexpected error: -524
#29/6 bpf_tcp_ca/write_sk_pacing:FAIL
test_update_ca:PASS:open 0 nsec
test_update_ca:FAIL:attach_struct_ops unexpected error: -524
settcpca:FAIL:setsockopt unexpected setsockopt: \
actual -1 == expected -1
(network_helpers.c:99: errno: No such file or directory) \
Failed to call post_socket_cb
start_test:FAIL:start_server_str unexpected start_server_str: \
actual -1 == expected -1
test_update_ca:FAIL:ca1_ca1_cnt unexpected ca1_ca1_cnt: \
actual 0 <= expected 0
#29/9 bpf_tcp_ca/update_ca:FAIL
#29 bpf_tcp_ca:FAIL
Caught signal #11!
Stack trace:
./test_progs(crash_handler+0x28)[0x5555567ed91c]
linux-vdso.so.1(__vdso_rt_sigreturn+0x0)[0x7ffffee408b0]
./test_progs(bpf_link__update_map+0x80)[0x555556824a78]
./test_progs(+0x94d68)[0x5555564c4d68]
./test_progs(test_bpf_tcp_ca+0xe8)[0x5555564c6a88]
./test_progs(+0x3bde54)[0x5555567ede54]
./test_progs(main+0x61c)[0x5555567efd54]
/usr/lib64/libc.so.6(+0x22208)[0x7ffff2aaa208]
/usr/lib64/libc.so.6(__libc_start_main+0xac)[0x7ffff2aaa30c]
./test_progs(_start+0x48)[0x55555646bca8]
Segmentation fault
'''
This is because BPF trampoline is not implemented on Loongarch yet,
"link" returned by bpf_map__attach_struct_ops() is NULL. test_progs
crashs when this NULL link passes to bpf_link__update_map(). This
patch adds NULL checks for all links in bpf_tcp_ca to fix these errors.
If "link" is NULL, goto the newly added label "out" to destroy the skel.
v2:
- use "goto out" instead of "return" as Eduard suggested.
Fixes: 06da9f3bd641 ("selftests/bpf: Test switching TCP Congestion Control algorithms.")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/r/b4c841492bd4ed97964e4e61e92827ce51bf1dc9.1720615848.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit adae187ebedcd95d02f045bc37dfecfd5b29434b ]
In the error path when update_lookup_map() fails in drop_on_reuseport in
prog_tests/sk_lookup.c, "server1", the fd of server 1, should be closed.
This patch fixes this by using "goto close_srv1" lable instead of "detach"
to close "server1" in this case.
Fixes: 0ab5539f8584 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point")
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/86aed33b4b0ea3f04497c757845cff7e8e621a2d.1720515893.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 717d6313bba1b3179f0bf1026aaec6b7e26f484e ]
This reverts [1] and changes return value for bpf_session_cookie
in bpf selftests. Having long * might lead to problems on 32-bit
architectures.
Fixes: 2b8dd87332cd ("bpf: Make bpf_session_cookie() kfunc return long *")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240619081624.1620152-1-jolsa@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit de1b5ea789dc28066cc8dc634b6825bd6148f38b ]
The value of recv in msg_loop may be negative, like EWOULDBLOCK, so it's
necessary to check if it is positive before accumulating it to bytes_recvd.
Fixes: 16962b2404ac ("bpf: sockmap, add selftests")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/5172563f7c7b2a2e953cef02e89fc34664a7b190.1716446893.git.tanggeliang@kylinos.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6c8d7598dfed759bf1d9d0322b4c2b42eb7252d8 ]
bpf_prog5 and bpf_prog7 are removed from progs/test_sockmap_kern.h in
commit d79a32129b21 ("bpf: Selftests, remove prints from sockmap tests"),
now there are only 9 progs in it, not 11:
SEC("sk_skb1")
int bpf_prog1(struct __sk_buff *skb)
SEC("sk_skb2")
int bpf_prog2(struct __sk_buff *skb)
SEC("sk_skb3")
int bpf_prog3(struct __sk_buff *skb)
SEC("sockops")
int bpf_sockmap(struct bpf_sock_ops *skops)
SEC("sk_msg1")
int bpf_prog4(struct sk_msg_md *msg)
SEC("sk_msg2")
int bpf_prog6(struct sk_msg_md *msg)
SEC("sk_msg3")
int bpf_prog8(struct sk_msg_md *msg)
SEC("sk_msg4")
int bpf_prog9(struct sk_msg_md *msg)
SEC("sk_msg5")
int bpf_prog10(struct sk_msg_md *msg)
This patch updates the array sizes of prog_fd[], prog_attach_type[] and
prog_type[] from 11 to 9 accordingly.
Fixes: d79a32129b21 ("bpf: Selftests, remove prints from sockmap tests")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/9c10d9f974f07fcb354a43a8eca67acb2fafc587.1715926605.git.tanggeliang@kylinos.cn
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
Add a selftest that tries to trigger a situation where two timer callbacks
are attempting to cancel each other's timer. By running them continuously,
we hit a condition where both run in parallel and cancel each other.
Without the fix in the previous patch, this would cause a lockup as
hrtimer_cancel on either side will wait for forward progress from the
callback.
Ensure that this situation leads to a EDEADLK error.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240711052709.2148616-1-memxor@gmail.com
|
|
Add a test case which replaces an active ingress qdisc while keeping the
miniq in-tact during the transition period to the new clsact qdisc.
# ./vmtest.sh -- ./test_progs -t tc_link
[...]
./test_progs -t tc_link
[ 3.412871] bpf_testmod: loading out-of-tree module taints kernel.
[ 3.413343] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
#332 tc_links_after:OK
#333 tc_links_append:OK
#334 tc_links_basic:OK
#335 tc_links_before:OK
#336 tc_links_chain_classic:OK
#337 tc_links_chain_mixed:OK
#338 tc_links_dev_chain0:OK
#339 tc_links_dev_cleanup:OK
#340 tc_links_dev_mixed:OK
#341 tc_links_ingress:OK
#342 tc_links_invalid:OK
#343 tc_links_prepend:OK
#344 tc_links_replace:OK
#345 tc_links_revision:OK
Summary: 14/0 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240708133130.11609-2-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Add few tests with may_goto and negative offset.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240619235355.85031-2-alexei.starovoitov@gmail.com
|
|
Add test coverage for reservations beyond the ring buffer size in order
to validate that bpf_ringbuf_reserve() rejects the request with NULL, all
other ring buffer tests keep passing as well:
# ./vmtest.sh -- ./test_progs -t ringbuf
[...]
./test_progs -t ringbuf
[ 1.165434] bpf_testmod: loading out-of-tree module taints kernel.
[ 1.165825] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
[ 1.284001] tsc: Refined TSC clocksource calibration: 3407.982 MHz
[ 1.286871] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fc34e357, max_idle_ns: 440795379773 ns
[ 1.289555] clocksource: Switched to clocksource tsc
#274/1 ringbuf/ringbuf:OK
#274/2 ringbuf/ringbuf_n:OK
#274/3 ringbuf/ringbuf_map_key:OK
#274/4 ringbuf/ringbuf_write:OK
#274 ringbuf:OK
#275 ringbuf_multi:OK
[...]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
[ Test fixups for getting BPF CI back to work ]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240621140828.18238-2-daniel@iogearbox.net
|
|
Add few tests with may_goto and jumps to the 1st insn.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240619011859.79334-2-alexei.starovoitov@gmail.com
|
|
Add three unit tests in verifier_movsx.c to cover
cases where missed var_off setting can cause
unexpected verification success or failure.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240615174637.3995589-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add a test case for the jmp32/k fix to ensure selftests have coverage.
Before fix:
# ./vmtest.sh -- ./test_progs -t verifier_or_jmp32_k
[...]
./test_progs -t verifier_or_jmp32_k
tester_init:PASS:tester_log_buf 0 nsec
process_subtest:PASS:obj_open_mem 0 nsec
process_subtest:PASS:specs_alloc 0 nsec
run_subtest:PASS:obj_open_mem 0 nsec
run_subtest:FAIL:unexpected_load_success unexpected success: 0
#492/1 verifier_or_jmp32_k/or_jmp32_k: bit ops + branch on unknown value:FAIL
#492 verifier_or_jmp32_k:FAIL
Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
After fix:
# ./vmtest.sh -- ./test_progs -t verifier_or_jmp32_k
[...]
./test_progs -t verifier_or_jmp32_k
#492/1 verifier_or_jmp32_k/or_jmp32_k: bit ops + branch on unknown value:OK
#492 verifier_or_jmp32_k:OK
Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20240613115310.25383-3-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Recent kernel change ([0]) changed inet_csk_accept() prototype. Adapt
progs/test_sk_storage_tracing.c to take that into account.
[0] 92ef0fd55ac8 ("net: change proto and proto_ops accept type")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240528223218.3445297-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Verifier enforces that only certain program types can mutate sock{map,hash}
maps, that is update it or delete from it. Add test coverage for these
checks so we don't regress.
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20240527-sockmap-verify-deletes-v1-3-944b372f2101@cloudflare.com
|
|
Add a test case to assert that the skb->pkt_type which was set from the BPF
program is retained from the netkit xmit side to the peer's device at tcx
ingress location.
# ./vmtest.sh -- ./test_progs -t netkit
[...]
./test_progs -t netkit
[ 1.140780] bpf_testmod: loading out-of-tree module taints kernel.
[ 1.141127] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
[ 1.284601] tsc: Refined TSC clocksource calibration: 3408.006 MHz
[ 1.286672] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd9b189d, max_idle_ns: 440795225691 ns
[ 1.290384] clocksource: Switched to clocksource tsc
#345 tc_netkit_basic:OK
#346 tc_netkit_device:OK
#347 tc_netkit_multi_links:OK
#348 tc_netkit_multi_opts:OK
#349 tc_netkit_neigh_links:OK
#350 tc_netkit_pkt_type:OK
Summary: 6/0 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20240524163619.26001-4-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This adds simple tests around setting MAC addresses in the different
netkit modes.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20240524163619.26001-3-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Validate libbpf's USDT-over-multi-uprobe logic by adding USDTs to
existing multi-uprobe tests. This checks correct libbpf fallback to
singular uprobes (when run on older kernels with buggy PID filtering).
We reuse already established child process and child thread testing
infrastructure, so additions are minimal. These test fail on either
older kernels or older version of libbpf that doesn't detect PID
filtering problems.
Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240521163401.3005045-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Extend existing multi-uprobe tests to test that PID filtering works
correctly. We already have child *process* tests, but we need also child
*thread* tests. This patch adds spawn_thread() helper to start child
thread, wait for it to be ready, and then instruct it to trigger desired
uprobes.
Additionally, we extend BPF-side code to track thread ID, not just
process ID. Also we detect whether extraneous triggerings with
unexpected process IDs happened, and validate that none of that happened
in practice.
These changes prove that fixed PID filtering logic for multi-uprobe
works as expected. These tests fail on old kernels.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20240521163401.3005045-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Current implementation of PID filtering logic for multi-uprobes in
uprobe_prog_run() is filtering down to exact *thread*, while the intent
for PID filtering it to filter by *process* instead. The check in
uprobe_prog_run() also differs from the analogous one in
uprobe_multi_link_filter() for some reason. The latter is correct,
checking task->mm, not the task itself.
Fix the check in uprobe_prog_run() to perform the same task->mm check.
While doing this, we also update get_pid_task() use to use PIDTYPE_TGID
type of lookup, given the intent is to get a representative task of an
entire process. This doesn't change behavior, but seems more logical. It
would hold task group leader task now, not any random thread task.
Last but not least, given multi-uprobe support is half-broken due to
this PID filtering logic (depending on whether PID filtering is
important or not), we need to make it easy for user space consumers
(including libbpf) to easily detect whether PID filtering logic was
already fixed.
We do it here by adding an early check on passed pid parameter. If it's
negative (and so has no chance of being a valid PID), we return -EINVAL.
Previous behavior would eventually return -ESRCH ("No process found"),
given there can't be any process with negative PID. This subtle change
won't make any practical change in behavior, but will allow applications
to detect PID filtering fixes easily. Libbpf fixes take advantage of
this in the next patch.
Cc: stable@vger.kernel.org
Acked-by: Jiri Olsa <jolsa@kernel.org>
Fixes: b733eeade420 ("bpf: Add pid filter support for uprobe_multi link")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240521163401.3005045-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The btf_dump test fails:
test_btf_dump_struct_data:FAIL:file_operations unexpected file_operations: actual '(struct file_operations){
.owner = (struct module *)0xffffffffffffffff,
.fop_flags = (fop_flags_t)4294967295,
.llseek = (loff_t (*)(struct f' != expected '(struct file_operations){
.owner = (struct module *)0xffffffffffffffff,
.llseek = (loff_t (*)(struct file *, loff_t, int))0xffffffffffffffff,'
The "fop_flags" is a recent addition to the struct file_operations in
commit 210a03c9d51a ("fs: claw back a few FMODE_* bits")
This patch changes the test_btf_dump_struct_data() to reflect
this change.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20240516164310.2481460-1-martin.lau@linux.dev
|
|
name change
After commit 4c3e509ea9f2 ("sched/balancing: Rename load_balance() => sched_balance_rq()"),
the load_balance kernel function is renamed to sched_balance_rq.
This patch adjusts the fentry program in test_access_variable_array.c
to reflect this kernel function name change.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240516170140.2689430-1-martin.lau@linux.dev
|
|
Add test cases validating usage of PERCPU_ARRAY and PERCPU_HASH maps as
inner maps.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20240515062440.846086-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Complete rework of garbage collection of AF_UNIX sockets.
AF_UNIX is prone to forming reference count cycles due to fd
passing functionality. New method based on Tarjan's Strongly
Connected Components algorithm should be both faster and remove a
lot of workarounds we accumulated over the years.
- Add TCP fraglist GRO support, allowing chaining multiple TCP
packets and forwarding them together. Useful for small switches /
routers which lack basic checksum offload in some scenarios (e.g.
PPPoE).
- Support using SMP threads for handling packet backlog i.e. packet
processing from software interfaces and old drivers which don't use
NAPI. This helps move the processing out of the softirq jumble.
- Continue work of converting from rtnl lock to RCU protection.
Don't require rtnl lock when reading: IPv6 routing FIB, IPv6
address labels, netdev threaded NAPI sysfs files, bonding driver's
sysfs files, MPLS devconf, IPv4 FIB rules, netns IDs, tcp metrics,
TC Qdiscs, neighbor entries, ARP entries via ioctl(SIOCGARP), a lot
of the link information available via rtnetlink.
- Small optimizations from Eric to UDP wake up handling, memory
accounting, RPS/RFS implementation, TCP packet sizing etc.
- Allow direct page recycling in the bulk API used by XDP, for +2%
PPS.
- Support peek with an offset on TCP sockets.
- Add MPTCP APIs for querying last time packets were received/sent/acked
and whether MPTCP "upgrade" succeeded on a TCP socket.
- Add intra-node communication shortcut to improve SMC performance.
- Add IPv6 (and IPv{4,6}-over-IPv{4,6}) support to the GTP protocol
driver.
- Add HSR-SAN (RedBOX) mode of operation to the HSR protocol driver.
- Add reset reasons for tracing what caused a TCP reset to be sent.
- Introduce direction attribute for xfrm (IPSec) states. State can be
used either for input or output packet processing.
Things we sprinkled into general kernel code:
- Add bitmap_{read,write}(), bitmap_size(), expose BYTES_TO_BITS().
This required touch-ups and renaming of a few existing users.
- Add Endian-dependent __counted_by_{le,be} annotations.
- Make building selftests "quieter" by printing summaries like
"CC object.o" rather than full commands with all the arguments.
Netfilter:
- Use GFP_KERNEL to clone elements, to deal better with OOM
situations and avoid failures in the .commit step.
BPF:
- Add eBPF JIT for ARCv2 CPUs.
- Support attaching kprobe BPF programs through kprobe_multi link in
a session mode, meaning, a BPF program is attached to both function
entry and return, the entry program can decide if the return
program gets executed and the entry program can share u64 cookie
value with return program. "Session mode" is a common use-case for
tetragon and bpftrace.
- Add the ability to specify and retrieve BPF cookie for raw
tracepoint programs in order to ease migration from classic to raw
tracepoints.
- Add an internal-only BPF per-CPU instruction for resolving per-CPU
memory addresses and implement support in x86, ARM64 and RISC-V
JITs. This allows inlining functions which need to access per-CPU
state.
- Optimize x86 BPF JIT's emit_mov_imm64, and add support for various
atomics in bpf_arena which can be JITed as a single x86
instruction. Support BPF arena on ARM64.
- Add a new bpf_wq API for deferring events and refactor
process-context bpf_timer code to keep common code where possible.
- Harden the BPF verifier's and/or/xor value tracking.
- Introduce crypto kfuncs to let BPF programs call kernel crypto
APIs.
- Support bpf_tail_call_static() helper for BPF programs with GCC 13.
- Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF
program to have code sections where preemption is disabled.
Driver API:
- Skip software TC processing completely if all installed rules are
marked as HW-only, instead of checking the HW-only flag rule by
rule.
- Add support for configuring PoE (Power over Ethernet), similar to
the already existing support for PoDL (Power over Data Line)
config.
- Initial bits of a queue control API, for now allowing a single
queue to be reset without disturbing packet flow to other queues.
- Common (ethtool) statistics for hardware timestamping.
Tests and tooling:
- Remove the need to create a config file to run the net forwarding
tests so that a naive "make run_tests" can exercise them.
- Define a method of writing tests which require an external endpoint
to communicate with (to send/receive data towards the test
machine). Add a few such tests.
- Create a shared code library for writing Python tests. Expose the
YAML Netlink library from tools/ to the tests for easy Netlink
access.
- Move netfilter tests under net/, extend them, separate performance
tests from correctness tests, and iron out issues found by running
them "on every commit".
- Refactor BPF selftests to use common network helpers.
- Further work filling in YAML definitions of Netlink messages for:
nftables, team driver, bonding interfaces, vlan interfaces, VF
info, TC u32 mark, TC police action.
- Teach Python YAML Netlink to decode attribute policies.
- Extend the definition of the "indexed array" construct in the specs
to cover arrays of scalars rather than just nests.
- Add hyperlinks between definitions in generated Netlink docs.
Drivers:
- Make sure unsupported flower control flags are rejected by drivers,
and make more drivers report errors directly to the application
rather than dmesg (large number of driver changes from Asbjørn
Sloth Tønnesen).
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- support multiple RSS contexts and steering traffic to them
- support XDP metadata
- make page pool allocations more NUMA aware
- Intel (100G, ice, idpf):
- extract datapath code common among Intel drivers into a library
- use fewer resources in switchdev by sharing queues with the PF
- add PFCP filter support
- add Ethernet filter support
- use a spinlock instead of HW lock in PTP clock ops
- support 5 layer Tx scheduler topology
- nVidia/Mellanox:
- 800G link modes and 100G SerDes speeds
- per-queue IRQ coalescing configuration
- Marvell Octeon:
- support offloading TC packet mark action
- Ethernet NICs consumer, embedded and virtual:
- stop lying about skb->truesize in USB Ethernet drivers, it
messes up TCP memory calculations
- Google cloud vNIC:
- support changing ring size via ethtool
- support ring reset using the queue control API
- VirtIO net:
- expose flow hash from RSS to XDP
- per-queue statistics
- add selftests
- Synopsys (stmmac):
- support controllers which require an RX clock signal from the
MII bus to perform their hardware initialization
- TI:
- icssg_prueth: support ICSSG-based Ethernet on AM65x SR1.0 devices
- icssg_prueth: add SW TX / RX Coalescing based on hrtimers
- cpsw: minimal XDP support
- Renesas (ravb):
- support describing the MDIO bus
- Realtek (r8169):
- add support for RTL8168M
- Microchip Sparx5:
- matchall and flower actions mirred and redirect
- Ethernet switches:
- nVidia/Mellanox:
- improve events processing performance
- Marvell:
- add support for MV88E6250 family internal PHYs
- Microchip:
- add DCB and DSCP mapping support for KSZ switches
- vsc73xx: convert to PHYLINK
- Realtek:
- rtl8226b/rtl8221b: add C45 instances and SerDes switching
- Many driver changes related to PHYLIB and PHYLINK deprecated API
cleanup
- Ethernet PHYs:
- Add a new driver for Airoha EN8811H 2.5 Gigabit PHY.
- micrel: lan8814: add support for PPS out and external timestamp trigger
- WiFi:
- Disable Wireless Extensions (WEXT) in all Wi-Fi 7 devices
drivers. Modern devices can only be configured using nl80211.
- mac80211/cfg80211
- handle color change per link for WiFi 7 Multi-Link Operation
- Intel (iwlwifi):
- don't support puncturing in 5 GHz
- support monitor mode on passive channels
- BZ-W device support
- P2P with HE/EHT support
- re-add support for firmware API 90
- provide channel survey information for Automatic Channel Selection
- MediaTek (mt76):
- mt7921 LED control
- mt7925 EHT radiotap support
- mt7920e PCI support
- Qualcomm (ath11k):
- P2P support for QCA6390, WCN6855 and QCA2066
- support hibernation
- ieee80211-freq-limit Device Tree property support
- Qualcomm (ath12k):
- refactoring in preparation of multi-link support
- suspend and hibernation support
- ACPI support
- debugfs support, including dfs_simulate_radar support
- RealTek:
- rtw88: RTL8723CS SDIO device support
- rtw89: RTL8922AE Wi-Fi 7 PCI device support
- rtw89: complete features of new WiFi 7 chip 8922AE including
BT-coexistence and Wake-on-WLAN
- rtw89: use BIOS ACPI settings to set TX power and channels
- rtl8xxxu: enable Management Frame Protection (MFP) support
- Bluetooth:
- support for Intel BlazarI and Filmore Peak2 (BE201)
- support for MediaTek MT7921S SDIO
- initial support for Intel PCIe BT driver
- remove HCI_AMP support"
* tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1827 commits)
selftests: netfilter: fix packetdrill conntrack testcase
net: gro: fix napi_gro_cb zeroed alignment
Bluetooth: btintel_pcie: Refactor and code cleanup
Bluetooth: btintel_pcie: Fix warning reported by sparse
Bluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1
Bluetooth: btintel: Fix compiler warning for multi_v7_defconfig config
Bluetooth: btintel_pcie: Fix compiler warnings
Bluetooth: btintel_pcie: Add *setup* function to download firmware
Bluetooth: btintel_pcie: Add support for PCIe transport
Bluetooth: btintel: Export few static functions
Bluetooth: HCI: Remove HCI_AMP support
Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init()
Bluetooth: qca: Fix error code in qca_read_fw_build_info()
Bluetooth: hci_conn: Use __counted_by() and avoid -Wfamnae warning
Bluetooth: btintel: Add support for Filmore Peak2 (BE201)
Bluetooth: btintel: Add support for BlazarI
LE Create Connection command timeout increased to 20 secs
dt-bindings: net: bluetooth: Add MediaTek MT7921S SDIO Bluetooth
Bluetooth: compute LE flow credits based on recvbuf space
Bluetooth: hci_sync: Use cmd->num_cis instead of magic number
...
|
|
Merge in late fixes to prepare for the 6.10 net-next PR.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf events updates from Ingo Molnar:
- Combine perf and BPF for fast evalution of HW breakpoint
conditions
- Add LBR capture support outside of hardware events
- Trigger IO signals for watermark_wakeup
- Add RAPL support for Intel Arrow Lake and Lunar Lake
- Optimize frequency-throttling
- Miscellaneous cleanups & fixes
* tag 'perf-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
perf/bpf: Mark perf_event_set_bpf_handler() and perf_event_free_bpf_handler() as inline too
selftests/perf_events: Test FASYNC with watermark wakeups
perf/ring_buffer: Trigger IO signals for watermark_wakeup
perf: Move perf_event_fasync() to perf_event.h
perf/bpf: Change the !CONFIG_BPF_SYSCALL stubs to static inlines
selftest/bpf: Test a perf BPF program that suppresses side effects
perf/bpf: Allow a BPF program to suppress all sample side effects
perf/bpf: Remove unneeded uses_default_overflow_handler()
perf/bpf: Call BPF handler directly, not through overflow machinery
perf/bpf: Remove #ifdef CONFIG_BPF_SYSCALL from struct perf_event members
perf/bpf: Create bpf_overflow_handler() stub for !CONFIG_BPF_SYSCALL
perf/bpf: Reorder bpf_overflow_handler() ahead of __perf_event_overflow()
perf/x86/rapl: Add support for Intel Lunar Lake
perf/x86/rapl: Add support for Intel Arrow Lake
perf/core: Reduce PMU access to adjust sample freq
perf/core: Optimize perf_adjust_freq_unthr_context()
perf/x86/amd: Don't reject non-sampling events with configured LBR
perf/x86/amd: Support capturing LBR from software events
perf/x86/amd: Avoid taking branches before disabling LBR
perf/x86/amd: Ensure amd_pmu_core_disable_all() is always inlined
...
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-05-13
We've added 119 non-merge commits during the last 14 day(s) which contain
a total of 134 files changed, 9462 insertions(+), 4742 deletions(-).
The main changes are:
1) Add BPF JIT support for 32-bit ARCv2 processors, from Shahab Vahedi.
2) Add BPF range computation improvements to the verifier in particular
around XOR and OR operators, refactoring of checks for range computation
and relaxing MUL range computation so that src_reg can also be an unknown
scalar, from Cupertino Miranda.
3) Add support to attach kprobe BPF programs through kprobe_multi link in
a session mode, meaning, a BPF program is attached to both function entry
and return, the entry program can decide if the return program gets
executed and the entry program can share u64 cookie value with return
program. Session mode is a common use-case for tetragon and bpftrace,
from Jiri Olsa.
4) Fix a potential overflow in libbpf's ring__consume_n() and improve libbpf
as well as BPF selftest's struct_ops handling, from Andrii Nakryiko.
5) Improvements to BPF selftests in context of BPF gcc backend,
from Jose E. Marchesi & David Faust.
6) Migrate remaining BPF selftest tests from test_sock_addr.c to prog_test-
-style in order to retire the old test, run it in BPF CI and additionally
expand test coverage, from Jordan Rife.
7) Big batch for BPF selftest refactoring in order to remove duplicate code
around common network helpers, from Geliang Tang.
8) Another batch of improvements to BPF selftests to retire obsolete
bpf_tcp_helpers.h as everything is available vmlinux.h,
from Martin KaFai Lau.
9) Fix BPF map tear-down to not walk the map twice on free when both timer
and wq is used, from Benjamin Tissoires.
10) Fix BPF verifier assumptions about socket->sk that it can be non-NULL,
from Alexei Starovoitov.
11) Change BTF build scripts to using --btf_features for pahole v1.26+,
from Alan Maguire.
12) Small improvements to BPF reusing struct_size() and krealloc_array(),
from Andy Shevchenko.
13) Fix s390 JIT to emit a barrier for BPF_FETCH instructions,
from Ilya Leoshkevich.
14) Extend TCP ->cong_control() callback in order to feed in ack and
flag parameters and allow write-access to tp->snd_cwnd_stamp
from BPF program, from Miao Xu.
15) Add support for internal-only per-CPU instructions to inline
bpf_get_smp_processor_id() helper call for arm64 and riscv64 BPF JITs,
from Puranjay Mohan.
16) Follow-up to remove the redundant ethtool.h from tooling infrastructure,
from Tushar Vyavahare.
17) Extend libbpf to support "module:<function>" syntax for tracing
programs, from Viktor Malik.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (119 commits)
bpf: make list_for_each_entry portable
bpf: ignore expected GCC warning in test_global_func10.c
bpf: disable strict aliasing in test_global_func9.c
selftests/bpf: Free strdup memory in xdp_hw_metadata
selftests/bpf: Fix a few tests for GCC related warnings.
bpf: avoid gcc overflow warning in test_xdp_vlan.c
tools: remove redundant ethtool.h from tooling infra
selftests/bpf: Expand ATTACH_REJECT tests
selftests/bpf: Expand getsockname and getpeername tests
sefltests/bpf: Expand sockaddr hook deny tests
selftests/bpf: Expand sockaddr program return value tests
selftests/bpf: Retire test_sock_addr.(c|sh)
selftests/bpf: Remove redundant sendmsg test cases
selftests/bpf: Migrate ATTACH_REJECT test cases
selftests/bpf: Migrate expected_attach_type tests
selftests/bpf: Migrate wildcard destination rewrite test
selftests/bpf: Migrate sendmsg6 v4 mapped address tests
selftests/bpf: Migrate sendmsg deny test cases
selftests/bpf: Migrate WILDCARD_IP test
selftests/bpf: Handle SYSCALL_EPERM and SYSCALL_ENOTSUPP test cases
...
====================
Link: https://lore.kernel.org/r/20240513134114.17575-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
[Changes from V1:
- The __compat_break has been abandoned in favor of
a more readable can_loop macro that can be used anywhere, including
loop conditions.]
The macro list_for_each_entry is defined in bpf_arena_list.h as
follows:
#define list_for_each_entry(pos, head, member) \
for (void * ___tmp = (pos = list_entry_safe((head)->first, \
typeof(*(pos)), member), \
(void *)0); \
pos && ({ ___tmp = (void *)pos->member.next; 1; }); \
cond_break, \
pos = list_entry_safe((void __arena *)___tmp, typeof(*(pos)), member))
The macro cond_break, in turn, expands to a statement expression that
contains a `break' statement. Compound statement expressions, and the
subsequent ability of placing statements in the header of a `for'
loop, are GNU extensions.
Unfortunately, clang implements this GNU extension differently than
GCC:
- In GCC the `break' statement is bound to the containing "breakable"
context in which the defining `for' appears. If there is no such
context, GCC emits a warning: break statement without enclosing `for'
o `switch' statement.
- In clang the `break' statement is bound to the defining `for'. If
the defining `for' is itself inside some breakable construct, then
clang emits a -Wgcc-compat warning.
This patch adds a new macro can_loop to bpf_experimental, that
implements the same logic than cond_break but evaluates to a boolean
expression. The patch also changes all the current instances of usage
of cond_break withing the header of loop accordingly.
Tested in bpf-next master.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Cc: david.faust@oracle.com
Cc: cupertino.miranda@oracle.com
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Link: https://lore.kernel.org/r/20240511212243.23477-1-jose.marchesi@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The BPF selftest global_func10 in progs/test_global_func10.c contains:
struct Small {
long x;
};
struct Big {
long x;
long y;
};
[...]
__noinline int foo(const struct Big *big)
{
if (!big)
return 0;
return bpf_get_prandom_u32() < big->y;
}
[...]
SEC("cgroup_skb/ingress")
__failure __msg("invalid indirect access to stack")
int global_func10(struct __sk_buff *skb)
{
const struct Small small = {.x = skb->len };
return foo((struct Big *)&small) ? 1 : 0;
}
GCC emits a "maybe uninitialized" warning for the code above, because
it knows `foo' accesses `big->y'.
Since the purpose of this selftest is to check that the verifier will
fail on this sort of invalid memory access, this patch just silences
the compiler warning.
Tested in bpf-next master.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Cc: david.faust@oracle.com
Cc: cupertino.miranda@oracle.com
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240511212349.23549-1-jose.marchesi@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The BPF selftest test_global_func9.c performs type punning and breaks
srict-aliasing rules.
In particular, given:
int global_func9(struct __sk_buff *skb)
{
int result = 0;
[...]
{
const struct C c = {.x = skb->len, .y = skb->family };
result |= foo((const struct S *)&c);
}
}
When building with strict-aliasing enabled (the default) the
initialization of `c' gets optimized away in its entirely:
[... no initialization of `c' ...]
r1 = r10
r1 += -40
call foo
w0 |= w6
Since GCC knows that `foo' accesses s->x, we get a "maybe
uninitialized" warning.
On the other hand, when strict-aliasing is disabled GCC only optimizes
away the store to `.y':
r1 = *(u32 *) (r6+0)
*(u32 *) (r10+-40) = r1 ; This is .x = skb->len in `c'
r1 = r10
r1 += -40
call foo
w0 |= w6
In this case the warning is not emitted, because s-> is initialized.
This patch disables strict aliasing in this test when building with
GCC. clang seems to not optimize this particular code even when
strict aliasing is enabled.
Tested in bpf-next master.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Cc: david.faust@oracle.com
Cc: cupertino.miranda@oracle.com
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240511212213.23418-1-jose.marchesi@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The strdup() function returns a pointer to a new string which is a
duplicate of the string "ifname". Memory for the new string is obtained
with malloc(), and need to be freed with free().
This patch adds this missing "free(saved_hwtstamp_ifname)" in cleanup()
to avoid a potential memory leak in xdp_hw_metadata.c.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/af9bcccb96655e82de5ce2b4510b88c9c8ed5ed0.1715417367.git.tanggeliang@kylinos.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch corrects a few warnings to allow selftests to compile for
GCC.
-- progs/cpumask_failure.c --
progs/bpf_misc.h:136:22: error: ‘cpumask’ is used uninitialized
[-Werror=uninitialized]
136 | #define __sink(expr) asm volatile("" : "+g"(expr))
| ^~~
progs/cpumask_failure.c:68:9: note: in expansion of macro ‘__sink’
68 | __sink(cpumask);
The macro __sink(cpumask) with the '+' contraint modifier forces the
the compiler to expect a read and write from cpumask. GCC detects
that cpumask is never initialized and reports an error.
This patch removes the spurious non required definitions of cpumask.
-- progs/dynptr_fail.c --
progs/dynptr_fail.c:1444:9: error: ‘ptr1’ may be used uninitialized
[-Werror=maybe-uninitialized]
1444 | bpf_dynptr_clone(&ptr1, &ptr2);
Many of the tests in the file are related to the detection of
uninitialized pointers by the verifier. GCC is able to detect possible
uninitialized values, and reports this as an error.
The patch initializes all of the previous uninitialized structs.
-- progs/test_tunnel_kern.c --
progs/test_tunnel_kern.c:590:9: error: array subscript 1 is outside
array bounds of ‘struct geneve_opt[1]’ [-Werror=array-bounds=]
590 | *(int *) &gopt.opt_data = bpf_htonl(0xdeadbeef);
| ^~~~~~~~~~~~~~~~~~~~~~~
progs/test_tunnel_kern.c:575:27: note: at offset 4 into object ‘gopt’ of
size 4
575 | struct geneve_opt gopt;
This tests accesses beyond the defined data for the struct geneve_opt
which contains as last field "u8 opt_data[0]" which clearly does not get
reserved space (in stack) in the function header. This pattern is
repeated in ip6geneve_set_tunnel and geneve_set_tunnel functions.
GCC is able to see this and emits a warning.
The patch introduces a local struct that allocates enough space to
safely allow the write to opt_data field.
-- progs/jeq_infer_not_null_fail.c --
progs/jeq_infer_not_null_fail.c:21:40: error: array subscript ‘struct
bpf_map[0]’ is partly outside array bounds of ‘struct <anonymous>[1]’
[-Werror=array-bounds=]
21 | struct bpf_map *inner_map = map->inner_map_meta;
| ^~
progs/jeq_infer_not_null_fail.c:14:3: note: object ‘m_hash’ of size 32
14 | } m_hash SEC(".maps");
This example defines m_hash in the context of the compilation unit and
casts it to struct bpf_map which is much smaller than the size of struct
bpf_map. It errors out in GCC when it attempts to access an element that
would be defined in struct bpf_map outsize of the defined limits for
m_hash.
This patch disables the warning through a GCC pragma.
This changes were tested in bpf-next master selftests without any
regressions.
Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Cc: jose.marchesi@oracle.com
Cc: david.faust@oracle.com
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Link: https://lore.kernel.org/r/20240510183850.286661-2-cupertino.miranda@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch fixes an integer overflow warning raised by GCC in
xdp_prognum1 of progs/test_xdp_vlan.c:
GCC-BPF [test_maps] test_xdp_vlan.bpf.o
progs/test_xdp_vlan.c: In function 'xdp_prognum1':
progs/test_xdp_vlan.c:163:25: error: integer overflow in expression
'(short int)(((__builtin_constant_p((int)vlan_hdr->h_vlan_TCI)) != 0
? (int)(short unsigned int)((short int)((int)vlan_hdr->h_vlan_TCI
<< 8 >> 8) << 8 | (short int)((int)vlan_hdr->h_vlan_TCI << 0 >> 8
<< 0)) & 61440 : (int)__builtin_bswap16(vlan_hdr->h_vlan_TCI)
& 61440) << 8 >> 8) << 8' of type 'short int' results in '0' [-Werror=overflow]
163 | bpf_htons((bpf_ntohs(vlan_hdr->h_vlan_TCI) & 0xf000)
| ^~~~~~~~~
The problem lies with the expansion of the bpf_htons macro and the
expression passed into it. The bpf_htons macro (and similarly the
bpf_ntohs macro) expand to a ternary operation using either
__builtin_bswap16 or ___bpf_swab16 to swap the bytes, depending on
whether the expression is constant.
For an expression, with 'value' as a u16, like:
bpf_htons (value & 0xf000)
The entire (value & 0xf000) is 'x' in the expansion of ___bpf_swab16
and we get as one part of the expanded swab16:
((__u16)(value & 0xf000) << 8 >> 8 << 8
This will always evaluate to 0, which is intentional since this
subexpression deals with the byte guaranteed to be 0 by the mask.
However, GCC warns because the precise reason this always evaluates to 0
is an overflow. Specifically, the plain 0xf000 in the expression is a
signed 32-bit integer, which causes 'value' to also be promoted to a
signed 32-bit integer, and the combination of the 8-bit left shift and
down-cast back to __u16 results in a signed overflow (really a 'warning:
overflow in conversion from int to __u16' which is propegated up through
the rest of the expression leading to the ultimate overflow warning
above), which is a valid warning despite being the intended result of
this code.
Clang does not warn on this case, likely because it performs constant
folding later in the compilation process relative to GCC. It seems that
by the time clang does constant folding for this expression, the side of
the ternary with this overflow has already been discarded.
Fortunately, this warning is easily silenced by simply making the 0xf000
mask explicitly unsigned. This has no impact on the result.
Signed-off-by: David Faust <david.faust@oracle.com>
Cc: jose.marchesi@oracle.com
Cc: cupertino.miranda@oracle.com
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240508193512.152759-1-david.faust@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This expands coverage for ATTACH_REJECT tests to include connect_unix,
sendmsg_unix, recvmsg*, getsockname*, and getpeername*.
Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-18-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This expands coverage for getsockname and getpeername hooks to include
getsockname4, getsockname6, getpeername4, and getpeername6.
Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-17-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch expands test coverage for EPERM tests to include connect and
bind calls and rounds out the coverage for sendmsg by adding tests for
sendmsg_unix.
Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-16-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch expands verifier coverage for program return values to cover
bind, connect, sendmsg, getsockname, and getpeername hooks. It also
rounds out the recvmsg coverage by adding test cases for recvmsg_unix
hooks.
Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-15-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Fully remove test_sock_addr.c and test_sock_addr.sh, as test coverage
has been fully moved to prog_tests/sock_addr.c.
Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-14-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Remove these test cases completely, as the same behavior is already
covered by other sendmsg* test cases in prog_tests/sock_addr.c. This
just rewrites the destination address similar to sendmsg_v4_prog and
sendmsg_v6_prog.
Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-13-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Migrate test case from bpf/test_sock_addr.c ensuring that program
attachment fails when using an inappropriate attach type.
Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-12-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Migrates tests from progs/test_sock_addr.c ensuring that programs fail
to load when the expected attach type does not match.
Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-11-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Migrate test case from bpf/test_sock_addr.c ensuring that sendmsg
respects when sendmsg6 hooks rewrite the destination IP with the IPv6
wildcard IP, [::].
Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-10-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Migrate test case from bpf/test_sock_addr.c ensuring that sendmsg
returns -ENOTSUPP when sending to an IPv4-mapped IPv6 address to
prog_tests/sock_addr.c.
Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-9-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This set of tests checks that sendmsg calls are rejected (return -EPERM)
when the sendmsg* hook returns 0. Replace those in bpf/test_sock_addr.c
with corresponding tests in prog_tests/sock_addr.c.
Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-8-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Move wildcard IP sendmsg test case out of bpf/test_sock_addr.c into
prog_tests/sock_addr.c.
Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-7-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In preparation to move test cases from bpf/test_sock_addr.c that expect
system calls to return ENOTSUPP or EPERM, this patch propagates errno
from relevant system calls up to test_sock_addr() where the result can
be checked.
Signed-off-by: Jordan Rife <jrife@google.com>
Link: https://lore.kernel.org/r/20240510190246.3247730-6-jrife@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|