kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2024-12-14	wireguard: selftests: load nf_conntrack if not present	Hangbin Liu	1	-0/+1
	[ Upstream commit 0290abc9860917f1ee8b58309c2bbd740a39ee8e ] Some distros may not load nf_conntrack by default, which will cause subsequent nf_conntrack sets to fail. Load this module if it is not already loaded. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> [ Jason: add [[ -e ... ]] check so this works in the qemu harness. ] Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://patch.msgid.link/20241117212030.629159-4-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14	selftests: net: really check for bg process completion	Paolo Abeni	1	-1/+1
	[ Upstream commit 52ed077aa6336dbef83a2d6d21c52d1706fb7f16 ] A recent refactor transformed the check for process completion in a true statement, due to a typo. As a result, the relevant test-case is unable to catch the regression it was supposed to detect. Restore the correct condition. Fixes: 691bb4e49c98 ("selftests: net: avoid just another constant wait") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/0e6f213811f8e93a235307e683af8225cc6277ae.1730828007.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14	selftests/bpf: Add push/pop checking for msg_verify_data in test_sockmap	Zijian Zhang	1	-5/+101
	[ Upstream commit 862087c3d36219ed44569666eb263efc97f00c9a ] Add push/pop checking for msg_verify_data in test_sockmap, except for pop/push with cork tests, in these tests the logic will be different. 1. With corking, pop/push might not be invoked in each sendmsg, it makes the layout of the received data difficult 2. It makes it hard to calculate the total_bytes in the recvmsg Temporarily skip the data integrity test for these cases now, added a TODO Fixes: ee9b352ce465 ("selftests/bpf: Fix msg_verify_data in test_sockmap") Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20241106222520.527076-5-zijianzhang@bytedance.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14	selftests/bpf: Fix total_bytes in msg_loop_rx in test_sockmap	Zijian Zhang	1	-5/+7
	[ Upstream commit 523dffccbadea0cfd65f1ff04944b864c558c4a8 ] total_bytes in msg_loop_rx should also take push into account, otherwise total_bytes will be a smaller value, which makes the msg_loop_rx end early. Besides, total_bytes has already taken pop into account, so we don't need to subtract some bytes from iov_buf in sendmsg_test. The additional subtraction may make total_bytes a negative number, and msg_loop_rx will just end without checking anything. Fixes: 18d4e900a450 ("bpf: Selftests, improve test_sockmap total bytes counter") Fixes: d69672147faa ("selftests, bpf: Add one test for sockmap with strparser") Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20241106222520.527076-4-zijianzhang@bytedance.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14	selftests, bpf: Add one test for sockmap with strparser	Liu Jian	1	-3/+30
	[ Upstream commit d69672147faa2a7671c0779fa5b9ad99e4fca4e3 ] Add the test to check sockmap with strparser is working well. Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20211029141216.211899-3-liujian56@huawei.com Stable-dep-of: 523dffccbade ("selftests/bpf: Fix total_bytes in msg_loop_rx in test_sockmap") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14	selftests/bpf: Fix SENDPAGE data logic in test_sockmap	Zijian Zhang	1	-7/+11
	[ Upstream commit 4095031463d4e99b534d2cd82035a417295764ae ] In the SENDPAGE test, "opt->iov_length * cnt" size of data will be sent cnt times by sendfile. 1. In push/pop tests, they will be invoked cnt times, for the simplicity of msg_verify_data, change chunk_sz to iov_length 2. Change iov_length in test_send_large from 1024 to 8192. We have pop test where txmsg_start_pop is 4096. 4096 > 1024, an error will be returned. Fixes: 328aa08a081b ("bpf: Selftests, break down test_sockmap into subtests") Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20241106222520.527076-3-zijianzhang@bytedance.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14	selftests/bpf: Add txmsg_pass to pull/push/pop in test_sockmap	Zijian Zhang	1	-0/+7
	[ Upstream commit 66c54c20408d994be34be2c070fba08472f69eee ] Add txmsg_pass to test_txmsg_pull/push/pop. If txmsg_pass is missing, tx_prog will be NULL, and no program will be attached to the sockmap. As a result, pull/push/pop are never invoked. Fixes: 328aa08a081b ("bpf: Selftests, break down test_sockmap into subtests") Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20241106222520.527076-2-zijianzhang@bytedance.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14	selftests/bpf: Fix txmsg_redir of test_txmsg_pull in test_sockmap	Zijian Zhang	1	-1/+1
	[ Upstream commit b29e231d66303c12b7b8ac3ac2a057df06b161e8 ] txmsg_redir in "Test pull + redirect" case of test_txmsg_pull should be 1 instead of 0. Fixes: 328aa08a081b ("bpf: Selftests, break down test_sockmap into subtests") Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com> Link: https://lore.kernel.org/r/20241012203731.1248619-3-zijianzhang@bytedance.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14	selftests/bpf: Fix msg_verify_data in test_sockmap	Zijian Zhang	1	-10/+20
	[ Upstream commit ee9b352ce4650ffc0d8ca0ac373d7c009c7e561e ] Function msg_verify_data should have context of bytes_cnt and k instead of assuming they are zero. Otherwise, test_sockmap with data integrity test will report some errors. I also fix the logic related to size and index j 1/ 6 sockmap::txmsg test passthrough:FAIL 2/ 6 sockmap::txmsg test redirect:FAIL 7/12 sockmap::txmsg test apply:FAIL 10/11 sockmap::txmsg test push_data:FAIL 11/17 sockmap::txmsg test pull-data:FAIL 12/ 9 sockmap::txmsg test pop-data:FAIL 13/ 1 sockmap::txmsg test push/pop data:FAIL ... Pass: 24 Fail: 52 After applying this patch, some of the errors are solved, but for push, pull and pop, we may need more fixes to msg_verify_data, added a TODO 10/11 sockmap::txmsg test push_data:FAIL 11/17 sockmap::txmsg test pull-data:FAIL 12/ 9 sockmap::txmsg test pop-data:FAIL ... Pass: 37 Fail: 15 Besides, added a custom errno EDATAINTEGRITY for msg_verify_data, we shall not ignore the error in txmsg_cork case. Fixes: 753fb2ee0934 ("bpf: sockmap, add msg_peek tests to test_sockmap") Fixes: 16edddfe3c5d ("selftests/bpf: test_sockmap, check test failure") Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com> Link: https://lore.kernel.org/r/20241012203731.1248619-2-zijianzhang@bytedance.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14	selftests/resctrl: Protect against array overrun during iMC config parsing	Reinette Chatre	1	-2/+1
	[ Upstream commit 48ed4e799e8fbebae838dca404a8527763d41191 ] The MBM and MBA tests need to discover the event and umask with which to configure the performance event used to measure read memory bandwidth. This is done by parsing the /sys/bus/event_source/devices/uncore_imc_<imc instance>/events/cas_count_read file for each iMC instance that contains the formatted output: "event=<event>,umask=<umask>" Parsing of cas_count_read contents is done by initializing an array of MAX_TOKENS elements with tokens (deliminated by "=,") from this file. Remove the unnecessary append of a delimiter to the string needing to be parsed. Per the strtok() man page: "delimiter bytes at the start or end of the string are ignored". This has no impact on the token placement within the array. After initialization, the actual event and umask is determined by parsing the tokens directly following the "event" and "umask" tokens respectively. Iterating through the array up to index "i < MAX_TOKENS" but then accessing index "i + 1" risks array overrun during the final iteration. Avoid array overrun by ensuring that the index used within for loop will always be valid. Fixes: 1d3f08687d76 ("selftests/resctrl: Read memory bandwidth from perf IMC counter and from resctrl file system") Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14	kselftest/arm64: mte: fix printf type warnings about longs	Andre Przywara	1	-2/+2
	[ Upstream commit 96dddb7b9406259baace9a1831e8da155311be6f ] When checking MTE tags, we print some diagnostic messages when the tests fail. Some variables uses there are "longs", however we only use "%x" for the format specifier. Update the format specifiers to "%lx", to match the variable types they are supposed to print. Fixes: f3b2a26ca78d ("kselftest/arm64: Verify mte tag inclusion via prctl") Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240816153251.2833702-9-andre.przywara@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14	selftests/watchdog-test: Fix system accidentally reset after watchdog-test	Li Zhijian	1	-0/+6
	[ Upstream commit dc1308bee1ed03b4d698d77c8bd670d399dcd04d ] When running watchdog-test with 'make run_tests', the watchdog-test will be terminated by a timeout signal(SIGTERM) due to the test timemout. And then, a system reboot would happen due to watchdog not stop. see the dmesg as below: ``` [ 1367.185172] watchdog: watchdog0: watchdog did not stop! ``` Fix it by registering more signals(including SIGTERM) in watchdog-test, where its signal handler will stop the watchdog. After that # timeout 1 ./watchdog-test Watchdog Ticking Away! . Stopping watchdog ticks... Link: https://lore.kernel.org/all/20241029031324.482800-1-lizhijian@fujitsu.com/ Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-08	selftests/mm: fix incorrect buffer->mirror size in hmm2 double_map test	Donet Tom	1	-1/+1
	[ Upstream commit 76503e1fa1a53ef041a120825d5ce81c7fe7bdd7 ] The hmm2 double_map test was failing due to an incorrect buffer->mirror size. The buffer->mirror size was 6, while buffer->ptr size was 6 * PAGE_SIZE. The test failed because the kernel's copy_to_user function was attempting to copy a 6 * PAGE_SIZE buffer to buffer->mirror. Since the size of buffer->mirror was incorrect, copy_to_user failed. This patch corrects the buffer->mirror size to 6 * PAGE_SIZE. Test Result without this patch ============================== # RUN hmm2.hmm2_device_private.double_map ... # hmm-tests.c:1680:double_map:Expected ret (-14) == 0 (0) # double_map: Test terminated by assertion # FAIL hmm2.hmm2_device_private.double_map not ok 53 hmm2.hmm2_device_private.double_map Test Result with this patch =========================== # RUN hmm2.hmm2_device_private.double_map ... # OK hmm2.hmm2_device_private.double_map ok 53 hmm2.hmm2_device_private.double_map Link: https://lkml.kernel.org/r/20240927050752.51066-1-donettom@linux.ibm.com Fixes: fee9f6d1b8df ("mm/hmm/test: add selftests for HMM") Signed-off-by: Donet Tom <donettom@linux.ibm.com> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Mark Brown <broonie@kernel.org> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17	net: Add l3mdev index to flow struct and avoid oif reset for port devices	David Ahern	1	-1/+1
	[ Upstream commit 40867d74c374b235e14d839f3a77f26684feefe5 ] The fundamental premise of VRF and l3mdev core code is binding a socket to a device (l3mdev or netdev with an L3 domain) to indicate L3 scope. Legacy code resets flowi_oif to the l3mdev losing any original port device binding. Ben (among others) has demonstrated use cases where the original port device binding is important and needs to be retained. This patch handles that by adding a new entry to the common flow struct that can indicate the l3mdev index for later rule and table matching avoiding the need to reset flowi_oif. In addition to allowing more use cases that require port device binds, this patch brings a few datapath simplications: 1. l3mdev_fib_rule_match is only called when walking fib rules and always after l3mdev_update_flow. That allows an optimization to bail early for non-VRF type uses cases when flowi_l3mdev is not set. Also, only that index needs to be checked for the FIB table id. 2. l3mdev_update_flow can be called with flowi_oif set to a l3mdev (e.g., VRF) device. By resetting flowi_oif only for this case the FLOWI_FLAG_SKIP_NH_OIF flag is not longer needed and can be removed, removing several checks in the datapath. The flowi_iif path can be simplified to only be called if the it is not loopback (loopback can not be assigned to an L3 domain) and the l3mdev index is not already set. 3. Avoid another device lookup in the output path when the fib lookup returns a reject failure. Note: 2 functional tests for local traffic with reject fib rules are updated to reflect the new direct failure at FIB lookup time for ping rather than the failure on packet path. The current code fails like this: HINT: Fails since address on vrf device is out of device scope COMMAND: ip netns exec ns-A ping -c1 -w1 -I eth1 172.16.3.1 ping: Warning: source address might be selected on device other than: eth1 PING 172.16.3.1 (172.16.3.1) from 172.16.3.1 eth1: 56(84) bytes of data. --- 172.16.3.1 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms where the test now directly fails: HINT: Fails since address on vrf device is out of device scope COMMAND: ip netns exec ns-A ping -c1 -w1 -I eth1 172.16.3.1 ping: connect: No route to host Signed-off-by: David Ahern <dsahern@kernel.org> Tested-by: Ben Greear <greearb@candelatech.com> Link: https://lore.kernel.org/r/20220314204551.16369-1-dsahern@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 05ef7055debc ("netfilter: fib: check correct rtable in vrf setups") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17	ktest.pl: Avoid false positives with grub2 skip regex	Daniel Jordan	1	-1/+1
	[ Upstream commit 2351e8c65404aabc433300b6bf90c7a37e8bbc4d ] Some distros have grub2 config files with the lines if [ x"${feature_menuentry_id}" = xy ]; then menuentry_id_option="--id" else menuentry_id_option="" fi which match the skip regex defined for grub2 in get_grub_index(): $skip = '^\s*menuentry'; These false positives cause the grub number to be higher than it should be, and the wrong kernel can end up booting. Grub documents the menuentry command with whitespace between it and the title, so make the skip regex reflect this. Link: https://lore.kernel.org/20240904175530.84175-1-daniel.m.jordan@oracle.com Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Acked-by: John 'Warthog9' Hawley (Tenstorrent) <warthog9@eaglescrag.net> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17	selftests/mm: fix charge_reserved_hugetlb.sh test	David Hildenbrand	2	-10/+13
	[ Upstream commit c41a701d18efe6b8aa402efab16edbaba50c9548 ] Currently, running the charge_reserved_hugetlb.sh selftest we can sometimes observe something like: $ ./charge_reserved_hugetlb.sh -cgroup-v2 ... write_result is 0 After write: hugetlb_usage=0 reserved_usage=10485760 killing write_to_hugetlbfs Received 2. Deleting the memory Detach failure: Invalid argument umount: /mnt/huge: target is busy. Both cases are issues in the test. While the unmount error seems to be racy, it will make the test fail: $ ./run_vmtests.sh -t hugetlb ... # [FAIL] not ok 10 charge_reserved_hugetlb.sh -cgroup-v2 # exit=32 The issue is that we are not waiting for the write_to_hugetlbfs process to quit. So it might still have a hugetlbfs file open, about which umount is not happy. Fix that by making "killall" wait for the process to quit. The other error ("Detach failure: Invalid argument") does not seem to result in a test error, but is misleading. Turns out write_to_hugetlbfs.c unconditionally tries to cleanup using shmdt(), even when we only mmap()'ed a hugetlb file. Even worse, shmaddr is never even set for the SHM case. Fix that as well. With this change it seems to work as expected. Link: https://lkml.kernel.org/r/20240821123115.2068812-1-david@redhat.com Fixes: 29750f71a9b4 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests") Signed-off-by: David Hildenbrand <david@redhat.com> Reported-by: Mario Casquero <mcasquer@redhat.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Tested-by: Mario Casquero <mcasquer@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17	selftests: vDSO: fix vDSO symbols lookup for powerpc64	Christophe Leroy	1	-1/+2
	[ Upstream commit ba83b3239e657469709d15dcea5f9b65bf9dbf34 ] On powerpc64, following tests fail locating vDSO functions: ~ # ./vdso_test_abi TAP version 13 1..16 # [vDSO kselftest] VDSO_VERSION: LINUX_2.6.15 # Couldn't find __kernel_gettimeofday ok 1 # SKIP __kernel_gettimeofday # clock_id: CLOCK_REALTIME # Couldn't find __kernel_clock_gettime ok 2 # SKIP __kernel_clock_gettime CLOCK_REALTIME # Couldn't find __kernel_clock_getres ok 3 # SKIP __kernel_clock_getres CLOCK_REALTIME ... # Couldn't find __kernel_time ok 16 # SKIP __kernel_time # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:16 error:0 ~ # ./vdso_test_getrandom __kernel_getrandom is missing! ~ # ./vdso_test_gettimeofday Could not find __kernel_gettimeofday ~ # ./vdso_test_getcpu Could not find __kernel_getcpu On powerpc64, as shown below by readelf, vDSO functions symbols have type NOTYPE, so also accept that type when looking for symbols. $ powerpc64-linux-gnu-readelf -a arch/powerpc/kernel/vdso/vdso64.so.dbg ELF Header: Magic: 7f 45 4c 46 02 02 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, big endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: DYN (Shared object file) Machine: PowerPC64 Version: 0x1 ... Symbol table '.dynsym' contains 12 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000524 84 NOTYPE GLOBAL DEFAULT 8 __[...]@@LINUX_2.6.15 2: 00000000000005f0 36 NOTYPE GLOBAL DEFAULT 8 __[...]@@LINUX_2.6.15 3: 0000000000000578 68 NOTYPE GLOBAL DEFAULT 8 __[...]@@LINUX_2.6.15 4: 0000000000000000 0 OBJECT GLOBAL DEFAULT ABS LINUX_2.6.15 5: 00000000000006c0 48 NOTYPE GLOBAL DEFAULT 8 __[...]@@LINUX_2.6.15 6: 0000000000000614 172 NOTYPE GLOBAL DEFAULT 8 __[...]@@LINUX_2.6.15 7: 00000000000006f0 84 NOTYPE GLOBAL DEFAULT 8 __[...]@@LINUX_2.6.15 8: 000000000000047c 84 NOTYPE GLOBAL DEFAULT 8 __[...]@@LINUX_2.6.15 9: 0000000000000454 12 NOTYPE GLOBAL DEFAULT 8 __[...]@@LINUX_2.6.15 10: 00000000000004d0 84 NOTYPE GLOBAL DEFAULT 8 __[...]@@LINUX_2.6.15 11: 00000000000005bc 52 NOTYPE GLOBAL DEFAULT 8 __[...]@@LINUX_2.6.15 Symbol table '.symtab' contains 56 entries: Num: Value Size Type Bind Vis Ndx Name ... 45: 0000000000000000 0 OBJECT GLOBAL DEFAULT ABS LINUX_2.6.15 46: 00000000000006c0 48 NOTYPE GLOBAL DEFAULT 8 __kernel_getcpu 47: 0000000000000524 84 NOTYPE GLOBAL DEFAULT 8 __kernel_clock_getres 48: 00000000000005f0 36 NOTYPE GLOBAL DEFAULT 8 __kernel_get_tbfreq 49: 000000000000047c 84 NOTYPE GLOBAL DEFAULT 8 __kernel_gettimeofday 50: 0000000000000614 172 NOTYPE GLOBAL DEFAULT 8 __kernel_sync_dicache 51: 00000000000006f0 84 NOTYPE GLOBAL DEFAULT 8 __kernel_getrandom 52: 0000000000000454 12 NOTYPE GLOBAL DEFAULT 8 __kernel_sigtram[...] 53: 0000000000000578 68 NOTYPE GLOBAL DEFAULT 8 __kernel_time 54: 00000000000004d0 84 NOTYPE GLOBAL DEFAULT 8 __kernel_clock_g[...] 55: 00000000000005bc 52 NOTYPE GLOBAL DEFAULT 8 __kernel_get_sys[...] Fixes: 98eedc3a9dbf ("Document the vDSO and add a reference parser") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17	selftests: breakpoints: use remaining time to check if suspend succeed	Yifei Liu	1	-1/+4
	[ Upstream commit c66be905cda24fb782b91053b196bd2e966f95b7 ] step_after_suspend_test fails with device busy error while writing to /sys/power/state to start suspend. The test believes it failed to enter suspend state with $ sudo ./step_after_suspend_test TAP version 13 Bail out! Failed to enter Suspend state However, in the kernel message, I indeed see the system get suspended and then wake up later. [611172.033108] PM: suspend entry (s2idle) [611172.044940] Filesystems sync: 0.006 seconds [611172.052254] Freezing user space processes [611172.059319] Freezing user space processes completed (elapsed 0.001 seconds) [611172.067920] OOM killer disabled. [611172.072465] Freezing remaining freezable tasks [611172.080332] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) [611172.089724] printk: Suspending console(s) (use no_console_suspend to debug) [611172.117126] serial 00:03: disabled some other hardware get reconnected [611203.136277] OOM killer enabled. [611203.140637] Restarting tasks ... [611203.141135] usb 1-8.1: USB disconnect, device number 7 [611203.141755] done. [611203.155268] random: crng reseeded on system resumption [611203.162059] PM: suspend exit After investigation, I noticed that for the code block if (write(power_state_fd, "mem", strlen("mem")) != strlen("mem")) ksft_exit_fail_msg("Failed to enter Suspend state\n"); The write will return -1 and errno is set to 16 (device busy). It should be caused by the write function is not successfully returned before the system suspend and the return value get messed when waking up. As a result, It may be better to check the time passed of those few instructions to determine whether the suspend is executed correctly for it is pretty hard to execute those few lines for 5 seconds. The timer to wake up the system is set to expire after 5 seconds and no re-arm. If the timer remaining time is 0 second and 0 nano secomd, it means the timer expired and wake the system up. Otherwise, the system could be considered to enter the suspend state failed if there is any remaining time. After appling this patch, the test would not fail for it believes the system does not go to suspend by mistake. It now could continue to the rest part of the test after suspend. Fixes: bfd092b8c272 ("selftests: breakpoint: add step_after_suspend_test") Reported-by: Sinadin Shan <sinadin.shan@oracle.com> Signed-off-by: Yifei Liu <yifei.l.liu@oracle.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17	selftests/bpf: Fix C++ compile error from missing _Bool type	Tony Ambardar	1	-0/+4
	[ Upstream commit aa95073fd290b5b3e45f067fa22bb25e59e1ff7c ] While building, bpftool makes a skeleton from test_core_extern.c, which itself includes <stdbool.h> and uses the 'bool' type. However, the skeleton test_core_extern.skel.h generated does not include <stdbool.h> or use the 'bool' type, instead using the C-only '_Bool' type. Compiling test_cpp.cpp with g++ 12.3 for mips64el/musl-libc then fails with error: In file included from test_cpp.cpp:9: test_core_extern.skel.h:45:17: error: '_Bool' does not name a type 45 \| _Bool CONFIG_BOOL; \| ^~~~~ This was likely missed previously because glibc uses a GNU extension for <stdbool.h> with C++ (#define _Bool bool), not supported by musl libc. Normally, a C fragment would include <stdbool.h> and use the 'bool' type, and thus cleanly work after import by C++. The ideal fix would be for 'bpftool gen skeleton' to output the correct type/include supporting C++, but in the meantime add a conditional define as above. Fixes: 7c8dce4b1661 ("bpftool: Make skeleton C code compilable with C++ compiler") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/6fc1dd28b8bda49e51e4f610bdc9d22f4455632d.1722244708.git.tony.ambardar@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17	selftests/bpf: Fix error compiling test_lru_map.c	Tony Ambardar	1	-1/+2
	[ Upstream commit cacf2a5a78cd1f5f616eae043ebc6f024104b721 ] Although the post-increment in macro 'CPU_SET(next++, &cpuset)' seems safe, the sequencing can raise compile errors, so move the increment outside the macro. This avoids an error seen using gcc 12.3.0 for mips64el/musl-libc: In file included from test_lru_map.c:11: test_lru_map.c: In function 'sched_next_online': test_lru_map.c:129:29: error: operation on 'next' may be undefined [-Werror=sequence-point] 129 \| CPU_SET(next++, &cpuset); \| ^ cc1: all warnings being treated as errors Fixes: 3fbfadce6012 ("bpf: Fix test_lru_sanity5() in test_lru_map.c") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/22993dfb11ccf27925a626b32672fd3324cb76c4.1722244708.git.tony.ambardar@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17	selftests/bpf: Fix errors compiling cg_storage_multi.h with musl libc	Tony Ambardar	1	-2/+0
	[ Upstream commit 730561d3c08d4a327cceaabf11365958a1c00cec ] Remove a redundant include of '<asm/types.h>', whose needed definitions are already included (via '<linux/types.h>') in cg_storage_multi_egress_only.c, cg_storage_multi_isolated.c, and cg_storage_multi_shared.c. This avoids redefinition errors seen compiling for mips64el/musl-libc like: In file included from progs/cg_storage_multi_egress_only.c:13: In file included from progs/cg_storage_multi.h:6: In file included from /usr/mips64el-linux-gnuabi64/include/asm/types.h:23: /usr/include/asm-generic/int-l64.h:29:25: error: typedef redefinition with different types ('long' vs 'long long') 29 \| typedef __signed__ long __s64; \| ^ /usr/include/asm-generic/int-ll64.h:30:44: note: previous definition is here 30 \| __extension__ typedef __signed__ long long __s64; \| ^ Fixes: 9e5bd1f7633b ("selftests/bpf: Test CGROUP_STORAGE map can't be used by multiple progs") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/4f4702e9f6115b7f84fea01b2326ca24c6df7ba8.1721713597.git.tony.ambardar@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17	selftests/bpf: Fix compiling tcp_rtt.c with musl-libc	Tony Ambardar	1	-0/+1
	[ Upstream commit 18826fb0b79c3c3cd1fe765d85f9c6f1a902c722 ] The GNU version of 'struct tcp_info' in 'netinet/tcp.h' is not exposed by musl headers unless _GNU_SOURCE is defined. Add this definition to fix errors seen compiling for mips64el/musl-libc: tcp_rtt.c: In function 'wait_for_ack': tcp_rtt.c:24:25: error: storage size of 'info' isn't known 24 \| struct tcp_info info; \| ^~~~ tcp_rtt.c:24:25: error: unused variable 'info' [-Werror=unused-variable] cc1: all warnings being treated as errors Fixes: 1f4f80fed217 ("selftests/bpf: test_progs: convert test_tcp_rtt") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/f2329767b15df206f08a5776d35a47c37da855ae.1721713597.git.tony.ambardar@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17	selftests/bpf: Fix compiling flow_dissector.c with musl-libc	Tony Ambardar	1	-0/+1
	[ Upstream commit 5e4c43bcb85973243d7274e0058b6e8f5810e4f7 ] The GNU version of 'struct tcphdr' has members 'doff', 'source' and 'dest', which are not exposed by musl libc headers unless _GNU_SOURCE is defined. Add this definition to fix errors seen compiling for mips64el/musl-libc: flow_dissector.c:118:30: error: 'struct tcphdr' has no member named 'doff' 118 \| .tcp.doff = 5, \| ^~~~ flow_dissector.c:119:30: error: 'struct tcphdr' has no member named 'source' 119 \| .tcp.source = 80, \| ^~~~~~ flow_dissector.c:120:30: error: 'struct tcphdr' has no member named 'dest' 120 \| .tcp.dest = 8080, \| ^~~~ Fixes: ae173a915785 ("selftests/bpf: support BPF_FLOW_DISSECTOR_F_PARSE_1ST_FRAG") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/8f7ab21a73f678f9cebd32b26c444a686e57414d.1721713597.git.tony.ambardar@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17	selftests/bpf: Fix compiling kfree_skb.c with musl-libc	Tony Ambardar	1	-0/+1
	[ Upstream commit bae9a5ce7d3a9b3a9e07b31ab9e9c58450e3e9fd ] The GNU version of 'struct tcphdr' with member 'doff' is not exposed by musl headers unless _GNU_SOURCE is defined. Add this definition to fix errors seen compiling for mips64el/musl-libc: In file included from kfree_skb.c:2: kfree_skb.c: In function 'on_sample': kfree_skb.c:45:30: error: 'struct tcphdr' has no member named 'doff' 45 \| if (CHECK(pkt_v6->tcp.doff != 5, "check_tcp", \| ^ Fixes: 580d656d80cf ("selftests/bpf: Add kfree_skb raw_tp test") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/e2d8cedc790959c10d6822a51f01a7a3616bea1b.1721713597.git.tony.ambardar@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17	selftests/bpf: Fix missing ARRAY_SIZE() definition in bench.c	Tony Ambardar	1	-0/+1
	[ Upstream commit d44c93fc2f5a0c47b23fa03d374e45259abd92d2 ] Add a "bpf_util.h" include to avoid the following error seen compiling for mips64el with musl libc: bench.c: In function 'find_benchmark': bench.c:590:25: error: implicit declaration of function 'ARRAY_SIZE' [-Werror=implicit-function-declaration] 590 \| for (i = 0; i < ARRAY_SIZE(benchs); i++) { \| ^~~~~~~~~~ cc1: all warnings being treated as errors Fixes: 8e7c2a023ac0 ("selftests/bpf: Add benchmark runner infrastructure") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/bc4dde77dfcd17a825d8f28f72f3292341966810.1721713597.git.tony.ambardar@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17	selftests/bpf: Fix compile error from rlim_t in sk_storage_map.c	Tony Ambardar	1	-1/+1
	[ Upstream commit d393f9479d4aaab0fa4c3caf513f28685e831f13 ] Cast 'rlim_t' argument to match expected type of printf() format and avoid compile errors seen building for mips64el/musl-libc: In file included from map_tests/sk_storage_map.c:20: map_tests/sk_storage_map.c: In function 'test_sk_storage_map_stress_free': map_tests/sk_storage_map.c:414:56: error: format '%lu' expects argument of type 'long unsigned int', but argument 2 has type 'rlim_t' {aka 'long long unsigned int'} [-Werror=format=] 414 \| CHECK(err, "setrlimit(RLIMIT_NOFILE)", "rlim_new:%lu errno:%d", \| ^~~~~~~~~~~~~~~~~~~~~~~ 415 \| rlim_new.rlim_cur, errno); \| ~~~~~~~~~~~~~~~~~ \| \| \| rlim_t {aka long long unsigned int} ./test_maps.h:12:24: note: in definition of macro 'CHECK' 12 \| printf(format); \ \| ^~~~~~ map_tests/sk_storage_map.c:414:68: note: format string is defined here 414 \| CHECK(err, "setrlimit(RLIMIT_NOFILE)", "rlim_new:%lu errno:%d", \| ~~^ \| \| \| long unsigned int \| %llu cc1: all warnings being treated as errors Fixes: 51a0e301a563 ("bpf: Add BPF_MAP_TYPE_SK_STORAGE test to test_maps") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/1e00a1fa7acf91b4ca135c4102dc796d518bad86.1721713597.git.tony.ambardar@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12	kselftests: dmabuf-heaps: Ensure the driver name is null-terminated	Zenghui Yu	1	-1/+3
	[ Upstream commit 291e4baf70019f17a81b7b47aeb186b27d222159 ] Even if a vgem device is configured in, we will skip the import_vgem_fd() test almost every time. TAP version 13 1..11 # Testing heap: system # ======================================= # Testing allocation and importing: ok 1 # SKIP Could not open vgem -1 The problem is that we use the DRM_IOCTL_VERSION ioctl to query the driver version information but leave the name field a non-null-terminated string. Terminate it properly to actually test against the vgem device. While at it, let's check the length of the driver name is exactly 4 bytes and return early otherwise (in case there is a name like "vgemfoo" that gets converted to "vgem\0" unexpectedly). Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20240729024604.2046-1-yuzenghui@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-04	tc-testing: don't access non-existent variable on exception	Simon Horman	1	-1/+0
	[ Upstream commit a0c9fe5eecc97680323ee83780ea3eaf440ba1b7 ] Since commit 255c1c7279ab ("tc-testing: Allow test cases to be skipped") the variable test_ordinal doesn't exist in call_pre_case(). So it should not be accessed when an exception occurs. This resolves the following splat: ... During handling of the above exception, another exception occurred: Traceback (most recent call last): File ".../tdc.py", line 1028, in <module> main() File ".../tdc.py", line 1022, in main set_operation_mode(pm, parser, args, remaining) File ".../tdc.py", line 966, in set_operation_mode catresults = test_runner_serial(pm, args, alltests) File ".../tdc.py", line 642, in test_runner_serial (index, tsr) = test_runner(pm, args, alltests) File ".../tdc.py", line 536, in test_runner res = run_one_test(pm, args, index, tidx) File ".../tdc.py", line 419, in run_one_test pm.call_pre_case(tidx) File ".../tdc.py", line 146, in call_pre_case print('test_ordinal is {}'.format(test_ordinal)) NameError: name 'test_ordinal' is not defined Fixes: 255c1c7279ab ("tc-testing: Allow test cases to be skipped") Signed-off-by: Simon Horman <horms@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20240815-tdc-test-ordinal-v1-1-0255c122a427@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-04	fix bitmap corruption on close_range() with CLOSE_RANGE_UNSHARE	Al Viro	1	-0/+35
	commit 9a2fa1472083580b6c66bdaf291f591e1170123a upstream. copy_fd_bitmaps(new, old, count) is expected to copy the first count/BITS_PER_LONG bits from old->full_fds_bits[] and fill the rest with zeroes. What it does is copying enough words (BITS_TO_LONGS(count/BITS_PER_LONG)), then memsets the rest. That works fine, if all bits past the cutoff point are clear. Otherwise we are risking garbage from the last word we'd copied. For most of the callers that is true - expand_fdtable() has count equal to old->max_fds, so there's no open descriptors past count, let alone fully occupied words in ->open_fds[], which is what bits in ->full_fds_bits[] correspond to. The other caller (dup_fd()) passes sane_fdtable_size(old_fdt, max_fds), which is the smallest multiple of BITS_PER_LONG that covers all opened descriptors below max_fds. In the common case (copying on fork()) max_fds is ~0U, so all opened descriptors will be below it and we are fine, by the same reasons why the call in expand_fdtable() is safe. Unfortunately, there is a case where max_fds is less than that and where we might, indeed, end up with junk in ->full_fds_bits[] - close_range(from, to, CLOSE_RANGE_UNSHARE) with * descriptor table being currently shared * 'to' being above the current capacity of descriptor table * 'from' being just under some chunk of opened descriptors. In that case we end up with observably wrong behaviour - e.g. spawn a child with CLONE_FILES, get all descriptors in range 0..127 open, then close_range(64, ~0U, CLOSE_RANGE_UNSHARE) and watch dup(0) ending up with descriptor #128, despite #64 being observably not open. The minimally invasive fix would be to deal with that in dup_fd(). If this proves to add measurable overhead, we can go that way, but let's try to fix copy_fd_bitmaps() first. * new helper: bitmap_copy_and_expand(to, from, bits_to_copy, size). * make copy_fd_bitmaps() take the bitmap size in words, rather than bits; it's 'count' argument is always a multiple of BITS_PER_LONG, so we are not losing any information, and that way we can use the same helper for all three bitmaps - compiler will see that count is a multiple of BITS_PER_LONG for the large ones, so it'll generate plain memcpy()+memset(). Reproducer added to tools/testing/selftests/core/close_range_test.c Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19	selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT	Yonghong Song	1	-1/+2
	[ Upstream commit 7015843afcaf68c132784c89528dfddc0005e483 ] Alexei reported that send_signal test may fail with nested CONFIG_PARAVIRT configs. In this particular case, the base VM is AMD with 166 cpus, and I run selftests with regular qemu on top of that and indeed send_signal test failed. I also tried with an Intel box with 80 cpus and there is no issue. The main qemu command line includes: -enable-kvm -smp 16 -cpu host The failure log looks like: $ ./test_progs -t send_signal [ 48.501588] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [test_progs:2225] [ 48.503622] Modules linked in: bpf_testmod(O) [ 48.503622] CPU: 9 PID: 2225 Comm: test_progs Tainted: G O 6.9.0-08561-g2c1713a8f1c9-dirty #69 [ 48.507629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014 [ 48.511635] RIP: 0010:handle_softirqs+0x71/0x290 [ 48.511635] Code: [...] 10 0a 00 00 00 31 c0 65 66 89 05 d5 f4 fa 7e fb bb ff ff ff ff <49> c7 c2 cb [ 48.518527] RSP: 0018:ffffc90000310fa0 EFLAGS: 00000246 [ 48.519579] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 00000000000006e0 [ 48.522526] RDX: 0000000000000006 RSI: ffff88810791ae80 RDI: 0000000000000000 [ 48.523587] RBP: ffffc90000fabc88 R08: 00000005a0af4f7f R09: 0000000000000000 [ 48.525525] R10: 0000000561d2f29c R11: 0000000000006534 R12: 0000000000000280 [ 48.528525] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 48.528525] FS: 00007f2f2885cd00(0000) GS:ffff888237c40000(0000) knlGS:0000000000000000 [ 48.531600] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.535520] CR2: 00007f2f287059f0 CR3: 0000000106a28002 CR4: 00000000003706f0 [ 48.537538] Call Trace: [ 48.537538] <IRQ> [ 48.537538] ? watchdog_timer_fn+0x1cd/0x250 [ 48.539590] ? lockup_detector_update_enable+0x50/0x50 [ 48.539590] ? __hrtimer_run_queues+0xff/0x280 [ 48.542520] ? hrtimer_interrupt+0x103/0x230 [ 48.544524] ? __sysvec_apic_timer_interrupt+0x4f/0x140 [ 48.545522] ? sysvec_apic_timer_interrupt+0x3a/0x90 [ 48.547612] ? asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.547612] ? handle_softirqs+0x71/0x290 [ 48.547612] irq_exit_rcu+0x63/0x80 [ 48.551585] sysvec_apic_timer_interrupt+0x75/0x90 [ 48.552521] </IRQ> [ 48.553529] <TASK> [ 48.553529] asm_sysvec_apic_timer_interrupt+0x1a/0x20 [ 48.555609] RIP: 0010:finish_task_switch.isra.0+0x90/0x260 [ 48.556526] Code: [...] 9f 58 0a 00 00 48 85 db 0f 85 89 01 00 00 4c 89 ff e8 53 d9 bd 00 fb 66 90 <4d> 85 ed 74 [ 48.562524] RSP: 0018:ffffc90000fabd38 EFLAGS: 00000282 [ 48.563589] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83385620 [ 48.563589] RDX: ffff888237c73ae4 RSI: 0000000000000000 RDI: ffff888237c6fd00 [ 48.568521] RBP: ffffc90000fabd68 R08: 0000000000000000 R09: 0000000000000000 [ 48.569528] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8881009d0000 [ 48.573525] R13: ffff8881024e5400 R14: ffff88810791ae80 R15: ffff888237c6fd00 [ 48.575614] ? finish_task_switch.isra.0+0x8d/0x260 [ 48.576523] __schedule+0x364/0xac0 [ 48.577535] schedule+0x2e/0x110 [ 48.578555] pipe_read+0x301/0x400 [ 48.579589] ? destroy_sched_domains_rcu+0x30/0x30 [ 48.579589] vfs_read+0x2b3/0x2f0 [ 48.579589] ksys_read+0x8b/0xc0 [ 48.583590] do_syscall_64+0x3d/0xc0 [ 48.583590] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 48.586525] RIP: 0033:0x7f2f28703fa1 [ 48.587592] Code: [...] 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 80 3d c5 23 14 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 [ 48.593534] RSP: 002b:00007ffd90f8cf88 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 48.595589] RAX: ffffffffffffffda RBX: 00007ffd90f8d5e8 RCX: 00007f2f28703fa1 [ 48.595589] RDX: 0000000000000001 RSI: 00007ffd90f8cfb0 RDI: 0000000000000006 [ 48.599592] RBP: 00007ffd90f8d2f0 R08: 0000000000000064 R09: 0000000000000000 [ 48.602527] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 48.603589] R13: 00007ffd90f8d608 R14: 00007f2f288d8000 R15: 0000000000f6bdb0 [ 48.605527] </TASK> In the test, two processes are communicating through pipe. Further debugging with strace found that the above splat is triggered as read() syscall could not receive the data even if the corresponding write() syscall in another process successfully wrote data into the pipe. The failed subtest is "send_signal_perf". The corresponding perf event has sample_period 1 and config PERF_COUNT_SW_CPU_CLOCK. sample_period 1 means every overflow event will trigger a call to the BPF program. So I suspect this may overwhelm the system. So I increased the sample_period to 100,000 and the test passed. The sample_period 10,000 still has the test failed. In other parts of selftest, e.g., [1], sample_freq is used instead. So I decided to use sample_freq = 1,000 since the test can pass as well. [1] https://lore.kernel.org/bpf/20240604070700.3032142-1-song@kernel.org/ Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240605201203.2603846-1-yonghong.song@linux.dev Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19	libbpf: Fix no-args func prototype BTF dumping syntax	Andrii Nakryiko	2	-4/+4
	[ Upstream commit 189f1a976e426011e6a5588f1d3ceedf71fe2965 ] For all these years libbpf's BTF dumper has been emitting not strictly valid syntax for function prototypes that have no input arguments. Instead of `int (blah)()` we should emit `int (blah)(void)`. This is not normally a problem, but it manifests when we get kfuncs in vmlinux.h that have no input arguments. Due to compiler internal specifics, we get no BTF information for such kfuncs, if they are not declared with proper `(void)`. The fix is trivial. We also need to adjust a few ancient tests that happily assumed `()` is correct. Fixes: 351131b51c7a ("libbpf: add btf_dump API for BTF-to-C conversion") Reported-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/bpf/20240712224442.282823-1-andrii@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19	selftests/sigaltstack: Fix ppc64 GCC build	Michael Ellerman	1	-1/+1
	commit 17c743b9da9e0d073ff19fd5313f521744514939 upstream. Building the sigaltstack test with GCC on 64-bit powerpc errors with: gcc -Wall sas.c -o /home/michael/linux/.build/kselftest/sigaltstack/sas In file included from sas.c:23: current_stack_pointer.h:22:2: error: #error "implement current_stack_pointer equivalent" 22 \| #error "implement current_stack_pointer equivalent" \| ^~~~~ sas.c: In function ‘my_usr1’: sas.c:50:13: error: ‘sp’ undeclared (first use in this function); did you mean ‘p’? 50 \| if (sp < (unsigned long)sstack \|\| \| ^~ This happens because GCC doesn't define __ppc__ for 64-bit builds, only 32-bit builds. Instead use __powerpc__ to detect powerpc builds, which is defined by clang and GCC for 64-bit and 32-bit builds. Fixes: 05107edc9101 ("selftests: sigaltstack: fix -Wuninitialized") Cc: stable@vger.kernel.org # v6.3+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240520062647.688667-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19	selftests: forwarding: devlink_lib: Wait for udev events after reloading	Amit Cohen	1	-0/+2
	[ Upstream commit f67a90a0c8f5b3d0acc18f10650d90fec44775f9 ] Lately, an additional locking was added by commit c0a40097f0bc ("drivers: core: synchronize really_probe() and dev_uevent()"). The locking protects dev_uevent() calling. This function is used to send messages from the kernel to user space. Uevent messages notify user space about changes in device states, such as when a device is added, removed, or changed. These messages are used by udev (or other similar user-space tools) to apply device-specific rules. After reloading devlink instance, udev events should be processed. This locking causes a short delay of udev events handling. One example for useful udev rule is renaming ports. 'forwading.config' can be configured to use names after udev rules are applied. Some tests run devlink_reload() and immediately use the updated names. This worked before the above mentioned commit was pushed, but now the delay of uevent messages causes that devlink_reload() returns before udev events are handled and tests fail. Adjust devlink_reload() to not assume that udev events are already processed when devlink reload is done, instead, wait for udev events to ensure they are processed before returning from the function. Without this patch: TESTS='rif_mac_profile' ./resource_scale.sh TEST: 'rif_mac_profile' 4 [ OK ] sysctl: cannot stat /proc/sys/net/ipv6/conf/swp1/disable_ipv6: No such file or directory sysctl: cannot stat /proc/sys/net/ipv6/conf/swp1/disable_ipv6: No such file or directory sysctl: cannot stat /proc/sys/net/ipv6/conf/swp2/disable_ipv6: No such file or directory sysctl: cannot stat /proc/sys/net/ipv6/conf/swp2/disable_ipv6: No such file or directory Cannot find device "swp1" Cannot find device "swp2" TEST: setup_wait_dev (: Interface swp1 does not come up.) [FAIL] With this patch: $ TESTS='rif_mac_profile' ./resource_scale.sh TEST: 'rif_mac_profile' 4 [ OK ] TEST: 'rif_mac_profile' overflow 5 [ OK ] This is relevant not only for this test. Fixes: bc7cbb1e9f4c ("selftests: forwarding: Add devlink_lib.sh") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/89367666e04b38a8993027f1526801ca327ab96a.1720709333.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19	selftests/bpf: Close fd in error path in drop_on_reuseport	Geliang Tang	1	-1/+1
	[ Upstream commit adae187ebedcd95d02f045bc37dfecfd5b29434b ] In the error path when update_lookup_map() fails in drop_on_reuseport in prog_tests/sk_lookup.c, "server1", the fd of server 1, should be closed. This patch fixes this by using "goto close_srv1" lable instead of "detach" to close "server1" in this case. Fixes: 0ab5539f8584 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point") Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/86aed33b4b0ea3f04497c757845cff7e8e621a2d.1720515893.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19	mlxsw: spectrum_acl: Fix ACL scale regression and firmware errors	Ido Schimmel	1	-4/+51
	[ Upstream commit 75d8d7a63065b18df9555dbaab0b42d4c6f20943 ] ACLs that reside in the algorithmic TCAM (A-TCAM) in Spectrum-2 and newer ASICs can share the same mask if their masks only differ in up to 8 consecutive bits. For example, consider the following filters: # tc filter add dev swp1 ingress pref 1 proto ip flower dst_ip 192.0.2.0/24 action drop # tc filter add dev swp1 ingress pref 1 proto ip flower dst_ip 198.51.100.128/25 action drop The second filter can use the same mask as the first (dst_ip/24) with a delta of 1 bit. However, the above only works because the two filters have different values in the common unmasked part (dst_ip/24). When entries have the same value in the common unmasked part they create undesired collisions in the device since many entries now have the same key. This leads to firmware errors such as [1] and to a reduced scale. Fix by adjusting the hash table key to only include the value in the common unmasked part. That is, without including the delta bits. That way the driver will detect the collision during filter insertion and spill the filter into the circuit TCAM (C-TCAM). Add a test case that fails without the fix and adjust existing cases that check C-TCAM spillage according to the above limitation. [1] mlxsw_spectrum2 0000:06:00.0: EMAD reg access failed (tid=3379b18a00003394,reg_id=3027(ptce3),type=write,status=8(resource not available)) Fixes: c22291f7cf45 ("mlxsw: spectrum: acl: Implement delta for ERP") Reported-by: Alexander Zubkov <green@qrator.net> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Amit Cohen <amcohen@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19	selftests/bpf: Check length of recv in test_sockmap	Geliang Tang	1	-1/+2
	[ Upstream commit de1b5ea789dc28066cc8dc634b6825bd6148f38b ] The value of recv in msg_loop may be negative, like EWOULDBLOCK, so it's necessary to check if it is positive before accumulating it to bytes_recvd. Fixes: 16962b2404ac ("bpf: sockmap, add selftests") Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/5172563f7c7b2a2e953cef02e89fc34664a7b190.1716446893.git.tanggeliang@kylinos.cn Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19	selftests/bpf: Fix prog numbers in test_sockmap	Geliang Tang	1	-5/+1
	[ Upstream commit 6c8d7598dfed759bf1d9d0322b4c2b42eb7252d8 ] bpf_prog5 and bpf_prog7 are removed from progs/test_sockmap_kern.h in commit d79a32129b21 ("bpf: Selftests, remove prints from sockmap tests"), now there are only 9 progs in it, not 11: SEC("sk_skb1") int bpf_prog1(struct __sk_buff skb) SEC("sk_skb2") int bpf_prog2(struct __sk_buff skb) SEC("sk_skb3") int bpf_prog3(struct __sk_buff skb) SEC("sockops") int bpf_sockmap(struct bpf_sock_ops skops) SEC("sk_msg1") int bpf_prog4(struct sk_msg_md msg) SEC("sk_msg2") int bpf_prog6(struct sk_msg_md msg) SEC("sk_msg3") int bpf_prog8(struct sk_msg_md msg) SEC("sk_msg4") int bpf_prog9(struct sk_msg_md msg) SEC("sk_msg5") int bpf_prog10(struct sk_msg_md *msg) This patch updates the array sizes of prog_fd[], prog_attach_type[] and prog_type[] from 11 to 9 accordingly. Fixes: d79a32129b21 ("bpf: Selftests, remove prints from sockmap tests") Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/9c10d9f974f07fcb354a43a8eca67acb2fafc587.1715926605.git.tanggeliang@kylinos.cn Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-27	selftests/vDSO: fix clang build errors and warnings	John Hubbard	2	-7/+27
	[ Upstream commit 73810cd45b99c6c418e1c6a487b52c1e74edb20d ] When building with clang, via: make LLVM=1 -C tools/testing/selftests ...there are several warnings, and an error. This fixes all of those and allows these tests to run and pass. 1. Fix linker error (undefined reference to memcpy) by providing a local version of memcpy. 2. clang complains about using this form: if (g = h & 0xf0000000) ...so factor out the assignment into a separate step. 3. The code is passing a signed const char* to elf_hash(), which expects a const unsigned char *. There are several callers, so fix this at the source by allowing the function to accept a signed argument, and then converting to unsigned operations, once inside the function. 4. clang doesn't have __attribute__((externally_visible)) and generates a warning to that effect. Fortunately, gcc 12 and gcc 13 do not seem to require that attribute in order to build, run and pass tests here, so remove it. Reviewed-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Edward Liaw <edliaw@google.com> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Tested-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-27	selftests/openat2: Fix build warnings on ppc64	Michael Ellerman	1	-0/+1
	[ Upstream commit 84b6df4c49a1cc2854a16937acd5fd3e6315d083 ] Fix warnings like: openat2_test.c: In function ‘test_openat2_flags’: openat2_test.c:303:73: warning: format ‘%llX’ expects argument of type ‘long long unsigned int’, but argument 5 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=] By switching to unsigned long long for u64 for ppc64 builds. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-18	bpf: Allow reads from uninit stack	Eduard Zingerman	8	-128/+332
	commit 6715df8d5d24655b9fd368e904028112b54c7de1 upstream. This commits updates the following functions to allow reads from uninitialized stack locations when env->allow_uninit_stack option is enabled: - check_stack_read_fixed_off() - check_stack_range_initialized(), called from: - check_stack_read_var_off() - check_helper_mem_access() Such change allows to relax logic in stacksafe() to treat STACK_MISC and STACK_INVALID in a same way and make the following stack slot configurations equivalent: \| Cached state \| Current state \| \| stack slot \| stack slot \| \|------------------+------------------\| \| STACK_INVALID or \| STACK_INVALID or \| \| STACK_MISC \| STACK_SPILL or \| \| \| STACK_MISC or \| \| \| STACK_ZERO or \| \| \| STACK_DYNPTR \| This leads to significant verification speed gains (see below). The idea was suggested by Andrii Nakryiko [1] and initial patch was created by Alexei Starovoitov [2]. Currently the env->allow_uninit_stack is allowed for programs loaded by users with CAP_PERFMON or CAP_SYS_ADMIN capabilities. A number of test cases from verifier/.c were expecting uninitialized stack access to be an error. These test cases were updated to execute in unprivileged mode (thus preserving the tests). The test progs/test_global_func10.c expected "invalid indirect read from stack" error message because of the access to uninitialized memory region. This error is no longer possible in privileged mode. The test is updated to provoke an error "invalid indirect access to stack" because of access to invalid stack address (such error is not verified by progs/test_global_func.c series of tests). The following tests had to be removed because these can't be made unprivileged: - verifier/sock.c: - "sk_storage_get(map, skb->sk, &stack_value, 1): partially init stack_value" BPF_PROG_TYPE_SCHED_CLS programs are not executed in unprivileged mode. - verifier/var_off.c: - "indirect variable-offset stack access, max_off+size > max_initialized" - "indirect variable-offset stack access, uninitialized" These tests verify that access to uninitialized stack values is detected when stack offset is not a constant. However, variable stack access is prohibited in unprivileged mode, thus these tests are no longer valid. * * * Here is veristat log comparing this patch with current master on a set of selftest binaries listed in tools/testing/selftests/bpf/veristat.cfg and cilium BPF binaries (see [3]): $ ./veristat -e file,prog,states -C -f 'states_pct<-30' master.log current.log File Program States (A) States (B) States (DIFF) -------------------------- -------------------------- ---------- ---------- ---------------- bpf_host.o tail_handle_ipv6_from_host 349 244 -105 (-30.09%) bpf_host.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%) bpf_lxc.o tail_handle_nat_fwd_ipv4 1320 895 -425 (-32.20%) bpf_sock.o cil_sock4_connect 70 48 -22 (-31.43%) bpf_sock.o cil_sock4_sendmsg 68 46 -22 (-32.35%) bpf_xdp.o tail_handle_nat_fwd_ipv4 1554 803 -751 (-48.33%) bpf_xdp.o tail_lb_ipv4 6457 2473 -3984 (-61.70%) bpf_xdp.o tail_lb_ipv6 7249 3908 -3341 (-46.09%) pyperf600_bpf_loop.bpf.o on_event 287 145 -142 (-49.48%) strobemeta.bpf.o on_event 15915 4772 -11143 (-70.02%) strobemeta_nounroll2.bpf.o on_event 17087 3820 -13267 (-77.64%) xdp_synproxy_kern.bpf.o syncookie_tc 21271 6635 -14636 (-68.81%) xdp_synproxy_kern.bpf.o syncookie_xdp 23122 6024 -17098 (-73.95%) -------------------------- -------------------------- ---------- ---------- ---------------- Note: I limited selection by states_pct<-30%. Inspection of differences in pyperf600_bpf_loop behavior shows that the following patch for the test removes almost all differences: - a/tools/testing/selftests/bpf/progs/pyperf.h + b/tools/testing/selftests/bpf/progs/pyperf.h @ -266,8 +266,8 @ int __on_event(struct bpf_raw_tracepoint_args ctx) } if (event->pthread_match \|\| !pidData->use_tls) { - void frame_ptr; - FrameData frame; + void* frame_ptr = 0; + FrameData frame = {}; Symbol sym = {}; int cur_cpu = bpf_get_smp_processor_id(); W/o this patch the difference comes from the following pattern (for different variables): static bool get_frame_data(... FrameData frame ...) { ... bpf_probe_read_user(&frame->f_code, ...); if (!frame->f_code) return false; ... bpf_probe_read_user(&frame->co_name, ...); if (frame->co_name) ...; } int __on_event(struct bpf_raw_tracepoint_args ctx) { FrameData frame; ... get_frame_data(... &frame ...) // indirectly via a bpf_loop & callback ... } SEC("raw_tracepoint/kfree_skb") int on_event(struct bpf_raw_tracepoint_args* ctx) { ... ret \|= __on_event(ctx); ret \|= __on_event(ctx); ... } With regards to value `frame->co_name` the following is important: - Because of the conditional `if (!frame->f_code)` each call to __on_event() produces two states, one with `frame->co_name` marked as STACK_MISC, another with it as is (and marked STACK_INVALID on a first call). - The call to bpf_probe_read_user() does not mark stack slots corresponding to `&frame->co_name` as REG_LIVE_WRITTEN but it marks these slots as BPF_MISC, this happens because of the following loop in the check_helper_call(): for (i = 0; i < meta.access_size; i++) { err = check_mem_access(env, insn_idx, meta.regno, i, BPF_B, BPF_WRITE, -1, false); if (err) return err; } Note the size of the write, it is a one byte write for each byte touched by a helper. The BPF_B write does not lead to write marks for the target stack slot. - Which means that w/o this patch when second __on_event() call is verified `if (frame->co_name)` will propagate read marks first to a stack slot with STACK_MISC marks and second to a stack slot with STACK_INVALID marks and these states would be considered different. [1] https://lore.kernel.org/bpf/CAEf4BzY3e+ZuC6HUa8dCiUovQRg2SzEk7M-dSkqNZyn=xEmnPA@mail.gmail.com/ [2] https://lore.kernel.org/bpf/CAADnVQKs2i1iuZ5SUGuJtxWVfGYR9kDgYKhq3rNV+kBLQCu7rA@mail.gmail.com/ [3] git@github.com:anakryiko/cilium.git Suggested-by: Andrii Nakryiko <andrii@kernel.org> Co-developed-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20230219200427.606541-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-18	selftests: make order checking verbose in msg_zerocopy selftest	Zijian Zhang	1	-1/+1
	[ Upstream commit 7d6d8f0c8b700c9493f2839abccb6d29028b4219 ] We find that when lock debugging is on, notifications may not come in order. Thus, we have order checking outputs managed by cfg_verbose, to avoid too many outputs in this case. Fixes: 07b65c5b31ce ("test: add msg_zerocopy test") Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com> Signed-off-by: Xiaochun Lu <xiaochun.lu@bytedance.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240701225349.3395580-3-zijianzhang@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-18	selftests: fix OOM in msg_zerocopy selftest	Zijian Zhang	1	-1/+11
	[ Upstream commit af2b7e5b741aaae9ffbba2c660def434e07aa241 ] In selftests/net/msg_zerocopy.c, it has a while loop keeps calling sendmsg on a socket with MSG_ZEROCOPY flag, and it will recv the notifications until the socket is not writable. Typically, it will start the receiving process after around 30+ sendmsgs. However, as the introduction of commit dfa2f0483360 ("tcp: get rid of sysctl_tcp_adv_win_scale"), the sender is always writable and does not get any chance to run recv notifications. The selftest always exits with OUT_OF_MEMORY because the memory used by opt_skb exceeds the net.core.optmem_max. Meanwhile, it could be set to a different value to trigger OOM on older kernels too. Thus, we introduce "cfg_notification_limit" to force sender to receive notifications after some number of sendmsgs. Fixes: 07b65c5b31ce ("test: add msg_zerocopy test") Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com> Signed-off-by: Xiaochun Lu <xiaochun.lu@bytedance.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20240701225349.3395580-2-zijianzhang@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05	kselftest: arm64: Add a null pointer check	Kunwu Chan	1	-0/+4
	[ Upstream commit 80164282b3620a3cb73de6ffda5592743e448d0e ] There is a 'malloc' call, which can be unsuccessful. This patch will add the malloc failure checking to avoid possible null dereference and give more information about test fail reasons. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Link: https://lore.kernel.org/r/20240423082102.2018886-1-chentao@kylinos.cn Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05	selftests/bpf: Fix flaky test btf_map_in_map/lookup_update	Yonghong Song	1	-25/+1
	[ Upstream commit 14bb1e8c8d4ad5d9d2febb7d19c70a3cf536e1e5 ] Recently, I frequently hit the following test failure: [root@arch-fb-vm1 bpf]# ./test_progs -n 33/1 test_lookup_update:PASS:skel_open 0 nsec [...] test_lookup_update:PASS:sync_rcu 0 nsec test_lookup_update:FAIL:map1_leak inner_map1 leaked! #33/1 btf_map_in_map/lookup_update:FAIL #33 btf_map_in_map:FAIL In the test, after map is closed and then after two rcu grace periods, it is assumed that map_id is not available to user space. But the above assumption cannot be guaranteed. After zero or one or two rcu grace periods in different siturations, the actual freeing-map-work is put into a workqueue. Later on, when the work is dequeued, the map will be actually freed. See bpf_map_put() in kernel/bpf/syscall.c. By using workqueue, there is no ganrantee that map will be actually freed after a couple of rcu grace periods. This patch removed such map leak detection and then the test can pass consistently. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20240322061353.632136-1-yonghong.song@linux.dev Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05	selftests/bpf: Prevent client connect before server bind in test_tc_tunnel.sh	Alessandro Carminati (Red Hat)	1	-1/+12
	[ Upstream commit f803bcf9208a2540acb4c32bdc3616673169f490 ] In some systems, the netcat server can incur in delay to start listening. When this happens, the test can randomly fail in various points. This is an example error message: # ip gre none gso # encap 192.168.1.1 to 192.168.1.2, type gre, mac none len 2000 # test basic connectivity # Ncat: Connection refused. The issue stems from a race condition between the netcat client and server. The test author had addressed this problem by implementing a sleep, which I have removed in this patch. This patch introduces a function capable of sleeping for up to two seconds. However, it can terminate the waiting period early if the port is reported to be listening. Signed-off-by: Alessandro Carminati (Red Hat) <alessandro.carminati@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240314105911.213411-1-alessandro.carminati@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05	selftests/mm: compaction_test: fix bogus test success on Aarch64	Dev Jain	1	-5/+11
	[ Upstream commit d4202e66a4b1fe6968f17f9f09bbc30d08f028a1 ] Patch series "Fixes for compaction_test", v2. The compaction_test memory selftest introduces fragmentation in memory and then tries to allocate as many hugepages as possible. This series addresses some problems. On Aarch64, if nr_hugepages == 0, then the test trivially succeeds since compaction_index becomes 0, which is less than 3, due to no division by zero exception being raised. We fix that by checking for division by zero. Secondly, correctly set the number of hugepages to zero before trying to set a large number of them. Now, consider a situation in which, at the start of the test, a non-zero number of hugepages have been already set (while running the entire selftests/mm suite, or manually by the admin). The test operates on 80% of memory to avoid OOM-killer invocation, and because some memory is already blocked by hugepages, it would increase the chance of OOM-killing. Also, since mem_free used in check_compaction() is the value before we set nr_hugepages to zero, the chance that the compaction_index will be small is very high if the preset nr_hugepages was high, leading to a bogus test success. This patch (of 3): Currently, if at runtime we are not able to allocate a huge page, the test will trivially pass on Aarch64 due to no exception being raised on division by zero while computing compaction_index. Fix that by checking for nr_hugepages == 0. Anyways, in general, avoid a division by zero by exiting the program beforehand. While at it, fix a typo, and handle the case where the number of hugepages may overflow an integer. Link: https://lkml.kernel.org/r/20240521074358.675031-1-dev.jain@arm.com Link: https://lkml.kernel.org/r/20240521074358.675031-2-dev.jain@arm.com Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory") Signed-off-by: Dev Jain <dev.jain@arm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Sri Jayaramappa <sjayaram@akamai.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-05	selftests/mm: conform test to TAP format output	Muhammad Usama Anjum	1	-47/+44
	[ Upstream commit 9a21701edc41465de56f97914741bfb7bfc2517d ] Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages. Link: https://lkml.kernel.org/r/20240101083614.1076768-1-usama.anjum@collabora.com Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Stable-dep-of: d4202e66a4b1 ("selftests/mm: compaction_test: fix bogus test success on Aarch64") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05	selftests/mm: compaction_test: fix incorrect write of zero to nr_hugepages	Dev Jain	1	-0/+2
	[ Upstream commit 9ad665ef55eaad1ead1406a58a34f615a7c18b5e ] Currently, the test tries to set nr_hugepages to zero, but that is not actually done because the file offset is not reset after read(). Fix that using lseek(). Link: https://lkml.kernel.org/r/20240521074358.675031-3-dev.jain@arm.com Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory") Signed-off-by: Dev Jain <dev.jain@arm.com> Cc: <stable@vger.kernel.org> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Sri Jayaramappa <sjayaram@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05	tracing/selftests: Fix kprobe event name test for .isra. functions	Steven Rostedt (Google)	1	-1/+2
	commit 23a4b108accc29a6125ed14de4a044689ffeda78 upstream. The kprobe_eventname.tc test checks if a function with .isra. can have a kprobe attached to it. It loops through the kallsyms file for all the functions that have the .isra. name, and checks if it exists in the available_filter_functions file, and if it does, it uses it to attach a kprobe to it. The issue is that kprobes can not attach to functions that are listed more than once in available_filter_functions. With the latest kernel, the function that is found is: rapl_event_update.isra.0 # grep rapl_event_update.isra.0 /sys/kernel/tracing/available_filter_functions rapl_event_update.isra.0 rapl_event_update.isra.0 It is listed twice. This causes the attached kprobe to it to fail which in turn fails the test. Instead of just picking the function function that is found in available_filter_functions, pick the first one that is listed only once in available_filter_functions. Cc: stable@vger.kernel.org Fixes: 604e3548236d ("selftests/ftrace: Select an existing function in kprobe_eventname test") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16	selftests/kcmp: remove unused open mode	Edward Liaw	1	-1/+1
	[ Upstream commit eb59a58113717df04b8a8229befd8ab1e5dbf86e ] Android bionic warns that open modes are ignored if O_CREAT or O_TMPFILE aren't specified. The permissions for the file are set above: fd1 = open(kpath, O_RDWR \| O_CREAT \| O_TRUNC, 0644); Link: https://lkml.kernel.org/r/20240429234610.191144-1-edliaw@google.com Fixes: d97b46a64674 ("syscalls, x86: add __NR_kcmp syscall") Signed-off-by: Edward Liaw <edliaw@google.com> Reviewed-by: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>