Age | Commit message (Collapse) | Author | Files | Lines |
|
Two subtests use the test_in_netns() function to run the test in a
dedicated network namespace. This can now be done directly through the
test_progs framework with a test name starting with 'ns_'.
Replace the use of test_in_netns() by test_ns_* calls.
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Link: https://lore.kernel.org/r/20250219-b4-tc_links-v2-4-14504db136b7@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Tests are serialized because they all use the loopback interface.
Replace the 'serial_test_' prefixes with 'test_ns_' to benefit from the
new test_prog feature which creates a dedicated namespace for each test,
allowing them to run in parallel.
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Link: https://lore.kernel.org/r/20250219-b4-tc_links-v2-3-14504db136b7@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Some tests are serialized to prevent interference with others.
Open a dedicated network namespace when a test name starts with 'ns_' to
allow more test parallelization. Use the test name as namespace name to
avoid conflict between namespaces.
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Link: https://lore.kernel.org/r/20250219-b4-tc_links-v2-2-14504db136b7@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Next patch will add a new feature to test_prog to run tests in a
dedicated namespace if the test name starts with 'ns_'. Here the test
name already starts with 'ns_' and creates some namespaces which would
conflict with the new feature.
Rename the test to avoid this conflict.
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Link: https://lore.kernel.org/r/20250219-b4-tc_links-v2-1-14504db136b7@bootlin.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add a simple test to verify that the new "override_creds" option works.
Link: https://lore.kernel.org/r/20250219-work-overlayfs-v3-4-46af55e4ceda@kernel.org
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add a simple test to verify that the new "override_creds" option works.
Link: https://lore.kernel.org/r/20250219-work-overlayfs-v3-3-46af55e4ceda@kernel.org
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add a new set of helpers that will be used in follow-up patches.
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Add a simple test to verify that the new "override_creds" option works.
Link: https://lore.kernel.org/r/20250219-work-overlayfs-v3-2-46af55e4ceda@kernel.org
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
A test case with ridiculously deep bpf_for() nesting and
a conditional update of a stack location.
Consider the innermost loop structure:
1: bpf_for(o, 0, 10)
2: if (unlikely(bpf_get_prandom_u32()))
3: buf[0] = 42;
4: <exit>
Assuming that verifier.c:clean_live_states() operates w/o change from
the previous patch (e.g. as on current master) verification would
proceed as follows:
- at (1) state {buf[0]=?,o=drained}:
- checkpoint
- push visit to (2) for later
- at (4) {buf[0]=?,o=drained}
- pop (2) {buf[0]=?,o=active}, push visit to (3) for later
- at (1) {buf[0]=?,o=active}
- checkpoint
- push visit to (2) for later
- at (4) {buf[0]=?,o=drained}
- pop (2) {buf[0]=?,o=active}, push visit to (3) for later
- at (1) {buf[0]=?,o=active}:
- checkpoint reached, checkpoint's branch count becomes 0
- checkpoint is processed by clean_live_states() and
becomes {o=active}
- pop (3) {buf[0]=42,o=active}
- at (1), {buf[0]=42,o=active}
- checkpoint
- push visit to (2) for later
- at (4) {buf[0]=42,o=drained}
- pop (2) {buf[0]=42,o=active}, push visit to (3) for later
- at (1) {buf[0]=42,o=active}, checkpoint reached
- pop (3) {buf[0]=42,o=active}
- at (1) {buf[0]=42,o=active}:
- checkpoint reached, checkpoint's branch count becomes 0
- checkpoint is processed by clean_live_states() and
becomes {o=active}
- ...
Note how clean_live_states() converted the checkpoint
{buf[0]=42,o=active} to {o=active} and it can no longer be matched
against {buf[0]=<any>,o=active}, because iterator based states
are compared using stacksafe(... RANGE_WITHIN), that requires
stack slots to have same types. At the same time there are
still states {buf[0]=42,o=active} pushed to DFS stack.
This behaviour becomes exacerbated with multiple nesting levels,
here are veristat results:
- nesting level 1: 69 insns
- nesting level 2: 258 insns
- nesting level 3: 900 insns
- nesting level 4: 4754 insns
- nesting level 5: 35944 insns
- nesting level 6: 312558 insns
- nesting level 7: 1M limit
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250215110411.3236773-5-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
A somewhat cumbersome test case sensitive to correct copying of
bpf_verifier_state->loop_entry fields in
verifier.c:copy_verifier_state().
W/o the fix from a previous commit the program is accepted as safe.
1: /* poison block */
2: if (random() != 24) { // assume false branch is placed first
3: i = iter_new();
4: while (iter_next(i));
5: iter_destroy(i);
6: return;
7: }
8:
9: /* dfs_depth block */
10: for (i = 10; i > 0; i--);
11:
12: /* main block */
13: i = iter_new(); // fp[-16]
14: b = -24; // r8
15: for (;;) {
16: if (iter_next(i))
17: break;
18: if (random() == 77) { // assume false branch is placed first
19: *(u64 *)(r10 + b) = 7; // this is not safe when b == -25
20: iter_destroy(i);
21: return;
22: }
23: if (random() == 42) { // assume false branch is placed first
24: b = -25;
25: }
26: }
27: iter_destroy(i);
The goal of this example is to:
(a) poison env->cur_state->loop_entry with a state S,
such that S->branches == 0;
(b) set state S as a loop_entry for all checkpoints in
/* main block */, thus forcing NOT_EXACT states comparisons;
(c) exploit incorrect loop_entry set for checkpoint at line 18
by first creating a checkpoint with b == -24 and then
pruning the state with b == -25 using that checkpoint.
The /* poison block */ is responsible for goal (a).
It forces verifier to first validate some unrelated iterator based
loop, which leads to an update_loop_entry() call in is_state_visited(),
which places checkpoint created at line 4 as env->cur_state->loop_entry.
Starting from line 8, the branch count for that checkpoint is 0.
The /* dfs_depth block */ is responsible for goal (b).
It abuses the fact that update_loop_entry(cur, hdr) only updates
cur->loop_entry when hdr->dfs_depth <= cur->dfs_depth.
After line 12 every state has dfs_depth bigger then dfs_depth of
poisoned env->cur_state->loop_entry. Thus the above condition is never
true for lines 12-27.
The /* main block */ is responsible for goal (c).
Verification proceeds as follows:
- checkpoint {b=-24,i=active} created at line 16;
- jump 18->23 is verified first, jump to 19 pushed to stack;
- jump 23->26 is verified first, jump to 24 pushed to stack;
- checkpoint {b=-24,i=active} created at line 15;
- current state is pruned by checkpoint created at line 16,
this sets branches count for checkpoint at line 15 to 0;
- jump to 24 is popped from stack;
- line 16 is reached in state {b=-25,i=active};
- this is pruned by a previous checkpoint {b=-24,i=active}:
- checkpoint's loop_entry is poisoned and has branch count of 0,
hence states are compared using NOT_EXACT rules;
- b is not marked precise yet.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20250215110411.3236773-3-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Fix few spelling mistakes in net selftests
Signed-off-by: Chandra Mohan Sundar <chandru.dav@gmail.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20250217141520.81033-1-chandru.dav@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Iterating through array of maps may encounter non existing keys. The
batch operation should not fail on when this happens.
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Acked-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/r/9007237b9606dc2ee44465a4447fe46e13f3bea6.1739171594.git.yan@cloudflare.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The tests done by test_xdp_redirect_multi.sh are now fully covered by
the CI through test_xdp_veth.c.
Remove test_xdp_redirect_multi.sh
Remove xdp_redirect_multi.c that was used by the script to load and
attach the BPF programs.
Remove their entries in the Makefile
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250212-redirect-multi-v5-6-fd0d39fca6e6@bootlin.com
|
|
XDP programs loaded on egress is tested by test_xdp_redirect_multi.sh
but not by the test_progs framework.
Add a test case in test_xdp_veth.c to test the XDP program on egress.
Use the same BPF program than test_xdp_redirect_multi.sh that replaces
the source MAC address by one provided through a BPF map.
Use a BPF program that stores the source MAC of received packets in a
map to check the test results.
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250212-redirect-multi-v5-5-fd0d39fca6e6@bootlin.com
|
|
XDP redirections with BPF_F_BROADCAST and BPF_F_EXCLUDE_INGRESS flags
are tested by test_xdp_redirect_multi.sh but not within the test_progs
framework.
Add a broadcast test case in test_xdp_veth.c to test them.
Use the same BPF programs than the one used by
test_xdp_redirect_multi.sh.
Use a BPF map to select the broadcast flags.
Use a BPF map with an entry per veth to check whether packets are
received or not
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250212-redirect-multi-v5-4-fd0d39fca6e6@bootlin.com
|
|
Broadcasting flags are hardcoded for each kind for protocol.
Create a redirect_flags map that allows to select the broadcasting flags
to use in the bpf_redirect_map(). The protocol ID is used as a key.
Set the old hardcoded values as default if the map isn't filled by the
BPF caller.
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250212-redirect-multi-v5-3-fd0d39fca6e6@bootlin.com
|
|
Tests use the root network namespace, so they aren't fully independent
of each other. For instance, the index of the created veth interfaces
is incremented every time a new test is launched.
Wrap the network topology in a network namespace to ensure full
isolation. Use the append_tid() helper to ensure the uniqueness of this
namespace's name during parallel runs.
Remove the use of the append_tid() on the veth names as they now belong
to an already unique namespace.
Simplify cleanup_network() by directly deleting the namespaces
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250212-redirect-multi-v5-2-fd0d39fca6e6@bootlin.com
|
|
The network configuration is defined by a table of struct
veth_configuration. This isn't convenient if we want to add a network
configuration that isn't linked to a veth pair.
Create a struct net_configuration that holds the veth_configuration
table to ease adding new configuration attributes in upcoming patch.
Signed-off-by: Bastien Curutchet (eBPF Foundation) <bastien.curutchet@bootlin.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250212-redirect-multi-v5-1-fd0d39fca6e6@bootlin.com
|
|
Following a similar rationale as commit e4835f1da425f ("kunit: tool:
Build compile_commands.json"), make a common developer tool available by
default for KUnit users.
Compared to compile_commands.json, there is a little more work to be
done to build the GDB scripts. Is it enough to affect development cycle
duration? Unscientific evaluation:
rm -rf .kunit; time tools/testing/kunit/kunit.py build --kunitconfig ./lib/kunit/.kunitconfig --jobs 96
Without this patch it took 14.77s, with this patch it took 14.83. So,
although `make scripts_gdb` is pretty slow, presumably most of that is
just the overhead of running Kbuild at all, actually building the
scripts is approximately free.
Note also, to actually get the GDB scripts the user needs to enable
CONFIG_SCRIPTS_GDB, but building the scripts_gdb target without that is
still harmless.
Link: https://lore.kernel.org/r/20250121-kunit-gdb-v1-1-faedfd0653ef@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Q is exported from Makefile.include so it is not necessary to manually
set it.
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <qmo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Benjamin Tissoires <bentiss@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Lukasz Luba <lukasz.luba@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Mykola Lysenko <mykolal@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20250213-quiet_tools-v3-2-07de4482a581@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Changes to MC remote need to be reflected in actual group memberships.
Add a test to verify that it is the case.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Instead of inlining equivalents, use lib.sh-provided primitives.
Use defer to manage vx lifetime.
This will make it easier to extend the test in the next patch.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This helper could be useful to more than just forwarding tests.
Move it upstairs and port over to log_test_skip().
Split the function into two parts: the bit that actually checks and
reports skip, which is in a new function check_command(). And a bit
that exits the test script if the check fails. This allows users
consistent checking behavior while giving an option to bail out from
a single test without bailing out of the whole script.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Verify that for a connectible AF_VSOCK socket, merely having a transport
assigned is insufficient; socket must be connected for the sockmap to
accept.
This does not test datagram vsocks. Even though it hardly matters. VMCI is
the only transport that features VSOCK_TRANSPORT_F_DGRAM, but it has an
unimplemented vsock_transport::readskb() callback, making it unsupported by
BPF/sockmap.
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Commit 515745445e92 ("selftest/bpf: Add test for vsock removal from sockmap
on close()") added test that checked if proto::close() callback was invoked
on AF_VSOCK socket release. I.e. it verified that a close()d vsock does
indeed get removed from the sockmap.
It was done simply by creating a socket pair and attempting to replace a
close()d one with its peer. Since, due to a recent change, sockmap does not
allow updating index with a non-established connectible vsock, redo it with
a freshly established one.
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When testing if we should try to compact memory or drop caches before we
run the THP or HugeTLB tests we use | as an or operator. This doesn't
work since run_vmtests.sh is written in shell where this is used to pipe
the output of the first argument into the second. Instead use the shell's
-o operator.
Link: https://lkml.kernel.org/r/20250212-kselftest-mm-no-hugepages-v1-1-44702f538522@kernel.org
Fixes: b433ffa8dbac ("selftests: mm: perform some system cleanup before using hugepages")
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Nico Pache <npache@redhat.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Test struct_ops programs returning referenced kptr. When the return type
of a struct_ops operator is pointer to struct, the verifier should
only allow programs that return a scalar NULL or a non-local kptr with the
correct type in its unmodified form.
Signed-off-by: Amery Hung <amery.hung@bytedance.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20250217190640.1748177-6-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Test referenced kptr acquired through struct_ops argument tagged with
"__ref". The success case checks whether 1) a reference to the correct
type is acquired, and 2) the referenced kptr argument can be accessed in
multiple paths as long as it hasn't been released. In the fail cases,
we first confirm that a referenced kptr acquried through a struct_ops
argument is not allowed to be leaked. Then, we make sure this new
referenced kptr acquiring mechanism does not accidentally allow referenced
kptrs to flow into global subprograms through their arguments.
Signed-off-by: Amery Hung <amery.hung@bytedance.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20250217190640.1748177-4-ameryhung@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Test that queues which are used for AF_XDP have the xsk nest attribute.
The attribute is currently empty, but its existence means the AF_XDP is
being used for the queue. Enable CONFIG_XDP_SOCKETS for
selftests/drivers/net tests, as well.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250214211255.14194-4-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Introduce tests to verify the correct functionality of the SO_RCVMARK and
SO_RCVPRIORITY socket options.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Suggested-by: Ferenc Fejes <fejes@inf.elte.hu>
Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250214205828.48503-1-annaemesenyiri@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch fixes a grammatical error in a test log message
in reuseaddr_ports_exhausted.c for better clarity.
Signed-off-by: Pranav Tyagi <pranav.tyagi03@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250213152612.4434-1-pranav.tyagi03@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace literal 0 with macro PKEY_UNRESTRICTED where pkey_*() functions
are used in mm selftests for memory protection keys for ppc target.
Signed-off-by: Yury Khrustalev <yury.khrustalev@arm.com>
Suggested-by: Kevin Brodsky <kevin.brodsky@arm.com>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
Link: https://lore.kernel.org/r/20250113170619.484698-4-yury.khrustalev@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Replace literal 0 with macro PKEY_UNRESTRICTED where pkey_*() functions
are used in mm selftests for memory protection keys.
Signed-off-by: Yury Khrustalev <yury.khrustalev@arm.com>
Suggested-by: Joey Gouly <joey.gouly@arm.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20250113170619.484698-3-yury.khrustalev@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Add a selftest for io_uring zero copy Rx. This test cannot run locally
and requires a remote host to be configured in net.config. The remote
host must have hardware support for zero copy Rx as listed in the
documentation page. The test will restore the NIC config back to before
the test and is idempotent.
liburing is required to compile the test and be installed on the remote
host running the test.
Signed-off-by: David Wei <dw@davidwei.uk>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20250215000947.789731-12-dw@davidwei.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next into for-6.15/io_uring-rx-zc
Merge networking zerocopy receive tree, to get the prep patches for
the io_uring rx zc support.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (63 commits)
net: add helpers for setting a memory provider on an rx queue
net: page_pool: add memory provider helpers
net: prepare for non devmem TCP memory providers
net: page_pool: add a mp hook to unregister_netdevice*
net: page_pool: add callback for mp info printing
netdev: add io_uring memory provider info
net: page_pool: create hooks for custom memory providers
net: generalise net_iov chunk owners
net: prefix devmem specific helpers
net: page_pool: don't cast mp param to devmem
tools: ynl: add all headers to makefile deps
eth: fbnic: set IFF_UNICAST_FLT to avoid enabling promiscuous mode when adding unicast addrs
eth: fbnic: add MAC address TCAM to debugfs
tools: ynl-gen: support limits using definitions
tools: ynl-gen: don't output external constants
net/mlx5e: Avoid WARN_ON when configuring MQPRIO with HTB offload enabled
net/mlx5e: Remove unused mlx5e_tc_flow_action struct
net/mlx5: Remove stray semicolon in LAG port selection table creation
net/mlx5e: Support FEC settings for 200G per lane link modes
net/mlx5: Add support for 200Gbps per lane link modes
...
|
|
We need the faux_device changes in here for future work.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Large set of fixes for vector handling, especially in the
interactions between host and guest state.
This fixes a number of bugs affecting actual deployments, and
greatly simplifies the FP/SIMD/SVE handling. Thanks to Mark Rutland
for dealing with this thankless task.
- Fix an ugly race between vcpu and vgic creation/init, resulting in
unexpected behaviours
- Fix use of kernel VAs at EL2 when emulating timers with nVHE
- Small set of pKVM improvements and cleanups
x86:
- Fix broken SNP support with KVM module built-in, ensuring the PSP
module is initialized before KVM even when the module
infrastructure cannot be used to order initcalls
- Reject Hyper-V SEND_IPI hypercalls if the local APIC isn't being
emulated by KVM to fix a NULL pointer dereference
- Enter guest mode (L2) from KVM's perspective before initializing
the vCPU's nested NPT MMU so that the MMU is properly tagged for
L2, not L1
- Load the guest's DR6 outside of the innermost .vcpu_run() loop, as
the guest's value may be stale if a VM-Exit is handled in the
fastpath"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
x86/sev: Fix broken SNP support with KVM module built-in
KVM: SVM: Ensure PSP module is initialized if KVM module is built-in
crypto: ccp: Add external API interface for PSP module initialization
KVM: arm64: vgic: Hoist SGI/PPI alloc from vgic_init() to kvm_create_vgic()
KVM: arm64: timer: Drop warning on failed interrupt signalling
KVM: arm64: Fix alignment of kvm_hyp_memcache allocations
KVM: arm64: Convert timer offset VA when accessed in HYP code
KVM: arm64: Simplify warning in kvm_arch_vcpu_load_fp()
KVM: arm64: Eagerly switch ZCR_EL{1,2}
KVM: arm64: Mark some header functions as inline
KVM: arm64: Refactor exit handlers
KVM: arm64: Refactor CPTR trap deactivation
KVM: arm64: Remove VHE host restore of CPACR_EL1.SMEN
KVM: arm64: Remove VHE host restore of CPACR_EL1.ZEN
KVM: arm64: Remove host FPSIMD saving for non-protected KVM
KVM: arm64: Unconditionally save+flush host FPSIMD/SVE/SME state
KVM: x86: Load DR6 with guest value only before entering .vcpu_run() loop
KVM: nSVM: Enter guest mode before initializing nested NPT MMU
KVM: selftests: Add CPUID tests for Hyper-V features that need in-kernel APIC
KVM: selftests: Manage CPUID array in Hyper-V CPUID test's core helper
...
|
|
The driver for the 8250 console is not used, as no port is found.
Instead the prom0 bootconsole is used the whole time.
The prom driver translates '\n' to '\r\n' before handing of the message
off to the firmware. The firmware performs the same translation again.
In the final output produced by QEMU each line ends with '\r\r\n'.
This breaks the kunit parser, which can only handle '\r\n' and '\n'.
Use the Zilog console instead. It works correctly, is the one documented
by the QEMU manual and also saves a bit of codesize:
Before=4051011, After=4023326, chg -0.68%
Observed on QEMU 9.2.0.
Link: https://lore.kernel.org/r/20250214-kunit-qemu-sparc-console-v1-1-ba1dfdf8f0b1@linutronix.de
Fixes: 87c9c1631788 ("kunit: tool: add support for QEMU")
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
As noted in [0], SeaBIOS (QEMU default) makes a mess of the terminal,
qboot does not.
It turns out this is actually useful with kunit.py, since the user is
exposed to this issue if they set --raw_output=all.
qboot is also faster than SeaBIOS, but it's is marginal for this
usecase.
[0] https://lore.kernel.org/all/CA+i-1C0wYb-gZ8Mwh3WSVpbk-LF-Uo+njVbASJPe1WXDURoV7A@mail.gmail.com/
Both SeaBIOS and qboot are x86-specific.
Link: https://lore.kernel.org/r/20250124-kunit-qboot-v1-1-815e4d4c6f7c@google.com
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
kernfs_rename_lock is used to obtain stable kernfs_node::{name|parent}
pointer. This is a preparation to access kernfs_node::parent under RCU
and ensure that the pointer remains stable under the RCU lifetime
guarantees.
For a complete path, as it is done in kernfs_path_from_node(), the
kernfs_rename_lock is still required in order to obtain a stable parent
relationship while computing the relevant node depth. This must not
change while the nodes are inspected in order to build the path.
If the kernfs user never moves the nodes (changes the parent) then the
kernfs_rename_lock is not required and the RCU guarantees are
sufficient. This "restriction" can be set with
KERNFS_ROOT_INVARIANT_PARENT. Otherwise the lock is required.
Rename kernfs_node::parent to kernfs_node::__parent to denote the RCU
access and use RCU accessor while accessing the node.
Make cgroup use KERNFS_ROOT_INVARIANT_PARENT since the parent here can
not change.
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20250213145023.2820193-6-bigeasy@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Add a simple repro for the issue of miscalculating LDX/STX/ST CO-RE
relocation size adjustment when the CO-RE relocation target type is an
ARRAY.
We need to make sure that compiler generates LDX/STX/ST instruction with
CO-RE relocation against entire ARRAY type, not ARRAY's element. With
the code pattern in selftest, we get this:
59: 61 71 00 00 00 00 00 00 w1 = *(u32 *)(r7 + 0x0)
00000000000001d8: CO-RE <byte_off> [5] struct core_reloc_arrays::a (0:0)
Where offset of `int a[5]` is embedded (through CO-RE relocation) into memory
load instruction itself.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250207014809.1573841-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Added test cases to ensure that programs with stack sizes exceeding 512
bytes are restricted in non-JITed mode, and can be executed normally in
JITed mode, even with stack sizes exceeding 512 bytes due to the presence
of may_goto instructions.
Test result:
echo "0" > /proc/sys/net/core/bpf_jit_enable
./test_progs -t verifier_stack_ptr
...
stack size 512 with may_goto with jit:SKIP
stack size 512 with may_goto without jit:OK
...
Summary: 1/27 PASSED, 25 SKIPPED, 0 FAILED
echo "1" > /proc/sys/net/core/bpf_jit_enable
./test_progs -t verifier_stack_ptr
...
stack size 512 with may_goto with jit:OK
stack size 512 with may_goto without jit:SKIP
...
Summary: 1/27 PASSED, 25 SKIPPED, 0 FAILED
Signed-off-by: Jiayuan Chen <mrpre@163.com>
Link: https://lore.kernel.org/r/20250214091823.46042-4-mrpre@163.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In some cases, the verification logic under the interpreter and JIT
differs, such as may_goto, and the test program behaves differently under
different runtime modes, requiring separate verification logic for each
result.
Introduce __load_if_JITed and __load_if_no_JITed annotation for tests.
Signed-off-by: Jiayuan Chen <mrpre@163.com>
Link: https://lore.kernel.org/r/20250214091823.46042-3-mrpre@163.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
When working on OpenRISC support for restartable sequences I noticed
and fixed these two issues with the riscv support bits.
1 The 'inc' argument to RSEQ_ASM_OP_R_DEREF_ADDV was being implicitly
passed to the macro. Fix this by adding 'inc' to the list of macro
arguments.
2 The inline asm input constraints for 'inc' and 'off' use "er", The
riscv gcc port does not have an "e" constraint, this looks to be
copied from the x86 port. Fix this by just using an "r" constraint.
I have compile tested this only for riscv. However, the same fixes I
use in the OpenRISC rseq selftests and everything passes with no issues.
Fixes: 171586a6ab66 ("selftests/rseq: riscv: Template memory ordering and percpu access mode")
Signed-off-by: Stafford Horne <shorne@gmail.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20250114170721.3613280-1-shorne@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:
- Fix lock imbalance in a corner case of dispatch_to_local_dsq()
- Migration disabled tasks were confusing some BPF schedulers and its
handling had a bug. Fix it and simplify the default behavior by
dispatching them automatically
- ops.tick(), ops.disable() and ops.exit_task() were incorrectly
disallowing kfuncs that require the task argument to be the rq
operation is currently operating on and thus is rq-locked.
Allow them.
- Fix autogroup migration handling bug which was occasionally
triggering a warning in the cgroup migration path
- tools/sched_ext, selftest and other misc updates
* tag 'sched_ext-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
sched_ext: Use SCX_CALL_OP_TASK in task_tick_scx
sched_ext: Fix the incorrect bpf_list kfunc API in common.bpf.h.
sched_ext: selftests: Fix grammar in tests description
sched_ext: Fix incorrect assumption about migration disabled tasks in task_can_run_on_remote_rq()
sched_ext: Fix migration disabled handling in targeted dispatches
sched_ext: Implement auto local dispatching of migration disabled tasks
sched_ext: Fix incorrect time delta calculation in time_delta()
sched_ext: Fix lock imbalance in dispatch_to_local_dsq()
sched_ext: selftests/dsp_local_on: Fix selftest on UP systems
tools/sched_ext: Add helper to check task migration state
sched_ext: Fix incorrect autogroup migration detection
sched_ext: selftests/dsp_local_on: Fix sporadic failures
selftests/sched_ext: Fix enum resolution
sched_ext: Include task weight in the error state dump
sched_ext: Fixes typos in comments
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
- Fix a race window where a newly forked task could escape cgroup.kill
- Remove incorrectly included steal time from cpu.stat::usage_usec
- Minor update in selftest
* tag 'cgroup-for-6.14-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Remove steal time from usage_usec
selftests/cgroup: use bash in test_cpuset_v1_hp.sh
cgroup: fix race between fork and cgroup.kill
|
|
Now that the binary stats cache infrastructure is largely scope agnostic,
add support for vCPU-scoped stats. Like VM stats, open and cache the
stats FD when the vCPU is created so that it's guaranteed to be valid when
vcpu_get_stats() is invoked.
Account for the extra per-vCPU file descriptor in kvm_set_files_rlimit(),
so that tests that create large VMs don't run afoul of resource limits.
To sanity check that the infrastructure actually works, and to get a bit
of bonus coverage, add an assert in x86's xapic_ipi_test to verify that
the number of HLTs executed by the test matches the number of HLT exits
observed by KVM.
Tested-by: Manali Shukla <Manali.Shukla@amd.com>
Link: https://lore.kernel.org/r/20250111005049.1247555-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Move the max vCPUs test's RLIMIT_NOFILE adjustments to common code, and
use the new helper to adjust the resource limit for non-barebones VMs by
default. x86's recalc_apic_map_test creates 512 vCPUs, and a future
change will open the binary stats fd for all vCPUs, which will put the
recalc APIC test above some distros' default limit of 1024.
Link: https://lore.kernel.org/r/20250111005049.1247555-8-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Get and cache a VM's binary stats FD when the VM is opened, as opposed to
waiting until the stats are first used. Opening the stats FD outside of
__vm_get_stat() will allow converting it to a scope-agnostic helper.
Note, this doesn't interfere with kvm_binary_stats_test's testcase that
verifies a stats FD can be used after its own VM's FD is closed, as the
cached FD is also closed during kvm_vm_free().
Link: https://lore.kernel.org/r/20250111005049.1247555-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add a struct and helpers to manage the binary stats cache, which is
currently used only for VM-scoped stats. This will allow expanding the
selftests infrastructure to provide support for vCPU-scoped binary stats,
which, except for the ioctl to get the stats FD are identical to VM-scoped
stats.
Defer converting __vm_get_stat() to a scope-agnostic helper to a future
patch, as getting the stats FD from KVM needs to be moved elsewhere
before it can be made completely scope-agnostic.
Link: https://lore.kernel.org/r/20250111005049.1247555-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|