Age | Commit message (Collapse) | Author | Files | Lines |
|
This change actually defines the (initial) metadata layout
that should be used by AF_XDP userspace (xsk_tx_metadata).
The first field is flags which requests appropriate offloads,
followed by the offload-specific fields. The supported per-device
offloads are exported via netlink (new xsk-flags).
The offloads themselves are still implemented in a bit of a
framework-y fashion that's left from my initial kfunc attempt.
I'm introducing new xsk_tx_metadata_ops which drivers are
supposed to implement. The drivers are also supposed
to call xsk_tx_metadata_request/xsk_tx_metadata_complete in
the right places. Since xsk_tx_metadata_{request,_complete}
are static inline, we don't incur any extra overhead doing
indirect calls.
The benefit of this scheme is as follows:
- keeps all metadata layout parsing away from driver code
- makes it easy to grep and see which drivers implement what
- don't need any extra flags to maintain to keep track of what
offloads are implemented; if the callback is implemented - the offload
is supported (used by netlink reporting code)
Two offloads are defined right now:
1. XDP_TXMD_FLAGS_CHECKSUM: skb-style csum_start+csum_offset
2. XDP_TXMD_FLAGS_TIMESTAMP: writes TX timestamp back into metadata
area upon completion (tx_timestamp field)
XDP_TXMD_FLAGS_TIMESTAMP is also implemented for XDP_COPY mode: it writes
SW timestamp from the skb destructor (note I'm reusing hwtstamps to pass
metadata pointer).
The struct is forward-compatible and can be extended in the future
by appending more fields.
Reviewed-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20231127190319.1190813-3-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
For zerocopy mode, tx_desc->addr can point to an arbitrary offset
and carry some TX metadata in the headroom. For copy mode, there
is no way currently to populate skb metadata.
Introduce new tx_metadata_len umem config option that indicates how many
bytes to treat as metadata. Metadata bytes come prior to tx_desc address
(same as in RX case).
The size of the metadata has mostly the same constraints as XDP:
- less than 256 bytes
- 8-byte aligned (compared to 4-byte alignment on xdp, due to 8-byte
timestamp in the completion)
- non-zero
This data is not interpreted in any way right now.
Reviewed-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20231127190319.1190813-2-sdf@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Remove x86's mmio_warning_test, as it is unnecessarily complex (there's no
reason to fork, spawn threads, initialize srand(), etc..), unnecessarily
restrictive (triggering triple fault is not unique to Intel CPUs without
unrestricted guest), and provides no meaningful coverage beyond what
basic fuzzing can achieve (running a vCPU with garbage is fuzzing's bread
and butter).
That the test has *all* of the above flaws is not coincidental, as the
code was copy+pasted almost verbatim from the syzkaller reproducer that
originally found the KVM bug (which has long since been fixed).
Cc: Michal Luczaj <mhal@rbox.co>
Link: https://groups.google.com/g/syzkaller/c/lHfau8E3SOE
Link: https://lore.kernel.org/r/20230815220030.560372-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add yet another macro to the VM/vCPU ioctl() framework to detect when an
ioctl() failed because KVM killed/bugged the VM, i.e. when there was
nothing wrong with the ioctl() itself. If KVM kills a VM, e.g. by way of
a failed KVM_BUG_ON(), all subsequent VM and vCPU ioctl()s will fail with
-EIO, which can be quite misleading and ultimately waste user/developer
time.
Use KVM_CHECK_EXTENSION on KVM_CAP_USER_MEMORY to detect if the VM is
dead and/or bug, as KVM doesn't provide a dedicated ioctl(). Using a
heuristic is obviously less than ideal, but practically speaking the logic
is bulletproof barring a KVM change, and any such change would arguably
break userspace, e.g. if KVM returns something other than -EIO.
Without the detection, tearing down a bugged VM yields a cryptic failure
when deleting memslots:
==== Test Assertion Failure ====
lib/kvm_util.c:689: !ret
pid=45131 tid=45131 errno=5 - Input/output error
1 0x00000000004036c3: __vm_mem_region_delete at kvm_util.c:689
2 0x00000000004042f0: kvm_vm_free at kvm_util.c:724 (discriminator 12)
3 0x0000000000402929: race_sync_regs at sync_regs_test.c:193
4 0x0000000000401cab: main at sync_regs_test.c:334 (discriminator 6)
5 0x0000000000416f13: __libc_start_call_main at libc-start.o:?
6 0x000000000041855f: __libc_start_main_impl at ??:?
7 0x0000000000401d40: _start at ??:?
KVM_SET_USER_MEMORY_REGION failed, rc: -1 errno: 5 (Input/output error)
Which morphs into a more pointed error message with the detection:
==== Test Assertion Failure ====
lib/kvm_util.c:689: false
pid=80347 tid=80347 errno=5 - Input/output error
1 0x00000000004039ab: __vm_mem_region_delete at kvm_util.c:689 (discriminator 5)
2 0x0000000000404660: kvm_vm_free at kvm_util.c:724 (discriminator 12)
3 0x0000000000402ac9: race_sync_regs at sync_regs_test.c:193
4 0x0000000000401cb7: main at sync_regs_test.c:334 (discriminator 6)
5 0x0000000000418263: __libc_start_call_main at libc-start.o:?
6 0x00000000004198af: __libc_start_main_impl at ??:?
7 0x0000000000401d90: _start at ??:?
KVM killed/bugged the VM, check the kernel log for clues
Suggested-by: Michal Luczaj <mhal@rbox.co>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Colton Lewis <coltonlewis@google.com>
Link: https://lore.kernel.org/r/20231108010953.560824-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Drop _kvm_ioctl(), _vm_ioctl(), and _vcpu_ioctl(), as they are no longer
used by anything other than the no-underscores variants (and may have
never been used directly). The single-underscore variants were never
intended to be a "feature", they were a stopgap of sorts to ease the
conversion to pretty printing ioctl() names when reporting errors.
Opportunistically add a comment explaining when to use __KVM_IOCTL_ERROR()
versus KVM_IOCTL_ERROR(). The single-underscore macros were subtly
ensuring that the name of the ioctl() was printed on error, i.e. it's all
too easy to overlook the fact that using __KVM_IOCTL_ERROR() is
intentional.
Link: https://lore.kernel.org/r/20231108010953.560824-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Using -MD without -MP causes build failures when a header file is deleted
or moved. With -MP, the compiler will emit phony targets for the header
files it lists as dependencies, and the Makefiles won't refuse to attempt
to rebuild a C unit which no longer includes the deleted header.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lore.kernel.org/r/9fc8b5395321abbfcaf5d78477a9a7cd350b08e4.camel@infradead.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
struct ynl_req_state carries reply-related info from generated code
into generic YNL code. While we don't need reply info to execute
a request without a reply, we still need to pass in the struct, because
it's also where we get the pointer to struct ynl_sock from. Passing NULL
results in crashes if kernel returns an error or an unexpected reply.
Fixes: dc0956c98f11 ("tools: ynl-gen: move the response reading logic into YNL")
Link: https://lore.kernel.org/r/20231126225858.2144136-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When linking statically, libraries may require other dependencies to be
included to ld flags. In particular, libelf may require libzstd. Use
pkg-config to determine such dependencies.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231125084253.85025-4-akihiko.odaki@daynix.com
|
|
A library may need to depend on additional archive files for static
builds so pkg-config should be instructed to list them.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231125084253.85025-3-akihiko.odaki@daynix.com
|
|
pkg-config is used to build sign-file executable. It should use the
library for the target instead of the host as it is called during tests.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231125084253.85025-2-akihiko.odaki@daynix.com
|
|
Adding support to display details for uprobe_multi links,
both plain:
# bpftool link -p
...
24: uprobe_multi prog 126
uprobe.multi path /home/jolsa/bpf/test_progs func_cnt 3 pid 4143
offset ref_ctr_offset cookies
0xd1f88 0xf5d5a8 0xdead
0xd1f8f 0xf5d5aa 0xbeef
0xd1f96 0xf5d5ac 0xcafe
and json:
# bpftool link -p
[{
...
},{
"id": 24,
"type": "uprobe_multi",
"prog_id": 126,
"retprobe": false,
"path": "/home/jolsa/bpf/test_progs",
"func_cnt": 3,
"pid": 4143,
"funcs": [{
"offset": 860040,
"ref_ctr_offset": 16111016,
"cookie": 57005
},{
"offset": 860047,
"ref_ctr_offset": 16111018,
"cookie": 48879
},{
"offset": 860054,
"ref_ctr_offset": 16111020,
"cookie": 51966
}
]
}
]
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/bpf/20231125193130.834322-7-jolsa@kernel.org
|
|
Adding fill_link_info test for uprobe_multi link.
Setting up uprobes with bogus ref_ctr_offsets and cookie values
to test all the bpf_link_info::uprobe_multi fields.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20231125193130.834322-6-jolsa@kernel.org
|
|
The fill_link_info test keeps skeleton open and just creates
various links. We are wrongly calling bpf_link__detach after
each test to close them, we need to call bpf_link__destroy.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/bpf/20231125193130.834322-5-jolsa@kernel.org
|
|
Adding support to get uprobe_link details through bpf_link_info
interface.
Adding new struct uprobe_multi to struct bpf_link_info to carry
the uprobe_multi link details.
The uprobe_multi.count is passed from user space to denote size
of array fields (offsets/ref_ctr_offsets/cookies). The actual
array size is stored back to uprobe_multi.count (allowing user
to find out the actual array size) and array fields are populated
up to the user passed size.
All the non-array fields (path/count/flags/pid) are always set.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20231125193130.834322-4-jolsa@kernel.org
|
|
We need to get offsets for static variables in following changes,
so making elf_resolve_syms_offsets to take st_type value as argument
and passing it to elf_sym_iter_new.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/bpf/20231125193130.834322-2-jolsa@kernel.org
|
|
Pass MAGIC_TOKEN to __TEST_REQUIRE() when printing the help message about
needing to pass a magic value to manually run the NX hugepages test,
otherwise the help message will contain garbage.
In file included from x86_64/nx_huge_pages_test.c:15:
x86_64/nx_huge_pages_test.c: In function ‘main’:
include/test_util.h:40:32: error: format ‘%d’ expects a matching ‘int’ argument [-Werror=format=]
40 | ksft_exit_skip("- " fmt "\n", ##__VA_ARGS__); \
| ^~~~
x86_64/nx_huge_pages_test.c:259:9: note: in expansion of macro ‘__TEST_REQUIRE’
259 | __TEST_REQUIRE(token == MAGIC_TOKEN,
| ^~~~~~~~~~~~~~
Signed-off-by: angquan yu <angquan21@gmail.com>
Link: https://lore.kernel.org/r/20231128221105.63093-1-angquan21@gmail.com
[sean: rewrite shortlog+changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
The root-only cpuset.cpus.isolated control file shows the current set
of isolated CPUs in isolated partitions. This control file is currently
exposed only with the cgroup_debug boot command line option which also
adds the ".__DEBUG__." prefix. This is actually a useful control file if
users want to find out which CPUs are currently in an isolated state by
the cpuset controller. Remove CFTYPE_DEBUG flag for this control file and
make it available by default without any prefix.
The test_cpuset_prs.sh test script and the cgroup-v2.rst documentation
file are also updated accordingly. Minor code change is also made in
test_cpuset_prs.sh to avoid false test failure when running on debug
kernel.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
|
|
Regenerate the tools/ code after netdev spec changes.
Add sample to query page-pool info in a concise fashion:
$ ./page-pool
eth0[2] page pools: 10 (zombies: 0)
refs: 41984 bytes: 171966464 (refs: 0 bytes: 0)
recycling: 90.3% (alloc: 656:397681 recycle: 89652:270201)
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Remove this leftover from the times we pre-allocated everything
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20231124154248.315470-6-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cleanup net namespaces and other resources if we get a SIGINT (Ctrl-C).
As user visible resources are allocated on a per test basis, it's only
required to catch this condition when (possibly) running tests.
So far calling post_suite is enough to free up anything that might
linger.
A missing keyword replacement for nsPlugin is also included.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20231124154248.315470-5-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As suggested by Simon, prefix the functions that operate on iproute2
commands in contrast with the "nl" netlink prefix.
Cc: Simon Horman <horms@kernel.org>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20231124154248.315470-4-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This operation is redundant and it's not stabilizing nor waiting
for anything.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20231124154248.315470-3-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As tdc only tests loading/deleting and anything more complicated is
better left to the ebpf test suite, provide a pre-compiled version of
'action.c' and don't bother compiling it in kselftests or on the fly
at all.
Cc: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20231124154248.315470-2-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Same init_rng() in both tests. The function reads /dev/urandom to
initialize srand(). In case of failure, it falls back onto the
entropy in the uninitialized variable. Not sure if this is on purpose.
But failure reading urandom should be rare, so just fail hard. While
at it, convert to getrandom(). Which man 4 random suggests is simpler
and more robust.
mptcp_inq.c:525:6:
mptcp_connect.c:1131:6:
error: variable 'foo' is used uninitialized
whenever 'if' condition is false
[-Werror,-Wsometimes-uninitialized]
Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp")
Fixes: b51880568f20 ("selftests: mptcp: add inq test case")
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: Willem de Bruijn <willemb@google.com>
----
When input is randomized because this is expected to meaningfully
explore edge cases, should we also add
1. logging the random seed to stdout and
2. adding a command line argument to replay from a specific seed
I can do this in net-next, if authors find it useful in this case.
Reviewed-by: Matthieu Baerts <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20231124171645.1011043-5-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove an unused variable.
diag_uid.c:151:24:
error: unused variable 'udr'
[-Werror,-Wunused-variable]
Fixes: ac011361bd4f ("af_unix: Add test for sock_diag and UDIAG_SHOW_UID.")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20231124171645.1011043-4-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Signedness of char is signed on x86_64, but unsigned on arm64.
Fix the warning building cmsg_sender.c on signed platforms or
forced with -fsigned-char:
msg_sender.c:455:12:
error: implicit conversion from 'int' to 'char'
changes value from 128 to -128
[-Werror,-Wconstant-conversion]
buf[0] = ICMPV6_ECHO_REQUEST;
constant ICMPV6_ECHO_REQUEST is 128.
Link: https://lwn.net/Articles/911914
Fixes: de17e305a810 ("selftests: net: cmsg_sender: support icmp and raw sockets")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20231124171645.1011043-3-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix a small compiler warning.
nr_process must be a signed long: it is assigned a signed long by
strtol() and is compared against LONG_MIN and LONG_MAX.
ipsec.c:2280:65:
error: result of comparison of constant -9223372036854775808
with expression of type 'unsigned int' is always false
[-Werror,-Wtautological-constant-out-of-range-compare]
if ((errno == ERANGE && (nr_process == LONG_MAX || nr_process == LONG_MIN))
Fixes: bc2652b7ae1e ("selftest/net/xfrm: Add test for ipsec tunnel")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com>
Link: https://lore.kernel.org/r/20231124171645.1011043-2-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
- filter orphaned programs by default
- when trying to query orphaned program, don't expect bpftool failure
Cc: netdev@vger.kernel.org
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20231127182057.1081138-2-sdf@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Commit ef01f4e25c17 ("bpf: restore the ebpf program ID for BPF_AUDIT_UNLOAD
and PERF_BPF_EVENT_PROG_UNLOAD") stopped removing program's id from
idr when the offloaded/bound netdev goes away. I was supposed to
take a look and check in [0], but apparently I did not.
Martin points out it might be useful to keep it that way for
observability sake, but we at least need to mark those programs as
unusable.
Mark those programs as 'orphaned' and keep printing the list when
we encounter ENODEV.
0: unspec tag 0000000000000000
xlated 0B not jited memlock 4096B orphaned
[0]: https://lore.kernel.org/all/CAKH8qBtyR20ZWAc11z1-6pGb3Hd47AQUTbE_cfoktG59TqaJ7Q@mail.gmail.com/
v3:
* use two spaces for " orphaned" (Quentin)
Cc: netdev@vger.kernel.org
Fixes: ef01f4e25c17 ("bpf: restore the ebpf program ID for BPF_AUDIT_UNLOAD and PERF_BPF_EVENT_PROG_UNLOAD")
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20231127182057.1081138-1-sdf@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Fixes: dd6cad2dcb58 ("testing: nvdimm: make struct class structures constant")
Signed-off-by: Yi Zhang <yi.zhang@redhat.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20231127040026.362729-1-yi.zhang@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Improve granularity of format selection for S32/U32 formats by adding
constants representing 20, 24 and MAX most significant bits.
The MAX means the maximum number of significant bits which can
the physical format hold. For 32-bit formats, MAX is related
to 32 bits. For 8-bit formats, MAX is related to 8 bits etc.
As there is only one user currently (format S32_LE), subformat is
represented by a simple u32 and stores flags only for that one user
alone. The approach of subformat being part of struct snd_pcm_hardware
is a compromise between ALSA and ASoC allowing for
hw_params-intersection code to be alloc/free-less while not adding any
new responsibilities to ASoC runtime structures.
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Co-developed-by: Cezary Rojewski <cezary.rojewski@intel.com>
Signed-off-by: Cezary Rojewski <cezary.rojewski@intel.com>
Link: https://lore.kernel.org/r/20231117120610.1755254-2-cezary.rojewski@intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Add support for VM_MODE_P52V48_4K and VM_MODE_P52V48_16K guest modes by
using the FEAT_LPA2 pte format for stage1, when FEAT_LPA2 is available.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231127111737.1897081-13-ryan.roberts@arm.com
|
|
We are about to add 52 bit PA guest modes for 4K and 16K pages when the
system supports LPA2. In preparation beef up the logic that parses mmfr0
to also tell us what the maximum supported PA size is for each page
size. Max PA size = 0 implies the page size is not supported at all.
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20231127111737.1897081-12-ryan.roberts@arm.com
|
|
With latest upstream llvm18, the following test cases failed:
$ ./test_progs -j
#13/2 bpf_cookie/multi_kprobe_link_api:FAIL
#13/3 bpf_cookie/multi_kprobe_attach_api:FAIL
#13 bpf_cookie:FAIL
#77 fentry_fexit:FAIL
#78/1 fentry_test/fentry:FAIL
#78 fentry_test:FAIL
#82/1 fexit_test/fexit:FAIL
#82 fexit_test:FAIL
#112/1 kprobe_multi_test/skel_api:FAIL
#112/2 kprobe_multi_test/link_api_addrs:FAIL
[...]
#112 kprobe_multi_test:FAIL
#356/17 test_global_funcs/global_func17:FAIL
#356 test_global_funcs:FAIL
Further analysis shows llvm upstream patch [1] is responsible for the above
failures. For example, for function bpf_fentry_test7() in net/bpf/test_run.c,
without [1], the asm code is:
0000000000000400 <bpf_fentry_test7>:
400: f3 0f 1e fa endbr64
404: e8 00 00 00 00 callq 0x409 <bpf_fentry_test7+0x9>
409: 48 89 f8 movq %rdi, %rax
40c: c3 retq
40d: 0f 1f 00 nopl (%rax)
... and with [1], the asm code is:
0000000000005d20 <bpf_fentry_test7.specialized.1>:
5d20: e8 00 00 00 00 callq 0x5d25 <bpf_fentry_test7.specialized.1+0x5>
5d25: c3 retq
... and <bpf_fentry_test7.specialized.1> is called instead of <bpf_fentry_test7>
and this caused test failures for #13/#77 etc. except #356.
For test case #356/17, with [1] (progs/test_global_func17.c)), the main prog
looks like:
0000000000000000 <global_func17>:
0: b4 00 00 00 2a 00 00 00 w0 = 0x2a
1: 95 00 00 00 00 00 00 00 exit
... which passed verification while the test itself expects a verification
failure.
Let us add 'barrier_var' style asm code in both places to prevent function
specialization which caused selftests failure.
[1] https://github.com/llvm/llvm-project/pull/72903
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231127050342.1945270-1-yonghong.song@linux.dev
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc architecture fixes from Helge Deller:
"This patchset fixes and enforces correct section alignments for the
ex_table, altinstructions, parisc_unwind, jump_table and bug_table
which are created by inline assembly.
Due to not being correctly aligned at link & load time they can
trigger unnecessarily the kernel unaligned exception handler at
runtime. While at it, I switched the bug table to use relative
addresses which reduces the size of the table by half on 64-bit.
We still had the ENOSYM and EREMOTERELEASE errno symbols as left-overs
from HP-UX, which now trigger build-issues with glibc. We can simply
remove them.
Most of the patches are tagged for stable kernel series.
Summary:
- Drop HP-UX ENOSYM and EREMOTERELEASE return codes to avoid glibc
build issues
- Fix section alignments for ex_table, altinstructions, parisc unwind
table, jump_table and bug_table
- Reduce size of bug_table on 64-bit kernel by using relative
pointers"
* tag 'parisc-for-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Reduce size of the bug_table on 64-bit kernel by half
parisc: Drop the HP-UX ENOSYM and EREMOTERELEASE error codes
parisc: Use natural CPU alignment for bug_table
parisc: Ensure 32-bit alignment on parisc unwind section
parisc: Mark lock_aligned variables 16-byte aligned on SMP
parisc: Mark jump_table naturally aligned
parisc: Mark altinstructions read-only and 32-bit aligned
parisc: Mark ex_table entries 32-bit aligned in uaccess.h
parisc: Mark ex_table entries 32-bit aligned in assembly.h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Fix "rodata=on" not disabling "rodata=full" on arm64
- Add arm64 make dependency between vmlinuz.efi and Image, leading to
occasional build failures previously (with parallel building)
- Add newline to the output formatting of the za-fork kselftest
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: add dependency between vmlinuz.efi and Image
kselftest/arm64: Fix output formatting for za-fork
arm64: mm: Fix "rodata=on" when CONFIG_RODATA_FULL_DEFAULT_ENABLED=y
|
|
Those return codes are only defined for the parisc architecture and
are leftovers from when we wanted to be HP-UX compatible.
They are not returned by any Linux kernel syscall but do trigger
problems with the glibc strerrorname_np() and strerror() functions as
reported in glibc issue #31080.
There is no need to keep them, so simply remove them.
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: Bruno Haible <bruno@clisp.org>
Closes: https://sourceware.org/bugzilla/show_bug.cgi?id=31080
Cc: stable@vger.kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Fix a syntax error in the sleepgraph utility which causes it to exit
early on every invocation (David Woodhouse)"
* tag 'pm-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: tools: Fix sleepgraph syntax error
|
|
The enum name used for id-to-str table does not handle
the enum-name override in the spec correctly.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If a new family is ever added with a dash in the name
the C codegen will break. Make sure we use the "safe"
form of the name consistently.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
32bit builds generate the following warning when we use a u32-max
in range validation:
warning: decimal constant 4294967295 is between LONG_MAX and ULONG_MAX. For C99 that means long long, C90 compilers are very likely to produce unsigned long (and a warning) here
The range values are u64, slap ULL/LL on all of them just
to avoid such noise.
There's currently no code using full range validation, but
it will matter in the upcoming page-pool introspection.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a few test that validate BPF verifier's lazy approach to validating
global subprogs.
We check that global subprogs that are called transitively through
another global subprog is validated.
We also check that invalid global subprog is not validated, if it's not
called from the main program.
And we also check that main program is always validated first, before
any of the subprogs.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231124035937.403208-4-andrii@kernel.org
|
|
Slightly change BPF verifier logic around eagerness and order of global
subprog validation. Instead of going over every global subprog eagerly
and validating it before main (entry) BPF program is verified, turn it
around. Validate main program first, mark subprogs that were called from
main program for later verification, but otherwise assume it is valid.
Afterwards, go over marked global subprogs and validate those,
potentially marking some more global functions as being called. Continue
this process until all (transitively) callable global subprogs are
validated. It's a BFS traversal at its heart and will always converge.
This is an important change because it allows to feature-gate some
subprograms that might not be verifiable on some older kernel, depending
on supported set of features.
E.g., at some point, global functions were allowed to accept a pointer
to memory, which size is identified by user-provided type.
Unfortunately, older kernels don't support this feature. With BPF CO-RE
approach, the natural way would be to still compile BPF object file once
and guard calls to this global subprog with some CO-RE check or using
.rodata variables. That's what people do to guard usage of new helpers
or kfuncs, and any other new BPF-side feature that might be missing on
old kernels.
That's currently impossible to do with global subprogs, unfortunately,
because they are eagerly and unconditionally validated. This patch set
aims to change this, so that in the future when global funcs gain new
features, those can be guarded using BPF CO-RE techniques in the same
fashion as any other new kernel feature.
Two selftests had to be adjusted in sync with these changes.
test_global_func12 relied on eager global subprog validation failing
before main program failure is detected (unknown return value). Fix by
making sure that main program is always valid.
verifier_subprog_precision's parent_stack_slot_precise subtest relied on
verifier checkpointing heuristic to do a checkpoint at instruction #5,
but that's no longer true because we don't have enough jumps validated
before reaching insn #5 due to global subprogs being validated later.
Other than that, no changes, as one would expect.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231124035937.403208-3-andrii@kernel.org
|
|
This is a simple script that parses the Netlink YAML spec files
(Documentation/netlink/specs/), and generates RST files to be rendered
in the Network -> Netlink Specification documentation page.
Create a python script that is invoked during 'make htmldocs', reads the
YAML specs input file and generate the correspondent RST file.
Create a new Documentation/networking/netlink_spec index page, and
reference each Netlink RST file that was processed above in this main
index.rst file.
In case of any exception during the parsing, dump the error and skip
the file.
Do not regenerate the RST files if the input files (YAML) were not
changed in-between invocations.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
----
Changelog:
V3:
* Do not regenerate the RST files if the input files were not
changed. In order to do it, a few things changed:
- Rely on Makefile more to find what changed, and trigger
individual file processing
- The script parses file by file now (instead of batches)
- Create a new option to generate the index file
V2:
* Moved the logic from a sphinx extension to a external script
* Adjust some formatting as suggested by Donald Hunter and Jakub
* Auto generating all the rsts instead of having stubs
* Handling error gracefully
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Bump libbpf.map to v1.4.0 to start a new libbpf version cycle.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231123000439.12025-1-eddyz87@gmail.com
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/intel/ice/ice_main.c
c9663f79cd82 ("ice: adjust switchdev rebuild path")
7758017911a4 ("ice: restore timestamp configuration after device reset")
https://lore.kernel.org/all/20231121211259.3348630-1-anthony.l.nguyen@intel.com/
Adjacent changes:
kernel/bpf/verifier.c
bb124da69c47 ("bpf: keep track of max number of bpf_loop callback iterations")
5f99f312bd3b ("bpf: add register bounds sanity checks and sanitization")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf.
Current release - regressions:
- Revert "net: r8169: Disable multicast filter for RTL8168H and
RTL8107E"
- kselftest: rtnetlink: fix ip route command typo
Current release - new code bugs:
- s390/ism: make sure ism driver implies smc protocol in kconfig
- two build fixes for tools/net
Previous releases - regressions:
- rxrpc: couple of ACK/PING/RTT handling fixes
Previous releases - always broken:
- bpf: verify bpf_loop() callbacks as if they are called unknown
number of times
- improve stability of auto-bonding with Hyper-V
- account BPF-neigh-redirected traffic in interface statistics
Misc:
- net: fill in some more MODULE_DESCRIPTION()s"
* tag 'net-6.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (58 commits)
tools: ynl: fix duplicate op name in devlink
tools: ynl: fix header path for nfsd
net: ipa: fix one GSI register field width
tls: fix NULL deref on tls_sw_splice_eof() with empty record
net: axienet: Fix check for partial TX checksum
vsock/test: fix SEQPACKET message bounds test
i40e: Fix adding unsupported cloud filters
ice: restore timestamp configuration after device reset
ice: unify logic for programming PFINT_TSYN_MSK
ice: remove ptp_tx ring parameter flag
amd-xgbe: propagate the correct speed and duplex status
amd-xgbe: handle the corner-case during tx completion
amd-xgbe: handle corner-case during sfp hotplug
net: veth: fix ethtool stats reporting
octeontx2-pf: Fix ntuple rule creation to direct packet to VF with higher Rx queue than its PF
net: usb: qmi_wwan: claim interface 4 for ZTE MF290
Revert "net: r8169: Disable multicast filter for RTL8168H and RTL8107E"
net/smc: avoid data corruption caused by decline
nfc: virtual_ncidev: Add variable to check if ndev is running
dpll: Fix potential msg memleak when genlmsg_put_reply failed
...
|
|
We don't support CRUD-inspired message types in YNL too well.
One aspect that currently trips us up is the fact that single
message ID can be used in multiple commands (as the response).
This leads to duplicate entries in the id-to-string tables:
devlink-user.c:19:34: warning: initialized field overwritten [-Woverride-init]
19 | [DEVLINK_CMD_PORT_NEW] = "port-new",
| ^~~~~~~~~~
devlink-user.c:19:34: note: (near initialization for ‘devlink_op_strmap[7]’)
Fixes tag points at where the code was generated, the "real" problem
is that the code generator does not support CRUD.
Fixes: f2f9dd164db0 ("netlink: specs: devlink: add the remaining command to generate complete split_ops")
Link: https://lore.kernel.org/r/20231123030558.1611831-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The makefile dependency is trying to include the wrong header:
<command-line>: fatal error: ../../../../include/uapi//linux/nfsd.h: No such file or directory
The guard also looks wrong.
Fixes: f14122b2c2ac ("tools: ynl: Add source files for nfsd netlink protocol")
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/20231123030624.1611925-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Tune message length calculation to make this test work on machines
where 'getpagesize()' returns >32KB. Now maximum message length is not
hardcoded (on machines above it was smaller than 'getpagesize()' return
value, thus we get negative value and test fails), but calculated at
runtime and always bigger than 'getpagesize()' result. Reproduced on
aarch64 with 64KB page size.
Fixes: 5c338112e48a ("test/vsock: rework message bounds test")
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Reported-by: Bogdan Marcynkov <bmarcynk@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20231121211642.163474-1-avkrasnov@salutedevices.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|