Age | Commit message (Collapse) | Author | Files | Lines |
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-02-29
We've added 119 non-merge commits during the last 32 day(s) which contain
a total of 150 files changed, 3589 insertions(+), 995 deletions(-).
The main changes are:
1) Extend the BPF verifier to enable static subprog calls in spin lock
critical sections, from Kumar Kartikeya Dwivedi.
2) Fix confusing and incorrect inference of PTR_TO_CTX argument type
in BPF global subprogs, from Andrii Nakryiko.
3) Larger batch of riscv BPF JIT improvements and enabling inlining
of the bpf_kptr_xchg() for RV64, from Pu Lehui.
4) Allow skeleton users to change the values of the fields in struct_ops
maps at runtime, from Kui-Feng Lee.
5) Extend the verifier's capabilities of tracking scalars when they
are spilled to stack, especially when the spill or fill is narrowing,
from Maxim Mikityanskiy & Eduard Zingerman.
6) Various BPF selftest improvements to fix errors under gcc BPF backend,
from Jose E. Marchesi.
7) Avoid module loading failure when the module trying to register
a struct_ops has its BTF section stripped, from Geliang Tang.
8) Annotate all kfuncs in .BTF_ids section which eventually allows
for automatic kfunc prototype generation from bpftool, from Daniel Xu.
9) Several updates to the instruction-set.rst IETF standardization
document, from Dave Thaler.
10) Shrink the size of struct bpf_map resp. bpf_array,
from Alexei Starovoitov.
11) Initial small subset of BPF verifier prepwork for sleepable bpf_timer,
from Benjamin Tissoires.
12) Fix bpftool to be more portable to musl libc by using POSIX's
basename(), from Arnaldo Carvalho de Melo.
13) Add libbpf support to gcc in CORE macro definitions,
from Cupertino Miranda.
14) Remove a duplicate type check in perf_event_bpf_event,
from Florian Lehner.
15) Fix bpf_spin_{un,}lock BPF helpers to actually annotate them
with notrace correctly, from Yonghong Song.
16) Replace the deprecated bpf_lpm_trie_key 0-length array with flexible
array to fix build warnings, from Kees Cook.
17) Fix resolve_btfids cross-compilation to non host-native endianness,
from Viktor Malik.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (119 commits)
selftests/bpf: Test if shadow types work correctly.
bpftool: Add an example for struct_ops map and shadow type.
bpftool: Generated shadow variables for struct_ops maps.
libbpf: Convert st_ops->data to shadow type.
libbpf: Set btf_value_type_id of struct bpf_map for struct_ops.
bpf: Replace bpf_lpm_trie_key 0-length array with flexible array
bpf, arm64: use bpf_prog_pack for memory management
arm64: patching: implement text_poke API
bpf, arm64: support exceptions
arm64: stacktrace: Implement arch_bpf_stack_walk() for the BPF JIT
bpf: add is_async_callback_calling_insn() helper
bpf: introduce in_sleepable() helper
bpf: allow more maps in sleepable bpf programs
selftests/bpf: Test case for lacking CFI stub functions.
bpf: Check cfi_stubs before registering a struct_ops type.
bpf: Clarify batch lookup/lookup_and_delete semantics
bpf, docs: specify which BPF_ABS and BPF_IND fields were zero
bpf, docs: Fix typos in instruction-set.rst
selftests/bpf: update tcp_custom_syncookie to use scalar packet offset
bpf: Shrink size of struct bpf_map/bpf_array.
...
====================
Link: https://lore.kernel.org/r/20240301001625.8800-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace deprecated 0-length array in struct bpf_lpm_trie_key with
flexible array. Found with GCC 13:
../kernel/bpf/lpm_trie.c:207:51: warning: array subscript i is outside array bounds of 'const __u8[0]' {aka 'const unsigned char[]'} [-Warray-bounds=]
207 | *(__be16 *)&key->data[i]);
| ^~~~~~~~~~~~~
../include/uapi/linux/swab.h:102:54: note: in definition of macro '__swab16'
102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
| ^
../include/linux/byteorder/generic.h:97:21: note: in expansion of macro '__be16_to_cpu'
97 | #define be16_to_cpu __be16_to_cpu
| ^~~~~~~~~~~~~
../kernel/bpf/lpm_trie.c:206:28: note: in expansion of macro 'be16_to_cpu'
206 | u16 diff = be16_to_cpu(*(__be16 *)&node->data[i]
^
| ^~~~~~~~~~~
In file included from ../include/linux/bpf.h:7:
../include/uapi/linux/bpf.h:82:17: note: while referencing 'data'
82 | __u8 data[0]; /* Arbitrary size */
| ^~~~
And found at run-time under CONFIG_FORTIFY_SOURCE:
UBSAN: array-index-out-of-bounds in kernel/bpf/lpm_trie.c:218:49
index 0 is out of range for type '__u8 [*]'
Changing struct bpf_lpm_trie_key is difficult since has been used by
userspace. For example, in Cilium:
struct egress_gw_policy_key {
struct bpf_lpm_trie_key lpm_key;
__u32 saddr;
__u32 daddr;
};
While direct references to the "data" member haven't been found, there
are static initializers what include the final member. For example,
the "{}" here:
struct egress_gw_policy_key in_key = {
.lpm_key = { 32 + 24, {} },
.saddr = CLIENT_IP,
.daddr = EXTERNAL_SVC_IP & 0Xffffff,
};
To avoid the build time and run time warnings seen with a 0-sized
trailing array for struct bpf_lpm_trie_key, introduce a new struct
that correctly uses a flexible array for the trailing bytes,
struct bpf_lpm_trie_key_u8. As part of this, include the "header"
portion (which is just the "prefixlen" member), so it can be used
by anything building a bpf_lpr_trie_key that has trailing members that
aren't a u8 flexible array (like the self-test[1]), which is named
struct bpf_lpm_trie_key_hdr.
Unfortunately, C++ refuses to parse the __struct_group() helper, so
it is not possible to define struct bpf_lpm_trie_key_hdr directly in
struct bpf_lpm_trie_key_u8, so we must open-code the union directly.
Adjust the kernel code to use struct bpf_lpm_trie_key_u8 through-out,
and for the selftest to use struct bpf_lpm_trie_key_hdr. Add a comment
to the UAPI header directing folks to the two new options.
Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Closes: https://paste.debian.net/hidden/ca500597/
Link: https://lore.kernel.org/all/202206281009.4332AA33@keescook/ [1]
Link: https://lore.kernel.org/bpf/20240222155612.it.533-kees@kernel.org
|
|
We've had issues with gcc and 'asm goto' before, and we created a
'asm_volatile_goto()' macro for that in the past: see commits
3f0116c3238a ("compiler/gcc4: Add quirk for 'asm goto' miscompilation
bug") and a9f180345f53 ("compiler/gcc4: Make quirk for
asm_volatile_goto() unconditional").
Then, much later, we ended up removing the workaround in commit
43c249ea0b1e ("compiler-gcc.h: remove ancient workaround for gcc PR
58670") because we no longer supported building the kernel with the
affected gcc versions, but we left the macro uses around.
Now, Sean Christopherson reports a new version of a very similar
problem, which is fixed by re-applying that ancient workaround. But the
problem in question is limited to only the 'asm goto with outputs'
cases, so instead of re-introducing the old workaround as-is, let's
rename and limit the workaround to just that much less common case.
It looks like there are at least two separate issues that all hit in
this area:
(a) some versions of gcc don't mark the asm goto as 'volatile' when it
has outputs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98619
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110420
which is easy to work around by just adding the 'volatile' by hand.
(b) Internal compiler errors:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110422
which are worked around by adding the extra empty 'asm' as a
barrier, as in the original workaround.
but the problem Sean sees may be a third thing since it involves bad
code generation (not an ICE) even with the manually added 'volatile'.
but the same old workaround works for this case, even if this feels a
bit like voodoo programming and may only be hiding the issue.
Reported-and-tested-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Andrew Pinski <quic_apinski@quicinc.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently %ld format specifiers are being used for unsigned long
values. Fix this by using %lu instead. Cleans up cppcheck warnings:
warning: %ld in format string (no. 1) requires 'long' but the argument
type is 'unsigned long'. [invalidPrintfArgType_sint]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/bpf/20231219152307.368921-1-colin.i.king@gmail.com
|
|
samples/bpf build its own bpftool boostrap to generate vmlinux.h as well
as some BPF objects. This is a redundant step if bpftool has been
already built, so update samples/bpf/Makefile such that it accepts a
path to bpftool passed via the BPFTOOL variable. The approach is
practically the same as tools/testing/selftests/bpf/Makefile uses.
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/bd746954ac271b02468d8d951ff9f11e655d485b.1698213811.git.vmalik@redhat.com
|
|
samples/bpf/Makefile passes LDFLAGS=$(TPROGS_LDFLAGS) to libbpf build
without surrounding quotes, which may cause compilation errors when
passing custom TPROGS_USER_LDFLAGS.
For example:
$ make -C samples/bpf/ TPROGS_USER_LDFLAGS="-Wl,--as-needed -specs=/usr/lib/gcc/x86_64-redhat-linux/13/libsanitizer.spec"
make: Entering directory './samples/bpf'
make -C ../../ M=./samples/bpf BPF_SAMPLES_PATH=./samples/bpf
make[1]: Entering directory '.'
make -C ./samples/bpf/../../tools/lib/bpf RM='rm -rf' EXTRA_CFLAGS="-Wall -O2 -Wmissing-prototypes -Wstrict-prototypes -I./usr/include -I./tools/testing/selftests/bpf/ -I./samples/bpf/libbpf/include -I./tools/include -I./tools/perf -I./tools/lib -DHAVE_ATTR_TEST=0" \
LDFLAGS=-Wl,--as-needed -specs=/usr/lib/gcc/x86_64-redhat-linux/13/libsanitizer.spec srctree=./samples/bpf/../../ \
O= OUTPUT=./samples/bpf/libbpf/ DESTDIR=./samples/bpf/libbpf prefix= \
./samples/bpf/libbpf/libbpf.a install_headers
make: invalid option -- 'c'
make: invalid option -- '='
make: invalid option -- '/'
make: invalid option -- 'u'
make: invalid option -- '/'
[...]
Fix the error by properly quoting $(TPROGS_LDFLAGS).
Suggested-by: Donald Zickus <dzickus@redhat.com>
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/c690de6671cc6c983d32a566d33fd7eabd18b526.1698213811.git.vmalik@redhat.com
|
|
Currently, it is not possible to specify custom flags when building
samples/bpf. The flags are defined in TPROGS_CFLAGS/TPROGS_LDFLAGS
variables, however, when trying to override those from the make command,
compilation fails.
For example, when trying to build with PIE:
$ make -C samples/bpf TPROGS_CFLAGS="-fpie" TPROGS_LDFLAGS="-pie"
This is because samples/bpf/Makefile updates these variables, especially
appends include paths to TPROGS_CFLAGS and these updates are overridden
by setting the variables from the make command.
This patch introduces variables TPROGS_USER_CFLAGS/TPROGS_USER_LDFLAGS
for this purpose, which can be set from the make command and their
values are propagated to TPROGS_CFLAGS/TPROGS_LDFLAGS.
Signed-off-by: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/2d81100b830a71f0e72329cc7781edaefab75f62.1698213811.git.vmalik@redhat.com
|
|
This modification doesn't change behaviour of the syscall_tp
But such code is often used as a reference so it should be
correct anyway
Signed-off-by: Denys Zagorui <dzagorui@cisco.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20231019113521.4103825-1-dzagorui@cisco.com
|
|
The sanitizer flag, which is supported by both clang and gcc, would make
it easier to debug array index out-of-bounds problems in these programs.
Make the Makfile smarter to detect ubsan support from the compiler and
add the '-fsanitize=bounds' accordingly.
Suggested-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Jinghao Jia <jinghao@linux.ibm.com>
Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>
Signed-off-by: Ruowen Qin <ruowenq2@illinois.edu>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20230927045030.224548-2-ruowenq2@illinois.edu
|
|
Commit 06744f24696e ("samples/bpf: Add openat2() enter/exit tracepoint
to syscall_tp sample") added two more eBPF programs to support the
openat2() syscall. However, it did not increase the size of the array
that holds the corresponding bpf_links. This leads to an out-of-bound
access on that array in the bpf_object__for_each_program loop and could
corrupt other variables on the stack. On our testing QEMU, it corrupts
the map1_fds array and causes the sample to fail:
# ./syscall_tp
prog #0: map ids 4 5
verify map:4 val: 5
map_lookup failed: Bad file descriptor
Dynamically allocate the array based on the number of programs reported
by libbpf to prevent similar inconsistencies in the future
Fixes: 06744f24696e ("samples/bpf: Add openat2() enter/exit tracepoint to syscall_tp sample")
Signed-off-by: Jinghao Jia <jinghao@linux.ibm.com>
Signed-off-by: Ruowen Qin <ruowenq2@illinois.edu>
Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>
Link: https://lore.kernel.org/r/20230917214220.637721-4-jinghao7@illinois.edu
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The variable name num_progs causes confusion because that variable
really controls the number of rounds the test should be executed.
Rename num_progs into nr_tests for the sake of clarity.
Signed-off-by: Jinghao Jia <jinghao@linux.ibm.com>
Signed-off-by: Ruowen Qin <ruowenq2@illinois.edu>
Signed-off-by: Jinghao Jia <jinghao7@illinois.edu>
Link: https://lore.kernel.org/r/20230917214220.637721-3-jinghao7@illinois.edu
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Static ksyms often have problems because the number of symbols exceeds the
MAX_SYMS limit. Like changing the MAX_SYMS from 300000 to 400000 in
commit e76a014334a6("selftests/bpf: Bump and validate MAX_SYMS") solves
the problem somewhat, but it's not the perfect way.
This commit uses dynamic memory allocation, which completely solves the
problem caused by the limitation of the number of kallsyms. At the same
time, add APIs:
load_kallsyms_local()
ksym_search_local()
ksym_get_addr_local()
free_kallsyms_local()
There are used to solve the problem of selftests/bpf updating kallsyms
after attach new symbols during testmod testing.
Signed-off-by: Rong Tao <rongtao@cestc.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/tencent_C9BDA68F9221F21BE4081566A55D66A9700A@qq.com
|
|
To help users find the XDP utilities, add a note to the README about the
new location and the conversion documentation in the commit messages.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-8-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Remove no longer present XDP utilities from .gitignore. Apart from the
recently removed XDP utilities this also includes the previously removed
xdpsock and xsk utilities.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-7-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The functionality of this utility is covered by the xdpdump utility in
xdp-tools.
There's a slight difference in usage as the xdpdump utility's main focus is
to dump packets before or after they are processed by an existing XDP
program. However, xdpdump also has the --load-xdp-program switch, which
will make it attach its own program if no existing program is loaded. With
this, xdp_sample_pkts usage can be converted as:
xdp_sample_pkts eth0
--> xdpdump --load-xdp-program eth0
To get roughly equivalent behaviour.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-6-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The functionality of these utilities have been incorporated into the
xdp-bench utility in xdp-tools.
Equivalent functionality is:
xdp1 eth0
--> xdp-bench drop -p parse-ip -l load-bytes eth0
xdp2 eth0
--> xdp-bench drop -p swap-macs eth0
Note that there's a slight difference in behaviour of those examples: the
swap-macs operation of xdp-bench doesn't use the bpf_xdp_load_bytes()
helper to load the packet data, whereas the xdp2 utility did so
unconditionally. For the parse-ip action the use of bpf_xdp_load_bytes()
can be selected by the '-l load-bytes' switch, with the difference that the
xdp-bench utility will perform two separate calls to the helper, one to
load the ethernet header and another to load the IP header; where the xdp1
utility only performed one call always loading 60 bytes of data.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-5-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The functionality of this utility has been incorporated into the xdp-bench
utility in xdp-tools, by way of the --rxq-stats argument to the 'drop',
'pass' and 'tx' commands of xdp-bench.
Some examples of how to convert xdp_rxq_info invocations into equivalent
xdp-bench commands:
xdp_rxq_info -d eth0
--> xdp-bench pass --rxq-stats eth0
xdp_rxq_info -d eth0 -a XDP_DROP -m
--> xdp-bench drop --rxq-stats -p swap-macs eth0
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-4-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
These utilities have all been ported to xdp-tools as functions of the
xdp-bench utility. The four different utilities in samples are incorporated
as separate subcommands to xdp-bench, with most of the command line
parameters left intact, except that mandatory arguments are always
positional in xdp-bench. For full usage details see the --help output of
each command, or the xdp-bench man page.
Some examples of how to convert usage to xdp-bench are:
xdp_redirect eth0 eth1
--> xdp-bench redirect eth0 eth1
xdp_redirect_map eth0 eth1
--> xdp-bench redirect-map eth0 eth1
xdp_redirect_map_multi eth0 eth1 eth2 eth3
--> xdp-bench redirect-multi eth0 eth1 eth2 eth3
xdp_redirect_cpu -d eth0 -c 0 -c 1
--> xdp-bench redirect-cpu -c 0 -c 1 eth0
xdp_redirect_cpu -d eth0 -c 0 -c 1 -r eth1
--> xdp-bench redirect-cpu -c 0 -c 1 eth0 -r redirect -D eth1
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-3-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This utility has been ported as-is to xdp-tools as 'xdp-monitor'. The only
difference in usage between the samples and xdp-tools versions is that the
'-v' command line parameter has been changed to '-e' in the xdp-tools
version for consistency with the other utilities.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230824102255.1561885-2-toke@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
With the introduction of kprobe.multi, it is now possible to attach
multiple kprobes to a single BPF program without the need for multiple
definitions. Additionally, this method supports wildcard-based
matching, allowing for further simplification of BPF programs. In here,
an asterisk (*) wildcard is used to map to all symbols relevant to
spin_{lock|unlock}.
Furthermore, since kprobe.multi handles symbol matching, this commit
eliminates the need for the previous logic of reading the ksym table to
verify the existence of symbols.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-10-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This commit refactors the syscall tracing programs by adopting the
BPF_KSYSCALL macro. This change aims to enhance the clarity and
simplicity of the BPF programs by reducing the complexity of argument
parsing from pt_regs.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-9-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In the commit 7c4cd051add3 ("bpf: Fix syscall's stackmap lookup
potential deadlock"), a potential deadlock issue was addressed, which
resulted in *_map_lookup_elem not triggering BPF programs.
(prior to lookup, bpf_disable_instrumentation() is used)
To resolve the broken map lookup probe using "htab_map_lookup_elem",
this commit introduces an alternative approach. Instead, it utilize
"bpf_map_copy_value" and apply a filter specifically for the hash table
with map_type.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Fixes: 7c4cd051add3 ("bpf: Fix syscall's stackmap lookup potential deadlock")
Link: https://lore.kernel.org/r/20230818090119.477441-8-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Recently, a new tracepoint for the block layer, specifically the
block_io_start/done tracepoints, was introduced in commit 5a80bd075f3b
("block: introduce block_io_start/block_io_done tracepoints").
Previously, the kprobe entry used for this purpose was quite unstable
and inherently broke relevant probes [1]. Now that a stable tracepoint
is available, this commit replaces the bio latency check with it.
One of the changes made during this replacement is the key used for the
hash table. Since 'struct request' cannot be used as a hash key, the
approach taken follows that which was implemented in bcc/biolatency [2].
(uses dev:sector for the key)
[1]: https://github.com/iovisor/bcc/issues/4261
[2]: https://github.com/iovisor/bcc/pull/4691
Fixes: 450b7879e345 ("block: move blk_account_io_{start,done} to blk-mq.c")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-7-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The existing tracing programs have been developed for a considerable
period of time and, as a result, do not properly incorporate the
features of the current libbpf, such as CO-RE. This is evident in
frequent usage of functions like PT_REGS* and the persistence of "hack"
methods using underscore-style bpf_probe_read_kernel from the past.
These programs are far behind the current level of libbpf and can
potentially confuse users. Therefore, this commit aims to convert the
outdated BPF programs to be more CO-RE centric.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-6-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Currently, multiple kprobe programs are suffering from symbol mismatch
due to compiler optimization. These optimizations might induce
additional suffix to the symbol name such as '.isra' or '.constprop'.
# egrep ' finish_task_switch| __netif_receive_skb_core' /proc/kallsyms
ffffffff81135e50 t finish_task_switch.isra.0
ffffffff81dd36d0 t __netif_receive_skb_core.constprop.0
ffffffff8205cc0e t finish_task_switch.isra.0.cold
ffffffff820b1aba t __netif_receive_skb_core.constprop.0.cold
To avoid this, this commit replaces the original kprobe section to
kprobe.multi in order to match symbol with wildcard characters. Here,
asterisk is used for avoiding symbol mismatch.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-5-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Currently, BPF programs typically have a suffix of .bpf.c. However,
some programs still utilize a mixture of _kern.c suffix alongside the
naming convention. In order to achieve consistency in the naming of
these programs, this commit unifies the inconsistency in the naming
convention of BPF kernel programs.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-4-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This commit replaces separate headers with a single vmlinux.h to
tracing programs. Thanks to that, we no longer need to define the
argument structure for tracing programs directly. For example, argument
for the sched_switch tracpepoint (sched_switch_args) can be replaced
with the vmlinux.h provided trace_event_raw_sched_switch.
Additional defines have been added to the BPF program either directly
or through the inclusion of net_shared.h. Defined values are
PERF_MAX_STACK_DEPTH, IFNAMSIZ constants and __stringify() macro. This
change enables the BPF program to access internal structures with BTF
generated "vmlinux.h" header.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-3-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Currently, compiling the bpf programs will result the warning with the
ignored attribute as follows. This commit fixes the warning by adding
cf-protection option.
In file included from ./arch/x86/include/asm/linkage.h:6:
./arch/x86/include/asm/ibt.h:77:8: warning: 'nocf_check' attribute ignored; use -fcf-protection to enable the attribute [-Wignored-attributes]
extern __noendbr u64 ibt_save(bool disable);
^
./arch/x86/include/asm/ibt.h:32:34: note: expanded from macro '__noendbr'
^
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230818090119.477441-2-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Update samples/bpf/README.rst to add pahole to the build dependencies
list. Add the reference to "Documentation/process/changes.rst" for
minimum version required so that the version required will not be
outdated in the future.
Signed-off-by: Anh Tuan Phan <tuananhlfc@gmail.com>
Link: https://lore.kernel.org/r/aecaf7a2-9100-cd5b-5cf4-91e5dbb2c90d@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
__NR_open never exist on AArch64.
Signed-off-by: Rong Tao <rongtao@cestc.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/tencent_C6AD4AD72BEFE813228FC188905F96C6A506@qq.com
|
|
The -target option has been deprecated since clang 3.4 in 2013. Therefore, use
the preferred --target=bpf form instead. This also matches how we use --target=
in scripts/Makefile.clang.
Signed-off-by: Fangrui Song <maskray@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://github.com/llvm/llvm-project/commit/274b6f0c87a6a1798de0a68135afc7f95def6277
Link: https://lore.kernel.org/bpf/20230624001856.1903733-1-maskray@google.com
|
|
Default samples/pktgen scripts send 60 byte packets as hardware adds
4-bytes FCS checksum, which fulfils minimum Ethernet 64 bytes frame
size.
XDP layer will not necessary have access to the 4-bytes FCS checksum.
This leads to bpf_xdp_load_bytes() failing as it tries to copy 64-bytes
from an XDP packet that only have 60-bytes available.
Fixes: 772251742262 ("samples/bpf: fixup some tools to be able to support xdp multibuffer")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/bpf/168545704139.2996228.2516528552939485216.stgit@firesoul
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
net/ipv4/raw.c
3632679d9e4f ("ipv{4,6}/raw: fix output xfrm lookup wrt protocol")
c85be08fc4fa ("raw: Stop using RTO_ONLINK.")
https://lore.kernel.org/all/20230525110037.2b532b83@canb.auug.org.au/
Adjacent changes:
drivers/net/ethernet/freescale/fec_main.c
9025944fddfe ("net: fec: add dma_wmb to ensure correct descriptor values")
144470c88c5d ("net: fec: using the standard return codes when xdp xmit errors")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
__fallthrough is now not supported. Instead of renaming it to
now-canonical ([0]) fallthrough pseudo-keyword, just get rid of it and
equate 'h' case to default case, as both emit usage information and
succeed.
[0] https://www.kernel.org/doc/html/latest/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20230516001718.317177-1-andrii@kernel.org
|
|
Using sizeof(nv) or strlen(nv)+1 is correct.
Fixes: c890063e4404 ("bpf: sample BPF_SOCKET_OPS_BASE_RTT program")
Signed-off-by: Pengcheng Yang <yangpc@wangsu.com>
Link: https://lore.kernel.org/r/1683276658-2860-1-git-send-email-yangpc@wangsu.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Macro PAGE_OFFSET(0xffff880000000000) in sampleip_user.c is inaccurate,
for example, in aarch64 architecture, this value depends on the
CONFIG_ARM64_VA_BITS compilation configuration, this value defaults to 48,
the corresponding PAGE_OFFSET is 0xffff800000000000, if we use the value
defined in sampleip_user.c, then all KSYMs obtained by sampleip are (user)
Symbol error due to PAGE_OFFSET error:
$ sudo ./sampleip 1
Sampling at 99 Hertz for 1 seconds. Ctrl-C also ends.
ADDR KSYM COUNT
0xffff80000810ceb8 (user) 1
0xffffb28ec880 (user) 1
0xffff8000080c82b8 (user) 1
0xffffb23fed24 (user) 1
0xffffb28944fc (user) 1
0xffff8000084628bc (user) 1
0xffffb2a935c0 (user) 1
0xffff80000844677c (user) 1
0xffff80000857a3a4 (user) 1
...
A few examples of addresses in the CONFIG_ARM64_VA_BITS=48 environment in
the aarch64 environment:
$ sudo head /proc/kallsyms
ffff8000080a0000 T _text
ffff8000080b0000 t gic_handle_irq
ffff8000080b0000 T _stext
ffff8000080b0000 T __irqentry_text_start
ffff8000080b00b0 t gic_handle_irq
ffff8000080b0230 t gic_handle_irq
ffff8000080b03b4 T __irqentry_text_end
ffff8000080b03b8 T __softirqentry_text_start
ffff8000080b03c0 T __do_softirq
ffff8000080b0718 T __entry_text_start
We just need to replace the PAGE_OFFSET with the address _text in
/proc/kallsyms to solve this problem:
$ sudo ./sampleip 1
Sampling at 99 Hertz for 1 seconds. Ctrl-C also ends.
ADDR KSYM COUNT
0xffffb2892ab0 (user) 1
0xffffb2b1edfc (user) 1
0xffff800008462834 __arm64_sys_ppoll 1
0xffff8000084b87f4 eventfd_read 1
0xffffb28e6788 (user) 1
0xffff8000081e96d8 rcu_all_qs 1
0xffffb2ada878 (user) 1
...
Signed-off-by: Rong Tao <rongtao@cestc.cn>
Link: https://lore.kernel.org/r/tencent_A0E82E0BEE925285F8156D540731DF805F05@qq.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Fix fout being fopen'ed but then not subsequently fclose'd. In the affected
branch, fout is otherwise going out of scope.
Signed-off-by: Hao Zeng <zenghao@kylinos.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230411084349.1999628-1-zenghao@kylinos.cn
|
|
The canonical location for the tracefs filesystem is at /sys/kernel/tracing.
But, from Documentation/trace/ftrace.rst:
Before 4.1, all ftrace tracing control files were within the debugfs
file system, which is typically located at /sys/kernel/debug/tracing.
For backward compatibility, when mounting the debugfs file system,
the tracefs file system will be automatically mounted at:
/sys/kernel/debug/tracing
Many comments and samples in the bpf code still refer to this older
debugfs path, so let's update them to avoid confusion. There are a few
spots where the bpf code explicitly checks both tracefs and debugfs
(tools/bpf/bpftool/tracelog.c and tools/lib/api/fs/fs.c) and I've left
those alone so that the tools can continue to work with both paths.
Signed-off-by: Ross Zwisler <zwisler@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20230313205628.1058720-2-zwisler@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Use the new type-safe wrappers around bpf_obj_get_info_by_fd().
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230214231221.249277-5-iii@linux.ibm.com
|
|
Commit fe3300897cbf("samples: bpf: fix syscall_tp due to unused syscall")
added openat() syscall tracepoints. This patch adds support for
openat2() as well.
Signed-off-by: Rong Tao <rongtao@cestc.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/tencent_9381CB1A158ED7ADD12C4406034E21A3AC07@qq.com
|
|
This commit changes the _kern suffix to .bpf with the BPF test programs.
With this modification, test programs will inherit the benefit of the
new CLANG-BPF compile target.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-11-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This commit applies vmlinux.h to BPF functionality testing program.
Macros that were not defined despite migration to "vmlinux.h" were
defined separately in individual files.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-10-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This commit applies "net_shared.h" to BPF programs to remove existing
network related header dependencies. Also, this commit removes
unnecessary headers before applying "vmlinux.h" to the BPF programs.
Mostly, endianness conversion function has been applied to the source.
In addition, several macros have been defined to fulfill the INET,
TC-related constants.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-9-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Currently, many programs under sample/bpf often include individual
macros by directly including the header under "linux/" rather than
using the "vmlinux.h" header.
However, there are some problems with migrating to "vmlinux.h" because
there is no definition for utility functions such as endianness
conversion (ntohs/htons). Fortunately, the xdp_sample program already
has a function that can be replaced to solve this problem.
Therefore, this commit attempts to separate these functions into a file
called net_shared.h to make them universally available. Additionally,
this file includes network-related macros that are not defined in
"vmlinux.h". (inspired by 'selftests' bpf_tracing_net.h)
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-8-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
With libbpf 1.0 release, support for legacy BPF map declaration syntax
had been dropped. If you run a program using legacy BPF in the latest
libbpf, the following error will be output.
libbpf: map 'lwt_len_hist_map' (legacy): legacy map definitions are deprecated, use BTF-defined maps instead
libbpf: Use of BPF_ANNOTATE_KV_PAIR is deprecated, use BTF-defined maps in .maps section instead
This commit replaces legacy map with the BTF-defined map.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-7-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The test_overhead bpf program is designed to compare performance
between tracepoint and kprobe. Initially it used task_rename and
urandom_read tracepoint.
However, commit 14c174633f34 ("random: remove unused tracepoints")
removed urandom_read tracepoint, and for this reason the test_overhead
got broken.
This commit introduces new microbenchmark using fib_table_lookup.
This microbenchmark sends UDP packets to localhost in order to invoke
fib_table_lookup.
In a nutshell:
fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
addr.sin_addr.s_addr = inet_addr(DUMMY_IP);
addr.sin_port = htons(DUMMY_PORT);
for() {
sendto(fd, buf, strlen(buf), 0,
(struct sockaddr *)&addr, sizeof(addr));
}
on 4 cpus in parallel:
lookup per sec
base (no tracepoints, no kprobes) 381k
with kprobe at fib_table_lookup() 325k
with tracepoint at fib:fib_table_lookup 330k
with raw_tracepoint at fib:fib_table_lookup 365k
Fixes: 14c174633f34 ("random: remove unused tracepoints")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-6-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Currently, executing test_cgrp2_sock2 fails due to wrong section
header. This 'cgroup/sock1' style section is previously used at
'samples/bpf_load' (deprecated) BPF loader. Because this style isn't
supported in libbpf, this commit fixes this problem by correcting the
section header.
$ sudo ./test_cgrp2_sock2.sh
libbpf: prog 'bpf_prog1': missing BPF prog type, check ELF section name 'cgroup/sock1'
libbpf: prog 'bpf_prog1': failed to load: -22
libbpf: failed to load object './sock_flags_kern.o'
ERROR: loading BPF object file failed
In addition, this BPF program filters ping packets by comparing whether
the socket type uses SOCK_RAW. However, after the ICMP socket[1] was
developed, ping sends ICMP packets using SOCK_DGRAM. Therefore, in this
commit, the packet filtering is changed to use SOCK_DGRAM instead of
SOCK_RAW.
$ strace --trace socket ping -6 -c1 -w1 ::1
socket(AF_INET6, SOCK_DGRAM, IPPROTO_ICMPV6) = 3
[1]: https://lwn.net/Articles/422330/
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-5-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The test_lwt_bpf is a script that tests the functionality of BPF through
the output of the ftrace with bpf_trace_printk. Currently, this program
is not operating normally for several reasons.
First of all, this test script can't parse the ftrace results properly.
GNU sed tries to be as greedy as possible when attempting pattern
matching. Due to this, cutting metadata (such as timestamp) from the
log entry of ftrace doesn't work properly, and also desired log isn't
extracted properly. To make sed stripping clearer, 'nocontext-info'
option with the ftrace has been used to remove metadata from the log.
Also, instead of using unclear pattern matching, this commit specifies
an explicit parse pattern.
Also, unlike before when this test was introduced, the way
bpf_trace_printk behaves has changed[1]. The previous bpf_trace_printk
had to always have '\n' in order to print newline, but now that the
bpf_trace_printk call includes newline by default, so '\n' is no longer
needed.
Lastly with the lwt ENCAP_BPF out, the context information with the
sk_buff protocol is preserved. Therefore, this commit changes the
previous test result from 'protocol 0' to 'protocol 8', which means
ETH_P_IP.
[1]: commit ac5a72ea5c89 ("bpf: Use dedicated bpf_trace_printk event instead of trace_printk()")
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-4-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Currently, some test scripts are experiencing minor errors related to
executing tests.
$ sudo ./test_cgrp2_sock.sh
./test_cgrp2_sock.sh: 22: test_cgrp2_sock: not found
This problem occurs because the path to the execution target is not
properly specified. Therefore, this commit solves this problem by
specifying a relative path to its executables. This commit also makes
a concise refactoring of hard-coded BPF program names.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-3-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Currently, a few of BPF tests use ipv6 functionality. The problem here
is that if ipv6 is disabled, these tests will fail, and even if the
test fails, it will not tell you why it failed.
$ sudo ./test_cgrp2_sock2.sh
RTNETLINK answers: Permission denied
In order to fix this, this commit ensures ipv6 is enabled prior to
running tests.
Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com>
Link: https://lore.kernel.org/r/20230115071613.125791-2-danieltimlee@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|