Age | Commit message (Collapse) | Author | Files | Lines |
|
Inline the calls to bpf_get_smp_processor_id() in the riscv bpf jit.
RISCV saves the pointer to the CPU's task_struct in the TP (thread
pointer) register. This makes it trivial to get the CPU's processor id.
As thread_info is the first member of task_struct, we can read the
processor id from TP + offsetof(struct thread_info, cpu).
RISCV64 JIT output for `call bpf_get_smp_processor_id`
======================================================
Before After
-------- -------
auipc t1,0x848c ld a5,32(tp)
jalr 604(t1)
mv a5,a0
Benchmark using [1] on Qemu.
./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc
+---------------+------------------+------------------+--------------+
| Name | Before | After | % change |
|---------------+------------------+------------------+--------------|
| glob-arr-inc | 1.077 ± 0.006M/s | 1.336 ± 0.010M/s | + 24.04% |
| arr-inc | 1.078 ± 0.002M/s | 1.332 ± 0.015M/s | + 23.56% |
| hash-inc | 0.494 ± 0.004M/s | 0.653 ± 0.001M/s | + 32.18% |
+---------------+------------------+------------------+--------------+
NOTE: This benchmark includes changes from this patch and the previous
patch that implemented the per-cpu insn.
[1] https://github.com/anakryiko/linux/commit/8dec900975ef
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240502151854.9810-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Support an instruction for resolving absolute addresses of per-CPU
data from their per-CPU offsets. This instruction is internal-only and
users are not allowed to use them directly. They will only be used for
internal inlining optimizations for now between BPF verifier and BPF
JITs.
RISC-V uses generic per-cpu implementation where the offsets for CPUs
are kept in an array called __per_cpu_offset[cpu_number]. RISCV stores
the address of the task_struct in TP register. The first element in
task_struct is struct thread_info, and we can get the cpu number by
reading from the TP register + offsetof(struct thread_info, cpu).
Once we have the cpu number in a register we read the offset for that
cpu from address: &__per_cpu_offset + cpu_number << 3. Then we add this
offset to the destination register.
To measure the improvement from this change, the benchmark in [1] was
used on Qemu:
Before:
glob-arr-inc : 1.127 ± 0.013M/s
arr-inc : 1.121 ± 0.004M/s
hash-inc : 0.681 ± 0.052M/s
After:
glob-arr-inc : 1.138 ± 0.011M/s
arr-inc : 1.366 ± 0.006M/s
hash-inc : 0.676 ± 0.001M/s
[1] https://github.com/anakryiko/linux/commit/8dec900975ef
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240502151854.9810-2-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This will add eBPF JIT support to the 32-bit ARCv2 processors. The
implementation is qualified by running the BPF tests on a Synopsys HSDK
board with "ARC HS38 v2.1c at 500 MHz" as the 4-core CPU.
The test_bpf.ko reports 2-10 fold improvements in execution time of its
tests. For instance:
test_bpf: #33 tcpdump port 22 jited:0 704 1766 2104 PASS
test_bpf: #33 tcpdump port 22 jited:1 120 224 260 PASS
test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:0 238 PASS
test_bpf: #141 ALU_DIV_X: 4294967295 / 4294967295 = 1 jited:1 23 PASS
test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:0 2034681 PASS
test_bpf: #776 JMP32_JGE_K: all ... magnitudes jited:1 1020022 PASS
Deployment and structure
------------------------
The related codes are added to "arch/arc/net":
- bpf_jit.h -- The interface that a back-end translator must provide
- bpf_jit_core.c -- Knows how to handle the input eBPF byte stream
- bpf_jit_arcv2.c -- The back-end code that knows the translation logic
The bpf_int_jit_compile() at the end of bpf_jit_core.c is the entrance
to the whole process. Normally, the translation is done in one pass,
namely the "normal pass". In case some relocations are not known during
this pass, some data (arc_jit_data) is allocated for the next pass to
come. This possible next (and last) pass is called the "extra pass".
1. Normal pass # The necessary pass
1a. Dry run # Get the whole JIT length, epilogue offset, etc.
1b. Emit phase # Allocate memory and start emitting instructions
2. Extra pass # Only needed if there are relocations to be fixed
2a. Patch relocations
Support status
--------------
The JIT compiler supports BPF instructions up to "cpu=v4". However, it
does not yet provide support for:
- Tail calls
- Atomic operations
- 64-bit division/remainder
- BPF_PROBE_MEM* (exception table)
The result of "test_bpf" test suite on an HSDK board is:
hsdk-lnx# insmod test_bpf.ko test_suite=test_bpf
test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed]
All the failing test cases are due to the ones that were not JIT'ed.
Categorically, they can be represented as:
.-----------.------------.-------------.
| test type | opcodes | # of cases |
|-----------+------------+-------------|
| atomic | 0xC3, 0xDB | 149 |
| div64 | 0x37, 0x3F | 22 |
| mod64 | 0x97, 0x9F | 15 |
`-----------^------------+-------------|
| (total) 186 |
`-------------'
Setup: build config
-------------------
The following configs must be set to have a working JIT test:
CONFIG_BPF_JIT=y
CONFIG_BPF_JIT_ALWAYS_ON=y
CONFIG_TEST_BPF=m
The following options are not necessary for the tests module,
but are good to have:
CONFIG_DEBUG_INFO=y # prerequisite for below
CONFIG_DEBUG_INFO_BTF=y # so bpftool can generate vmlinux.h
CONFIG_FTRACE=y #
CONFIG_BPF_SYSCALL=y # all these options lead to
CONFIG_KPROBE_EVENTS=y # having CONFIG_BPF_EVENTS=y
CONFIG_PERF_EVENTS=y #
Some BPF programs provide data through /sys/kernel/debug:
CONFIG_DEBUG_FS=y
arc# mount -t debugfs debugfs /sys/kernel/debug
Setup: elfutils
---------------
The libdw.{so,a} library that is used by pahole for processing
the final binary must come from elfutils 0.189 or newer. The
support for ARCv2 [1] has been added since that version.
[1]
https://sourceware.org/git/?p=elfutils.git;a=commit;h=de3d46b3e7
Setup: pahole
-------------
The line below in linux/scripts/Makefile.btf must be commented out:
pahole-flags-$(call test-ge, $(pahole-ver), 121) += --btf_gen_floats
Or else, the build will fail:
$ make V=1
...
BTF .btf.vmlinux.bin.o
pahole -J --btf_gen_floats \
-j --lang_exclude=rust \
--skip_encoding_btf_inconsistent_proto \
--btf_gen_optimized .tmp_vmlinux.btf
Complex, interval and imaginary float types are not supported
Encountered error while encoding BTF.
...
BTFIDS vmlinux
./tools/bpf/resolve_btfids/resolve_btfids vmlinux
libbpf: failed to find '.BTF' ELF section in vmlinux
FAILED: load BTF from vmlinux: No data available
This is due to the fact that the ARC toolchains generate
"complex float" DIE entries in libgcc and at the moment, pahole
can't handle such entries.
Running the tests
-----------------
host$ scp /bld/linux/lib/test_bpf.ko arc:
arc # sysctl net.core.bpf_jit_enable=1
arc # insmod test_bpf.ko test_suite=test_bpf
...
test_bpf: #1048 Staggered jumps: JMP32_JSLE_X jited:1 697811 PASS
test_bpf: Summary: 863 PASSED, 186 FAILED, [851/851 JIT'ed]
Acknowledgments
---------------
- Claudiu Zissulescu for his unwavering support
- Yuriy Kolerov for testing and troubleshooting
- Vladimir Isaev for the pahole workaround
- Sergey Matyukevich for paving the road by adding the interpreter support
Signed-off-by: Shahab Vahedi <shahab@synopsys.com>
Link: https://lore.kernel.org/r/20240430145604.38592-1-list+bpf@vahedi.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The btf_features list can be used for pahole v1.26 and later -
it is useful because if a feature is not yet implemented it will
not exit with a failure message. This will allow us to add feature
requests to the pahole options without having to check pahole versions
in future; if the version of pahole supports the feature it will be
added.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240507135514.490467-1-alan.maguire@oracle.com
|
|
Geliang Tang says:
====================
From: Geliang Tang <tanggeliang@kylinos.cn>
This patchset adds post_socket_cb pointer into
struct network_helper_opts to make start_server_addr() helper
more flexible. With these modifications, many duplicate codes
can be dropped.
Patches 1-3 address Martin's comments in the previous series.
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
The arguments "addr" and "len" of run_test() have dropped. This makes
function get_port() useless. Drop it from test_tcp_check_syncookie_user.c.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/a9b5c8064ab4cbf0f68886fe0e4706428b8d0d47.1714907662.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
This patch uses public helper connect_to_fd() exported in network_helpers.h
instead of the local defined function connect_to_server() in
test_tcp_check_syncookie_user.c. This can avoid duplicate code.
Then the arguments "addr" and "len" of run_test() become useless, drop them
too.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/e0ae6b790ac0abc7193aadfb2660c8c9eb0fe1f0.1714907662.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
This patch uses public helper connect_to_fd() exported in network_helpers.h
instead of the local defined function connect_to_server() in
prog_tests/sockopt_inherit.c. This can avoid duplicate code.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/71db79127cc160b0643fd9a12c70ae019ae076a1.1714907662.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Include network_helpers.h in test_tcp_check_syncookie_user.c, use
public helper start_server_addr() in it instead of the local defined
function start_server(). This can avoid duplicate code.
Add two helpers v6only_true() and v6only_false() to set IPV6_V6ONLY
sockopt to true or false, set them to post_socket_cb pointer of struct
network_helper_opts, and pass it to start_server_setsockopt().
In order to use functions defined in network_helpers.c, Makefile needs
to be updated too.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/e0c5324f5da84f453f47543536e70f126eaa8678.1714907662.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Include network_helpers.h in prog_tests/sockopt_inherit.c, use public
helper start_server_addr() instead of the local defined function
start_server(). This can avoid duplicate code.
Add a helper custom_cb() to set SOL_CUSTOM sockopt looply, set it to
post_socket_cb pointer of struct network_helper_opts, and pass it to
start_server_addr().
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/687af66f743a0bf15cdba372c5f71fe64863219e.1714907662.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
__start_server() sets SO_REUSPORT through setsockopt() when the parameter
'reuseport' is set. This patch makes it more flexible by adding a function
pointer post_socket_cb into struct network_helper_opts. The
'const struct post_socket_opts *cb_opts' args in the post_socket_cb is
for the future extension.
The 'reuseport' parameter can be dropped.
Now the original start_reuseport_server() can be implemented by setting a
newly defined reuseport_cb() function pointer to post_socket_cb filed of
struct network_helper_opts.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/470cb82f209f055fc7fb39c66c6b090b5b7ed2b2.1714907662.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Martin KaFai Lau says:
====================
selftests/bpf: Retire bpf_tcp_helpers.h
From: Martin KaFai Lau <martin.lau@kernel.org>
The earlier commit 8e6d9ae2e09f ("selftests/bpf: Use bpf_tracing.h instead of bpf_tcp_helpers.h")
removed the bpf_tcp_helpers.h usages from the non networking tests.
This patch set is a continuation of this effort to retire
the bpf_tcp_helpers.h from the networking tests (mostly tcp-cc related).
The main usage of the bpf_tcp_helpers.h is the partial kernel
socket definitions (e.g. sock, tcp_sock). New fields are kept adding
back to those partial socket definitions while everything is available
in the vmlinux.h. The recent bpf_cc_cubic.c test tried to extend
bpf_tcp_helpers.c but eventually used the vmlinux.h instead. To avoid
this unnecessary detour for new tests and have one consistent way
of using the kernel sockets, this patch set retires the bpf_tcp_helpers.h
usages and consolidates the tests to use vmlinux.h instead.
====================
Link: https://lore.kernel.org/r/20240509175026.3423614-1-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The previous patches have consolidated the tests to use
bpf_tracing_net.h (i.e. vmlinux.h) instead of bpf_tcp_helpers.h.
This patch can finally retire the bpf_tcp_helpers.h from
the repository.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-11-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The patch removes the remaining bpf_tcp_helpers.h usages in the
non tcp-cc networking tests. It either replaces it with bpf_tracing_net.h
or just removed it because the test is not actually using any
kernel sockets. For the later, the missing macro (mainly SOL_TCP) is
defined locally.
An exception is the test_sock_fields which is testing
the "struct bpf_sock" type instead of the kernel sock type.
Whenever "vmlinux.h" is used instead, it hits a verifier
error on doing arithmetic on the sock_common pointer:
; return !a6[0] && !a6[1] && !a6[2] && a6[3] == bpf_htonl(1); @ test_sock_fields.c:54
21: (61) r2 = *(u32 *)(r1 +28) ; R1_w=sock_common() R2_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff))
22: (56) if w2 != 0x0 goto pc-6 ; R2_w=0
23: (b7) r3 = 28 ; R3_w=28
24: (bf) r2 = r1 ; R1_w=sock_common() R2_w=sock_common()
25: (0f) r2 += r3
R2 pointer arithmetic on sock_common prohibited
Hence, instead of including bpf_tracing_net.h, the test_sock_fields test
defines a tcp_sock with one lsndtime field in it.
Another highlight is, in sockopt_qos_to_cc.c, the tcp_cc_eq()
is replaced by bpf_strncmp(). tcp_cc_eq() was a workaround
in bpf_tcp_helpers.h before bpf_strncmp had been added.
The SOL_IPV6 addition to bpf_tracing_net.h is needed by the
test_tcpbpf_kern test.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-10-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch removed the final few bpf_tcp_helpers.h usages
in some misc bpf tcp-cc tests and replace it with
bpf_tracing_net.h (i.e. vmlinux.h)
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-9-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch uses bpf_tracing_net.h (i.e. vmlinux.h) in bpf_dctcp.
This will allow to retire the bpf_tcp_helpers.h and consolidate
tcp-cc tests to vmlinux.h.
It will have a dup on min/max macros with the bpf_cubic. It could
be further refactored in the future.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-8-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch uses bpf_tracing_net.h (i.e. vmlinux.h) in bpf_cubic.
This will allow to retire the bpf_tcp_helpers.h and consolidate
tcp-cc tests to vmlinux.h.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-7-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The "struct bictcp" and "struct dctcp" are private to the bpf prog
and they are stored in the private buffer in inet_csk(sk)->icsk_ca_priv.
Hence, there is no bpf CO-RE required.
The same struct name exists in the vmlinux.h. To reuse vmlinux.h,
they need to be renamed such that the bpf prog logic will be
immuned from the kernel tcp-cc changes.
This patch adds a "bpf_" prefix to them.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-6-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
It is needed to remove the BPF_STRUCT_OPS usages from the tcp-cc tests
because it is defined in bpf_tcp_helpers.h which is going to be retired.
While at it, this patch consolidates all tcp-cc struct_ops programs to
use the SEC("struct_ops") + BPF_PROG().
It also removes the unnecessary __always_inline usages from the
tcp-cc tests.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-5-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch removes the individual tcp_sk implementations from the
tcp-cc tests. The tcp_sk() implementation from the bpf_tracing_net.h
is reused instead.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-4-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch adds a few tcp related helper functions to bpf_tracing_net.h.
They will be useful for both tcp-cc and network tracing related
bpf progs. They have already been in the bpf_tcp_helpers.h. This change
is needed to retire the bpf_tcp_helpers.h and consolidate all tests
to vmlinux.h (i.e. bpf_tracing_net.h).
Some of the helpers (tcp_sk and inet_csk) are also defined in
bpf_cc_cubic.c and they are removed. While at it, remove
the vmlinux.h from bpf_cc_cubic.c. bpf_tracing_net.h (which has
vmlinux.h after this patch) is enough and will be consistent
with the other tcp-cc tests in the later patches.
The other TCP_* macro additions will be needed for the bpf_dctcp
changes in the later patch.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-3-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch removes the bpf_tracing_net.h usage from the networking tests,
fib_lookup and test_lwt_redirect. Instead of using the (copied) macro
TC_ACT_SHOT and ETH_HLEN from bpf_tracing_net.h, they can directly
use the ones defined in the network header files under linux/.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240509175026.3423614-2-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
[Changes from V1:
- Use a default branch in the switch statement to initialize `val'.]
GCC warns that `val' may be used uninitialized in the
BPF_CRE_READ_BITFIELD macro, defined in bpf_core_read.h as:
[...]
unsigned long long val; \
[...] \
switch (__CORE_RELO(s, field, BYTE_SIZE)) { \
case 1: val = *(const unsigned char *)p; break; \
case 2: val = *(const unsigned short *)p; break; \
case 4: val = *(const unsigned int *)p; break; \
case 8: val = *(const unsigned long long *)p; break; \
} \
[...]
val; \
} \
This patch adds a default entry in the switch statement that sets
`val' to zero in order to avoid the warning, and random values to be
used in case __builtin_preserve_field_info returns unexpected values
for BPF_FIELD_BYTE_SIZE.
Tested in bpf-next master.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240508101313.16662-1-jose.marchesi@oracle.com
|
|
This little patch is a follow-up to:
https://lore.kernel.org/bpf/20240507095011.15867-1-jose.marchesi@oracle.com/T/#u
The temporary workaround of passing -DBPF_NO_PRESERVE_ACCESS_INDEX
when building with GCC triggers a redefinition preprocessor error when
building progs/skb_pkt_end.c. This patch adds a guard to avoid
redefinition.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Cc: david.faust@oracle.com
Cc: cupertino.miranda@oracle.com
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240508110332.17332-1-jose.marchesi@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
[Changes from V2:
- no-strict-aliasing is only applied when building with GCC.
- cpumask_failure.c is excluded, as it doesn't use __imm_insn.]
The __imm_insn macro is defined in bpf_misc.h as:
#define __imm_insn(name, expr) [name]"i"(*(long *)&(expr))
This may lead to type-punning and strict aliasing rules violations in
it's typical usage where the address of a struct bpf_insn is passed as
expr, like in:
__imm_insn(st_mem,
BPF_ST_MEM(BPF_W, BPF_REG_1, offsetof(struct __sk_buff, mark), 42))
Where:
#define BPF_ST_MEM(SIZE, DST, OFF, IMM) \
((struct bpf_insn) { \
.code = BPF_ST | BPF_SIZE(SIZE) | BPF_MEM, \
.dst_reg = DST, \
.src_reg = 0, \
.off = OFF, \
.imm = IMM })
In all the actual instances of this in the BPF selftests the value is
fed to a volatile asm statement as soon as it gets read from memory,
and thus it is unlikely anti-aliasing rules breakage may lead to
misguided optimizations.
However, GCC detects the potential problem (indirectly) by issuing a
warning stating that a temporary <Uxxxxxx> is used uninitialized,
where the temporary corresponds to the memory read by *(long *).
This patch adds -fno-strict-aliasing to the compilation flags of the
particular selftests that do type punning via __imm_insn, only for
GCC.
Tested in master bpf-next.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Cc: david.faust@oracle.com
Cc: cupertino.miranda@oracle.com
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240508103551.14955-1-jose.marchesi@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
[Changes from V1:
- The warning to disable is -Wmaybe-uninitialized, not -Wuninitialized.
- This warning is only supported in GCC.]
The BPF selftest verifier_global_subprogs.c contains code that
purposedly performs out of bounds access to memory, to check whether
the kernel verifier is able to catch them. For example:
__noinline int global_unsupp(const int *mem)
{
if (!mem)
return 0;
return mem[100]; /* BOOM */
}
With -O1 and higher and no inlining, GCC notices this fact and emits a
"maybe uninitialized" warning. This is by design. Note that the
emission of these warnings is highly dependent on the precise
optimizations that are performed.
This patch adds a compiler pragma to verifier_global_subprogs.c to
ignore these warnings.
Tested in bpf-next master.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Cc: david.faust@oracle.com
Cc: cupertino.miranda@oracle.com
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240507184756.1772-1-jose.marchesi@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
When LSE atomics are available, BPF atomic instructions are implemented
as single ARM64 atomic instructions, therefore it is easy to enable
these in bpf_arena using the currently available exception handling
setup.
LL_SC atomics use loops and therefore would need more work to enable in
bpf_arena.
Enable LSE atomics based instructions in bpf_arena and use the
bpf_jit_supports_insn() callback to reject atomics in bpf_arena if LSE
atomics are not available.
All atomics and arena_atomics selftests are passing:
[root@ip-172-31-2-216 bpf]# ./test_progs -a atomics,arena_atomics
#3/1 arena_atomics/add:OK
#3/2 arena_atomics/sub:OK
#3/3 arena_atomics/and:OK
#3/4 arena_atomics/or:OK
#3/5 arena_atomics/xor:OK
#3/6 arena_atomics/cmpxchg:OK
#3/7 arena_atomics/xchg:OK
#3 arena_atomics:OK
#10/1 atomics/add:OK
#10/2 atomics/sub:OK
#10/3 atomics/and:OK
#10/4 atomics/or:OK
#10/5 atomics/xor:OK
#10/6 atomics/cmpxchg:OK
#10/7 atomics/xchg:OK
#10 atomics:OK
Summary: 2/14 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20240426161116.441-1-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Andrii Nakryiko says:
====================
Fix yet another case of mishandling SEC("struct_ops") programs that were
nulled out programmatically through BPF skeleton by the user.
While at it, add some improvements around detecting and reporting errors,
specifically a common case of declaring SEC("struct_ops") program, but
forgetting to actually make use of it by setting it as a callback
implementation in SEC(".struct_ops") variable (i.e., map) declaration.
A bunch of new selftests are added as well.
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Drive-by clean up, we shouldn't use meaningless "test_" prefix for
subtest names.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-8-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Add a simple test that validates that libbpf will reject isolated
struct_ops program early with helpful warning message.
Also validate that explicit use of such BPF program through BPF skeleton
after BPF object is open won't trigger any warnings.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-7-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Extend libbpf's pre-load checks for BPF programs, detecting more typical
conditions that are destinated to cause BPF program failure. This is an
opportunity to provide more helpful and actionable error message to
users, instead of potentially very confusing BPF verifier log and/or
error.
In this case, we detect struct_ops BPF program that was not referenced
anywhere, but still attempted to be loaded (according to libbpf logic).
Suggest that the program might need to be used in some struct_ops
variable. User will get a message of the following kind:
libbpf: prog 'test_1_forgotten': SEC("struct_ops") program isn't referenced anywhere, did you forget to use it?
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-6-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
strerror_r(), used from libbpf-specific libbpf_strerror_r() wrapper is
documented to return error in two different ways, depending on glibc
version. Take that into account when handling strerror_r()'s own errors,
which happens when we pass some non-standard (internal) kernel error to
it. Before this patch we'd have "ERROR: strerror_r(524)=22", which is
quite confusing. Now for the same situation we'll see a bit less
visually scary "unknown error (-524)".
At least we won't confuse user with irrelevant EINVAL (22).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-5-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Add a test which tests the case that was just fixed. Kernel has full
type information about callback, but user explicitly nulls out the
reference to declaratively set BPF program reference.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-4-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
There is yet another corner case where user can set STRUCT_OPS program
reference in STRUCT_OPS map to NULL, but libbpf will fail to disable
autoload for such BPF program. This time it's the case of "new" kernel
which has type information about callback field, but user explicitly
nulled-out program reference from user-space after opening BPF object.
Fix, hopefully, the last remaining unhandled case.
Fixes: 0737df6de946 ("libbpf: better fix for handling nulled-out struct_ops program")
Fixes: f973fccd43d3 ("libbpf: handle nulled-out program in struct_ops correctly")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-3-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
libbpf ensures that BPF program references set in map->st_ops->progs[i]
during open phase are always valid STRUCT_OPS programs. This is done in
bpf_object__collect_st_ops_relos(). So there is no need to double-check
that in bpf_map__init_kern_struct_ops().
Simplify the code by removing unnecessary check. Also, we avoid using
local prog variable to keep code similar to the upcoming fix, which adds
similar logic in another part of bpf_map__init_kern_struct_ops().
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240507001335.1445325-2-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Cupertino Miranda says:
====================
Fix number of arguments in test
Hi everyone,
This is a new version based on comments.
Regards,
Cupertino
Changes from v1:
- Comment with gcc-bpf replaced by bpf_gcc.
- Used pragma GCC optimize to disable GCC optimization in test.
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
====================
Link: https://lore.kernel.org/r/20240507122220.207820-1-cupertino.miranda@oracle.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
|
|
The test_xdp_noinline.c contains 2 functions that use more then 5
arguments. This patch collapses the 2 last arguments in an array.
Also in GCC and ipa_sra optimization increases the number of arguments
used in function encap_v4. This pass disables the optimization for that
particular file.
Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240507122220.207820-3-cupertino.miranda@oracle.com
|
|
This patch adds support to specify CFLAGS per source file and per test
runner.
Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240507122220.207820-2-cupertino.miranda@oracle.com
|
|
The vmlinux.h file generated by bpftool makes use of compiler pragmas
in order to install the CO-RE preserve_access_index in all the struct
types derived from the BTF info:
#ifndef __VMLINUX_H__
#define __VMLINUX_H__
#ifndef BPF_NO_PRESERVE_ACCESS_INDEX
#pragma clang attribute push (__attribute__((preserve_access_index)), apply_t = record
#endif
[... type definitions generated from kernel BTF ... ]
#ifndef BPF_NO_PRESERVE_ACCESS_INDEX
#pragma clang attribute pop
#endif
The `clang attribute push/pop' pragmas are specific to clang/llvm and
are not supported by GCC.
At the moment the BTF dumping services in libbpf do not support
dicriminating between types dumped because they are directly referred
and types dumped because they are dependencies. A suitable API is
being worked now. See [1] and [2].
In the interim, this patch changes the selftests/bpf Makefile so it
passes -DBPF_NO_PRESERVE_ACCESS_INDEX to GCC when it builds the
selftests. This workaround is temporary, and may have an impact on
the results of the GCC-built tests.
[1] https://lore.kernel.org/bpf/20240503111836.25275-1-jose.marchesi@oracle.com/T/#u
[2] https://lore.kernel.org/bpf/20240504205510.24785-1-jose.marchesi@oracle.com/T/#u
Tested in bpf-next master.
No regressions.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240507095011.15867-1-jose.marchesi@oracle.com
|
|
Jose E. Marchesi says:
====================
bpf: avoid `attribute ignored' warnings in GCC
These two patches avoid warnings (turned into errors) when building
the BPF selftests with GCC.
[Changes from V1:
- As requested by reviewer, an additional patch has been added in
order to remove __hidden from the `private' macro in
cpumask_common.h.
- Typo bening -> benign fixed in the commit message of the second
patch.]
====================
Link: https://lore.kernel.org/r/20240507074227.4523-1-jose.marchesi@oracle.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
|
|
This patch modifies selftests/bpf/Makefile to pass -Wno-attributes to
GCC. This is because of the following attributes which are ignored:
- btf_decl_tag
- btf_type_tag
There are many of these. At the moment none of these are
recognized/handled by gcc-bpf.
We are aware that btf_decl_tag is necessary for some of the
selftest harness to communicate test failure/success. Support for
it is in progress in GCC upstream:
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650482.html
However, the GCC master branch is not yet open, so the series
above (currently under review upstream) wont be able to make it
there until 14.1 gets released, probably mid next week.
As for btf_type_tag, more extensive work will be needed in GCC
upstream to support it in both BTF and DWARF. We have a WIP big
patch for that, but that is not needed to compile/build the
selftests.
- used
There are SEC macros defined in the selftests as:
#define SEC(N) __attribute__((section(N),used))
The SEC macro is used for both functions and global variables.
According to the GCC documentation `used' attribute is really only
meaningful for functions, and it warns when the attribute is used
for other global objects, like for example ctl_array in
test_xdp_noinline.c.
Ignoring this is benign.
- align_value
In progs/test_cls_redirect.c:127 there is:
typedef uint8_t *net_ptr __attribute__((align_value(8)));
GCC warns that it is ignoring this attribute, because it is not
implemented by GCC.
I think ignoring this attribute in GCC is benign, because according
to the clang documentation [1] its purpose seems to be merely
declarative and doesn't seem to translate into extra checks at
run-time, only to perhaps better optimized code ("runtime behavior
is undefined if the pointed memory object is not aligned to the
specified alignment").
[1] https://clang.llvm.org/docs/AttributeReference.html#align-value
Tested in bpf-next master.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240507074227.4523-3-jose.marchesi@oracle.com
|
|
An object defined as `static' defaults to hidden visibility. If
additionally the visibility(__weak__) compiler attribute is applied to
the declaration of the object, GCC warns that the attribute gets
ignored.
This patch removes the only instance of this problem among the BPF
selftests.
Tested in bpf-next master.
Signed-off-by: Jose E. Marchesi <jose.marchesi@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240507074227.4523-2-jose.marchesi@oracle.com
|
|
As the comment described in "struct vm_fault":
".address" : 'Faulting virtual address - masked'
".real_address" : 'Faulting virtual address - unmasked'
The link [1] said: "Whatever the routes, all architectures end up to the
invocation of handle_mm_fault() which, in turn, (likely) ends up calling
__handle_mm_fault() to carry out the actual work of allocating the page
tables."
__handle_mm_fault() does address assignment:
.address = address & PAGE_MASK,
.real_address = address,
This is debug dump by running `./test_progs -a "*arena*"`:
[ 69.767494] arena fault: vmf->address = 10000001d000, vmf->real_address = 10000001d008
[ 69.767496] arena fault: vmf->address = 10000001c000, vmf->real_address = 10000001c008
[ 69.767499] arena fault: vmf->address = 10000001b000, vmf->real_address = 10000001b008
[ 69.767501] arena fault: vmf->address = 10000001a000, vmf->real_address = 10000001a008
[ 69.767504] arena fault: vmf->address = 100000019000, vmf->real_address = 100000019008
[ 69.769388] arena fault: vmf->address = 10000001e000, vmf->real_address = 10000001e1e8
So we can use the value of 'vmf->address' to do BPF arena kernel address
space cast directly.
[1] https://docs.kernel.org/mm/page_tables.html
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Link: https://lore.kernel.org/r/20240507063358.8048-1-haiyue.wang@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Cupertino Miranda says:
====================
bpf/verifier: range computation improvements
Hi everyone,
This is what I hope to be the last version. :)
Regards,
Cupertino
Changes from v1:
- Reordered patches in the series.
- Fix refactor to be acurate with original code.
- Fixed other mentioned small problems.
Changes from v2:
- Added a patch to replace mark_reg_unknowon for __mark_reg_unknown in
the context of range computation.
- Reverted implementation of refactor to v1 which used a simpler
boolean return value in check function.
- Further relaxed MUL to allow it to still compute a range when neither
of its registers is a known value.
- Simplified tests based on Eduards example.
- Added messages in selftest commits.
Changes from v3:
- Improved commit message of patch nr 1.
- Coding style fixes.
- Improve XOR and OR tests.
- Made function calls to pass struct bpf_reg_state pointer instead.
- Improved final code as a last patch.
Changes from v4:
- Merged patch nr 7 in 2.
====================
Link: https://lore.kernel.org/r/20240506141849.185293-1-cupertino.miranda@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Added a test for bound computation in MUL when non constant
values are used and both registers have bounded ranges.
Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
Link: https://lore.kernel.org/r/20240506141849.185293-7-cupertino.miranda@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
MUL instruction required that src_reg would be a known value (i.e.
src_reg would be a const value). The condition in this case can be
relaxed, since the range computation algorithm used in current code
already supports a proper range computation for any valid range value on
its operands.
Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
Link: https://lore.kernel.org/r/20240506141849.185293-6-cupertino.miranda@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Added a test for bound computation in XOR and OR when non constant
values are used and both registers have bounded ranges.
Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Link: https://lore.kernel.org/r/20240506141849.185293-5-cupertino.miranda@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Range for XOR and OR operators would not be attempted unless src_reg
would resolve to a single value, i.e. a known constant value.
This condition is unnecessary, and the following XOR/OR operator
handling could compute a possible better range.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Link: https://lore.kernel.org/r/20240506141849.185293-4-cupertino.miranda@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Split range computation checks in its own function, isolating pessimitic
range set for dst_reg and failing return to a single point.
Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
bpf/verifier: improve code after range computation recent changes.
Link: https://lore.kernel.org/r/20240506141849.185293-3-cupertino.miranda@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In order to further simplify the code in adjust_scalar_min_max_vals all
the calls to mark_reg_unknown are replaced by __mark_reg_unknown.
static void mark_reg_unknown(struct bpf_verifier_env *env,
struct bpf_reg_state *regs, u32 regno)
{
if (WARN_ON(regno >= MAX_BPF_REG)) {
... mark all regs not init ...
return;
}
__mark_reg_unknown(env, regs + regno);
}
The 'regno >= MAX_BPF_REG' does not apply to
adjust_scalar_min_max_vals(), because it is only called from the
following stack:
- check_alu_op
- adjust_reg_min_max_vals
- adjust_scalar_min_max_vals
The check_alu_op() does check_reg_arg() which verifies that both src and
dst register numbers are within bounds.
Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: David Faust <david.faust@oracle.com>
Cc: Jose Marchesi <jose.marchesi@oracle.com>
Cc: Elena Zannoni <elena.zannoni@oracle.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Link: https://lore.kernel.org/r/20240506141849.185293-2-cupertino.miranda@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|