summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-08-22libbpf: Add bpf_link_create support for multi uprobesJiri Olsa2-1/+21
Adding new uprobe_multi struct to bpf_link_create_opts object to pass multiple uprobe data to link_create attr uapi. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-14-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add elf_resolve_pattern_offsets functionJiri Olsa3-1/+67
Adding elf_resolve_pattern_offsets function that looks up offsets for symbols specified by pattern argument. The 'pattern' argument allows wildcards (*?' supported). Offsets are returned in allocated array together with its size and needs to be released by the caller. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-13-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add elf_resolve_syms_offsets functionJiri Olsa2-0/+112
Adding elf_resolve_syms_offsets function that looks up offsets for symbols specified in syms array argument. Offsets are returned in allocated array with the 'cnt' size, that needs to be released by the caller. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-12-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add elf symbol iteratorJiri Olsa1-64/+115
Adding elf symbol iterator object (and some functions) that follow open-coded iterator pattern and some functions to ease up iterating elf object symbols. The idea is to iterate single symbol section with: struct elf_sym_iter iter; struct elf_sym *sym; if (elf_sym_iter_new(&iter, elf, binary_path, SHT_DYNSYM)) goto error; while ((sym = elf_sym_iter_next(&iter))) { ... } I considered opening the elf inside the iterator and iterate all symbol sections, but then it gets more complicated wrt user checks for when the next section is processed. Plus side is the we don't need 'exit' function, because caller/user is in charge of that. The returned iterated symbol object from elf_sym_iter_next function is placed inside the struct elf_sym_iter, so no extra allocation or argument is needed. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-11-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add elf_open/elf_close functionsJiri Olsa3-42/+57
Adding elf_open/elf_close functions and using it in elf_find_func_offset_from_file function. It will be used in following changes to save some common code. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-10-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Move elf_find_func_offset* functions to elf objectJiri Olsa4-186/+202
Adding new elf object that will contain elf related functions. There's no functional change. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-9-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22libbpf: Add uprobe_multi attach type and link namesJiri Olsa1-0/+2
Adding new uprobe_multi attach type and link names, so the functions can resolve the new values. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230809083440.3209381-8-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22bpf: Add bpf_get_func_ip helper support for uprobe linkJiri Olsa1-3/+30
Adding support for bpf_get_func_ip helper being called from ebpf program attached by uprobe_multi link. It returns the ip of the uprobe. Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-7-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22bpf: Add pid filter support for uprobe_multi linkJiri Olsa4-1/+36
Adding support to specify pid for uprobe_multi link and the uprobes are created only for task with given pid value. Using the consumer.filter filter callback for that, so the task gets filtered during the uprobe installation. We still need to check the task during runtime in the uprobe handler, because the handler could get executed if there's another system wide consumer on the same uprobe (thanks Oleg for the insight). Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-6-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22bpf: Add cookies support for uprobe_multi linkJiri Olsa4-5/+44
Adding support to specify cookies array for uprobe_multi link. The cookies array share indexes and length with other uprobe_multi arrays (offsets/ref_ctr_offsets). The cookies[i] value defines cookie for i-the uprobe and will be returned by bpf_get_attach_cookie helper when called from ebpf program hooked to that specific uprobe. Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-5-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22bpf: Add multi uprobe linkJiri Olsa5-3/+282
Adding new multi uprobe link that allows to attach bpf program to multiple uprobes. Uprobes to attach are specified via new link_create uprobe_multi union: struct { __aligned_u64 path; __aligned_u64 offsets; __aligned_u64 ref_ctr_offsets; __u32 cnt; __u32 flags; } uprobe_multi; Uprobes are defined for single binary specified in path and multiple calling sites specified in offsets array with optional reference counters specified in ref_ctr_offsets array. All specified arrays have length of 'cnt'. The 'flags' supports single bit for now that marks the uprobe as return probe. Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-4-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22bpf: Add attach_type checks under bpf_prog_attach_check_attach_typeJiri Olsa1-68/+52
Add extra attach_type checks from link_create under bpf_prog_attach_check_attach_type. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-3-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22bpf: Switch BPF_F_KPROBE_MULTI_RETURN macro to enumJiri Olsa2-2/+6
Switching BPF_F_KPROBE_MULTI_RETURN macro to anonymous enum, so it'd show up in vmlinux.h. There's not functional change compared to having this as macro. Acked-by: Yafang Shao <laoar.shao@gmail.com> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230809083440.3209381-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22Merge branch 'samples-bpf-make-bpf-programs-more-libbpf-aware'Alexei Starovoitov21-171/+117
Daniel T. Lee says: ==================== samples/bpf: make BPF programs more libbpf aware The existing tracing programs have been developed for a considerable period of time and, as a result, do not properly incorporate the features of the current libbpf, such as CO-RE. This is evident in frequent usage of functions like PT_REGS* and the persistence of "hack" methods using underscore-style bpf_probe_read_kernel from the past. These programs are far behind the current level of libbpf and can potentially confuse users. The kernel has undergone significant changes, and some of these changes have broken these programs, but on the other hand, more robust APIs have been developed for increased stableness. To list some of the kernel changes that this patch set is focusing on, - symbol mismatch occurs due to compiler optimization [1] - inline of blk_account_io* breaks BPF kprobe program [2] - new tracepoints for the block_io_start/done are introduced [3] - map lookup probes can't be triggered (bpf_disable_instrumentation)[4] - BPF_KSYSCALL has been introduced to simplify argument fetching [5] - convert to vmlinux.h and use tp argument structure within it - make tracing programs to be more CO-RE centric In this regard, this patch set aims not only to integrate the latest features of libbpf into BPF programs but also to reduce confusion and clarify the BPF programs. This will help with the potential confusion among users and make the programs more intutitive. [1]: https://github.com/iovisor/bcc/issues/1754 [2]: https://github.com/iovisor/bcc/issues/4261 [3]: commit 5a80bd075f3b ("block: introduce block_io_start/block_io_done tracepoints") [4]: commit 7c4cd051add3 ("bpf: Fix syscall's stackmap lookup potential deadlock") [5]: commit 6f5d467d55f0 ("libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL") ==================== Link: https://lore.kernel.org/r/20230818090119.477441-1-danieltimlee@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22samples/bpf: simplify spintest with kprobe.multiDaniel T. Lee2-29/+10
With the introduction of kprobe.multi, it is now possible to attach multiple kprobes to a single BPF program without the need for multiple definitions. Additionally, this method supports wildcard-based matching, allowing for further simplification of BPF programs. In here, an asterisk (*) wildcard is used to map to all symbols relevant to spin_{lock|unlock}. Furthermore, since kprobe.multi handles symbol matching, this commit eliminates the need for the previous logic of reading the ksym table to verify the existence of symbols. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-10-danieltimlee@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22samples/bpf: refactor syscall tracing programs using BPF_KSYSCALL macroDaniel T. Lee1-7/+3
This commit refactors the syscall tracing programs by adopting the BPF_KSYSCALL macro. This change aims to enhance the clarity and simplicity of the BPF programs by reducing the complexity of argument parsing from pt_regs. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-9-danieltimlee@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22samples/bpf: fix broken map lookup probeDaniel T. Lee1-2/+15
In the commit 7c4cd051add3 ("bpf: Fix syscall's stackmap lookup potential deadlock"), a potential deadlock issue was addressed, which resulted in *_map_lookup_elem not triggering BPF programs. (prior to lookup, bpf_disable_instrumentation() is used) To resolve the broken map lookup probe using "htab_map_lookup_elem", this commit introduces an alternative approach. Instead, it utilize "bpf_map_copy_value" and apply a filter specifically for the hash table with map_type. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Fixes: 7c4cd051add3 ("bpf: Fix syscall's stackmap lookup potential deadlock") Link: https://lore.kernel.org/r/20230818090119.477441-8-danieltimlee@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22samples/bpf: fix bio latency check with tracepointDaniel T. Lee1-12/+24
Recently, a new tracepoint for the block layer, specifically the block_io_start/done tracepoints, was introduced in commit 5a80bd075f3b ("block: introduce block_io_start/block_io_done tracepoints"). Previously, the kprobe entry used for this purpose was quite unstable and inherently broke relevant probes [1]. Now that a stable tracepoint is available, this commit replaces the bio latency check with it. One of the changes made during this replacement is the key used for the hash table. Since 'struct request' cannot be used as a hash key, the approach taken follows that which was implemented in bcc/biolatency [2]. (uses dev:sector for the key) [1]: https://github.com/iovisor/bcc/issues/4261 [2]: https://github.com/iovisor/bcc/pull/4691 Fixes: 450b7879e345 ("block: move blk_account_io_{start,done} to blk-mq.c") Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-7-danieltimlee@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22samples/bpf: make tracing programs to be more CO-RE centricDaniel T. Lee4-40/+20
The existing tracing programs have been developed for a considerable period of time and, as a result, do not properly incorporate the features of the current libbpf, such as CO-RE. This is evident in frequent usage of functions like PT_REGS* and the persistence of "hack" methods using underscore-style bpf_probe_read_kernel from the past. These programs are far behind the current level of libbpf and can potentially confuse users. Therefore, this commit aims to convert the outdated BPF programs to be more CO-RE centric. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-6-danieltimlee@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22samples/bpf: fix symbol mismatch by compiler optimizationDaniel T. Lee2-2/+3
Currently, multiple kprobe programs are suffering from symbol mismatch due to compiler optimization. These optimizations might induce additional suffix to the symbol name such as '.isra' or '.constprop'. # egrep ' finish_task_switch| __netif_receive_skb_core' /proc/kallsyms ffffffff81135e50 t finish_task_switch.isra.0 ffffffff81dd36d0 t __netif_receive_skb_core.constprop.0 ffffffff8205cc0e t finish_task_switch.isra.0.cold ffffffff820b1aba t __netif_receive_skb_core.constprop.0.cold To avoid this, this commit replaces the original kprobe section to kprobe.multi in order to match symbol with wildcard characters. Here, asterisk is used for avoiding symbol mismatch. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-5-danieltimlee@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22samples/bpf: unify bpf program suffix to .bpf with tracing programsDaniel T. Lee17-17/+17
Currently, BPF programs typically have a suffix of .bpf.c. However, some programs still utilize a mixture of _kern.c suffix alongside the naming convention. In order to achieve consistency in the naming of these programs, this commit unifies the inconsistency in the naming convention of BPF kernel programs. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-4-danieltimlee@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22samples/bpf: convert to vmlinux.h with tracing programsDaniel T. Lee10-62/+25
This commit replaces separate headers with a single vmlinux.h to tracing programs. Thanks to that, we no longer need to define the argument structure for tracing programs directly. For example, argument for the sched_switch tracpepoint (sched_switch_args) can be replaced with the vmlinux.h provided trace_event_raw_sched_switch. Additional defines have been added to the BPF program either directly or through the inclusion of net_shared.h. Defined values are PERF_MAX_STACK_DEPTH, IFNAMSIZ constants and __stringify() macro. This change enables the BPF program to access internal structures with BTF generated "vmlinux.h" header. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-3-danieltimlee@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22samples/bpf: fix warning with ignored-attributesDaniel T. Lee1-1/+1
Currently, compiling the bpf programs will result the warning with the ignored attribute as follows. This commit fixes the warning by adding cf-protection option. In file included from ./arch/x86/include/asm/linkage.h:6: ./arch/x86/include/asm/ibt.h:77:8: warning: 'nocf_check' attribute ignored; use -fcf-protection to enable the attribute [-Wignored-attributes] extern __noendbr u64 ibt_save(bool disable); ^ ./arch/x86/include/asm/ibt.h:32:34: note: expanded from macro '__noendbr' ^ Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Link: https://lore.kernel.org/r/20230818090119.477441-2-danieltimlee@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22Merge branch 'remove-unnecessary-synchronizations-in-cpumap'Alexei Starovoitov1-78/+35
Hou Tao says: ==================== Remove unnecessary synchronizations in cpumap From: Hou Tao <houtao1@huawei.com> Hi, This is the formal patchset to remove unnecessary synchronizations in cpu-map after address comments and collect Rvb tags from Toke Høiland-Jørgensen (Big thanks to Toke). Patch #1 removes the unnecessary rcu_barrier() when freeing bpf_cpu_map_entry and replaces it by queue_rcu_work(). Patch #2 removes the unnecessary call_rcu() and queue_work() when destroying cpu-map and does the freeing directly. Test the patchset by using xdp_redirect_cpu and virtio-net. Both xdp-mode and skb-mode have been exercised and no issues were reported. As ususal, comments and suggestions are always welcome. Change Log: v1: * address comments from Toke Høiland-Jørgensen * add Rvb tags from Toke Høiland-Jørgensen * update outdated comment in cpu_map_delete_elem() RFC: https://lore.kernel.org/bpf/20230728023030.1906124-1-houtao@huaweicloud.com ==================== Link: https://lore.kernel.org/r/20230816045959.358059-1-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22bpf, cpumask: Clean up bpf_cpu_map_entry directly in cpu_map_freeHou Tao1-9/+8
After synchronous_rcu(), both the dettached XDP program and xdp_do_flush() are completed, and the only user of bpf_cpu_map_entry will be cpu_map_kthread_run(), so instead of calling __cpu_map_entry_replace() to stop kthread and cleanup entry after a RCU grace period, do these things directly. Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20230816045959.358059-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-22bpf, cpumap: Use queue_rcu_work() to remove unnecessary rcu_barrier()Hou Tao1-69/+27
As for now __cpu_map_entry_replace() uses call_rcu() to wait for the inflight xdp program to exit the RCU read critical section, and then launch kworker cpu_map_kthread_stop() to call kthread_stop() to flush all pending xdp frames or skbs. But it is unnecessary to use rcu_barrier() in cpu_map_kthread_stop() to wait for the completion of __cpu_map_entry_free(), because rcu_barrier() will wait for all pending RCU callbacks and cpu_map_kthread_stop() only needs to wait for the completion of a specific __cpu_map_entry_free(). So use queue_rcu_work() to replace call_rcu(), schedule_work() and rcu_barrier(). queue_rcu_work() will queue a __cpu_map_entry_free() kworker after a RCU grace period. Because __cpu_map_entry_free() is running in a kworker context, so it is OK to do all of these freeing procedures include kthread_stop() in it. After the update, there is no need to do reference-counting for bpf_cpu_map_entry, because bpf_cpu_map_entry is freed directly in __cpu_map_entry_free(), so just remove it. Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20230816045959.358059-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-08-18selftests/bpf: Fix a selftest compilation errorYonghong Song1-1/+1
When building the kernel and selftest with clang compiler (llvm17 or llvm18), I hit the following compilation failure: In file included from progs/test_lwt_redirect.c:3: In file included from /usr/include/linux/ip.h:21: In file included from /usr/include/asm/byteorder.h:5: In file included from /usr/include/linux/byteorder/little_endian.h:13: /usr/include/linux/swab.h:136:8: error: unknown type name '__always_inline' 136 | static __always_inline unsigned long __swab(const unsigned long y) | ^ /usr/include/linux/swab.h:171:8: error: unknown type name '__always_inline' 171 | static __always_inline __u16 __swab16p(const __u16 *p) ... bpf_helpers.h file provided a definition for __always_inline. Putting 'ip.h' after 'bpf_helpers.h' fixed the issue. Fixes: 43a7c3ef8a15 ("selftests/bpf: Add lwt_xmit tests for BPF_REDIRECT") Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20230818174312.1883381-1-yonghong.song@linux.dev Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-08-18selftests/bpf: Add CO-RE relocs kfunc flavors testsDave Marchevsky2-0/+53
This patch adds selftests that exercise kfunc flavor relocation functionality added in the previous patch. The actual kfunc defined in kernel/bpf/helpers.c is: struct task_struct *bpf_task_acquire(struct task_struct *p) The following relocation behaviors are checked: struct task_struct *bpf_task_acquire___one(struct task_struct *name) * Should succeed despite differing param name struct task_struct *bpf_task_acquire___two(struct task_struct *p, void *ctx) * Should fail because there is no two-param bpf_task_acquire struct task_struct *bpf_task_acquire___three(void *ctx) * Should fail because, despite vmlinux's bpf_task_acquire having one param, the types don't match Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20230817225353.2570845-2-davemarchevsky@fb.com
2023-08-18libbpf: Support triple-underscore flavors for kfunc relocationDave Marchevsky1-1/+19
The function signature of kfuncs can change at any time due to their intentional lack of stability guarantees. As kfuncs become more widely used, BPF program writers will need facilities to support calling different versions of a kfunc from a single BPF object. Consider this simplified example based on a real scenario we ran into at Meta: /* initial kfunc signature */ int some_kfunc(void *ptr) /* Oops, we need to add some flag to modify behavior. No problem, change the kfunc. flags = 0 retains original behavior */ int some_kfunc(void *ptr, long flags) If the initial version of the kfunc is deployed on some portion of the fleet and the new version on the rest, a fleetwide service that uses some_kfunc will currently need to load different BPF programs depending on which some_kfunc is available. Luckily CO-RE provides a facility to solve a very similar problem, struct definition changes, by allowing program writers to declare my_struct___old and my_struct___new, with ___suffix being considered a 'flavor' of the non-suffixed name and being ignored by bpf_core_type_exists and similar calls. This patch extends the 'flavor' facility to the kfunc extern relocation process. BPF program writers can now declare extern int some_kfunc___old(void *ptr) extern int some_kfunc___new(void *ptr, int flags) then test which version of the kfunc exists with bpf_ksym_exists. Relocation and verifier's dead code elimination will work in concert as expected, allowing this pattern: if (bpf_ksym_exists(some_kfunc___old)) some_kfunc___old(ptr); else some_kfunc___new(ptr, 0); Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Vernet <void@manifault.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230817225353.2570845-1-davemarchevsky@fb.com
2023-08-18bpf/tests: Enhance output on error and fix typosHelge Deller1-5/+7
If a testcase returns a wrong (unexpected) value, print the expected and returned value in hex notation in addition to the decimal notation. This is very useful in tests which bit-shift hex values left or right and helped me a lot while developing the JIT compiler for the hppa architecture. Additionally fix two typos: dowrd -> dword, tall calls -> tail calls. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/ZN6ZAAVoWZpsD1Jf@p100
2023-08-18selftests/bpf: Add lwt_xmit tests for BPF_REROUTEYan Zhai3-0/+299
There is no lwt test case for BPF_REROUTE yet. Add test cases for both normal and abnormal situations. The abnormal situation is set up with an fq qdisc on the reroute target device. Without proper fixes, overflow this qdisc queue limit (to trigger a drop) would panic the kernel. Signed-off-by: Yan Zhai <yan@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/62c8ddc1e924269dcf80d2e8af1a1e632cee0b3a.1692326837.git.yan@cloudflare.com
2023-08-18selftests/bpf: Add lwt_xmit tests for BPF_REDIRECTYan Zhai4-0/+560
There is no lwt_xmit test case for BPF_REDIRECT yet. Add test cases for both normal and abnormal situations. For abnormal test cases, devices are set down or have its carrier set down. Without proper fixes, BPF_REDIRECT to either ingress or egress of such device would panic the kernel. Signed-off-by: Yan Zhai <yan@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/96bf435243641939d9c9da329fab29cb45f7df22.1692326837.git.yan@cloudflare.com
2023-08-18lwt: Check LWTUNNEL_XMIT_CONTINUE strictlyYan Zhai3-3/+6
LWTUNNEL_XMIT_CONTINUE is implicitly assumed in ip(6)_finish_output2, such that any positive return value from a xmit hook could cause unexpected continue behavior, despite that related skb may have been freed. This could be error-prone for future xmit hook ops. One of the possible errors is to return statuses of dst_output directly. To make the code safer, redefine LWTUNNEL_XMIT_CONTINUE value to distinguish from dst_output statuses and check the continue condition explicitly. Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure") Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Yan Zhai <yan@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/96b939b85eda00e8df4f7c080f770970a4c5f698.1692326837.git.yan@cloudflare.com
2023-08-18lwt: Fix return values of BPF xmit opsYan Zhai1-4/+3
BPF encap ops can return different types of positive values, such like NET_RX_DROP, NET_XMIT_CN, NETDEV_TX_BUSY, and so on, from function skb_do_redirect and bpf_lwt_xmit_reroute. At the xmit hook, such return values would be treated implicitly as LWTUNNEL_XMIT_CONTINUE in ip(6)_finish_output2. When this happens, skbs that have been freed would continue to the neighbor subsystem, causing use-after-free bug and kernel crashes. To fix the incorrect behavior, skb_do_redirect return values can be simply discarded, the same as tc-egress behavior. On the other hand, bpf_lwt_xmit_reroute returns useful errors to local senders, e.g. PMTU information. Thus convert its return values to avoid the conflict with LWTUNNEL_XMIT_CONTINUE. Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure") Reported-by: Jordan Griege <jgriege@cloudflare.com> Suggested-by: Martin KaFai Lau <martin.lau@linux.dev> Suggested-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Yan Zhai <yan@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/0d2b878186cfe215fec6b45769c1cd0591d3628d.1692326837.git.yan@cloudflare.com
2023-08-18selftests/bpf: Enable cpu v4 tests for arm64Xu Kuohai6-6/+6
Enable CPU v4 instruction tests for arm64. Below are the test results from BPF test_progs selftests: # ./test_progs -t ldsx_insn,verifier_sdiv,verifier_movsx,verifier_ldsx,verifier_gotol,verifier_bswap #115/1 ldsx_insn/map_val and probed_memory:OK #115/2 ldsx_insn/ctx_member_sign_ext:OK #115/3 ldsx_insn/ctx_member_narrow_sign_ext:OK #115 ldsx_insn:OK #302/1 verifier_bswap/BSWAP, 16:OK #302/2 verifier_bswap/BSWAP, 16 @unpriv:OK #302/3 verifier_bswap/BSWAP, 32:OK #302/4 verifier_bswap/BSWAP, 32 @unpriv:OK #302/5 verifier_bswap/BSWAP, 64:OK #302/6 verifier_bswap/BSWAP, 64 @unpriv:OK #302 verifier_bswap:OK #316/1 verifier_gotol/gotol, small_imm:OK #316/2 verifier_gotol/gotol, small_imm @unpriv:OK #316 verifier_gotol:OK #324/1 verifier_ldsx/LDSX, S8:OK #324/2 verifier_ldsx/LDSX, S8 @unpriv:OK #324/3 verifier_ldsx/LDSX, S16:OK #324/4 verifier_ldsx/LDSX, S16 @unpriv:OK #324/5 verifier_ldsx/LDSX, S32:OK #324/6 verifier_ldsx/LDSX, S32 @unpriv:OK #324/7 verifier_ldsx/LDSX, S8 range checking, privileged:OK #324/8 verifier_ldsx/LDSX, S16 range checking:OK #324/9 verifier_ldsx/LDSX, S16 range checking @unpriv:OK #324/10 verifier_ldsx/LDSX, S32 range checking:OK #324/11 verifier_ldsx/LDSX, S32 range checking @unpriv:OK #324 verifier_ldsx:OK #335/1 verifier_movsx/MOV32SX, S8:OK #335/2 verifier_movsx/MOV32SX, S8 @unpriv:OK #335/3 verifier_movsx/MOV32SX, S16:OK #335/4 verifier_movsx/MOV32SX, S16 @unpriv:OK #335/5 verifier_movsx/MOV64SX, S8:OK #335/6 verifier_movsx/MOV64SX, S8 @unpriv:OK #335/7 verifier_movsx/MOV64SX, S16:OK #335/8 verifier_movsx/MOV64SX, S16 @unpriv:OK #335/9 verifier_movsx/MOV64SX, S32:OK #335/10 verifier_movsx/MOV64SX, S32 @unpriv:OK #335/11 verifier_movsx/MOV32SX, S8, range_check:OK #335/12 verifier_movsx/MOV32SX, S8, range_check @unpriv:OK #335/13 verifier_movsx/MOV32SX, S16, range_check:OK #335/14 verifier_movsx/MOV32SX, S16, range_check @unpriv:OK #335/15 verifier_movsx/MOV32SX, S16, range_check 2:OK #335/16 verifier_movsx/MOV32SX, S16, range_check 2 @unpriv:OK #335/17 verifier_movsx/MOV64SX, S8, range_check:OK #335/18 verifier_movsx/MOV64SX, S8, range_check @unpriv:OK #335/19 verifier_movsx/MOV64SX, S16, range_check:OK #335/20 verifier_movsx/MOV64SX, S16, range_check @unpriv:OK #335/21 verifier_movsx/MOV64SX, S32, range_check:OK #335/22 verifier_movsx/MOV64SX, S32, range_check @unpriv:OK #335/23 verifier_movsx/MOV64SX, S16, R10 Sign Extension:OK #335/24 verifier_movsx/MOV64SX, S16, R10 Sign Extension @unpriv:OK #335 verifier_movsx:OK #347/1 verifier_sdiv/SDIV32, non-zero imm divisor, check 1:OK #347/2 verifier_sdiv/SDIV32, non-zero imm divisor, check 1 @unpriv:OK #347/3 verifier_sdiv/SDIV32, non-zero imm divisor, check 2:OK #347/4 verifier_sdiv/SDIV32, non-zero imm divisor, check 2 @unpriv:OK #347/5 verifier_sdiv/SDIV32, non-zero imm divisor, check 3:OK #347/6 verifier_sdiv/SDIV32, non-zero imm divisor, check 3 @unpriv:OK #347/7 verifier_sdiv/SDIV32, non-zero imm divisor, check 4:OK #347/8 verifier_sdiv/SDIV32, non-zero imm divisor, check 4 @unpriv:OK #347/9 verifier_sdiv/SDIV32, non-zero imm divisor, check 5:OK #347/10 verifier_sdiv/SDIV32, non-zero imm divisor, check 5 @unpriv:OK #347/11 verifier_sdiv/SDIV32, non-zero imm divisor, check 6:OK #347/12 verifier_sdiv/SDIV32, non-zero imm divisor, check 6 @unpriv:OK #347/13 verifier_sdiv/SDIV32, non-zero imm divisor, check 7:OK #347/14 verifier_sdiv/SDIV32, non-zero imm divisor, check 7 @unpriv:OK #347/15 verifier_sdiv/SDIV32, non-zero imm divisor, check 8:OK #347/16 verifier_sdiv/SDIV32, non-zero imm divisor, check 8 @unpriv:OK #347/17 verifier_sdiv/SDIV32, non-zero reg divisor, check 1:OK #347/18 verifier_sdiv/SDIV32, non-zero reg divisor, check 1 @unpriv:OK #347/19 verifier_sdiv/SDIV32, non-zero reg divisor, check 2:OK #347/20 verifier_sdiv/SDIV32, non-zero reg divisor, check 2 @unpriv:OK #347/21 verifier_sdiv/SDIV32, non-zero reg divisor, check 3:OK #347/22 verifier_sdiv/SDIV32, non-zero reg divisor, check 3 @unpriv:OK #347/23 verifier_sdiv/SDIV32, non-zero reg divisor, check 4:OK #347/24 verifier_sdiv/SDIV32, non-zero reg divisor, check 4 @unpriv:OK #347/25 verifier_sdiv/SDIV32, non-zero reg divisor, check 5:OK #347/26 verifier_sdiv/SDIV32, non-zero reg divisor, check 5 @unpriv:OK #347/27 verifier_sdiv/SDIV32, non-zero reg divisor, check 6:OK #347/28 verifier_sdiv/SDIV32, non-zero reg divisor, check 6 @unpriv:OK #347/29 verifier_sdiv/SDIV32, non-zero reg divisor, check 7:OK #347/30 verifier_sdiv/SDIV32, non-zero reg divisor, check 7 @unpriv:OK #347/31 verifier_sdiv/SDIV32, non-zero reg divisor, check 8:OK #347/32 verifier_sdiv/SDIV32, non-zero reg divisor, check 8 @unpriv:OK #347/33 verifier_sdiv/SDIV64, non-zero imm divisor, check 1:OK #347/34 verifier_sdiv/SDIV64, non-zero imm divisor, check 1 @unpriv:OK #347/35 verifier_sdiv/SDIV64, non-zero imm divisor, check 2:OK #347/36 verifier_sdiv/SDIV64, non-zero imm divisor, check 2 @unpriv:OK #347/37 verifier_sdiv/SDIV64, non-zero imm divisor, check 3:OK #347/38 verifier_sdiv/SDIV64, non-zero imm divisor, check 3 @unpriv:OK #347/39 verifier_sdiv/SDIV64, non-zero imm divisor, check 4:OK #347/40 verifier_sdiv/SDIV64, non-zero imm divisor, check 4 @unpriv:OK #347/41 verifier_sdiv/SDIV64, non-zero imm divisor, check 5:OK #347/42 verifier_sdiv/SDIV64, non-zero imm divisor, check 5 @unpriv:OK #347/43 verifier_sdiv/SDIV64, non-zero imm divisor, check 6:OK #347/44 verifier_sdiv/SDIV64, non-zero imm divisor, check 6 @unpriv:OK #347/45 verifier_sdiv/SDIV64, non-zero reg divisor, check 1:OK #347/46 verifier_sdiv/SDIV64, non-zero reg divisor, check 1 @unpriv:OK #347/47 verifier_sdiv/SDIV64, non-zero reg divisor, check 2:OK #347/48 verifier_sdiv/SDIV64, non-zero reg divisor, check 2 @unpriv:OK #347/49 verifier_sdiv/SDIV64, non-zero reg divisor, check 3:OK #347/50 verifier_sdiv/SDIV64, non-zero reg divisor, check 3 @unpriv:OK #347/51 verifier_sdiv/SDIV64, non-zero reg divisor, check 4:OK #347/52 verifier_sdiv/SDIV64, non-zero reg divisor, check 4 @unpriv:OK #347/53 verifier_sdiv/SDIV64, non-zero reg divisor, check 5:OK #347/54 verifier_sdiv/SDIV64, non-zero reg divisor, check 5 @unpriv:OK #347/55 verifier_sdiv/SDIV64, non-zero reg divisor, check 6:OK #347/56 verifier_sdiv/SDIV64, non-zero reg divisor, check 6 @unpriv:OK #347/57 verifier_sdiv/SMOD32, non-zero imm divisor, check 1:OK #347/58 verifier_sdiv/SMOD32, non-zero imm divisor, check 1 @unpriv:OK #347/59 verifier_sdiv/SMOD32, non-zero imm divisor, check 2:OK #347/60 verifier_sdiv/SMOD32, non-zero imm divisor, check 2 @unpriv:OK #347/61 verifier_sdiv/SMOD32, non-zero imm divisor, check 3:OK #347/62 verifier_sdiv/SMOD32, non-zero imm divisor, check 3 @unpriv:OK #347/63 verifier_sdiv/SMOD32, non-zero imm divisor, check 4:OK #347/64 verifier_sdiv/SMOD32, non-zero imm divisor, check 4 @unpriv:OK #347/65 verifier_sdiv/SMOD32, non-zero imm divisor, check 5:OK #347/66 verifier_sdiv/SMOD32, non-zero imm divisor, check 5 @unpriv:OK #347/67 verifier_sdiv/SMOD32, non-zero imm divisor, check 6:OK #347/68 verifier_sdiv/SMOD32, non-zero imm divisor, check 6 @unpriv:OK #347/69 verifier_sdiv/SMOD32, non-zero reg divisor, check 1:OK #347/70 verifier_sdiv/SMOD32, non-zero reg divisor, check 1 @unpriv:OK #347/71 verifier_sdiv/SMOD32, non-zero reg divisor, check 2:OK #347/72 verifier_sdiv/SMOD32, non-zero reg divisor, check 2 @unpriv:OK #347/73 verifier_sdiv/SMOD32, non-zero reg divisor, check 3:OK #347/74 verifier_sdiv/SMOD32, non-zero reg divisor, check 3 @unpriv:OK #347/75 verifier_sdiv/SMOD32, non-zero reg divisor, check 4:OK #347/76 verifier_sdiv/SMOD32, non-zero reg divisor, check 4 @unpriv:OK #347/77 verifier_sdiv/SMOD32, non-zero reg divisor, check 5:OK #347/78 verifier_sdiv/SMOD32, non-zero reg divisor, check 5 @unpriv:OK #347/79 verifier_sdiv/SMOD32, non-zero reg divisor, check 6:OK #347/80 verifier_sdiv/SMOD32, non-zero reg divisor, check 6 @unpriv:OK #347/81 verifier_sdiv/SMOD64, non-zero imm divisor, check 1:OK #347/82 verifier_sdiv/SMOD64, non-zero imm divisor, check 1 @unpriv:OK #347/83 verifier_sdiv/SMOD64, non-zero imm divisor, check 2:OK #347/84 verifier_sdiv/SMOD64, non-zero imm divisor, check 2 @unpriv:OK #347/85 verifier_sdiv/SMOD64, non-zero imm divisor, check 3:OK #347/86 verifier_sdiv/SMOD64, non-zero imm divisor, check 3 @unpriv:OK #347/87 verifier_sdiv/SMOD64, non-zero imm divisor, check 4:OK #347/88 verifier_sdiv/SMOD64, non-zero imm divisor, check 4 @unpriv:OK #347/89 verifier_sdiv/SMOD64, non-zero imm divisor, check 5:OK #347/90 verifier_sdiv/SMOD64, non-zero imm divisor, check 5 @unpriv:OK #347/91 verifier_sdiv/SMOD64, non-zero imm divisor, check 6:OK #347/92 verifier_sdiv/SMOD64, non-zero imm divisor, check 6 @unpriv:OK #347/93 verifier_sdiv/SMOD64, non-zero imm divisor, check 7:OK #347/94 verifier_sdiv/SMOD64, non-zero imm divisor, check 7 @unpriv:OK #347/95 verifier_sdiv/SMOD64, non-zero imm divisor, check 8:OK #347/96 verifier_sdiv/SMOD64, non-zero imm divisor, check 8 @unpriv:OK #347/97 verifier_sdiv/SMOD64, non-zero reg divisor, check 1:OK #347/98 verifier_sdiv/SMOD64, non-zero reg divisor, check 1 @unpriv:OK #347/99 verifier_sdiv/SMOD64, non-zero reg divisor, check 2:OK #347/100 verifier_sdiv/SMOD64, non-zero reg divisor, check 2 @unpriv:OK #347/101 verifier_sdiv/SMOD64, non-zero reg divisor, check 3:OK #347/102 verifier_sdiv/SMOD64, non-zero reg divisor, check 3 @unpriv:OK #347/103 verifier_sdiv/SMOD64, non-zero reg divisor, check 4:OK #347/104 verifier_sdiv/SMOD64, non-zero reg divisor, check 4 @unpriv:OK #347/105 verifier_sdiv/SMOD64, non-zero reg divisor, check 5:OK #347/106 verifier_sdiv/SMOD64, non-zero reg divisor, check 5 @unpriv:OK #347/107 verifier_sdiv/SMOD64, non-zero reg divisor, check 6:OK #347/108 verifier_sdiv/SMOD64, non-zero reg divisor, check 6 @unpriv:OK #347/109 verifier_sdiv/SMOD64, non-zero reg divisor, check 7:OK #347/110 verifier_sdiv/SMOD64, non-zero reg divisor, check 7 @unpriv:OK #347/111 verifier_sdiv/SMOD64, non-zero reg divisor, check 8:OK #347/112 verifier_sdiv/SMOD64, non-zero reg divisor, check 8 @unpriv:OK #347/113 verifier_sdiv/SDIV32, zero divisor:OK #347/114 verifier_sdiv/SDIV32, zero divisor @unpriv:OK #347/115 verifier_sdiv/SDIV64, zero divisor:OK #347/116 verifier_sdiv/SDIV64, zero divisor @unpriv:OK #347/117 verifier_sdiv/SMOD32, zero divisor:OK #347/118 verifier_sdiv/SMOD32, zero divisor @unpriv:OK #347/119 verifier_sdiv/SMOD64, zero divisor:OK #347/120 verifier_sdiv/SMOD64, zero divisor @unpriv:OK #347 verifier_sdiv:OK Summary: 6/166 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Florent Revest <revest@chromium.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Florent Revest <revest@chromium.org> Link: https://lore.kernel.org/bpf/20230815154158.717901-8-xukuohai@huaweicloud.com
2023-08-18bpf, arm64: Support signed div/mod instructionsXu Kuohai2-4/+17
Add JIT for signed div/mod instructions. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Florent Revest <revest@chromium.org> Acked-by: Florent Revest <revest@chromium.org> Link: https://lore.kernel.org/bpf/20230815154158.717901-7-xukuohai@huaweicloud.com
2023-08-18bpf, arm64: Support 32-bit offset jmp instructionXu Kuohai1-1/+5
Add support for 32-bit offset jmp instructions. Given the arm64 direct jump range is +-128MB, which is large enough for BPF prog, jumps beyond this range are not supported. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Florent Revest <revest@chromium.org> Acked-by: Florent Revest <revest@chromium.org> Link: https://lore.kernel.org/bpf/20230815154158.717901-6-xukuohai@huaweicloud.com
2023-08-18bpf, arm64: Support unconditional bswapXu Kuohai1-2/+3
Add JIT support for unconditional bswap instructions. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Florent Revest <revest@chromium.org> Acked-by: Florent Revest <revest@chromium.org> Link: https://lore.kernel.org/bpf/20230815154158.717901-5-xukuohai@huaweicloud.com
2023-08-18bpf, arm64: Support sign-extension mov instructionsXu Kuohai2-1/+19
Add JIT support for BPF sign-extension mov instructions with arm64 SXTB/SXTH/SXTW instructions. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Florent Revest <revest@chromium.org> Acked-by: Florent Revest <revest@chromium.org> Link: https://lore.kernel.org/bpf/20230815154158.717901-4-xukuohai@huaweicloud.com
2023-08-18bpf, arm64: Support sign-extension load instructionsXu Kuohai2-8/+43
Add JIT support for sign-extension load instructions. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Florent Revest <revest@chromium.org> Acked-by: Florent Revest <revest@chromium.org> Link: https://lore.kernel.org/bpf/20230815154158.717901-3-xukuohai@huaweicloud.com
2023-08-18arm64: insn: Add encoders for LDRSB/LDRSH/LDRSWXu Kuohai2-0/+10
To support BPF sign-extend load instructions, add encoders for LDRSB/LDRSH/LDRSW. LDRSB/LDRSH/LDRSW (immediate) is encoded as follows: 3 2 2 2 2 1 0 0 0 7 6 4 2 0 5 0 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sz|1 1 1|0|0 1|opc| imm12 | Rn | Rt | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ LDRSB/LDRSH/LDRSW (register) is encoded as follows: 3 2 2 2 2 2 1 1 1 1 0 0 0 7 6 4 2 1 6 3 2 0 5 0 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | sz|1 1 1|0|0 0|opc|1| Rm | opt |S|1 0| Rn | Rt | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ where: - sz indicates whether 8-bit, 16-bit or 32-bit data is to be loaded - opc opc[1] (bit 23) is always 1 and opc[0] == 1 indicates regsize is 32-bit. Since BPF signed load instructions always exend the sign bit to bit 63 regardless of whether it loads an 8-bit, 16-bit or 32-bit data. So only 64-bit register size is required. That is, it's sufficient to set field opc fixed to 0x2. - opt Indicates whether to sign extend the offset register Rm and the effective bits of Rm. We set opt to 0x7 (SXTX) since we'll use Rm as a sgined 64-bit value in BPF. - S Optional only when opt field is 0x3 (LSL) In short, the above fields are encoded to the values listed below. sz opc opt S LDRSB (immediate) 0x0 0x2 na na LDRSH (immediate) 0x1 0x2 na na LDRSW (immediate) 0x2 0x2 na na LDRSB (register) 0x0 0x2 0x7 0 LDRSH (register) 0x1 0x2 0x7 0 LDRSW (register) 0x2 0x2 0x7 0 Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Florent Revest <revest@chromium.org> Acked-by: Florent Revest <revest@chromium.org> Link: https://lore.kernel.org/bpf/20230815154158.717901-2-xukuohai@huaweicloud.com
2023-08-18Merge branch 'netconsole-enable-compile-time-configuration'Jakub Kicinski2-20/+50
Breno Leitao says: ==================== netconsole: Enable compile time configuration Enable netconsole features to be set at compilation time. Create two Kconfig options that allow users to set extended logs and release prepending features at compilation time. The first patch de-duplicates the initialization code, and the second patch adds the support in the de-duplicated code, avoiding touching two different functions with the same change. ==================== Link: https://lore.kernel.org/r/20230811093158.1678322-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-18netconsole: Enable compile time configurationBreno Leitao2-0/+27
Enable netconsole features to be set at compilation time. Create two Kconfig options that allow users to set extended logs and release prepending features at compilation time. Right now, the user needs to pass command line parameters to netconsole, such as "+"/"r" to enable extended logs and version prepending features. With these two options, the user could set the default values for the features at compile time, and don't need to pass it in the command line to get them enabled, simplifying the command line. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230811093158.1678322-3-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-18netconsole: Create a allocation helperBreno Leitao1-20/+23
De-duplicate the initialization and allocation code for struct netconsole_target. The same allocation and initialization code is duplicated in two different places in the netconsole subsystem, when the netconsole target is initialized by command line parameters (alloc_param_target()), and dynamically by sysfs (make_netconsole_target()). Create a helper function, and call it from the two different functions. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230811093158.1678322-2-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-18net: mdio: fix -Wvoid-pointer-to-enum-cast warningJustin Stitt1-1/+1
When building with clang 18 I see the following warning: | drivers/net/mdio/mdio-xgene.c:338:13: warning: cast to smaller integer | type 'enum xgene_mdio_id' from 'const void *' [-Wvoid-pointer-to-enum-cast] | 338 | mdio_id = (enum xgene_mdio_id)of_id->data; This is due to the fact that `of_id->data` is a void* while `enum xgene_mdio_id` has the size of an int. This leads to truncation and possible data loss. Link: https://github.com/ClangBuiltLinux/linux/issues/1910 Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/20230815-void-drivers-net-mdio-mdio-xgene-v1-1-5304342e0659@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-18Merge branch 'netem-use-a-seeded-prng-for-loss-and-corruption-events'Jakub Kicinski2-15/+35
François Michel says: ==================== netem: use a seeded PRNG for loss and corruption events In order to reproduce bugs or performance evaluation of network protocols and applications, it is useful to have reproducible test suites and tools. This patch adds a way to specify a PRNG seed through the TCA_NETEM_PRNG_SEED attribute for generating netem loss and corruption events. Initializing the qdisc with the same seed leads to the exact same loss and corruption patterns. If no seed is explicitly specified, the qdisc generates a random seed using get_random_u64(). This patch can be and has been tested using tc from the following iproute2-next fork: https://github.com/francoismichel/iproute2-next For instance, setting the seed 42424242 on the loopback with a loss rate of 10% will systematically drop the 5th, 12th and 24th packet when sending 25 packets. ==================== Link: https://lore.kernel.org/r/20230815092348.1449179-1-francois.michel@uclouvain.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-18netem: use seeded PRNG for correlated loss eventsFrançois Michel1-10/+12
Use prandom_u32_state() instead of get_random_u32() to generate the correlated loss events of netem. Signed-off-by: François Michel <francois.michel@uclouvain.be> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Link: https://lore.kernel.org/r/20230815092348.1449179-4-francois.michel@uclouvain.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-18netem: use a seeded PRNG for generating random lossesFrançois Michel1-5/+6
Use prandom_u32_state() instead of get_random_u32() to generate the random loss events of netem. The state of the prng is part of the prng attribute of struct netem_sched_data. Signed-off-by: François Michel <francois.michel@uclouvain.be> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Link: https://lore.kernel.org/r/20230815092348.1449179-3-francois.michel@uclouvain.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-18netem: add prng attribute to netem_sched_dataFrançois Michel2-0/+17
Add prng attribute to struct netem_sched_data and allows setting the seed of the PRNG through netlink using the new TCA_NETEM_PRNG_SEED attribute. The PRNG attribute is not actually used yet. Signed-off-by: François Michel <francois.michel@uclouvain.be> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Link: https://lore.kernel.org/r/20230815092348.1449179-2-francois.michel@uclouvain.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-18net: ena: Use pci_dev_id() to simplify the codeJialin Zhang1-1/+1
PCI core API pci_dev_id() can be used to get the BDF number for a pci device. We don't need to compose it manually. Use pci_dev_id() to simplify the code a little bit. Signed-off-by: Jialin Zhang <zhangjialin11@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/20230815024248.3519068-1-zhangjialin11@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>