summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2021-03-05bpf: Simplify the calculation of variablesJiapeng Chong1-1/+1
Fix the following coccicheck warnings: ./tools/bpf/bpf_dbg.c:1201:55-57: WARNING !A || A && B is equivalent to !A || B. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/1614756035-111280-1-git-send-email-jiapeng.chong@linux.alibaba.com
2021-03-05selftests: bpf: Don't run sk_lookup in verifier testsLorenz Bauer2-2/+3
sk_lookup doesn't allow setting data_in for bpf_prog_run. This doesn't play well with the verifier tests, since they always set a 64 byte input buffer. Allow not running verifier tests by setting bpf_test.runs to a negative value and don't run the ctx access case for sk_lookup. We have dedicated ctx access tests so skipping here doesn't reduce coverage. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210303101816.36774-6-lmb@cloudflare.com
2021-03-05selftests: bpf: Check that PROG_TEST_RUN repeats as requestedLorenz Bauer1-9/+42
Extend a simple prog_run test to check that PROG_TEST_RUN adheres to the requested repetitions. Convert it to use BPF skeleton. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210303101816.36774-5-lmb@cloudflare.com
2021-03-05selftests: bpf: Convert sk_lookup ctx access tests to PROG_TEST_RUNLorenz Bauer2-36/+109
Convert the selftests for sk_lookup narrow context access to use PROG_TEST_RUN instead of creating actual sockets. This ensures that ctx is populated correctly when using PROG_TEST_RUN. Assert concrete values since we now control remote_ip and remote_port. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210303101816.36774-4-lmb@cloudflare.com
2021-03-05bpf: Add PROG_TEST_RUN support for sk_lookup programsLorenz Bauer1-1/+4
Allow to pass sk_lookup programs to PROG_TEST_RUN. User space provides the full bpf_sk_lookup struct as context. Since the context includes a socket pointer that can't be exposed to user space we define that PROG_TEST_RUN returns the cookie of the selected socket or zero in place of the socket pointer. We don't support testing programs that select a reuseport socket, since this would mean running another (unrelated) BPF program from the sk_lookup test handler. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210303101816.36774-3-lmb@cloudflare.com
2021-03-05tools: Sync uapi bpf.h header with latest changesJoe Stringer1-1/+711
Synchronize the header after all of the recent changes. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210302171947.2268128-16-joe@cilium.io
2021-03-05selftests/bpf: Test syscall command parsingJoe Stringer2-2/+13
Add building of the bpf(2) syscall commands documentation as part of the docs building step in the build. This allows us to pick up on potential parse errors from the docs generator script as part of selftests. The generated manual pages here are not intended for distribution, they are just a fragment that can be integrated into the other static text of bpf(2) to form the full manual page. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210302171947.2268128-14-joe@cilium.io
2021-03-05selftests/bpf: Templatize man page generationJoe Stringer1-15/+25
Previously, the Makefile here was only targeting a single manual page so it just hardcoded a bunch of individual rules to specifically handle build, clean, install, uninstall for that particular page. Upcoming commits will generate manual pages for an additional section, so this commit prepares the makefile first by converting the existing targets into an evaluated set of targets based on the manual page name and section. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210302171947.2268128-13-joe@cilium.io
2021-03-05tools/bpf: Remove bpf-helpers from bpftool docsJoe Stringer7-49/+50
This logic is used for validating the manual pages from selftests, so move the infra under tools/testing/selftests/bpf/ and rely on selftests for validation rather than tying it into the bpftool build. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210302171947.2268128-12-joe@cilium.io
2021-03-05scripts/bpf: Abstract eBPF API target parameterJoe Stringer4-4/+4
Abstract out the target parameter so that upcoming commits, more than just the existing "helpers" target can be called to generate specific portions of docs from the eBPF UAPI headers. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20210302171947.2268128-10-joe@cilium.io
2021-03-05selftests/bpf: Add BTF_KIND_FLOAT to the existing deduplication testsIlya Leoshkevich1-12/+31
Check that floats don't interfere with struct deduplication, that they are not merged with another kinds and that floats of different sizes are not merged with each other. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226202256.116518-9-iii@linux.ibm.com
2021-03-05selftest/bpf: Add BTF_KIND_FLOAT testsIlya Leoshkevich3-0/+138
Test the good variants as well as the potential malformed ones. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226202256.116518-8-iii@linux.ibm.com
2021-03-05selftests/bpf: Use the 25th bit in the "invalid BTF_INFO" testIlya Leoshkevich1-1/+1
The bit being checked by this test is no longer reserved after introducing BTF_KIND_FLOAT, so use the next one instead. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210226202256.116518-6-iii@linux.ibm.com
2021-03-05tools/bpftool: Add BTF_KIND_FLOAT supportIlya Leoshkevich2-0/+9
Only dumping support needs to be adjusted, the code structure follows that of BTF_KIND_INT. Example plain and JSON outputs: [4] FLOAT 'float' size=4 {"id":4,"kind":"FLOAT","name":"float","size":4} Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210226202256.116518-5-iii@linux.ibm.com
2021-03-05libbpf: Add BTF_KIND_FLOAT supportIlya Leoshkevich6-1/+94
The logic follows that of BTF_KIND_INT most of the time. Sanitization replaces BTF_KIND_FLOATs with equally-sized empty BTF_KIND_STRUCTs on older kernels, for example, the following: [4] FLOAT 'float' size=4 becomes the following: [4] STRUCT '(anon)' size=4 vlen=0 With dwarves patch [1] and this patch, the older kernels, which were failing with the floating-point-related errors, will now start working correctly. [1] https://github.com/iii-i/dwarves/commit/btf-kind-float-v2 Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226202256.116518-4-iii@linux.ibm.com
2021-03-05libbpf: Fix whitespace in btf_add_composite() commentIlya Leoshkevich1-1/+1
Remove trailing space. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210226202256.116518-3-iii@linux.ibm.com
2021-03-05bpf: Add BTF_KIND_FLOAT to uapiIlya Leoshkevich1-2/+3
Add a new kind value and expand the kind bitfield. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210226202256.116518-2-iii@linux.ibm.com
2021-03-04selftests/bpf: Add a verifier scale test with unknown bounded loopYonghong Song3-0/+139
The original bcc pull request https://github.com/iovisor/bcc/pull/3270 exposed a verifier failure with Clang 12/13 while Clang 4 works fine. Further investigation exposed two issues: Issue 1: LLVM may generate code which uses less refined value. The issue is fixed in LLVM patch: https://reviews.llvm.org/D97479 Issue 2: Spills with initial value 0 are marked as precise which makes later state pruning less effective. This is my rough initial analysis and further investigation is needed to find how to improve verifier pruning in such cases. With the above LLVM patch, for the new loop6.c test, which has smaller loop bound compared to original test, I got: $ test_progs -s -n 10/16 ... stack depth 64 processed 390735 insns (limit 1000000) max_states_per_insn 87 total_states 8658 peak_states 964 mark_read 6 #10/16 loop6.o:OK Use the original loop bound, i.e., commenting out "#define WORKAROUND", I got: $ test_progs -s -n 10/16 ... BPF program is too large. Processed 1000001 insn stack depth 64 processed 1000001 insns (limit 1000000) max_states_per_insn 91 total_states 23176 peak_states 5069 mark_read 6 ... #10/16 loop6.o:FAIL The purpose of this patch is to provide a regression test for the above LLVM fix and also provide a test case for further analyzing the verifier pruning issue. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Zhenwei Pi <pizhenwei@bytedance.com> Link: https://lore.kernel.org/bpf/20210226223810.236472-1-yhs@fb.com
2021-03-03tools/runqslower: Allow substituting custom vmlinux.h for the buildAndrii Nakryiko1-0/+4
Just like was done for bpftool and selftests in ec23eb705620 ("tools/bpftool: Allow substituting custom vmlinux.h for the build") and ca4db6389d61 ("selftests/bpf: Allow substituting custom vmlinux.h for selftests build"), allow to provide pre-generated vmlinux.h for runqslower build. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210303004010.653954-1-andrii@kernel.org
2021-02-27tools, bpf_asm: Exit non-zero on errorsIan Denhardt1-4/+4
... so callers can correctly detect failure. Signed-off-by: Ian Denhardt <ian@zenhack.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/b0bea780bc292f29e7b389dd062f20adc2a2d634.1614201868.git.ian@zenhack.net
2021-02-27tools, bpf_asm: Hard error on out of range jumpsIan Denhardt1-2/+4
Per discussion at [0] this was originally introduced as a warning due to concerns about breaking existing code, but a hard error probably makes more sense, especially given that concerns about breakage were only speculation. [0] https://lore.kernel.org/bpf/c964892195a6b91d20a67691448567ef528ffa6d.camel@linux.ibm.com/T/#t Signed-off-by: Ian Denhardt <ian@zenhack.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/a6b6c7516f5d559049d669968e953b4a8d7adea3.1614201868.git.ian@zenhack.net
2021-02-27selftests/bpf: Add arraymap test for bpf_for_each_map_elem() helperYonghong Song2-0/+118
A test is added for arraymap and percpu arraymap. The test also exercises the early return for the helper which does not traverse all elements. $ ./test_progs -n 45 #45/1 hash_map:OK #45/2 array_map:OK #45 for_each:OK Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226204934.3885756-1-yhs@fb.com
2021-02-27selftests/bpf: Add hashmap test for bpf_for_each_map_elem() helperYonghong Song3-0/+179
A test case is added for hashmap and percpu hashmap. The test also exercises nested bpf_for_each_map_elem() calls like bpf_prog: bpf_for_each_map_elem(func1) func1: bpf_for_each_map_elem(func2) func2: $ ./test_progs -n 45 #45/1 hash_map:OK #45 for_each:OK Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226204933.3885657-1-yhs@fb.com
2021-02-27bpftool: Print subprog address properlyYonghong Song1-0/+3
With later hashmap example, using bpftool xlated output may look like: int dump_task(struct bpf_iter__task * ctx): ; struct task_struct *task = ctx->task; 0: (79) r2 = *(u64 *)(r1 +8) ; if (task == (void *)0 || called > 0) ... 19: (18) r2 = subprog[+17] 30: (18) r2 = subprog[+25] ... 36: (95) exit __u64 check_hash_elem(struct bpf_map * map, __u32 * key, __u64 * val, struct callback_ctx * data): ; struct bpf_iter__task *ctx = data->ctx; 37: (79) r5 = *(u64 *)(r4 +0) ... 55: (95) exit __u64 check_percpu_elem(struct bpf_map * map, __u32 * key, __u64 * val, void * unused): ; check_percpu_elem(struct bpf_map *map, __u32 *key, __u64 *val, void *unused) 56: (bf) r6 = r3 ... 83: (18) r2 = subprog[-47] Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226204931.3885458-1-yhs@fb.com
2021-02-27libbpf: Support subprog address relocationYonghong Song1-3/+61
A new relocation RELO_SUBPROG_ADDR is added to capture subprog addresses loaded with ld_imm64 insns. Such ld_imm64 insns are marked with BPF_PSEUDO_FUNC and will be passed to kernel. For bpf_for_each_map_elem() case, kernel will check that the to-be-used subprog address must be a static function and replace it with proper actual jited func address. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226204930.3885367-1-yhs@fb.com
2021-02-27libbpf: Move function is_ldimm64() earlier in libbpf.cYonghong Song1-6/+6
Move function is_ldimm64() close to the beginning of libbpf.c, so it can be reused by later code and the next patch as well. There is no functionality change. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226204929.3885295-1-yhs@fb.com
2021-02-27bpf: Add bpf_for_each_map_elem() helperYonghong Song1-0/+38
The bpf_for_each_map_elem() helper is introduced which iterates all map elements with a callback function. The helper signature looks like long bpf_for_each_map_elem(map, callback_fn, callback_ctx, flags) and for each map element, the callback_fn will be called. For example, like hashmap, the callback signature may look like long callback_fn(map, key, val, callback_ctx) There are two known use cases for this. One is from upstream ([1]) where a for_each_map_elem helper may help implement a timeout mechanism in a more generic way. Another is from our internal discussion for a firewall use case where a map contains all the rules. The packet data can be compared to all these rules to decide allow or deny the packet. For array maps, users can already use a bounded loop to traverse elements. Using this helper can avoid using bounded loop. For other type of maps (e.g., hash maps) where bounded loop is hard or impossible to use, this helper provides a convenient way to operate on all elements. For callback_fn, besides map and map element, a callback_ctx, allocated on caller stack, is also passed to the callback function. This callback_ctx argument can provide additional input and allow to write to caller stack for output. If the callback_fn returns 0, the helper will iterate through next element if available. If the callback_fn returns 1, the helper will stop iterating and returns to the bpf program. Other return values are not used for now. Currently, this helper is only available with jit. It is possible to make it work with interpreter with so effort but I leave it as the future work. [1]: https://lore.kernel.org/bpf/20210122205415.113822-1-xiyou.wangcong@gmail.com/ Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210226204925.3884923-1-yhs@fb.com
2021-02-27selftests/bpf: Copy extras in out-of-srctree buildsIlya Leoshkevich1-3/+4
Building selftests in a separate directory like this: make O="$BUILD" -C tools/testing/selftests/bpf and then running: cd "$BUILD" && ./test_progs -t btf causes all the non-flavored btf_dump_test_case_*.c tests to fail, because these files are not copied to where test_progs expects to find them. Fix by not skipping EXT-COPY when the original $(OUTPUT) is not empty (lib.mk sets it to $(shell pwd) in that case) and using rsync instead of cp: cp fails because e.g. urandom_read is being copied into itself, and rsync simply skips such cases. rsync is already used by kselftests and therefore is not a new dependency. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210224111445.102342-1-iii@linux.ibm.com
2021-02-27selftests/bpf: Propagate error code of the command to vmtest.shKP Singh1-7/+19
When vmtest.sh ran a command in a VM, it did not record or propagate the error code of the command. This made the script less "script-able". The script now saves the error code of the said command in a file in the VM, copies the file back to the host and (when available) uses this error code instead of its own. Signed-off-by: KP Singh <kpsingh@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210225161947.1778590-1-kpsingh@kernel.org
2021-02-26sock_map: Rename skb_parser and skb_verdictCong Wang2-6/+6
These two eBPF programs are tied to BPF_SK_SKB_STREAM_PARSER and BPF_SK_SKB_STREAM_VERDICT, rename them to reflect the fact they are only used for TCP. And save the name 'skb_verdict' for general use later. Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Lorenz Bauer <lmb@cloudflare.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jakub Sitnicki <jakub@cloudflare.com> Link: https://lore.kernel.org/bpf/20210223184934.6054-6-xiyou.wangcong@gmail.com
2021-02-26bpf: Remove blank line in bpf helper description commentHangbin Liu1-1/+0
Commit 34b2021cc616 ("bpf: Add BPF-helper for MTU checking") added an extra blank line in bpf helper description. This will make bpf_helpers_doc.py stop building bpf_helper_defs.h immediately after bpf_check_mtu(), which will affect future added functions. Fixes: 34b2021cc616 ("bpf: Add BPF-helper for MTU checking") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/bpf/20210223131457.1378978-1-liuhangbin@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2021-02-26selftests/bpf: Introduce xsk statistics testsCiara Loftus2-14/+136
This commit introduces a range of tests to the xsk testsuite for validating xsk statistics. A new test type called 'stats' is added. Within it there are four sub-tests. Each test configures a scenario which should trigger the given error statistic. The test passes if the statistic is successfully incremented. The four statistics for which tests have been created are: 1. rx dropped Increase the UMEM frame headroom to a value which results in insufficient space in the rx buffer for both the packet and the headroom. 2. tx invalid Set the 'len' field of tx descriptors to an invalid value (umem frame size + 1). 3. rx ring full Reduce the size of the RX ring to a fraction of the fill ring size. 4. fill queue empty Do not populate the fill queue and then try to receive pkts. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210223162304.7450-5-ciara.loftus@intel.com
2021-02-26selftests/bpf: Restructure xsk selftestsCiara Loftus4-228/+159
Prior to this commit individual xsk tests were launched from the shell script 'test_xsk.sh'. When adding a new test type, two new test configurations had to be added to this file - one for each of the supported XDP 'modes' (skb or drv). Should zero copy support be added to the xsk selftest framework in the future, three new test configurations would need to be added for each new test type. Each new test type also typically requires new CLI arguments for the xdpxceiver program. This commit aims to reduce the overhead of adding new tests, by launching the test configurations from within the xdpxceiver program itself, using simple loops. Every test is run every time the C program is executed. Many of the CLI arguments can be removed as a result. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210223162304.7450-4-ciara.loftus@intel.com
2021-02-26selftests/bpf: Expose and rename debug argumentCiara Loftus3-5/+14
Launching xdpxceiver with -D enables what was formerly know as 'debug' mode. Rename this mode to 'dump-pkts' as it better describes the behavior enabled by the option. New usage: ./xdpxceiver .. -D or ./xdpxceiver .. --dump-pkts Also make it possible to pass this flag to the app via the test_xsk.sh shell script like so: ./test_xsk.sh -D Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210223162304.7450-3-ciara.loftus@intel.com
2021-02-26selftest/bpf: Make xsk tests less verboseMagnus Karlsson4-30/+50
Make the xsk tests less verbose by only printing the essentials. Currently, it is hard to see if the tests passed or not due to all the printouts. Move the extra printouts to a verbose option, if further debugging is needed when a problem arises. To run the xsk tests with verbose output: ./test_xsk.sh -v Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20210223162304.7450-2-ciara.loftus@intel.com
2021-02-26bpf: runqslower: Use task local storageSong Liu1-12/+21
Replace hashtab with task local storage in runqslower. This improves the performance of these BPF programs. The following table summarizes average runtime of these programs, in nanoseconds: task-local hash-prealloc hash-no-prealloc handle__sched_wakeup 125 340 3124 handle__sched_wakeup_new 2812 1510 2998 handle__sched_switch 151 208 991 Note that, task local storage gives better performance than hashtab for handle__sched_wakeup and handle__sched_switch. On the other hand, for handle__sched_wakeup_new, task local storage is slower than hashtab with prealloc. This is because handle__sched_wakeup_new accesses the data for the first time, so it has to allocate the data for task local storage. Once the initial allocation is done, subsequent accesses, as those in handle__sched_wakeup, are much faster with task local storage. If we disable hashtab prealloc, task local storage is much faster for all 3 functions. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210225234319.336131-7-songliubraving@fb.com
2021-02-26bpf: runqslower: Prefer using local vmlimux to generate vmlinux.hSong Liu1-1/+4
Update the Makefile to prefer using $(O)/vmlinux, $(KBUILD_OUTPUT)/vmlinux (for selftests) or ../../../vmlinux. These two files should have latest definitions for vmlinux.h. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210225234319.336131-6-songliubraving@fb.com
2021-02-26selftests/bpf: Test deadlock from recursive bpf_task_storage_[get|delete]Song Liu2-0/+93
Add a test with recursive bpf_task_storage_[get|delete] from fentry programs on bpf_local_storage_lookup and bpf_local_storage_update. Without proper deadlock prevent mechanism, this test would cause deadlock. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210225234319.336131-5-songliubraving@fb.com
2021-02-26selftests/bpf: Add non-BPF_LSM test for task local storageSong Liu3-0/+165
Task local storage is enabled for tracing programs. Add two tests for task local storage without CONFIG_BPF_LSM. The first test stores a value in sys_enter and read it back in sys_exit. The second test checks whether the kernel allows allocating task local storage in exit_creds() (which it should not). Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210225234319.336131-4-songliubraving@fb.com
2021-02-24bpf: Add kernel/modules BTF presence checks to bpftool feature commandGrant Seltzer1-0/+4
This adds both the CONFIG_DEBUG_INFO_BTF and CONFIG_DEBUG_INFO_BTF_MODULES kernel compile option to output of the bpftool feature command. This is relevant for developers that want to account for data structure definition differences between kernels. Signed-off-by: Grant Seltzer <grantseltzer@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210222195846.155483-1-grantseltzer@gmail.com
2021-02-21Merge tag 'sched-core-2021-02-17' of ↵Linus Torvalds4-65/+267
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Core scheduler updates: - Add CONFIG_PREEMPT_DYNAMIC: this in its current form adds the preempt=none/voluntary/full boot options (default: full), to allow distros to build a PREEMPT kernel but fall back to close to PREEMPT_VOLUNTARY (or PREEMPT_NONE) runtime scheduling behavior via a boot time selection. There's also the /debug/sched_debug switch to do this runtime. This feature is implemented via runtime patching (a new variant of static calls). The scope of the runtime patching can be best reviewed by looking at the sched_dynamic_update() function in kernel/sched/core.c. ( Note that the dynamic none/voluntary mode isn't 100% identical, for example preempt-RCU is available in all cases, plus the preempt count is maintained in all models, which has runtime overhead even with the code patching. ) The PREEMPT_VOLUNTARY/PREEMPT_NONE models, used by the vast majority of distributions, are supposed to be unaffected. - Fix ignored rescheduling after rcu_eqs_enter(). This is a bug that was found via rcutorture triggering a hang. The bug is that rcu_idle_enter() may wake up a NOCB kthread, but this happens after the last generic need_resched() check. Some cpuidle drivers fix it by chance but many others don't. In true 2020 fashion the original bug fix has grown into a 5-patch scheduler/RCU fix series plus another 16 RCU patches to address the underlying issue of missed preemption events. These are the initial fixes that should fix current incarnations of the bug. - Clean up rbtree usage in the scheduler, by providing & using the following consistent set of rbtree APIs: partial-order; less() based: - rb_add(): add a new entry to the rbtree - rb_add_cached(): like rb_add(), but for a rb_root_cached total-order; cmp() based: - rb_find(): find an entry in an rbtree - rb_find_add(): find an entry, and add if not found - rb_find_first(): find the first (leftmost) matching entry - rb_next_match(): continue from rb_find_first() - rb_for_each(): iterate a sub-tree using the previous two - Improve the SMP/NUMA load-balancer: scan for an idle sibling in a single pass. This is a 4-commit series where each commit improves one aspect of the idle sibling scan logic. - Improve the cpufreq cooling driver by getting the effective CPU utilization metrics from the scheduler - Improve the fair scheduler's active load-balancing logic by reducing the number of active LB attempts & lengthen the load-balancing interval. This improves stress-ng mmapfork performance. - Fix CFS's estimated utilization (util_est) calculation bug that can result in too high utilization values Misc updates & fixes: - Fix the HRTICK reprogramming & optimization feature - Fix SCHED_SOFTIRQ raising race & warning in the CPU offlining code - Reduce dl_add_task_root_domain() overhead - Fix uprobes refcount bug - Process pending softirqs in flush_smp_call_function_from_idle() - Clean up task priority related defines, remove *USER_*PRIO and USER_PRIO() - Simplify the sched_init_numa() deduplication sort - Documentation updates - Fix EAS bug in update_misfit_status(), which degraded the quality of energy-balancing - Smaller cleanups" * tag 'sched-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) sched,x86: Allow !PREEMPT_DYNAMIC entry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling point entry: Explicitly flush pending rcuog wakeup before last rescheduling point rcu/nocb: Trigger self-IPI on late deferred wake up before user resume rcu/nocb: Perform deferred wake up before last idle's need_resched() check rcu: Pull deferred rcuog wake up to rcu_eqs_enter() callers sched/features: Distinguish between NORMAL and DEADLINE hrtick sched/features: Fix hrtick reprogramming sched/deadline: Reduce rq lock contention in dl_add_task_root_domain() uprobes: (Re)add missing get_uprobe() in __find_uprobe() smp: Process pending softirqs in flush_smp_call_function_from_idle() sched: Harden PREEMPT_DYNAMIC static_call: Allow module use without exposing static_call_key sched: Add /debug/sched_preempt preempt/dynamic: Support dynamic preempt with preempt= boot option preempt/dynamic: Provide irqentry_exit_cond_resched() static call preempt/dynamic: Provide preempt_schedule[_notrace]() static calls preempt/dynamic: Provide cond_resched() and might_resched() static calls preempt: Introduce CONFIG_PREEMPT_DYNAMIC static_call: Provide DEFINE_STATIC_CALL_RET0() ...
2021-02-21Merge tag 'locking-core-2021-02-17' of ↵Linus Torvalds34-138/+42
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "Core locking primitives updates: - Remove mutex_trylock_recursive() from the API - no users left - Simplify + constify the futex code a bit Lockdep updates: - Teach lockdep about local_lock_t - Add CONFIG_DEBUG_IRQFLAGS=y debug config option to check for potentially unsafe IRQ mask restoration patterns. (I.e. calling raw_local_irq_restore() with IRQs enabled.) - Add wait context self-tests - Fix graph lock corner case corrupting internal data structures - Fix noinstr annotations LKMM updates: - Simplify the litmus tests - Documentation fixes KCSAN updates: - Re-enable KCSAN instrumentation in lib/random32.c Misc fixes: - Don't branch-trace static label APIs - DocBook fix - Remove stale leftover empty file" * tag 'locking-core-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) checkpatch: Don't check for mutex_trylock_recursive() locking/mutex: Kill mutex_trylock_recursive() s390: Use arch_local_irq_{save,restore}() in early boot code lockdep: Noinstr annotate warn_bogus_irq_restore() locking/lockdep: Avoid unmatched unlock locking/rwsem: Remove empty rwsem.h locking/rtmutex: Add missing kernel-doc markup futex: Remove unneeded gotos futex: Change utime parameter to be 'const ... *' lockdep: report broken irq restoration jump_label: Do not profile branch annotations locking: Add Reviewers locking/selftests: Add local_lock inversion tests locking/lockdep: Exclude local_lock_t from IRQ inversions locking/lockdep: Clean up check_redundant() a bit locking/lockdep: Add a skip() function to __bfs() locking/lockdep: Mark local_lock_t locking/selftests: More granular debug_locks_verbose lockdep/selftest: Add wait context selftests tools/memory-model: Fix typo in klitmus7 compatibility table ...
2021-02-21Merge tag 'core-rcu-2021-02-17' of ↵Linus Torvalds15-80/+758
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "These are the latest RCU updates for v5.12: - Documentation updates. - Miscellaneous fixes. - kfree_rcu() updates: Addition of mem_dump_obj() to provide allocator return addresses to more easily locate bugs. This has a couple of RCU-related commits, but is mostly MM. Was pulled in with akpm's agreement. - Per-callback-batch tracking of numbers of callbacks, which enables better debugging information and smarter reactions to large numbers of callbacks. - The first round of changes to allow CPUs to be runtime switched from and to callback-offloaded state. - CONFIG_PREEMPT_RT-related changes. - RCU CPU stall warning updates. - Addition of polling grace-period APIs for SRCU. - Torture-test and torture-test scripting updates, including a "torture everything" script that runs rcutorture, locktorture, scftorture, rcuscale, and refscale. Plus does an allmodconfig build. - nolibc fixes for the torture tests" * tag 'core-rcu-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (130 commits) percpu_ref: Dump mem_dump_obj() info upon reference-count underflow rcu: Make call_rcu() print mem_dump_obj() info for double-freed callback mm: Make mem_obj_dump() vmalloc() dumps include start and length mm: Make mem_dump_obj() handle vmalloc() memory mm: Make mem_dump_obj() handle NULL and zero-sized pointers mm: Add mem_dump_obj() to print source of memory block tools/rcutorture: Fix position of -lgcc in mkinitrd.sh tools/nolibc: Fix position of -lgcc in the documented example tools/nolibc: Emit detailed error for missing alternate syscall number definitions tools/nolibc: Remove incorrect definitions of __ARCH_WANT_* tools/nolibc: Get timeval, timespec and timezone from linux/time.h tools/nolibc: Implement poll() based on ppoll() tools/nolibc: Implement fork() based on clone() tools/nolibc: Make getpgrp() fall back to getpgid(0) tools/nolibc: Make dup2() rely on dup3() when available tools/nolibc: Add the definition for dup() rcutorture: Add rcutree.use_softirq=0 to RUDE01 and TASKS01 torture: Maintain torture-specific set of CPUs-online books torture: Clean up after torture-test CPU hotplugging rcutorture: Make object_debug also double call_rcu() heap object ...
2021-02-21Merge tag 'acpi-5.12-rc1' of ↵Linus Torvalds10-10/+10
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These update the ACPICA code in the kernel to upstream revision 20210105, fix and clean up the handling of device properties, add support for setting global profile of the platform, clean up device enumeration, the CPPC library, the APEI support and more, update the documentation, consolidate the printing of messages in several places and make assorted janitorial changes. Specifics: - Update ACPICA code in the kernel to upstream revision 20201113 with changes as follows: * Remove the MTMR (Mid-Timer) table (Al Stone). * Remove the VRTC table (Al Stone). * Add type casts for string functions (Bob Moore). * Update all copyrights to 2021 (Bob Moore). * Fix exception code class checks (Maximilian Luz). * Clean up exception code class checks (Maximilian Luz). * Fix -Wfallthrough (Nick Desaulniers). - Add support for setting and reading global profile of the platform along with documentation (Mark Pearson, Hans de Goede, Jiaxun Yang). - Fix fwnode properties matching and clean up the code handling device properties and its documentation (Rafael Wysocki, Andy Shevchenko). - Clean up ACPI-based device enumeration code (Rafael Wysocki). - Clean up the CPPC support library code (Ionela Voinescu). - Clean up the APEI support code (Yang Li, Yazen Ghannam). - Update GPIO-related properties documentation (Flavio Suligoi). - Consolidate and clean up the printing of messages in several places (Rafael Wysocki). - Fix error code path in configfs handling code (Qinglang Miao). - Use DEVICE_ATTR_<RW|RO|WO> macros where applicable (Dwaipayan Ray). - Replace tests for !ACPI_FAILURE with tests for ACPI_SUCCESS in multiple places (Bjorn Helgaas)" * tag 'acpi-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (44 commits) ACPI: property: Satisfy kernel doc validator (part 2) ACPI: property: Satisfy kernel doc validator (part 1) ACPI: property: Make acpi_node_prop_read() static ACPI: property: Remove dead code ACPI: property: Fix fwnode string properties matching ACPI: OSL: Clean up printing messages ACPI: OSL: Rework acpi_check_resource_conflict() ACPI: APEI: ERST: remove unneeded semicolon ACPI: thermal: Clean up printing messages ACPI: video: Clean up printing messages ACPI: button: Clean up printing messages ACPI: battery: Clean up printing messages ACPI: AC: Clean up printing messages ACPI: bus: Drop ACPI_BUS_COMPONENT which is not used any more ACPI: utils: Clean up printing messages ACPI: scan: Clean up printing messages ACPI: bus: Clean up printing messages ACPI: PM: Clean up printing messages ACPI: power: Clean up printing messages ACPI: APEI: Add is_generic_error() to identify GHES sources ...
2021-02-21Merge tag 'pm-5.12-rc1' of ↵Linus Torvalds7-59/+62
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These add a new power capping facility allowing aggregate power constraints to be applied to sets of devices in a distributed manner, add a new CPU ID to the RAPL power capping driver and improve it, drop a cpufreq driver belonging to a platform that is not supported any more, drop two redundant cpufreq driver flags, update cpufreq drivers (intel_pstate, brcmstb-avs, qcom-hw), update the operating performance points (OPP) framework (code cleanups, new helpers, devfreq-related modifications), clean up devfreq, extend the PM clock layer, update the cpupower utility and make assorted janitorial changes. Specifics: - Add new power capping facility called DTPM (Dynamic Thermal Power Management), based on the existing power capping framework, to allow aggregate power constraints to be applied to sets of devices in a distributed manner, along with a CPU backend driver based on the Energy Model (Daniel Lezcano, Dan Carpenter, Colin Ian King). - Add AlderLake Mobile support to the Intel RAPL power capping driver and make it use the topology interface when laying out the system topology (Zhang Rui, Yunfeng Ye). - Drop the cpufreq tango driver belonging to a platform that is not supported any more (Arnd Bergmann). - Drop the redundant CPUFREQ_STICKY and CPUFREQ_PM_NO_WARN cpufreq driver flags (Viresh Kumar). - Update cpufreq drivers: * Fix max CPU frequency discovery in the intel_pstate driver and make janitorial changes in it (Chen Yu, Rafael Wysocki, Nigel Christian). * Fix resource leaks in the brcmstb-avs-cpufreq driver (Christophe JAILLET). * Make the tegra20 driver use the resource-managed API (Dmitry Osipenko). * Enable boost support in the qcom-hw driver (Shawn Guo). - Update the operating performance points (OPP) framework: * Clean up the OPP core (Dmitry Osipenko, Viresh Kumar). * Extend the OPP API by adding new helpers to it (Dmitry Osipenko, Viresh Kumar). * Allow required OPPs to be used for devfreq devices and update the devfreq governor code accordingly (Saravana Kannan). * Prepare the framework for introducing new dev_pm_opp_set_opp() helper (Viresh Kumar). * Drop dev_pm_opp_set_bw() and update related drivers (Viresh Kumar). * Allow lazy linking of required-OPPs (Viresh Kumar). - Simplify and clean up devfreq somewhat (Lukasz Luba, Yang Li, Pierre Kuo). - Update the generic power domains (genpd) framework: * Use device's next wakeup to determine domain idle state (Lina Iyer). * Improve initialization and debug (Dmitry Osipenko). * Simplify computations (Abaci Team). - Make janitorial changes in the core code handling system sleep and PM-runtime (Bhaskar Chowdhury, Bjorn Helgaas, Rikard Falkeborn, Zqiang). - Update the MAINTAINERS entry for the exynos cpuidle driver and drop DEBUG definition from intel_idle (Krzysztof Kozlowski, Tom Rix). - Extend the PM clock layer to cover clocks that must sleep (Nicolas Pitre). - Update the cpupower utility: * Update cpupower command, add support for AMD family 0x19 and clean up the code to remove many of the family checks to make future family updates easier (Nathan Fontenot, Robert Richter). * Add Makefile dependencies for install targets to allow building cpupower in parallel rather than serially (Ivan Babrou). - Make janitorial changes in power management Kconfig (Lukasz Luba)" * tag 'pm-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (89 commits) MAINTAINERS: cpuidle: exynos: include header in file pattern powercap: intel_rapl: Use topology interface in rapl_init_domains() powercap: intel_rapl: Use topology interface in rapl_add_package() PM: sleep: Constify static struct attribute_group PM: Kconfig: remove unneeded "default n" options PM: EM: update Kconfig description and drop "default n" option cpufreq: Remove unused flag CPUFREQ_PM_NO_WARN cpufreq: Remove CPUFREQ_STICKY flag PM / devfreq: Add required OPPs support to passive governor PM / devfreq: Cache OPP table reference in devfreq OPP: Add function to look up required OPP's for a given OPP PM / devfreq: rk3399_dmc: Remove unneeded semicolon opp: Replace ENOTSUPP with EOPNOTSUPP opp: Fix "foo * bar" should be "foo *bar" opp: Don't ignore clk_get() errors other than -ENOENT opp: Update bandwidth requirements based on scaling up/down opp: Allow lazy-linking of required-opps opp: Remove dev_pm_opp_set_bw() devfreq: tegra30: Migrate to dev_pm_opp_set_opp() drm: msm: Migrate to dev_pm_opp_set_opp() ...
2021-02-21Merge tag 'x86_cpu_for_v5.12' of ↵Linus Torvalds2-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 CPUID cleanup from Borislav Petkov: "Assign a dedicated feature word to a CPUID leaf which is widely used" * tag 'x86_cpu_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpufeatures: Assign dedicated feature word for CPUID_0x8000001F[EAX]
2021-02-21Merge tag 'x86_misc_for_v5.12' of ↵Linus Torvalds1-20/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 misc updates from Borislav Petkov: - Complete the MSR write filtering by applying it to the MSR ioctl interface too. - Other misc small fixups. * tag 'x86_misc_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/MSR: Filter MSR writes through X86_IOC_WRMSR_REGS ioctl too selftests/fpu: Fix debugfs_simple_attr.cocci warning selftests/x86: Use __builtin_ia32_read/writeeflags x86/reboot: Add Zotac ZBOX CI327 nano PCI reboot quirk
2021-02-17static_call: Allow module use without exposing static_call_keyJosh Poimboeuf2-4/+40
When exporting static_call_key; with EXPORT_STATIC_CALL*(), the module can use static_call_update() to change the function called. This is not desirable in general. Not exporting static_call_key however also disallows usage of static_call(), since objtool needs the key to construct the static_call_site. Solve this by allowing objtool to create the static_call_site using the trampoline address when it builds a module and cannot find the static_call_key symbol. The module loader will then try and map the trampole back to a key before it constructs the normal sites list. Doing this requires a trampoline -> key associsation, so add another magic section that keeps those. Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210127231837.ifddpn7rhwdaepiu@treble
2021-02-17static_call: Pull some static_call declarations to the type headersPeter Zijlstra1-0/+27
Some static call declarations are going to be needed on low level header files. Move the necessary material to the dedicated static call types header to avoid inclusion dependency hell. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lkml.kernel.org/r/20210118141223.123667-4-frederic@kernel.org
2021-02-17rbtree: Add generic add and find helpersPeter Zijlstra2-63/+202
I've always been bothered by the endless (fragile) boilerplate for rbtree, and I recently wrote some rbtree helpers for objtool and figured I should lift them into the kernel and use them more widely. Provide: partial-order; less() based: - rb_add(): add a new entry to the rbtree - rb_add_cached(): like rb_add(), but for a rb_root_cached total-order; cmp() based: - rb_find(): find an entry in an rbtree - rb_find_add(): find an entry, and add if not found - rb_find_first(): find the first (leftmost) matching entry - rb_next_match(): continue from rb_find_first() - rb_for_each(): iterate a sub-tree using the previous two Inlining and constant propagation should see the compiler inline the whole thing, including the various compare functions. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Michel Lespinasse <walken@google.com> Acked-by: Davidlohr Bueso <dbueso@suse.de>