diff options
author | Eduard Zingerman <eddyz87@gmail.com> | 2022-06-21 02:53:42 +0300 |
---|---|---|
committer | Alexei Starovoitov <ast@kernel.org> | 2022-06-21 03:40:51 +0300 |
commit | 1ade23711971b0eececf0d7fedc29d3c1d2fce01 (patch) | |
tree | 8e32aa6e6109037ea5020965f237ae3821b17999 /tools | |
parent | 7a42008ca5c700819e4b3003025e5e1695fd1f86 (diff) | |
download | linux-1ade23711971b0eececf0d7fedc29d3c1d2fce01.tar.xz |
bpf: Inline calls to bpf_loop when callback is known
Calls to `bpf_loop` are replaced with direct loops to avoid
indirection. E.g. the following:
bpf_loop(10, foo, NULL, 0);
Is replaced by equivalent of the following:
for (int i = 0; i < 10; ++i)
foo(i, NULL);
This transformation could be applied when:
- callback is known and does not change during program execution;
- flags passed to `bpf_loop` are always zero.
Inlining logic works as follows:
- During execution simulation function `update_loop_inline_state`
tracks the following information for each `bpf_loop` call
instruction:
- is callback known and constant?
- are flags constant and zero?
- Function `optimize_bpf_loop` increases stack depth for functions
where `bpf_loop` calls can be inlined and invokes `inline_bpf_loop`
to apply the inlining. The additional stack space is used to spill
registers R6, R7 and R8. These registers are used as loop counter,
loop maximal bound and callback context parameter;
Measurements using `benchs/run_bench_bpf_loop.sh` inside QEMU / KVM on
i7-4710HQ CPU show a drop in latency from 14 ns/op to 2 ns/op.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20220620235344.569325-4-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Diffstat (limited to 'tools')
0 files changed, 0 insertions, 0 deletions