diff options
| author | Alexei Starovoitov <ast@kernel.org> | 2026-05-13 19:27:32 +0300 |
|---|---|---|
| committer | Alexei Starovoitov <ast@kernel.org> | 2026-05-13 19:27:33 +0300 |
| commit | cd59fa185a031417d2699c68172676f4671153d8 (patch) | |
| tree | 6b399bb886ef8e68945a5dfd3b5b08009bbca7b6 /include/linux | |
| parent | 6318f11d53a3caabea5337ccf83612e52d1bd55a (diff) | |
| parent | 90e43f1b47535cc7aceef3add1a61ba3260b7aee (diff) | |
| download | linux-cd59fa185a031417d2699c68172676f4671153d8.tar.xz | |
Merge branch 'bpf-support-stack-arguments-for-bpf-functions-and-kfuncs'
Yonghong Song says:
====================
bpf: Support stack arguments for BPF functions and kfuncs
Currently, bpf function calls and kfunc's are limited by 5 reg-level
parameters. For function calls with more than 5 parameters,
developers can use always inlining or pass a struct pointer
after packing more parameters in that struct although it may have
some inconvenience. But there is no workaround for kfunc if more
than 5 parameters is needed.
This patch set lifts the 5-argument limit by introducing stack-based
argument passing for BPF functions and kfunc's, coordinated with
compiler support in LLVM [1]. The compiler emits loads/stores through
a new bpf register r11 (BPF_REG_PARAMS), to pass arguments beyond
the 5th, keeping the stack arg area separate from the r10-based program
stack. The current maximum number of arguments is capped at
MAX_BPF_FUNC_ARGS (12), which is sufficient for the vast majority of
use cases.
All kfunc/bpf-function arguments are caller saved, including stack
arguments. For register arguments (r1-r5), the verifier already marks
them as clobbered after each call. For stack arguments, the verifier
invalidates all outgoing stack arg slots immediately after a call,
requiring the compiler to re-store them before any subsequent call.
This follows the native calling convention where all function
parameters are caller saved.
The x86_64 JIT translates r11-relative accesses to RBP-relative
native instructions. Each function's stack allocation is extended
by 'max_outgoing' bytes to hold the outgoing arg area below the
callee-saved registers. This makes implementation easier as the r10
can be reused for stack argument access. At both BPF-to-BPF and kfunc
calls, outgoing args are pushed onto the expected calling convention
locations directly. The incoming parameters can directly get the value
from caller.
Global subprogs and freplace progs with >5 args are not yet supported.
Only x86_64 and arm64 are supported for now. Same selftests are tested
by both x86_64 and arm64. Please see each individual patch for details.
[1] https://github.com/llvm/llvm-project/pull/189060
Changelogs:
v3 -> v4:
- v3: https://lore.kernel.org/bpf/20260511053301.1878610-1-yonghong.song@linux.dev/
- Added no_stack_arg_load comparison in func_states_equal() to ensure
correctness of pruning.
- Shrink bpf_jmp_history_entry.flags to 4bit to match the number of flags.
- Instead of passing bpf_subprog_info to JIT, use prog->aux->func_idx to
find corresponding bpf_subprog_info from 'env'.
- For patch 'bpf: Reject stack arguments if tail call reachable', use stack_arg_cnt
instead of just incoming stack arg cnt.
- Tighten invalidate_outgoing_stack_args() for kfunc/helper/bpf-to-bpf calls.
- Disable private stack in verifier for x86_64 instead of in JIT.
v2 -> v3:
- v2: https://lore.kernel.org/bpf/20260507212942.1122000-1-yonghong.song@linux.dev/
- In do_check_common() and for main prog, if btf does not match with actual
parameter, the verification will continue and will ignore arg_cnt. Make
arg_cnt=1 explictly to prevent any incoming stack arguments.
- Remove the loop which clear current frame stack slot and set the upper level frame
stack slot. This is not needed unless there is a bug. Add a verifier_bug
if the bug happens.
- For liveness, avoid r11 based load/stores mixing with r10 based stack tracking.
Also, print out stack arguments properly.
- Pass bpf_subprog_info the JIT so we can avoid copy bpf_subprog_info fields to
bpf_prog_aux.
- Fix the missed allocation free for test infra BTF fixup.
- Remove selftest result for precision backtracking test since the result would
be change (two possible output).
v1 -> v2:
- v1: https://lore.kernel.org/bpf/20260424171433.2034470-1-yonghong.song@linux.dev/
- Several refactoring (convert bpf_get_spilled_reg macro to static inline func,
Remove copy_register_state(), Refactor jmp history, Refactor record_call_access(), etc),
suggested by Eduard.
- Use incoming_stack_arg_cnt/stack_arg_cnt instead of incoming_stack_arg_depth/stack_arg_depth,
suggested by Eduard.
- Fix a stack arg pruning bug, from Eduard.
- Fix a bug for precision marking and backtracking, basically callee needs to get the
stack arg value from callers, helped from Eduard.
- Set sub->arg_cnt earlier in btf_prepare_func_args(), this will avoid having
incoming_stack_arg_cnt in bpf_subprog_info.
- Do stack-arg liveness analysis together with r10 based liveness analysis,
suggested by Eduard.
- Fix a few tests to ensure that r11-based loads cannot be ahead of r11-based stores,
and r11-based loads cannot be after kfunc/helper/bpf-function.
====================
Link: https://patch.msgid.link/20260513044949.2382019-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Diffstat (limited to 'include/linux')
| -rw-r--r-- | include/linux/bpf.h | 1 | ||||
| -rw-r--r-- | include/linux/bpf_verifier.h | 98 | ||||
| -rw-r--r-- | include/linux/filter.h | 22 |
3 files changed, 93 insertions, 28 deletions
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 9e16e91647d3..242f9597d9ab 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1548,6 +1548,7 @@ void bpf_jit_uncharge_modmem(u32 size); bool bpf_prog_has_trampoline(const struct bpf_prog *prog); bool bpf_insn_is_indirect_target(const struct bpf_verifier_env *env, const struct bpf_prog *prog, int insn_idx); +u16 bpf_out_stack_arg_cnt(const struct bpf_verifier_env *env, const struct bpf_prog *prog); #else static inline int bpf_trampoline_link_prog(struct bpf_tramp_link *link, struct bpf_trampoline *tr, diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index c15a4c26a43b..6f12fc40b682 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -402,6 +402,7 @@ struct bpf_func_state { bool in_callback_fn; bool in_async_callback_fn; bool in_exception_callback_fn; + bool no_stack_arg_load; /* For callback calling functions that limit number of possible * callback executions (e.g. bpf_loop) keeps track of current * simulated iteration number. @@ -427,46 +428,48 @@ struct bpf_func_state { * `stack`. allocated_stack is always a multiple of BPF_REG_SIZE. */ int allocated_stack; + + u16 out_stack_arg_cnt; /* Number of outgoing on-stack argument slots */ + struct bpf_reg_state *stack_arg_regs; /* Outgoing on-stack arguments */ }; #define MAX_CALL_FRAMES 8 -/* instruction history flags, used in bpf_jmp_history_entry.flags field */ +/* instruction history flags, used in bpf_jmp_history_entry.flags field. + * Frame number and SPI are stored in dedicated fields of bpf_jmp_history_entry. + */ enum { - /* instruction references stack slot through PTR_TO_STACK register; - * we also store stack's frame number in lower 3 bits (MAX_CALL_FRAMES is 8) - * and accessed stack slot's index in next 6 bits (MAX_BPF_STACK is 512, - * 8 bytes per slot, so slot index (spi) is [0, 63]) - */ - INSN_F_FRAMENO_MASK = 0x7, /* 3 bits */ + INSN_F_STACK_ACCESS = BIT(0), - INSN_F_SPI_MASK = 0x3f, /* 6 bits */ - INSN_F_SPI_SHIFT = 3, /* shifted 3 bits to the left */ + INSN_F_DST_REG_STACK = BIT(1), /* dst_reg is PTR_TO_STACK */ + INSN_F_SRC_REG_STACK = BIT(2), /* src_reg is PTR_TO_STACK */ - INSN_F_STACK_ACCESS = BIT(9), - - INSN_F_DST_REG_STACK = BIT(10), /* dst_reg is PTR_TO_STACK */ - INSN_F_SRC_REG_STACK = BIT(11), /* src_reg is PTR_TO_STACK */ - /* total 12 bits are used now. */ + INSN_F_STACK_ARG_ACCESS = BIT(3), }; -static_assert(INSN_F_FRAMENO_MASK + 1 >= MAX_CALL_FRAMES); -static_assert(INSN_F_SPI_MASK + 1 >= MAX_BPF_STACK / 8); - struct bpf_jmp_history_entry { - u32 idx; /* insn idx can't be bigger than 1 million */ + u32 idx : 20; + u32 frame : 3; /* stack access frame number */ + u32 spi : 6; /* stack slot index (0..63) */ + u32 : 3; u32 prev_idx : 20; /* special INSN_F_xxx flags */ - u32 flags : 12; + u32 flags : 4; + u32 : 8; /* additional registers that need precision tracking when this * jump is backtracked, vector of six 10-bit records */ u64 linked_regs; }; -/* Maximum number of register states that can exist at once */ -#define BPF_ID_MAP_SIZE ((MAX_BPF_REG + MAX_BPF_STACK / BPF_REG_SIZE) * MAX_CALL_FRAMES) +static_assert(MAX_CALL_FRAMES <= (1 << 3)); +static_assert(MAX_BPF_STACK / 8 <= (1 << 6)); + +/* Maximum number of bpf_reg_state objects that can exist at once */ +#define MAX_STACK_ARG_SLOTS (MAX_BPF_FUNC_ARGS - MAX_BPF_FUNC_REG_ARGS) +#define BPF_ID_MAP_SIZE ((MAX_BPF_REG + MAX_BPF_STACK / BPF_REG_SIZE + \ + MAX_STACK_ARG_SLOTS) * MAX_CALL_FRAMES) struct bpf_verifier_state { /* call stack tracking */ struct bpf_func_state *frame[MAX_CALL_FRAMES]; @@ -552,10 +555,23 @@ struct bpf_verifier_state { u32 may_goto_depth; }; -#define bpf_get_spilled_reg(slot, frame, mask) \ - (((slot < frame->allocated_stack / BPF_REG_SIZE) && \ - ((1 << frame->stack[slot].slot_type[BPF_REG_SIZE - 1]) & (mask))) \ - ? &frame->stack[slot].spilled_ptr : NULL) +static inline struct bpf_reg_state * +bpf_get_spilled_reg(int slot, struct bpf_func_state *frame, u32 mask) +{ + if (slot < frame->allocated_stack / BPF_REG_SIZE && + (1 << frame->stack[slot].slot_type[BPF_REG_SIZE - 1]) & mask) + return &frame->stack[slot].spilled_ptr; + return NULL; +} + +static inline struct bpf_reg_state * +bpf_get_spilled_stack_arg(int slot, struct bpf_func_state *frame) +{ + if (slot < frame->out_stack_arg_cnt && + frame->stack_arg_regs[slot].type != NOT_INIT) + return &frame->stack_arg_regs[slot]; + return NULL; +} /* Iterate over 'frame', setting 'reg' to either NULL or a spilled register. */ #define bpf_for_each_spilled_reg(iter, frame, reg, mask) \ @@ -563,6 +579,12 @@ struct bpf_verifier_state { iter < frame->allocated_stack / BPF_REG_SIZE; \ iter++, reg = bpf_get_spilled_reg(iter, frame, mask)) +/* Iterate over 'frame', setting 'reg' to either NULL or a spilled stack arg. */ +#define bpf_for_each_spilled_stack_arg(iter, frame, reg) \ + for (iter = 0, reg = bpf_get_spilled_stack_arg(iter, frame); \ + iter < frame->out_stack_arg_cnt; \ + iter++, reg = bpf_get_spilled_stack_arg(iter, frame)) + #define bpf_for_each_reg_in_vstate_mask(__vst, __state, __reg, __mask, __expr) \ ({ \ struct bpf_verifier_state *___vstate = __vst; \ @@ -580,6 +602,11 @@ struct bpf_verifier_state { continue; \ (void)(__expr); \ } \ + bpf_for_each_spilled_stack_arg(___j, __state, __reg) { \ + if (!__reg) \ + continue; \ + (void)(__expr); \ + } \ } \ }) @@ -811,12 +838,21 @@ struct bpf_subprog_info { bool keep_fastcall_stack: 1; bool changes_pkt_data: 1; bool might_sleep: 1; - u8 arg_cnt:3; + u8 arg_cnt:4; enum priv_stack_mode priv_stack_mode; - struct bpf_subprog_arg_info args[MAX_BPF_FUNC_REG_ARGS]; + struct bpf_subprog_arg_info args[MAX_BPF_FUNC_ARGS]; + u16 stack_arg_cnt; /* incoming + max outgoing */ + u16 max_out_stack_arg_cnt; }; +static inline u16 bpf_in_stack_arg_cnt(const struct bpf_subprog_info *sub) +{ + if (sub->arg_cnt > MAX_BPF_FUNC_REG_ARGS) + return sub->arg_cnt - MAX_BPF_FUNC_REG_ARGS; + return 0; +} + struct bpf_verifier_env; struct backtrack_state { @@ -824,6 +860,7 @@ struct backtrack_state { u32 frame; u32 reg_masks[MAX_CALL_FRAMES]; u64 stack_masks[MAX_CALL_FRAMES]; + u8 stack_arg_masks[MAX_CALL_FRAMES]; }; struct bpf_id_pair { @@ -1159,7 +1196,7 @@ struct list_head *bpf_explored_state(struct bpf_verifier_env *env, int idx); void bpf_free_verifier_state(struct bpf_verifier_state *state, bool free_self); void bpf_free_backedges(struct bpf_scc_visit *visit); int bpf_push_jmp_history(struct bpf_verifier_env *env, struct bpf_verifier_state *cur, - int insn_flags, u64 linked_regs); + int insn_flags, int spi, int frame, u64 linked_regs); void bpf_bt_sync_linked_regs(struct backtrack_state *bt, struct bpf_jmp_history_entry *hist); void bpf_mark_reg_not_init(const struct bpf_verifier_env *env, struct bpf_reg_state *reg); @@ -1222,6 +1259,11 @@ static inline void bpf_bt_set_frame_slot(struct backtrack_state *bt, u32 frame, bt->stack_masks[frame] |= 1ull << slot; } +static inline void bt_set_frame_stack_arg_slot(struct backtrack_state *bt, u32 frame, u32 slot) +{ + bt->stack_arg_masks[frame] |= 1 << slot; +} + static inline bool bt_is_frame_reg_set(struct backtrack_state *bt, u32 frame, u32 reg) { return bt->reg_masks[frame] & (1 << reg); diff --git a/include/linux/filter.h b/include/linux/filter.h index b77d0b06db6e..a515a9769078 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -749,6 +749,27 @@ static inline u32 bpf_prog_run_pin_on_cpu(const struct bpf_prog *prog, return ret; } +static inline bool is_stack_arg_ldx(const struct bpf_insn *insn) +{ + return insn->code == (BPF_LDX | BPF_MEM | BPF_DW) && + insn->src_reg == BPF_REG_PARAMS && + insn->off > 0 && insn->off % 8 == 0; +} + +static inline bool is_stack_arg_st(const struct bpf_insn *insn) +{ + return insn->code == (BPF_ST | BPF_MEM | BPF_DW) && + insn->dst_reg == BPF_REG_PARAMS && + insn->off < 0 && insn->off % 8 == 0; +} + +static inline bool is_stack_arg_stx(const struct bpf_insn *insn) +{ + return insn->code == (BPF_STX | BPF_MEM | BPF_DW) && + insn->dst_reg == BPF_REG_PARAMS && + insn->off < 0 && insn->off % 8 == 0; +} + #define BPF_SKB_CB_LEN QDISC_CB_PRIV_LEN struct bpf_skb_data_end { @@ -1163,6 +1184,7 @@ bool bpf_jit_inlines_helper_call(s32 imm); bool bpf_jit_supports_subprog_tailcalls(void); bool bpf_jit_supports_percpu_insn(void); bool bpf_jit_supports_kfunc_call(void); +bool bpf_jit_supports_stack_args(void); bool bpf_jit_supports_far_kfunc_call(void); bool bpf_jit_supports_exceptions(void); bool bpf_jit_supports_ptr_xchg(void); |
