diff options
author | Alexei Starovoitov <ast@kernel.org> | 2019-04-27 05:05:44 +0300 |
---|---|---|
committer | Alexei Starovoitov <ast@kernel.org> | 2019-04-27 05:05:44 +0300 |
commit | 3745dc24aa7a6a45771a6b441194b857ac615cfc (patch) | |
tree | 9fe461c7f01adec95c4f66d9a4748afa3348643b /kernel/bpf/verifier.c | |
parent | 34b8ab091f9ef57a2bb3c8c8359a0a03a8abf2f9 (diff) | |
parent | e950e843367d7990b9d7ea964e3c33876d477c4b (diff) | |
download | linux-3745dc24aa7a6a45771a6b441194b857ac615cfc.tar.xz |
Merge branch 'writeable-bpf-tracepoints'
Matt Mullins says:
====================
This adds an opt-in interface for tracepoints to expose a writable context to
BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE programs that are attached, while
supporting read-only access from existing BPF_PROG_TYPE_RAW_TRACEPOINT
programs, as well as from non-BPF-based tracepoints.
The initial motivation is to support tracing that can be observed from the
remote end of an NBD socket, e.g. by adding flags to the struct nbd_request
header. Earlier attempts included adding an NBD-specific tracepoint fd, but in
code review, I was recommended to implement it more generically -- as a result,
this patchset is far simpler than my initial try.
v4->v5:
* rebased onto bpf-next/master and fixed merge conflicts
* "tools: sync bpf.h" also syncs comments that have previously changed
in bpf-next
v3->v4:
* fixed a silly copy/paste typo in include/trace/events/bpf_test_run.h
(_TRACE_NBD_H -> _TRACE_BPF_TEST_RUN_H)
* fixed incorrect/misleading wording in patch 1's commit message,
since the pointer cannot be directly dereferenced in a
BPF_PROG_TYPE_RAW_TRACEPOINT
* cleaned up the error message wording if the prog_tests fail
* Addressed feedback from Yonghong
* reject non-pointer-sized accesses to the buffer pointer
* use sizeof(struct nbd_request) as one-byte-past-the-end in
raw_tp_writable_reject_nbd_invalid.c
* use BPF_MOV64_IMM instead of BPF_LD_IMM64
v2->v3:
* Andrew addressed Josef's comments:
* C-style commenting in nbd.c
* Collapsed identical events into a single DECLARE_EVENT_CLASS.
This saves about 2kB of kernel text
v1->v2:
* add selftests
* sync tools/include/uapi/linux/bpf.h
* reject variable offset into the buffer
* add string representation of PTR_TO_TP_BUFFER to reg_type_str
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Diffstat (limited to 'kernel/bpf/verifier.c')
-rw-r--r-- | kernel/bpf/verifier.c | 31 |
1 files changed, 31 insertions, 0 deletions
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 423f242a5efb..2ef442c62c0e 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -405,6 +405,7 @@ static const char * const reg_type_str[] = { [PTR_TO_SOCK_COMMON_OR_NULL] = "sock_common_or_null", [PTR_TO_TCP_SOCK] = "tcp_sock", [PTR_TO_TCP_SOCK_OR_NULL] = "tcp_sock_or_null", + [PTR_TO_TP_BUFFER] = "tp_buffer", }; static char slot_type_char[] = { @@ -1993,6 +1994,32 @@ static int check_ctx_reg(struct bpf_verifier_env *env, return 0; } +static int check_tp_buffer_access(struct bpf_verifier_env *env, + const struct bpf_reg_state *reg, + int regno, int off, int size) +{ + if (off < 0) { + verbose(env, + "R%d invalid tracepoint buffer access: off=%d, size=%d", + regno, off, size); + return -EACCES; + } + if (!tnum_is_const(reg->var_off) || reg->var_off.value) { + char tn_buf[48]; + + tnum_strn(tn_buf, sizeof(tn_buf), reg->var_off); + verbose(env, + "R%d invalid variable buffer offset: off=%d, var_off=%s", + regno, off, tn_buf); + return -EACCES; + } + if (off + size > env->prog->aux->max_tp_access) + env->prog->aux->max_tp_access = off + size; + + return 0; +} + + /* truncate register to smaller size (in bytes) * must be called with size < BPF_REG_SIZE */ @@ -2137,6 +2164,10 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn err = check_sock_access(env, insn_idx, regno, off, size, t); if (!err && value_regno >= 0) mark_reg_unknown(env, regs, value_regno); + } else if (reg->type == PTR_TO_TP_BUFFER) { + err = check_tp_buffer_access(env, reg, regno, off, size); + if (!err && t == BPF_READ && value_regno >= 0) + mark_reg_unknown(env, regs, value_regno); } else { verbose(env, "R%d invalid mem access '%s'\n", regno, reg_type_str[reg->type]); |