diff options
author | Andrii Nakryiko <andrii@kernel.org> | 2022-04-04 03:55:46 +0300 |
---|---|---|
committer | Andrii Nakryiko <andrii@kernel.org> | 2022-04-04 05:56:01 +0300 |
commit | 88d01a57113a361c1e62abaeed0563745d637888 (patch) | |
tree | d7a5856aa921c106e1faf8f8c33cba8450292298 /tools/lib/bpf | |
parent | 85bf1f51691c078e77f524a7addf37faa6635a6e (diff) | |
parent | 579c3196b21800538e0cfd512a05619ee80fb19a (diff) | |
download | linux-88d01a57113a361c1e62abaeed0563745d637888.tar.xz |
Merge branch 'libbpf: name-based u[ret]probe attach'
Alan Maguire says:
====================
This patch series focuses on supporting name-based attach - similar
to that supported for kprobes - for uprobe BPF programs.
Currently attach for such probes is done by determining the offset
manually, so the aim is to try and mimic the simplicity of kprobe
attach, making use of uprobe opts to specify a name string.
Patch 1 supports expansion of the binary_path argument used for
bpf_program__attach_uprobe_opts(), allowing it to determine paths
for programs and shared objects automatically, allowing for
specification of "libc.so.6" rather than the full path
"/usr/lib64/libc.so.6".
Patch 2 adds the "func_name" option to allow uprobe attach by
name; the mechanics are described there.
Having name-based support allows us to support auto-attach for
uprobes; patch 3 adds auto-attach support while attempting
to handle backwards-compatibility issues that arise. The format
supported is
u[ret]probe/binary_path:[raw_offset|function[+offset]]
For example, to attach to libc malloc:
SEC("uprobe//usr/lib64/libc.so.6:malloc")
..or, making use of the path computation mechanisms introduced in patch 1
SEC("uprobe/libc.so.6:malloc")
Finally patch 4 add tests to the attach_probe selftests covering
attach by name, with patch 5 covering skeleton auto-attach.
Changes since v4 [1]:
- replaced strtok_r() usage with copying segments from static char *; avoids
unneeded string allocation (Andrii, patch 1)
- switched to using access() instead of stat() when checking path-resolved
binary (Andrii, patch 1)
- removed computation of .plt offset for instrumenting shared library calls
within binaries. Firstly it proved too brittle, and secondly it was somewhat
unintuitive in that this form of instrumentation did not support function+offset
as the "local function in binary" and "shared library function in shared library"
cases did. We can still instrument library calls, just need to do it in the
library .so (patch 2)
- added binary path logging in cases where it was missing (Andrii, patch 2)
- avoid strlen() calcuation in checking name match (Andrii, patch 2)
- reword comments for func_name option (Andrii, patch 2)
- tightened SEC() name validation to support "u[ret]probe" and fail on other
permutations that do not support auto-attach (i.e. have u[ret]probe/binary_path:func
format (Andrii, patch 3)
- fixed selftests to fail independently rather than skip remainder on failure
(Andrii, patches 4,5)
Changes since v3 [2]:
- reworked variable naming to fit better with libbpf conventions
(Andrii, patch 2)
- use quoted binary path in log messages (Andrii, patch 2)
- added path determination mechanisms using LD_LIBRARY_PATH/PATH and
standard locations (patch 1, Andrii)
- changed section lookup to be type+name (if name is specified) to
simplify use cases (patch 2, Andrii)
- fixed .plt lookup scheme to match symbol table entries with .plt
index via the .rela.plt table; also fix the incorrect assumption
that the code in the .plt that does library linking is the same
size as .plt entries (it just happens to be on x86_64)
- aligned with pluggable section support such that uprobe SEC() names
that do not conform to auto-attach format do not cause skeleton load
failure (patch 3, Andrii)
- no longer need to look up absolute path to libraries used by test_progs
since we have mechanism to determine path automatically
- replaced CHECK()s with ASSERT*()s for attach_probe test (Andrii, patch 4)
- added auto-attach selftests also (Andrii, patch 5)
Changes since RFC [3]:
- used "long" for addresses instead of ssize_t (Andrii, patch 1).
- used gelf_ interfaces to avoid assumptions about 64-bit
binaries (Andrii, patch 1)
- clarified string matching in symbol table lookups
(Andrii, patch 1)
- added support for specification of shared object functions
in a non-shared object binary. This approach instruments
the Procedure Linking Table (PLT) - malloc@PLT.
- changed logic in symbol search to check dynamic symbol table
first, then fall back to symbol table (Andrii, patch 1).
- modified auto-attach string to require "/" separator prior
to path prefix i.e. uprobe//path/to/binary (Andrii, patch 2)
- modified auto-attach string to use ':' separator (Andrii,
patch 2)
- modified auto-attach to support raw offset (Andrii, patch 2)
- modified skeleton attach to interpret -ESRCH errors as
a non-fatal "unable to auto-attach" (Andrii suggested
-EOPNOTSUPP but my concern was it might collide with other
instances where that value is returned and reflects a
failure to attach a to-be-expected attachment rather than
skip a program that does not present an auto-attachable
section name. Admittedly -EOPNOTSUPP seems a more natural
value here).
- moved library path retrieval code to trace_helpers (Andrii,
patch 3)
[1] https://lore.kernel.org/bpf/1647000658-16149-1-git-send-email-alan.maguire@oracle.com/
[2] https://lore.kernel.org/bpf/1643645554-28723-1-git-send-email-alan.maguire@oracle.com/
[3] https://lore.kernel.org/bpf/1642678950-19584-1-git-send-email-alan.maguire@oracle.com/
====================
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Diffstat (limited to 'tools/lib/bpf')
-rw-r--r-- | tools/lib/bpf/libbpf.c | 332 | ||||
-rw-r--r-- | tools/lib/bpf/libbpf.h | 10 |
2 files changed, 338 insertions, 4 deletions
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 809fe209cdcc..6d2be53e4ba9 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -8630,6 +8630,7 @@ int bpf_program__set_log_buf(struct bpf_program *prog, char *log_buf, size_t log } static int attach_kprobe(const struct bpf_program *prog, long cookie, struct bpf_link **link); +static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf_link **link); static int attach_tp(const struct bpf_program *prog, long cookie, struct bpf_link **link); static int attach_raw_tp(const struct bpf_program *prog, long cookie, struct bpf_link **link); static int attach_trace(const struct bpf_program *prog, long cookie, struct bpf_link **link); @@ -8642,9 +8643,9 @@ static const struct bpf_sec_def section_defs[] = { SEC_DEF("sk_reuseport/migrate", SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT_OR_MIGRATE, SEC_ATTACHABLE | SEC_SLOPPY_PFX), SEC_DEF("sk_reuseport", SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT, SEC_ATTACHABLE | SEC_SLOPPY_PFX), SEC_DEF("kprobe/", KPROBE, 0, SEC_NONE, attach_kprobe), - SEC_DEF("uprobe/", KPROBE, 0, SEC_NONE), + SEC_DEF("uprobe+", KPROBE, 0, SEC_NONE, attach_uprobe), SEC_DEF("kretprobe/", KPROBE, 0, SEC_NONE, attach_kprobe), - SEC_DEF("uretprobe/", KPROBE, 0, SEC_NONE), + SEC_DEF("uretprobe+", KPROBE, 0, SEC_NONE, attach_uprobe), SEC_DEF("kprobe.multi/", KPROBE, BPF_TRACE_KPROBE_MULTI, SEC_NONE, attach_kprobe_multi), SEC_DEF("kretprobe.multi/", KPROBE, BPF_TRACE_KPROBE_MULTI, SEC_NONE, attach_kprobe_multi), SEC_DEF("tc", SCHED_CLS, 0, SEC_NONE), @@ -10517,6 +10518,235 @@ static int perf_event_uprobe_open_legacy(const char *probe_name, bool retprobe, return pfd; } +/* uprobes deal in relative offsets; subtract the base address associated with + * the mapped binary. See Documentation/trace/uprobetracer.rst for more + * details. + */ +static long elf_find_relative_offset(const char *filename, Elf *elf, long addr) +{ + size_t n; + int i; + + if (elf_getphdrnum(elf, &n)) { + pr_warn("elf: failed to find program headers for '%s': %s\n", filename, + elf_errmsg(-1)); + return -ENOENT; + } + + for (i = 0; i < n; i++) { + int seg_start, seg_end, seg_offset; + GElf_Phdr phdr; + + if (!gelf_getphdr(elf, i, &phdr)) { + pr_warn("elf: failed to get program header %d from '%s': %s\n", i, filename, + elf_errmsg(-1)); + return -ENOENT; + } + if (phdr.p_type != PT_LOAD || !(phdr.p_flags & PF_X)) + continue; + + seg_start = phdr.p_vaddr; + seg_end = seg_start + phdr.p_memsz; + seg_offset = phdr.p_offset; + if (addr >= seg_start && addr < seg_end) + return addr - seg_start + seg_offset; + } + pr_warn("elf: failed to find prog header containing 0x%lx in '%s'\n", addr, filename); + return -ENOENT; +} + +/* Return next ELF section of sh_type after scn, or first of that type if scn is NULL. */ +static Elf_Scn *elf_find_next_scn_by_type(Elf *elf, int sh_type, Elf_Scn *scn) +{ + while ((scn = elf_nextscn(elf, scn)) != NULL) { + GElf_Shdr sh; + + if (!gelf_getshdr(scn, &sh)) + continue; + if (sh.sh_type == sh_type) + return scn; + } + return NULL; +} + +/* Find offset of function name in object specified by path. "name" matches + * symbol name or name@@LIB for library functions. + */ +static long elf_find_func_offset(const char *binary_path, const char *name) +{ + int fd, i, sh_types[2] = { SHT_DYNSYM, SHT_SYMTAB }; + bool is_shared_lib, is_name_qualified; + char errmsg[STRERR_BUFSIZE]; + long ret = -ENOENT; + size_t name_len; + GElf_Ehdr ehdr; + Elf *elf; + + fd = open(binary_path, O_RDONLY | O_CLOEXEC); + if (fd < 0) { + ret = -errno; + pr_warn("failed to open %s: %s\n", binary_path, + libbpf_strerror_r(ret, errmsg, sizeof(errmsg))); + return ret; + } + elf = elf_begin(fd, ELF_C_READ_MMAP, NULL); + if (!elf) { + pr_warn("elf: could not read elf from %s: %s\n", binary_path, elf_errmsg(-1)); + close(fd); + return -LIBBPF_ERRNO__FORMAT; + } + if (!gelf_getehdr(elf, &ehdr)) { + pr_warn("elf: failed to get ehdr from %s: %s\n", binary_path, elf_errmsg(-1)); + ret = -LIBBPF_ERRNO__FORMAT; + goto out; + } + /* for shared lib case, we do not need to calculate relative offset */ + is_shared_lib = ehdr.e_type == ET_DYN; + + name_len = strlen(name); + /* Does name specify "@@LIB"? */ + is_name_qualified = strstr(name, "@@") != NULL; + + /* Search SHT_DYNSYM, SHT_SYMTAB for symbol. This search order is used because if + * a binary is stripped, it may only have SHT_DYNSYM, and a fully-statically + * linked binary may not have SHT_DYMSYM, so absence of a section should not be + * reported as a warning/error. + */ + for (i = 0; i < ARRAY_SIZE(sh_types); i++) { + size_t nr_syms, strtabidx, idx; + Elf_Data *symbols = NULL; + Elf_Scn *scn = NULL; + int last_bind = -1; + const char *sname; + GElf_Shdr sh; + + scn = elf_find_next_scn_by_type(elf, sh_types[i], NULL); + if (!scn) { + pr_debug("elf: failed to find symbol table ELF sections in '%s'\n", + binary_path); + continue; + } + if (!gelf_getshdr(scn, &sh)) + continue; + strtabidx = sh.sh_link; + symbols = elf_getdata(scn, 0); + if (!symbols) { + pr_warn("elf: failed to get symbols for symtab section in '%s': %s\n", + binary_path, elf_errmsg(-1)); + ret = -LIBBPF_ERRNO__FORMAT; + goto out; + } + nr_syms = symbols->d_size / sh.sh_entsize; + + for (idx = 0; idx < nr_syms; idx++) { + int curr_bind; + GElf_Sym sym; + + if (!gelf_getsym(symbols, idx, &sym)) + continue; + + if (GELF_ST_TYPE(sym.st_info) != STT_FUNC) + continue; + + sname = elf_strptr(elf, strtabidx, sym.st_name); + if (!sname) + continue; + + curr_bind = GELF_ST_BIND(sym.st_info); + + /* User can specify func, func@@LIB or func@@LIB_VERSION. */ + if (strncmp(sname, name, name_len) != 0) + continue; + /* ...but we don't want a search for "foo" to match 'foo2" also, so any + * additional characters in sname should be of the form "@@LIB". + */ + if (!is_name_qualified && sname[name_len] != '\0' && sname[name_len] != '@') + continue; + + if (ret >= 0) { + /* handle multiple matches */ + if (last_bind != STB_WEAK && curr_bind != STB_WEAK) { + /* Only accept one non-weak bind. */ + pr_warn("elf: ambiguous match for '%s', '%s' in '%s'\n", + sname, name, binary_path); + ret = -LIBBPF_ERRNO__FORMAT; + goto out; + } else if (curr_bind == STB_WEAK) { + /* already have a non-weak bind, and + * this is a weak bind, so ignore. + */ + continue; + } + } + ret = sym.st_value; + last_bind = curr_bind; + } + /* For binaries that are not shared libraries, we need relative offset */ + if (ret > 0 && !is_shared_lib) + ret = elf_find_relative_offset(binary_path, elf, ret); + if (ret > 0) + break; + } + + if (ret > 0) { + pr_debug("elf: symbol address match for '%s' in '%s': 0x%lx\n", name, binary_path, + ret); + } else { + if (ret == 0) { + pr_warn("elf: '%s' is 0 in symtab for '%s': %s\n", name, binary_path, + is_shared_lib ? "should not be 0 in a shared library" : + "try using shared library path instead"); + ret = -ENOENT; + } else { + pr_warn("elf: failed to find symbol '%s' in '%s'\n", name, binary_path); + } + } +out: + elf_end(elf); + close(fd); + return ret; +} + +/* Get full path to program/shared library. */ +static int resolve_full_path(const char *file, char *result, size_t result_sz) +{ + const char *search_paths[2]; + int i; + + if (strstr(file, ".so")) { + search_paths[0] = getenv("LD_LIBRARY_PATH"); + search_paths[1] = "/usr/lib64:/usr/lib"; + } else { + search_paths[0] = getenv("PATH"); + search_paths[1] = "/usr/bin:/usr/sbin"; + } + + for (i = 0; i < ARRAY_SIZE(search_paths); i++) { + const char *s; + + if (!search_paths[i]) + continue; + for (s = search_paths[i]; s != NULL; s = strchr(s, ':')) { + char *next_path; + int seg_len; + + if (s[0] == ':') + s++; + next_path = strchr(s, ':'); + seg_len = next_path ? next_path - s : strlen(s); + if (!seg_len) + continue; + snprintf(result, result_sz, "%.*s/%s", seg_len, s, file); + /* ensure it is an executable file/link */ + if (access(result, R_OK | X_OK) < 0) + continue; + pr_debug("resolved '%s' to '%s'\n", file, result); + return 0; + } + } + return -ENOENT; +} + LIBBPF_API struct bpf_link * bpf_program__attach_uprobe_opts(const struct bpf_program *prog, pid_t pid, const char *binary_path, size_t func_offset, @@ -10524,10 +10754,12 @@ bpf_program__attach_uprobe_opts(const struct bpf_program *prog, pid_t pid, { DECLARE_LIBBPF_OPTS(bpf_perf_event_opts, pe_opts); char errmsg[STRERR_BUFSIZE], *legacy_probe = NULL; + char full_binary_path[PATH_MAX]; struct bpf_link *link; size_t ref_ctr_off; int pfd, err; bool retprobe, legacy; + const char *func_name; if (!OPTS_VALID(opts, bpf_uprobe_opts)) return libbpf_err_ptr(-EINVAL); @@ -10536,12 +10768,37 @@ bpf_program__attach_uprobe_opts(const struct bpf_program *prog, pid_t pid, ref_ctr_off = OPTS_GET(opts, ref_ctr_offset, 0); pe_opts.bpf_cookie = OPTS_GET(opts, bpf_cookie, 0); + if (binary_path && !strchr(binary_path, '/')) { + err = resolve_full_path(binary_path, full_binary_path, + sizeof(full_binary_path)); + if (err) { + pr_warn("prog '%s': failed to resolve full path for '%s'\n", + prog->name, binary_path); + return libbpf_err_ptr(err); + } + binary_path = full_binary_path; + } + func_name = OPTS_GET(opts, func_name, NULL); + if (func_name) { + long sym_off; + + if (!binary_path) { + pr_warn("prog '%s': name-based attach requires binary_path\n", + prog->name); + return libbpf_err_ptr(-EINVAL); + } + sym_off = elf_find_func_offset(binary_path, func_name); + if (sym_off < 0) + return libbpf_err_ptr(sym_off); + func_offset += sym_off; + } + legacy = determine_uprobe_perf_type() < 0; if (!legacy) { pfd = perf_event_open_probe(true /* uprobe */, retprobe, binary_path, func_offset, pid, ref_ctr_off); } else { - char probe_name[512]; + char probe_name[PATH_MAX + 64]; if (ref_ctr_off) return libbpf_err_ptr(-EINVAL); @@ -10589,6 +10846,75 @@ err_out: } +/* Format of u[ret]probe section definition supporting auto-attach: + * u[ret]probe/binary:function[+offset] + * + * binary can be an absolute/relative path or a filename; the latter is resolved to a + * full binary path via bpf_program__attach_uprobe_opts. + * + * Specifying uprobe+ ensures we carry out strict matching; either "uprobe" must be + * specified (and auto-attach is not possible) or the above format is specified for + * auto-attach. + */ +static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf_link **link) +{ + DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, opts); + char *func, *probe_name, *func_end; + char *func_name, binary_path[512]; + unsigned long long raw_offset; + size_t offset = 0; + int n; + + *link = NULL; + + opts.retprobe = str_has_pfx(prog->sec_name, "uretprobe"); + if (opts.retprobe) + probe_name = prog->sec_name + sizeof("uretprobe") - 1; + else + probe_name = prog->sec_name + sizeof("uprobe") - 1; + if (probe_name[0] == '/') + probe_name++; + + /* handle SEC("u[ret]probe") - format is valid, but auto-attach is impossible. */ + if (strlen(probe_name) == 0) + return 0; + + snprintf(binary_path, sizeof(binary_path), "%s", probe_name); + /* ':' should be prior to function+offset */ + func_name = strrchr(binary_path, ':'); + if (!func_name) { + pr_warn("section '%s' missing ':function[+offset]' specification\n", + prog->sec_name); + return -EINVAL; + } + func_name[0] = '\0'; + func_name++; + n = sscanf(func_name, "%m[a-zA-Z0-9_.]+%li", &func, &offset); + if (n < 1) { + pr_warn("uprobe name '%s' is invalid\n", func_name); + return -EINVAL; + } + if (opts.retprobe && offset != 0) { + free(func); + pr_warn("uretprobes do not support offset specification\n"); + return -EINVAL; + } + + /* Is func a raw address? */ + errno = 0; + raw_offset = strtoull(func, &func_end, 0); + if (!errno && !*func_end) { + free(func); + func = NULL; + offset = (size_t)raw_offset; + } + opts.func_name = func; + + *link = bpf_program__attach_uprobe_opts(prog, -1, binary_path, offset, &opts); + free(func); + return libbpf_get_error(*link); +} + struct bpf_link *bpf_program__attach_uprobe(const struct bpf_program *prog, bool retprobe, pid_t pid, const char *binary_path, diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 05dde85e19a6..28cd2062d0df 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -459,9 +459,17 @@ struct bpf_uprobe_opts { __u64 bpf_cookie; /* uprobe is return probe, invoked at function return time */ bool retprobe; + /* Function name to attach to. Could be an unqualified ("abc") or library-qualified + * "abc@LIBXYZ" name. To specify function entry, func_name should be set while + * func_offset argument to bpf_prog__attach_uprobe_opts() should be 0. To trace an + * offset within a function, specify func_name and use func_offset argument to specify + * offset within the function. Shared library functions must specify the shared library + * binary_path. + */ + const char *func_name; size_t :0; }; -#define bpf_uprobe_opts__last_field retprobe +#define bpf_uprobe_opts__last_field func_name /** * @brief **bpf_program__attach_uprobe()** attaches a BPF program |