diff options
author | Ingo Molnar <mingo@kernel.org> | 2015-11-23 11:13:48 +0300 |
---|---|---|
committer | Ingo Molnar <mingo@kernel.org> | 2015-11-23 11:45:53 +0300 |
commit | 8c2accc8ca0be9cd8119ca439038243dfc8fcd0d (patch) | |
tree | ec57994aef7e5c4a63d7f36f798c3e8a2139066c /tools/lib | |
parent | 90eec103b96e30401c0b846045bf8a1c7159b6da (diff) | |
parent | 2c6caff2b26fde8f3f87183f8c97f2cebfdbcb98 (diff) | |
download | linux-8c2accc8ca0be9cd8119ca439038243dfc8fcd0d.tar.xz |
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Allows BPF scriptlets specify arguments to be fetched using
DWARF info, using a prologue generated at compile/build time (He Kuang, Wang Nan)
- Allow attaching BPF scriptlets to module symbols (Wang Nan)
- Allow attaching BPF scriptlets to userspace code using uprobe (Wang Nan)
- BPF programs now can specify 'perf probe' tunables via its section name,
separating key=val values using semicolons (Wang Nan)
Testing some of these new BPF features:
Use case: get callchains when receiving SSL packets, filter then in the
kernel, at arbitrary place.
# cat ssl.bpf.c
#define SEC(NAME) __attribute__((section(NAME), used))
struct pt_regs;
SEC("func=__inet_lookup_established hnum")
int func(struct pt_regs *ctx, int err, unsigned short port)
{
return err == 0 && port == 443;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
#
# perf record -a -g -e ssl.bpf.c
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.787 MB perf.data (3 samples) ]
# perf script | head -30
swapper 0 [000] 58783.268118: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
8572a8 process_backlog (/lib/modules/4.3.0+/build/vmlinux)
856b11 net_rx_action (/lib/modules/4.3.0+/build/vmlinux)
2a284b __do_softirq (/lib/modules/4.3.0+/build/vmlinux)
2a2ba3 irq_exit (/lib/modules/4.3.0+/build/vmlinux)
96b7a4 do_IRQ (/lib/modules/4.3.0+/build/vmlinux)
969807 ret_from_intr (/lib/modules/4.3.0+/build/vmlinux)
2dede5 cpu_startup_entry (/lib/modules/4.3.0+/build/vmlinux)
95d5bc rest_init (/lib/modules/4.3.0+/build/vmlinux)
1163ffa start_kernel ([kernel.vmlinux].init.text)
11634d7 x86_64_start_reservations ([kernel.vmlinux].init.text)
1163623 x86_64_start_kernel ([kernel.vmlinux].init.text)
qemu-system-x86 9178 [003] 58785.792417: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
856660 netif_receive_skb_internal (/lib/modules/4.3.0+/build/vmlinux)
8566ec netif_receive_skb_sk (/lib/modules/4.3.0+/build/vmlinux)
430a br_handle_frame_finish ([bridge])
48bc br_handle_frame ([bridge])
855f44 __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
#
Use 'perf probe' various options to list functions, see what variables can
be collected at any given point, experiment first collecting without a filter,
then filter, use it together with 'perf trace', 'perf top', with or without
callchains, if it explodes, please tell us!
- Introduce a new callchain mode: "folded", that will list per line
representations of all callchains for a give histogram entry, facilitating
'perf report' output processing by other tools, such as Brendan Gregg's
flamegraph tools (Namhyung Kim)
E.g:
# perf report | grep -v ^# | head
18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
|
---cpu_startup_entry
|
|--12.07%--start_secondary
|
--6.30%--rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
#
Becomes, in "folded" mode:
# perf report -g folded | grep -v ^# | head -5
18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
12.07% cpu_startup_entry;start_secondary
6.30% cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
16.90% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle
11.23% call_cpuidle;cpu_startup_entry;start_secondary
5.67% call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
16.90% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter
11.23% cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
5.67% cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
15.12% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state
#
The user can also select one of "count", "period" or "percent" as the first column.
Infrastructure changes:
- Fix multiple leaks found with Valgrind and a refcount
debugger (Masami Hiramatsu)
- Add further 'perf test' entries for BPF and LLVM (Wang Nan)
- Improve 'perf test' to suport subtests, so that the series of tests
performed in the LLVM and BPF main tests appear in the default 'perf test'
output (Wang Nan)
- Move memdup() from tools/perf to tools/lib/string.c (Arnaldo Carvalho de Melo)
- Adopt strtobool() from the kernel into tools/lib/ (Wang Nan)
- Fix selftests_install tools/ Makefile rule (Kevin Hilman)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Diffstat (limited to 'tools/lib')
-rw-r--r-- | tools/lib/bpf/libbpf.c | 146 | ||||
-rw-r--r-- | tools/lib/bpf/libbpf.h | 64 | ||||
-rw-r--r-- | tools/lib/string.c | 62 |
3 files changed, 263 insertions, 9 deletions
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index e176bad19bcb..e3f4c3379f14 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -152,7 +152,11 @@ struct bpf_program { } *reloc_desc; int nr_reloc; - int fd; + struct { + int nr; + int *fds; + } instances; + bpf_program_prep_t preprocessor; struct bpf_object *obj; void *priv; @@ -206,10 +210,25 @@ struct bpf_object { static void bpf_program__unload(struct bpf_program *prog) { + int i; + if (!prog) return; - zclose(prog->fd); + /* + * If the object is opened but the program was never loaded, + * it is possible that prog->instances.nr == -1. + */ + if (prog->instances.nr > 0) { + for (i = 0; i < prog->instances.nr; i++) + zclose(prog->instances.fds[i]); + } else if (prog->instances.nr != -1) { + pr_warning("Internal error: instances.nr is %d\n", + prog->instances.nr); + } + + prog->instances.nr = -1; + zfree(&prog->instances.fds); } static void bpf_program__exit(struct bpf_program *prog) @@ -260,7 +279,8 @@ bpf_program__init(void *data, size_t size, char *name, int idx, memcpy(prog->insns, data, prog->insns_cnt * sizeof(struct bpf_insn)); prog->idx = idx; - prog->fd = -1; + prog->instances.fds = NULL; + prog->instances.nr = -1; return 0; errout: @@ -860,13 +880,73 @@ static int bpf_program__load(struct bpf_program *prog, char *license, u32 kern_version) { - int err, fd; + int err = 0, fd, i; - err = load_program(prog->insns, prog->insns_cnt, - license, kern_version, &fd); - if (!err) - prog->fd = fd; + if (prog->instances.nr < 0 || !prog->instances.fds) { + if (prog->preprocessor) { + pr_warning("Internal error: can't load program '%s'\n", + prog->section_name); + return -LIBBPF_ERRNO__INTERNAL; + } + prog->instances.fds = malloc(sizeof(int)); + if (!prog->instances.fds) { + pr_warning("Not enough memory for BPF fds\n"); + return -ENOMEM; + } + prog->instances.nr = 1; + prog->instances.fds[0] = -1; + } + + if (!prog->preprocessor) { + if (prog->instances.nr != 1) { + pr_warning("Program '%s' is inconsistent: nr(%d) != 1\n", + prog->section_name, prog->instances.nr); + } + err = load_program(prog->insns, prog->insns_cnt, + license, kern_version, &fd); + if (!err) + prog->instances.fds[0] = fd; + goto out; + } + + for (i = 0; i < prog->instances.nr; i++) { + struct bpf_prog_prep_result result; + bpf_program_prep_t preprocessor = prog->preprocessor; + + bzero(&result, sizeof(result)); + err = preprocessor(prog, i, prog->insns, + prog->insns_cnt, &result); + if (err) { + pr_warning("Preprocessing the %dth instance of program '%s' failed\n", + i, prog->section_name); + goto out; + } + + if (!result.new_insn_ptr || !result.new_insn_cnt) { + pr_debug("Skip loading the %dth instance of program '%s'\n", + i, prog->section_name); + prog->instances.fds[i] = -1; + if (result.pfd) + *result.pfd = -1; + continue; + } + + err = load_program(result.new_insn_ptr, + result.new_insn_cnt, + license, kern_version, &fd); + + if (err) { + pr_warning("Loading the %dth instance of program '%s' failed\n", + i, prog->section_name); + goto out; + } + + if (result.pfd) + *result.pfd = fd; + prog->instances.fds[i] = fd; + } +out: if (err) pr_warning("failed to load program '%s'\n", prog->section_name); @@ -1121,5 +1201,53 @@ const char *bpf_program__title(struct bpf_program *prog, bool needs_copy) int bpf_program__fd(struct bpf_program *prog) { - return prog->fd; + return bpf_program__nth_fd(prog, 0); +} + +int bpf_program__set_prep(struct bpf_program *prog, int nr_instances, + bpf_program_prep_t prep) +{ + int *instances_fds; + + if (nr_instances <= 0 || !prep) + return -EINVAL; + + if (prog->instances.nr > 0 || prog->instances.fds) { + pr_warning("Can't set pre-processor after loading\n"); + return -EINVAL; + } + + instances_fds = malloc(sizeof(int) * nr_instances); + if (!instances_fds) { + pr_warning("alloc memory failed for fds\n"); + return -ENOMEM; + } + + /* fill all fd with -1 */ + memset(instances_fds, -1, sizeof(int) * nr_instances); + + prog->instances.nr = nr_instances; + prog->instances.fds = instances_fds; + prog->preprocessor = prep; + return 0; +} + +int bpf_program__nth_fd(struct bpf_program *prog, int n) +{ + int fd; + + if (n >= prog->instances.nr || n < 0) { + pr_warning("Can't get the %dth fd from program %s: only %d instances\n", + n, prog->section_name, prog->instances.nr); + return -EINVAL; + } + + fd = prog->instances.fds[n]; + if (fd < 0) { + pr_warning("%dth instance of program '%s' is invalid\n", + n, prog->section_name); + return -ENOENT; + } + + return fd; } diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index c9a9aef2806c..949df4b346cf 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -88,6 +88,70 @@ const char *bpf_program__title(struct bpf_program *prog, bool needs_copy); int bpf_program__fd(struct bpf_program *prog); +struct bpf_insn; + +/* + * Libbpf allows callers to adjust BPF programs before being loaded + * into kernel. One program in an object file can be transform into + * multiple variants to be attached to different code. + * + * bpf_program_prep_t, bpf_program__set_prep and bpf_program__nth_fd + * are APIs for this propose. + * + * - bpf_program_prep_t: + * It defines 'preprocessor', which is a caller defined function + * passed to libbpf through bpf_program__set_prep(), and will be + * called before program is loaded. The processor should adjust + * the program one time for each instances according to the number + * passed to it. + * + * - bpf_program__set_prep: + * Attachs a preprocessor to a BPF program. The number of instances + * whould be created is also passed through this function. + * + * - bpf_program__nth_fd: + * After the program is loaded, get resuling fds from bpf program for + * each instances. + * + * If bpf_program__set_prep() is not used, the program whould be loaded + * without adjustment during bpf_object__load(). The program has only + * one instance. In this case bpf_program__fd(prog) is equal to + * bpf_program__nth_fd(prog, 0). + */ + +struct bpf_prog_prep_result { + /* + * If not NULL, load new instruction array. + * If set to NULL, don't load this instance. + */ + struct bpf_insn *new_insn_ptr; + int new_insn_cnt; + + /* If not NULL, result fd is set to it */ + int *pfd; +}; + +/* + * Parameters of bpf_program_prep_t: + * - prog: The bpf_program being loaded. + * - n: Index of instance being generated. + * - insns: BPF instructions array. + * - insns_cnt:Number of instructions in insns. + * - res: Output parameter, result of transformation. + * + * Return value: + * - Zero: pre-processing success. + * - Non-zero: pre-processing, stop loading. + */ +typedef int (*bpf_program_prep_t)(struct bpf_program *prog, int n, + struct bpf_insn *insns, int insns_cnt, + struct bpf_prog_prep_result *res); + +int bpf_program__set_prep(struct bpf_program *prog, int nr_instance, + bpf_program_prep_t prep); + +int bpf_program__nth_fd(struct bpf_program *prog, int n); + /* * We don't need __attribute__((packed)) now since it is * unnecessary for 'bpf_map_def' because they are all aligned. diff --git a/tools/lib/string.c b/tools/lib/string.c new file mode 100644 index 000000000000..065e54f42d8f --- /dev/null +++ b/tools/lib/string.c @@ -0,0 +1,62 @@ +/* + * linux/tools/lib/string.c + * + * Copied from linux/lib/string.c, where it is: + * + * Copyright (C) 1991, 1992 Linus Torvalds + * + * More specifically, the first copied function was strtobool, which + * was introduced by: + * + * d0f1fed29e6e ("Add a strtobool function matching semantics of existing in kernel equivalents") + * Author: Jonathan Cameron <jic23@cam.ac.uk> + */ + +#include <stdlib.h> +#include <string.h> +#include <errno.h> +#include <linux/string.h> + +/** + * memdup - duplicate region of memory + * + * @src: memory region to duplicate + * @len: memory region length + */ +void *memdup(const void *src, size_t len) +{ + void *p = malloc(len); + + if (p) + memcpy(p, src, len); + + return p; +} + +/** + * strtobool - convert common user inputs into boolean values + * @s: input string + * @res: result + * + * This routine returns 0 iff the first character is one of 'Yy1Nn0'. + * Otherwise it will return -EINVAL. Value pointed to by res is + * updated upon finding a match. + */ +int strtobool(const char *s, bool *res) +{ + switch (s[0]) { + case 'y': + case 'Y': + case '1': + *res = true; + break; + case 'n': + case 'N': + case '0': + *res = false; + break; + default: + return -EINVAL; + } + return 0; +} |