summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-04-12perf evsel: Rename config_callgraph() to config_callchain() and make it publicArnaldo Carvalho de Melo2-6/+8
The rename is for consistency with the parameter name. Make it public for fine grained control of which evsels should have callchains enabled, like, for instance, will be done in the next changesets in 'perf trace', to enable callchains just on the "raw_syscalls:sys_exit" tracepoint. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-og8vup111rn357g4yagus3ao@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-12perf evlist: Add (reset,set)_sample_bit methodsArnaldo Carvalho de Melo2-0/+29
For fiddling with sample_type fields in all evsels in an evlist. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-dg6yavctt0hzl2tsgfb43qsr@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-12perf evsel: Do not use globals in config()Arnaldo Carvalho de Melo15-18/+26
Instead receive a callchain_param pointer to configure callchain aspects, not doing so if NULL is passed. This will allow fine grained control over which evsels in an evlist gets callchains enabled. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-2mupip6khc92mh5x4nw9to82@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-12perf trace: Exclude the kernel part of the callchain leading to a syscallArnaldo Carvalho de Melo2-0/+16
The kernel parts are not that useful: # trace -m 512 -e nanosleep --call dwarf usleep 1 0.065 ( 0.065 ms): usleep/18732 nanosleep(rqtp: 0x7ffc4ee4e200) = 0 syscall_slow_exit_work ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) return_from_SYSCALL_64 ([kernel.kallsyms]) __nanosleep (/usr/lib64/libc-2.22.so) usleep (/usr/lib64/libc-2.22.so) main (/usr/bin/usleep) __libc_start_main (/usr/lib64/libc-2.22.so) _start (/usr/bin/usleep) # So lets just use perf_event_attr.exclude_callchain_kernel to avoid collecting it in the ring buffer: # trace -m 512 -e nanosleep --call dwarf usleep 1 0.063 ( 0.063 ms): usleep/19212 nanosleep(rqtp: 0x7ffc3df10fb0) = 0 __nanosleep (/usr/lib64/libc-2.22.so) usleep (/usr/lib64/libc-2.22.so) main (/usr/bin/usleep) __libc_start_main (/usr/lib64/libc-2.22.so) _start (/usr/bin/usleep) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-qctu3gqhpim0dfbcp9d86c91@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-12perf evsel: Introduce fprintf_callchain() method out of fprintf_sym()Arnaldo Carvalho de Melo3-7/+32
In 'perf trace' we're just interested in printing callchains, and we don't want to use the symbol_conf.use_callchain, so move the callchain part to a new method. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-kcn3romzivcpxb3u75s9nz33@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-12perf evsel: Rename print_ip() to fprintf_sym()Arnaldo Carvalho de Melo4-43/+39
As it receives a FILE, and its more than just the IP, which can even be requested not to be printed. For consistency with other similar methods in tools/perf/, name it as perf_evsel__fprintf_sym() and make it return the number of bytes printed, just like 'fprintf(3)' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-84gawlqa3lhk63nf0t9vnqnn@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-12perf trace: Add support for printing call chains on sys_exit events.Milian Wolff2-0/+28
Now, one can print the call chain for every encountered sys_exit event, e.g.: $ perf trace -e nanosleep --call-graph dwarf path/to/ex_sleep 1005.757 (1000.090 ms): ex_sleep/13167 nanosleep(...) = 0 syscall_slow_exit_work ([kernel.kallsyms]) syscall_return_slowpath ([kernel.kallsyms]) int_ret_from_sys_call ([kernel.kallsyms]) __nanosleep (/usr/lib/libc-2.23.so) [unknown] (/usr/lib/libQt5Core.so.5.6.0) QThread::sleep (/usr/lib/libQt5Core.so.5.6.0) main (path/to/ex_sleep) __libc_start_main (/usr/lib/libc-2.23.so) _start (path/to/ex_sleep) Note that it is advised to increase the number of mmap pages to prevent event losses when using this new feature. Often, adding `-m 10M` to the `perf trace` invocation is enough. This feature is also available in strace when built with libunwind via `strace -k`. Performance wise, this solution is much better: $ time find path/to/linux &> /dev/null real 0m0.051s user 0m0.013s sys 0m0.037s $ time perf trace -m 800M --call-graph dwarf find path/to/linux &> /dev/null real 0m2.624s user 0m1.203s sys 0m1.333s $ time strace -k find path/to/linux &> /dev/null real 0m35.398s user 0m10.403s sys 0m23.173s Note that it is currently not possible to configure the print output. Adding such a feature, similar to what is available in `perf script` via its `--fields` knob can be added later on. Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> LPU-Reference: 1460115255-17648-1-git-send-email-milian.wolff@kdab.com [ Split from a larger patch, do not print the IP, left align, remove dup call symbol__init(), added man page entry ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-12perf evsel: Allow passing a left alignment when printing a symbolArnaldo Carvalho de Melo3-4/+8
For callchains, etc where we want it to align just below the syscall name, for instance, in 'perf trace' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-uk9ekchd67651c625ltaur5y@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-12perf evsel: Allow specifying a file to output in perf_evsel__print_ipMilian Wolff3-21/+25
As this function will be used in 'perf trace'. Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/n/tip-8x297v9utnxq77onikevvlse@git.kernel.org [ Split from a larger patch ] Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
2016-04-12perf bpf: Automatically create bpf-output event __bpf_stdout__Wang Nan1-9/+28
This patch removes the need to set a bpf-output event in cmdline. By referencing a map named '__bpf_stdout__', perf automatically creates an event for it. For example: # perf record -e ./test_bpf_trace.c usleep 100000 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data (2 samples) ] # perf script usleep 4639 [000] 261895.307826: 0 __bpf_stdout__: ffffffff810eb9a1 ... BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" usleep 4639 [000] 261895.407883: 0 __bpf_stdout__: ffffffff8105d609 ... BPF output: 0000: 52 61 69 73 65 20 61 20 Raise a 0008: 42 50 46 20 65 76 65 6e BPF even 0010: 74 21 00 00 t!.. BPF string: "Raise a BPF event!" perf record -e ./test_bpf_trace.c usleep 100000 equals to: perf record -e bpf-output/no-inherit=1,name=__bpf_stdout__/ \ -e ./test_bpf_trace.c/map:__bpf_stdout__.event=__bpf_stdout__/ \ usleep 100000 Where test_bpf_trace.c is: /************************ BEGIN **************************/ #include <uapi/linux/bpf.h> struct bpf_map_def { unsigned int type; unsigned int key_size; unsigned int value_size; unsigned int max_entries; }; #define SEC(NAME) __attribute__((section(NAME), used)) static u64 (*ktime_get_ns)(void) = (void *)BPF_FUNC_ktime_get_ns; static int (*trace_printk)(const char *fmt, int fmt_size, ...) = (void *)BPF_FUNC_trace_printk; static int (*get_smp_processor_id)(void) = (void *)BPF_FUNC_get_smp_processor_id; static int (*perf_event_output)(void *, struct bpf_map_def *, int, void *, unsigned long) = (void *)BPF_FUNC_perf_event_output; struct bpf_map_def SEC("maps") __bpf_stdout__ = { .type = BPF_MAP_TYPE_PERF_EVENT_ARRAY, .key_size = sizeof(int), .value_size = sizeof(u32), .max_entries = __NR_CPUS__, }; static inline int __attribute__((always_inline)) func(void *ctx, int type) { char output_str[] = "Raise a BPF event!"; char err_str[] = "BAD %d\n"; int err; err = perf_event_output(ctx, &__bpf_stdout__, get_smp_processor_id(), &output_str, sizeof(output_str)); if (err) trace_printk(err_str, sizeof(err_str), err); return 1; } SEC("func_begin=sys_nanosleep") int func_begin(void *ctx) {return func(ctx, 1);} SEC("func_end=sys_nanosleep%return") int func_end(void *ctx) { return func(ctx, 2);} char _license[] SEC("license") = "GPL"; int _version SEC("version") = LINUX_VERSION_CODE; /************************* END ***************************/ Committer note: Testing with 'perf trace': # trace -e nanosleep --ev test_bpf_stdout.c usleep 1 0.007 ( 0.007 ms): usleep/729 nanosleep(rqtp: 0x7ffc5bbc5fe0) ... 0.007 ( ): __bpf_stdout__:Raise a BPF event!..) 0.008 ( ): perf_bpf_probe:func_begin:(ffffffff81112460)) 0.069 ( ): __bpf_stdout__:Raise a BPF event!..) 0.070 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92)) 0.072 ( 0.072 ms): usleep/729 ... [continued]: nanosleep()) = 0 # Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460128045-97310-5-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-12perf bpf: Clone bpf stdout events in multiple bpf scriptsWang Nan4-0/+158
This patch allows cloning bpf-output event configuration among multiple bpf scripts. If there exist a map named '__bpf_output__' and not configured using 'map:__bpf_output__.event=', this patch clones the configuration of another '__bpf_stdout__' map. For example, following command: # perf trace --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \ --ev ./test_bpf_trace2.c usleep 100000 equals to: # perf trace --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \ --ev ./test_bpf_trace2.c/map:__bpf_stdout__.event=evt/ \ usleep 100000 Signed-off-by: Wang Nan <wangnan0@huawei.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460128045-97310-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf dwarf: Guard !x86_64 definitions under #ifdef else clauseArnaldo Carvalho de Melo1-4/+4
To fix the build on Fedora Rawhide (gcc 6.0.0 20160311 (Red Hat 6.0.0-0.17): CC /tmp/build/perf/arch/x86/util/dwarf-regs.o arch/x86/util/dwarf-regs.c:66:36: error: 'x86_32_regoffset_table' defined but not used [-Werror=unused-const-variable=] static const struct pt_regs_offset x86_32_regoffset_table[] = { ^~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-fghuksc1u8ln82bof4lwcj0o@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf tools: Use readdir() instead of deprecated readdir_r()Arnaldo Carvalho de Melo1-30/+30
The readdir() function is thread safe as long as just one thread uses a DIR, which is the case when parsing tracepoint event definitions, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wddn49r6bz6wq4ee3dxbl7lo@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf tools: Use readdir() instead of deprecated readdir_r()Arnaldo Carvalho de Melo1-6/+6
The readdir() function is thread safe as long as just one thread uses a DIR, which is the case when synthesizing events for pre-existing threads by traversing /proc, so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. CC /tmp/build/perf/util/event.o util/event.c: In function '__event__synthesize_thread': util/event.c:466:2: error: 'readdir_r' is deprecated [-Werror=deprecated-declarations] while (!readdir_r(tasks, &dirent, &next) && next) { ^~~~~ In file included from /usr/include/features.h:368:0, from /usr/include/stdint.h:25, from /usr/lib/gcc/x86_64-redhat-linux/6.0.0/include/stdint.h:9, from /git/linux/tools/include/linux/types.h:6, from util/event.c:1: /usr/include/dirent.h:189:12: note: declared here Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i1vj7nyjp2p750rirxgrfd3c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf thread_map: Use readdir() instead of deprecated readdir_r()Arnaldo Carvalho de Melo1-4/+4
The readdir() function is thread safe as long as just one thread uses a DIR, which is the case in thread_map, so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-del8h2a0f40z75j4r42l96l0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf script: Use readdir() instead of deprecated readdir_r()Arnaldo Carvalho de Melo1-36/+34
The readdir() function is thread safe as long as just one thread uses a DIR, which is the case in 'perf script', so, to avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r(). See: http://man7.org/linux/man-pages/man3/readdir.3.html "However, in modern implementations (including the glibc implementation), concurrent calls to readdir() that specify different directory streams are thread-safe. In cases where multiple threads must read from the same directory stream, using readdir() with external synchronization is still preferable to the use of the deprecated readdir_r(3) function." Noticed while building on a Fedora Rawhide docker container. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-mt3xz7n2hl49ni2vx7kuq74g@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf symbols: Adjust symbol for shared objectsWang Nan1-10/+3
He Kuang reported a problem that perf fails to get correct symbol on Android platform in [1]. The problem can be reproduced on normal x86_64 platform. I will describe the reproducing steps in detail at the end of commit message. The reason of this problem is the missing of symbol adjustment for normal shared objects. In most of the cases skipping adjustment is okay. However, when '.text' section have different 'address' and 'offset' the result is wrong. I checked all shared objects in my working platform, only wine dll objects and debug objects (in .debug) have this problem. However, it is common on Android. For example: $ readelf -S ./libsurfaceflinger.so | grep \.text [10] .text PROGBITS 0000000000029030 00012030 This patch enables symbol adjustment for dynamic objects so the symbol address got from elfutils would be adjusted correctly. Now nearly all types of ELF files should adjust symbols. Makes ss->adjust_symbols default to true. Steps to reproduce the problem: $ cat ./Makefile PWD := $(shell pwd) LDFLAGS += "-Wl,-rpath=$(PWD)" CFLAGS += -g main: main.c libbuggy.so libbuggy.so: buggy.c gcc -g -shared -fPIC -Wl,-Ttext-segment=0x200000 $< -o $@ clean: rm -rf main libbuggy.so *.o $ cat ./buggy.c int fib(int x) { return (x == 0) ? 1 : (x == 1) ? 1 : fib(x - 1) + fib(x - 2); } $ cat ./main.c #include <stdio.h> extern int fib(int x); int main() { int i; for (i = 0; i < 40; i++) printf("%d\n", fib(i)); return 0; } $ make $ perf record ./main ... $ perf report --stdio # Overhead Command Shared Object Symbol # ........ ....... ................. ............................... # 14.97% main libbuggy.so [.] 0x000000000000066c 8.68% main libbuggy.so [.] 0x00000000000006aa 8.52% main libbuggy.so [.] fib@plt 7.95% main libbuggy.so [.] 0x0000000000000664 5.94% main libbuggy.so [.] 0x00000000000006a9 5.35% main libbuggy.so [.] 0x0000000000000678 ... The correct result should be (after this patch): # Overhead Command Shared Object Symbol # ........ ....... ................. ............................... # 91.47% main libbuggy.so [.] fib 8.52% main libbuggy.so [.] fib@plt 0.00% main [kernel.kallsyms] [k] kmem_cache_free [1] http://lkml.kernel.org/g/1452567507-54013-1-git-send-email-hekuang@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460024671-64774-3-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf symbols: Record text offset in dso to calculate objdump addressWang Nan1-0/+14
In this patch, the offset of '.text' section is stored into dso and used here to re-calculate address to objdump. In most of the cases, executable code is in '.text' section, so the adjustment made to a symbol in dso__load_sym (using sym.st_value -= shdr.sh_addr - shdr.sh_offset) should equal to 'sym.st_value -= dso->text_offset'. Therefore, adding text_offset back get objdump address from symbol address (rip). However, it is not true for kernel and kernel module since there could be multiple executable sections with different offset. Exclude kernel for this reason. After this patch, even dso->adjust_symbols is set to true for shared objects, map__rip_2objdump() and map__objdump_2mem() would return correct result, so perf behavior of annotate won't be changed. Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Cody P Schafer <dev@codyps.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kirill Smelkov <kirr@nexedi.com> Cc: Li Zefan <lizefan@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1460024671-64774-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf tools: Build syscall table .c header from kernel's syscall_64.tblArnaldo Carvalho de Melo7-5/+456
We used libaudit to map ids to syscall names and vice-versa, but that imposes a delay in supporting new syscalls, having to wait for libaudit to get those new syscalls on its tables. To remove that delay, for x86_64 initially, grab a copy of arch/x86/entry/syscalls/syscall_64.tbl and use it to generate those tables. Syscalls currently not available in audit-libs: # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd Error: Invalid syscall copy_file_range, membarrier, mlock2, pread64, pwrite64, timerfd_create, userfaultfd Hint: try 'perf list syscalls:sys_enter_*' Hint: and: 'man syscalls' # With this patch: # trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd 8505.733 ( 0.010 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 36 8506.688 ( 0.005 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 40 30023.097 ( 0.025 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63ae382000, count: 4096, pos: 529592320) = 4096 31268.712 ( 0.028 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afd8b000, count: 4096, pos: 2314133504) = 4096 31268.854 ( 0.016 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afda2000, count: 4096, pos: 2314137600) = 4096 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-51xfjbxevdsucmnbc4ka5r88@git.kernel.org [ Added make dep for 'prepare' in 'LIBPERF_IN', fix by Wang Nan to fix parallell build ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf tools: Allow generating per-arch syscall table arraysArnaldo Carvalho de Melo2-2/+91
Tools should use a mechanism similar to arch/x86/entry/syscalls/ to generate a header file with the definitions for two variables: static const char *syscalltbl_x86_64[] = { [0] = "read", [1] = "write", <SNIP> [324] = "membarrier", [325] = "mlock2", [326] = "copy_file_range", }; static const int syscalltbl_x86_64_max_id = 326; In a per arch file that should then be included in tools/perf/util/syscalltbl.c. First one will be for x86_64. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-02uuamkxgccczdth8komspgp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf trace: Move syscall table id <-> name routines to separate classArnaldo Carvalho de Melo4-13/+71
We're using libaudit for doing name to id and id to syscall name translations, but that makes 'perf trace' to have to wait for newer libaudit versions supporting recently added syscalls, such as "userfaultfd" at the time of this changeset. We have all the information right there, in the kernel sources, so move this code to a separate place, wrapped behind functions that will progressively use the kernel source files to extract the syscall table for use in 'perf trace'. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i38opd09ow25mmyrvfwnbvkj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf trace: Beautify mode_t argumentsArnaldo Carvalho de Melo2-0/+71
When reading the syscall tracepoint /format file, look for arguments of type "mode_t" and attach a beautifier: [root@jouet ~]# cat ~/bin/tp_with_fields_of_type #!/bin/bash grep -w $1 /sys/kernel/tracing/events/syscalls/*/format | sed -r 's%.*sys_enter_(.*)/format.*%\1%g' | paste -d, -s # tp_with_fields_of_type umode_t chmod,creat,fchmodat,fchmod,mkdirat,mkdir,mknodat,mknod,mq_open,openat,open # Testing it: #define S_ISUID 0004000 #define S_ISGID 0002000 #define S_ISVTX 0001000 #define S_IRWXU 0000700 #define S_IRUSR 0000400 #define S_IWUSR 0000200 #define S_IXUSR 0000100 #define S_IRWXG 0000070 #define S_IRGRP 0000040 #define S_IWGRP 0000020 #define S_IXGRP 0000010 #define S_IRWXO 0000007 #define S_IROTH 0000004 #define S_IWOTH 0000002 #define S_IXOTH 0000001 # for mode in 4000 2000 1000 700 400 200 100 70 40 20 10 7 4 2 1 ; do \ echo -n $mode '->' ; trace --no-inherit -e chmod,fchmodat,fchmod chmod $mode x; \ done 4000 -> 0.338 ( 0.012 ms): fchmodat(dfd: CWD, filename: x, mode: ISUID) = 0 2000 -> 0.438 ( 0.015 ms): fchmodat(dfd: CWD, filename: x, mode: ISGID) = 0 1000 -> 0.677 ( 0.040 ms): fchmodat(dfd: CWD, filename: x, mode: ISVTX) = 0 700 -> 0.394 ( 0.013 ms): fchmodat(dfd: CWD, filename: x, mode: IRWXU) = 0 400 -> 0.337 ( 0.010 ms): fchmodat(dfd: CWD, filename: x, mode: IRUSR) = 0 200 -> 0.259 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IWUSR) = 0 100 -> 0.249 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IXUSR) = 0 70 -> 0.266 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IRWXG) = 0 40 -> 0.329 ( 0.009 ms): fchmodat(dfd: CWD, filename: x, mode: IRGRP) = 0 20 -> 0.250 ( 0.009 ms): fchmodat(dfd: CWD, filename: x, mode: IWGRP) = 0 10 -> 0.259 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IXGRP) = 0 7 -> 0.249 ( 0.009 ms): fchmodat(dfd: CWD, filename: x, mode: IRWXO) = 0 4 -> 0.278 ( 0.011 ms): fchmodat(dfd: CWD, filename: x, mode: IROTH) = 0 2 -> 0.276 ( 0.009 ms): fchmodat(dfd: CWD, filename: x, mode: IWOTH) = 0 1 -> 0.250 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: IXOTH) = 0 # # trace --no-inherit -e chmod,fchmodat,fchmod chmod 7777 x 0.258 ( 0.011 ms): fchmodat(dfd: CWD, filename: x, mode: IALLUGO) = 0 # trace --no-inherit -e chmod,fchmodat,fchmod chmod 7770 x 0.258 ( 0.008 ms): fchmodat(dfd: CWD, filename: x, mode: ISUID|ISGID|ISVTX|IRWXU|IRWXG) = 0 # trace --no-inherit -e chmod,fchmodat,fchmod chmod 777 x 0.293 ( 0.012 ms): fchmodat(dfd: CWD, filename: x, mode: IRWXUGO # Now lets see if check by using the tracepoint for that specific syscall, instead of raw_syscalls:sys_enter as 'trace' does for its strace fu: # trace --no-inherit --ev syscalls:sys_enter_fchmodat -e fchmodat chmod 666 x 0.255 ( ): syscalls:sys_enter_fchmodat:dfd: 0xffffffffffffff9c, filename: 0x55db32a3f0f0, mode: 0x000001b6) 0.268 ( 0.012 ms): fchmodat(dfd: CWD, filename: x, mode: IRUGO|IWUGO ) = 0 # Perfect, 0x1bc == 0666. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-18e8zfgbkj83xo87yoom43kd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf script: Process event update eventsJiri Olsa1-0/+1
Andreas reported following command produces no output: # cat test.py #!/usr/bin/env python def stat__krava(cpu, thread, time, val, ena, run): print "event %s cpu %d, thread %d, time %d, val %d, ena %d, run %d" % \ ("krava", cpu, thread, time, val, ena, run) # perf stat -a -I 1000 -e cycles,"cpu/config=0x6530160,name=krava/" record | perf script -s test.py ^C # The reason is that 'perf script' does not process event update events and will never get the event name update thus the python callback is never called. The fix is just to add already existing callback we use in 'perf stat report'. Committer note: After the patch: # perf stat -a -I 1000 -e cycles,"cpu/config=0x6530160,name=krava/" record | perf script -s test.py event krava cpu -1, thread -1, time 1000239179, val 1789051, ena 4000690920, run 4000690920 event krava cpu -1, thread -1, time 2000479061, val 2391338, ena 4000879596, run 4000879596 event krava cpu -1, thread -1, time 3000740802, val 1939121, ena 4000977209, run 4000977209 event krava cpu -1, thread -1, time 4001006730, val 2356115, ena 4001000489, run 4001000489 ^C # Reported-by: Andreas Hollmann <hollmann@in.tum.de> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460013073-18444-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-08perf tools: Add dedicated unwind addr_space member into thread structJiri Olsa2-16/+15
Milian reported issue with thread::priv, which was double booked by perf trace and DWARF unwind code. So using those together is impossible at the moment. Moving DWARF unwind private data into separate variable so perf trace can keep using thread::priv. Reported-and-Tested-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andreas Hollmann <hollmann@in.tum.de> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460013073-18444-2-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-07perf tools: Introduce trim functionJiri Olsa3-4/+7
To be used in cases for both sides trim. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Andreas Hollmann <hollmann@in.tum.de> Cc: David Ahern <dsahern@gmail.com> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1460013073-18444-1-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-07perf trace: Beautify pid_t argumentsArnaldo Carvalho de Melo2-53/+75
When reading the syscall tracepoint /format file, look for arguments of type "pid_t" and attach the PID beautifier, that will do a lookup on the threads it knows, i.e. the ones that came from PERF_RECORD_COMM events and add the COMM after the pid in such args: Excerpt of a system wide trace for syscalls with pid_t args: 55602.977 ( 0.006 ms): bash/12122 setpgid(pid: 24347 (bash), pgid: 24347 (bash)) = 0 55603.024 ( 0.004 ms): bash/24347 setpgid(pid: 24347 (bash), pgid: 24347 (bash)) = 0 55691.527 (88.397 ms): bash/12122 wait4(upid: -1, stat_addr: 0x7ffe0cee1720, options: UNTRACED|CONTINUED) ... 55692.479 ( 0.952 ms): git/24347 wait4(upid: 24368, stat_addr: 0x7ffe030d5724) ... 55694.549 ( 2.070 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4fc10) = 24369 (pre-commit) 55694.575 ( 0.002 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f650, options: NOHANG) = -1 ECHILD No child processes 55695.934 ( 0.010 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f2d0, options: NOHANG) = 24370 (git) 55695.937 ( 0.001 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f2d0, options: NOHANG) = -1 ECHILD No child processes 55717.963 ( 0.000 ms): pre-commit/24371 ... [continued]: wait4()) = 24372 55717.978 (21.468 ms): :24371/24371 wait4(upid: -1, stat_addr: 0x7ffc94f4f230) ... 55718.087 ( 0.109 ms): pre-commit/24371 wait4(upid: -1, stat_addr: 0x7ffc94f4f230) = 24373 (tr) 55718.187 ( 0.096 ms): pre-commit/24371 wait4(upid: -1, stat_addr: 0x7ffc94f4f230) = 24374 (wc) 55718.218 ( 0.002 ms): pre-commit/24371 wait4(upid: -1, stat_addr: 0x7ffc94f4eed0, options: NOHANG) = -1 ECHILD No child processes 55718.367 ( 0.005 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f1d0, options: NOHANG) = 24371 (pre-commit) 55718.369 ( 0.001 ms): pre-commit/24368 wait4(upid: -1, stat_addr: 0x7ffc94f4f1d0, options: NOHANG) = -1 ECHILD No child processes 55741.021 (49.494 ms): git/24347 ... [continued]: wait4()) = 24368 (pre-commit) 74146.427 (18319.601 ms): git/24347 wait4(upid: 24375 (git), stat_addr: 0x7ffe030d6824) ... 74149.036 ( 0.891 ms): bash/24391 wait4(upid: -1, stat_addr: 0x7ffe0cee0560) = 24393 (sed) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-75yl9hzjhb020iadc81gdj8t@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf trace: Beautify set_tid_address, getpid, getppid return valuesArnaldo Carvalho de Melo1-0/+3
Showing the COMM for that return, if available. # trace -e getpid,getppid,set_tid_address 490.007 ( 0.005 ms): sh/8250 getpid(...) = 8250 (sh) 490.014 ( 0.001 ms): sh/8250 getppid(...) = 7886 (make) 491.156 ( 0.004 ms): install/8251 set_tid_address(tidptr: 0x7f204a9d4ad0) = 8251 (install) ^C Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-psbpplqupatom9x4uohbxid5@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf trace: Infrastructure to show COMM strings for syscalls returning PIDsArnaldo Carvalho de Melo1-4/+15
Starting with clone, waitid and wait4: # trace -e waitid,wait4 1.385 ( 1.385 ms): bash/12122 wait4(upid: -1, stat_addr: 0x7ffe0cee1720, options: UNTRACED|CONTINUED) = 1210 (ls) 1.426 ( 0.002 ms): bash/12122 wait4(upid: -1, stat_addr: 0x7ffe0cee1150, options: NOHANG|UNTRACED|CONTINUED) = 0 3.293 ( 0.604 ms): bash/1211 wait4(upid: -1, stat_addr: 0x7ffe0cee0560 ) = 1214 (sed) 3.342 ( 0.002 ms): bash/1211 wait4(upid: -1, stat_addr: 0x7ffe0cee01d0, options: NOHANG ) = -1 ECHILD No child processes 3.576 ( 0.016 ms): bash/12122 wait4(upid: -1, stat_addr: 0x7ffe0cee0550, options: NOHANG|UNTRACED|CONTINUED) = 1211 (bash) ^C# trace -e clone 0.027 ( 0.000 ms): systemd/1 ... [continued]: clone()) = 1227 (systemd) 0.050 ( 0.000 ms): systemd/1227 ... [continued]: clone()) = 0 ^C[root@jouet ~]# Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-lyf5d3y5j15wikjb6pe6ukoi@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf trace: Beautify wait4/waitid 'options' argumentArnaldo Carvalho de Melo2-0/+31
# trace -e waitid,wait4 0.557 ( 0.557 ms): bash/27335 wait4(upid: -1, stat_addr: 0x7ffd02f449f0) = 27336 1.250 ( 0.685 ms): bash/27335 wait4(upid: -1, stat_addr: 0x7ffd02f449f0) = 27337 1.312 ( 0.002 ms): bash/27335 wait4(upid: -1, stat_addr: 0x7ffd02f44690, options: NOHANG) = -1 ECHILD No child processes 1.550 ( 0.015 ms): bash/3856 wait4(upid: -1, stat_addr: 0x7ffd02f44990, options: NOHANG|UNTRACED|CONTINUED) = 27335 1.552 ( 0.001 ms): bash/3856 wait4(upid: -1, stat_addr: 0x7ffd02f44990, options: NOHANG|UNTRACED|CONTINUED) = 0 # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i5vlo5n5jv0amt8bkyicmdxh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf trace: Beautify sched_setscheduler 'policy' argumentArnaldo Carvalho de Melo2-0/+48
$ trace -e sched_setscheduler chrt -f 1 usleep 1 chrt: failed to set pid 0's policy: Operation not permitted 0.005 ( 0.005 ms): chrt/19189 sched_setscheduler(policy: FIFO, param: 0x7ffec5273d70) = -1 EPERM Operation not permitted $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-i5vlo5n5jv0amt8bkyicmdxh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf list: Document event specifications betterAndi Kleen1-1/+106
Document some features for specifying events in the perf list manpage: - Event groups - Leader sampling - How to specify raw PMU events in the new syntax - Global versus per process PMUs. - Access restrictions - Fix Intel SDM URL v2: Lots of new content. address review feedback. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1459810686-15913-1-git-send-email-andi@firstfloor.org [ Add quotes to some keywords, such as "any" ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf tools: Remove superfluous ARCH Makefile includesJiri Olsa2-5/+0
Link: http://lkml.kernel.org/n/tip-yk6brsq3opuotr9by18xlkr8@git.kernel.org Signed-off-by: Jiri Olsa <jolsa@kernel.org>
2016-04-06perf script perl: Do error checking on new backtrace routineArnaldo Carvalho de Melo1-9/+21
This ended up triggering these warnings when building on Ubuntu 12.04.5: util/scripting-engines/trace-event-perl.c: In function 'perl_process_callchain': util/scripting-engines/trace-event-perl.c:293:4: error: value computed is not used [-Werror=unused-value] util/scripting-engines/trace-event-perl.c:294:4: error: value computed is not used [-Werror=unused-value] util/scripting-engines/trace-event-perl.c:295:4: error: value computed is not used [-Werror=unused-value] util/scripting-engines/trace-event-perl.c:297:4: error: value computed is not used [-Werror=unused-value] util/scripting-engines/trace-event-perl.c:309:4: error: value computed is not used [-Werror=unused-value] cc1: all warnings being treated as errors mv: cannot stat `/tmp/build/perf/util/scripting-engines/.trace-event-perl.o.tmp': No such file or directory make[4]: *** [/tmp/build/perf/util/scripting-engines/trace-event-perl.o] Error 1 Fix it by doing error checking when building the perl data structures related to callchains. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@gmail.com> Fixes: f7380c12ec6c ("perf script perl: Perl scripts now get a backtrace, like the python ones") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf probe: Check if dwarf_getlocations() is availableArnaldo Carvalho de Melo6-0/+38
If not, tell the user that: config/Makefile:273: Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.157 And return -ENOTSUPP in die_get_var_range(), failing features that need it, like the one pointed out above. This fixes the build on older systems, such as Ubuntu 12.04.5. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Vinson Lee <vlee@freedesktop.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-9l7luqkq4gfnx7vrklkq4obs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06perf config: Fix build with older toolchain.Vinson Lee1-3/+3
Fix build error on Ubuntu 12.04.5 with GCC 4.6.3. CC util/config.o util/config.c: In function ‘perf_buildid_config’: util/config.c:384:15: error: declaration of ‘dirname’ shadows a global declaration [-Werror=shadow] Signed-off-by: Vinson Lee <vlee@freedesktop.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 9cb5987c8227 ("perf config: Rework buildid_dir_command_config to perf_buildid_config") Link: http://lkml.kernel.org/r/1459807659-9020-1-git-send-email-vlee@freedesktop.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-06Merge tag 'perf-core-for-mingo-20160401' of ↵Ingo Molnar19-41/+274
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Do not use events that don't have timestamps when setting 'perf trace's base timestamp, fixing up the timestamp column for syscalls (Arnaldo Carvalho de Melo) - Make the 'bpf-output' sample_type be the same as tracepoint's, fixing up 'perf trace's timestamp column for bpf events (Wang Nan) - Fix PMU term format max value calculation (Kan Liang) - Pretty print 'seccomp', 'getrandom' syscalls in 'perf trace' (Arnaldo Carvalho de Melo) Infrastructure changes: - Add support for using TSC as an ARCH timestamp when synthesizing JIT records (Adrian Hunter) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-04-02perf bpf: Add sample types for 'bpf-output' eventWang Nan1-1/+2
Before this patch we can see very large time in the events before the 'bpf-output' event. For example: # perf trace -vv -T --ev sched:sched_switch \ --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:channel.event=evt/ \ usleep 10 ... 18446744073709.551 (18446564645918.480 ms): usleep/4157 nanosleep(rqtp: 0x7ffd3f0dc4e0) ... 18446744073709.551 ( ): evt:Raise a BPF event!..) 179427791.076 ( ): perf_bpf_probe:func_begin:(ffffffff810eb9a0)) 179427791.081 ( ): sched:sched_switch:usleep:4157 [120] S ==> swapper/2:0 [120]) ... We can also see the differences between bpf-output events and breakpoint events: For bpf output event: sample_type IP|TID|RAW|IDENTIFIER For tracepoint events: sample_type IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER This patch fix this differences by adding more sample type for bpf-output events. After this patch: # perf trace -vv -T --ev sched:sched_switch \ --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:channel.event=evt/ \ usleep 10 ... 179877370.878 ( 0.003 ms): usleep/5336 nanosleep(rqtp: 0x7ffff866c450) ... 179877370.878 ( ): evt:Raise a BPF event!..) 179877370.878 ( ): perf_bpf_probe:func_begin:(ffffffff810eb9a0)) 179877370.882 ( ): sched:sched_switch:usleep:5336 [120] S ==> swapper/4:0 [120]) 179877370.945 ( ): evt:Raise a BPF event!..) ... # ./perf trace -vv -T --ev sched:sched_switch \ --ev bpf-output/no-inherit,name=evt/ \ --ev ./test_bpf_trace.c/map:channel.event=evt/ \ usleep 10 2>&1 | grep sample_type sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW sample_type IP|TID|TIME|ID|CPU|PERIOD|RAW The 'IDENTIFIER' info is not required because all events have the same sample_type. Committer notes: Further testing, on top of the changes making 'perf trace' avoid samples from events without PERF_SAMPLE_TIME: Before: # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10 <SNIP> 0.560 ( 0.001 ms): brk( ) = 0x55e5a1df8000 18446640227439.430 (18446640227438.859 ms): nanosleep(rqtp: 0x7ffc96643370) ... 18446640227439.430 ( ): evt:Raise a BPF event!..) 0.576 ( ): perf_bpf_probe:func_begin:(ffffffff81112460)) 18446640227439.430 ( ): evt:Raise a BPF event!..) 0.645 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92)) 0.646 ( 0.076 ms): ... [continued]: nanosleep()) = 0 # After: # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10 <SNIP> 0.292 ( 0.001 ms): brk( ) = 0x55c7cd6e1000 0.302 ( 0.004 ms): nanosleep(rqtp: 0x7ffedd8bc0f0) ... 0.302 ( ): evt:Raise a BPF event!..) 0.303 ( ): perf_bpf_probe:func_begin:(ffffffff81112460)) 0.397 ( ): evt:Raise a BPF event!..) 0.397 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92)) 0.398 ( 0.100 ms): ... [continued]: nanosleep()) = 0 Signed-off-by: Wang Nan <wangnan0@huawei.com> Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1459517202-42320-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-02perf trace: Don't set the base timestamp using events without PERF_SAMPLE_TIMEArnaldo Carvalho de Melo1-2/+11
This was causing bogus values to be shown at the timestamp column: Before: # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10 94631143.385 ( 0.001 ms): brk( ) = 0x555555757000 94631143.398 ( 0.003 ms): mmap(len: 4096, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1) = 0x7ffff7ff6000 94631143.406 ( 0.004 ms): access(filename: 0xf7df9e10, mode: R ) = -1 ENOENT No such file or directory 94631143.412 ( 0.004 ms): open(filename: 0xf7df8761, flags: CLOEXEC) = 3 94631143.415 ( 0.002 ms): fstat(fd: 3, statbuf: 0x7fffffffd6b0 ) = 0 94631143.419 ( 0.003 ms): mmap(len: 106798, prot: READ, flags: PRIVATE, fd: 3) = 0x7ffff7fdb000 94631143.420 ( 0.001 ms): close(fd: 3 ) = 0 94631143.432 ( 0.004 ms): open(filename: 0xf7ff6640, flags: CLOEXEC) = 3 <SNIP> After: # trace --ev bpf-output/no-inherit,name=evt/ --ev /home/acme/bpf/test_bpf_trace.c/map:channel.event=evt/ usleep 10 0.022 ( 0.001 ms): brk( ) = 0x55d7668a6000 0.037 ( 0.003 ms): mmap(len: 4096, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS, fd: -1) = 0x7f8fbeb97000 0.123 ( 0.083 ms): access(filename: 0xbe995e10, mode: R ) = -1 ENOENT No such file or directory 0.130 ( 0.004 ms): open(filename: 0xbe994761, flags: CLOEXEC) = 3 0.133 ( 0.002 ms): fstat(fd: 3, statbuf: 0x7fff6487a890 ) = 0 0.138 ( 0.003 ms): mmap(len: 106798, prot: READ, flags: PRIVATE, fd: 3) = 0x7f8fbeb7c000 0.140 ( 0.001 ms): close(fd: 3 ) = 0 0.151 ( 0.004 ms): open(filename: 0xbeb97640, flags: CLOEXEC) = 3 <SNIP> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-p7m8llv81iv55ekxexdp5n57@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-02perf trace: Introduce function to set the base timestampArnaldo Carvalho de Melo1-5/+11
That is used in both live runs, i.e.: # trace ls As when processing events recorded in a perf.data file: # trace -i perf.data Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-901l6yebnzeqg7z8mbaf49xb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-02perf tools: Fix PMU term format max value calculationKan Liang1-7/+6
Currently the max value of format is calculated by the bits number. It relies on the continuity of the format. However, uncore event format is not continuous. E.g. uncore qpi event format can be 0-7,21. If bit 21 is set, there is parsing issues as below. $ perf stat -a -e uncore_qpi_0/event=0x200002,umask=0x8/ event syntax error: '..pi_0/event=0x200002,umask=0x8/' \___ value too big for format, maximum is 511 This patch return the real max value by setting all possible bits to 1. Signed-off-by: Kan Liang <kan.liang@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1459365375-14285-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-02perf intel-pt/bts: Define JITDUMP_USE_ARCH_TIMESTAMPAdrian Hunter2-0/+10
For Intel PT / BTS, define the environment variable that selects TSC timestamps in the jitdump file. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1457426333-30260-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-04-02perf jit: Add support for using TSC as a timestampAdrian Hunter7-28/+87
Intel PT uses TSC as a timestamp, so add support for using TSC instead of the monotonic clock. Use of TSC is selected by an environment variable "JITDUMP_USE_ARCH_TIMESTAMP" and flagged in the jitdump file with flag JITDUMP_FLAGS_ARCH_TIMESTAMP. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1457426330-30226-1-git-send-email-adrian.hunter@intel.com [ Added the fixup from He Kuang to make it build on other arches, ] [ such as aarch64, to avoid inserting this bisectiong breakage upstream ] Link: http://lkml.kernel.org/r/1459482572-129494-1-git-send-email-hekuang@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31perf tools: Add time conversion eventAdrian Hunter9-0/+75
Intel PT uses the time members from the perf_event_mmap_page to convert between TSC and perf time. Due to a lack of foresight when Intel PT was implemented, those time members were recorded in the (implementation dependent) AUXTRACE_INFO event, the structure of which is generally inaccessible outside of the Intel PT decoder. However now the conversion between TSC and perf time is needed when processing a jitdump file when Intel PT has been used for tracing. So add a user event to record the time members. 'perf record' will synthesize the event if the information is available. And session processing will put a copy of the event on the session so that tools like 'perf inject' can easily access it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1457426324-30158-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31perf trace: Pretty print getrandom() argsArnaldo Carvalho de Melo1-0/+26
# trace -e getrandom 35622.560 ( 0.023 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 35622.585 ( 0.006 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 35622.594 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 35627.395 ( 0.010 ms): libvirtd/1353 getrandom(buf: 0x7f7a1bfa35c0, count: 16, flags: NONBLOCK ) = 16 35630.940 ( 0.013 ms): fwupd/16120 getrandom(buf: 0x7f63243aa5c0, count: 16, flags: NONBLOCK ) = 16 35718.613 ( 0.015 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 35718.629 ( 0.005 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 35718.637 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 35719.355 ( 0.010 ms): libvirtd/1353 getrandom(buf: 0x7f7a1bfa35c0, count: 16, flags: NONBLOCK ) = 16 35721.042 ( 0.030 ms): fwupd/16120 getrandom(buf: 0x7f63243aa5c0, count: 16, flags: NONBLOCK ) = 16 41090.830 ( 0.012 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 41090.845 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 41090.851 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 41091.750 ( 0.010 ms): libvirtd/1353 getrandom(buf: 0x7f7a1bfa35c0, count: 16, flags: NONBLOCK ) = 16 41091.823 ( 0.006 ms): fwupd/16120 getrandom(buf: 0x7f63243aa5c0, count: 16, flags: NONBLOCK ) = 16 41122.078 ( 0.053 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 41122.129 ( 0.009 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 41122.139 ( 0.004 ms): systemd-udevd/631 getrandom(buf: 0x55621e3c18f0, count: 16, flags: NONBLOCK) = 16 41124.492 ( 0.007 ms): libvirtd/1353 getrandom(buf: 0x7f7a1bfa35c0, count: 16, flags: NONBLOCK ) = 16 41124.470 ( 0.013 ms): fwupd/16120 getrandom(buf: 0x7f63243aa5c0, count: 16, flags: NONBLOCK ) = 16 41590.832 ( 0.014 ms): chrome/5957 getrandom(buf: 0x7fabac7b15b0, count: 16, flags: NONBLOCK ) = 16 41590.884 ( 0.004 ms): chrome/5957 getrandom(buf: 0x7fabac7b15c0, count: 16, flags: NONBLOCK ) = 16 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-gca0n1p3aca3depey703ph2q@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31perf trace: Pretty print seccomp() argsArnaldo Carvalho de Melo1-0/+47
E.g: # trace -e seccomp 200.061 (0.009 ms): :2441/2441 seccomp(op: FILTER, flags: TSYNC ) = -1 EFAULT Bad address 200.910 (0.121 ms): :2441/2441 seccomp(op: FILTER, flags: TSYNC, uargs: 0x7fff57479fe0) = 0 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-t369uckshlwp4evkks4bcoo7@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31perf trace: Do not process PERF_RECORD_LOST twiceArnaldo Carvalho de Melo1-0/+1
We catch this record to provide a visual indication that events are getting lost, then call the default method to allow extra logging shared with the other tools to take place. This extra logging was done twice because we were continuing to the "default" clause where machine__process_event() will end up calling machine__process_lost_event() again, fix it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wus2zlhw3qo24ye84ewu4aqw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-03-31perf/ring_buffer: Prepare writing into the ring-buffer from the endWang Nan1-6/+36
Convert perf_output_begin() to __perf_output_begin() and make the later function able to write records from the end of the ring-buffer. Following commits will utilize the 'backward' flag. This is the core patch to support writing to the ring-buffer backwards, which will be introduced by upcoming patches to support reading from overwritable ring-buffers. In theory, this patch should not introduce any extra performance overhead since we use always_inline, but it does not hurt to double check that assumption: When CONFIG_OPTIMIZE_INLINING is disabled, the output object is nearly identical to original one. See: http://lkml.kernel.org/g/56F52E83.70409@huawei.com When CONFIG_OPTIMIZE_INLINING is enabled, the resuling object file becomes smaller: $ size kernel/events/ring_buffer.o* text data bss dec hex filename 4641 4 8 4653 122d kernel/events/ring_buffer.o.old 4545 4 8 4557 11cd kernel/events/ring_buffer.o.new Performance testing results: Calling 3000000 times of 'close(-1)', use gettimeofday() to check duration. Use 'perf record -o /dev/null -e raw_syscalls:*' to capture system calls. In ns. Testing environment: CPU : Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz Kernel : v4.5.0 MEAN STDVAR BASE 800214.950 2853.083 PRE 2253846.700 9997.014 POST 2257495.540 8516.293 Where 'BASE' is pure performance without capturing. 'PRE' is test result of pure 'v4.5.0' kernel. 'POST' is test result after this patch. Considering the stdvar, this patch doesn't hurt performance, within noise margin. For testing details, see: http://lkml.kernel.org/g/56F89DCD.1040202@huawei.com Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <pi3orama@163.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/1459147292-239310-4-git-send-email-wangnan0@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31perf/core: Set event's default ::overflow_handler()Wang Nan4-10/+18
Set a default event->overflow_handler in perf_event_alloc() so don't need to check event->overflow_handler in __perf_event_overflow(). Following commits can give a different default overflow_handler. Initial idea comes from Peter: http://lkml.kernel.org/r/20130708121557.GA17211@twins.programming.kicks-ass.net Since the default value of event->overflow_handler is not NULL, existing 'if (!overflow_handler)' checks need to be changed. is_default_overflow_handler() is introduced for this. No extra performance overhead is introduced into the hot path because in the original code we still need to read this handler from memory. A conditional branch is avoided so actually we remove some instructions. Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <pi3orama@163.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/1459147292-239310-3-git-send-email-wangnan0@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31perf/ring_buffer: Introduce new ioctl options to pause and resume the ↵Wang Nan4-1/+34
ring-buffer Add new ioctl() to pause/resume ring-buffer output. In some situations we want to read from the ring-buffer only when we ensure nothing can write to the ring-buffer during reading. Without this patch we have to turn off all events attached to this ring-buffer to achieve this. This patch is a prerequisite to enable overwrite support for the perf ring-buffer support. Following commits will introduce new methods support reading from overwrite ring buffer. Before reading, caller must ensure the ring buffer is frozen, or the reading is unreliable. Signed-off-by: Wang Nan <wangnan0@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <pi3orama@163.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: Zefan Li <lizefan@huawei.com> Link: http://lkml.kernel.org/r/1459147292-239310-2-git-send-email-wangnan0@huawei.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-03-31ftrace/perf: Check sample types only for sampling eventsJiri Olsa1-0/+3
Currently we check sample type for ftrace:function events even if it's not created as a sampling event. That prevents creating ftrace_function event in counting mode. Make sure we check sample types only for sampling events. Before: $ sudo perf stat -e ftrace:function ls ... Performance counter stats for 'ls': <not supported> ftrace:function 0.001983662 seconds time elapsed After: $ sudo perf stat -e ftrace:function ls ... Performance counter stats for 'ls': 44,498 ftrace:function 0.037534722 seconds time elapsed Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1458138873-1553-2-git-send-email-jolsa@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>