summaryrefslogtreecommitdiff
path: root/tools/perf/util/annotate-data.c
AgeCommit message (Collapse)AuthorFilesLines
2024-09-10perf annotate-data: Add pr_debug_scope()Namhyung Kim1-2/+23
The pr_debug_scope() is to print more information about the scope DIE during the instruction tracking so that it can help finding relevant debug info and the source code like inlined functions more easily. $ perf --debug type-profile annotate --data-type ... ----------------------------------------------------------- find data type for 0(reg0, reg12) at set_task_cpu+0xdd CU for kernel/sched/core.c (die:0x1268dae) frame base: cfa=1 fbreg=7 scope: [3/3] (die:12b6d28) [inlined] set_task_rq <<<--- (here) bb: [9f - dd] var [9f] reg3 type='struct task_struct*' size=0x8 (die:0x126aff0) var [9f] reg6 type='unsigned int' size=0x4 (die:0x1268e0d) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240909214251.3033827-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-22perf annotate-data: Copy back variable types after moveNamhyung Kim1-0/+31
In some cases, compilers don't set the location expression in DWARF precisely. For instance, it may assign a variable to a register after copying it from a different register. Then it should use the register for the new type but still uses the old register. This makes hard to track the type information properly. This is an example I found in __tcp_transmit_skb(). The first argument (sk) of this function is a pointer to sock and there's a variable (tp) for tcp_sock. static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, gfp_t gfp_mask, u32 rcv_nxt) { ... struct tcp_sock *tp; BUG_ON(!skb || !tcp_skb_pcount(skb)); tp = tcp_sk(sk); prior_wstamp = tp->tcp_wstamp_ns; tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache); ... So it basically calls tcp_sk(sk) to get the tcp_sock pointer from sk. But it turned out to be the same value because tcp_sock embeds sock as the first member. The sk is located in reg5 (RDI) and tp is in reg3 (RBX). The offset of tcp_wstamp_ns is 0x748 and tcp_clock_cache is 0x750. So you need to use RBX (reg3) to access the fields in the tcp_sock. But the code used RDI (reg5) as it has the same value. $ pahole --hex -C tcp_sock vmlinux | grep -e 748 -e 750 u64 tcp_wstamp_ns; /* 0x748 0x8 */ u64 tcp_clock_cache; /* 0x750 0x8 */ And this is the disassembly of the part of the function. <__tcp_transmit_skb>: ... 44: mov %rdi, %rbx 47: mov 0x748(%rdi), %rsi 4e: mov 0x750(%rdi), %rax 55: cmp %rax, %rsi Because compiler put the debug info to RBX, it only knows RDI is a pointer to sock and accessing those two fields resulted in error due to offset being beyond the type size. ----------------------------------------------------------- find data type for 0x748(reg5) at __tcp_transmit_skb+0x63 CU for net/ipv4/tcp_output.c (die:0x817f543) frame base: cfa=0 fbreg=6 scope: [1/1] (die:81aac3e) bb: [0 - 30] var [0] -0x98(stack) type='struct tcp_out_options' size=0x28 (die:0x81af3df) var [5] reg8 type='unsigned int' size=0x4 (die:0x8180ed6) var [5] reg2 type='unsigned int' size=0x4 (die:0x8180ed6) var [5] reg1 type='int' size=0x4 (die:0x818059e) var [5] reg4 type='struct sk_buff*' size=0x8 (die:0x8181360) var [5] reg5 type='struct sock*' size=0x8 (die:0x8181a0c) <<<--- the first argument ('sk' at %RDI) mov [19] reg8 -> -0xa8(stack) type='unsigned int' size=0x4 (die:0x8180ed6) mov [20] stack canary -> reg0 mov [29] reg0 -> -0x30(stack) stack canary bb: [36 - 3e] mov [36] reg4 -> reg15 type='struct sk_buff*' size=0x8 (die:0x8181360) bb: [44 - 63] mov [44] reg5 -> reg3 type='struct sock*' size=0x8 (die:0x8181a0c) <<<--- calling tcp_sk() var [47] reg3 type='struct tcp_sock*' size=0x8 (die:0x819eead) <<<--- new variable ('tp' at %RBX) var [4e] reg4 type='unsigned long long' size=0x8 (die:0x8180edd) mov [58] reg4 -> -0xc0(stack) type='unsigned long long' size=0x8 (die:0x8180edd) chk [63] reg5 offset=0x748 ok=1 kind=1 (struct sock*) : offset bigger than size <<<--- access with old variable final result: offset bigger than size While it's a fault in the compiler, we could work around this issue by using the type of new variable when it's copied directly. So I've added copied_from field in the register state to track those direct register to register copies. After that new register gets a new type and the old register still has the same type, it'll update (copy it back) the type of the old register. For example, if we can update type of reg5 at __tcp_transmit_skb+0x47, we can find the target type of the instruction at 0x63 like below: ----------------------------------------------------------- find data type for 0x748(reg5) at __tcp_transmit_skb+0x63 ... bb: [44 - 63] mov [44] reg5 -> reg3 type='struct sock*' size=0x8 (die:0x8181a0c) var [47] reg3 type='struct tcp_sock*' size=0x8 (die:0x819eead) var [47] copyback reg5 type='struct tcp_sock*' size=0x8 (die:0x819eead) <<<--- here mov [47] 0x748(reg5) -> reg4 type='unsigned long long' size=0x8 (die:0x8180edd) mov [4e] 0x750(reg5) -> reg0 type='unsigned long long' size=0x8 (die:0x8180edd) mov [58] reg4 -> -0xc0(stack) type='unsigned long long' size=0x8 (die:0x8180edd) chk [63] reg5 offset=0x748 ok=1 kind=1 (struct tcp_sock*) : Good! <<<--- new type found by insn track: 0x748(reg5) type-offset=0x748 final result: type='struct tcp_sock' size=0xa98 (die:0x819eeb2) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240821232628.353177-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-22perf annotate-data: Update stack slot for the storeNamhyung Kim1-4/+25
When checking the match variable at the target instruction, it might not have any information if it's a first write to a stack slot. In this case it could spill a register value into the stack so the type info is in the source operand. But currently it's hard to get the operand from the checking function. Let's process the instruction and retry to get the type info from the stack if there's no information already. This is an example of __tcp_transmit_skb(). The instructions are <__tcp_transmit_skb>: 0: nopl 0x0(%rax, %rax, 1) 5: push %rbp 6: mov %rsp, %rbp 9: push %r15 b: push %r14 d: push %r13 f: push %r12 11: push %rbx 12: sub $0x98, %rsp 19: mov %r8d, -0xa8(%rbp) ... It cannot find any variable at -0xa8(%rbp) at this point. ----------------------------------------------------------- find data type for -0xa8(reg6) at __tcp_transmit_skb+0x19 CU for net/ipv4/tcp_output.c (die:0x817f543) frame base: cfa=0 fbreg=6 scope: [1/1] (die:81aac3e) bb: [0 - 19] var [0] -0x98(stack) type='struct tcp_out_options' size=0x28 (die:0x81af3df) var [5] reg8 type='unsigned int' size=0x4 (die:0x8180ed6) var [5] reg2 type='unsigned int' size=0x4 (die:0x8180ed6) var [5] reg1 type='int' size=0x4 (die:0x818059e) var [5] reg4 type='struct sk_buff*' size=0x8 (die:0x8181360) var [5] reg5 type='struct sock*' size=0x8 (die:0x8181a0c) chk [19] reg6 offset=-0xa8 ok=0 kind=0 fbreg : no type information no type information And it was able to find the type after processing the 'mov' instruction. ----------------------------------------------------------- find data type for -0xa8(reg6) at __tcp_transmit_skb+0x19 CU for net/ipv4/tcp_output.c (die:0x817f543) frame base: cfa=0 fbreg=6 scope: [1/1] (die:81aac3e) bb: [0 - 19] var [0] -0x98(stack) type='struct tcp_out_options' size=0x28 (die:0x81af3df) var [5] reg8 type='unsigned int' size=0x4 (die:0x8180ed6) var [5] reg2 type='unsigned int' size=0x4 (die:0x8180ed6) var [5] reg1 type='int' size=0x4 (die:0x818059e) var [5] reg4 type='struct sk_buff*' size=0x8 (die:0x8181360) var [5] reg5 type='struct sock*' size=0x8 (die:0x8181a0c) chk [19] reg6 offset=-0xa8 ok=0 kind=0 fbreg : retry <<<--- here mov [19] reg8 -> -0xa8(stack) type='unsigned int' size=0x4 (die:0x8180ed6) chk [19] reg6 offset=-0xa8 ok=0 kind=0 fbreg : Good! found by insn track: -0xa8(reg6) type-offset=0 final result: type='unsigned int' size=0x4 (die:0x8180ed6) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240821232628.353177-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-22perf annotate-data: Update debug messagesNamhyung Kim1-10/+35
In check_matching_type(), it'd be easier to display the typename in question if it's available. For example, check out the line starts with 'chk'. ----------------------------------------------------------- find data type for 0x10(reg0) at cpuacct_charge+0x13 CU for kernel/sched/build_utility.c (die:0x137ee0b) frame base: cfa=1 fbreg=7 scope: [3/3] (die:13d9632) bb: [c - 13] var [c] reg5 type='struct task_struct*' size=0x8 (die:0x1381230) mov [c] 0xdf8(reg5) -> reg0 type='struct css_set*' size=0x8 (die:0x1385c56) chk [13] reg0 offset=0x10 ok=1 kind=1 (struct css_set*) : Good! <<<--- here found by insn track: 0x10(reg0) type-offset=0x10 final result: type='struct css_set' size=0x250 (die:0x1385b0e) Another example: ----------------------------------------------------------- find data type for 0x8(reg0) at menu_select+0x279 CU for drivers/cpuidle/governors/menu.c (die:0x7b0fe79) frame base: cfa=1 fbreg=7 scope: [2/2] (die:7b11010) bb: [273 - 277] bb: [279 - 279] chk [279] reg0 offset=0x8 ok=0 kind=0 cfa : no type information scope: [1/2] (die:7b10cbc) bb: [0 - 64] ... mov [26a] imm=0xffffffff -> reg15 bb: [273 - 277] bb: [279 - 279] chk [279] reg0 offset=0x8 ok=1 kind=1 (long long unsigned int) : no/void pointer <<<--- here final result: no/void pointer Also change some places to print negative offsets properly. Before: ----------------------------------------------------------- find data type for 0xffffff40(reg6) at __tcp_transmit_skb+0x58 After: ----------------------------------------------------------- find data type for -0xc0(reg6) at __tcp_transmit_skb+0x58 Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240821232628.353177-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-21perf annotate-data: Show offset and size in hexNamhyung Kim1-1/+1
It'd be better to have them in hex to check cacheline alignment. Percent offset size field 100.00 0 0x1c0 struct cfs_rq { 0.00 0 0x10 struct load_weight load { 0.00 0 0x8 long unsigned int weight; 0.00 0x8 0x4 u32 inv_weight; }; 0.00 0x10 0x4 unsigned int nr_running; 14.56 0x14 0x4 unsigned int h_nr_running; 0.00 0x18 0x4 unsigned int idle_nr_running; 0.00 0x1c 0x4 unsigned int idle_h_nr_running; ... Committer notes: Justification from Namhyung when asked about why it would be "better": Cache line sizes are power of 2 so it'd be natural to use hex and check whether an offset is in the same boundary. Also 'perf annotate' shows instruction offsets in hex. > > Maybe this should be selectable? I can add an option and/or a config if you want. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240819233603.54941-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-21perf annotate-data: Fix percpu pointer checkNamhyung Kim1-59/+63
In check_matching_type(), it checks the type state of the register in a wrong order. When it's the percpu pointer, it should check the type for the pointer, but it checks the CFA bit first and thought it has no type in the stack slot. This resulted in no type info. ----------------------------------------------------------- find data type for 0x28(reg1) at hrtimer_reprogram+0x88 CU for kernel/time/hrtimer.c (die:0x18f219f) frame base: cfa=1 fbreg=7 ... add [72] percpu 0x24500 -> reg1 pointer type='struct hrtimer_cpu_base' size=0x240 (die:0x18f6d46) bb: [7a - 7e] bb: [80 - 86] (here) bb: [88 - 88] vvv chk [88] reg1 offset=0x28 ok=1 kind=4 cfa : no type information no type information Here, instruction at 0x72 found reg1 has a (percpu) pointer and got the correct type. But when it checks the final result, it wrongly thought it was stack variable because it checks the cfa bit first. After changing the order of state check: ----------------------------------------------------------- find data type for 0x28(reg1) at hrtimer_reprogram+0x88 CU for kernel/time/hrtimer.c (die:0x18f219f) frame base: cfa=1 fbreg=7 ... (here) vvvvvvvvvv chk [88] reg1 offset=0x28 ok=1 kind=4 percpu ptr : Good! found by insn track: 0x28(reg1) type-offset=0x28 final type: type='struct hrtimer_cpu_base' size=0x240 (die:0x18f6d46) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240821065408.285548-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-21perf annotate-data: Prefer struct/union over base typeNamhyung Kim1-1/+19
Sometimes a compound type can have a single field and the size is the same as the base type. But it's still preferred as struct or union could carry more information than the base type. Also put a slight priority on the typedef for the same reason. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240821065408.285548-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-20perf annotate-data: Set bitfield member offset and size properlyNamhyung Kim1-6/+28
The bitfield members might not have DW_AT_data_member_location. Let's use DW_AT_data_bit_offset to set the member offset correct. Also use DW_AT_bit_size for the name like in a C program. Before: Annotate type: 'struct sk_buff' (1 samples) Percent Offset Size Field - 100.00 0 232 struct sk_buff { + 0.00 0 24 union ; + 0.00 24 8 union ; + 0.00 32 8 union ; 0.00 40 48 char[] cb; + 0.00 88 16 union ; 0.00 104 8 long unsigned int _nfct; 100.00 112 4 unsigned int len; 0.00 116 4 unsigned int data_len; 0.00 120 2 __u16 mac_len; 0.00 122 2 __u16 hdr_len; 0.00 124 2 __u16 queue_mapping; 0.00 126 0 __u8[] __cloned_offset; 0.00 0 1 __u8 cloned; 0.00 0 1 __u8 nohdr; 0.00 0 1 __u8 fclone; 0.00 0 1 __u8 peeked; 0.00 0 1 __u8 head_frag; 0.00 0 1 __u8 pfmemalloc; 0.00 0 1 __u8 pp_recycle; 0.00 127 1 __u8 active_extensions; + 0.00 128 60 union ; 0.00 188 4 sk_buff_data_t tail; 0.00 192 4 sk_buff_data_t end; 0.00 200 8 unsigned char* head; After: Annotate type: 'struct sk_buff' (1 samples) Percent Offset Size Field - 100.00 0 232 struct sk_buff { + 0.00 0 24 union ; + 0.00 24 8 union ; + 0.00 32 8 union ; 0.00 40 48 char[] cb + 0.00 88 16 union ; 0.00 104 8 long unsigned int _nfct; 100.00 112 4 unsigned int len; 0.00 116 4 unsigned int data_len; 0.00 120 2 __u16 mac_len; 0.00 122 2 __u16 hdr_len; 0.00 124 2 __u16 queue_mapping; 0.00 126 0 __u8[] __cloned_offset; 0.00 126 1 __u8 cloned:1; 0.00 126 1 __u8 nohdr:1; 0.00 126 1 __u8 fclone:2; 0.00 126 1 __u8 peeked:1; 0.00 126 1 __u8 head_frag:1; 0.00 126 1 __u8 pfmemalloc:1; 0.00 126 1 __u8 pp_recycle:1; 0.00 127 1 __u8 active_extensions; + 0.00 128 60 union ; 0.00 188 4 sk_buff_data_t tail; 0.00 192 4 sk_buff_data_t end; 0.00 200 8 unsigned char* head; Commiter notes: Collect some data: root@number:~# perf mem record -a --ldlat 5 -- ping -s 8193 -f 192.168.86.1 Memory events are enabled on a subset of CPUs: 16-27 PING 192.168.86.1 (192.168.86.1) 8193(8221) bytes of data. .^C --- 192.168.86.1 ping statistics --- 13881 packets transmitted, 13880 received, 0.00720409% packet loss, time 8664ms rtt min/avg/max/mdev = 0.510/0.599/7.768/0.115 ms, ipg/ewma 0.624/0.593 ms [ perf record: Woken up 8 times to write data ] [ perf record: Captured and wrote 14.877 MB perf.data (46785 samples) ] root@number:~# root@number:~# perf evlist cpu_atom/mem-loads,ldlat=5/P cpu_atom/mem-stores/P dummy:u root@number:~# perf evlist -v cpu_atom/mem-loads,ldlat=5/P: type: 10 (cpu_atom), size: 136, config: 0x5d0 (mem-loads), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, inherit: 1, freq: 1, precise_ip: 3, sample_id_all: 1, { bp_addr, config1 }: 0x7 cpu_atom/mem-stores/P: type: 10 (cpu_atom), size: 136, config: 0x6d0 (mem-stores), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CPU|PERIOD|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, disabled: 1, inherit: 1, freq: 1, precise_ip: 3, sample_id_all: 1 dummy:u: type: 1 (software), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|ADDR|CPU|IDENTIFIER|DATA_SRC|WEIGHT_STRUCT, read_format: ID|LOST, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 root@number:~# Ok, now lets see what changes from before this patch to after it: root@number:~# perf annotate --data-type > /tmp/before Apply the patch, build: root@number:~# perf annotate --data-type > /tmp/after The first hunk of the diff, for a glib data structure, in userspace, look at those bitfields: root@number:~# diff -u10 /tmp/before /tmp/after | head -20 --- /tmp/before 2024-08-20 17:29:58.306765780 -0300 +++ /tmp/after 2024-08-20 17:33:13.210582596 -0300 @@ -163,22 +163,22 @@ Annotate type: 'GHashTable' in /usr/lib64/libglib-2.0.so.0.8000.3 (1 samples): ============================================================================ Percent offset size field 100.00 0 96 GHashTable { 0.00 0 8 gsize size; 0.00 8 4 gint mod; 100.00 12 4 guint mask; 0.00 16 4 guint nnodes; 0.00 20 4 guint noccupied; - 0.00 0 4 guint have_big_keys; - 0.00 0 4 guint have_big_values; + 0.00 24 1 guint have_big_keys:1; + 0.00 24 1 guint have_big_values:1; 0.00 32 8 gpointer keys; 0.00 40 8 guint* hashes; 0.00 48 8 gpointer values; root@number:~# As advertised :-) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240815223823.2402285-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-19perf annotate-data: Update type stat at the end of find_data_type_die()Namhyung Kim1-17/+30
After trying all possibilities with DWARF and instruction tracking. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240816235840.2754937-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-19perf annotate-data: Check variables in every scopeNamhyung Kim1-17/+27
Sometimes it matches a variable in the inner scope but it fails because the actual access can be on a different type. Let's try variables in every scope and choose the best one using is_better_type(). I have an example with update_blocked_averages(), at first it found a variable (__mptr) but it's a void pointer. So it moved on to the upper scope and found another variable (cfs_rq). $ perf --debug type-profile annotate --data-type --stdio ... ----------------------------------------------------------- find data type for 0x140(reg14) at update_blocked_averages+0x2db CU for kernel/sched/fair.c (die:0x12dd892) frame base: cfa=1 fbreg=7 found "__mptr" (die: 0x13022f1) in scope=4/4 (die: 0x13022e8) failed: no/void pointer variable location: base=reg14, offset=0x140 type='void*' size=0x8 (die:0x12dd8f9) found "cfs_rq" (die: 0x1301721) in scope=3/4 (die: 0x130171c) type_offset=0x140 variable location: reg14 type='struct cfs_rq' size=0x1c0 (die:0x12e37e5) final type: type='struct cfs_rq' size=0x1c0 (die:0x12e37e5) IIUC the scope is like below: 1: update_blocked_averages 2: __update_blocked_fair 3: for_each_leaf_cfs_rq_safe 4: list_entry -> (container_of) The container_of is implemented like: #define container_of(ptr, type, member) ({ \ void *__mptr = (void *)(ptr); \ static_assert(__same_type(*(ptr), ((type *)0)->member) || \ __same_type(*(ptr), void), \ "pointer type mismatch in container_of()"); \ ((type *)(__mptr - offsetof(type, member))); }) That's why we see the __mptr variable first but it failed since it has no type information. Then for_each_leaf_cfs_rq_safe() is defined as #define for_each_leaf_cfs_rq_safe(rq, cfs_rq, pos) \ list_for_each_entry_safe(cfs_rq, pos, &rq->leaf_cfs_rq_list, \ leaf_cfs_rq_list) Note that the access was 0x140(r14). And the cfs_rq has leaf_cfs_rq_list at the 0x140. So it converts the list_head pointer to a pointer to struct cfs_rq here. $ pahole --hex -C cfs_rq vmlinux | grep 140 struct cfs_rq struct list_head leaf_cfs_rq_list; /* 0x140 0x10 */ Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240816235840.2754937-9-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-19perf annotate-data: Add is_better_type() helperNamhyung Kim1-10/+51
Sometimes more than one variables are located in the same register or a stack slot. Or it can overwrite existing information with others. I found this is not helpful in some cases so it needs to update the type information from the variable only if it's better. But it's hard to know which one is better, so we needs heuristics. :) As it deals with memory accesses, the location should have a pointer or something similar (like array or reference). So if it had an integer type and a variable is a pointer, we can take the variable's type to resolve the target of the access. If it has a pointer type and a variable with the same location has a different pointer type, it'll take one with bigger target type. This can be useful when the target type embeds a smaller type (like list header or RB-tree node) at the beginning so their location is same. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240816235840.2754937-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-19perf annotate-data: Add is_pointer_type() helperNamhyung Kim1-9/+14
It treats pointers and arrays in the same way. Let's add the helper and use it when it checks if it needs a pointer. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240816235840.2754937-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-19perf annotate-data: Change return type of find_data_type_block()Namhyung Kim1-59/+58
So that it can return enum variable_match_type to be propagated to the find_data_type_die(). Also update the debug message to show the result of the check_matching_type(). chk [dd] reg0 offset=0 ok=1 kind=1 : Good! or chk [177] reg4 offset=0x138 ok=0 kind=0 cfa : no type information Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240816235840.2754937-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-19perf annotate-data: Add variable_state_str()Namhyung Kim1-15/+26
So that it can show a proper debug message in the right place. The check_variable() is used in other places which don't want to print the message. $ perf --debug type-profile annotate --data-type Before: ----------------------------------------------------------- find data type for 0x140(reg14) at update_blocked_averages+0x2db CU for kernel/sched/fair.c (die:0x12dd892) frame base: cfa=1 fbreg=7 no pointer or no type <<<--- removed check variable "__mptr" failed (die: 0x13022f1) variable location: base=reg14, offset=0x140 type='void*' size=0x8 (die:0x12dd8f9) After: ----------------------------------------------------------- find data type for 0x140(reg14) at update_blocked_averages+0x2db CU for kernel/sched/fair.c (die:0x12dd892) frame base: cfa=1 fbreg=7 found "__mptr" (die: 0x13022f1) in scope=4/4 (die: 0x13022e8) failed: no/void pointer <<<--- here variable location: base=reg14, offset=0x140 type='void*' size=0x8 (die:0x12dd8f9) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240816235840.2754937-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-19perf annotate-data: Add 'enum type_match_result'Namhyung Kim1-11/+25
And let check_variable() return the enum value so that callers can know what was the problem. This will be used by the later patch to update the statistics correctly and print the error message in a right place. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240816235840.2754937-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-19perf annotate-data: Fix off-by-one in location range checkNamhyung Kim1-1/+1
The location list will have entries with half-open addressing like [start, end) which means it doesn't include the end address. So it should skip entries at the end address and match to the next entry. An example location list looks like this (from readelf -wo): 00237876 ffffffff8110d32b (base address) 0023787f v000000000000000 v000000000000002 views at 00237868 for: ffffffff8110d32b ffffffff8110d4eb (DW_OP_reg3 (rbx)) <<<--- 1 00237885 v000000000000002 v000000000000000 views at 0023786a for: ffffffff8110d4eb ffffffff8110d50b (DW_OP_reg14 (r14)) <<<--- 2 0023788c v000000000000000 v000000000000001 views at 0023786c for: ffffffff8110d50b ffffffff8110d7c4 (DW_OP_reg3 (rbx)) 00237893 v000000000000000 v000000000000000 views at 0023786e for: ffffffff8110d806 ffffffff8110d854 (DW_OP_reg3 (rbx)) 0023789a v000000000000000 v000000000000000 views at 00237870 for: ffffffff8110d876 ffffffff8110d88e (DW_OP_reg3 (rbx)) The first entry at 0023787f has [8110d32b, 8110d4eb) (omitting the ffffffff at the beginning), and the second one has [8110d4eb, 8110d50b). Fixes: 2bc3cf575a162a2c ("perf annotate-data: Improve debug message with location info") Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240816235840.2754937-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-10perf annotate-data: Support --skip-empty optionNamhyung Kim1-22/+22
The --skip-empty option is to hide dummy events in a group. Like other output mode in 'perf report' and 'perf annotate', the data-type profiling output should support the option. Committer testing: With dummy: root@number:~# perf annotate --stdio --group --data-type --skip-empty | head -24 Annotate type: 'pthread_mutex_t' in /usr/lib64/libc.so.6 (50 samples): event[0] = cpu_atom/mem-loads,ldlat=30/P event[1] = cpu_atom/mem-stores/P event[2] = dummy:u ============================================================================ Percent offset size field 100.00 100.00 0.00 0 40 pthread_mutex_t { 100.00 100.00 0.00 0 40 struct __pthread_mutex_s __data { 45.21 84.54 0.00 0 4 int __lock; 0.00 0.00 0.00 4 4 unsigned int __count; 0.00 1.83 0.00 8 4 int __owner; 5.19 10.65 0.00 12 4 unsigned int __nusers; 49.61 2.97 0.00 16 4 int __kind; 0.00 0.00 0.00 20 2 short int __spins; 0.00 0.00 0.00 22 2 short int __elision; 0.00 0.00 0.00 24 16 __pthread_list_t __list { 0.00 0.00 0.00 24 8 struct __pthread_internal_list* __prev; 0.00 0.00 0.00 32 8 struct __pthread_internal_list* __next; }; }; 0.00 0.00 0.00 0 0 char[] __size; 45.21 84.54 0.00 0 8 long int __align; }; Skipping it: root@number:~# perf annotate --stdio --group --data-type --skip-empty | head -24 Annotate type: 'pthread_mutex_t' in /usr/lib64/libc.so.6 (50 samples): event[0] = cpu_atom/mem-loads,ldlat=30/P event[1] = cpu_atom/mem-stores/P ============================================================================ Percent offset size field 100.00 100.00 0 40 pthread_mutex_t { 100.00 100.00 0 40 struct __pthread_mutex_s __data { 45.21 84.54 0 4 int __lock; 0.00 0.00 4 4 unsigned int __count; 0.00 1.83 8 4 int __owner; 5.19 10.65 12 4 unsigned int __nusers; 49.61 2.97 16 4 int __kind; 0.00 0.00 20 2 short int __spins; 0.00 0.00 22 2 short int __elision; 0.00 0.00 24 16 __pthread_list_t __list { 0.00 0.00 24 8 struct __pthread_internal_list* __prev; 0.00 0.00 32 8 struct __pthread_internal_list* __next; }; }; 0.00 0.00 0 0 char[] __size; 45.21 84.54 0 8 long int __align; }; Annotate type: 'pthread_mutexattr_t' in /usr/lib64/libc.so.6 (1 samples): root@number:~# Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240807061713.1642924-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-08perf annotate-data: Show typedef names properlyNamhyung Kim1-9/+30
The die_get_typename() would resolve typedef and get to the original type. But sometimes the original type is a struct without name and it makes the output confusing and hard to read. This is a diff of perf report -s type before and after the change. New types such as atomic{,64}_t and sigset_t appeared and the portion of unnamed struct was reduced. Also u32, u64 and size_t were splitted from the base types. --- b 2024-08-01 17:02:34.307809952 -0700 +++ a 2024-08-07 14:17:05.245853999 -0700 - 2.40% long unsigned int + 2.26% long unsigned int - 1.56% unsigned int + 1.27% unsigned int - 0.98% struct - 0.79% long long unsigned int + 0.58% long long unsigned int + 0.36% struct + 0.27% atomic64_t + 0.22% u32 + 0.21% u64 + 0.19% atomic_t + 0.13% size_t - 0.08% struct seqcount_spinlock + 0.08% seqcount_spinlock_t + 0.08% sigset_t + 0.08% __poll_t Let's use the typedef name directly and the resolved to get the size of the type. Committer testing: root@x1:~# diff -u before after | head -30 --- before 2024-08-08 09:35:13.917325041 -0300 +++ after 2024-08-08 09:37:35.312257905 -0300 @@ -10,25 +10,27 @@ # ........ ......... # 79.40% (unknown) - 2.28% union 1.96% (stack operation) - 1.24% struct + 1.87% pthread_mutex_t 0.99% u32[] - 0.92% unsigned int 0.77% struct task_struct + 0.75% U32 0.75% struct pcpu_hot 0.63% struct qspinlock + 0.61% atomic_t 0.59% struct list_head - 0.58% int 0.53% struct cfs_rq 0.51% BYTE* - 0.48% unsigned char + 0.48% BYTE 0.48% long unsigned int 0.46% struct rq 0.41% struct worker 0.41% struct memcg_vmstats_percpu + 0.41% pthread_cond_t 0.37% _Bool + 0.36% int root@x1:~# Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240807223129.1738004-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-08perf annotate: Cache debuginfo for data type profilingNamhyung Kim1-13/+2
In find_data_type(), it creates and deletes a debug info whenver it tries to find data type for a sample. This is inefficient and it most likely accesses the same binary again and again. Let's add a single entry cache the debug info structure for the last DSO. Depending on sample data, it usually gives me 2~3x (and sometimes more) speed ups. Note that this will introduce a little difference in the output due to the order of checking stack operations. It used to check the stack ops before checking the availability of debug info but I moved it after the symbol check. So it'll report stack operations in DSOs without debug info as unknown. But I think it's ok and better to have the checking near the caching logic. Committer testing: root@x1:~# perf mem record -a sleep 5s root@x1:~# perf evlist cpu_atom/mem-loads,ldlat=30/P cpu_atom/mem-stores/P dummy:u root@x1:~# diff -u before after --- before 2024-08-08 09:33:53.880780784 -0300 +++ after 2024-08-08 09:35:13.917325041 -0300 @@ -81,8 +81,8 @@ # Overhead Data Type # ........ ......... # - 55.43% (unknown) - 11.61% (stack operation) + 55.56% (unknown) + 11.48% (stack operation) 4.93% struct pcpu_hot 3.26% unsigned int 2.48% struct Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240805234648.1453689-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Update instruction tracking for powerpcAthira Rajeev1-1/+8
Add instruction tracking function "update_insn_state_powerpc" for powerpc. Example sequence in powerpc: ld r10,264(r3) mr r31,r3 <<after some sequence> ld r9,312(r31) Consider ithe sample is pointing to: "ld r9,312(r31)". Here the memory reference is hit at "312(r31)" where 312 is the offset and r31 is the source register. Previous instruction sequence shows that register state of r3 is moved to r31. So to identify the data type for r31 access, the previous instruction ("mr") needs to be tracked and the state type entry has to be updated. Current instruction tracking support in perf tools infrastructure is specific to x86. Patch adds this support for powerpc as well. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-12-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Add "update_insn_state" callback function to handle arch ↵Athira Rajeev1-383/+8
specific instruction tracking Add "update_insn_state" callback to "struct arch" to handle instruction tracking. Currently updating instruction state is handled by static function "update_insn_state_x86" which is defined in "annotate-data.c". Make this as a callback for specific arch and move to archs specific file "arch/x86/annotate/instructions.c" . This will help to add helper function for other platforms in file: "arch/<platform>/annotate/instructions.c" and make changes/updates easier. Define callback "update_insn_state" as part of "struct arch", also make some of the debug functions non-static so that it can be referenced from other places. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-3-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31perf annotate: Move the data structures related to register type to header fileAthira Rajeev1-52/+1
Data type profiling uses instruction tracking by checking each instruction and updating the register type state in some data structures. This is useful to find the data type in cases when the register state gets transferred from one reg to another. Example, in x86, "mov" instruction and in powerpc, "mr" instruction. Currently these structures are defined in annotate-data.c and instruction tracking is implemented only for x86. Move these data structures to "annotate-data.h" header file so that other arch implementations can use it in arch specific files as well. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-11perf annotate-data: Ensure the number of type histogramsNamhyung Kim1-1/+4
Arnaldo reported that there is a case where nr_histograms and histograms don't agree each other. It ended up in a segfault trying to access a NULL histograms array. Let's make sure to update the nr_histograms when the histograms array is changed. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240510210452.2449944-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-07perf annotate: Use zfree() to avoid possibly accessing dangling pointersArnaldo Carvalho de Melo1-8/+9
When freeing a->b it is good practice to set a->b to NULL using zfree(&a->b) so that when we have a bug where a reference to a freed 'a' pointer is kept somewhere, we can more quickly cause a segfault if some code tries to use a->b. This is mostly done but some new cases were introduced recently, convert them to zfree(). Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZjmbHHrjIm5YRIBv@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-06perf dso: Add reference count checking and accessor functionsIan Rogers1-9/+9
Add reference count checking to struct dso, this can help with implementing correct reference counting discipline. To avoid RC_CHK_ACCESS everywhere, add accessor functions for the variables in struct dso. The majority of the change is mechanical in nature and not easy to split up. Committer testing: 'perf test' up to this patch shows no regressions. But: util/symbol.c: In function ‘dso__load_bfd_symbols’: util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’ 1683 | dso__set_adjust_symbols(dso); | ^~~~~~~~~~~~~~~~~~~~~~~ In file included from util/symbol.c:21: util/dso.h:268:20: note: declared here 268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val) | ^~~~~~~~~~~~~~~~~~~~~~~ make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1 MKDIR /tmp/tmp.ZWHbQftdN6/tests/workloads/ make[6]: *** Waiting for unfinished jobs.... This was updated: - symbols__fixup_end(&dso->symbols, false); - symbols__fixup_duplicate(&dso->symbols); - dso->adjust_symbols = 1; + symbols__fixup_end(dso__symbols(dso), false); + symbols__fixup_duplicate(dso__symbols(dso)); + dso__set_adjust_symbols(dso); But not build tested with BUILD_NONDISTRO and libbfd devel files installed (binutils-devel on fedora). Add the missing argument: symbols__fixup_end(dso__symbols(dso), false); symbols__fixup_duplicate(dso__symbols(dso)); - dso__set_adjust_symbols(dso); + dso__set_adjust_symbols(dso, true); Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Chengen Du <chengen.du@canonical.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-02perf annotate-data: Check kind of stack variablesNamhyung Kim1-0/+6
I sometimes see ("unknown type") in the result and it was because it didn't check the type of stack variables properly during the instruction tracking. The stack can carry constant values (without type info) and if the target instruction is accessing the stack location, it resulted in the "unknown type". Maybe we could pick one of integer types for the constant, but it doesn't really mean anything useful. Let's just drop the stack slot if it doesn't have a valid type info. Here's an example how it got the unknown type. Note that 0xffffff48 = -0xb8. ----------------------------------------------------------- find data type for 0xffffff48(reg6) at ... CU for ... frame base: cfa=0 fbreg=6 scope: [2/2] (die:11cb97f) bb: [37 - 3a] var [37] reg15 type='int' size=0x4 (die:0x1180633) bb: [40 - 4b] mov [40] imm=0x1 -> reg13 var [45] reg8 type='sigset_t*' size=0x8 (die:0x11a39ee) mov [45] imm=0x1 -> reg2 <--- here reg2 has a constant bb: [215 - 237] mov [218] reg2 -> -0xb8(stack) constant <--- and save it to the stack mov [225] reg13 -> -0xc4(stack) constant call [22f] find_task_by_vgpid call [22f] return -> reg0 type='struct task_struct*' size=0x8 (die:0x11881e8) bb: [5c8 - 5cf] bb: [2fb - 302] mov [2fb] -0xc4(stack) -> reg13 constant bb: [13b - 14d] mov [143] 0xd50(reg3) -> reg5 type='struct task_struct*' size=0x8 (die:0xa31f3c) bb: [153 - 153] chk [153] reg6 offset=0xffffff48 ok=0 kind=0 fbreg <--- access here found by insn track: 0xffffff48(reg6) type-offset=0 type='G<EF>^K<F6><AF>U' size=0 (die:0xffffffffffffffff) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240502060011.1838090-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-02perf annotate-data: Handle multi regs in find_data_type_block()Namhyung Kim1-11/+12
The instruction tracking should be the same for the both registers. Just do it once and compare the result with multi regs as with the previous patches. Then we don't need to call find_data_type_block() separately for each reg. Let's remove the 'reg' argument from the relevant functions. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240502060011.1838090-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-02perf annotate-data: Check memory access with two registersNamhyung Kim1-5/+39
The following instruction pattern is used to access a global variable. mov $0x231c0, %rax movsql %edi, %rcx mov -0x7dc94ae0(,%rcx,8), %rcx cmpl $0x0, 0xa60(%rcx,%rax,1) <<<--- here The first instruction set the address of the per-cpu variable (here, it is 'runqueues' of type 'struct rq'). The second instruction seems like a cpu number of the per-cpu base. The third instruction get the base offset of per-cpu area for that cpu. The last instruction compares the value of the per-cpu variable at the offset of 0xa60. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240502060011.1838090-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-02perf annotate-data: Handle direct global variable accessNamhyung Kim1-12/+15
Like per-cpu base offset array, sometimes it accesses the global variable directly using the offset. Allow this type of instructions as long as it finds a global variable for the address. movslq %edi, %rcx mov -0x7dc94ae0(,%rcx,8), %rcx <<<--- here As %rcx has a valid type (i.e. array index) from the first instruction, it will be checked by the first case in check_matching_type(). But as it's not a pointer type, the match will fail. But in this case, it should check if it accesses the kernel global array variable. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240502060011.1838090-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-02perf annotate-data: Collect global variables in advanceNamhyung Kim1-2/+55
Currently it looks up global variables from the current CU using address and name. But it sometimes fails to find a variable as the variable can come from a different CU - but it's still strange it failed to find a declaration for some reason. Anyway, it can collect all global variables from all CU once and then lookup them later on. This slightly improves the success rate of my test data set. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240502060011.1838090-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-16perf annotate-data: Handle RSP if it's not the FB registerNamhyung Kim1-8/+19
In some cases, the stack pointer on x86 (rsp = reg7) is used to point variables on stack but it's not the frame base register. Then it should handle the register like normal registers (IOW not to access the other stack variables using offset calculation) but it should not assume it would have a pointer. Before: ----------------------------------------------------------- find data type for 0x7c(reg7) at tcp_getsockopt+0xb62 CU for net/ipv4/tcp.c (die:0x7b5f516) frame base: cfa=0 fbreg=6 no pointer or no type check variable "zc" failed (die: 0x7b9580a) variable location: base=reg7, offset=0x40 type='struct tcp_zerocopy_receive' size=0x40 (die:0x7b947f4) After: ----------------------------------------------------------- find data type for 0x7c(reg7) at tcp_getsockopt+0xb62 CU for net/ipv4/tcp.c (die:0x7b5f516) frame base: cfa=0 fbreg=6 found "zc" in scope=3/3 (die: 0x7b957fc) type_offset=0x3c variable location: base=reg7, offset=0x40 type='struct tcp_zerocopy_receive' size=0x40 (die:0x7b947f4) Note that the type-offset was properly calculated to 0x3c as the variable starts at 0x40. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240412183310.2518474-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-16perf annotate-data: Improve debug message with location infoNamhyung Kim1-17/+82
To verify it found the correct variable, let's add the location expression to the debug message. $ perf --debug type-profile annotate --data-type ... ----------------------------------------------------------- find data type for 0xaf0(reg15) at schedule+0xeb CU for kernel/sched/core.c (die:0x1180523) frame base: cfa=0 fbreg=6 found "rq" in scope=3/4 (die: 0x11b6a00) type_offset=0xaf0 variable location: reg15 type='struct rq' size=0xfc0 (die:0x11892e2) ----------------------------------------------------------- find data type for 0x7bc(reg3) at tcp_get_info+0x62 CU for net/ipv4/tcp.c (die:0x7b5f516) frame base: cfa=0 fbreg=6 offset: 1980 is bigger than size: 760 check variable "sk" failed (die: 0x7b92b2c) variable location: reg3 type='struct sock' size=0x2f8 (die:0x7b63c3a) ----------------------------------------------------------- ... The first case is fine. It looked up a data type in r15 with offset of 0xaf0 at schedule+0xeb. It found the CU die and the frame base info and the variable "rq" was found in the scope 3/4. Its location is the r15 register and the type size is 0xfc0 which includes 0xaf0. But the second case is not good. It looked up a data type in rbx (reg3) with offset 0x7bc. It found a CU and the frame base which is good so far. And it also found a variable "sk" but the access offset is bigger than the type size (1980 vs. 760 or 0x7bc vs. 0x2f8). The variable has the right location (reg3) but I need to figure out why it accesses beyond what it's supposed to. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240412183310.2518474-2-namhyung@kernel.org [ Fix the build on 32-bit by casting Dwarf_Word to (long) in pr_debug_location() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate-data: Add hist_entry__annotate_data_tui()Namhyung Kim1-1/+2
Support data type profiling output on TUI. Testing from Arnaldo: First make sure that the debug information for your workload binaries in embedded in them by building it with '-g' or install the debuginfo packages, since our workload is 'find': root@number:~# type find find is hashed (/usr/bin/find) root@number:~# rpm -qf /usr/bin/find findutils-4.9.0-5.fc39.x86_64 root@number:~# dnf debuginfo-install findutils <SNIP> root@number:~# Then collect some data: root@number:~# echo 1 > /proc/sys/vm/drop_caches root@number:~# perf mem record find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.331 MB perf.data (3982 samples) ] root@number:~# Finally do data-type annotation with the following command, that will default, as 'perf report' to the --tui mode, with lines colored to highlight the hotspots, etc. root@number:~# perf annotate --data-type Annotate type: 'struct predicate' (58 samples) Percent Offset Size Field 100.00 0 312 struct predicate { 0.00 0 8 PRED_FUNC pred_func; 0.00 8 8 char* p_name; 0.00 16 4 enum predicate_type p_type; 0.00 20 4 enum predicate_precedence p_prec; 0.00 24 1 _Bool side_effects; 0.00 25 1 _Bool no_default_print; 0.00 26 1 _Bool need_stat; 0.00 27 1 _Bool need_type; 0.00 28 1 _Bool need_inum; 0.00 32 4 enum EvaluationCost p_cost; 0.00 36 4 float est_success_rate; 0.00 40 1 _Bool literal_control_chars; 0.00 41 1 _Bool artificial; 0.00 48 8 char* arg_text; <SNIP> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240411033256.2099646-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate-data: Add hist_entry__annotate_data_tty()Namhyung Kim1-0/+112
And move the related code into util/annotate-data.c file. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240411033256.2099646-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate-data: Do not delete non-asm linesNamhyung Kim1-0/+6
For data type profiling, it removed non-instruction lines from the list of annotation lines. It was to simplify the implementation dealing with instructions like to calculate the PC-relative address and to search the shortest path to the target instruction or basic block. But it means that it removes all the comments and debug information in the annotate output like source file name and line numbers. To support both code annotation and data type annotation, it'd be better to keep the non-instruction lines as well. So this change is to skip those lines during the data type profiling and to display them in the normal perf annotate output. No function changes intended (other than having more lines). Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240405211800.1412920-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate-data: Fix global variable lookupNamhyung Kim1-1/+3
The recent change in the global variable handling added a bug to miss setting the return value even if it found a data type. Also add the type name in the debug message. Fixes: 1ebb5e17ef21b492 ("perf annotate-data: Add get_global_var_type()") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240405211800.1412920-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Do not retry for invalid typesNamhyung Kim1-35/+48
In some cases, it was able to find a type or location info (for per-cpu variable) but cannot match because of invalid offset or missing global information. In those cases, it's meaningless to go to the outer scope and retry because there will be no additional information. Let's change the return type of find_matching_type() and bail out if it returns -1 for the cases. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-24-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Add a cache for global variable typesNamhyung Kim1-3/+104
They are often searched by many different places. Let's add a cache for them to reduce the duplicate DWARF access. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-23-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Add stack canary typeNamhyung Kim1-0/+46
When the stack protector is enabled, compiler would generate code to check stack overflow with a special value called 'stack carary' at runtime. On x86_64, GCC hard-codes the stack canary as %gs:40. While there's a definition of fixed_percpu_data in asm/processor.h, it seems that the header is not included everywhere and many places it cannot find the type info. As it's in the well-known location (at %gs:40), let's add a pseudo stack canary type to handle it specially. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-22-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Handle ADD instructionsNamhyung Kim1-2/+105
There are different patterns for percpu variable access using a constant value added to the base.  2aeb:  mov    -0x7da0f7e0(,%rax,8),%r14 # r14 = __per_cpu_offset[cpu]  2af3:  mov    $0x34740,%rax # rax = address of runqueues * 2afa:  add    %rax,%r14 # r14 = &per_cpu(runqueues, cpu)  2bfd:  cmpl   $0x0,0x10(%r14) # cpu_rq(cpu)->has_blocked_load  2b03:  je     0x2b36 At the first instruction, r14 has the __per_cpu_offset. And then rax has an immediate value and then added to r14 to calculate the address of a per-cpu variable. So it needs to track the immediate values and ADD instructions. Similar but a little different case is to use "this_cpu_off" instead of "__per_cpu_offset" for the current CPU. This time the variable address comes with PC-rel addressing. 89: mov $0x34740,%rax # rax = address of runqueues * 90: add %gs:0x7f015f60(%rip),%rax # 19a78 <this_cpu_off> 98: incl 0xd8c(%rax) # cpu_rq(cpu)->sched_count Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-21-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Support general per-cpu accessNamhyung Kim1-44/+169
This is to support per-cpu variable access often without a matching DWARF entry. For some reason, I cannot find debug info of per-cpu variables sometimes. They have more complex pattern to calculate the address of per-cpu variables like below. 2b7d: mov -0x1e0(%rbp),%rax ; rax = cpu 2b84: mov -0x7da0f7e0(,%rax,8),%rcx ; rcx = __per_cpu_offset[cpu] * 2b8c: mov 0x34870(%rcx),%rax ; *(__per_cpu_offset[cpu] + 0x34870) Let's assume the rax register has a number for a CPU at 2b7d. The next instruction is to get the per-cpu offset' for that cpu. The offset -0x7da0f7e0 is 0xffffffff825f0820 in u64 which is the address of the '__per_cpu_offset' array in my system. So it'd get the actual offset of that CPU's per-cpu region and save it to the rcx register. Then, at 2b8c, accesses using rcx can be handled same as the global variable access. To handle this case, it should check if the offset of the instruction matches to the address of '__per_cpu_offset'. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-20-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Track instructions with a this-cpu variableNamhyung Kim1-1/+30
Like global variables, this per-cpu variables should be tracked correctly. Factor our get_global_var_type() to handle both global and per-cpu (for this cpu) variables in the same manner. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-19-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Handle this-cpu variables in kernelNamhyung Kim1-2/+19
On x86, the kernel gets the current task using the current macro like below: #define current get_current() static __always_inline struct task_struct *get_current(void) { return this_cpu_read_stable(pcpu_hot.current_task); } So it returns the current_task field of struct pcpu_hot which is the first member. On my build, it's located at 0x32940. $ nm vmlinux | grep pcpu_hot 0000000000032940 D pcpu_hot And the current macro generates the instructions like below: mov %gs:0x32940, %rcx So the %gs segment register points to the beginning of the per-cpu region of this cpu and it points the variable with a constant. Let's update the instruction location info to have a segment register and handle %gs in kernel to look up a global variable. Pretend it as a global variable by changing the register number to DWARF_REG_PC. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-18-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Check register state for typeNamhyung Kim1-7/+81
As instruction tracking updates the type state for each register, check the final type info for the target register at the given instruction. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-16-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Implement instruction trackingNamhyung Kim1-14/+209
If it failed to find a variable for the location directly, it might be due to a missing variable in the source code. For example, accessing pointer variables in a chain can result in the case like below: struct foo *foo = ...; int i = foo->bar->baz; The DWARF debug information is created for each variable so it'd have one for 'foo'. But there's no variable for 'foo->bar' and then it cannot know the type of 'bar' and 'baz'. The above source code can be compiled to the follow x86 instructions: mov 0x8(%rax), %rcx mov 0x4(%rcx), %rdx <=== PMU sample mov %rdx, -4(%rbp) Let's say 'foo' is located in the %rax and it has a pointer to struct foo. But perf sample is captured in the second instruction and there is no variable or type info for the %rcx. It'd be great if compiler could generate debug info for %rcx, but we should handle it on our side. So this patch implements the logic to iterate instructions and update the type table for each location. As it already collected a list of scopes including the target instruction, we can use it to construct the type table smartly. +---------------- scope[0] subprogram | | +-------------- scope[1] lexical_block | | | | +------------ scope[2] inlined_subroutine | | | | | | +---------- scope[3] inlined_subroutine | | | | | | | | +-------- scope[4] lexical_block | | | | | | | | | | *** target instruction ... Image the target instruction has 5 scopes, each scope will have its own variables and parameters. Then it can start with the innermost scope (4). So it'd search the shortest path from the start of scope[4] to the target address and build a list of basic blocks. Then it iterates the basic blocks with the variables in the scope and update the table. If it finds a type at the target instruction, then returns it. Otherwise, it moves to the upper scope[3]. Now it'd search the shortest path from the start of scope[3] to the start of scope[4]. Then connect it to the existing basic block list. Then it'd iterate the blocks with variables for both scopes. It can repeat this until it finds a type at the target instruction or reaches to the top scope[0]. As the basic blocks contain the shortest path, it won't worry about branches and can update the table simply. The final check will be done by find_matching_type() in the next patch. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-15-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Handle call instructionsNamhyung Kim1-2/+52
When updating instruction states, the call instruction should play a role since it changes the register states. For simplicity, mark some registers as caller-saved registers (should be arch-dependent), and invalidate them all after a function call. If the function returns something, the designated register (ret_reg) will have the type info. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-14-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Handle global variable accessNamhyung Kim1-4/+42
When updating the instruction states, it also needs to handle global variable accesses. Same as it does for PC-relative addressing, it can look up the type by address (if it's defined in the same file), or by name after finding the symbol by address (for declarations). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-13-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Add get_global_var_type()Namhyung Kim1-14/+48
Accessing global variable is common when it tracks execution later. Factor out the common code into a function for later use. It adds thread and cpumode to struct data_loc_info to find (global) symbols if needed. Also remove var_name as it's retrieved in the helper function. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-12-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Add update_insn_state()Namhyung Kim1-3/+158
The update_insn_state() function is to update the type state table after processing each instruction. For now, it handles MOV (on x86) insn to transfer type info from the source location to the target. The location can be a register or a stack slot. Check carefully when memory reference happens and fetch the type correctly. It basically ignores write to a memory since it doesn't change the type info. One exception is writes to (new) stack slots for register spilling. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-11-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf annotate-data: Maintain variable type infoNamhyung Kim1-0/+173
As it collected basic block and variable information in each scope, it now can build a state table to find matching variable at the location. The struct type_state is to keep the type info saved in each register and stack slot. The update_var_state() updates the table when it finds variables in the current address. It expects die_collect_vars() filled a list of variables with type info and starting address. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240319055115.4063940-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>