summaryrefslogtreecommitdiff
path: root/tools/perf/util/intel-pt-decoder
AgeCommit message (Collapse)AuthorFilesLines
2019-06-11perf intel-pt: Fix sample timestamp wrt non-taken branchesAdrian Hunter1-1/+4
commit 1b6599a9d8e6c9f7e9b0476012383b1777f7fc93 upstream. The sample timestamp is updated to ensure that the timestamp represents the time of the sample and not a branch that the decoder is still walking towards. The sample timestamp is updated when the decoder returns, but the decoder does not return for non-taken branches. Update the sample timestamp then also. Note that commit 3f04d98e972b5 ("perf intel-pt: Improve sample timestamp") was also a stable fix and appears, for example, in v4.4 stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample timestamp"). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.4+ Fixes: 3f04d98e972b ("perf intel-pt: Improve sample timestamp") Link: http://lkml.kernel.org/r/20190510124143.27054-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-11perf intel-pt: Fix improved sample timestampAdrian Hunter1-3/+10
commit 61b6e08dc8e3ea80b7485c9b3f875ddd45c8466b upstream. The decoder uses its current timestamp in samples. Usually that is a timestamp that has already passed, but in some cases it is a timestamp for a branch that the decoder is walking towards, and consequently hasn't reached. The intel_pt_sample_time() function decides which is which, but was not handling TNT packets exactly correctly. In the case of TNT, the timestamp applies to the first branch, so the decoder must first walk to that branch. That means intel_pt_sample_time() should return true for TNT, and this patch makes that change. However, if the first branch is a non-taken branch (i.e. a 'N'), then intel_pt_sample_time() needs to return false for subsequent taken branches in the same TNT packet. To handle that, introduce a new state INTEL_PT_STATE_TNT_CONT to distinguish the cases. Note that commit 3f04d98e972b5 ("perf intel-pt: Improve sample timestamp") was also a stable fix and appears, for example, in v4.4 stable tree as commit a4ebb58fd124 ("perf intel-pt: Improve sample timestamp"). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v4.4+ Fixes: 3f04d98e972b5 ("perf intel-pt: Improve sample timestamp") Link: http://lkml.kernel.org/r/20190510124143.27054-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-11perf intel-pt: Fix instructions sampling rateAdrian Hunter1-3/+10
commit 7ba8fa20e26eb3c0c04d747f7fd2223694eac4d5 upstream. The timestamp used to determine if an instruction sample is made, is an estimate based on the number of instructions since the last known timestamp. A consequence is that it might go backwards, which results in extra samples. Change it so that a sample is only made when the timestamp goes forwards. Note this does not affect a sampling period of 0 or sampling periods specified as a count of instructions. Example: Before: $ perf script --itrace=i10us ls 13812 [003] 2167315.222583: 3270 instructions:u: 7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 30902 instructions:u: 7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 10 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 8 instructions:u: 7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 14 instructions:u: 7fac71e2d9ea _dl_cache_libcmp+0xa (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 6 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 14 instructions:u: 7fac71e2d9ff _dl_cache_libcmp+0x1f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 4 instructions:u: 7fac71e2dab2 _dl_cache_libcmp+0xd2 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222728: 16423 instructions:u: 7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222734: 12731 instructions:u: 7fac71e27938 _dl_name_match_p+0x68 (/lib/x86_64-linux-gnu/ld-2.28.so) ... After: $ perf script --itrace=i10us ls 13812 [003] 2167315.222583: 3270 instructions:u: 7fac71e2e494 __GI___tunables_init+0xf4 (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222667: 30902 instructions:u: 7fac71e2da0f _dl_cache_libcmp+0x2f (/lib/x86_64-linux-gnu/ld-2.28.so) ls 13812 [003] 2167315.222728: 16479 instructions:u: 7fac71e2477a _dl_map_object_deps+0x1ba (/lib/x86_64-linux-gnu/ld-2.28.so) ... Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Fixes: f4aa081949e7b ("perf tools: Add Intel PT decoder") Link: http://lkml.kernel.org/r/20190510124143.27054-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-03perf intel-pt: Fix TSC slipAdrian Hunter1-12/+8
commit f3b4e06b3bda759afd042d3d5fa86bea8f1fe278 upstream. A TSC packet can slip past MTC packets so that the timestamp appears to go backwards. One estimate is that can be up to about 40 CPU cycles, which is certainly less than 0x1000 TSC ticks, but accept slippage an order of magnitude more to be on the safe side. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Fixes: 79b58424b821c ("perf tools: Add Intel PT support for decoding MTC packets") Link: http://lkml.kernel.org/r/20190325135135.18348-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23perf intel-pt: Fix overlap calculation for paddingAdrian Hunter1-2/+34
commit 5a99d99e3310a565b0cf63f785b347be9ee0da45 upstream. Auxtrace records might have up to 7 bytes of padding appended. Adjust the overlap accordingly. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-23perf intel-pt: Fix CYC timestamp calculation after OVFAdrian Hunter1-1/+0
commit 03997612904866abe7cdcc992784ef65cb3a4b81 upstream. CYC packet timestamp calculation depends upon CBR which was being cleared upon overflow (OVF). That can cause errors due to failing to synchronize with sideband events. Even if a CBR change has been lost, the old CBR is still a better estimate than zero. So remove the clearing of CBR. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20190206103947.15750-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03perf intel-pt: Fix packet decoding of CYC packetsAdrian Hunter1-1/+1
commit 621a5a327c1e36ffd7bb567f44a559f64f76358f upstream. Use a 64-bit type so that the cycle count is not limited to 32-bits. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1528371002-8862-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03perf intel-pt: Fix "Unexpected indirect branch" errorAdrian Hunter2-2/+24
commit 9fb523363f6e3984457fee95bb7019395384ffa7 upstream. Some Atom CPUs can produce FUP packets that contain NLIP (next linear instruction pointer) instead of CLIP (current linear instruction pointer). That will result in "Unexpected indirect branch" errors. Fix by comparing IP to NLIP in that case. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03perf intel-pt: Fix MTC timing after overflowAdrian Hunter1-1/+0
commit dd27b87ab5fcf3ea1c060b5e3ab5d31cc78e9f4c upstream. On some platforms, overflows will clear before MTC wraparound, and there is no following TSC/TMA packet. In that case the previous TMA is valid. Since there will be a valid TMA either way, stop setting 'have_tma' to false upon overflow. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-07-03perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIPAdrian Hunter1-1/+4
commit bd2e49ec48feb1855f7624198849eea4610e2286 upstream. It is possible to have a CBR packet between a FUP packet and corresponding TIP packet. Stop treating it as an error. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24perf intel-pt: Fix timestamp following overflowAdrian Hunter1-0/+1
commit 91d29b288aed3406caf7c454bf2b898c96cfd177 upstream. timestamp_insn_cnt is used to estimate the timestamp based on the number of instructions since the last known timestamp. If the estimate is not accurate enough decoding might not be correctly synchronized with side-band events causing more trace errors. However there are always timestamps following an overflow, so the estimate is not needed and can indeed result in more errors. Suppress the estimate by setting timestamp_insn_cnt to zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1520431349-30689-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24perf intel-pt: Fix error recovery from missing TIP packetAdrian Hunter1-0/+1
commit 1c196a6c771c47a2faa63d38d913e03284f73a16 upstream. When a TIP packet is expected but there is a different packet, it is an error. However the unexpected packet might be something important like a TSC packet, so after the error, it is necessary to continue from there, rather than the next packet. That is achieved by setting pkt_step to zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1520431349-30689-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-24perf intel-pt: Fix overlap detection to identify consecutive buffers correctlyAdrian Hunter2-34/+30
commit 117db4b27bf08dba412faf3924ba55fe970c57b8 upstream. Overlap detection was not not updating the buffer's 'consecutive' flag. Marking buffers consecutive has the advantage that decoding begins from the start of the buffer instead of the first PSB. Fix overlap detection to identify consecutive buffers correctly. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1520431349-30689-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-07perf intel-pt: Ensure never to set 'last_ip' when packet 'count' is zeroAdrian Hunter1-3/+5
commit f952eaceb089b691eba7c4e13686e742a8f26bf5 upstream. Intel PT uses IP compression based on the last IP. For decoding purposes, 'last IP' is not updated when a branch target has been suppressed, which is indicated by IPBytes == 0. IPBytes is stored in the packet 'count', so ensure never to set 'last_ip' when packet 'count' is zero. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-07perf intel-pt: Use FUP always when scanning for an IPAdrian Hunter1-8/+4
commit 622b7a47b843c78626f40c1d1aeef8483383fba2 upstream. The decoder will try to use branch packets to find an IP to start decoding or to recover from errors. Currently the FUP packet is used only in the case of an overflow, however there is no reason for that to be a special case. So just use FUP always when scanning for an IP. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-07perf intel-pt: Fix last_ip usageAdrian Hunter1-2/+11
commit ee14ac0ef6827cd6f9a572cc83dd0191ea17812c upstream. Intel PT uses IP compression based on the last IP. For decoding purposes, 'last IP' is considered to be reset to zero whenever there is a synchronization packet (PSB). The decoder wasn't doing that, and was treating the zero value to mean that there was no last IP, whereas compression can be done against the zero value. Fix by setting last_ip to zero when a PSB is received and keep track of have_last_ip. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-07perf intel-pt: Fix ip compressionAdrian Hunter2-28/+40
commit e1717e0485af4f47fc4da1e979ac817f9ad61b0f upstream. The June 2015 Intel SDM introduced IP Compression types 4 and 6. Refer to section 36.4.2.2 Target IP (TIP) Packet - IP Compression. Existing Intel PT packet decoder did not support type 4, and got type 6 wrong. Because type 3 and type 4 have the same number of bytes, the packet 'count' has been changed from being the number of ip bytes to being the type code. That allows the Intel PT decoder to correctly decide whether to sign-extend or use the last ip. However that also meant the code had to be adjusted in a number of places. Currently hardware is not using the new compression types, so this fix has no effect on existing hardware. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1469005206-3049-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-28perf intel-pt: Clear FUP flag on errorAdrian Hunter1-0/+2
commit 6a558f12dbe85437acbdec5e149ea07b5554eced upstream. Sometimes a FUP packet is associated with a TSX transaction and a flag is set to indicate that. Ensure that flag is cleared on any error condition because at that point the decoder can no longer assume it is correct. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-9-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-28perf intel-pt: Ensure IP is zero when state is INTEL_PT_STATE_NO_IPAdrian Hunter1-0/+1
commit ad7167a8cd174ba7d8c0d0ed8d8410521206d104 upstream. A value of zero is used to indicate that there is no IP. Ensure the value is zero when the state is INTEL_PT_STATE_NO_IP. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-28perf intel-pt: Fix missing stack clearAdrian Hunter1-0/+1
commit 12b7080609097753fd8198cc1daf589be3ec1cca upstream. The return compression stack must be cleared whenever there is a PSB. Fix one case where that was not happening. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-28perf intel-pt: Improve sample timestampAdrian Hunter1-3/+31
commit 3f04d98e972b59706bd43d6cc75efac91f8fba50 upstream. The decoder uses its current timestamp in samples. Usually that is a timestamp that has already passed, but in some cases it is a timestamp for a branch that the decoder is walking towards, and consequently hasn't reached. Improve that situation by using the pkt_state to determine when to use the current or previous timestamp. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-28perf intel-pt: Move decoder error setting into one conditionAdrian Hunter1-4/+7
commit 22c06892332d8916115525145b78e606e9cc6492 upstream. Move decoder error setting into one condition. Cc'ed to stable because later fixes depend on it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1495786658-18063-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-07-15perf intel-pt: Use __fallthroughArnaldo Carvalho de Melo2-0/+7
commit 7ea6856d6f5629d742edc23b8b76e6263371ef45 upstream. To address new warnings emmited by gcc 7, e.g.:: CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.o CC /tmp/build/perf/tests/parse-events.o util/intel-pt-decoder/intel-pt-pkt-decoder.c: In function 'intel_pt_pkt_desc': util/intel-pt-decoder/intel-pt-pkt-decoder.c:499:6: error: this statement may fall through [-Werror=implicit-fallthrough=] if (!(packet->count)) ^ util/intel-pt-decoder/intel-pt-pkt-decoder.c:501:2: note: here case INTEL_PT_CYC: ^~~~ CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-decoder.o cc1: all warnings being treated as errors Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-mf0hw789pu9x855us5l32c83@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28perf intel-pt: Fix MTC timestamp calculation for large MTC periodsAdrian Hunter1-0/+36
commit 3bccbe20f6d188ce7b00326e776b745cfd35b10a upstream. The MTC packet provides a 8-bit slice of CTC which is related to TSC by the TMA packet, however the TMA packet only provides the lower 16 bits of CTC. If mtc_shift > 8 then some of the MTC bits are not in the CTC provided by the TMA packet. Fix-up the last_mtc calculated from the TMA packet by copying the missing bits from the current MTC assuming the least difference between the two, and that the current MTC comes after last_mtc. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1475062896-22274-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-28perf intel-pt: Fix estimated timestamps for cycle-accurate modeAdrian Hunter1-0/+2
commit 51ee6481fa8e879cc942bcc1b0af713e158b7a98 upstream. In cycle-accurate mode, timestamps can be calculated from CYC packets. The decoder also estimates timestamps based on the number of instructions since the last timestamp. For that to work in cycle-accurate mode, the instruction count needs to be reset to zero when a timestamp is calculated from a CYC packet, but that wasn't happening, so fix it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1475062896-22274-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-09-28perf intel-pt: Make logging slightly more efficientAdrian Hunter2-16/+43
Logging is only used for debugging. Use macros to save calling into the functions only to return immediately when logging is not enabled. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-28perf intel-pt: Fix potential loop foreverAdrian Hunter1-2/+2
TSC packets contain only 7 bytes of TSC. The 8th byte is assumed to change so infrequently that its value can be inferred. However the logic must cater for a 7 byte wraparound, which it does by adding 1 to the top byte. The existing code was doing that with a while loop even though the addition should only need to be done once. That logic won't work (will loop forever) if TSC wraps around at the 8th byte. Theoretically that would take at least 10 years, unless something else went wrong. And what else could go wrong. Well, if the chunks of trace data are processed out of order, it will make it look like the 7-byte TSC has gone backwards (i.e. wrapped). If that happens 256 times then stuck in the while loop it will be. Fix that by getting rid of the unnecessary while loop. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1443186956-18718-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-04x86/insn: perf tools: Add new xsave instructionsAdrian Hunter1-0/+3
Add xsavec, xsaves and xrstors to the op code map and the perf tools new instructions test. To run the test: $ tools/perf/perf test "x86 ins" 39: Test x86 instruction decoder - new instructions : Ok Or to see the details: $ tools/perf/perf test -v "x86 ins" 2>&1 | grep 'xsave\|xrst' For information about xsavec, xsaves and xrstors, refer the Intel SDM. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1441196131-20632-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-04x86/insn: perf tools: Add new memory protection keys instructionsAdrian Hunter1-1/+1
Add rdpkru and wrpkru to the op code map and the perf tools new instructions test. In the case of the test, only the bytes can be tested at the moment since binutils doesn't support the instructions yet. To run the test: $ tools/perf/perf test "x86 ins" 39: Test x86 instruction decoder - new instructions : Ok Or to see the details: $ tools/perf/perf test -v "x86 ins" 2>&1 | grep pkru For information about rdpkru and wrpkru, refer the Intel SDM. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1441196131-20632-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-04x86/insn: perf tools: Add new memory instructionsAdrian Hunter1-2/+2
Intel Architecture Instruction Set Extensions Programing Reference (Oct 2014) describes 3 new memory instructions, namely clflushopt, clwb and pcommit. Add them to the op code map and the perf tools new instructions test. e.g. $ tools/perf/perf test "x86 ins" 39: Test x86 instruction decoder - new instructions : Ok Or to see the details: $ tools/perf/perf test -v "x86 ins" Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1441196131-20632-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-04x86/insn: perf tools: Add new SHA instructionsAdrian Hunter1-0/+7
Intel SHA Extensions are explained in the Intel Architecture Instruction Set Extensions Programing Reference (Oct 2014). There are 7 new instructions. Add them to the op code map and the perf tools new instructions test. e.g. $ tools/perf/perf test "x86 ins" 39: Test x86 instruction decoder - new instructions : Ok Or to see the details: $ tools/perf/perf test -v "x86 ins" 2>&1 | grep sha Committer note: 3 lines of details, for the curious: $ perf test -v "x86 ins" 2>&1 | grep sha256msg1 | tail -3 Decoded ok: 0f 38 cc 84 08 78 56 34 12 sha256msg1 0x12345678(%rax,%rcx,1),%xmm0 Decoded ok: 0f 38 cc 84 c8 78 56 34 12 sha256msg1 0x12345678(%rax,%rcx,8),%xmm0 Decoded ok: 44 0f 38 cc bc c8 78 56 34 12 sha256msg1 0x12345678(%rax,%rcx,8),%xmm15 $ Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1441196131-20632-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-04x86/insn: perf tools: Pedantically tweak opcode map for MPX instructionsAdrian Hunter1-2/+6
The MPX instructions are presently not described in the SDM opcode maps, and there are not encoding characters for bnd registers, address method or operand type. So the kernel opcode map is using 'Gv' for bnd registers and 'Ev' for everything else. That is fine because the instruction decoder does not use that information anyway, except as an indication that there is a ModR/M byte. Nevertheless, in some cases the 'Gv' and 'Ev' are the wrong way around, BNDLDX and BNDSTX have 2 operands not 3, and it wouldn't hurt to identify the mandatory prefixes. This has no effect on the decoding of valid instructions, but the addition of the mandatory prefixes will cause some invalid instructions to error out that wouldn't have previously. Note that perf tools has a copy of the instruction decoder and provides a test for new instructions which includes MPX instructions e.g. $ perf test "x86 ins" 39: Test x86 instruction decoder - new instructions : Ok Or to see the details: $ perf test -v "x86 ins" Commiter notes: And to see these MPX instructions specifically: $ perf test -v "x86 ins" 2>&1 | grep bndldx | head -3 Decoded ok: 0f 1a 00 bndldx (%eax),%bnd0 Decoded ok: 0f 1a 05 78 56 34 12 bndldx 0x12345678,%bnd0 Decoded ok: 0f 1a 18 bndldx (%eax),%bnd3 $ perf test -v "x86 ins" 2>&1 | grep bndstx | head -3 Decoded ok: 0f 1b 00 bndstx %bnd0,(%eax) Decoded ok: 0f 1b 05 78 56 34 12 bndstx %bnd0,0x12345678 Decoded ok: 0f 1b 18 bndstx %bnd3,(%eax) $ Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1441196131-20632-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-04perf tools: Display build warning if x86 instruction decoder differs from kernelAdrian Hunter1-1/+12
perf tools has a copy of the x86 instruction decoder used by the kernel. The expectation is that the copy will be kept more-or-less in-synch with the kernel version. Consequently it is helpful to know if there are differences. This patch adds a check into the perf tools build so that a diff is done on the sources, and a warning is printed if they are different. Note that the warning is not fatal and the build continues as normal. The check is done as part of building the instruction decoder, so, like a compiler warning, it is not seen unless the instruction decoder has to be re-compiled. e.g. $ make -C tools/perf >/dev/null $ echo "/* blah */" >> tools/perf/util/intel-pt-decoder/inat_types.h $ make -C tools/perf >/dev/null Warning: Intel PT: x86 instruction decoder differs from kernel $ make -C tools/perf >/dev/null $ Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1441196131-20632-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-01perf build: Fix Intel PT instruction decoder dependency problemWang Nan1-0/+1
I hit following building error randomly: ... /bin/sh: /path/to/kernel/buildperf/util/intel-pt-decoder/inat-tables.c: No such file or directory ... LINK /path/to/kernel/buildperf/plugin_mac80211.so LINK /path/to/kernel/buildperf/plugin_kmem.so LINK /path/to/kernel/buildperf/plugin_xen.so LINK /path/to/kernel/buildperf/plugin_hrtimer.so In file included from util/intel-pt-decoder/intel-pt-insn-decoder.c:25:0: util/intel-pt-decoder/inat.c:24:25: fatal error: inat-tables.c: No such file or directory #include "inat-tables.c" ^ compilation terminated. make[4]: *** [/path/to/kernel/buildperf/util/intel-pt-decoder/intel-pt-insn-decoder.o] Error 1 make[4]: *** Waiting for unfinished jobs.... LINK /path/to/kernel/buildperf/plugin_function.so This is caused by tools/perf/util/intel-pt-decoder/Build that, it tries to generate $(OUTPUT)util/intel-pt-decoder/inat-tables.c atomatically but forget to ensure the existance of $(OUTPUT)util/intel-pt-decoder directory. This patch fixes it by adding $(call rule_mkdir) like other similar rules. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1441087005-107540-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-31perf tools: Fix build on powerpc broken by pt/btsAdrian Hunter1-0/+3
It is theoretically possible to process perf.data files created on x86 and that contain Intel PT or Intel BTS data, on any other architecture, which is why it is possible for there to be build errors on powerpc caused by pt/bts. The errors were: util/intel-pt-decoder/intel-pt-insn-decoder.c: In function ‘intel_pt_insn_decoder’: util/intel-pt-decoder/intel-pt-insn-decoder.c:138:3: error: switch missing default case [-Werror=switch-default] switch (insn->immediate.nbytes) { ^ cc1: all warnings being treated as errors linux-acme.git/tools/perf/perf-obj/libperf.a(libperf-in.o): In function `intel_pt_synth_branch_sample': sources/linux-acme.git/tools/perf/util/intel-pt.c:871: undefined reference to `tsc_to_perf_time' linux-acme.git/tools/perf/perf-obj/libperf.a(libperf-in.o): In function `intel_pt_sample': sources/linux-acme.git/tools/perf/util/intel-pt.c:915: undefined reference to `tsc_to_perf_time' sources/linux-acme.git/tools/perf/util/intel-pt.c:962: undefined reference to `tsc_to_perf_time' linux-acme.git/tools/perf/perf-obj/libperf.a(libperf-in.o): In function `intel_pt_process_event': sources/linux-acme.git/tools/perf/util/intel-pt.c:1454: undefined reference to `perf_time_to_tsc' Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1441046384-28663-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf tools: Add Intel PT support for decoding TRACESTOP packetsAdrian Hunter1-0/+11
A TRACESTOP packet is produced when an Intel PT trace enters a defined region of the address space at which point the tracing stops. This patch just adds decoder support. Support for specifying TRACESTOP regions is left until later. For details refer to the June 2015 or later Intel 64 and IA-32 Architectures SDM Chapter 36 Intel Processor Trace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-25-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf tools: Add Intel PT support for decoding CYC packetsAdrian Hunter1-5/+306
CYC packets provide even finer grain timestamp information than MTC and TSC packets. A CYC packet contains the number of CPU cycles since the last CYC packet. This patch just adds decoder support. The CPU frequency can be related to TSC using the Maximum Non-Turbo Ratio in combination with the CBR (core-to-bus ratio) packet. However more accuracy is achieved by simply interpolating the number of cycles between other timing packets like MTC or TSC. This patch takes the latter approach. Support for a default value and validation of values is provided by a later patch. Also documentation is updated in a separate patch. For details refer to the June 2015 or later Intel 64 and IA-32 Architectures SDM Chapter 36 Intel Processor Trace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-23-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf tools: Add Intel PT support for decoding MTC packetsAdrian Hunter2-4/+159
MTC packets provide finer grain timestamp information than TSC packets. MTC packets record time using the hardware crystal clock (CTC) which is related to TSC packets using a TMA packet. This patch just adds decoder support. Support for a default value and validation of values is provided by a later patch. Also documentation is updated in a separate patch. For details refer to the June 2015 or later Intel 64 and IA-32 Architectures SDM Chapter 36 Intel Processor Trace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-21-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf tools: Pass Intel PT information for decoding MTC and CYCAdrian Hunter1-0/+3
Record additional information in the AUXTRACE_INFO event in preparation for decoding MTC and CYC packets. Pass the information to the decoder. The AUXTRACE_INFO record can be extended by using the size to indicate the presence of new members. The additional information includes PMU config bit positions and the TSC to CTC (hardware crystal clock) ratio needed to decode MTC packets. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-20-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf tools: Add new Intel PT packet definitionsAdrian Hunter3-17/+201
New features have been added to Intel PT which include a number of new packet definitions. This patch adds packet definitions for new packets: TMA, MTC, CYC, VMCS, TRACESTOP and MNT. Also another bit in PIP is defined. This patch only adds support for the definitions. Later patches add support for decoding TMA, MTC, CYC and TRACESTOP which is where those packets are explained. VMCS and the newly defined bit in PIP are used with virtualization which is not supported yet. MNT is a maintenance packet which the decoder should ignore. For details, refer to the June 2015 or later Intel 64 and IA-32 Architectures SDM Chapter 36 Intel Processor Trace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-19-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf tools: Fix Intel PT 'instructions' sample periodAdrian Hunter2-0/+4
The period on synthesized 'instructions' samples was being set to a fixed value, whereas the correct value is the number of instructions since the last sample, which is a value that the decoder can provide. So do it that way. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-14-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-22perf tools: Fix tarball build broken by pt/btsAdrian Hunter6-6/+35
Fix some include paths and add missing inat_types.h. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/55D77696.60102@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf tools: Add Intel PT decoderAdrian Hunter3-1/+1921
Add support for decoding an Intel Processor Trace. Intel PT trace data must be 'decoded' which involves walking the object code and matching the trace data packets. The decoder requests a buffer of binary data via a get_trace() call-back, which it decodes using instruction information which it gets via another call-back walk_insn(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf tools: Add Intel PT logAdrian Hunter3-1/+208
Add a facility to log Intel Processor Trace decoding. The log is intended for debugging purposes only. The log file name is "intel_pt.log" and is opened in the current directory. The log contains a record of all packets and instructions decoded and can get very large (10 MB would be a small one). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf tools: Add Intel PT instruction decoderAdrian Hunter9-1/+2790
Add support for decoding instructions for Intel Processor Trace. The kernel x86 instruction decoder is copied for this. This essentially provides intel_pt_get_insn() which takes a binary buffer, uses the kernel's x86 instruction decoder to get details of the instruction and then categorizes it for consumption by an Intel PT decoder. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1439450095-30122-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf tools: Add Intel PT packet decoderAdrian Hunter3-0/+465
Add support for decoding Intel Processor Trace packets. This essentially provides intel_pt_get_packet() which takes a buffer of binary data and returns the decoded packet. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>