summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2021-06-23perf beauty: Update copy of linux/socket.h with the kernel sourcesArnaldo Carvalho de Melo1-2/+0
commit ef83f9efe8461b8fd71eb60b53dbb6a5dd7b39e9 upstream. To pick the changes in: ea6932d70e223e02 ("net: make get_net_ns return error if NET_NS is disabled") That don't result in any changes in the tables generated from that header. This silences this perf build warning: Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h' diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h Cc: Changbin Du <changbin.du@intel.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-23tools headers UAPI: Sync linux/in.h copy with the kernel sourcesArnaldo Carvalho de Melo1-0/+3
commit 1792a59eab9593de2eae36c40c5a22d70f52c026 upstream. To pick the changes in: 321827477360934d ("icmp: don't send out ICMP messages with a source address of 0.0.0.0") That don't result in any change in tooling, as INADDR_ are not used to generate id->string tables used by 'perf trace'. This addresses this build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h' diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h Cc: David S. Miller <davem@davemloft.net> Cc: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-23perf metricgroup: Return error code from ↵John Garry1-3/+5
metricgroup__add_metric_sys_event_iter() [ Upstream commit fe7a98b9d9b36e5c8a22d76b67d29721f153f66e ] The error code is not set at all in the sys event iter function. This may lead to an uninitialized value of "ret" in metricgroup__add_metric() when no CPU metric is added. Fix by properly setting the error code. It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as if we have no CPU or sys event metric matching, then "has_match" should be 0 and "ret" is set to -EINVAL. However gcc cannot detect that it may not have been set after the map_for_each_metric() loop for CPU metrics, which is strange. Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs") Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/1623335580-187317-3-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-23perf metricgroup: Fix find_evsel_group() event selectorJohn Garry1-3/+3
[ Upstream commit fc96ec4d5d4155c61cbafd49fb2dd403c899a9f4 ] The following command segfaults on my x86 broadwell: $ ./perf stat -M frontend_bound,retiring,backend_bound,bad_speculation sleep 1 WARNING: grouped events cpus do not match, disabling group: anon group { raw 0x10e } anon group { raw 0x10e } perf: util/evsel.c:1596: get_group_fd: Assertion `!(!leader->core.fd)' failed. Aborted (core dumped) The issue shows itself as a use-after-free in evlist__check_cpu_maps(), whereby the leader of an event selector (evsel) has been deleted (yet we still attempt to verify for an evsel). Fundamentally the problem comes from metricgroup__setup_events() -> find_evsel_group(), and has developed from the previous fix attempt in commit 9c880c24cb0d ("perf metricgroup: Fix for metrics containing duration_time"). The problem now is that the logic in checking if an evsel is in the same group is subtly broken for the "cycles" event. For the "cycles" event, the pmu_name is NULL; however the logic in find_evsel_group() may set an event matched against "cycles" as used, when it should not be. This leads to a condition where an evsel is set, yet its leader is not. Fix the check for evsel pmu_name by not matching evsels when either has a NULL pmu_name. There is still a pre-existing metric issue whereby the ordering of the metrics may break the 'stat' function, as discussed at: https://lore.kernel.org/lkml/49c6fccb-b716-1bf0-18a6-cace1cdb66b9@huawei.com/ Fixes: 9c880c24cb0d ("perf metricgroup: Fix for metrics containing duration_time") Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> # On a Thinkpad T450S Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/1623335580-187317-2-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-23ipv4: Fix device used for dst_alloc with local routesDavid Ahern1-0/+25
[ Upstream commit b87b04f5019e821c8c6c7761f258402e43500a1f ] Oliver reported a use case where deleting a VRF device can hang waiting for the refcnt to drop to 0. The root cause is that the dst is allocated against the VRF device but cached on the loopback device. The use case (added to the selftests) has an implicit VRF crossing due to the ordering of the FIB rules (lookup local is before the l3mdev rule, but the problem occurs even if the FIB rules are re-ordered with local after l3mdev because the VRF table does not have a default route to terminate the lookup). The end result is is that the FIB lookup returns the loopback device as the nexthop, but the ingress device is in a VRF. The mismatch causes the dst alloc against the VRF device but then cached on the loopback. The fix is to bring the trick used for IPv6 (see ip6_rt_get_dev_rcu): pick the dst alloc device based the fib lookup result but with checks that the result has a nexthop device (e.g., not an unreachable or prohibit entry). Fixes: f5a0aab84b74 ("net: ipv4: dst for local input routes should use l3mdev if relevant") Reported-by: Oliver Herms <oliver.peter.herms@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-23selftests: mptcp: enable syncookie only in absence of reordersPaolo Abeni1-3/+8
[ Upstream commit 2395da0e17935ce9158cdfae433962bdb6cbfa67 ] Syncookie validation may fail for OoO packets, causing spurious resets and self-tests failures, so let's force syncookie only for tests iteration with no OoO. Fixes: fed61c4b584c ("selftests: mptcp: make 2nd net namespace use tcp syn cookies unconditionally") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/198 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-23libbpf: Fixes incorrect rx_ring_setup_doneKev Jackson1-1/+1
[ Upstream commit 11fc79fc9f2e395aa39fa5baccae62767c5d8280 ] When calling xsk_socket__create_shared(), the logic at line 1097 marks a boolean flag true within the xsk_umem structure to track setup progress in order to support multiple calls to the function. However, instead of marking umem->tx_ring_setup_done, the code incorrectly sets umem->rx_ring_setup_done. This leads to improper behaviour when creating and destroying xsk and umem structures. Multiple calls to this function is documented as supported. Fixes: ca7a83e2487a ("libbpf: Only create rx and tx XDP rings when necessary") Signed-off-by: Kev Jackson <foamdino@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/YL4aU4f3Aaik7CN0@linux-dev Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-16perf session: Correct buffer copying when peeking eventsLeo Yan1-0/+1
[ Upstream commit 197eecb6ecae0b04bd694432f640ff75597fed9c ] When peeking an event, it has a short path and a long path. The short path uses the session pointer "one_mmap_addr" to directly fetch the event; and the long path needs to read out the event header and the following event data from file and fill into the buffer pointer passed through the argument "buf". The issue is in the long path that it copies the event header and event data into the same destination address which pointer "buf", this means the event header is overwritten. We are just lucky to run into the short path in most cases, so we don't hit the issue in the long path. This patch adds the offset "hdr_sz" to the pointer "buf" when copying the event data, so that it can reserve the event header which can be used properly by its caller. Fixes: 5a52f33adf02 ("perf session: Add perf_session__peek_event()") Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210605052957.1070720-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-16tools/bootconfig: Fix error return code in apply_xbc()Zhen Lei1-0/+1
commit e8ba0b2b64126381643bb50df3556b139a60545a upstream. Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Link: https://lkml.kernel.org/r/20210508034216.2277-1-thunder.leizhen@huawei.com Fixes: a995e6bc0524 ("tools/bootconfig: Fix to check the write failure correctly") Reported-by: Hulk Robot <hulkci@huawei.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-16tools/bootconfig: Fix a build error accroding to undefined fallthroughMasami Hiramatsu1-0/+4
commit 824afd55e95c3cb12c55d297a0ae408be1779cc8 upstream. Since the "fallthrough" is defined only in the kernel, building lib/bootconfig.c as a part of user-space tools causes a build error. Add a dummy fallthrough to avoid the build error. Link: https://lkml.kernel.org/r/162087519356.442660.11385099982318160180.stgit@devnote2 Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Fixes: 4c1ca831adb1 ("Revert "lib: Revert use of fallthrough pseudo-keyword in lib/"") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-16bpf, selftests: Adjust few selftest result_unpriv outcomesDaniel Borkmann2-10/+0
[ Upstream commit 1bad6fd52be4ce12d207e2820ceb0f29ab31fc53 ] Given we don't need to simulate the speculative domain for registers with immediates anymore since the verifier uses direct imm-based rewrites instead of having to mask, we can also lift a few cases that were previously rejected. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-10wireguard: selftests: make sure rp_filter is disabled on vethcJason A. Donenfeld1-0/+1
commit f8873d11d4121aad35024f9379e431e0c83abead upstream. Some distros may enable strict rp_filter by default, which will prevent vethc from receiving the packets with an unrouteable reverse path address. Reported-by: Hangbin Liu <liuhangbin@gmail.com> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-10wireguard: selftests: remove old conntrack kconfig valueJason A. Donenfeld1-1/+0
commit acf2492b51c9a3c4dfb947f4d3477a86d315150f upstream. On recent kernels, this config symbol is no longer used. Reported-by: Rui Salvaterra <rsalvaterra@gmail.com> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-10perf probe: Fix NULL pointer dereference in convert_variable_location()Li Huafei2-2/+9
[ Upstream commit 3cb17cce1e76ccc5499915a4d7e095a1ad6bf7ff ] If we just check whether the variable can be converted, 'tvar' should be a null pointer. However, the null pointer check is missing in the 'Constant value' execution path. The following cases can trigger this problem: $ cat test.c #include <stdio.h> void main(void) { int a; const int b = 1; asm volatile("mov %1, %0" : "=r"(a): "i"(b)); printf("a: %d\n", a); } $ gcc test.c -o test -O -g $ sudo ./perf probe -x ./test -L "main" <main@/home/lhf/test.c:0> 0 void main(void) { 2 int a; const int b = 1; asm volatile("mov %1, %0" : "=r"(a): "i"(b)); 6 printf("a: %d\n", a); } $ sudo ./perf probe -x ./test -V "main:6" Segmentation fault The check on 'tvar' is added. If 'tavr' is a null pointer, we return 0 to indicate that the variable can be converted. Now, we can successfully show the variables that can be accessed. $ sudo ./perf probe -x ./test -V "main:6" Available variables at main:6 @<main+13> char* __fmt int a int b However, the variable 'b' cannot be tracked. $ sudo ./perf probe -x ./test -D "main:6 b" Failed to find the location of the 'b' variable at this address. Perhaps it has been optimized out. Use -V with the --range option to show 'b' location range. Error: Failed to add events. This is because __die_find_variable_cb() did not successfully match variable 'b', which has the DW_AT_const_value attribute instead of DW_AT_location. We added support for DW_AT_const_value in __die_find_variable_cb(). With this modification, we can successfully track the variable 'b'. $ sudo ./perf probe -x ./test -D "main:6 b" p:probe_test/main_L6 /home/lhf/test:0x1156 b=\1:s32 Fixes: 66f69b219716 ("perf probe: Support DW_AT_const_value constant value") Signed-off-by: Li Huafei <lihuafei1@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Frank Ch. Eigler <fche@redhat.com> Cc: Jianlin Lv <jianlin.lv@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Zhang Jinhao <zhangjinhao2@huawei.com> http://lore.kernel.org/lkml/20210601092750.169601-1-lihuafei1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-03bpftool: Add sock_release help info for cgroup attach/prog load commandLiu Jian5-7/+10
commit a8deba8547e39f26440101164a3bbc2899c5b305 upstream. The help information was not added at the time when the function got added. Fix this and add the missing information to its cli, documentation and bash completion. Fixes: db94cc0b4805 ("bpftool: Add support for BPF_CGROUP_INET_SOCK_RELEASE") Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Quentin Monnet <quentin@isovalent.com> Link: https://lore.kernel.org/bpf/20210525014139.323859-1-liujian56@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03linux/bits.h: fix compilation error with GENMASKRikard Falkeborn2-1/+9
[ Upstream commit f747e6667ebb2ffb8133486c9cd19800d72b0d98 ] GENMASK() has an input check which uses __builtin_choose_expr() to enable a compile time sanity check of its inputs if they are known at compile time. However, it turns out that __builtin_constant_p() does not always return a compile time constant [0]. It was thought this problem was fixed with gcc 4.9 [1], but apparently this is not the case [2]. Switch to use __is_constexpr() instead which always returns a compile time constant, regardless of its inputs. Link: https://lore.kernel.org/lkml/42b4342b-aefc-a16a-0d43-9f9c0d63ba7a@rasmusvillemoes.dk [0] Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19449 [1] Link: https://lore.kernel.org/lkml/1ac7bbc2-45d9-26ed-0b33-bf382b8d858b@I-love.SAKURA.ne.jp [2] Link: https://lkml.kernel.org/r/20210511203716.117010-1-rikard.falkeborn@gmail.com Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Yury Norov <yury.norov@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-03perf jevents: Fix getting maximum number of fdsFelix Fietkau1-1/+1
commit 75ea44e356b5de8c817f821c9dd68ae329e82add upstream. On some hosts, rlim.rlim_max can be returned as RLIM_INFINITY. By casting it to int, it is interpreted as -1, which will cause get_maxfds to return 0, causing "Invalid argument" errors in nftw() calls. Fix this by casting the second argument of min() to rlim_t instead. Fixes: 80eeb67fe577 ("perf jevents: Program to convert JSON file") Signed-off-by: Felix Fietkau <nbd@nbd.name> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lore.kernel.org/lkml/20210525160758.97829-1-nbd@nbd.name Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03perf debug: Move debug initialization earlierIan Rogers1-2/+2
commit c59870e2110e1229a6e4b2457aece6ffe8d68d99 upstream. This avoids segfaults during option handlers that use pr_err. For example, "perf --debug nopager list" segfaults before this change. Fixes: 8abceacff87d (perf debug: Add debug_set_file function) Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20210519164447.2672030-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()David Matlack4-10/+16
commit ef4c9f4f654622fa15b7a94a9bd1f19e76bb7feb upstream. vm_get_max_gfn() casts vm->max_gfn from a uint64_t to an unsigned int, which causes the upper 32-bits of the max_gfn to get truncated. Nobody noticed until now likely because vm_get_max_gfn() is only used as a mechanism to create a memslot in an unused region of the guest physical address space (the top), and the top of the 32-bit physical address space was always good enough. This fix reveals a bug in memslot_modification_stress_test which was trying to create a dummy memslot past the end of guest physical memory. Fix that by moving the dummy memslot lower. Fixes: 52200d0d944e ("KVM: selftests: Remove duplicate guest mode handling") Reviewed-by: Venkatesh Srinivas <venkateshs@chromium.org> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210521173828.1180619-1-dmatlack@google.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03KVM: X86: Use _BITUL() macro in UAPI headersJoe Richey1-2/+3
commit fb1070d18edb37daf3979662975bc54625a19953 upstream. Replace BIT() in KVM's UPAI header with _BITUL(). BIT() is not defined in the UAPI headers and its usage may cause userspace build errors. Fixes: fb04a1eddb1a ("KVM: X86: Implement ring-based dirty memory tracking") Signed-off-by: Joe Richey <joerichey@google.com> Message-Id: <20210521085849.37676-3-joerichey94@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03net/sched: fq_pie: re-factor fix for fq_pie endless loopDavide Caratti1-4/+4
commit 3a62fed2fd7b6fea96d720e779cafc30dfb3a22e upstream. the patch that fixed an endless loop in_fq_pie_init() was not considering that 65535 is a valid class id. The correct bugfix for this infinite loop is to change 'idx' to become an u32, like Colin proposed in the past [1]. Fix this as follows: - restore 65536 as maximum possible values of 'flows_cnt' - use u32 'idx' when iterating on 'q->flows' - fix the TDC selftest This reverts commit bb2f930d6dd708469a587dc9ed1efe1ef969c0bf. [1] https://lore.kernel.org/netdev/20210407163808.499027-1-colin.king@canonical.com/ CC: Colin Ian King <colin.king@canonical.com> CC: stable@vger.kernel.org Fixes: bb2f930d6dd7 ("net/sched: fix infinite loop in sch_fq_pie") Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03perf scripts python: exported-sql-viewer.py: Fix warning displayAdrian Hunter1-0/+5
commit f56299a9c998e0bfbd4ab07cafe9eb8444512448 upstream. Deprecation warnings are useful only for the developer, not an end user. Display warnings only when requested using the python -W option. This stops the display of warnings like: tools/perf/scripts/python/exported-sql-viewer.py:5102: DeprecationWarning: an integer is required (got type PySide2.QtCore.Qt.AlignmentFlag). Implicit conversion to integers using __int__ is deprecated, and may be removed in a future version of Python. err = app.exec_() Since the warning can be fixed only in PySide2, we must wait for it to be finally fixed there. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org # v5.3+ Link: http://lore.kernel.org/lkml/20210521092053.25683-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03perf scripts python: exported-sql-viewer.py: Fix Array TypeErrorAdrian Hunter1-2/+3
commit fd931b2e234a7cc451a7bbb1965d6ce623189158 upstream. The 'Array' class is present in more than one python standard library. In some versions of Python 3, the following error occurs: Traceback (most recent call last): File "tools/perf/scripts/python/exported-sql-viewer.py", line 4702, in <lambda> reports_menu.addAction(CreateAction(label, "Create a new window displaying branch events", lambda a=None,x=dbid: self.NewBranchView(x), self)) File "tools/perf/scripts/python/exported-sql-viewer.py", line 4727, in NewBranchView BranchWindow(self.glb, event_id, ReportVars(), self) File "tools/perf/scripts/python/exported-sql-viewer.py", line 3208, in __init__ self.model = LookupCreateModel(model_name, lambda: BranchModel(glb, event_id, report_vars.where_clause)) File "tools/perf/scripts/python/exported-sql-viewer.py", line 343, in LookupCreateModel model = create_fn() File "tools/perf/scripts/python/exported-sql-viewer.py", line 3208, in <lambda> self.model = LookupCreateModel(model_name, lambda: BranchModel(glb, event_id, report_vars.where_clause)) File "tools/perf/scripts/python/exported-sql-viewer.py", line 3124, in __init__ self.fetcher = SQLFetcher(glb, sql, prep, self.AddSample) File "tools/perf/scripts/python/exported-sql-viewer.py", line 2658, in __init__ self.buffer = Array(c_char, self.buffer_size, lock=False) TypeError: abstract class This apparently happens because Python can be inconsistent about which class of the name 'Array' gets imported. Fix by importing explicitly by name so that only the desired 'Array' gets imported. Fixes: 8392b74b575c3 ("perf scripts python: exported-sql-viewer.py: Add ability to display all the database tables") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20210521092053.25683-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03perf scripts python: exported-sql-viewer.py: Fix copy to clipboard from Top ↵Adrian Hunter1-1/+1
Calls by elapsed Time report commit a6172059758ba1b496ae024cece7d5bdc8d017db upstream. Provide missing argument to prevent following error when copying a selection to the clipboard: Traceback (most recent call last): File "tools/perf/scripts/python/exported-sql-viewer.py", line 4041, in <lambda> menu.addAction(CreateAction("&Copy selection", "Copy to clipboard", lambda: CopyCellsToClipboardHdr(self.view), self.view)) File "tools/perf/scripts/python/exported-sql-viewer.py", line 4021, in CopyCellsToClipboardHdr CopyCellsToClipboard(view, False, True) File "tools/perf/scripts/python/exported-sql-viewer.py", line 4018, in CopyCellsToClipboard view.CopyCellsToClipboard(view, as_csv, with_hdr) File "tools/perf/scripts/python/exported-sql-viewer.py", line 3871, in CopyTableCellsToClipboard val = model.headerData(col, Qt.Horizontal) TypeError: headerData() missing 1 required positional argument: 'role' Fixes: 96c43b9a7ab3b ("perf scripts python: exported-sql-viewer.py: Add copy to clipboard") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20210521092053.25683-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03perf intel-pt: Fix transaction abort handlingAdrian Hunter1-1/+5
commit cb7987837c31b217b28089bbc78922d5c9187869 upstream. When adding support for power events, some handling of FUP packets was unified. That resulted in breaking reporting of TSX aborts, by not considering the associated TIP packet. Fix that. Example: A machine that supports TSX is required. It will have flag "rtm". Kernel parameter tsx=on may be required. # for w in `cat /proc/cpuinfo | grep -m1 flags `;do echo $w | grep rtm ; done rtm Test program: #include <stdio.h> #include <immintrin.h> int main() { int x = 0; if (_xbegin() == _XBEGIN_STARTED) { x = 1; _xabort(1); } else { printf("x = %d\n", x); } return 0; } Compile with -mrtm i.e. gcc -Wall -Wextra -mrtm xabort.c -o xabort Record: perf record -e intel_pt/cyc/u --filter 'filter main @ ./xabort' ./xabort Before: # perf script --itrace=be -F+flags,+addr,-period,-event --ns xabort 1478 [007] 92161.431348552: tr strt 0 [unknown] ([unknown]) => 400b6d main+0x0 (/root/xabort) xabort 1478 [007] 92161.431348624: jmp 400b96 main+0x29 (/root/xabort) => 400bae main+0x41 (/root/xabort) xabort 1478 [007] 92161.431348624: return 400bb4 main+0x47 (/root/xabort) => 400b87 main+0x1a (/root/xabort) xabort 1478 [007] 92161.431348637: jcc 400b8a main+0x1d (/root/xabort) => 400b98 main+0x2b (/root/xabort) xabort 1478 [007] 92161.431348644: tr end call 400ba9 main+0x3c (/root/xabort) => 40f690 printf+0x0 (/root/xabort) xabort 1478 [007] 92161.431360859: tr strt 0 [unknown] ([unknown]) => 400bae main+0x41 (/root/xabort) xabort 1478 [007] 92161.431360882: tr end return 400bb4 main+0x47 (/root/xabort) => 401139 __libc_start_main+0x309 (/root/xabort) After: # perf script --itrace=be -F+flags,+addr,-period,-event --ns xabort 1478 [007] 92161.431348552: tr strt 0 [unknown] ([unknown]) => 400b6d main+0x0 (/root/xabort) xabort 1478 [007] 92161.431348624: tx abrt 400b93 main+0x26 (/root/xabort) => 400b87 main+0x1a (/root/xabort) xabort 1478 [007] 92161.431348637: jcc 400b8a main+0x1d (/root/xabort) => 400b98 main+0x2b (/root/xabort) xabort 1478 [007] 92161.431348644: tr end call 400ba9 main+0x3c (/root/xabort) => 40f690 printf+0x0 (/root/xabort) xabort 1478 [007] 92161.431360859: tr strt 0 [unknown] ([unknown]) => 400bae main+0x41 (/root/xabort) xabort 1478 [007] 92161.431360882: tr end return 400bb4 main+0x47 (/root/xabort) => 401139 __libc_start_main+0x309 (/root/xabort) Fixes: a472e65fc490a ("perf intel-pt: Add decoder support for ptwrite and power event packets") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20210519074515.9262-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03perf intel-pt: Fix sample instruction bytesAdrian Hunter1-1/+4
commit c954eb72b31a9dc56c99b450253ec5b121add320 upstream. The decoder reports the current instruction if it was decoded. In some cases the current instruction is not decoded, in which case the instruction bytes length must be set to zero. Ensure that is always done. Note perf script can anyway get the instruction bytes for any samples where they are not present. Also note, that there is a redundant "ptq->insn_len = 0" statement which is not removed until a subsequent patch in order to make this patch apply cleanly to stable branches. Example: A machne that supports TSX is required. It will have flag "rtm". Kernel parameter tsx=on may be required. # for w in `cat /proc/cpuinfo | grep -m1 flags `;do echo $w | grep rtm ; done rtm Test program: #include <stdio.h> #include <immintrin.h> int main() { int x = 0; if (_xbegin() == _XBEGIN_STARTED) { x = 1; _xabort(1); } else { printf("x = %d\n", x); } return 0; } Compile with -mrtm i.e. gcc -Wall -Wextra -mrtm xabort.c -o xabort Record: perf record -e intel_pt/cyc/u --filter 'filter main @ ./xabort' ./xabort Before: # perf script --itrace=xe -F+flags,+insn,-period --xed --ns xabort 1478 [007] 92161.431348581: transactions: x 400b81 main+0x14 (/root/xabort) mov $0xffffffff, %eax xabort 1478 [007] 92161.431348624: transactions: tx abrt 400b93 main+0x26 (/root/xabort) mov $0xffffffff, %eax After: # perf script --itrace=xe -F+flags,+insn,-period --xed --ns xabort 1478 [007] 92161.431348581: transactions: x 400b81 main+0x14 (/root/xabort) xbegin 0x6 xabort 1478 [007] 92161.431348624: transactions: tx abrt 400b93 main+0x26 (/root/xabort) xabort $0x1 Fixes: faaa87680b25d ("perf intel-pt/bts: Report instruction bytes and length in sample") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lore.kernel.org/lkml/20210519074515.9262-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-26powerpc/64s/syscall: Use pt_regs.trap to distinguish syscall ABI difference ↵Nicholas Piggin1-9/+18
between sc and scv syscalls commit 5665bc35c1ed917ac8fd06cb651317bb47a65b10 upstream. The sc and scv 0 system calls have different ABI conventions, and ptracers need to know which system call type is being used if they want to look at the syscall registers. Document that pt_regs.trap can be used for this, and fix one in-tree user to work with scv 0 syscalls. Fixes: 7fa95f9adaee ("powerpc/64s: system call support for scv/rfscv instructions") Cc: stable@vger.kernel.org # v5.9+ Reported-by: "Dmitry V. Levin" <ldv@altlinux.org> Suggested-by: "Dmitry V. Levin" <ldv@altlinux.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210520111931.2597127-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-26tools/testing/selftests/exec: fix link errorYang Yingliang1-3/+3
[ Upstream commit 4d1cd3b2c5c1c32826454de3a18c6183238d47ed ] Fix the link error by adding '-static': gcc -Wall -Wl,-z,max-page-size=0x1000 -pie load_address.c -o /home/yang/linux/tools/testing/selftests/exec/load_address_4096 /usr/bin/ld: /tmp/ccopEGun.o: relocation R_AARCH64_ADR_PREL_PG_HI21 against symbol `stderr@@GLIBC_2.17' which may bind externally can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: /tmp/ccopEGun.o(.text+0x158): unresolvable R_AARCH64_ADR_PREL_PG_HI21 relocation against symbol `stderr@@GLIBC_2.17' /usr/bin/ld: final link failed: bad value collect2: error: ld returned 1 exit status make: *** [Makefile:25: tools/testing/selftests/exec/load_address_4096] Error 1 Link: https://lkml.kernel.org/r/20210514092422.2367367-1-yangyingliang@huawei.com Fixes: 206e22f01941 ("tools/testing/selftests: add self-test for verifying load alignment") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Cc: Chris Kennelly <ckennelly@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19perf tools: Fix dynamic libbpf linkJiri Olsa2-0/+8
[ Upstream commit ad1237c30d975535a669746496cbed136aa5a045 ] Justin reported broken build with LIBBPF_DYNAMIC=1. When linking libbpf dynamically we need to use perf's hashmap object, because it's not exported in libbpf.so (only in libbpf.a). Following build is now passing: $ make LIBBPF_DYNAMIC=1 BUILD: Doing 'make -j8' parallel build ... $ ldd perf | grep libbpf libbpf.so.0 => /lib64/libbpf.so.0 (0x00007fa7630db000) Fixes: eee19501926d ("perf tools: Grab a copy of libbpf's hashmap") Reported-by: Justin M. Forbes <jforbes@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210508205020.617984-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19libbpf: Fix signed overflow in ringbuf_process_ringBrendan Jackman1-9/+21
[ Upstream commit 2a30f9440640c418bcfbea9b2b344d268b58e0a2 ] One of our benchmarks running in (Google-internal) CI pushes data through the ringbuf faster htan than userspace is able to consume it. In this case it seems we're actually able to get >INT_MAX entries in a single ring_buffer__consume() call. ASAN detected that cnt overflows in this case. Fix by using 64-bit counter internally and then capping the result to INT_MAX before converting to the int return type. Do the same for the ring_buffer__poll(). Fixes: bf99c936f947 (libbpf: Add BPF ring buffer support) Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210429130510.1621665-1-jackmanb@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19selftests: mlxsw: Fix mausezahn invocation in ERSPAN scale testPetr Machata2-3/+19
[ Upstream commit 1233898ab758cbcf5f6fea10b8dd16a0b2c24fab ] The mirror_gre_scale test creates as many ERSPAN sessions as the underlying chip supports, and tests that they all work. In order to determine that it issues a stream of ICMP packets and checks if they are mirrored as expected. However, the mausezahn invocation missed the -6 flag to identify the use of IPv6 protocol, and was sending ICMP messages over IPv6, as opposed to ICMP6. It also didn't pass an explicit source IP address, which apparently worked at some point in the past, but does not anymore. To fix these issues, extend the function mirror_test() in mirror_lib by detecting the IPv6 protocol addresses, and using a different ICMP scheme. Fix __mirror_gre_test() in the selftest itself to pass a source IP address. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19selftests: mlxsw: Increase the tolerance of backlog buildupPetr Machata1-2/+2
[ Upstream commit dda7f4fa55839baeb72ae040aeaf9ccf89d3e416 ] The intention behind this test is to make sure that qdisc limit is correctly projected to the HW. However, first, due to rounding in the qdisc, and then in the driver, the number cannot actually be accurate. And second, the approach to testing this is to oversubscribe the port with traffic generated on the same switch. The actual backlog size therefore fluctuates. In practice, this test proved to be noisier than the rest, and spuriously fails every now and then. Increase the tolerance to 10 % to avoid these issues. Signed-off-by: Petr Machata <petrm@nvidia.com> Acked-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19selftests: Set CC to clang in lib.mk if LLVM is setYonghong Song1-0/+4
[ Upstream commit 26e6dd1072763cd5696b75994c03982dde952ad9 ] selftests/bpf/Makefile includes lib.mk. With the following command make -j60 LLVM=1 LLVM_IAS=1 <=== compile kernel make -j60 -C tools/testing/selftests/bpf LLVM=1 LLVM_IAS=1 V=1 some files are still compiled with gcc. This patch fixed lib.mk issue which sets CC to gcc in all cases. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210413153413.3027426-1-yhs@fb.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19selftests: mptcp: launch mptcp_connect with timeoutMatthieu Baerts4-33/+72
[ Upstream commit 5888a61cb4e00695075bbacfd86f3fa73af00413 ] 'mptcp_connect' already has a timeout for poll() but in some cases, it is not enough. With "timeout" tool, we will force the command to fail if it doesn't finish on time. Thanks to that, the script will continue and display details about the current state before marking the test as failed. Displaying this state is very important to be able to understand the issue. Best to have our CI reporting the issue than just "the test hanged". Note that in mptcp_connect.sh, we were using a long timeout to validate the fact we cannot create a socket if a sysctl is set. We don't need this timeout. In diag.sh, we want to send signals to mptcp_connect instances that have been started in the netns. But we cannot send this signal to 'timeout' otherwise that will stop the timeout and messages telling us SIGUSR1 has been received will be printed. Instead of trying to find the right PID and storing them in an array, we can simply use the output of 'ip netns pids' which is all the PIDs we want to send signal to. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/160 Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19selftests/powerpc: Fix L1D flushing tests for Power10Russell Currey3-2/+6
[ Upstream commit 3a72c94ebfb1f171eba0715998010678a09ec796 ] The rfi_flush and entry_flush selftests work by using the PM_LD_MISS_L1 perf event to count L1D misses. The value of this event has changed over time: - Power7 uses 0x400f0 - Power8 and Power9 use both 0x400f0 and 0x3e054 - Power10 uses only 0x3e054 Rather than relying on raw values, configure perf to count L1D read misses in the most explicit way available. This fixes the selftests to work on systems without 0x400f0 as PM_LD_MISS_L1, and should change no behaviour for systems that the tests already worked on. The only potential downside is that referring to a specific perf event requires PMU support implemented in the kernel for that platform. Signed-off-by: Russell Currey <ruscur@russell.cc> Acked-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210223070227.2916871-1-ruscur@russell.cc Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14perf session: Add swap operation for event TIME_CONVLeo Yan1-1/+14
[ Upstream commit 050ffc449008eeeafc187dec337d9cf1518f89bc ] Since commit d110162cafc8 ("perf tsc: Support cap_user_time_short for event TIME_CONV"), the event PERF_RECORD_TIME_CONV has extended the data structure for clock parameters. To be backwards-compatible, this patch adds a dedicated swap operation for the event PERF_RECORD_TIME_CONV, based on checking if the event contains field "time_cycles", it can support both for the old and new event formats. Fixes: d110162cafc8 ("perf tsc: Support cap_user_time_short for event TIME_CONV") Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve MacLean <Steve.MacLean@Microsoft.com> Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Link: https://lore.kernel.org/r/20210428120915.7123-4-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14perf jit: Let convert_timestamp() to be backwards-compatibleLeo Yan2-10/+22
[ Upstream commit aa616f5a8a2d22a179d5502ebd85045af66fa656 ] Commit d110162cafc80dad ("perf tsc: Support cap_user_time_short for event TIME_CONV") supports the extended parameters for event TIME_CONV, but it broke the backwards compatibility, so any perf data file with old event format fails to convert timestamp. This patch introduces a helper event_contains() to check if an event contains a specific member or not. For the backwards-compatibility, if the event size confirms the extended parameters are supported in the event TIME_CONV, then copies these parameters. Committer notes: To make this compiler backwards compatible add this patch: - struct perf_tsc_conversion tc = { 0 }; + struct perf_tsc_conversion tc = { .time_shift = 0, }; Fixes: d110162cafc8 ("perf tsc: Support cap_user_time_short for event TIME_CONV") Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve MacLean <Steve.MacLean@Microsoft.com> Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Link: https://lore.kernel.org/r/20210428120915.7123-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14perf tools: Change fields type in perf_record_time_convLeo Yan1-2/+3
[ Upstream commit e1d380ea8b00db4bb14d1f513000d4b62aa9d3f0 ] C standard claims "An object declared as type _Bool is large enough to store the values 0 and 1", bool type size can be 1 byte or larger than 1 byte. Thus it's uncertian for bool type size with different compilers. This patch changes the bool type in structure perf_record_time_conv to __u8 type, and pads extra bytes for 8-byte alignment; this can give reliable structure size. Fixes: d110162cafc8 ("perf tsc: Support cap_user_time_short for event TIME_CONV") Suggested-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve MacLean <Steve.MacLean@Microsoft.com> Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Link: https://lore.kernel.org/r/20210428120915.7123-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14bpf: Fix propagation of 32 bit unsigned bounds from 64 bit boundsDaniel Borkmann1-1/+1
[ Upstream commit 10bf4e83167cc68595b85fd73bb91e8f2c086e36 ] Similarly as b02709587ea3 ("bpf: Fix propagation of 32-bit signed bounds from 64-bit bounds."), we also need to fix the propagation of 32 bit unsigned bounds from 64 bit counterparts. That is, really only set the u32_{min,max}_value when /both/ {umin,umax}_value safely fit in 32 bit space. For example, the register with a umin_value == 1 does /not/ imply that u32_min_value is also equal to 1, since umax_value could be much larger than 32 bit subregister can hold, and thus u32_min_value is in the interval [0,1] instead. Before fix, invalid tracking result of R2_w=inv1: [...] 5: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0) R10=fp0 5: (35) if r2 >= 0x1 goto pc+1 [...] // goto path 7: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umin_value=1) R10=fp0 7: (b6) if w2 <= 0x1 goto pc+1 [...] // goto path 9: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,smin_value=-9223372036854775807,smax_value=9223372032559808513,umin_value=1,umax_value=18446744069414584321,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_max_value=1) R10=fp0 9: (bc) w2 = w2 10: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv1 R10=fp0 [...] After fix, correct tracking result of R2_w=inv(id=0,umax_value=1,var_off=(0x0; 0x1)): [...] 5: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0) R10=fp0 5: (35) if r2 >= 0x1 goto pc+1 [...] // goto path 7: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,umin_value=1) R10=fp0 7: (b6) if w2 <= 0x1 goto pc+1 [...] // goto path 9: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2=inv(id=0,smax_value=9223372032559808513,umax_value=18446744069414584321,var_off=(0x0; 0xffffffff00000001),s32_min_value=0,s32_max_value=1,u32_max_value=1) R10=fp0 9: (bc) w2 = w2 10: R0=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,umax_value=1,var_off=(0x0; 0x1)) R10=fp0 [...] Thus, same issue as in b02709587ea3 holds for unsigned subregister tracking. Also, align __reg64_bound_u32() similarly to __reg64_bound_s32() as done in b02709587ea3 to make them uniform again. Fixes: 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking") Reported-by: Manfred Paul (@_manfp) Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14selftests/bpf: Fix core_reloc test runnerAndrii Nakryiko1-8/+12
[ Upstream commit bede0ebf0be87e9678103486a77f39e0334c6791 ] Fix failed tests checks in core_reloc test runner, which allowed failing tests to pass quietly. Also add extra check to make sure that expected to fail test cases with invalid names are caught as test failure anyway, as this is not an expected failure mode. Also fix mislabeled probed vs direct bitfield test cases. Fixes: 124a892d1c41 ("selftests/bpf: Test TYPE_EXISTS and TYPE_SIZE CO-RE relocations") Reported-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-6-andrii@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14selftests/bpf: Fix field existence CO-RE reloc testsAndrii Nakryiko9-48/+24
[ Upstream commit 5a30eb23922b52f33222c6729b6b3ff1c37a6c66 ] Negative field existence cases for have a broken assumption that FIELD_EXISTS CO-RE relo will fail for fields that match the name but have incompatible type signature. That's not how CO-RE relocations generally behave. Types and fields that match by name but not by expected type are treated as non-matching candidates and are skipped. Error later is reported if no matching candidate was found. That's what happens for most relocations, but existence relocations (FIELD_EXISTS and TYPE_EXISTS) are more permissive and they are designed to return 0 or 1, depending if a match is found. This allows to handle name-conflicting but incompatible types in BPF code easily. Combined with ___flavor suffixes, it's possible to handle pretty much any structural type changes in kernel within the compiled once BPF source code. So, long story short, negative field existence test cases are invalid in their assumptions, so this patch reworks them into a single consolidated positive case that doesn't match any of the fields. Fixes: c7566a69695c ("selftests/bpf: Add field existence CO-RE relocs tests") Reported-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-5-andrii@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14selftests/bpf: Fix BPF_CORE_READ_BITFIELD() macroAndrii Nakryiko1-4/+12
[ Upstream commit 0f20615d64ee2ad5e2a133a812382d0c4071589b ] Fix BPF_CORE_READ_BITFIELD() macro used for reading CO-RE-relocatable bitfields. Missing breaks in a switch caused 8-byte reads always. This can confuse libbpf because it does strict checks that memory load size corresponds to the original size of the field, which in this case quite often would be wrong. After fixing that, we run into another problem, which quite subtle, so worth documenting here. The issue is in Clang optimization and CO-RE relocation interactions. Without that asm volatile construct (also known as barrier_var()), Clang will re-order BYTE_OFFSET and BYTE_SIZE relocations and will apply BYTE_OFFSET 4 times for each switch case arm. This will result in the same error from libbpf about mismatch of memory load size and original field size. I.e., if we were reading u32, we'd still have *(u8 *), *(u16 *), *(u32 *), and *(u64 *) memory loads, three of which will fail. Using barrier_var() forces Clang to apply BYTE_OFFSET relocation first (and once) to calculate p, after which value of p is used without relocation in each of switch case arms, doing appropiately-sized memory load. Here's the list of relevant relocations and pieces of generated BPF code before and after this patch for test_core_reloc_bitfields_direct selftests. BEFORE ===== #45: core_reloc: insn #160 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 #46: core_reloc: insn #167 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #47: core_reloc: insn #174 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #48: core_reloc: insn #178 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #49: core_reloc: insn #182 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 157: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 159: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 160: b7 02 00 00 04 00 00 00 r2 = 4 ; BYTE_SIZE relocation here ^^^ 161: 66 02 07 00 03 00 00 00 if w2 s> 3 goto +7 <LBB0_63> 162: 16 02 0d 00 01 00 00 00 if w2 == 1 goto +13 <LBB0_65> 163: 16 02 01 00 02 00 00 00 if w2 == 2 goto +1 <LBB0_66> 164: 05 00 12 00 00 00 00 00 goto +18 <LBB0_69> 0000000000000528 <LBB0_66>: 165: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 167: 69 11 08 00 00 00 00 00 r1 = *(u16 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 168: 05 00 0e 00 00 00 00 00 goto +14 <LBB0_69> 0000000000000548 <LBB0_63>: 169: 16 02 0a 00 04 00 00 00 if w2 == 4 goto +10 <LBB0_67> 170: 16 02 01 00 08 00 00 00 if w2 == 8 goto +1 <LBB0_68> 171: 05 00 0b 00 00 00 00 00 goto +11 <LBB0_69> 0000000000000560 <LBB0_68>: 172: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 174: 79 11 08 00 00 00 00 00 r1 = *(u64 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 175: 05 00 07 00 00 00 00 00 goto +7 <LBB0_69> 0000000000000580 <LBB0_65>: 176: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 178: 71 11 08 00 00 00 00 00 r1 = *(u8 *)(r1 + 8) ; BYTE_OFFSET relo here w/ WRONG size ^^^^^^^^^^^^^^^^ 179: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 00000000000005a0 <LBB0_67>: 180: 18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll 182: 61 11 08 00 00 00 00 00 r1 = *(u32 *)(r1 + 8) ; BYTE_OFFSET relo here w/ RIGHT size ^^^^^^^^^^^^^^^^ 00000000000005b8 <LBB0_69>: 183: 67 01 00 00 20 00 00 00 r1 <<= 32 184: b7 02 00 00 00 00 00 00 r2 = 0 185: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 186: c7 01 00 00 20 00 00 00 r1 s>>= 32 187: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000005e0 <LBB0_71>: 188: 77 01 00 00 20 00 00 00 r1 >>= 32 AFTER ===== #30: core_reloc: insn #132 --> [5] + 0:5: byte_off --> struct core_reloc_bitfields.u32 #31: core_reloc: insn #134 --> [5] + 0:5: byte_sz --> struct core_reloc_bitfields.u32 129: 18 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r2 = 0 ll 131: 7b 12 20 01 00 00 00 00 *(u64 *)(r2 + 288) = r1 132: b7 01 00 00 08 00 00 00 r1 = 8 ; BYTE_OFFSET relo here ^^^ ; no size check for non-memory dereferencing instructions 133: 0f 12 00 00 00 00 00 00 r2 += r1 134: b7 03 00 00 04 00 00 00 r3 = 4 ; BYTE_SIZE relocation here ^^^ 135: 66 03 05 00 03 00 00 00 if w3 s> 3 goto +5 <LBB0_63> 136: 16 03 09 00 01 00 00 00 if w3 == 1 goto +9 <LBB0_65> 137: 16 03 01 00 02 00 00 00 if w3 == 2 goto +1 <LBB0_66> 138: 05 00 0a 00 00 00 00 00 goto +10 <LBB0_69> 0000000000000458 <LBB0_66>: 139: 69 21 00 00 00 00 00 00 r1 = *(u16 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 140: 05 00 08 00 00 00 00 00 goto +8 <LBB0_69> 0000000000000468 <LBB0_63>: 141: 16 03 06 00 04 00 00 00 if w3 == 4 goto +6 <LBB0_67> 142: 16 03 01 00 08 00 00 00 if w3 == 8 goto +1 <LBB0_68> 143: 05 00 05 00 00 00 00 00 goto +5 <LBB0_69> 0000000000000480 <LBB0_68>: 144: 79 21 00 00 00 00 00 00 r1 = *(u64 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 145: 05 00 03 00 00 00 00 00 goto +3 <LBB0_69> 0000000000000490 <LBB0_65>: 146: 71 21 00 00 00 00 00 00 r1 = *(u8 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 147: 05 00 01 00 00 00 00 00 goto +1 <LBB0_69> 00000000000004a0 <LBB0_67>: 148: 61 21 00 00 00 00 00 00 r1 = *(u32 *)(r2 + 0) ; NO CO-RE relocation here ^^^^^^^^^^^^^^^^ 00000000000004a8 <LBB0_69>: 149: 67 01 00 00 20 00 00 00 r1 <<= 32 150: b7 02 00 00 00 00 00 00 r2 = 0 151: 16 02 02 00 00 00 00 00 if w2 == 0 goto +2 <LBB0_71> 152: c7 01 00 00 20 00 00 00 r1 s>>= 32 153: 05 00 01 00 00 00 00 00 goto +1 <LBB0_72> 00000000000004d0 <LBB0_71>: 154: 77 01 00 00 20 00 00 00 r1 >>= 323 Fixes: ee26dade0e3b ("libbpf: Add support for relocatable bitfields") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Lorenz Bauer <lmb@cloudflare.com> Link: https://lore.kernel.org/bpf/20210426192949.416837-4-andrii@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14selftests: mlxsw: Remove a redundant if statement in tc_flower_scale testDanielle Ratson1-5/+1
[ Upstream commit 1f1c92139e36223b89d8140f2b72f75e79baf8bd ] Currently, the error return code of the failure condition is lost after using an if statement, so the test doesn't fail when it should. Remove the if statement that separates the condition and the error code check, so the test won't always pass. Fixes: abfce9e062021 ("selftests: mlxsw: Reduce running time using offload indication") Reported-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14selftests: mlxsw: Remove a redundant if statement in port_scale testDanielle Ratson1-5/+1
[ Upstream commit b6fc2f212108b3676f54d00a2c38e3bc36753980 ] Currently, the error return code of the failure condition is lost after using an if statement, so the test doesn't fail when it should. Remove the if statement that separates the condition and the error code check, so the test won't always pass. Fixes: 5154b1b826d9b ("selftests: mlxsw: Add a scale test for physical ports") Reported-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14selftests: net: mirror_gre_vlan_bridge_1q: Make an FDB entry staticPetr Machata1-1/+1
[ Upstream commit c8d0260cdd96fdccdef0509c4160e28a1012a5d7 ] The FDB roaming test installs a destination MAC address on the wrong interface of an FDB database and tests whether the mirroring fails, because packets are sent to the wrong port. The test by mistake installs the FDB entry as local. This worked previously, because drivers were notified of local FDB entries in the same way as of static entries. However that has been fixed in the commit 6ab4c3117aec ("net: bridge: don't notify switchdev for local FDB addresses"), and local entries are not notified anymore. As a result, the HW is not reconfigured for the FDB roam, and mirroring keeps working, failing the test. To fix the issue, mark the FDB entry as static. Fixes: 9c7c8a82442c ("selftests: forwarding: mirror_gre_vlan_bridge_1q: Add more tests") Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14libbpf: Initialize the bpf_seq_printf parameters array field by fieldFlorent Revest1-11/+29
[ Upstream commit 83cd92b46484aa8f64cdc0bff8ac6940d1f78519 ] When initializing the __param array with a one liner, if all args are const, the initial array value will be placed in the rodata section but because libbpf does not support relocation in the rodata section, any pointer in this array will stay NULL. Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support") Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210419155243.1632274-5-revest@chromium.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14perf beauty: Fix fsconfig generatorVitaly Chikunov1-4/+3
[ Upstream commit 2e1daee14e67fbf9b27280b974e2c680a22cabea ] After gnulib update sed stopped matching `[[:space:]]*+' as before, causing the following compilation error: In file included from builtin-trace.c:719: trace/beauty/generated/fsconfig_arrays.c:2:3: error: expected expression before ']' token 2 | [] = "", | ^ trace/beauty/generated/fsconfig_arrays.c:2:3: error: array index in initializer not of integer type trace/beauty/generated/fsconfig_arrays.c:2:3: note: (near initialization for 'fsconfig_cmds') Fix this by correcting the regular expression used in the generator. Also, clean up the script by removing redundant egrep, xargs, and printf invocations. Committer testing: Continues to work: $ cat tools/perf/trace/beauty/fsconfig.sh #!/bin/sh # SPDX-License-Identifier: LGPL-2.1 if [ $# -ne 1 ] ; then linux_header_dir=tools/include/uapi/linux else linux_header_dir=$1 fi linux_mount=${linux_header_dir}/mount.h printf "static const char *fsconfig_cmds[] = {\n" ms='[[:space:]]*' sed -nr "s/^${ms}FSCONFIG_([[:alnum:]_]+)${ms}=${ms}([[:digit:]]+)${ms},.*/\t[\2] = \"\1\",/p" \ ${linux_mount} printf "};\n" $ tools/perf/trace/beauty/fsconfig.sh static const char *fsconfig_cmds[] = { [0] = "SET_FLAG", [1] = "SET_STRING", [2] = "SET_BINARY", [3] = "SET_PATH", [4] = "SET_PATH_EMPTY", [5] = "SET_FD", [6] = "CMD_CREATE", [7] = "CMD_RECONFIGURE", }; $ Fixes: d35293004a5e4 ("perf beauty: Add generator for fsconfig's 'cmd' arg values") Signed-off-by: Vitaly Chikunov <vt@altlinux.org> Co-authored-by: Dmitry V. Levin <ldv@altlinux.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: http://lore.kernel.org/lkml/20210414182723.1670663-1-vt@altlinux.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14perf vendor events amd: Fix broken L2 Cache Hits from L2 HWPF metricSmita Koralahalli4-8/+8
[ Upstream commit 86c2bc3da769124e3e856b6e9457be3667c30919 ] Commit 08ed77e414ab2342 ("perf vendor events amd: Add recommended events") added the hits event "L2 Cache Hits from L2 HWPF" with the same metric expression as the accesses event "L2 Cache Accesses from L2 HWPF": $ perf list --details ... l2_cache_accesses_from_l2_hwpf [L2 Cache Accesses from L2 HWPF] [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3] l2_cache_hits_from_l2_hwpf [L2 Cache Hits from L2 HWPF] [l2_pf_hit_l2 + l2_pf_miss_l2_hit_l3 + l2_pf_miss_l2_l3] ... This was wrong and led to counting hits the same as accesses. Section 2.1.15.2 "Performance Measurement" of "PPR for AMD Family 17h Model 31h B0 - 55803 Rev 0.54 - Sep 12, 2019", documents the hits event with EventCode 0x70 which is the same as l2_pf_hit_l2. Fix this, and massage the description for l2_pf_hit_l2 as the hits event is now the duplicate of l2_pf_hit_l2. AMD recommends using the recommended event over other events if the duplicate exists and maintain both for consistency. Hence, l2_cache_hits_from_l2_hwpf should override l2_pf_hit_l2. Before: # perf stat -M l2_cache_accesses_from_l2_hwpf,l2_cache_hits_from_l2_hwpf sleep 1 Performance counter stats for 'sleep 1': 1,436 l2_pf_miss_l2_l3 # 11114.00 l2_cache_accesses_from_l2_hwpf # 11114.00 l2_cache_hits_from_l2_hwpf 4,482 l2_pf_hit_l2 5,196 l2_pf_miss_l2_hit_l3 1.001765339 seconds time elapsed After: # perf stat -M l2_cache_accesses_from_l2_hwpf sleep 1 Performance counter stats for 'sleep 1': 1,477 l2_pf_miss_l2_l3 # 10442.00 l2_cache_accesses_from_l2_hwpf 3,978 l2_pf_hit_l2 4,987 l2_pf_miss_l2_hit_l3 1.001491186 seconds time elapsed # perf stat -e l2_cache_hits_from_l2_hwpf sleep 1 Performance counter stats for 'sleep 1': 3,983 l2_cache_hits_from_l2_hwpf 1.001329970 seconds time elapsed Note the difference in performance counter values for the accesses versus the hits after the fix, and the hits event now counting the same as l2_pf_hit_l2. Fixes: 08ed77e414ab ("perf vendor events amd: Add recommended events") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206537 Reviewed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> # On a 3900X Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Liška <mliska@suse.cz> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vijay Thakkar <vijaythakkar@me.com> Cc: linux-perf-users@vger.kernel.org Link: https://lore.kernel.org/r/20210406215944.113332-2-Smita.KoralahalliChannabasappa@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14libbpf: Add explicit padding to btf_dump_emit_type_decl_optsKP Singh1-0/+1
[ Upstream commit ea24b19562fe5f72c78319dbb347b701818956d9 ] Similar to https://lore.kernel.org/bpf/20210313210920.1959628-2-andrii@kernel.org/ When DECLARE_LIBBPF_OPTS is used with inline field initialization, e.g: DECLARE_LIBBPF_OPTS(btf_dump_emit_type_decl_opts, opts, .field_name = var_ident, .indent_level = 2, .strip_mods = strip_mods, ); and compiled in debug mode, the compiler generates code which leaves the padding uninitialized and triggers errors within libbpf APIs which require strict zero initialization of OPTS structs. Adding anonymous padding field fixes the issue. Fixes: 9f81654eebe8 ("libbpf: Expose BTF-to-C type declaration emitting API") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210319192117.2310658-1-kpsingh@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14selftests/bpf: Re-generate vmlinux.h and BPF skeletons if bpftool changedAndrii Nakryiko1-2/+3
[ Upstream commit cab62c37be057379a2a17b1b2eacd9dcba1e14dc ] Trigger vmlinux.h and BPF skeletons re-generation if detected that bpftool was re-compiled. Otherwise full `make clean` is required to get updated skeletons, if bpftool is modified. Fixes: acbd06206bbb ("selftests/bpf: Add vmlinux.h selftest exercising tracing of syscalls") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20210318194036.3521577-11-andrii@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>