summaryrefslogtreecommitdiff
path: root/tools/include
AgeCommit message (Collapse)AuthorFilesLines
2021-02-12Merge branch 'for-mingo-nolibc' of ↵Ingo Molnar1-38/+115
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull nolibc fixes from Paul E. McKenney. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-02-12bpf: Expose bpf_get_socket_cookie to tracing programsFlorent Revest1-0/+8
This needs a new helper that: - can work in a sleepable context (using sock_gen_cookie) - takes a struct sock pointer and checks that it's not NULL Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: KP Singh <kpsingh@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210210111406.785541-2-revest@chromium.org
2021-02-12bpf: Be less specific about socket cookies guaranteesFlorent Revest1-4/+4
Since "92acdc58ab11 bpf, net: Rework cookie generator as per-cpu one" socket cookies are not guaranteed to be non-decreasing. The bpf_get_socket_cookie helper descriptions are currently specifying that cookies are non-decreasing but we don't want users to rely on that. Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Florent Revest <revest@chromium.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/bpf/20210210111406.785541-1-revest@chromium.org
2021-02-11tools headers UAPI: Sync linux/prctl.h with the kernel sourcesArnaldo Carvalho de Melo1-0/+3
To pick a new prctl introduced in: 36a6c843fd0d8e02 ("entry: Use different define for selector variable in SUD") That don't result in any changes in tooling: $ tools/perf/trace/beauty/prctl_option.sh > before $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h $ tools/perf/trace/beauty/prctl_option.sh > after $ diff -u before after Just silences this perf tools build warning: Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Gabriel Krisman Bertazi <krisman@collabora.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-11bpf: Count the number of times recursion was preventedAlexei Starovoitov1-0/+1
Add per-program counter for number of times recursion prevention mechanism was triggered and expose it via show_fdinfo and bpf_prog_info. Teach bpftool to print it. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210210033634.62081-7-alexei.starovoitov@gmail.com
2021-02-10KVM: PPC: Book3S HV: Introduce new capability for 2nd DAWRRavi Bangoria1-0/+1
Introduce KVM_CAP_PPC_DAWR1 which can be used by QEMU to query whether KVM supports 2nd DAWR or not. The capability is by default disabled even when the underlying CPU supports 2nd DAWR. QEMU needs to check and enable it manually to use the feature. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2021-02-08tools headers uapi: Update tools's copy of linux/perf_event.hKan Liang1-4/+50
To get the changes in these csets: 2a6c6b7d7ad346f0 ("perf/core: Add PERF_SAMPLE_WEIGHT_STRUCT") 61b985e3e775a3a7 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids") This cures the following warning during perf's build: Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h' diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h Committer notes: Picked by hand as I had already merged the MMAP buildid patch that also touches perf_event.h and is also only in {acme,tip}/perf/core, not yet upstream. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/1612296553-21962-2-git-send-email-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-02-08module: remove EXPORT_UNUSED_SYMBOL*Christoph Hellwig1-2/+0
EXPORT_UNUSED_SYMBOL* is not actually used anywhere. Remove the unused functionality as we generally just remove unused code anyway. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2021-02-08module: remove EXPORT_SYMBOL_GPL_FUTUREChristoph Hellwig1-1/+0
As far as I can tell this has never been used at all, and certainly not any time recently. Reviewed-by: Miroslav Benes <mbenes@suse.cz> Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2021-02-04Revert "GTP: add support for flow based tunneling API"Jonas Bonn1-1/+0
This reverts commit 9ab7e76aefc97a9aa664accb59d6e8dc5e52514a. This patch was committed without maintainer approval and despite a number of unaddressed concerns from review. There are several issues that impede the acceptance of this patch and that make a reversion of this particular instance of these changes the best way forward: i) the patch contains several logically separate changes that would be better served as smaller patches (for review purposes) ii) functionality like the handling of end markers has been introduced without further explanation iii) symmetry between the handling of GTPv0 and GTPv1 has been unnecessarily broken iv) the patchset produces 'broken' packets when extension headers are included v) there are no available userspace tools to allow for testing this functionality vi) there is an unaddressed Coverity report against the patch concering memory leakage vii) most importantly, the patch contains a large amount of superfluous churn that impedes other ongoing work with this driver This patch will be reworked into a series that aligns with other ongoing work and facilitates review. Signed-off-by: Jonas Bonn <jonas@norrbonn.se> Acked-by: Harald Welte <laforge@gnumonks.org> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-26objtool: Combine UNWIND_HINT_RET_OFFSET and UNWIND_HINT_FUNCJosh Poimboeuf1-1/+4
The ORC metadata generated for UNWIND_HINT_FUNC isn't actually very func-like. With certain usages it can cause stack state mismatches because it doesn't set the return address (CFI_RA). Also, users of UNWIND_HINT_RET_OFFSET no longer need to set a custom return stack offset. Instead they just need to specify a func-like situation, so the current ret_offset code is hacky for no good reason. Solve both problems by simplifying the RET_OFFSET handling and converting it into a more useful UNWIND_HINT_FUNC. If we end up needing the old 'ret_offset' functionality again in the future, we should be able to support it pretty easily with the addition of a custom 'sp_offset' in UNWIND_HINT_FUNC. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/db9d1f5d79dddfbb3725ef6d8ec3477ad199948d.1611263462.git.jpoimboe@redhat.com
2021-01-26objtool: Add asm version of STACK_FRAME_NON_STANDARDJosh Poimboeuf1-0/+8
To be used for adding asm functions to the ignore list. The "aw" is needed to help the ELF section metadata match GCC-created sections. Otherwise the linker creates duplicate sections instead of combining them. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lore.kernel.org/r/8faa476f9a5ac89af27944ec184c89f95f3c6c49.1611263462.git.jpoimboe@redhat.com
2021-01-26samples/bpf: Set flag __SANE_USERSPACE_TYPES__ for MIPS to fix build warningsTiezhu Yang1-0/+3
There exists many build warnings when make M=samples/bpf on the Loongson platform, this issue is MIPS related, x86 compiles just fine. Here are some warnings: CC samples/bpf/ibumad_user.o samples/bpf/ibumad_user.c: In function ‘dump_counts’: samples/bpf/ibumad_user.c:46:24: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=] printf("0x%02x : %llu\n", key, value); ~~~^ ~~~~~ %lu CC samples/bpf/offwaketime_user.o samples/bpf/offwaketime_user.c: In function ‘print_ksym’: samples/bpf/offwaketime_user.c:34:17: warning: format ‘%llx’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=] printf("%s/%llx;", sym->name, addr); ~~~^ ~~~~ %lx samples/bpf/offwaketime_user.c: In function ‘print_stack’: samples/bpf/offwaketime_user.c:68:17: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘__u64’ {aka ‘long unsigned int’} [-Wformat=] printf(";%s %lld\n", key->waker, count); ~~~^ ~~~~~ %ld MIPS needs __SANE_USERSPACE_TYPES__ before <linux/types.h> to select 'int-ll64.h' in arch/mips/include/uapi/asm/types.h, then it can avoid build warnings when printing __u64 with %llu, %llx or %lld. The header tools/include/linux/types.h defines __SANE_USERSPACE_TYPES__, it seems that we can include <linux/types.h> in the source files which have build warnings, but it has no effect due to actually it includes usr/include/linux/types.h instead of tools/include/linux/types.h, the problem is that "usr/include" is preferred first than "tools/include" in samples/bpf/Makefile, that sounds like a ugly hack to -Itools/include before -Iusr/include. So define __SANE_USERSPACE_TYPES__ for MIPS in samples/bpf/Makefile is proper, if add "TPROGS_CFLAGS += -D__SANE_USERSPACE_TYPES__" in samples/bpf/Makefile, it appears the following error: Auto-detecting system features: ... libelf: [ on ] ... zlib: [ on ] ... bpf: [ OFF ] BPF API too old make[3]: *** [Makefile:293: bpfdep] Error 1 make[2]: *** [Makefile:156: all] Error 2 With #ifndef __SANE_USERSPACE_TYPES__ in tools/include/linux/types.h, the above error has gone and this ifndef change does not hurt other compilations. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/1611551146-14052-1-git-send-email-yangtiezhu@loongson.cn
2021-01-26tools, headers: Sync struct bpf_perf_event_dataFlorian Lehner1-0/+1
Update struct bpf_perf_event_data with the addr field to match the tools headers with the kernel headers. Signed-off-by: Florian Lehner <dev@der-flo.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210123185221.23946-1-dev@der-flo.net
2021-01-24fs: add mount_setattr()Christian Brauner1-1/+3
This implements the missing mount_setattr() syscall. While the new mount api allows to change the properties of a superblock there is currently no way to change the properties of a mount or a mount tree using file descriptors which the new mount api is based on. In addition the old mount api has the restriction that mount options cannot be applied recursively. This hasn't changed since changing mount options on a per-mount basis was implemented in [1] and has been a frequent request not just for convenience but also for security reasons. The legacy mount syscall is unable to accommodate this behavior without introducing a whole new set of flags because MS_REC | MS_REMOUNT | MS_BIND | MS_RDONLY | MS_NOEXEC | [...] only apply the mount option to the topmost mount. Changing MS_REC to apply to the whole mount tree would mean introducing a significant uapi change and would likely cause significant regressions. The new mount_setattr() syscall allows to recursively clear and set mount options in one shot. Multiple calls to change mount options requesting the same changes are idempotent: int mount_setattr(int dfd, const char *path, unsigned flags, struct mount_attr *uattr, size_t usize); Flags to modify path resolution behavior are specified in the @flags argument. Currently, AT_EMPTY_PATH, AT_RECURSIVE, AT_SYMLINK_NOFOLLOW, and AT_NO_AUTOMOUNT are supported. If useful, additional lookup flags to restrict path resolution as introduced with openat2() might be supported in the future. The mount_setattr() syscall can be expected to grow over time and is designed with extensibility in mind. It follows the extensible syscall pattern we have used with other syscalls such as openat2(), clone3(), sched_{set,get}attr(), and others. The set of mount options is passed in the uapi struct mount_attr which currently has the following layout: struct mount_attr { __u64 attr_set; __u64 attr_clr; __u64 propagation; __u64 userns_fd; }; The @attr_set and @attr_clr members are used to clear and set mount options. This way a user can e.g. request that a set of flags is to be raised such as turning mounts readonly by raising MOUNT_ATTR_RDONLY in @attr_set while at the same time requesting that another set of flags is to be lowered such as removing noexec from a mount tree by specifying MOUNT_ATTR_NOEXEC in @attr_clr. Note, since the MOUNT_ATTR_<atime> values are an enum starting from 0, not a bitmap, users wanting to transition to a different atime setting cannot simply specify the atime setting in @attr_set, but must also specify MOUNT_ATTR__ATIME in the @attr_clr field. So we ensure that MOUNT_ATTR__ATIME can't be partially set in @attr_clr and that @attr_set can't have any atime bits set if MOUNT_ATTR__ATIME isn't set in @attr_clr. The @propagation field lets callers specify the propagation type of a mount tree. Propagation is a single property that has four different settings and as such is not really a flag argument but an enum. Specifically, it would be unclear what setting and clearing propagation settings in combination would amount to. The legacy mount() syscall thus forbids the combination of multiple propagation settings too. The goal is to keep the semantics of mount propagation somewhat simple as they are overly complex as it is. The @userns_fd field lets user specify a user namespace whose idmapping becomes the idmapping of the mount. This is implemented and explained in detail in the next patch. [1]: commit 2e4b7fcd9260 ("[PATCH] r/o bind mounts: honor mount writer counts at remount") Link: https://lore.kernel.org/r/20210121131959.646623-35-christian.brauner@ubuntu.com Cc: David Howells <dhowells@redhat.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Cc: linux-api@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-23sch_htb: Hierarchical QoS hardware offloadMaxim Mikityanskiy1-0/+1
HTB doesn't scale well because of contention on a single lock, and it also consumes CPU. This patch adds support for offloading HTB to hardware that supports hierarchical rate limiting. In the offload mode, HTB passes control commands to the driver using ndo_setup_tc. The driver has to replicate the whole hierarchy of classes and their settings (rate, ceil) in the NIC. Every modification of the HTB tree caused by the admin results in ndo_setup_tc being called. After this setup, the HTB algorithm is done completely in the NIC. An SQ (send queue) is created for every leaf class and attached to the hierarchy, so that the NIC can calculate and obey aggregated rate limits, too. In the future, it can be changed, so that multiple SQs will back a single leaf class. ndo_select_queue is responsible for selecting the right queue that serves the traffic class of each packet. The data path works as follows: a packet is classified by clsact, the driver selects a hardware queue according to its class, and the packet is enqueued into this queue's qdisc. This solution addresses two main problems of scaling HTB: 1. Contention by flow classification. Currently the filters are attached to the HTB instance as follows: # tc filter add dev eth0 parent 1:0 protocol ip flower dst_port 80 classid 1:10 It's possible to move classification to clsact egress hook, which is thread-safe and lock-free: # tc filter add dev eth0 egress protocol ip flower dst_port 80 action skbedit priority 1:10 This way classification still happens in software, but the lock contention is eliminated, and it happens before selecting the TX queue, allowing the driver to translate the class to the corresponding hardware queue in ndo_select_queue. Note that this is already compatible with non-offloaded HTB and doesn't require changes to the kernel nor iproute2. 2. Contention by handling packets. HTB is not multi-queue, it attaches to a whole net device, and handling of all packets takes the same lock. When HTB is offloaded, it registers itself as a multi-queue qdisc, similarly to mq: HTB is attached to the netdev, and each queue has its own qdisc. Some features of HTB may be not supported by some particular hardware, for example, the maximum number of classes may be limited, the granularity of rate and ceil parameters may be different, etc. - so, the offload is not enabled by default, a new parameter is used to enable it: # tc qdisc replace dev eth0 root handle 1: htb offload Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-21tools/nolibc: Fix position of -lgcc in the documented exampleWilly Tarreau1-1/+1
The documentation header in the nolibc.h file provides an example command line, but it places the -lgcc argument before the source files, which can fail with libgcc.a (e.g. on ARM when uidiv is needed). This commit therefore moves the -lgcc to the end of the command line, hopefully before this example leaks into makefiles. This is a port of nolibc's upstream commit b5e282089223 to the Linux kernel. Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc") Tested-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-21tools/nolibc: Emit detailed error for missing alternate syscall number ↵Willy Tarreau1-13/+39
definitions Some syscalls can be implemented from different __NR_* variants. For example, sys_dup2() can be implemented based on __NR_dup3 or __NR_dup2. In this case it is useful to mention both alternatives in error messages when neither are detected. This information will help the user search for the right one (e.g __NR_dup3) instead of just the fallback (__NR_dup2) which might not exist on the platform. This is a port of nolibc's upstream commit a21080d2ba41 to the Linux kernel. Suggested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/lkml/20210120145447.GC77728@C02TD0UTHF1T.local/ Tested-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-21tools/nolibc: Remove incorrect definitions of __ARCH_WANT_*Willy Tarreau1-8/+0
The __ARCH_WANT_* definitions were added in order to support aarch64 when it was missing some syscall definitions (including __NR_dup2, __NR_fork, and __NR_getpgrp), but these __ARCH_WANT_* definitions were actually wrong because these syscalls do not exist on this platform. Defining these resulted in exposing invalid definitions, resulting in failures on aarch64. The missing syscalls were since implemented based on the newer ones (__NR_dup3, __NR_clone, __NR_getpgid) so these incorrect __ARCH_WANT_* definitions are no longer needed. Thanks to Mark Rutland for spotting this incorrect analysis and explaining why it was wrong. This is a port of nolibc's upstream commit 00b1b0d9b2a4 to the Linux kernel. Reported-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/lkml/20210119153147.GA5083@paulmck-ThinkPad-P72 Tested-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-21tools/nolibc: Get timeval, timespec and timezone from linux/time.hWilly Tarreau1-18/+1
The definitions of timeval(), timespec() and timezone() conflict with linux/time.h when building, so this commit takes them directly from linux/time.h. This is a port of nolibc's upstream commit dc45f5426b0c to the Linux kernel. Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc") Tested-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-21tools/nolibc: Implement poll() based on ppoll()Willy Tarreau1-0/+10
Some architectures like arm64 do not implement poll() and have to use ppoll() instead. This commit therefore makes poll() use ppoll() when available. This is a port of nolibc's upstream commit 800f75c13ede to the Linux kernel. Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc") Tested-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-21tools/nolibc: Implement fork() based on clone()Willy Tarreau1-0/+10
Some archs such as arm64 do not have fork() and have to use clone() instead. This commit therefore makes fork() use clone() when available. This requires including signal.h to get the definition of SIGCHLD. This is a port of nolibc's upstream commit d2dc42fd6149 to the Linux kernel. Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc") Tested-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-21tools/nolibc: Make getpgrp() fall back to getpgid(0)Willy Tarreau1-1/+19
The getpgrp() syscall is not implemented on arm64, so this commit instead uses getpgid(0) when getpgrp() is not available. This is a port of nolibc's upstream commit 2379f25073f9 to the Linux kernel. Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc") Tested-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-21tools/nolibc: Make dup2() rely on dup3() when availableWilly Tarreau1-0/+26
A recent boot failure on 5.4-rc3 on arm64 revealed that sys_dup2() is not available and that only sys_dup3() is implemented. This commit detects this and falls back to sys_dup3() when available. This is a port of nolibc's upstream commit fd5272ec2c66 to the Linux kernel. Tested-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-21tools/nolibc: Add the definition for dup()Willy Tarreau1-0/+12
This commit adds the dup() function, which was omitted when sys_dup() was defined. This is a port of nolibc's upstream commit 47cc42a79c92 to the Linux kernel. Fixes: 66b6f755ad45 ("rcutorture: Import a copy of nolibc") Tested-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-21bpf: Remove extra lock_sock for TCP_ZEROCOPY_RECEIVEStanislav Fomichev1-0/+357
Add custom implementation of getsockopt hook for TCP_ZEROCOPY_RECEIVE. We skip generic hooks for TCP_ZEROCOPY_RECEIVE and have a custom call in do_tcp_getsockopt using the on-stack data. This removes 3% overhead for locking/unlocking the socket. Without this patch: 3.38% 0.07% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt | --3.30%--__cgroup_bpf_run_filter_getsockopt | --0.81%--__kmalloc With the patch applied: 0.52% 0.12% tcp_mmap [kernel.kallsyms] [k] __cgroup_bpf_run_filter_getsockopt_kern Note, exporting uapi/tcp.h requires removing netinet/tcp.h from test_progs.h because those headers have confliciting definitions. Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20210115163501.805133-2-sdf@google.com
2021-01-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-5/+2
Conflicts: drivers/net/can/dev.c commit 03f16c5075b2 ("can: dev: can_restart: fix use after free bug") commit 3e77f70e7345 ("can: dev: move driver related infrastructure into separate subdir") Code move. drivers/net/dsa/b53/b53_common.c commit 8e4052c32d6b ("net: dsa: b53: fix an off by one in checking "vlan->vid"") commit b7a9e0da2d1c ("net: switchdev: remove vid_begin -> vid_end range from VLAN objects") Field rename. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-20Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo2-5/+2
To pick up fixes. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-16GTP: add support for flow based tunneling APIPravin B Shelar1-0/+1
Following patch add support for flow based tunneling API to send and recv GTP tunnel packet over tunnel metadata API. This would allow this device integration with OVS or eBPF using flow based tunneling APIs. Signed-off-by: Pravin B Shelar <pbshelar@fb.com> Link: https://lore.kernel.org/r/20210110070021.26822-1-pbshelar@fb.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-15tools headers: Syncronize linux/build_bug.h with the kernel sourcesArnaldo Carvalho de Melo1-5/+0
To pick up the changes in: 3a176b94609a18f5 ("Revert "kbuild: avoid static_assert for genksyms"") And silence this perf build warning: Warning: Kernel ABI header at 'tools/include/linux/build_bug.h' differs from latest version at 'include/linux/build_bug.h' diff -u tools/include/linux/build_bug.h include/linux/build_bug.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-15tools headers UAPI: Sync kvm.h headers with the kernel sourcesArnaldo Carvalho de Melo1-0/+2
To pick the changes in: 647daca25d24fb6e ("KVM: SVM: Add support for booting APs in an SEV-ES guest") That don't cause any tooling change, just silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-01-15bpf: Add bitwise atomic instructionsBrendan Jackman1-0/+6
This adds instructions for atomic[64]_[fetch_]and atomic[64]_[fetch_]or atomic[64]_[fetch_]xor All these operations are isomorphic enough to implement with the same verifier, interpreter, and x86 JIT code, hence being a single commit. The main interesting thing here is that x86 doesn't directly support the fetch_ version these operations, so we need to generate a CMPXCHG loop in the JIT. This requires the use of two temporary registers, IIUC it's safe to use BPF_REG_AX and x86's AUX_REG for this purpose. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210114181751.768687-10-jackmanb@google.com
2021-01-15bpf: Add instructions for atomic_[cmp]xchgBrendan Jackman2-1/+5
This adds two atomic opcodes, both of which include the BPF_FETCH flag. XCHG without the BPF_FETCH flag would naturally encode atomic_set. This is not supported because it would be of limited value to userspace (it doesn't imply any barriers). CMPXCHG without BPF_FETCH woulud be an atomic compare-and-write. We don't have such an operation in the kernel so it isn't provided to BPF either. There are two significant design decisions made for the CMPXCHG instruction: - To solve the issue that this operation fundamentally has 3 operands, but we only have two register fields. Therefore the operand we compare against (the kernel's API calls it 'old') is hard-coded to be R0. x86 has similar design (and A64 doesn't have this problem). A potential alternative might be to encode the other operand's register number in the immediate field. - The kernel's atomic_cmpxchg returns the old value, while the C11 userspace APIs return a boolean indicating the comparison result. Which should BPF do? A64 returns the old value. x86 returns the old value in the hard-coded register (and also sets a flag). That means return-old-value is easier to JIT, so that's what we use. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210114181751.768687-8-jackmanb@google.com
2021-01-15bpf: Add BPF_FETCH field / create atomic_fetch_add instructionBrendan Jackman2-0/+4
The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC instructions, in order to have the previous value of the atomically-modified memory location loaded into the src register after an atomic op is carried out. Suggested-by: Yonghong Song <yhs@fb.com> Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20210114181751.768687-7-jackmanb@google.com
2021-01-15bpf: Rename BPF_XADD and prepare to encode other atomics in .immBrendan Jackman2-6/+14
A subsequent patch will add additional atomic operations. These new operations will use the same opcode field as the existing XADD, with the immediate discriminating different operations. In preparation, rename the instruction mode BPF_ATOMIC and start calling the zero immediate BPF_ADD. This is possible (doesn't break existing valid BPF progs) because the immediate field is currently reserved MBZ and BPF_ADD is zero. All uses are removed from the tree but the BPF_XADD definition is kept around to avoid breaking builds for people including kernel headers. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Björn Töpel <bjorn.topel@gmail.com> Link: https://lore.kernel.org/bpf/20210114181751.768687-5-jackmanb@google.com
2021-01-12bpf: Clarify return value of probe str helpersBrendan Jackman1-5/+5
When the buffer is too small to contain the input string, these helpers return the length of the buffer, not the length of the original string. This tries to make the docs totally clear about that, since "the length of the [copied ]string" could also refer to the length of the input. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: KP Singh <kpsingh@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20210112123422.2011234-1-jackmanb@google.com
2020-12-28tools headers uapi: Sync tools/include/uapi/linux/perf_event.hJiri Olsa1-5/+37
Syncing tools's uapi with mmap2 build id data changes. Committer notes: I'm taking the tools/ bits, so this will be in fact ahead of the kernel till the bpf/perf-kernel bits are merged. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20201214105457.543111-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-24tools headers UAPI: Sync kvm.h headers with the kernel sourcesArnaldo Carvalho de Melo1-1/+55
To pick the changes in: fb04a1eddb1a65b6 ("KVM: X86: Implement ring-based dirty memory tracking") That result in these change in tooling: $ tools/perf/trace/beauty/kvm_ioctl.sh > before $ cp include/uapi/linux/kvm.h tools/include/uapi/linux/kvm.h $ cp arch/x86/include/uapi/asm/kvm.h tools/arch/x86/include/uapi/asm/kvm.h $ tools/perf/trace/beauty/kvm_ioctl.sh > after $ diff -u before after --- before 2020-12-21 11:55:45.229737066 -0300 +++ after 2020-12-21 11:55:56.379983393 -0300 @@ -90,6 +90,7 @@ [0xc0] = "CLEAR_DIRTY_LOG", [0xc1] = "GET_SUPPORTED_HV_CPUID", [0xc6] = "X86_SET_MSR_FILTER", + [0xc7] = "RESET_DIRTY_RINGS", [0xe0] = "CREATE_DEVICE", [0xe1] = "SET_DEVICE_ATTR", [0xe2] = "GET_DEVICE_ATTR", $ Now one can use that string in filters when tracing ioctls, etc. And silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h' diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/kvm.h' differs from latest version at 'arch/x86/include/uapi/asm/kvm.h' diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-24tools headers UAPI: Update epoll_pwait2 affected filesArnaldo Carvalho de Melo1-1/+3
To pick the changes from: b0a0c2615f6f199a ("epoll: wire up syscall epoll_pwait2") That addresses these perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h' diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl' diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18tools headers UAPI: Update asm-generic/unistd.hArnaldo Carvalho de Melo1-1/+1
Just a comment change, trivial. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18tools headers UAPI: Sync linux/prctl.h with the kernel sourcesArnaldo Carvalho de Melo1-0/+5
To pick a new prctl introduced in: 1446e1df9eb183fd ("kernel: Implement selective syscall userspace redirection") That results in: $ tools/perf/trace/beauty/prctl_option.sh > before $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h $ tools/perf/trace/beauty/prctl_option.sh > after $ diff -u before after --- before 2020-12-17 15:00:42.012537367 -0300 +++ after 2020-12-17 15:00:49.832699463 -0300 @@ -53,6 +53,7 @@ [56] = "GET_TAGGED_ADDR_CTRL", [57] = "SET_IO_FLUSHER", [58] = "GET_IO_FLUSHER", + [59] = "SET_SYSCALL_USER_DISPATCH", }; static const char *prctl_set_mm_options[] = { [1] = "START_CODE", $ Now users can do: # perf trace -e syscalls:sys_enter_prctl --filter "option==SET_SYSCALL_USER_DISPATCH" ^C# # trace -v -e syscalls:sys_enter_prctl --filter "option==SET_SYSCALL_USER_DISPATCH" New filter for syscalls:sys_enter_prctl: (option==0x3b) && (common_pid != 5519 && common_pid != 3404) ^C# And also when prctl appears in a session, its options will be translated to the string. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Gabriel Krisman Bertazi <krisman@collabora.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18tools headers UAPI: Sync linux/fscrypt.h with the kernel sourcesArnaldo Carvalho de Melo1-3/+2
To pick the changes from: 3ceb6543e9cf6ed8 ("fscrypt: remove kernel-internal constants from UAPI header") That don't result in any changes in tooling, just addressing this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/fscrypt.h' differs from latest version at 'include/uapi/linux/fscrypt.h' diff -u tools/include/uapi/linux/fscrypt.h include/uapi/linux/fscrypt.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18tools headers UAPI: Sync linux/const.h with the kernel headersArnaldo Carvalho de Melo1-0/+5
To pick up the changes in: a85cbe6159ffc973 ("uapi: move constants from <linux/kernel.h> to <linux/const.h>") That causes no changes in tooling, just addresses this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/const.h' differs from latest version at 'include/uapi/linux/const.h' diff -u tools/include/uapi/linux/const.h include/uapi/linux/const.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Petr Vorel <petr.vorel@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18tools headers: Update linux/ctype.h with the kernel sourcesArnaldo Carvalho de Melo1-5/+12
To pick up the changes in: caabdd0f59a9771e ("ctype.h: remove duplicate isdigit() helper") Addressing this perf build warning: Warning: Kernel ABI header at 'tools/include/linux/ctype.h' differs from latest version at 'include/linux/ctype.h' diff -u tools/include/linux/ctype.h include/linux/ctype.h And we need to continue using the combination of: inline __isdigit() #define isdigit() __isdigit When the __has_builtin() thing isn't available, as it is a builtin in older systems with it as a builtin but with compilers not hacinv __has_builtin(), rendering the __has_builtin() check useless otherwise. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18tools headers: Add conditional __has_builtin()Arnaldo Carvalho de Melo1-0/+11
As it'll be used by the ctype.h sync with its kernel source original. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18tools headers: Get tools's linux/compiler.h closer to the kernel'sArnaldo Carvalho de Melo2-3/+11
We're cherry picking stuff from the kernel to allow for the other headers that we keep in sync via tools/perf/check-headers.sh to work, so introduce linux/compiler_types.h and from there get the compiler specific stuff. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-17tools headers UAPI: Sync linux/stat.h with the kernel sourcesArnaldo Carvalho de Melo1-3/+6
To pick the changes in: 72d1249e2ffdbc34 ("uapi: fix statx attribute value overlap for DAX & MOUNT_ROOT") That don't cause any change in tooling, just addresses this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/stat.h' differs from latest version at 'include/uapi/linux/stat.h' diff -u tools/include/uapi/linux/stat.h include/uapi/linux/stat.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Eric Sandeen <sandeen@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-17tools headers: Syncronize linux/build_bug.h with the kernel sourcesArnaldo Carvalho de Melo1-0/+5
To pick up the changes in: 14dc3983b5dff513 ("kbuild: avoid static_assert for genksyms") And silence this perf build warning: Warning: Kernel ABI header at 'tools/include/linux/build_bug.h' differs from latest version at 'include/linux/build_bug.h' diff -u tools/include/linux/build_bug.h include/linux/build_bug.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-17Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo4-10/+111
To pick up fixes and check what UAPI headers need to be synched. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-17tools headers UAPI: Update tools's copy of linux/perf_event.hKan Liang1-1/+5
To get the changes in: commit 8d97e71811aa ("perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE") commit 995f088efebe ("perf/core: Add support for PERF_SAMPLE_CODE_PAGE_SIZE") This silences this perf tools build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h' diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Link: http://lore.kernel.org/lkml/20201130172803.2676-2-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>