summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-09-28Merge branch 'octeontx2-af-external-ptp-clock'David S. Miller13-79/+292
Hariprasad Kelam says: ==================== Externel ptp clock support Externel ptp support is required in a scenario like connecting a external timing device to the chip for time synchronization. This series of patches adds support to ptp driver to use external clock and enables PTP config in CN10K MAC block (RPM). Currently PTP configuration is left unchanged in FLR handler these patches addresses the same. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> ,
2021-09-28octeontx2-af: Add external ptp input clockYi Guo7-6/+187
PTP hardware block can be configured to utilize the external clock. Also the current ptp timestamp can be captured when external trigger is applied on a gpio pin. These features are required in scenarios like connecting a external timing device to the chip for time synchronization. The timing device provides the clock and trigger(PPS signal) to the PTP block. This patch does the following: 1. configures PTP block to use external clock frequency and timestamp capture on external event. 2. sends PTP_REQ_EXTTS events to kernel ptp phc susbsytem with captured timestamps 3. aligns PPS edge to adjusted ptp clock in the ptp device by setting the PPS_THRESH to the reminder of the last timestamp value captured by external PPS Signed-off-by: Yi Guo <yig@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28octeontx2-af: Use ptp input clock info from firmware dataSubbaraya Sundeep3-51/+33
The input clock frequency of PTP block is figured out from hardware reset block currently. The firmware data already has this info in sclk. Hence simplify ptp driver to use sclk from firmware data. Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28octeontx2-af: cn10k: RPM hardware timestamp configurationHariprasad Kelam7-26/+58
MAC on CN10K support hardware timestamping such that 8 bytes addition header is prepended to incoming packets. This patch does necessary configuration to enable Hardware time stamping upon receiving request from PF netdev interfaces. Timestamp configuration is different on MAC (CGX) Octeontx2 silicon and MAC (RPM) OcteonTX3 CN10k. Based on silicon variant appropriate fn() pointer is called. Refactor MAC specific mbox messages to remove unnecessary gaps in mboxids. Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28octeontx2-af: Reset PTP config in FLR handlerHarman Kalra3-0/+18
Upon receiving ptp config request from netdev interface , Octeontx2 MAC block CGX is configured to append timestamp to every incoming packet and NPC config is updated with DMAC offset change. Currently this configuration is not reset in FLR handler. This patch resets the same. Signed-off-by: Harman Kalra <hkalra@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28net: hns3: fix hclge_dbg_dump_tm_pg() stack usageArnd Bergmann1-4/+24
This function copies strings around between multiple buffers including a large on-stack array that causes a build warning on 32-bit systems: drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c: In function 'hclge_dbg_dump_tm_pg': drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_debugfs.c:782:1: error: the frame size of 1424 bytes is larger than 1400 bytes [-Werror=frame-larger-than=] The function can probably be cleaned up a lot, to go back to printing directly into the output buffer, but dynamically allocating the structure is a simpler workaround for now. Fixes: 04d96139ddb3 ("net: hns3: refine function hclge_dbg_dump_tm_pri()") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28net: mdio: mscc-miim: Fix the mdio controllerHoratiu Vultur1-5/+10
According to the documentation the second resource is optional. But the blamed commit ignores that and if the resource is not there it just fails. This patch reverts that to still allow the second resource to be optional because other SoC have the some MDIO controller and doesn't need to second resource. Fixes: 672a1c394950 ("net: mdio: mscc-miim: Make use of the helper function devm_platform_ioremap_resource()") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Cai Huoqing <caihuoqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28net/tls: support SM4 CCM algorithmTianjia Zhang2-5/+18
The IV of CCM mode has special requirements, this patch supports CCM mode of SM4 algorithm. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28af_unix: Return errno instead of NULL in unix_create1().Kuniyuki Iwashima1-17/+32
unix_create1() returns NULL on error, and the callers assume that it never fails for reasons other than out of memory. So, the callers always return -ENOMEM when unix_create1() fails. However, it also returns NULL when the number of af_unix sockets exceeds twice the limit controlled by sysctl: fs.file-max. In this case, the callers should return -ENFILE like alloc_empty_file(). This patch changes unix_create1() to return the correct error value instead of NULL on error. Out of curiosity, the assumption has been wrong since 1999 due to this change introduced in 2.2.4 [0]. diff -u --recursive --new-file v2.2.3/linux/net/unix/af_unix.c linux/net/unix/af_unix.c --- v2.2.3/linux/net/unix/af_unix.c Tue Jan 19 11:32:53 1999 +++ linux/net/unix/af_unix.c Sun Mar 21 07:22:00 1999 @@ -388,6 +413,9 @@ { struct sock *sk; + if (atomic_read(&unix_nr_socks) >= 2*max_files) + return NULL; + MOD_INC_USE_COUNT; sk = sk_alloc(PF_UNIX, GFP_KERNEL, 1); if (!sk) { [0]: https://cdn.kernel.org/pub/linux/kernel/v2.2/patch-2.2.4.gz Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28net: udp: annotate data race around udp_sk(sk)->corkflagEric Dumazet2-6/+6
up->corkflag field can be read or written without any lock. Annotate accesses to avoid possible syzbot/KCSAN reports. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28net: sun: SUNVNET_COMMON should depend on INETRandy Dunlap1-0/+1
When CONFIG_INET is not set, there are failing references to IPv4 functions, so make this driver depend on INET. Fixes these build errors: sparc64-linux-ld: drivers/net/ethernet/sun/sunvnet_common.o: in function `sunvnet_start_xmit_common': sunvnet_common.c:(.text+0x1a68): undefined reference to `__icmp_send' sparc64-linux-ld: drivers/net/ethernet/sun/sunvnet_common.o: in function `sunvnet_poll_common': sunvnet_common.c:(.text+0x358c): undefined reference to `ip_send_check' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Aaron Young <aaron.young@oracle.com> Cc: Rashmi Narasimhan <rashmi.narasimhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28ionic: fix gathering of debug statsShannon Nelson1-9/+0
Don't print stats for which we haven't reserved space as it can cause nasty memory bashing and related bad behaviors. Fixes: aa620993b1e5 ("ionic: pull per-q stats work out of queue loops") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tDavid S. Miller1-7/+15
nguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-09-27 This series contains updates to e100 driver only. Jake corrects under allocation of register buffer due to incorrect calculations and fixes buffer overrun of register dump. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28net: ipv6: use ipv6-y directly instead of ipv6-objsMasahiro Yamada1-4/+2
Kbuild supports <modname>-y as well as <modname>-objs. This simplifies the Makefile. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28net: ipv6: squash $(ipv6-offload) in MakefileMasahiro Yamada1-3/+2
Assign the objects directly to obj-$(CONFIG_INET). Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28dmascc: add CONFIG_VIRT_TO_BUS dependencyArnd Bergmann1-0/+1
Many architectures don't define virt_to_bus() any more, as drivers should be using the dma-mapping interfaces where possible: In file included from drivers/net/hamradio/dmascc.c:27: drivers/net/hamradio/dmascc.c: In function 'tx_on': drivers/net/hamradio/dmascc.c:976:30: error: implicit declaration of function 'virt_to_bus'; did you mean 'virt_to_fix'? [-Werror=implicit-function-declaration] 976 | virt_to_bus(priv->tx_buf[priv->tx_tail]) + n); | ^~~~~~~~~~~ arch/arm/include/asm/dma.h:109:52: note: in definition of macro 'set_dma_addr' 109 | __set_dma_addr(chan, (void *)__bus_to_virt(addr)) | ^~~~ Add the Kconfig dependency to prevent this from being built on architectures without virt_to_bus(). Fixes: bc1abb9e55ce ("dmascc: use proper 'virt_to_bus()' rather than casting to 'int'") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28net: ks8851: fix link errorArnd Bergmann2-4/+10
An object file cannot be built for both loadable module and built-in use at the same time: arm-linux-gnueabi-ld: drivers/net/ethernet/micrel/ks8851_common.o: in function `ks8851_probe_common': ks8851_common.c:(.text+0xf80): undefined reference to `__this_module' Change the ks8851_common code to be a standalone module instead, and use Makefile logic to ensure this is built-in if at least one of its two users is. Fixes: 797047f875b5 ("net: ks8851: Implement Parallel bus operations") Link: https://lore.kernel.org/netdev/20210125121937.3900988-1-arnd@kernel.org/ Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Marek Vasut <marex@denx.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28net: stmmac: fix off-by-one error in sanity checkArnd Bergmann1-2/+2
My previous patch had an off-by-one error in the added sanity check, the arrays are MTL_MAX_{RX,TX}_QUEUES long, so if that index is that number, it has overflown. The patch silenced the warning anyway because the strings could no longer overlap with the input, but they could still overlap with other fields. Fixes: 3e0d5699a975 ("net: stmmac: fix gcc-10 -Wrestrict warning") Reported-by: Russell King (Oracle) <linux@armlinux.org.uk> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28am65-cpsw: avoid null pointer arithmeticArnd Bergmann1-1/+1
clang warns about arithmetic on NULL pointers: drivers/net/ethernet/ti/am65-cpsw-ethtool.c:71:2: error: performing pointer subtraction with a null pointer has undefined behavior [-Werror,-Wnull-pointer-subtraction] AM65_CPSW_REGDUMP_REC(AM65_CPSW_REGDUMP_MOD_NUSS, 0x0, 0x1c), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/ti/am65-cpsw-ethtool.c:64:29: note: expanded from macro 'AM65_CPSW_REGDUMP_REC' .hdr.len = (((u32 *)(end)) - ((u32 *)(start)) + 1) * sizeof(u32) * 2 + \ ^ ~~~~~~~~~~~~~~~~ The expression here is easily changed to a calculation based on integers that is no less readable. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28net: mac80211: check return value of rhashtable_initMichelleJin1-1/+4
When rhashtable_init() fails, it returns -EINVAL. However, since error return value of rhashtable_init is not checked, it can cause use of uninitialized pointers. So, fix unhandled errors of rhashtable_init. Signed-off-by: MichelleJin <shjy180909@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28net: ipv6: check return value of rhashtable_initMichelleJin3-6/+12
When rhashtable_init() fails, it returns -EINVAL. However, since error return value of rhashtable_init is not checked, it can cause use of uninitialized pointers. So, fix unhandled errors of rhashtable_init. Signed-off-by: MichelleJin <shjy180909@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28net/mlx5e: check return value of rhashtable_initMichelleJin1-3/+11
When rhashtable_init() fails, it returns -EINVAL. However, since error return value of rhashtable_init is not checked, it can cause use of uninitialized pointers. So, fix unhandled errors of rhashtable_init. Signed-off-by: MichelleJin <shjy180909@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-28bpf, x86: Fix bpf mapping of atomic fetch implementationJohan Almbladh1-5/+8
Fix the case where the dst register maps to %rax as otherwise this produces an incorrect mapping with the implementation in 981f94c3e921 ("bpf: Add bitwise atomic instructions") as %rax is clobbered given it's part of the cmpxchg as operand. The issue is similar to b29dd96b905f ("bpf, x86: Fix BPF_FETCH atomic and/or/ xor with r0 as src") just that the case of dst register was missed. Before, dst=r0 (%rax) src=r2 (%rsi): [...] c5: mov %rax,%r10 c8: mov 0x0(%rax),%rax <---+ (broken) cc: mov %rax,%r11 | cf: and %rsi,%r11 | d2: lock cmpxchg %r11,0x0(%rax) <---+ d8: jne 0x00000000000000c8 | da: mov %rax,%rsi | dd: mov %r10,%rax | [...] | | After, dst=r0 (%rax) src=r2 (%rsi): | | [...] | da: mov %rax,%r10 | dd: mov 0x0(%r10),%rax <---+ (fixed) e1: mov %rax,%r11 | e4: and %rsi,%r11 | e7: lock cmpxchg %r11,0x0(%r10) <---+ ed: jne 0x00000000000000dd ef: mov %rax,%rsi f2: mov %r10,%rax [...] The remaining combinations were fine as-is though: After, dst=r9 (%r15) src=r0 (%rax): [...] dc: mov %rax,%r10 df: mov 0x0(%r15),%rax e3: mov %rax,%r11 e6: and %r10,%r11 e9: lock cmpxchg %r11,0x0(%r15) ef: jne 0x00000000000000df _ f1: mov %rax,%r10 | (unneeded, but f4: mov %r10,%rax _| not a problem) [...] After, dst=r9 (%r15) src=r4 (%rcx): [...] de: mov %rax,%r10 e1: mov 0x0(%r15),%rax e5: mov %rax,%r11 e8: and %rcx,%r11 eb: lock cmpxchg %r11,0x0(%r15) f1: jne 0x00000000000000e1 f3: mov %rax,%rcx f6: mov %r10,%rax [...] The case of dst == src register is rejected by the verifier and therefore not supported, but x86 JIT also handles this case just fine. After, dst=r0 (%rax) src=r0 (%rax): [...] eb: mov %rax,%r10 ee: mov 0x0(%r10),%rax f2: mov %rax,%r11 f5: and %r10,%r11 f8: lock cmpxchg %r11,0x0(%r10) fe: jne 0x00000000000000ee 100: mov %rax,%r10 103: mov %r10,%rax [...] Fixes: 981f94c3e921 ("bpf: Add bitwise atomic instructions") Reported-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Brendan Jackman <jackmanb@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-09-28ALSA: pcsp: Make hrtimer forwarding more robustThomas Gleixner1-1/+1
The hrtimer callback pcsp_do_timer() prepares rearming of the timer with hrtimer_forward(). hrtimer_forward() is intended to provide a mechanism to forward the expiry time of the hrtimer by a multiple of the period argument so that the expiry time greater than the time provided in the 'now' argument. pcsp_do_timer() invokes hrtimer_forward() with the current timer expiry time as 'now' argument. That's providing a periodic timer expiry, but is not really robust when the timer callback is delayed so that the resulting new expiry time is already in the past which causes the callback to be invoked immediately again. If the timer is delayed then the back to back invocation is not really making it better than skipping the missed periods. Sound is distorted in any case. Use hrtimer_forward_now() which ensures that the next expiry is in the future. This prevents hogging the CPU in the timer expiry code and allows later on to remove hrtimer_forward() from the public interfaces. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: alsa-devel@alsa-project.org Cc: Takashi Iwai <tiwai@suse.com> Cc: Jaroslav Kysela <perex@perex.cz> Link: https://lore.kernel.org/r/20210923153339.623208460@linutronix.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-09-28selftests, bpf: test_lwt_ip_encap: Really disable rp_filterJiri Benc1-5/+8
It's not enough to set net.ipv4.conf.all.rp_filter=0, that does not override a greater rp_filter value on the individual interfaces. We also need to set net.ipv4.conf.default.rp_filter=0 before creating the interfaces. That way, they'll also get their own rp_filter value of zero. Fixes: 0fde56e4385b0 ("selftests: bpf: add test_lwt_ip_encap selftest") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/b1cdd9d469f09ea6e01e9c89a6071c79b7380f89.1632386362.git.jbenc@redhat.com
2021-09-28selftests, bpf: Fix makefile dependencies on libbpfJiri Benc1-1/+2
When building bpf selftest with make -j, I'm randomly getting build failures such as this one: In file included from progs/bpf_flow.c:19: [...]/tools/testing/selftests/bpf/tools/include/bpf/bpf_helpers.h:11:10: fatal error: 'bpf_helper_defs.h' file not found #include "bpf_helper_defs.h" ^~~~~~~~~~~~~~~~~~~ The file that fails the build varies between runs but it's always in the progs/ subdir. The reason is a missing make dependency on libbpf for the .o files in progs/. There was a dependency before commit 3ac2e20fba07e but that commit removed it to prevent unneeded rebuilds. However, that only works if libbpf has been built already; the 'wildcard' prerequisite does not trigger when there's no bpf_helper_defs.h generated yet. Keep the libbpf as an order-only prerequisite to satisfy both goals. It is always built before the progs/ objects but it does not trigger unnecessary rebuilds by itself. Fixes: 3ac2e20fba07e ("selftests/bpf: BPF object files should depend only on libbpf headers") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/ee84ab66436fba05a197f952af23c98d90eb6243.1632758415.git.jbenc@redhat.com
2021-09-28bpf, test, cgroup: Use sk_{alloc,free} for test casesDaniel Borkmann1-5/+9
BPF test infra has some hacks in place which kzalloc() a socket and perform minimum init via sock_net_set() and sock_init_data(). As a result, the sk's skcd->cgroup is NULL since it didn't go through proper initialization as it would have been the case from sk_alloc(). Rather than re-adding a NULL test in sock_cgroup_ptr() just for this, use sk_{alloc,free}() pair for the test socket. The latter also allows to get rid of the bpf_sk_storage_free() special case. Fixes: 8520e224f547 ("bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode") Fixes: b7a1848e8398 ("bpf: add BPF_PROG_TEST_RUN support for flow dissector") Fixes: 2cb494a36c98 ("bpf: add tests for direct packet access from CGROUP_SKB") Reported-by: syzbot+664b58e9a40fbb2cec71@syzkaller.appspotmail.com Reported-by: syzbot+33f36d0754d4c5c0e102@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: syzbot+664b58e9a40fbb2cec71@syzkaller.appspotmail.com Tested-by: syzbot+33f36d0754d4c5c0e102@syzkaller.appspotmail.com Link: https://lore.kernel.org/bpf/20210927123921.21535-2-daniel@iogearbox.net
2021-09-28bpf, cgroup: Assign cgroup in cgroup_sk_alloc when called from interruptDaniel Borkmann1-5/+12
If cgroup_sk_alloc() is called from interrupt context, then just assign the root cgroup to skcd->cgroup. Prior to commit 8520e224f547 ("bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode") we would just return, and later on in sock_cgroup_ptr(), we were NULL-testing the cgroup in fast-path, and iff indeed NULL returning the root cgroup (v ?: &cgrp_dfl_root.cgrp). Rather than re-adding the NULL-test to the fast-path we can just assign it once from cgroup_sk_alloc() given v1/v2 handling has been simplified. The migration from NULL test with returning &cgrp_dfl_root.cgrp to assigning &cgrp_dfl_root.cgrp directly does /not/ change behavior for callers of sock_cgroup_ptr(). syzkaller was able to trigger a splat in the legacy netrom code base, where the RX handler in nr_rx_frame() calls nr_make_new() which calls sk_alloc() and therefore cgroup_sk_alloc() with in_interrupt() condition. Thus the NULL skcd->cgroup, where it trips over on cgroup_sk_free() side given it expects a non-NULL object. There are a few other candidates aside from netrom which have similar pattern where in their accept-like implementation, they just call to sk_alloc() and thus cgroup_sk_alloc() instead of sk_clone_lock() with the corresponding cgroup_sk_clone() which then inherits the cgroup from the parent socket. None of them are related to core protocols where BPF cgroup programs are running from. However, in future, they should follow to implement a similar inheritance mechanism. Additionally, with a !CONFIG_CGROUP_NET_PRIO and !CONFIG_CGROUP_NET_CLASSID configuration, the same issue was exposed also prior to 8520e224f547 due to commit e876ecc67db8 ("cgroup: memcg: net: do not associate sock with unrelated cgroup") which added the early in_interrupt() return back then. Fixes: 8520e224f547 ("bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode") Fixes: e876ecc67db8 ("cgroup: memcg: net: do not associate sock with unrelated cgroup") Reported-by: syzbot+df709157a4ecaf192b03@syzkaller.appspotmail.com Reported-by: syzbot+533f389d4026d86a2a95@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: syzbot+df709157a4ecaf192b03@syzkaller.appspotmail.com Tested-by: syzbot+533f389d4026d86a2a95@syzkaller.appspotmail.com Acked-by: Tejun Heo <tj@kernel.org> Link: https://lore.kernel.org/bpf/20210927123921.21535-1-daniel@iogearbox.net
2021-09-28libbpf: Fix segfault in static linker for objects without BTFKumar Kartikeya Dwivedi1-1/+7
When a BPF object is compiled without BTF info (without -g), trying to link such objects using bpftool causes a SIGSEGV due to btf__get_nr_types accessing obj->btf which is NULL. Fix this by checking for the NULL pointer, and return error. Reproducer: $ cat a.bpf.c extern int foo(void); int bar(void) { return foo(); } $ cat b.bpf.c int foo(void) { return 0; } $ clang -O2 -target bpf -c a.bpf.c $ clang -O2 -target bpf -c b.bpf.c $ bpftool gen obj out a.bpf.o b.bpf.o Segmentation fault (core dumped) After fix: $ bpftool gen obj out a.bpf.o b.bpf.o libbpf: failed to find BTF info for object 'a.bpf.o' Error: failed to link 'a.bpf.o': Unknown error -22 (-22) Fixes: a46349227cd8 (libbpf: Add linker extern resolution support for functions and global variables) Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210924023725.70228-1-memxor@gmail.com
2021-09-28MAINTAINERS: Add btf headers to BPFDave Marchevsky1-0/+2
BPF folks maintain these and they're not picked up by the current MAINTAINERS entries. Files caught by the added globs: include/linux/btf.h include/linux/btf_ids.h include/uapi/linux/btf.h Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210924193557.3081469-1-davemarchevsky@fb.com
2021-09-28bpf: Exempt CAP_BPF from checks against bpf_jit_limitLorenz Bauer1-1/+1
When introducing CAP_BPF, bpf_jit_charge_modmem() was not changed to treat programs with CAP_BPF as privileged for the purpose of JIT memory allocation. This means that a program without CAP_BPF can block a program with CAP_BPF from loading a program. Fix this by checking bpf_capable() in bpf_jit_charge_modmem(). Fixes: 2c78ee898d8f ("bpf: Implement CAP_BPF") Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210922111153.19843-1-lmb@cloudflare.com
2021-09-28bpf/tests: Add tail call limit test with external function callJohan Almbladh1-3/+83
This patch adds a tail call limit test where the program also emits a BPF_CALL to an external function prior to the tail call. Mainly testing that JITed programs preserve its internal register state, for example tail call count, across such external calls. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210914091842.4186267-15-johan.almbladh@anyfinetworks.com
2021-09-28bpf/tests: Fix error in tail call limit testsJohan Almbladh1-10/+27
This patch fixes an error in the tail call limit test that caused the test to fail on for x86-64 JIT. Previously, the register R0 was used to report the total number of tail calls made. However, after a tail call fall-through, the value of the R0 register is undefined. Now, all tail call error path tests instead use context state to store the count. Fixes: 874be05f525e ("bpf, tests: Add tail call test suite") Reported-by: Paul Chaignon <paul@cilium.io> Reported-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/bpf/20210914091842.4186267-14-johan.almbladh@anyfinetworks.com
2021-09-28bpf/tests: Add more BPF_END byte order conversion testsJohan Almbladh1-0/+122
This patch adds tests of the high 32 bits of 64-bit BPF_END conversions. It also adds a mirrored set of tests where the source bytes are reversed. The MSB of each byte is now set on the high word instead, possibly affecting sign-extension during conversion in a different way. Mainly for JIT testing. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210914091842.4186267-13-johan.almbladh@anyfinetworks.com
2021-09-28bpf/tests: Expand branch conversion JIT testJohan Almbladh1-34/+91
This patch expands the branch conversion test introduced by 66e5eb84 ("bpf, tests: Add branch conversion JIT test"). The test now includes a JMP with maximum eBPF offset. This triggers branch conversion for the 64-bit MIPS JIT. Additional variants are also added for cases when the branch is taken or not taken. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210914091842.4186267-12-johan.almbladh@anyfinetworks.com
2021-09-28bpf/tests: Add JMP tests with degenerate conditionalJohan Almbladh1-0/+229
This patch adds a set of tests for JMP and JMP32 operations where the branch decision is know at JIT time. Mainly testing JIT behaviour. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210914091842.4186267-11-johan.almbladh@anyfinetworks.com
2021-09-28bpf/tests: Add JMP tests with small offsetsJohan Almbladh1-0/+71
This patch adds a set of tests for JMP to verify that the JITed jump offset is calculated correctly. We pretend that the verifier has inserted any zero extensions to make the jump-over operations JIT to one instruction each, in order to control the exact JITed jump offset. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210914091842.4186267-10-johan.almbladh@anyfinetworks.com
2021-09-28bpf/tests: Add test case flag for verifier zero-extensionJohan Almbladh1-0/+3
This patch adds a new flag to indicate that the verified did insert zero-extensions, even though the verifier is not being run for any of the tests. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210914091842.4186267-9-johan.almbladh@anyfinetworks.com
2021-09-28bpf/tests: Add exhaustive test of LD_IMM64 immediate magnitudesJohan Almbladh1-0/+63
This patch adds a test for the 64-bit immediate load, a two-instruction operation, to verify correctness for all possible magnitudes of the immediate operand. Mainly intended for JIT testing. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210914091842.4186267-8-johan.almbladh@anyfinetworks.com
2021-09-28bpf/tests: Add staggered JMP and JMP32 testsJohan Almbladh1-0/+829
This patch adds a new type of jump test where the program jumps forwards and backwards with increasing offset. It mainly tests JITs where a relative jump may generate different JITed code depending on the offset size, read MIPS. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210914091842.4186267-7-johan.almbladh@anyfinetworks.com
2021-09-28bpf/tests: Add exhaustive tests of JMP operand magnitudesJohan Almbladh1-0/+779
This patch adds a set of tests for conditional JMP and JMP32 operations to verify correctness for all possible magnitudes of the immediate and register operands. Mainly intended for JIT testing. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210914091842.4186267-6-johan.almbladh@anyfinetworks.com
2021-09-28bpf/tests: Add exhaustive tests of ALU operand magnitudesJohan Almbladh1-0/+772
This patch adds a set of tests for ALU64 and ALU32 arithmetic and bitwise logical operations to verify correctness for all possible magnitudes of the register and immediate operands. Mainly intended for JIT testing. The patch introduces a pattern generator that can be used to drive extensive tests of different kinds of operations. It is parameterized to allow tuning of the operand combinations to test. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210914091842.4186267-5-johan.almbladh@anyfinetworks.com
2021-09-28bpf/tests: Add exhaustive tests of ALU shift valuesJohan Almbladh1-0/+260
This patch adds a set of tests for ALU64 and ALU32 shift operations to verify correctness for all possible values of the shift value. Mainly intended for JIT testing. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210914091842.4186267-4-johan.almbladh@anyfinetworks.com
2021-09-28bpf/tests: Reduce memory footprint of test suiteJohan Almbladh1-14/+12
The test suite used to call any fill_helper callbacks to generate eBPF program data for all test cases at once. This caused ballooning memory requirements as more extensive test cases were added. Now the each fill_helper is called before the test is run and the allocated memory released afterwards, before the next test case is processed. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210914091842.4186267-3-johan.almbladh@anyfinetworks.com
2021-09-28bpf/tests: Allow different number of runs per test caseJohan Almbladh1-0/+4
This patch allows a test cast to specify the number of runs to use. For compatibility with existing test case definitions, the default value 0 is interpreted as MAX_TESTRUNS. A reduced number of runs is useful for complex test programs where 1000 runs may take a very long time. Instead of reducing what is tested, one can instead reduce the number of times the test is run. Signed-off-by: Johan Almbladh <johan.almbladh@anyfinetworks.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210914091842.4186267-2-johan.almbladh@anyfinetworks.com
2021-09-28libbpf: Ignore STT_SECTION symbols in 'maps' sectionToke Høiland-Jørgensen1-2/+3
When parsing legacy map definitions, libbpf would error out when encountering an STT_SECTION symbol. This becomes a problem because some versions of binutils will produce SECTION symbols for every section when processing an ELF file, so BPF files run through 'strip' will end up with such symbols, making libbpf refuse to load them. There's not really any reason why erroring out is strictly necessary, so change libbpf to just ignore SECTION symbols when parsing the ELF. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210927205810.715656-1-toke@redhat.com
2021-09-28Merge branch 'bpf-xsk-rx-batch'Daniel Borkmann10-156/+376
Magnus Karlsson says: ==================== This patch set introduces a batched interface for Rx buffer allocation in AF_XDP buffer pool. Instead of using xsk_buff_alloc(*pool), drivers can now use xsk_buff_alloc_batch(*pool, **xdp_buff_array, max). Instead of returning a pointer to an xdp_buff, it returns the number of xdp_buffs it managed to allocate up to the maximum value of the max parameter in the function call. Pointers to the allocated xdp_buff:s are put in the xdp_buff_array supplied in the call. This could be a SW ring that already exists in the driver or a new structure that the driver has allocated. u32 xsk_buff_alloc_batch(struct xsk_buff_pool *pool, struct xdp_buff **xdp, u32 max); When using this interface, the driver should also use the new interface below to set the relevant fields in the struct xdp_buff. The reason for this is that xsk_buff_alloc_batch() does not fill in the data and data_meta fields for you as is the case with xsk_buff_alloc(). So it is not sufficient to just set data_end (effectively the size) anymore in the driver. The reason for this is performance as explained in detail in the commit message. void xsk_buff_set_size(struct xdp_buff *xdp, u32 size); Patch 6 also optimizes the buffer allocation in the aligned case. In this case, we can skip the reinitialization of most fields in the xdp_buff_xsk struct at allocation time. As the number of elements in the heads array is equal to the number of possible buffers in the umem, we can initialize them once and for all at bind time and then just point to the correct one in the xdp_buff_array that is returned to the driver. No reason to have a stack of free head entries. In the unaligned case, the buffers can reside anywhere in the umem, so this optimization is not possible as we still have to fill in the right information in the xdp_buff every single time one is allocated. I have updated i40e and ice to use this new batched interface. These are the throughput results on my 2.1 GHz Cascade Lake system: Aligned mode: ice: +11% / -9 cycles/pkt i40e: +12% / -9 cycles/pkt Unaligned mode: ice: +1.5% / -1 cycle/pkt i40e: +1% / -1 cycle/pkt For the aligned case, batching provides around 40% of the performance improvement and the aligned optimization the rest, around 60%. Would have expected a ~4% boost for unaligned with this data, but I only get around 1%. Do not know why. Note that memory consumption in aligned mode is also reduced by this patch set. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2021-09-28selftests: xsk: Add frame_headroom testMagnus Karlsson2-11/+44
Add a test for the frame_headroom feature that can be set on the umem. The logic added validates that all offsets in all tests and packets are valid, not just the ones that have a specifically configured frame_headroom. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210922075613.12186-14-magnus.karlsson@gmail.com
2021-09-28selftests: xsk: Change interleaving of packets in unaligned modeMagnus Karlsson1-3/+3
Change the interleaving of packets in unaligned mode. With the current buffer addresses in the packet stream, the last buffer in the umem could not be used as a large packet could potentially write over the end of the umem. The kernel correctly threw this buffer address away and refused to use it. This is perfectly fine for all regular packet streams, but the ones used for unaligned mode have every other packet being at some different offset. As we will add checks for correct offsets in the next patch, this needs to be fixed. Just start these page-boundary straddling buffers one page earlier so that the last one is not on the last page of the umem, making all buffers valid. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210922075613.12186-13-magnus.karlsson@gmail.com
2021-09-28selftests: xsk: Add single packet testMagnus Karlsson2-0/+14
Add a test where a single packet is sent and received. This might sound like a silly test, but since many of the interfaces in xsk are batched, it is important to be able to validate that we did not break something as fundamental as just receiving single packets, instead of batches of packets at high speed. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20210922075613.12186-12-magnus.karlsson@gmail.com