kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2024-10-17	vxlan: Handle error of rtnl_register_module().	Kuniyuki Iwashima	3	-12/+15
	[ Upstream commit 78b7b991838a4a6baeaad934addc4db2c5917eb8 ] Since introduced, vxlan_vnifilter_init() has been ignoring the returned value of rtnl_register_module(), which could fail silently. Handling the error allows users to view a module as an all-or-nothing thing in terms of the rtnetlink functionality. This prevents syzkaller from reporting spurious errors from its tests, where OOM often occurs and module is automatically loaded. Let's handle the errors by rtnl_register_many(). Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-21	vxlan: Fix regression when dropping packets due to invalid src addresses	Daniel Borkmann	1	-0/+4
	[ Upstream commit 1cd4bc987abb2823836cbb8f887026011ccddc8a ] Commit f58f45c1e5b9 ("vxlan: drop packets from invalid src-address") has recently been added to vxlan mainly in the context of source address snooping/learning so that when it is enabled, an entry in the FDB is not being created for an invalid address for the corresponding tunnel endpoint. Before commit f58f45c1e5b9 vxlan was similarly behaving as geneve in that it passed through whichever macs were set in the L2 header. It turns out that this change in behavior breaks setups, for example, Cilium with netkit in L3 mode for Pods as well as tunnel mode has been passing before the change in f58f45c1e5b9 for both vxlan and geneve. After mentioned change it is only passing for geneve as in case of vxlan packets are dropped due to vxlan_set_mac() returning false as source and destination macs are zero which for E/W traffic via tunnel is totally fine. Fix it by only opting into the is_valid_ether_addr() check in vxlan_set_mac() when in fact source address snooping/learning is actually enabled in vxlan. This is done by moving the check into vxlan_snoop(). With this change, the Cilium connectivity test suite passes again for both tunnel flavors. Fixes: f58f45c1e5b9 ("vxlan: drop packets from invalid src-address") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: David Bauer <mail@david-bauer.net> Cc: Ido Schimmel <idosch@nvidia.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: David Bauer <mail@david-bauer.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-16	vxlan: Fix regression when dropping packets due to invalid src addresses	Daniel Borkmann	1	-4/+0
	commit 1cd4bc987abb2823836cbb8f887026011ccddc8a upstream. Commit f58f45c1e5b9 ("vxlan: drop packets from invalid src-address") has recently been added to vxlan mainly in the context of source address snooping/learning so that when it is enabled, an entry in the FDB is not being created for an invalid address for the corresponding tunnel endpoint. Before commit f58f45c1e5b9 vxlan was similarly behaving as geneve in that it passed through whichever macs were set in the L2 header. It turns out that this change in behavior breaks setups, for example, Cilium with netkit in L3 mode for Pods as well as tunnel mode has been passing before the change in f58f45c1e5b9 for both vxlan and geneve. After mentioned change it is only passing for geneve as in case of vxlan packets are dropped due to vxlan_set_mac() returning false as source and destination macs are zero which for E/W traffic via tunnel is totally fine. Fix it by only opting into the is_valid_ether_addr() check in vxlan_set_mac() when in fact source address snooping/learning is actually enabled in vxlan. This is done by moving the check into vxlan_snoop(). With this change, the Cilium connectivity test suite passes again for both tunnel flavors. Fixes: f58f45c1e5b9 ("vxlan: drop packets from invalid src-address") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: David Bauer <mail@david-bauer.net> Cc: Ido Schimmel <idosch@nvidia.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Martin KaFai Lau <martin.lau@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: David Bauer <mail@david-bauer.net> Signed-off-by: David S. Miller <davem@davemloft.net> [ Backport note: vxlan snooping/learning not supported in 6.8 or older, so commit is simply a revert. ] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17	vxlan: Pull inner IP header in vxlan_rcv().	Guillaume Nault	1	-1/+18
	[ Upstream commit f7789419137b18e3847d0cc41afd788c3c00663d ] Ensure the inner IP header is part of skb's linear data before reading its ECN bits. Otherwise we might read garbage. One symptom is the system erroneously logging errors like "vxlan: non-ECT from xxx.xxx.xxx.xxx with TOS=xxxx". Similar bugs have been fixed in geneve, ip_tunnel and ip6_tunnel (see commit 1ca1ba465e55 ("geneve: make sure to pull inner header in geneve_rx()") for example). So let's reuse the same code structure for consistency. Maybe we'll can add a common helper in the future. Fixes: d342894c5d2f ("vxlan: virtual extensible lan") Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/1239c8db54efec341dd6455c77e0380f58923a3c.1714495737.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02	vxlan: drop packets from invalid src-address	David Bauer	1	-0/+4
	[ Upstream commit f58f45c1e5b92975e91754f5407250085a6ae7cf ] The VXLAN driver currently does not check if the inner layer2 source-address is valid. In case source-address snooping/learning is enabled, a entry in the FDB for the invalid address is created with the layer3 address of the tunnel endpoint. If the frame happens to have a non-unicast address set, all this non-unicast traffic is subsequently not flooded to the tunnel network but sent to the learnt host in the FDB. To make matters worse, this FDB entry does not expire. Apply the same filtering for packets as it is done for bridges. This not only drops these invalid packets but avoids them from being learnt into the FDB. Fixes: d342894c5d2f ("vxlan: virtual extensible lan") Suggested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Bauer <mail@david-bauer.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	neighbour: annotate lockless accesses to n->nud_state	Eric Dumazet	1	-2/+2
	[ Upstream commit b071af523579df7341cabf0f16fc661125e9a13f ] We have many lockless accesses to n->nud_state. Before adding another one in the following patch, add annotations to readers and writers. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 5baa0433a15e ("neighbour: fix data-races around n->output") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-16	drivers: vxlan: vnifilter: free percpu vni stats on error path	Fedor Pchelkin	1	-3/+8
	commit b1c936e9af5dd08636d568736fc6075ed9d1d529 upstream. In case rhashtable_lookup_insert_fast() fails inside vxlan_vni_add(), the allocated percpu vni stats are not freed on the error path. Introduce vxlan_vni_free() which would work as a nice wrapper to free vxlan_vni_node resources properly. Found by Linux Verification Center (linuxtesting.org). Fixes: 4095e0e1328a ("drivers: vxlan: vnifilter: per vni stats") Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03	vxlan: fix GRO with VXLAN-GPE	Jiri Benc	1	-15/+69
	[ Upstream commit b0b672c4d0957e5897685667fc848132b8bd2d71 ] In VXLAN-GPE, there may not be an Ethernet header following the VXLAN header. But in GRO, the vxlan driver calls eth_gro_receive unconditionally, which means the following header is incorrectly parsed as Ethernet. Introduce GPE specific GRO handling. For better performance, do not check for GPE during GRO but rather install a different set of functions at setup time. Fixes: e1e5314de08ba ("vxlan: implement GPE") Reported-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-03	vxlan: generalize vxlan_parse_gpe_hdr and remove unused args	Jiri Benc	1	-30/+28
	[ Upstream commit 17a0a64448b568442a101de09575f81ffdc45d15 ] The vxlan_parse_gpe_hdr function extracts the next protocol value from the GPE header and marks GPE bits as parsed. In order to be used in the next patch, split the function into protocol extraction and bit marking. The bit marking is meaningful only in vxlan_rcv; move it directly there. Rename the function to vxlan_parse_gpe_proto to reflect what it now does. Remove unused arguments skb and vxflags. Move the function earlier in the file to allow it to be called from more places in the next patch. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: b0b672c4d095 ("vxlan: fix GRO with VXLAN-GPE") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-03	vxlan: calculate correct header length for GPE	Jiri Benc	1	-13/+10
	[ Upstream commit 94d166c5318c6edd1e079df8552233443e909c33 ] VXLAN-GPE does not add an extra inner Ethernet header. Take that into account when calculating header length. This causes problems in skb_tunnel_check_pmtu, where incorrect PMTU is cached. In the collect_md mode (which is the only mode that VXLAN-GPE supports), there's no magic auto-setting of the tunnel interface MTU. It can't be, since the destination and thus the underlying interface may be different for each packet. So, the administrator is responsible for setting the correct tunnel interface MTU. Apparently, the administrators are capable enough to calculate that the maximum MTU for VXLAN-GPE is (their_lower_MTU - 36). They set the tunnel interface MTU to 1464. If you run a TCP stream over such interface, it's then segmented according to the MTU 1464, i.e. producing 1514 bytes frames. Which is okay, this still fits the lower MTU. However, skb_tunnel_check_pmtu (called from vxlan_xmit_one) uses 50 as the header size and thus incorrectly calculates the frame size to be 1528. This leads to ICMP too big message being generated (locally), PMTU of 1450 to be cached and the TCP stream to be resegmented. The fix is to use the correct actual header size, especially for skb_tunnel_check_pmtu calculation. Fixes: e1e5314de08ba ("vxlan: implement GPE") Signed-off-by: Jiri Benc <jbenc@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-12	vxlan: Fix memory leaks in error path	Ido Schimmel	1	-6/+13
	[ Upstream commit 06bf62944144a92d83dd14fd1378d2a288259561 ] The memory allocated by vxlan_vnigroup_init() is not freed in the error path, leading to memory leaks [1]. Fix by calling vxlan_vnigroup_uninit() in the error path. The leaks can be reproduced by annotating gro_cells_init() with ALLOW_ERROR_INJECTION() and then running: # echo "100" > /sys/kernel/debug/fail_function/probability # echo "1" > /sys/kernel/debug/fail_function/times # echo "gro_cells_init" > /sys/kernel/debug/fail_function/inject # printf %#x -12 > /sys/kernel/debug/fail_function/gro_cells_init/retval # ip link add name vxlan0 type vxlan dstport 4789 external vnifilter RTNETLINK answers: Cannot allocate memory [1] unreferenced object 0xffff88810db84a00 (size 512): comm "ip", pid 330, jiffies 4295010045 (age 66.016s) hex dump (first 32 bytes): f8 d5 76 0e 81 88 ff ff 01 00 00 00 00 00 00 02 ..v............. 03 00 04 00 48 00 00 00 00 00 00 01 04 00 01 00 ....H........... backtrace: [<ffffffff81a3097a>] kmalloc_trace+0x2a/0x60 [<ffffffff82f049fc>] vxlan_vnigroup_init+0x4c/0x160 [<ffffffff82ecd69e>] vxlan_init+0x1ae/0x280 [<ffffffff836858ca>] register_netdevice+0x57a/0x16d0 [<ffffffff82ef67b7>] __vxlan_dev_create+0x7c7/0xa50 [<ffffffff82ef6ce6>] vxlan_newlink+0xd6/0x130 [<ffffffff836d02ab>] __rtnl_newlink+0x112b/0x18a0 [<ffffffff836d0a8c>] rtnl_newlink+0x6c/0xa0 [<ffffffff836c0ddf>] rtnetlink_rcv_msg+0x43f/0xd40 [<ffffffff83908ce0>] netlink_rcv_skb+0x170/0x440 [<ffffffff839066af>] netlink_unicast+0x53f/0x810 [<ffffffff839072d8>] netlink_sendmsg+0x958/0xe70 [<ffffffff835c319f>] ____sys_sendmsg+0x78f/0xa90 [<ffffffff835cd6da>] ___sys_sendmsg+0x13a/0x1e0 [<ffffffff835cd94c>] __sys_sendmsg+0x11c/0x1f0 [<ffffffff8424da78>] do_syscall_64+0x38/0x80 unreferenced object 0xffff88810e76d5f8 (size 192): comm "ip", pid 330, jiffies 4295010045 (age 66.016s) hex dump (first 32 bytes): 04 00 00 00 00 00 00 00 db e1 4f e7 00 00 00 00 ..........O..... 08 d6 76 0e 81 88 ff ff 08 d6 76 0e 81 88 ff ff ..v.......v..... backtrace: [<ffffffff81a3162e>] __kmalloc_node+0x4e/0x90 [<ffffffff81a0e166>] kvmalloc_node+0xa6/0x1f0 [<ffffffff8276e1a3>] bucket_table_alloc.isra.0+0x83/0x460 [<ffffffff8276f18b>] rhashtable_init+0x43b/0x7c0 [<ffffffff82f04a1c>] vxlan_vnigroup_init+0x6c/0x160 [<ffffffff82ecd69e>] vxlan_init+0x1ae/0x280 [<ffffffff836858ca>] register_netdevice+0x57a/0x16d0 [<ffffffff82ef67b7>] __vxlan_dev_create+0x7c7/0xa50 [<ffffffff82ef6ce6>] vxlan_newlink+0xd6/0x130 [<ffffffff836d02ab>] __rtnl_newlink+0x112b/0x18a0 [<ffffffff836d0a8c>] rtnl_newlink+0x6c/0xa0 [<ffffffff836c0ddf>] rtnetlink_rcv_msg+0x43f/0xd40 [<ffffffff83908ce0>] netlink_rcv_skb+0x170/0x440 [<ffffffff839066af>] netlink_unicast+0x53f/0x810 [<ffffffff839072d8>] netlink_sendmsg+0x958/0xe70 [<ffffffff835c319f>] ____sys_sendmsg+0x78f/0xa90 Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-01	net: move from strlcpy with unused retval to strscpy	Wolfram Sang	1	-2/+2
	Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN Link: https://lore.kernel.org/r/20220830201457.7984-1-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-25	net: gro: skb_gro_header helper function	Richard Gobert	1	-6/+3
	Introduce a simple helper function to replace a common pattern. When accessing the GRO header, we fetch the pointer from frag0, then test its validity and fetch it from the skb when necessary. This leads to the pattern skb_gro_header_fast -> skb_gro_header_hard -> skb_gro_header_slow recurring many times throughout GRO code. This patch replaces these patterns with a single inlined function call, improving code readability. Signed-off-by: Richard Gobert <richardbgobert@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220823071034.GA56142@debian Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-10	vxlan: do not use RT_TOS for IPv6 flowlabel	Matthias May	1	-1/+1
	According to Guillaume Nault RT_TOS should never be used for IPv6. Quote: RT_TOS() is an old macro used to interprete IPv4 TOS as described in the obsolete RFC 1349. It's conceptually wrong to use it even in IPv4 code, although, given the current state of the code, most of the existing calls have no consequence. But using RT_TOS() in IPv6 code is always a bug: IPv6 never had a "TOS" field to be interpreted the RFC 1349 way. There's no historical compatibility to worry about. Fixes: 1400615d64cf ("vxlan: allow setting ipv6 traffic class") Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Matthias May <matthias.may@westermo.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26	vxlan: Use ip_tunnel_key flow flags in route lookups	Paul Chaignon	1	-4/+7
	Use the new ip_tunnel_key field with the flow flags in the IPv4 route lookups for the encapsulated packet. This will be used by the bpf_skb_set_tunnel_key helper in a subsequent commit. Signed-off-by: Paul Chaignon <paul@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/1ffc95c3d60182fd5ec0cf6602083f8f68afe98f.1658759380.git.paul@isovalent.com
2022-06-10	net: adopt u64_stats_t in struct pcpu_sw_netstats	Eric Dumazet	1	-4/+4
	As explained in commit 316580b69d0a ("u64_stats: provide u64_stats_t type") we should use u64_stats_t and related accessors to avoid load/store tearing. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-21	net: vxlan: Fix kernel coding style	Alaa Mohamed	1	-7/+6
	The continuation line does not align with the opening bracket and this patch fix it. Signed-off-by: Alaa Mohamed <eng.alaamohamedsoliman.am@gmail.com> Link: https://lore.kernel.org/r/20220520003614.6073-1-eng.alaamohamedsoliman.am@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-09	net: vxlan: Add extack support to vxlan_fdb_delete	Alaa Mohamed	1	-11/+27
	This patch adds extack msg support to vxlan_fdb_delete and vxlan_fdb_parse. extack is used to propagate meaningful error msgs to the user of vxlan fdb netlink api Signed-off-by: Alaa Mohamed <eng.alaamohamedsoliman.am@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-09	rtnetlink: add extack support in fdb del handlers	Alaa Mohamed	1	-1/+2
	Add extack support to .ndo_fdb_del in netdevice.h and all related methods. Signed-off-by: Alaa Mohamed <eng.alaamohamedsoliman.am@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-06	net: add netif_inherit_tso_max()	Jakub Kicinski	1	-2/+1
	To make later patches smaller create a helper for inheriting the TSO limitations of a lower device. The TSO in the name is not an accident, subsequent patches will replace GSO with TSO in more names. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-08	vxlan: fix error return code in vxlan_fdb_append	Hongbin Wang	1	-2/+2
	When kmalloc and dst_cache_init failed, should return ENOMEM rather than ENOBUFS. Signed-off-by: Hongbin Wang <wh_bin@126.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-31	vxlan: do not feed vxlan_vnifilter_dump_dev with non vxlan devices	Eric Dumazet	1	-0/+6
	vxlan_vnifilter_dump_dev() assumes it is called only for vxlan devices. Make sure it is the case. BUG: KASAN: slab-out-of-bounds in vxlan_vnifilter_dump_dev+0x9a0/0xb40 drivers/net/vxlan/vxlan_vnifilter.c:349 Read of size 4 at addr ffff888060d1ce70 by task syz-executor.3/17662 CPU: 0 PID: 17662 Comm: syz-executor.3 Tainted: G W 5.17.0-syzkaller-12888-g77c9387c0c5b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106 print_address_description.constprop.0.cold+0xeb/0x495 mm/kasan/report.c:313 print_report mm/kasan/report.c:429 [inline] kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491 vxlan_vnifilter_dump_dev+0x9a0/0xb40 drivers/net/vxlan/vxlan_vnifilter.c:349 vxlan_vnifilter_dump+0x3ff/0x650 drivers/net/vxlan/vxlan_vnifilter.c:428 netlink_dump+0x4b5/0xb70 net/netlink/af_netlink.c:2270 __netlink_dump_start+0x647/0x900 net/netlink/af_netlink.c:2375 netlink_dump_start include/linux/netlink.h:245 [inline] rtnetlink_rcv_msg+0x70c/0xb80 net/core/rtnetlink.c:5953 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2496 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1921 sock_sendmsg_nosec net/socket.c:705 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:725 ____sys_sendmsg+0x6e2/0x800 net/socket.c:2413 ___sys_sendmsg+0xf3/0x170 net/socket.c:2467 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2496 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f87b8e89049 Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Roopa Prabhu <roopa@nvidia.com> Link: https://lore.kernel.org/r/20220330194643.2706132-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-12	net: add per-cpu storage and net->core_stats	Eric Dumazet	1	-1/+1
	Before adding yet another possibly contended atomic_long_t, it is time to add per-cpu storage for existing ones: dev->tx_dropped, dev->rx_dropped, and dev->rx_nohandler Because many devices do not have to increment such counters, allocate the per-cpu storage on demand, so that dev_get_stats() does not have to spend considerable time folding zero counters. Note that some drivers have abused these counters which were supposed to be only used by core networking stack. v4: should use per_cpu_ptr() in dev_get_stats() (Jakub) v3: added a READ_ONCE() in netdev_core_stats_alloc() (Paolo) v2: add a missing include (reported by kernel test robot <lkp@intel.com>) Change in netdev_core_stats_alloc() (Jakub) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: jeffreyji <jeffreyji@google.com> Reviewed-by: Brian Vazquez <brianvv@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20220311051420.2608812-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10	drivers: vxlan: fix returnvar.cocci warning	Guo Zhengkui	1	-2/+1
	Fix the following coccicheck warning: drivers/net/vxlan/vxlan_core.c:2995:5-8: Unneeded variable: "ret". Return "0" on line 3004. Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device") Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Acked-by: Roopa Prabhu <roopa@nvidia.com> Link: https://lore.kernel.org/r/20220308134321.29862-1-guozhengkui@vivo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08	vxlan_core: delete unnecessary condition	Dan Carpenter	1	-28/+26
	The previous check handled the "if (!nh)" condition so we know "nh" is non-NULL here. Delete the check and pull the code in one tab. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Roopa Prabhu <roopa@nvidia.com> Link: https://lore.kernel.org/r/20220307125735.GC16710@kili Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-07	tun: vxlan: Use netif_rx().	Sebastian Andrzej Siewior	1	-2/+2
	Since commit baebdf48c3600 ("net: dev: Makes sure netif_rx() can be invoked in any context.") the function netif_rx() can be used in preemptible/thread context as well as in interrupt context. Use netif_rx(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-01	drivers: vxlan: vnifilter: add support for stats dumping	Nikolay Aleksandrov	1	-6/+86
	Add support for VXLAN vni filter entries' stats dumping Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-01	drivers: vxlan: vnifilter: per vni stats	Nikolay Aleksandrov	3	-9/+102
	Add per-vni statistics for vni filter mode. Counting Rx/Tx bytes/packets/drops/errors at the appropriate places. This patch changes vxlan_vs_find_vni to also return the vxlan_vni_node in cases where the vni belongs to a vni filtering vxlan device Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-01	vxlan: vni filtering support on collect metadata device	Roopa Prabhu	5	-30/+1121
	This patch adds vnifiltering support to collect metadata device. Motivation: You can only use a single vxlan collect metadata device for a given vxlan udp port in the system today. The vxlan collect metadata device terminates all received vxlan packets. As shown in the below diagram, there are use-cases where you need to support multiple such vxlan devices in independent bridge domains. Each vxlan device must terminate the vni's it is configured for. Example usecase: In a service provider network a service provider typically supports multiple bridge domains with overlapping vlans. One bridge domain per customer. Vlans in each bridge domain are mapped to globally unique vxlan ranges assigned to each customer. vnifiltering support in collect metadata devices terminates only configured vnis. This is similar to vlan filtering in bridge driver. The vni filtering capability is provided by a new flag on collect metadata device. In the below pic: - customer1 is mapped to br1 bridge domain - customer2 is mapped to br2 bridge domain - customer1 vlan 10-11 is mapped to vni 1001-1002 - customer2 vlan 10-11 is mapped to vni 2001-2002 - br1 and br2 are vlan filtering bridges - vxlan1 and vxlan2 are collect metadata devices with vnifiltering enabled ┌──────────────────────────────────────────────────────────────────┐ │ switch │ │ │ │ ┌───────────┐ ┌───────────┐ │ │ │ │ │ │ │ │ │ br1 │ │ br2 │ │ │ └┬─────────┬┘ └──┬───────┬┘ │ │ vlans│ │ vlans │ │ │ │ 10,11│ │ 10,11│ │ │ │ │ vlanvnimap: │ vlanvnimap: │ │ │ 10-1001,11-1002 │ 10-2001,11-2002 │ │ │ │ │ │ │ │ ┌──────┴┐ ┌──┴─────────┐ ┌───┴────┐ │ │ │ │ swp1 │ │vxlan1 │ │ swp2 │ ┌┴─────────────┐ │ │ │ │ │ vnifilter:│ │ │ │vxlan2 │ │ │ └───┬───┘ │ 1001,1002│ └───┬────┘ │ vnifilter: │ │ │ │ └────────────┘ │ │ 2001,2002 │ │ │ │ │ └──────────────┘ │ │ │ │ │ └───────┼──────────────────────────────────┼───────────────────────┘ │ │ │ │ ┌─────┴───────┐ │ │ customer1 │ ┌─────┴──────┐ │ host/VM │ │customer2 │ └─────────────┘ │ host/VM │ └────────────┘ With this implementation, vxlan dst metadata device can be associated with range of vnis. struct vxlan_vni_node is introduced to represent a configured vni. We start with vni and its associated remote_ip in this structure. This structure can be extended to bring in other per vni attributes if there are usecases for it. A vni inherits an attribute from the base vxlan device if there is no per vni attributes defined. struct vxlan_dev gets a new rhashtable for vnis called vxlan_vni_group. vxlan_vnifilter.c implements the necessary netlink api, notifications and helper functions to process and manage lifecycle of vxlan_vni_node. This patch also adds new helper functions in vxlan_multicast.c to handle per vni remote_ip multicast groups which are part of vxlan_vni_group. Fix build problems: Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-01	vxlan_multicast: Move multicast helpers to a separate file	Roopa Prabhu	4	-124/+142
	subsequent patches will add more helpers. Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-01	vxlan_core: add helper vxlan_vni_in_use	Roopa Prabhu	1	-18/+28
	more users in follow up patches Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-01	vxlan_core: make multicast helper take rip and ifindex explicitly	Roopa Prabhu	1	-16/+21
	This patch changes multicast helpers to take rip and ifindex as input. This is needed in future patches where rip can come from a pervni structure while the ifindex can come from the vxlan device. Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-01	vxlan_core: move some fdb helpers to non-static	Roopa Prabhu	2	-19/+39
	This patch moves some fdb helpers to non-static for use in later patches. Ideally, all fdb code could move into its own file vxlan_fdb.c. This can be done as a subsequent patch and is out of scope of this series. Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-01	vxlan_core: move common declarations to private header file	Roopa Prabhu	2	-79/+100
	This patch moves common structures and global declarations to a shared private headerfile vxlan_private.h. Subsequent patches use this header file as a common header file for additional shared declarations. Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-01	vxlan_core: fix build warnings in vxlan_xmit_one	Roopa Prabhu	1	-1/+8
	Fix the below build warnings reported by kernel test robot: - initialize vni in vxlan_xmit_one - wrap label in ipv6 enabled checks in vxlan_xmit_one warnings: static drivers/net/vxlan/vxlan_core.c:2437:14: warning: variable 'label' set but not used [-Wunused-but-set-variable] __be32 vni, label; ^ >> drivers/net/vxlan/vxlan_core.c:2483:7: warning: variable 'vni' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized] Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-01	vxlan: move to its own directory	Roopa Prabhu	2	-0/+4841
	vxlan.c has grown too long. This patch moves it to its own directory. subsequent patches add new functionality in new files. Signed-off-by: Roopa Prabhu <roopa@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>