summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2021-06-15net: inline function get_net_ns_by_fd if NET_NS is disabledChangbin Du1-7/+1
The function get_net_ns_by_fd() could be inlined when NET_NS is not enabled. Signed-off-by: Changbin Du <changbin.du@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14Bluetooth: SMP: Fix crash when receiving new connection when debug is enabledLuiz Augusto von Dentz1-3/+3
When receiving a new connection pchan->conn won't be initialized so the code cannot use bt_dev_dbg as the pointer to hci_dev won't be accessible. Fixes: 2e1614f7d61e4 ("Bluetooth: SMP: Convert BT_ERR/BT_DBG to bt_dev_err/bt_dev_dbg") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2021-06-14net: qrtr: fix OOB Read in qrtr_endpoint_postPavel Skripkin1-1/+1
Syzbot reported slab-out-of-bounds Read in qrtr_endpoint_post. The problem was in wrong _size_ type: if (len != ALIGN(size, 4) + hdrlen) goto err; If size from qrtr_hdr is 4294967293 (0xfffffffd), the result of ALIGN(size, 4) will be 0. In case of len == hdrlen and size == 4294967293 in header this check won't fail and skb_put_data(skb, data + hdrlen, size); will read out of bound from data, which is hdrlen allocated block. Fixes: 194ccc88297a ("net: qrtr: Support decoding incoming v2 packets") Reported-and-tested-by: syzbot+1917d778024161609247@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14ipv4: Fix device used for dst_alloc with local routesDavid Ahern1-1/+14
Oliver reported a use case where deleting a VRF device can hang waiting for the refcnt to drop to 0. The root cause is that the dst is allocated against the VRF device but cached on the loopback device. The use case (added to the selftests) has an implicit VRF crossing due to the ordering of the FIB rules (lookup local is before the l3mdev rule, but the problem occurs even if the FIB rules are re-ordered with local after l3mdev because the VRF table does not have a default route to terminate the lookup). The end result is is that the FIB lookup returns the loopback device as the nexthop, but the ingress device is in a VRF. The mismatch causes the dst alloc against the VRF device but then cached on the loopback. The fix is to bring the trick used for IPv6 (see ip6_rt_get_dev_rcu): pick the dst alloc device based the fib lookup result but with checks that the result has a nexthop device (e.g., not an unreachable or prohibit entry). Fixes: f5a0aab84b74 ("net: ipv4: dst for local input routes should use l3mdev if relevant") Reported-by: Oliver Herms <oliver.peter.herms@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14ethtool: strset: fix message length calculationJakub Kicinski1-0/+2
Outer nest for ETHTOOL_A_STRSET_STRINGSETS is not accounted for. This may result in ETHTOOL_MSG_STRSET_GET producing a warning like: calculated message payload length (684) not sufficient WARNING: CPU: 0 PID: 30967 at net/ethtool/netlink.c:369 ethnl_default_doit+0x87a/0xa20 and a splat. As usually with such warnings three conditions must be met for the warning to trigger: - there must be no skb size rounding up (e.g. reply_size of 684); - string set must be per-device (so that the header gets populated); - the device name must be at least 12 characters long. all in all with current user space it looks like reading priv flags is the only place this could potentially happen. Or with syzbot :) Reported-by: syzbot+59aa77b92d06cd5a54f2@syzkaller.appspotmail.com Fixes: 71921690f974 ("ethtool: provide string sets with STRSET_GET request") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-14sch_cake: revise docs for RFC 8622 LE PHB supportTyson Moore1-6/+6
Commit b8392808eb3fc28e ("sch_cake: add RFC 8622 LE PHB support to CAKE diffserv handling") added the LE mark to the Bulk tin. Update the comments to reflect the change. Signed-off-by: Tyson Moore <tyson@tyson.me> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-12net: make get_net_ns return error if NET_NS is disabledChangbin Du2-13/+12
There is a panic in socket ioctl cmd SIOCGSKNS when NET_NS is not enabled. The reason is that nsfs tries to access ns->ops but the proc_ns_operations is not implemented in this case. [7.670023] Unable to handle kernel NULL pointer dereference at virtual address 00000010 [7.670268] pgd = 32b54000 [7.670544] [00000010] *pgd=00000000 [7.671861] Internal error: Oops: 5 [#1] SMP ARM [7.672315] Modules linked in: [7.672918] CPU: 0 PID: 1 Comm: systemd Not tainted 5.13.0-rc3-00375-g6799d4f2da49 #16 [7.673309] Hardware name: Generic DT based system [7.673642] PC is at nsfs_evict+0x24/0x30 [7.674486] LR is at clear_inode+0x20/0x9c The same to tun SIOCGSKNS command. To fix this problem, we make get_net_ns() return -EINVAL when NET_NS is disabled. Meanwhile move it to right place net/core/net_namespace.c. Signed-off-by: Changbin Du <changbin.du@gmail.com> Fixes: c62cce2caee5 ("net: add an ioctl to get a socket network namespace") Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Christian Brauner <christian.brauner@ubuntu.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11mptcp: fix soft lookup in subflow_error_report()Paolo Abeni2-36/+48
Maxim reported a soft lookup in subflow_error_report(): watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [swapper/0:0] RIP: 0010:native_queued_spin_lock_slowpath RSP: 0018:ffffa859c0003bc0 EFLAGS: 00000202 RAX: 0000000000000101 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff9195c2772d88 RSI: 0000000000000000 RDI: ffff9195c2772d88 RBP: ffff9195c2772d00 R08: 00000000000067b0 R09: c6e31da9eb1e44f4 R10: ffff9195ef379700 R11: ffff9195edb50710 R12: ffff9195c2772d88 R13: ffff9195f500e3d0 R14: ffff9195ef379700 R15: ffff9195ef379700 FS: 0000000000000000(0000) GS:ffff91961f400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000c000407000 CR3: 0000000002988000 CR4: 00000000000006f0 Call Trace: <IRQ> _raw_spin_lock_bh subflow_error_report mptcp_subflow_data_available __mptcp_move_skbs_from_subflow mptcp_data_ready tcp_data_queue tcp_rcv_established tcp_v4_do_rcv tcp_v4_rcv ip_protocol_deliver_rcu ip_local_deliver_finish __netif_receive_skb_one_core netif_receive_skb rtl8139_poll 8139too __napi_poll net_rx_action __do_softirq __irq_exit_rcu common_interrupt </IRQ> The calling function - mptcp_subflow_data_available() - can be invoked from different contexts: - plain ssk socket lock - ssk socket lock + mptcp_data_lock - ssk socket lock + mptcp_data_lock + msk socket lock. Since subflow_error_report() tries to acquire the mptcp_data_lock, the latter two call chains will cause soft lookup. This change addresses the issue moving the error reporting call to outer functions, where the held locks list is known and the we can acquire only the needed one. Reported-by: Maxim Galaganov <max@internet.ru> Fixes: 15cc10453398 ("mptcp: deliver ssk errors to msk") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/199 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11mptcp: do not warn on bad input from the networkPaolo Abeni1-5/+5
warn_bad_map() produces a kernel WARN on bad input coming from the network. Use pr_debug() to avoid spamming the system log. Additionally, when the right bound check fails, warn_bad_map() reports the wrong ssn value, let's fix it. Fixes: 648ef4b88673 ("mptcp: Implement MPTCP receive path") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/107 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11mptcp: wake-up readers only for in sequence dataPaolo Abeni3-36/+21
Currently we rely on the subflow->data_avail field, which is subject to races: ssk1 skb len = 500 DSS(seq=1, len=1000, off=0) # data_avail == MPTCP_SUBFLOW_DATA_AVAIL ssk2 skb len = 500 DSS(seq = 501, len=1000) # data_avail == MPTCP_SUBFLOW_DATA_AVAIL ssk1 skb len = 500 DSS(seq = 1, len=1000, off =500) # still data_avail == MPTCP_SUBFLOW_DATA_AVAIL, # as the skb is covered by a pre-existing map, # which was in-sequence at reception time. Instead we can explicitly check if some has been received in-sequence, propagating the info from __mptcp_move_skbs_from_subflow(). Additionally add the 'ONCE' annotation to the 'data_avail' memory access, as msk will read it outside the subflow socket lock. Fixes: 648ef4b88673 ("mptcp: Implement MPTCP receive path") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11mptcp: try harder to borrow memory from subflow under pressurePaolo Abeni1-4/+6
If the host is under sever memory pressure, and RX forward memory allocation for the msk fails, we try to borrow the required memory from the ingress subflow. The current attempt is a bit flaky: if skb->truesize is less than SK_MEM_QUANTUM, the ssk will not release any memory, and the next schedule will fail again. Instead, directly move the required amount of pages from the ssk to the msk, if available Fixes: 9c3f94e1681b ("mptcp: add missing memory scheduling in the rx path") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2-46/+61
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Fix a crash when stateful expression with its own gc callback is used in a set definition. 2) Skip IPv6 packets from any link-local address in IPv6 fib expression. Add a selftest for this scenario, from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11sch_cake: Fix out of bounds when parsing TCP options and headerMaxim Mikityanskiy1-1/+5
The TCP option parser in cake qdisc (cake_get_tcpopt and cake_tcph_may_drop) could read one byte out of bounds. When the length is 1, the execution flow gets into the loop, reads one byte of the opcode, and if the opcode is neither TCPOPT_EOL nor TCPOPT_NOP, it reads one more byte, which exceeds the length of 1. This fix is inspired by commit 9609dad263f8 ("ipv4: tcp_input: fix stack out of bounds when parsing TCP options."). v2 changes: Added doff validation in cake_get_tcphdr to avoid parsing garbage as TCP header. Although it wasn't strictly an out-of-bounds access (memory was allocated), garbage values could be read where CAKE expected the TCP header if doff was smaller than 5. Cc: Young Xiao <92siuyang@gmail.com> Fixes: 8b7138814f29 ("sch_cake: Add optional ACK filter") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11mptcp: Fix out of bounds when parsing TCP optionsMaxim Mikityanskiy1-0/+2
The TCP option parser in mptcp (mptcp_get_options) could read one byte out of bounds. When the length is 1, the execution flow gets into the loop, reads one byte of the opcode, and if the opcode is neither TCPOPT_EOL nor TCPOPT_NOP, it reads one more byte, which exceeds the length of 1. This fix is inspired by commit 9609dad263f8 ("ipv4: tcp_input: fix stack out of bounds when parsing TCP options."). Cc: Young Xiao <92siuyang@gmail.com> Fixes: cec37a6e41aa ("mptcp: Handle MP_CAPABLE options for outgoing connections") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11netfilter: synproxy: Fix out of bounds when parsing TCP optionsMaxim Mikityanskiy1-0/+5
The TCP option parser in synproxy (synproxy_parse_options) could read one byte out of bounds. When the length is 1, the execution flow gets into the loop, reads one byte of the opcode, and if the opcode is neither TCPOPT_EOL nor TCPOPT_NOP, it reads one more byte, which exceeds the length of 1. This fix is inspired by commit 9609dad263f8 ("ipv4: tcp_input: fix stack out of bounds when parsing TCP options."). v2 changes: Added an early return when length < 0 to avoid calling skb_header_pointer with negative length. Cc: Young Xiao <92siuyang@gmail.com> Fixes: 48b1de4c110a ("netfilter: add SYNPROXY core/target") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11net/packet: annotate data race in packet_sendmsg()Eric Dumazet1-3/+6
There is a known race in packet_sendmsg(), addressed in commit 32d3182cd2cd ("net/packet: fix race in tpacket_snd()") Now we have data_race(), we can use it to avoid a future KCSAN warning, as syzbot loves stressing af_packet sockets :) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11net: bridge: fix vlan tunnel dst refcnt when egressingNikolay Aleksandrov1-2/+2
The egress tunnel code uses dst_clone() and directly sets the result which is wrong because the entry might have 0 refcnt or be already deleted, causing number of problems. It also triggers the WARN_ON() in dst_hold()[1] when a refcnt couldn't be taken. Fix it by using dst_hold_safe() and checking if a reference was actually taken before setting the dst. [1] dmesg WARN_ON log and following refcnt errors WARNING: CPU: 5 PID: 38 at include/net/dst.h:230 br_handle_egress_vlan_tunnel+0x10b/0x134 [bridge] Modules linked in: 8021q garp mrp bridge stp llc bonding ipv6 virtio_net CPU: 5 PID: 38 Comm: ksoftirqd/5 Kdump: loaded Tainted: G W 5.13.0-rc3+ #360 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 RIP: 0010:br_handle_egress_vlan_tunnel+0x10b/0x134 [bridge] Code: e8 85 bc 01 e1 45 84 f6 74 90 45 31 f6 85 db 48 c7 c7 a0 02 19 a0 41 0f 94 c6 31 c9 31 d2 44 89 f6 e8 64 bc 01 e1 85 db 75 02 <0f> 0b 31 c9 31 d2 44 89 f6 48 c7 c7 70 02 19 a0 e8 4b bc 01 e1 49 RSP: 0018:ffff8881003d39e8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffffa01902a0 RBP: ffff8881040c6700 R08: 0000000000000000 R09: 0000000000000001 R10: 2ce93d0054fe0d00 R11: 54fe0d00000e0000 R12: ffff888109515000 R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000401 FS: 0000000000000000(0000) GS:ffff88822bf40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f42ba70f030 CR3: 0000000109926000 CR4: 00000000000006e0 Call Trace: br_handle_vlan+0xbc/0xca [bridge] __br_forward+0x23/0x164 [bridge] deliver_clone+0x41/0x48 [bridge] br_handle_frame_finish+0x36f/0x3aa [bridge] ? skb_dst+0x2e/0x38 [bridge] ? br_handle_ingress_vlan_tunnel+0x3e/0x1c8 [bridge] ? br_handle_frame_finish+0x3aa/0x3aa [bridge] br_handle_frame+0x2c3/0x377 [bridge] ? __skb_pull+0x33/0x51 ? vlan_do_receive+0x4f/0x36a ? br_handle_frame_finish+0x3aa/0x3aa [bridge] __netif_receive_skb_core+0x539/0x7c6 ? __list_del_entry_valid+0x16e/0x1c2 __netif_receive_skb_list_core+0x6d/0xd6 netif_receive_skb_list_internal+0x1d9/0x1fa gro_normal_list+0x22/0x3e dev_gro_receive+0x55b/0x600 ? detach_buf_split+0x58/0x140 napi_gro_receive+0x94/0x12e virtnet_poll+0x15d/0x315 [virtio_net] __napi_poll+0x2c/0x1c9 net_rx_action+0xe6/0x1fb __do_softirq+0x115/0x2d8 run_ksoftirqd+0x18/0x20 smpboot_thread_fn+0x183/0x19c ? smpboot_unregister_percpu_thread+0x66/0x66 kthread+0x10a/0x10f ? kthread_mod_delayed_work+0xb6/0xb6 ret_from_fork+0x22/0x30 ---[ end trace 49f61b07f775fd2b ]--- dst_release: dst:00000000c02d677a refcnt:-1 dst_release underflow Cc: stable@vger.kernel.org Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-11net: bridge: fix vlan tunnel dst null pointer dereferenceNikolay Aleksandrov2-16/+26
This patch fixes a tunnel_dst null pointer dereference due to lockless access in the tunnel egress path. When deleting a vlan tunnel the tunnel_dst pointer is set to NULL without waiting a grace period (i.e. while it's still usable) and packets egressing are dereferencing it without checking. Use READ/WRITE_ONCE to annotate the lockless use of tunnel_id, use RCU for accessing tunnel_dst and make sure it is read only once and checked in the egress path. The dst is already properly RCU protected so we don't need to do anything fancy than to make sure tunnel_id and tunnel_dst are read only once and checked in the egress path. Cc: stable@vger.kernel.org Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths") Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10ping: Check return value of function 'ping_queue_rcv_skb'Zheng Yongjun1-5/+7
Function 'ping_queue_rcv_skb' not always return success, which will also return fail. If not check the wrong return value of it, lead to function `ping_rcv` return success. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10skbuff: fix incorrect msg_zerocopy copy notificationsWillem de Bruijn1-1/+3
msg_zerocopy signals if a send operation required copying with a flag in serr->ee.ee_code. This field can be incorrect as of the below commit, as a result of both structs uarg and serr pointing into the same skb->cb[]. uarg->zerocopy must be read before skb->cb[] is reinitialized to hold serr. Similar to other fields len, hi and lo, use a local variable to temporarily hold the value. This was not a problem before, when the value was passed as a function argument. Fixes: 75518851a2a0 ("skbuff: Push status and refcounts into sock_zerocopy_callback") Reported-by: Talal Ahmad <talalahmad@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10net/sched: act_ct: handle DNAT tuple collisionMarcelo Ricardo Leitner1-8/+13
This this the counterpart of 8aa7b526dc0b ("openvswitch: handle DNAT tuple collision") for act_ct. From that commit changelog: """ With multiple DNAT rules it's possible that after destination translation the resulting tuples collide. ... Netfilter handles this case by allocating a null binding for SNAT at egress by default. Perform the same operation in openvswitch for DNAT if no explicit SNAT is requested by the user and allocate a null binding for SNAT for packets in the "original" direction. """ Fixes: 95219afbb980 ("act_ct: support asymmetric conntrack") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10rtnetlink: Fix regression in bridge VLAN configurationIdo Schimmel1-3/+5
Cited commit started returning errors when notification info is not filled by the bridge driver, resulting in the following regression: # ip link add name br1 type bridge vlan_filtering 1 # bridge vlan add dev br1 vid 555 self pvid untagged RTNETLINK answers: Invalid argument As long as the bridge driver does not fill notification info for the bridge device itself, an empty notification should not be considered as an error. This is explained in commit 59ccaaaa49b5 ("bridge: dont send notification when skb->len == 0 in rtnl_bridge_notify"). Fix by removing the error and add a comment to avoid future bugs. Fixes: a8db57c1d285 ("rtnetlink: Fix missing error code in rtnl_bridge_notify()") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10Merge tag 'mac80211-for-net-2021-06-09' of ↵David S. Miller11-48/+95
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes berg says: ==================== A fair number of fixes: * fix more fallout from RTNL locking changes * fixes for some of the bugs found by syzbot * drop multicast fragments in mac80211 to align with the spec and what drivers are doing now * fix NULL-ptr deref in radiotap injection ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-10udp: fix race between close() and udp_abort()Paolo Abeni2-0/+13
Kaustubh reported and diagnosed a panic in udp_lib_lookup(). The root cause is udp_abort() racing with close(). Both racing functions acquire the socket lock, but udp{v6}_destroy_sock() release it before performing destructive actions. We can't easily extend the socket lock scope to avoid the race, instead use the SOCK_DEAD flag to prevent udp_abort from doing any action when the critical race happens. Diagnosed-and-tested-by: Kaustubh Pandey <kapandey@codeaurora.org> Fixes: 5d77dca82839 ("net: diag: support SOCK_DESTROY for UDP sockets") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09inet: annotate data race in inet_send_prepare() and inet_dgram_connect()Eric Dumazet1-2/+2
Both functions are known to be racy when reading inet_num as we do not want to grab locks for the common case the socket has been bound already. The race is resolved in inet_autobind() by reading again inet_num under the socket lock. syzbot reported: BUG: KCSAN: data-race in inet_send_prepare / udp_lib_get_port write to 0xffff88812cba150e of 2 bytes by task 24135 on cpu 0: udp_lib_get_port+0x4b2/0xe20 net/ipv4/udp.c:308 udp_v6_get_port+0x5e/0x70 net/ipv6/udp.c:89 inet_autobind net/ipv4/af_inet.c:183 [inline] inet_send_prepare+0xd0/0x210 net/ipv4/af_inet.c:807 inet6_sendmsg+0x29/0x80 net/ipv6/af_inet6.c:639 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg net/socket.c:674 [inline] ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350 ___sys_sendmsg net/socket.c:2404 [inline] __sys_sendmmsg+0x315/0x4b0 net/socket.c:2490 __do_sys_sendmmsg net/socket.c:2519 [inline] __se_sys_sendmmsg net/socket.c:2516 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2516 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffff88812cba150e of 2 bytes by task 24132 on cpu 1: inet_send_prepare+0x21/0x210 net/ipv4/af_inet.c:806 inet6_sendmsg+0x29/0x80 net/ipv6/af_inet6.c:639 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg net/socket.c:674 [inline] ____sys_sendmsg+0x360/0x4d0 net/socket.c:2350 ___sys_sendmsg net/socket.c:2404 [inline] __sys_sendmmsg+0x315/0x4b0 net/socket.c:2490 __do_sys_sendmmsg net/socket.c:2519 [inline] __se_sys_sendmmsg net/socket.c:2516 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2516 do_syscall_64+0x4a/0x90 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x0000 -> 0x9db4 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 24132 Comm: syz-executor.2 Not tainted 5.13.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09net: ethtool: clear heap allocations for ethtool functionAustin Kim1-5/+5
Several ethtool functions leave heap uncleared (potentially) by drivers. This will leave the unused portion of heap unchanged and might copy the full contents back to userspace. Signed-off-by: Austin Kim <austindh.kim@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-09netfilter: nft_fib_ipv6: skip ipv6 packets from any to link-localFlorian Westphal1-4/+18
The ip6tables rpfilter match has an extra check to skip packets with "::" source address. Extend this to ipv6 fib expression. Else ipv6 duplicate address detection packets will fail rpf route check -- lookup returns -ENETUNREACH. While at it, extend the prerouting check to also cover the ingress hook. Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1543 Fixes: f6d0cbcf09c5 ("netfilter: nf_tables: add fib expression") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-09netfilter: nf_tables: initialize set before expression setupPablo Neira Ayuso1-42/+43
nft_set_elem_expr_alloc() needs an initialized set if expression sets on the NFT_EXPR_GC flag. Move set fields initialization before expression setup. [4512935.019450] ================================================================== [4512935.019456] BUG: KASAN: null-ptr-deref in nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables] [4512935.019487] Read of size 8 at addr 0000000000000070 by task nft/23532 [4512935.019494] CPU: 1 PID: 23532 Comm: nft Not tainted 5.12.0-rc4+ #48 [...] [4512935.019502] Call Trace: [4512935.019505] dump_stack+0x89/0xb4 [4512935.019512] ? nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables] [4512935.019536] ? nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables] [4512935.019560] kasan_report.cold.12+0x5f/0xd8 [4512935.019566] ? nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables] [4512935.019590] nft_set_elem_expr_alloc+0x84/0xd0 [nf_tables] [4512935.019615] nf_tables_newset+0xc7f/0x1460 [nf_tables] Reported-by: syzbot+ce96ca2b1d0b37c6422d@syzkaller.appspotmail.com Fixes: 65038428b2c6 ("netfilter: nf_tables: allow to specify stateful expression in set definition") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-09mac80211: drop multicast fragmentsJohannes Berg1-6/+3
These are not permitted by the spec, just drop them. Link: https://lore.kernel.org/r/20210609161305.23def022b750.Ibd6dd3cdce573dae262fcdc47f8ac52b883a9c50@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-09mac80211: move interface shutdown out of wiphy lockJohannes Berg3-4/+12
When reconfiguration fails, we shut down everything, but we cannot call cfg80211_shutdown_all_interfaces() with the wiphy mutex held. Since cfg80211 now calls it on resume errors, we only need to do likewise for where we call reconfig (whether directly or indirectly), but not under the wiphy lock. Cc: stable@vger.kernel.org Fixes: 2fe8ef106238 ("cfg80211: change netdev registration/unregistration semantics") Link: https://lore.kernel.org/r/20210608113226.78233c80f548.Iecc104aceb89f0568f50e9670a9cb191a1c8887b@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-09cfg80211: shut down interfaces on failed resumeJohannes Berg1-0/+4
If resume fails, we should shut down all interfaces as the hardware is probably dead. This was/is already done now in mac80211, but we need to change that due to locking issues, so move it here and do it without the wiphy lock held. Cc: stable@vger.kernel.org Fixes: 2fe8ef106238 ("cfg80211: change netdev registration/unregistration semantics") Link: https://lore.kernel.org/r/20210608113226.d564ca69de7c.I2e3c3e5d410b72a4f63bade4fb075df041b3d92f@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-09cfg80211: fix phy80211 symlink creationJohannes Berg1-8/+5
When I moved around the code here, I neglected that we could still call register_netdev() or similar without the wiphy mutex held, which then calls cfg80211_register_wdev() - that's also done from cfg80211_register_netdevice(), but the phy80211 symlink creation was only there. Now, the symlink isn't needed for a *pure* wdev, but a netdev not registered via cfg80211_register_wdev() should still have the symlink, so move the creation to the right place. Cc: stable@vger.kernel.org Fixes: 2fe8ef106238 ("cfg80211: change netdev registration/unregistration semantics") Link: https://lore.kernel.org/r/20210608113226.a5dc4c1e488c.Ia42fe663cefe47b0883af78c98f284c5555bbe5d@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-09mac80211: fix 'reset' debugfs lockingJohannes Berg1-1/+3
cfg80211 now calls suspend/resume with the wiphy lock held, and while there's a problem with that needing to be fixed, we should do the same in debugfs. Cc: stable@vger.kernel.org Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver") Link: https://lore.kernel.org/r/20210608113226.14020430e449.I78e19db0a55a8295a376e15ac4cf77dbb4c6fb51@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-09net: rds: fix memory leak in rds_recvmsgPavel Skripkin1-1/+1
Syzbot reported memory leak in rds. The problem was in unputted refcount in case of error. int rds_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, int msg_flags) { ... if (!rds_next_incoming(rs, &inc)) { ... } After this "if" inc refcount incremented and if (rds_cmsg_recv(inc, msg, rs)) { ret = -EFAULT; goto out; } ... out: return ret; } in case of rds_cmsg_recv() fail the refcount won't be decremented. And it's easy to see from ftrace log, that rds_inc_addref() don't have rds_inc_put() pair in rds_recvmsg() after rds_cmsg_recv() 1) | rds_recvmsg() { 1) 3.721 us | rds_inc_addref(); 1) 3.853 us | rds_message_inc_copy_to_user(); 1) + 10.395 us | rds_cmsg_recv(); 1) + 34.260 us | } Fixes: bdbe6fbc6a2f ("RDS: recv.c") Reported-and-tested-by: syzbot+5134cdf021c4ed5aaa5f@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08Merge tag 'batadv-net-pullrequest-20210608' of ↵David S. Miller1-1/+3
git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== Here is a batman-adv bugfix: - Avoid WARN_ON timing related checks, by Sven Eckelmann ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08net: appletalk: fix the usage of prepositiongushengxian1-1/+1
The preposition "for" should be changed to preposition "of". Signed-off-by: gushengxian <gushengxian@yulong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08net: ipv4: Remove unneed BUG() functionZheng Yongjun2-2/+2
When 'nla_parse_nested_deprecated' failed, it's no need to BUG() here, return -EINVAL is ok. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08net: ipv4: fix memory leak in netlbl_cipsov4_add_stdNanyong Sun1-0/+1
Reported by syzkaller: BUG: memory leak unreferenced object 0xffff888105df7000 (size 64): comm "syz-executor842", pid 360, jiffies 4294824824 (age 22.546s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000e67ed558>] kmalloc include/linux/slab.h:590 [inline] [<00000000e67ed558>] kzalloc include/linux/slab.h:720 [inline] [<00000000e67ed558>] netlbl_cipsov4_add_std net/netlabel/netlabel_cipso_v4.c:145 [inline] [<00000000e67ed558>] netlbl_cipsov4_add+0x390/0x2340 net/netlabel/netlabel_cipso_v4.c:416 [<0000000006040154>] genl_family_rcv_msg_doit.isra.0+0x20e/0x320 net/netlink/genetlink.c:739 [<00000000204d7a1c>] genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] [<00000000204d7a1c>] genl_rcv_msg+0x2bf/0x4f0 net/netlink/genetlink.c:800 [<00000000c0d6a995>] netlink_rcv_skb+0x134/0x3d0 net/netlink/af_netlink.c:2504 [<00000000d78b9d2c>] genl_rcv+0x24/0x40 net/netlink/genetlink.c:811 [<000000009733081b>] netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline] [<000000009733081b>] netlink_unicast+0x4a0/0x6a0 net/netlink/af_netlink.c:1340 [<00000000d5fd43b8>] netlink_sendmsg+0x789/0xc70 net/netlink/af_netlink.c:1929 [<000000000a2d1e40>] sock_sendmsg_nosec net/socket.c:654 [inline] [<000000000a2d1e40>] sock_sendmsg+0x139/0x170 net/socket.c:674 [<00000000321d1969>] ____sys_sendmsg+0x658/0x7d0 net/socket.c:2350 [<00000000964e16bc>] ___sys_sendmsg+0xf8/0x170 net/socket.c:2404 [<000000001615e288>] __sys_sendmsg+0xd3/0x190 net/socket.c:2433 [<000000004ee8b6a5>] do_syscall_64+0x37/0x90 arch/x86/entry/common.c:47 [<00000000171c7cee>] entry_SYSCALL_64_after_hwframe+0x44/0xae The memory of doi_def->map.std pointing is allocated in netlbl_cipsov4_add_std, but no place has freed it. It should be freed in cipso_v4_doi_free which frees the cipso DOI resource. Fixes: 96cb8e3313c7a ("[NetLabel]: CIPSOv4 and Unlabeled packet integration") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Nanyong Sun <sunnanyong@huawei.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-08mac80211: fix deadlock in AP/VLAN handlingJohannes Berg1-7/+12
Syzbot reports that when you have AP_VLAN interfaces that are up and close the AP interface they belong to, we get a deadlock. No surprise - since we dev_close() them with the wiphy mutex held, which goes back into the netdev notifier in cfg80211 and tries to acquire the wiphy mutex there. To fix this, we need to do two things: 1) prevent changing iftype while AP_VLANs are up, we can't easily fix this case since cfg80211 already calls us with the wiphy mutex held, but change_interface() is relatively rare in drivers anyway, so changing iftype isn't used much (and userspace has to fall back to down/change/up anyway) 2) pull the dev_close() loop over VLANs out of the wiphy mutex section in the normal stop case Cc: stable@vger.kernel.org Reported-by: syzbot+452ea4fbbef700ff0a56@syzkaller.appspotmail.com Fixes: a05829a7222e ("cfg80211: avoid holding the RTNL when calling the driver") Link: https://lore.kernel.org/r/20210517160322.9b8f356c0222.I392cb0e2fa5a1a94cf2e637555d702c7e512c1ff@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-06-08neighbour: allow NUD_NOARP entries to be forced GCedDavid Ahern1-0/+1
IFF_POINTOPOINT interfaces use NUD_NOARP entries for IPv6. It's possible to fill up the neighbour table with enough entries that it will overflow for valid connections after that. This behaviour is more prevalent after commit 58956317c8de ("neighbor: Improve garbage collection") is applied, as it prevents removal from entries that are not NUD_FAILED, unless they are more than 5s old. Fixes: 58956317c8de (neighbor: Improve garbage collection) Reported-by: Kasper Dupont <kasperd@gjkwv.06.feb.2021.kasperd.net> Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07revert "net: kcm: fix memory leak in kcm_sendmsg"Pavel Skripkin1-5/+0
In commit c47cc304990a ("net: kcm: fix memory leak in kcm_sendmsg") I misunderstood the root case of the memory leak and came up with completely broken fix. So, simply revert this commit to avoid GPF reported by syzbot. Im so sorry for this situation. Fixes: c47cc304990a ("net: kcm: fix memory leak in kcm_sendmsg") Reported-by: syzbot+65badd5e74ec62cb67dc@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-07ethtool: Fix NULL pointer dereference during module EEPROM dumpIdo Schimmel1-1/+1
When get_module_eeprom_by_page() is not implemented by the driver, NULL pointer dereference can occur [1]. Fix by testing if get_module_eeprom_by_page() is implemented instead of get_module_info(). [1] BUG: kernel NULL pointer dereference, address: 0000000000000000 [...] CPU: 0 PID: 251 Comm: ethtool Not tainted 5.13.0-rc3-custom-00940-g3822d0670c9d #989 Call Trace: eeprom_prepare_data+0x101/0x2d0 ethnl_default_doit+0xc2/0x290 genl_family_rcv_msg_doit+0xdc/0x140 genl_rcv_msg+0xd7/0x1d0 netlink_rcv_skb+0x49/0xf0 genl_rcv+0x1f/0x30 netlink_unicast+0x1f6/0x2c0 netlink_sendmsg+0x1f9/0x400 __sys_sendto+0xe1/0x130 __x64_sys_sendto+0x1b/0x20 do_syscall_64+0x3a/0x70 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: c97a31f66ebc ("ethtool: wire in generic SFP module access") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-05Merge tag 'net-5.13-rc5' of ↵Linus Torvalds33-133/+291
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes, including fixes from bpf, wireless, netfilter and wireguard trees. The bpf vs lockdown+audit fix is the most notable. Things haven't slowed down just yet, both in terms of regressions in current release and largish fixes for older code, but we usually see a slowdown only after -rc5. Current release - regressions: - virtio-net: fix page faults and crashes when XDP is enabled - mlx5e: fix HW timestamping with CQE compression, and make sure they are only allowed to coexist with capable devices - stmmac: - fix kernel panic due to NULL pointer dereference of mdio_bus_data - fix double clk unprepare when no PHY device is connected Current release - new code bugs: - mt76: a few fixes for the recent MT7921 devices and runtime power management Previous releases - regressions: - ice: - track AF_XDP ZC enabled queues in bitmap to fix copy mode Tx - fix allowing VF to request more/less queues via virtchnl - correct supported and advertised autoneg by using PHY capabilities - allow all LLDP packets from PF to Tx - kbuild: quote OBJCOPY var to avoid a pahole call break the build Previous releases - always broken: - bpf, lockdown, audit: fix buggy SELinux lockdown permission checks - mt76: address the recent FragAttack vulnerabilities not covered by generic fixes - ipv6: fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions - Bluetooth: - fix the erroneous flush_work() order, to avoid double free - use correct lock to prevent UAF of hdev object - nfc: fix NULL ptr dereference in llcp_sock_getname() after failed connect - ieee802154: multiple fixes to error checking and return values - igb: fix XDP with PTP enabled - intel: add correct exception tracing for XDP - tls: fix use-after-free when TLS offload device goes down and back up - ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service - netfilter: nft_ct: skip expectations for confirmed conntrack - mptcp: fix falling back to TCP in presence of out of order packets early in connection lifetime - wireguard: switch from O(n) to a O(1) algorithm for maintaining peers, fixing stalls and a large memory leak in the process Misc: - devlink: correct VIRTUAL port to not have phys_port attributes - Bluetooth: fix VIRTIO_ID_BT assigned number - net: return the correct errno code ENOBUF -> ENOMEM - wireguard: - peer: allocate in kmem_cache saving 25% on peer memory - do not use -O3" * tag 'net-5.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits) cxgb4: avoid link re-train during TC-MQPRIO configuration sch_htb: fix refcount leak in htb_parent_to_leaf_offload wireguard: allowedips: free empty intermediate nodes when removing single node wireguard: allowedips: allocate nodes in kmem_cache wireguard: allowedips: remove nodes in O(1) wireguard: allowedips: initialize list head in selftest wireguard: peer: allocate in kmem_cache wireguard: use synchronize_net rather than synchronize_rcu wireguard: do not use -O3 wireguard: selftests: make sure rp_filter is disabled on vethc wireguard: selftests: remove old conntrack kconfig value virtchnl: Add missing padding to virtchnl_proto_hdrs ice: Allow all LLDP packets from PF to Tx ice: report supported and advertised autoneg using PHY capabilities ice: handle the VF VSI rebuild failure ice: Fix VFR issues for AVF drivers that expect ATQLEN cleared ice: Fix allowing VF to request more/less queues via virtchnl virtio-net: fix for skb_over_panic inside big mode ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions fib: Return the correct errno code ...
2021-06-05sch_htb: fix refcount leak in htb_parent_to_leaf_offloadYunjian Wang1-4/+4
The commit ae81feb7338c ("sch_htb: fix null pointer dereference on a null new_q") fixes a NULL pointer dereference bug, but it is not correct. Because htb_graft_helper properly handles the case when new_q is NULL, and after the previous patch by skipping this call which creates an inconsistency : dev_queue->qdisc will still point to the old qdisc, but cl->parent->leaf.q will point to the new one (which will be noop_qdisc, because new_q was NULL). The code is based on an assumption that these two pointers are the same, so it can lead to refcount leaks. The correct fix is to add a NULL pointer check to protect qdisc_refcount_inc inside htb_parent_to_leaf_offload. Fixes: ae81feb7338c ("sch_htb: fix null pointer dereference on a null new_q") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Suggested-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04Merge tag 'for-net-2021-06-03' of ↵David S. Miller2-3/+8
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth bluetooth pull request for net: - Fixes UAF and CVE-2021-3564 - Fix VIRTIO_ID_BT to use an unassigned ID - Fix firmware loading on some Intel Controllers Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04Merge tag 'ieee802154-for-davem-2021-06-03' of ↵David S. Miller3-9/+14
git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan Stefan Schmidt says: ==================== An update from ieee802154 for your *net* tree. This time we have fixes for the ieee802154 netlink code, as well as a driver fix. Zhen Lei, Wei Yongjun and Yang Li each had a patch to cleanup some return code handling ensuring we actually get a real error code when things fails. Dan Robertson fixed a potential null dereference in our netlink handling. Andy Shevchenko removed of_match_ptr()usage in the mrf24j40 driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptionsCoco Li1-2/+6
Reported by syzbot: HEAD commit: 90c911ad Merge tag 'fixes' of git://git.kernel.org/pub/scm.. git tree: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master dashboard link: https://syzkaller.appspot.com/bug?extid=123aa35098fd3c000eb7 compiler: Debian clang version 11.0.1-2 ================================================================== BUG: KASAN: slab-out-of-bounds in fib6_nh_get_excptn_bucket net/ipv6/route.c:1604 [inline] BUG: KASAN: slab-out-of-bounds in fib6_nh_flush_exceptions+0xbd/0x360 net/ipv6/route.c:1732 Read of size 8 at addr ffff8880145c78f8 by task syz-executor.4/17760 CPU: 0 PID: 17760 Comm: syz-executor.4 Not tainted 5.12.0-rc8-syzkaller #0 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x202/0x31e lib/dump_stack.c:120 print_address_description+0x5f/0x3b0 mm/kasan/report.c:232 __kasan_report mm/kasan/report.c:399 [inline] kasan_report+0x15c/0x200 mm/kasan/report.c:416 fib6_nh_get_excptn_bucket net/ipv6/route.c:1604 [inline] fib6_nh_flush_exceptions+0xbd/0x360 net/ipv6/route.c:1732 fib6_nh_release+0x9a/0x430 net/ipv6/route.c:3536 fib6_info_destroy_rcu+0xcb/0x1c0 net/ipv6/ip6_fib.c:174 rcu_do_batch kernel/rcu/tree.c:2559 [inline] rcu_core+0x8f6/0x1450 kernel/rcu/tree.c:2794 __do_softirq+0x372/0x7a6 kernel/softirq.c:345 invoke_softirq kernel/softirq.c:221 [inline] __irq_exit_rcu+0x22c/0x260 kernel/softirq.c:422 irq_exit_rcu+0x5/0x20 kernel/softirq.c:434 sysvec_apic_timer_interrupt+0x91/0xb0 arch/x86/kernel/apic/apic.c:1100 </IRQ> asm_sysvec_apic_timer_interrupt+0x12/0x20 arch/x86/include/asm/idtentry.h:632 RIP: 0010:lock_acquire+0x1f6/0x720 kernel/locking/lockdep.c:5515 Code: f6 84 24 a1 00 00 00 02 0f 85 8d 02 00 00 f7 c3 00 02 00 00 49 bd 00 00 00 00 00 fc ff df 74 01 fb 48 c7 44 24 40 0e 36 e0 45 <4b> c7 44 3d 00 00 00 00 00 4b c7 44 3d 09 00 00 00 00 43 c7 44 3d RSP: 0018:ffffc90009e06560 EFLAGS: 00000206 RAX: 1ffff920013c0cc0 RBX: 0000000000000246 RCX: dffffc0000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffc90009e066e0 R08: dffffc0000000000 R09: fffffbfff1f992b1 R10: fffffbfff1f992b1 R11: 0000000000000000 R12: 0000000000000000 R13: dffffc0000000000 R14: 0000000000000000 R15: 1ffff920013c0cb4 rcu_lock_acquire+0x2a/0x30 include/linux/rcupdate.h:267 rcu_read_lock include/linux/rcupdate.h:656 [inline] ext4_get_group_info+0xea/0x340 fs/ext4/ext4.h:3231 ext4_mb_prefetch+0x123/0x5d0 fs/ext4/mballoc.c:2212 ext4_mb_regular_allocator+0x8a5/0x28f0 fs/ext4/mballoc.c:2379 ext4_mb_new_blocks+0xc6e/0x24f0 fs/ext4/mballoc.c:4982 ext4_ext_map_blocks+0x2be3/0x7210 fs/ext4/extents.c:4238 ext4_map_blocks+0xab3/0x1cb0 fs/ext4/inode.c:638 ext4_getblk+0x187/0x6c0 fs/ext4/inode.c:848 ext4_bread+0x2a/0x1c0 fs/ext4/inode.c:900 ext4_append+0x1a4/0x360 fs/ext4/namei.c:67 ext4_init_new_dir+0x337/0xa10 fs/ext4/namei.c:2768 ext4_mkdir+0x4b8/0xc00 fs/ext4/namei.c:2814 vfs_mkdir+0x45b/0x640 fs/namei.c:3819 ovl_do_mkdir fs/overlayfs/overlayfs.h:161 [inline] ovl_mkdir_real+0x53/0x1a0 fs/overlayfs/dir.c:146 ovl_create_real+0x280/0x490 fs/overlayfs/dir.c:193 ovl_workdir_create+0x425/0x600 fs/overlayfs/super.c:788 ovl_make_workdir+0xed/0x1140 fs/overlayfs/super.c:1355 ovl_get_workdir fs/overlayfs/super.c:1492 [inline] ovl_fill_super+0x39ee/0x5370 fs/overlayfs/super.c:2035 mount_nodev+0x52/0xe0 fs/super.c:1413 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x86/0x270 fs/super.c:1497 do_new_mount fs/namespace.c:2903 [inline] path_mount+0x196f/0x2be0 fs/namespace.c:3233 do_mount fs/namespace.c:3246 [inline] __do_sys_mount fs/namespace.c:3454 [inline] __se_sys_mount+0x2f9/0x3b0 fs/namespace.c:3431 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x4665f9 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f68f2b87188 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 00000000004665f9 RDX: 00000000200000c0 RSI: 0000000020000000 RDI: 000000000040000a RBP: 00000000004bfbb9 R08: 0000000020000100 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000056bf60 R13: 00007ffe19002dff R14: 00007f68f2b87300 R15: 0000000000022000 Allocated by task 17768: kasan_save_stack mm/kasan/common.c:38 [inline] kasan_set_track mm/kasan/common.c:46 [inline] set_alloc_info mm/kasan/common.c:427 [inline] ____kasan_kmalloc+0xc2/0xf0 mm/kasan/common.c:506 kasan_kmalloc include/linux/kasan.h:233 [inline] __kmalloc+0xb4/0x380 mm/slub.c:4055 kmalloc include/linux/slab.h:559 [inline] kzalloc include/linux/slab.h:684 [inline] fib6_info_alloc+0x2c/0xd0 net/ipv6/ip6_fib.c:154 ip6_route_info_create+0x55d/0x1a10 net/ipv6/route.c:3638 ip6_route_add+0x22/0x120 net/ipv6/route.c:3728 inet6_rtm_newroute+0x2cd/0x2260 net/ipv6/route.c:5352 rtnetlink_rcv_msg+0xb34/0xe70 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x7de/0x9b0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0xaa6/0xe90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg net/socket.c:674 [inline] ____sys_sendmsg+0x5a2/0x900 net/socket.c:2350 ___sys_sendmsg net/socket.c:2404 [inline] __sys_sendmsg+0x319/0x400 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae Last potentially related work creation: kasan_save_stack+0x27/0x50 mm/kasan/common.c:38 kasan_record_aux_stack+0xee/0x120 mm/kasan/generic.c:345 __call_rcu kernel/rcu/tree.c:3039 [inline] call_rcu+0x1b1/0xa30 kernel/rcu/tree.c:3114 fib6_info_release include/net/ip6_fib.h:337 [inline] ip6_route_info_create+0x10c4/0x1a10 net/ipv6/route.c:3718 ip6_route_add+0x22/0x120 net/ipv6/route.c:3728 inet6_rtm_newroute+0x2cd/0x2260 net/ipv6/route.c:5352 rtnetlink_rcv_msg+0xb34/0xe70 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x7de/0x9b0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0xaa6/0xe90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg net/socket.c:674 [inline] ____sys_sendmsg+0x5a2/0x900 net/socket.c:2350 ___sys_sendmsg net/socket.c:2404 [inline] __sys_sendmsg+0x319/0x400 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae Second to last potentially related work creation: kasan_save_stack+0x27/0x50 mm/kasan/common.c:38 kasan_record_aux_stack+0xee/0x120 mm/kasan/generic.c:345 insert_work+0x54/0x400 kernel/workqueue.c:1331 __queue_work+0x981/0xcc0 kernel/workqueue.c:1497 queue_work_on+0x111/0x200 kernel/workqueue.c:1524 queue_work include/linux/workqueue.h:507 [inline] call_usermodehelper_exec+0x283/0x470 kernel/umh.c:433 kobject_uevent_env+0x1349/0x1730 lib/kobject_uevent.c:617 kvm_uevent_notify_change+0x309/0x3b0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4809 kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:877 [inline] kvm_put_kvm+0x9c/0xd10 arch/x86/kvm/../../../virt/kvm/kvm_main.c:920 kvm_vcpu_release+0x53/0x60 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3120 __fput+0x352/0x7b0 fs/file_table.c:280 task_work_run+0x146/0x1c0 kernel/task_work.c:140 tracehook_notify_resume include/linux/tracehook.h:189 [inline] exit_to_user_mode_loop kernel/entry/common.c:174 [inline] exit_to_user_mode_prepare+0x10b/0x1e0 kernel/entry/common.c:208 __syscall_exit_to_user_mode_work kernel/entry/common.c:290 [inline] syscall_exit_to_user_mode+0x26/0x70 kernel/entry/common.c:301 entry_SYSCALL_64_after_hwframe+0x44/0xae The buggy address belongs to the object at ffff8880145c7800 which belongs to the cache kmalloc-192 of size 192 The buggy address is located 56 bytes to the right of 192-byte region [ffff8880145c7800, ffff8880145c78c0) The buggy address belongs to the page: page:ffffea00005171c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x145c7 flags: 0xfff00000000200(slab) raw: 00fff00000000200 ffffea00006474c0 0000000200000002 ffff888010c41a00 raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8880145c7780: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ffff8880145c7800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8880145c7880: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff8880145c7900: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880145c7980: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ================================================================== In the ip6_route_info_create function, in the case that the nh pointer is not NULL, the fib6_nh in fib6_info has not been allocated. Therefore, when trying to free fib6_info in this error case using fib6_info_release, the function will call fib6_info_destroy_rcu, which it will access fib6_nh_release(f6i->fib6_nh); However, f6i->fib6_nh doesn't have any refcount yet given the lack of allocation causing the reported memory issue above. Therefore, releasing the empty pointer directly instead would be the solution. Fixes: f88d8ea67fbdb ("ipv6: Plumb support for nexthop object in a fib6_info") Fixes: 706ec91916462 ("ipv6: Fix nexthop refcnt leak when creating ipv6 route info") Signed-off-by: Coco Li <lixiaoyan@google.com> Cc: David Ahern <dsahern@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04fib: Return the correct errno codeZheng Yongjun1-1/+1
When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04net: Return the correct errno codeZheng Yongjun1-1/+1
When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-04net/x25: Return the correct errno codeZheng Yongjun1-1/+1
When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>