summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2020-07-31AX.25: Prevent out-of-bounds read in ax25_sendmsg()Peilin Ye1-1/+2
[ Upstream commit 8885bb0621f01a6c82be60a91e5fc0f6e2f71186 ] Checks on `addr_len` and `usax->sax25_ndigis` are insufficient. ax25_sendmsg() can go out of bounds when `usax->sax25_ndigis` equals to 7 or 8. Fix it. It is safe to remove `usax->sax25_ndigis > AX25_MAX_DIGIS`, since `addr_len` is guaranteed to be less than or equal to `sizeof(struct full_sockaddr_ax25)` Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-31AX.25: Fix out-of-bounds read in ax25_connect()Peilin Ye1-1/+3
[ Upstream commit 2f2a7ffad5c6cbf3d438e813cfdc88230e185ba6 ] Checks on `addr_len` and `fsa->fsa_ax25.sax25_ndigis` are insufficient. ax25_connect() can go out of bounds when `fsa->fsa_ax25.sax25_ndigis` equals to 7 or 8. Fix it. This issue has been reported as a KMSAN uninit-value bug, because in such a case, ax25_connect() reaches into the uninitialized portion of the `struct sockaddr_storage` statically allocated in __sys_connect(). It is safe to remove `fsa->fsa_ax25.sax25_ndigis > AX25_MAX_DIGIS` because `addr_len` is guaranteed to be less than or equal to `sizeof(struct full_sockaddr_ax25)`. Reported-by: syzbot+c82752228ed975b0a623@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=55ef9d629f3b3d7d70b69558015b63b48d01af66 Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-31mac80211: allow rx of mesh eapol frames with default rx keyMarkus Theil1-0/+26
[ Upstream commit 0b467b63870d9c05c81456aa9bfee894ab2db3b6 ] Without this patch, eapol frames cannot be received in mesh mode, when 802.1X should be used. Initially only a MGTK is defined, which is found and set as rx->key, when there are no other keys set. ieee80211_drop_unencrypted would then drop these eapol frames, as they are data frames without encryption and there exists some rx->key. Fix this by differentiating between mesh eapol frames and other data frames with existing rx->key. Allow mesh mesh eapol frames only if they are for our vif address. With this patch in-place, ieee80211_rx_h_mesh_fwding continues after the ieee80211_drop_unencrypted check and notices, that these eapol frames have to be delivered locally, as they should. Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de> Link: https://lore.kernel.org/r/20200625104214.50319-1-markus.theil@tu-ilmenau.de [small code cleanups] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-22tcp: md5: allow changing MD5 keys in all socket statesEric Dumazet1-4/+1
[ Upstream commit 1ca0fafd73c5268e8fc4b997094b8bb2bfe8deea ] This essentially reverts commit 721230326891 ("tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets") Mathieu reported that many vendors BGP implementations can actually switch TCP MD5 on established flows. Quoting Mathieu : Here is a list of a few network vendors along with their behavior with respect to TCP MD5: - Cisco: Allows for password to be changed, but within the hold-down timer (~180 seconds). - Juniper: When password is initially set on active connection it will reset, but after that any subsequent password changes no network resets. - Nokia: No notes on if they flap the tcp connection or not. - Ericsson/RedBack: Allows for 2 password (old/new) to co-exist until both sides are ok with new passwords. - Meta-Switch: Expects the password to be set before a connection is attempted, but no further info on whether they reset the TCP connection on a change. - Avaya: Disable the neighbor, then set password, then re-enable. - Zebos: Would normally allow the change when socket connected. We can revert my prior change because commit 9424e2e7ad93 ("tcp: md5: fix potential overestimation of TCP option space") removed the leak of 4 kernel bytes to the wire that was the main reason for my patch. While doing my investigations, I found a bug when a MD5 key is changed, leading to these commits that stable teams want to consider before backporting this revert : Commit 6a2febec338d ("tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()") Commit e6ced831ef11 ("tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers") Fixes: 721230326891 "tcp: md5: reject TCP_MD5SIG or TCP_MD5SIG_EXT on established sockets" Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-22tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriersEric Dumazet2-7/+13
[ Upstream commit e6ced831ef11a2a06e8d00aad9d4fc05b610bf38 ] My prior fix went a bit too far, according to Herbert and Mathieu. Since we accept that concurrent TCP MD5 lookups might see inconsistent keys, we can use READ_ONCE()/WRITE_ONCE() instead of smp_rmb()/smp_wmb() Clearing all key->key[] is needed to avoid possible KMSAN reports, if key->keylen is increased. Since tcp_md5_do_add() is not fast path, using __GFP_ZERO to clear all struct tcp_md5sig_key is simpler. data_race() was added in linux-5.8 and will prevent KCSAN reports, this can safely be removed in stable backports, if data_race() is not yet backported. v2: use data_race() both in tcp_md5_hash_key() and tcp_md5_do_add() Fixes: 6a2febec338d ("tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Marco Elver <elver@google.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-22tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()Eric Dumazet2-1/+7
[ Upstream commit 6a2febec338df7e7699a52d00b2e1207dcf65b28 ] MD5 keys are read with RCU protection, and tcp_md5_do_add() might update in-place a prior key. Normally, typical RCU updates would allocate a new piece of memory. In this case only key->key and key->keylen might be updated, and we do not care if an incoming packet could see the old key, the new one, or some intermediate value, since changing the key on a live flow is known to be problematic anyway. We only want to make sure that in the case key->keylen is changed, cpus in tcp_md5_hash_key() wont try to use uninitialized data, or crash because key->keylen was read twice to feed sg_init_one() and ahash_request_set_crypt() Fixes: 9ea88a153001 ("tcp: md5: check md5 signature without socket lock") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-22tcp: make sure listeners don't initialize congestion-control stateChristoph Paasch2-1/+4
[ Upstream commit ce69e563b325f620863830c246a8698ccea52048 ] syzkaller found its way into setsockopt with TCP_CONGESTION "cdg". tcp_cdg_init() does a kcalloc to store the gradients. As sk_clone_lock just copies all the memory, the allocated pointer will be copied as well, if the app called setsockopt(..., TCP_CONGESTION) on the listener. If now the socket will be destroyed before the congestion-control has properly been initialized (through a call to tcp_init_transfer), we will end up freeing memory that does not belong to that particular socket, opening the door to a double-free: [ 11.413102] ================================================================== [ 11.414181] BUG: KASAN: double-free or invalid-free in tcp_cleanup_congestion_control+0x58/0xd0 [ 11.415329] [ 11.415560] CPU: 3 PID: 4884 Comm: syz-executor.5 Not tainted 5.8.0-rc2 #80 [ 11.416544] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 11.418148] Call Trace: [ 11.418534] <IRQ> [ 11.418834] dump_stack+0x7d/0xb0 [ 11.419297] print_address_description.constprop.0+0x1a/0x210 [ 11.422079] kasan_report_invalid_free+0x51/0x80 [ 11.423433] __kasan_slab_free+0x15e/0x170 [ 11.424761] kfree+0x8c/0x230 [ 11.425157] tcp_cleanup_congestion_control+0x58/0xd0 [ 11.425872] tcp_v4_destroy_sock+0x57/0x5a0 [ 11.426493] inet_csk_destroy_sock+0x153/0x2c0 [ 11.427093] tcp_v4_syn_recv_sock+0xb29/0x1100 [ 11.427731] tcp_get_cookie_sock+0xc3/0x4a0 [ 11.429457] cookie_v4_check+0x13d0/0x2500 [ 11.433189] tcp_v4_do_rcv+0x60e/0x780 [ 11.433727] tcp_v4_rcv+0x2869/0x2e10 [ 11.437143] ip_protocol_deliver_rcu+0x23/0x190 [ 11.437810] ip_local_deliver+0x294/0x350 [ 11.439566] __netif_receive_skb_one_core+0x15d/0x1a0 [ 11.441995] process_backlog+0x1b1/0x6b0 [ 11.443148] net_rx_action+0x37e/0xc40 [ 11.445361] __do_softirq+0x18c/0x61a [ 11.445881] asm_call_on_stack+0x12/0x20 [ 11.446409] </IRQ> [ 11.446716] do_softirq_own_stack+0x34/0x40 [ 11.447259] do_softirq.part.0+0x26/0x30 [ 11.447827] __local_bh_enable_ip+0x46/0x50 [ 11.448406] ip_finish_output2+0x60f/0x1bc0 [ 11.450109] __ip_queue_xmit+0x71c/0x1b60 [ 11.451861] __tcp_transmit_skb+0x1727/0x3bb0 [ 11.453789] tcp_rcv_state_process+0x3070/0x4d3a [ 11.456810] tcp_v4_do_rcv+0x2ad/0x780 [ 11.457995] __release_sock+0x14b/0x2c0 [ 11.458529] release_sock+0x4a/0x170 [ 11.459005] __inet_stream_connect+0x467/0xc80 [ 11.461435] inet_stream_connect+0x4e/0xa0 [ 11.462043] __sys_connect+0x204/0x270 [ 11.465515] __x64_sys_connect+0x6a/0xb0 [ 11.466088] do_syscall_64+0x3e/0x70 [ 11.466617] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.467341] RIP: 0033:0x7f56046dc469 [ 11.467844] Code: Bad RIP value. [ 11.468282] RSP: 002b:00007f5604dccdd8 EFLAGS: 00000246 ORIG_RAX: 000000000000002a [ 11.469326] RAX: ffffffffffffffda RBX: 000000000068bf00 RCX: 00007f56046dc469 [ 11.470379] RDX: 0000000000000010 RSI: 0000000020000000 RDI: 0000000000000004 [ 11.471311] RBP: 00000000ffffffff R08: 0000000000000000 R09: 0000000000000000 [ 11.472286] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 11.473341] R13: 000000000041427c R14: 00007f5604dcd5c0 R15: 0000000000000003 [ 11.474321] [ 11.474527] Allocated by task 4884: [ 11.475031] save_stack+0x1b/0x40 [ 11.475548] __kasan_kmalloc.constprop.0+0xc2/0xd0 [ 11.476182] tcp_cdg_init+0xf0/0x150 [ 11.476744] tcp_init_congestion_control+0x9b/0x3a0 [ 11.477435] tcp_set_congestion_control+0x270/0x32f [ 11.478088] do_tcp_setsockopt.isra.0+0x521/0x1a00 [ 11.478744] __sys_setsockopt+0xff/0x1e0 [ 11.479259] __x64_sys_setsockopt+0xb5/0x150 [ 11.479895] do_syscall_64+0x3e/0x70 [ 11.480395] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 11.481097] [ 11.481321] Freed by task 4872: [ 11.481783] save_stack+0x1b/0x40 [ 11.482230] __kasan_slab_free+0x12c/0x170 [ 11.482839] kfree+0x8c/0x230 [ 11.483240] tcp_cleanup_congestion_control+0x58/0xd0 [ 11.483948] tcp_v4_destroy_sock+0x57/0x5a0 [ 11.484502] inet_csk_destroy_sock+0x153/0x2c0 [ 11.485144] tcp_close+0x932/0xfe0 [ 11.485642] inet_release+0xc1/0x1c0 [ 11.486131] __sock_release+0xc0/0x270 [ 11.486697] sock_close+0xc/0x10 [ 11.487145] __fput+0x277/0x780 [ 11.487632] task_work_run+0xeb/0x180 [ 11.488118] __prepare_exit_to_usermode+0x15a/0x160 [ 11.488834] do_syscall_64+0x4a/0x70 [ 11.489326] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Wei Wang fixed a part of these CDG-malloc issues with commit c12014440750 ("tcp: memset ca_priv data to 0 properly"). This patch here fixes the listener-scenario: We make sure that listeners setting the congestion-control through setsockopt won't initialize it (thus CDG never allocates on listeners). For those who use AF_UNSPEC to reuse a socket, tcp_disconnect() is changed to cleanup afterwards. (The issue can be reproduced at least down to v4.4.x.) Cc: Wei Wang <weiwan@google.com> Cc: Eric Dumazet <edumazet@google.com> Fixes: 2b0a8c9eee81 ("tcp: add CDG congestion control") Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-22genetlink: remove genl_bindSean Tranchetti1-52/+0
[ Upstream commit 1e82a62fec613844da9e558f3493540a5b7a7b67 ] A potential deadlock can occur during registering or unregistering a new generic netlink family between the main nl_table_lock and the cb_lock where each thread wants the lock held by the other, as demonstrated below. 1) Thread 1 is performing a netlink_bind() operation on a socket. As part of this call, it will call netlink_lock_table(), incrementing the nl_table_users count to 1. 2) Thread 2 is registering (or unregistering) a genl_family via the genl_(un)register_family() API. The cb_lock semaphore will be taken for writing. 3) Thread 1 will call genl_bind() as part of the bind operation to handle subscribing to GENL multicast groups at the request of the user. It will attempt to take the cb_lock semaphore for reading, but it will fail and be scheduled away, waiting for Thread 2 to finish the write. 4) Thread 2 will call netlink_table_grab() during the (un)registration call. However, as Thread 1 has incremented nl_table_users, it will not be able to proceed, and both threads will be stuck waiting for the other. genl_bind() is a noop, unless a genl_family implements the mcast_bind() function to handle setting up family-specific multicast operations. Since no one in-tree uses this functionality as Cong pointed out, simply removing the genl_bind() function will remove the possibility for deadlock, as there is no attempt by Thread 1 above to take the cb_lock semaphore. Fixes: c380d9a7afff ("genetlink: pass multicast bind/unbind to families") Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Johannes Berg <johannes.berg@intel.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Sean Tranchetti <stranche@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-22llc: make sure applications use ARPHRD_ETHEREric Dumazet1-3/+7
[ Upstream commit a9b1110162357689a34992d5c925852948e5b9fd ] syzbot was to trigger a bug by tricking AF_LLC with non sensible addr->sllc_arphrd It seems clear LLC requires an Ethernet device. Back in commit abf9d537fea2 ("llc: add support for SO_BINDTODEVICE") Octavian Purdila added possibility for application to use a zero value for sllc_arphrd, convert it to ARPHRD_ETHER to not cause regressions on existing applications. BUG: KASAN: use-after-free in __read_once_size include/linux/compiler.h:199 [inline] BUG: KASAN: use-after-free in list_empty include/linux/list.h:268 [inline] BUG: KASAN: use-after-free in waitqueue_active include/linux/wait.h:126 [inline] BUG: KASAN: use-after-free in wq_has_sleeper include/linux/wait.h:160 [inline] BUG: KASAN: use-after-free in skwq_has_sleeper include/net/sock.h:2092 [inline] BUG: KASAN: use-after-free in sock_def_write_space+0x642/0x670 net/core/sock.c:2813 Read of size 8 at addr ffff88801e0b4078 by task ksoftirqd/3/27 CPU: 3 PID: 27 Comm: ksoftirqd/3 Not tainted 5.5.0-rc1-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374 __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:639 __asan_report_load8_noabort+0x14/0x20 mm/kasan/generic_report.c:135 __read_once_size include/linux/compiler.h:199 [inline] list_empty include/linux/list.h:268 [inline] waitqueue_active include/linux/wait.h:126 [inline] wq_has_sleeper include/linux/wait.h:160 [inline] skwq_has_sleeper include/net/sock.h:2092 [inline] sock_def_write_space+0x642/0x670 net/core/sock.c:2813 sock_wfree+0x1e1/0x260 net/core/sock.c:1958 skb_release_head_state+0xeb/0x260 net/core/skbuff.c:652 skb_release_all+0x16/0x60 net/core/skbuff.c:663 __kfree_skb net/core/skbuff.c:679 [inline] consume_skb net/core/skbuff.c:838 [inline] consume_skb+0xfb/0x410 net/core/skbuff.c:832 __dev_kfree_skb_any+0xa4/0xd0 net/core/dev.c:2967 dev_kfree_skb_any include/linux/netdevice.h:3650 [inline] e1000_unmap_and_free_tx_resource.isra.0+0x21b/0x3a0 drivers/net/ethernet/intel/e1000/e1000_main.c:1963 e1000_clean_tx_irq drivers/net/ethernet/intel/e1000/e1000_main.c:3854 [inline] e1000_clean+0x4cc/0x1d10 drivers/net/ethernet/intel/e1000/e1000_main.c:3796 napi_poll net/core/dev.c:6532 [inline] net_rx_action+0x508/0x1120 net/core/dev.c:6600 __do_softirq+0x262/0x98c kernel/softirq.c:292 run_ksoftirqd kernel/softirq.c:603 [inline] run_ksoftirqd+0x8e/0x110 kernel/softirq.c:595 smpboot_thread_fn+0x6a3/0xa40 kernel/smpboot.c:165 kthread+0x361/0x430 kernel/kthread.c:255 ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 Allocated by task 8247: save_stack+0x23/0x90 mm/kasan/common.c:72 set_track mm/kasan/common.c:80 [inline] __kasan_kmalloc mm/kasan/common.c:513 [inline] __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:486 kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:521 slab_post_alloc_hook mm/slab.h:584 [inline] slab_alloc mm/slab.c:3320 [inline] kmem_cache_alloc+0x121/0x710 mm/slab.c:3484 sock_alloc_inode+0x1c/0x1d0 net/socket.c:240 alloc_inode+0x68/0x1e0 fs/inode.c:230 new_inode_pseudo+0x19/0xf0 fs/inode.c:919 sock_alloc+0x41/0x270 net/socket.c:560 __sock_create+0xc2/0x730 net/socket.c:1384 sock_create net/socket.c:1471 [inline] __sys_socket+0x103/0x220 net/socket.c:1513 __do_sys_socket net/socket.c:1522 [inline] __se_sys_socket net/socket.c:1520 [inline] __ia32_sys_socket+0x73/0xb0 net/socket.c:1520 do_syscall_32_irqs_on arch/x86/entry/common.c:337 [inline] do_fast_syscall_32+0x27b/0xe16 arch/x86/entry/common.c:408 entry_SYSENTER_compat+0x70/0x7f arch/x86/entry/entry_64_compat.S:139 Freed by task 17: save_stack+0x23/0x90 mm/kasan/common.c:72 set_track mm/kasan/common.c:80 [inline] kasan_set_free_info mm/kasan/common.c:335 [inline] __kasan_slab_free+0x102/0x150 mm/kasan/common.c:474 kasan_slab_free+0xe/0x10 mm/kasan/common.c:483 __cache_free mm/slab.c:3426 [inline] kmem_cache_free+0x86/0x320 mm/slab.c:3694 sock_free_inode+0x20/0x30 net/socket.c:261 i_callback+0x44/0x80 fs/inode.c:219 __rcu_reclaim kernel/rcu/rcu.h:222 [inline] rcu_do_batch kernel/rcu/tree.c:2183 [inline] rcu_core+0x570/0x1540 kernel/rcu/tree.c:2408 rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2417 __do_softirq+0x262/0x98c kernel/softirq.c:292 The buggy address belongs to the object at ffff88801e0b4000 which belongs to the cache sock_inode_cache of size 1152 The buggy address is located 120 bytes inside of 1152-byte region [ffff88801e0b4000, ffff88801e0b4480) The buggy address belongs to the page: page:ffffea0000782d00 refcount:1 mapcount:0 mapping:ffff88807aa59c40 index:0xffff88801e0b4ffd raw: 00fffe0000000200 ffffea00008e6c88 ffffea0000782d48 ffff88807aa59c40 raw: ffff88801e0b4ffd ffff88801e0b4000 0000000100000003 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88801e0b3f00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc ffff88801e0b3f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88801e0b4000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff88801e0b4080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88801e0b4100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb Fixes: abf9d537fea2 ("llc: add support for SO_BINDTODEVICE") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-22l2tp: remove skb_dst_set() from l2tp_xmit_skb()Xin Long1-4/+1
[ Upstream commit 27d53323664c549b5bb2dfaaf6f7ad6e0376a64e ] In the tx path of l2tp, l2tp_xmit_skb() calls skb_dst_set() to set skb's dst. However, it will eventually call inet6_csk_xmit() or ip_queue_xmit() where skb's dst will be overwritten by: skb_dst_set_noref(skb, dst); without releasing the old dst in skb. Then it causes dst/dev refcnt leak: unregister_netdevice: waiting for eth0 to become free. Usage count = 1 This can be reproduced by simply running: # modprobe l2tp_eth && modprobe l2tp_ip # sh ./tools/testing/selftests/net/l2tp.sh So before going to inet6_csk_xmit() or ip_queue_xmit(), skb's dst should be dropped. This patch is to fix it by removing skb_dst_set() from l2tp_xmit_skb() and moving skb_dst_drop() into l2tp_xmit_core(). Fixes: 3557baabf280 ("[L2TP]: PPP over L2TP driver core") Reported-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: James Chapman <jchapman@katalix.com> Tested-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-22ipv4: fill fl4_icmp_{type,code} in ping_v4_sendmsgSabrina Dubroca1-0/+3
[ Upstream commit 5eff06902394425c722f0a44d9545909a8800f79 ] IPv4 ping sockets don't set fl4.fl4_icmp_{type,code}, which leads to incomplete IPsec ACQUIRE messages being sent to userspace. Currently, both raw sockets and IPv6 ping sockets set those fields. Expected output of "ip xfrm monitor": acquire proto esp sel src 10.0.2.15/32 dst 8.8.8.8/32 proto icmp type 8 code 0 dev ens4 policy src 10.0.2.15/32 dst 8.8.8.8/32 <snip> Currently with ping sockets: acquire proto esp sel src 10.0.2.15/32 dst 8.8.8.8/32 proto icmp type 0 code 0 dev ens4 policy src 10.0.2.15/32 dst 8.8.8.8/32 <snip> The Libreswan test suite found this problem after Fedora changed the value for the sysctl net.ipv4.ping_group_range. Fixes: c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP socket kind") Reported-by: Paul Wouters <pwouters@redhat.com> Tested-by: Paul Wouters <pwouters@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-09netfilter: nf_conntrack_h323: lost .data_len definition for Q.931/ipv6Vasily Averin1-0/+1
Could you please push this patch into stable@? it fixes memory corruption in kernels v3.5 .. v4.10 Lost .data_len definition leads to write beyond end of struct nf_ct_h323_master. Usually it corrupts following struct nf_conn_nat, however if nat is not loaded it corrupts following slab object. In mainline this problem went away in v4.11, after commit 9f0f3ebeda47 ("netfilter: helpers: remove data_len usage for inkernel helpers") however many stable kernels are still affected. Fixes: 1afc56794e03 ("netfilter: nf_ct_helper: implement variable length helper private data") # v3.5 cc: stable@vger.kernel.org Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30SUNRPC: Properly set the @subbuf parameter of xdr_buf_subsegment()Chuck Lever1-0/+4
commit 89a3c9f5b9f0bcaa9aea3e8b2a616fcaea9aad78 upstream. @subbuf is an output parameter of xdr_buf_subsegment(). A survey of call sites shows that @subbuf is always uninitialized before xdr_buf_segment() is invoked by callers. There are some execution paths through xdr_buf_subsegment() that do not set all of the fields in @subbuf, leaving some pointer fields containing garbage addresses. Subsequent processing of that buffer then results in a page fault. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30sunrpc: fixed rollback in rpc_gssd_dummy_populate()Vasily Averin1-0/+1
commit b7ade38165ca0001c5a3bd5314a314abbbfbb1b7 upstream. __rpc_depopulate(gssd_dentry) was lost on error path cc: stable@vger.kernel.org Fixes: commit 4b9a445e3eeb ("sunrpc: create a new dummy pipe for gssd to hold open") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30netfilter: ipset: fix unaligned atomic accessRussell King1-0/+2
[ Upstream commit 715028460082d07a7ec6fcd87b14b46784346a72 ] When using ip_set with counters and comment, traffic causes the kernel to panic on 32-bit ARM: Alignment trap: not handling instruction e1b82f9f at [<bf01b0dc>] Unhandled fault: alignment exception (0x221) at 0xea08133c PC is at ip_set_match_extensions+0xe0/0x224 [ip_set] The problem occurs when we try to update the 64-bit counters - the faulting address above is not 64-bit aligned. The problem occurs due to the way elements are allocated, for example: set->dsize = ip_set_elem_len(set, tb, 0, 0); map = ip_set_alloc(sizeof(*map) + elements * set->dsize); If the element has a requirement for a member to be 64-bit aligned, and set->dsize is not a multiple of 8, but is a multiple of four, then every odd numbered elements will be misaligned - and hitting an atomic64_add() on that element will cause the kernel to panic. ip_set_elem_len() must return a size that is rounded to the maximum alignment of any extension field stored in the element. This change ensures that is the case. Fixes: 95ad1f4a9358 ("netfilter: ipset: Fix extension alignment") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30mld: fix memory leak in ipv6_mc_destroy_dev()Wang Hai1-0/+1
[ Upstream commit ea2fce88d2fd678ed9d45354ff49b73f1d5615dd ] Commit a84d01647989 ("mld: fix memory leak in mld_del_delrec()") fixed the memory leak of MLD, but missing the ipv6_mc_destroy_dev() path, in which mca_sources are leaked after ma_put(). Using ip6_mc_clear_src() to take care of the missing free. BUG: memory leak unreferenced object 0xffff8881113d3180 (size 64): comm "syz-executor071", pid 389, jiffies 4294887985 (age 17.943s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 ff 02 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 ................ backtrace: [<000000002cbc483c>] kmalloc include/linux/slab.h:555 [inline] [<000000002cbc483c>] kzalloc include/linux/slab.h:669 [inline] [<000000002cbc483c>] ip6_mc_add1_src net/ipv6/mcast.c:2237 [inline] [<000000002cbc483c>] ip6_mc_add_src+0x7f5/0xbb0 net/ipv6/mcast.c:2357 [<0000000058b8b1ff>] ip6_mc_source+0xe0c/0x1530 net/ipv6/mcast.c:449 [<000000000bfc4fb5>] do_ipv6_setsockopt.isra.12+0x1b2c/0x3b30 net/ipv6/ipv6_sockglue.c:754 [<00000000e4e7a722>] ipv6_setsockopt+0xda/0x150 net/ipv6/ipv6_sockglue.c:950 [<0000000029260d9a>] rawv6_setsockopt+0x45/0x100 net/ipv6/raw.c:1081 [<000000005c1b46f9>] __sys_setsockopt+0x131/0x210 net/socket.c:2132 [<000000008491f7db>] __do_sys_setsockopt net/socket.c:2148 [inline] [<000000008491f7db>] __se_sys_setsockopt net/socket.c:2145 [inline] [<000000008491f7db>] __x64_sys_setsockopt+0xba/0x150 net/socket.c:2145 [<00000000c7bc11c5>] do_syscall_64+0xa1/0x530 arch/x86/entry/common.c:295 [<000000005fb7a3f3>] entry_SYSCALL_64_after_hwframe+0x49/0xb3 Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when set link down") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Acked-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30net: core: reduce recursion limit valueTaehee Yoo1-1/+1
[ Upstream commit fb7861d14c8d7edac65b2fcb6e8031cb138457b2 ] In the current code, ->ndo_start_xmit() can be executed recursively only 10 times because of stack memory. But, in the case of the vxlan, 10 recursion limit value results in a stack overflow. In the current code, the nested interface is limited by 8 depth. There is no critical reason that the recursion limitation value should be 10. So, it would be good to be the same value with the limitation value of nesting interface depth. Test commands: ip link add vxlan10 type vxlan vni 10 dstport 4789 srcport 4789 4789 ip link set vxlan10 up ip a a 192.168.10.1/24 dev vxlan10 ip n a 192.168.10.2 dev vxlan10 lladdr fc:22:33:44:55:66 nud permanent for i in {9..0} do let A=$i+1 ip link add vxlan$i type vxlan vni $i dstport 4789 srcport 4789 4789 ip link set vxlan$i up ip a a 192.168.$i.1/24 dev vxlan$i ip n a 192.168.$i.2 dev vxlan$i lladdr fc:22:33:44:55:66 nud permanent bridge fdb add fc:22:33:44:55:66 dev vxlan$A dst 192.168.$i.2 self done hping3 192.168.10.2 -2 -d 60000 Splat looks like: [ 103.814237][ T1127] ============================================================================= [ 103.871955][ T1127] BUG kmalloc-2k (Tainted: G B ): Padding overwritten. 0x00000000897a2e4f-0x000 [ 103.873187][ T1127] ----------------------------------------------------------------------------- [ 103.873187][ T1127] [ 103.874252][ T1127] INFO: Slab 0x000000005cccc724 objects=5 used=5 fp=0x0000000000000000 flags=0x10000000001020 [ 103.881323][ T1127] CPU: 3 PID: 1127 Comm: hping3 Tainted: G B 5.7.0+ #575 [ 103.882131][ T1127] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 103.883006][ T1127] Call Trace: [ 103.883324][ T1127] dump_stack+0x96/0xdb [ 103.883716][ T1127] slab_err+0xad/0xd0 [ 103.884106][ T1127] ? _raw_spin_unlock+0x1f/0x30 [ 103.884620][ T1127] ? get_partial_node.isra.78+0x140/0x360 [ 103.885214][ T1127] slab_pad_check.part.53+0xf7/0x160 [ 103.885769][ T1127] ? pskb_expand_head+0x110/0xe10 [ 103.886316][ T1127] check_slab+0x97/0xb0 [ 103.886763][ T1127] alloc_debug_processing+0x84/0x1a0 [ 103.887308][ T1127] ___slab_alloc+0x5a5/0x630 [ 103.887765][ T1127] ? pskb_expand_head+0x110/0xe10 [ 103.888265][ T1127] ? lock_downgrade+0x730/0x730 [ 103.888762][ T1127] ? pskb_expand_head+0x110/0xe10 [ 103.889244][ T1127] ? __slab_alloc+0x3e/0x80 [ 103.889675][ T1127] __slab_alloc+0x3e/0x80 [ 103.890108][ T1127] __kmalloc_node_track_caller+0xc7/0x420 [ ... ] Fixes: 11a766ce915f ("net: Increase xmit RECURSION_LIMIT to 10.") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30net: Do not clear the sock TX queue in sk_set_socket()Tariq Toukan1-0/+2
[ Upstream commit 41b14fb8724d5a4b382a63cb4a1a61880347ccb8 ] Clearing the sock TX queue in sk_set_socket() might cause unexpected out-of-order transmit when called from sock_orphan(), as outstanding packets can pick a different TX queue and bypass the ones already queued. This is undesired in general. More specifically, it breaks the in-order scheduling property guarantee for device-offloaded TLS sockets. Remove the call to sk_tx_queue_clear() in sk_set_socket(), and add it explicitly only where needed. Fixes: e022f0b4a03f ("net: Introduce sk_tx_queue_mapping") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30net: Fix the arp error in some casesguodeqing1-1/+1
[ Upstream commit 5eea3a63ff4aba6a26002e657a6d21934b7e2b96 ] ie., $ ifconfig eth0 6.6.6.6 netmask 255.255.255.0 $ ip rule add from 6.6.6.6 table 6666 $ ip route add 9.9.9.9 via 6.6.6.6 $ ping -I 6.6.6.6 9.9.9.9 PING 9.9.9.9 (9.9.9.9) from 6.6.6.6 : 56(84) bytes of data. 3 packets transmitted, 0 received, 100% packet loss, time 2079ms $ arp Address HWtype HWaddress Flags Mask Iface 6.6.6.6 (incomplete) eth0 The arp request address is error, this is because fib_table_lookup in fib_check_nh lookup the destnation 9.9.9.9 nexthop, the scope of the fib result is RT_SCOPE_LINK,the correct scope is RT_SCOPE_HOST. Here I add a check of whether this is RT_TABLE_MAIN to solve this problem. Fixes: 3bfd847203c6 ("net: Use passed in table for nexthop lookups") Signed-off-by: guodeqing <geffrey.guo@huawei.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30sctp: Don't advertise IPv4 addresses if ipv6only is set on the socketMarcelo Ricardo Leitner3-1/+6
[ Upstream commit 471e39df96b9a4c4ba88a2da9e25a126624d7a9c ] If a socket is set ipv6only, it will still send IPv4 addresses in the INIT and INIT_ACK packets. This potentially misleads the peer into using them, which then would cause association termination. The fix is to not add IPv4 addresses to ipv6only sockets. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Tested-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30tcp: grow window for OOO packets only for SACK flowsEric Dumazet1-2/+10
[ Upstream commit 662051215c758ae8545451628816204ed6cd372d ] Back in 2013, we made a change that broke fast retransmit for non SACK flows. Indeed, for these flows, a sender needs to receive three duplicate ACK before starting fast retransmit. Sending ACK with different receive window do not count. Even if enabling SACK is strongly recommended these days, there still are some cases where it has to be disabled. Not increasing the window seems better than having to rely on RTO. After the fix, following packetdrill test gives : // Initialize connection 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7> +0 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 8> +0 < . 1:1(0) ack 1 win 514 +0 accept(3, ..., ...) = 4 +0 < . 1:1001(1000) ack 1 win 514 // Quick ack +0 > . 1:1(0) ack 1001 win 264 +0 < . 2001:3001(1000) ack 1 win 514 // DUPACK : Normally we should not change the window +0 > . 1:1(0) ack 1001 win 264 +0 < . 3001:4001(1000) ack 1 win 514 // DUPACK : Normally we should not change the window +0 > . 1:1(0) ack 1001 win 264 +0 < . 4001:5001(1000) ack 1 win 514 // DUPACK : Normally we should not change the window +0 > . 1:1(0) ack 1001 win 264 +0 < . 1001:2001(1000) ack 1 win 514 // Hole is repaired. +0 > . 1:1(0) ack 5001 win 272 Fixes: 4e4f1fc22681 ("tcp: properly increase rcv_ssthresh for ofo packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30ip6_gre: fix use-after-free in ip6gre_tunnel_lookup()Taehee Yoo1-3/+6
[ Upstream commit dafabb6590cb15f300b77c095d50312e2c7c8e0f ] In the datapath, the ip6gre_tunnel_lookup() is used and it internally uses fallback tunnel device pointer, which is fb_tunnel_dev. This pointer variable should be set to NULL when a fb interface is deleted. But there is no routine to set fb_tunnel_dev pointer to NULL. So, this pointer will be still used after interface is deleted and it eventually results in the use-after-free problem. Test commands: ip netns add A ip netns add B ip link add eth0 type veth peer name eth1 ip link set eth0 netns A ip link set eth1 netns B ip netns exec A ip link set lo up ip netns exec A ip link set eth0 up ip netns exec A ip link add ip6gre1 type ip6gre local fc:0::1 \ remote fc:0::2 ip netns exec A ip -6 a a fc:100::1/64 dev ip6gre1 ip netns exec A ip link set ip6gre1 up ip netns exec A ip -6 a a fc:0::1/64 dev eth0 ip netns exec A ip link set ip6gre0 up ip netns exec B ip link set lo up ip netns exec B ip link set eth1 up ip netns exec B ip link add ip6gre1 type ip6gre local fc:0::2 \ remote fc:0::1 ip netns exec B ip -6 a a fc:100::2/64 dev ip6gre1 ip netns exec B ip link set ip6gre1 up ip netns exec B ip -6 a a fc:0::2/64 dev eth1 ip netns exec B ip link set ip6gre0 up ip netns exec A ping fc:100::2 -s 60000 & ip netns del B Splat looks like: [ 73.087285][ C1] BUG: KASAN: use-after-free in ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.088361][ C1] Read of size 4 at addr ffff888040559218 by task ping/1429 [ 73.089317][ C1] [ 73.089638][ C1] CPU: 1 PID: 1429 Comm: ping Not tainted 5.7.0+ #602 [ 73.090531][ C1] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 73.091725][ C1] Call Trace: [ 73.092160][ C1] <IRQ> [ 73.092556][ C1] dump_stack+0x96/0xdb [ 73.093122][ C1] print_address_description.constprop.6+0x2cc/0x450 [ 73.094016][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.094894][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.095767][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.096619][ C1] kasan_report+0x154/0x190 [ 73.097209][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.097989][ C1] ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre] [ 73.098750][ C1] ? gre_del_protocol+0x60/0x60 [gre] [ 73.099500][ C1] gre_rcv+0x1c5/0x1450 [ip6_gre] [ 73.100199][ C1] ? ip6gre_header+0xf00/0xf00 [ip6_gre] [ 73.100985][ C1] ? rcu_read_lock_sched_held+0xc0/0xc0 [ 73.101830][ C1] ? ip6_input_finish+0x5/0xf0 [ 73.102483][ C1] ip6_protocol_deliver_rcu+0xcbb/0x1510 [ 73.103296][ C1] ip6_input_finish+0x5b/0xf0 [ 73.103920][ C1] ip6_input+0xcd/0x2c0 [ 73.104473][ C1] ? ip6_input_finish+0xf0/0xf0 [ 73.105115][ C1] ? rcu_read_lock_held+0x90/0xa0 [ 73.105783][ C1] ? rcu_read_lock_sched_held+0xc0/0xc0 [ 73.106548][ C1] ipv6_rcv+0x1f1/0x300 [ ... ] Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30tcp_cubic: fix spurious HYSTART_DELAY exit upon drop in min RTTNeal Cardwell1-0/+2
[ Upstream commit b344579ca8478598937215f7005d6c7b84d28aee ] Mirja Kuehlewind reported a bug in Linux TCP CUBIC Hystart, where Hystart HYSTART_DELAY mechanism can exit Slow Start spuriously on an ACK when the minimum rtt of a connection goes down. From inspection it is clear from the existing code that this could happen in an example like the following: o The first 8 RTT samples in a round trip are 150ms, resulting in a curr_rtt of 150ms and a delay_min of 150ms. o The 9th RTT sample is 100ms. The curr_rtt does not change after the first 8 samples, so curr_rtt remains 150ms. But delay_min can be lowered at any time, so delay_min falls to 100ms. The code executes the HYSTART_DELAY comparison between curr_rtt of 150ms and delay_min of 100ms, and the curr_rtt is declared far enough above delay_min to force a (spurious) exit of Slow start. The fix here is simple: allow every RTT sample in a round trip to lower the curr_rtt. Fixes: ae27e98a5152 ("[TCP] CUBIC v2.3") Reported-by: Mirja Kuehlewind <mirja.kuehlewind@ericsson.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30ip_tunnel: fix use-after-free in ip_tunnel_lookup()Taehee Yoo1-6/+8
[ Upstream commit ba61539c6ae57f4146284a5cb4f7b7ed8d42bf45 ] In the datapath, the ip_tunnel_lookup() is used and it internally uses fallback tunnel device pointer, which is fb_tunnel_dev. This pointer variable should be set to NULL when a fb interface is deleted. But there is no routine to set fb_tunnel_dev pointer to NULL. So, this pointer will be still used after interface is deleted and it eventually results in the use-after-free problem. Test commands: ip netns add A ip netns add B ip link add eth0 type veth peer name eth1 ip link set eth0 netns A ip link set eth1 netns B ip netns exec A ip link set lo up ip netns exec A ip link set eth0 up ip netns exec A ip link add gre1 type gre local 10.0.0.1 \ remote 10.0.0.2 ip netns exec A ip link set gre1 up ip netns exec A ip a a 10.0.100.1/24 dev gre1 ip netns exec A ip a a 10.0.0.1/24 dev eth0 ip netns exec B ip link set lo up ip netns exec B ip link set eth1 up ip netns exec B ip link add gre1 type gre local 10.0.0.2 \ remote 10.0.0.1 ip netns exec B ip link set gre1 up ip netns exec B ip a a 10.0.100.2/24 dev gre1 ip netns exec B ip a a 10.0.0.2/24 dev eth1 ip netns exec A hping3 10.0.100.2 -2 --flood -d 60000 & ip netns del B Splat looks like: [ 77.793450][ C3] ================================================================== [ 77.794702][ C3] BUG: KASAN: use-after-free in ip_tunnel_lookup+0xcc4/0xf30 [ 77.795573][ C3] Read of size 4 at addr ffff888060bd9c84 by task hping3/2905 [ 77.796398][ C3] [ 77.796664][ C3] CPU: 3 PID: 2905 Comm: hping3 Not tainted 5.8.0-rc1+ #616 [ 77.797474][ C3] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 77.798453][ C3] Call Trace: [ 77.798815][ C3] <IRQ> [ 77.799142][ C3] dump_stack+0x9d/0xdb [ 77.799605][ C3] print_address_description.constprop.7+0x2cc/0x450 [ 77.800365][ C3] ? ip_tunnel_lookup+0xcc4/0xf30 [ 77.800908][ C3] ? ip_tunnel_lookup+0xcc4/0xf30 [ 77.801517][ C3] ? ip_tunnel_lookup+0xcc4/0xf30 [ 77.802145][ C3] kasan_report+0x154/0x190 [ 77.802821][ C3] ? ip_tunnel_lookup+0xcc4/0xf30 [ 77.803503][ C3] ip_tunnel_lookup+0xcc4/0xf30 [ 77.804165][ C3] __ipgre_rcv+0x1ab/0xaa0 [ip_gre] [ 77.804862][ C3] ? rcu_read_lock_sched_held+0xc0/0xc0 [ 77.805621][ C3] gre_rcv+0x304/0x1910 [ip_gre] [ 77.806293][ C3] ? lock_acquire+0x1a9/0x870 [ 77.806925][ C3] ? gre_rcv+0xfe/0x354 [gre] [ 77.807559][ C3] ? erspan_xmit+0x2e60/0x2e60 [ip_gre] [ 77.808305][ C3] ? rcu_read_lock_sched_held+0xc0/0xc0 [ 77.809032][ C3] ? rcu_read_lock_held+0x90/0xa0 [ 77.809713][ C3] gre_rcv+0x1b8/0x354 [gre] [ ... ] Suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30net: fix memleak in register_netdevice()Yang Yingliang1-0/+7
[ Upstream commit 814152a89ed52c722ab92e9fbabcac3cb8a39245 ] I got a memleak report when doing some fuzz test: unreferenced object 0xffff888112584000 (size 13599): comm "ip", pid 3048, jiffies 4294911734 (age 343.491s) hex dump (first 32 bytes): 74 61 70 30 00 00 00 00 00 00 00 00 00 00 00 00 tap0............ 00 ee d9 19 81 88 ff ff 00 00 00 00 00 00 00 00 ................ backtrace: [<000000002f60ba65>] __kmalloc_node+0x309/0x3a0 [<0000000075b211ec>] kvmalloc_node+0x7f/0xc0 [<00000000d3a97396>] alloc_netdev_mqs+0x76/0xfc0 [<00000000609c3655>] __tun_chr_ioctl+0x1456/0x3d70 [<000000001127ca24>] ksys_ioctl+0xe5/0x130 [<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0 [<00000000e1023498>] do_syscall_64+0x56/0xa0 [<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 unreferenced object 0xffff888111845cc0 (size 8): comm "ip", pid 3048, jiffies 4294911734 (age 343.491s) hex dump (first 8 bytes): 74 61 70 30 00 88 ff ff tap0.... backtrace: [<000000004c159777>] kstrdup+0x35/0x70 [<00000000d8b496ad>] kstrdup_const+0x3d/0x50 [<00000000494e884a>] kvasprintf_const+0xf1/0x180 [<0000000097880a2b>] kobject_set_name_vargs+0x56/0x140 [<000000008fbdfc7b>] dev_set_name+0xab/0xe0 [<000000005b99e3b4>] netdev_register_kobject+0xc0/0x390 [<00000000602704fe>] register_netdevice+0xb61/0x1250 [<000000002b7ca244>] __tun_chr_ioctl+0x1cd1/0x3d70 [<000000001127ca24>] ksys_ioctl+0xe5/0x130 [<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0 [<00000000e1023498>] do_syscall_64+0x56/0xa0 [<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 unreferenced object 0xffff88811886d800 (size 512): comm "ip", pid 3048, jiffies 4294911734 (age 343.491s) hex dump (first 32 bytes): 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... ff ff ff ff ff ff ff ff c0 66 3d a3 ff ff ff ff .........f=..... backtrace: [<0000000050315800>] device_add+0x61e/0x1950 [<0000000021008dfb>] netdev_register_kobject+0x17e/0x390 [<00000000602704fe>] register_netdevice+0xb61/0x1250 [<000000002b7ca244>] __tun_chr_ioctl+0x1cd1/0x3d70 [<000000001127ca24>] ksys_ioctl+0xe5/0x130 [<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0 [<00000000e1023498>] do_syscall_64+0x56/0xa0 [<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 If call_netdevice_notifiers() failed, then rollback_registered() calls netdev_unregister_kobject() which holds the kobject. The reference cannot be put because the netdev won't be add to todo list, so it will leads a memleak, we need put the reference to avoid memleak. Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30net: sched: export __netdev_watchdog_up()Valentin Longchamp1-0/+1
[ Upstream commit 1a3db27ad9a72d033235b9673653962c02e3486e ] Since the quiesce/activate rework, __netdev_watchdog_up() is directly called in the ucc_geth driver. Unfortunately, this function is not available for modules and thus ucc_geth cannot be built as a module anymore. Fix it by exporting __netdev_watchdog_up(). Since the commit introducing the regression was backported to stable branches, this one should ideally be as well. Fixes: 79dde73cf9bc ("net/ethernet/freescale: rework quiesce/activate for ucc_geth") Signed-off-by: Valentin Longchamp <valentin@longchamp.me> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30l2tp: Allow duplicate session creation with UDPRidge Kennedy1-1/+6
commit 0d0d9a388a858e271bb70e71e99e7fe2a6fd6f64 upstream. In the past it was possible to create multiple L2TPv3 sessions with the same session id as long as the sessions belonged to different tunnels. The resulting sessions had issues when used with IP encapsulated tunnels, but worked fine with UDP encapsulated ones. Some applications began to rely on this behaviour to avoid having to negotiate unique session ids. Some time ago a change was made to require session ids to be unique across all tunnels, breaking the applications making use of this "feature". This change relaxes the duplicate session id check to allow duplicates if both of the colliding sessions belong to UDP encapsulated tunnels. Fixes: dbdbc73b4478 ("l2tp: fix duplicate session creation") Signed-off-by: Ridge Kennedy <ridge.kennedy@alliedtelesis.co.nz> Acked-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30net: Revert "pkt_sched: fq: use proper locking in fq_dump_stats()"Toke Høiland-Jørgensen1-18/+14
This reverts commit 191cf872190de28a92e1bd2b56d8860e37e07443 which is commit 695b4ec0f0a9cf29deabd3ac075911d58b31f42b upstream. That commit should never have been backported since it relies on a change in locking semantics that was introduced in v4.8 and not backported. Because of this, the backported commit to sch_fq leads to lockups because of the double locking. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30net: core: device_rename: Use rwsem instead of a seqcountAhmed S. Darwish1-22/+18
[ Upstream commit 11d6011c2cf29f7c8181ebde6c8bc0c4d83adcd7 ] Sequence counters write paths are critical sections that must never be preempted, and blocking, even for CONFIG_PREEMPTION=n, is not allowed. Commit 5dbe7c178d3f ("net: fix kernel deadlock with interface rename and netdev name retrieval.") handled a deadlock, observed with CONFIG_PREEMPTION=n, where the devnet_rename seqcount read side was infinitely spinning: it got scheduled after the seqcount write side blocked inside its own critical section. To fix that deadlock, among other issues, the commit added a cond_resched() inside the read side section. While this will get the non-preemptible kernel eventually unstuck, the seqcount reader is fully exhausting its slice just spinning -- until TIF_NEED_RESCHED is set. The fix is also still broken: if the seqcount reader belongs to a real-time scheduling policy, it can spin forever and the kernel will livelock. Disabling preemption over the seqcount write side critical section will not work: inside it are a number of GFP_KERNEL allocations and mutex locking through the drivers/base/ :: device_rename() call chain. >From all the above, replace the seqcount with a rwsem. Fixes: 5dbe7c178d3f (net: fix kernel deadlock with interface rename and netdev name retrieval.) Fixes: 30e6c9fa93cf (net: devnet_rename_seq should be a seqcount) Fixes: c91f6df2db49 (sockopt: Change getsockopt() of SO_BINDTODEVICE to return an interface name) Cc: <stable@vger.kernel.org> Reported-by: kbuild test robot <lkp@intel.com> [ v1 missing up_read() on error exit ] Reported-by: Dan Carpenter <dan.carpenter@oracle.com> [ v1 missing up_read() on error exit ] Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30sched/rt, net: Use CONFIG_PREEMPTION.patchThomas Gleixner1-1/+1
[ Upstream commit 2da2b32fd9346009e9acdb68c570ca8d3966aba7 ] CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT. Both PREEMPT and PREEMPT_RT require the same functionality which today depends on CONFIG_PREEMPT. Update the comment to use CONFIG_PREEMPTION. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: David S. Miller <davem@davemloft.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20191015191821.11479-22-bigeasy@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-30net: sunrpc: Fix off-by-one issues in 'rpc_ntop6'Fedor Tokarev1-2/+2
[ Upstream commit 118917d696dc59fd3e1741012c2f9db2294bed6f ] Fix off-by-one issues in 'rpc_ntop6': - 'snprintf' returns the number of characters which would have been written if enough space had been available, excluding the terminating null byte. Thus, a return value of 'sizeof(scopebuf)' means that the last character was dropped. - 'strcat' adds a terminating null byte to the string, thus if len == buflen, the null byte is written past the end of the buffer. Signed-off-by: Fedor Tokarev <ftokarev@gmail.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-20sunrpc: clean up properly in gss_mech_unregister()NeilBrown2-9/+15
commit 24c5efe41c29ee3e55bcf5a1c9f61ca8709622e8 upstream. gss_mech_register() calls svcauth_gss_register_pseudoflavor() for each flavour, but gss_mech_unregister() does not call auth_domain_put(). This is unbalanced and makes it impossible to reload the module. Change svcauth_gss_register_pseudoflavor() to return the registered auth_domain, and save it for later release. Cc: stable@vger.kernel.org (v2.6.12+) Link: https://bugzilla.kernel.org/show_bug.cgi?id=206651 Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-20sunrpc: svcauth_gss_register_pseudoflavor must reject duplicate registrations.NeilBrown1-2/+4
commit d47a5dc2888fd1b94adf1553068b8dad76cec96c upstream. There is no valid case for supporting duplicate pseudoflavor registrations. Currently the silent acceptance of such registrations is hiding a bug. The rpcsec_gss_krb5 module registers 2 flavours but does not unregister them, so if you load, unload, reload the module, it will happily continue to use the old registration which now has pointers to the memory were the module was originally loaded. This could lead to unexpected results. So disallow duplicate registrations. Link: https://bugzilla.kernel.org/show_bug.cgi?id=206651 Cc: stable@vger.kernel.org (v2.6.12+) Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-20netfilter: nft_nat: return EOPNOTSUPP if type or flags are not supportedPablo Neira Ayuso1-2/+2
[ Upstream commit 0d7c83463fdf7841350f37960a7abadd3e650b41 ] Instead of EINVAL which should be used for malformed netlink messages. Fixes: eb31628e37a0 ("netfilter: nf_tables: Add support for IPv6 NAT") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-20Bluetooth: Add SCO fallback for invalid LMP parameters errorHsin-Yu Chao1-0/+1
[ Upstream commit 56b5453a86203a44726f523b4133c1feca49ce7c ] Bluetooth PTS test case HFP/AG/ACC/BI-12-I accepts SCO connection with invalid parameter at the first SCO request expecting AG to attempt another SCO request with the use of "safe settings" for given codec, base on section 5.7.1.2 of HFP 1.7 specification. This patch addresses it by adding "Invalid LMP Parameters" (0x1e) to the SCO fallback case. Verified with below log: < HCI Command: Setup Synchronous Connection (0x01|0x0028) plen 17 Handle: 256 Transmit bandwidth: 8000 Receive bandwidth: 8000 Max latency: 13 Setting: 0x0003 Input Coding: Linear Input Data Format: 1's complement Input Sample Size: 8-bit # of bits padding at MSB: 0 Air Coding Format: Transparent Data Retransmission effort: Optimize for link quality (0x02) Packet type: 0x0380 3-EV3 may not be used 2-EV5 may not be used 3-EV5 may not be used > HCI Event: Command Status (0x0f) plen 4 Setup Synchronous Connection (0x01|0x0028) ncmd 1 Status: Success (0x00) > HCI Event: Number of Completed Packets (0x13) plen 5 Num handles: 1 Handle: 256 Count: 1 > HCI Event: Max Slots Change (0x1b) plen 3 Handle: 256 Max slots: 1 > HCI Event: Synchronous Connect Complete (0x2c) plen 17 Status: Invalid LMP Parameters / Invalid LL Parameters (0x1e) Handle: 0 Address: 00:1B:DC:F2:21:59 (OUI 00-1B-DC) Link type: eSCO (0x02) Transmission interval: 0x00 Retransmission window: 0x02 RX packet length: 0 TX packet length: 0 Air mode: Transparent (0x03) < HCI Command: Setup Synchronous Connection (0x01|0x0028) plen 17 Handle: 256 Transmit bandwidth: 8000 Receive bandwidth: 8000 Max latency: 8 Setting: 0x0003 Input Coding: Linear Input Data Format: 1's complement Input Sample Size: 8-bit # of bits padding at MSB: 0 Air Coding Format: Transparent Data Retransmission effort: Optimize for link quality (0x02) Packet type: 0x03c8 EV3 may be used 2-EV3 may not be used 3-EV3 may not be used 2-EV5 may not be used 3-EV5 may not be used > HCI Event: Command Status (0x0f) plen 4 Setup Synchronous Connection (0x01|0x0028) ncmd 1 Status: Success (0x00) > HCI Event: Max Slots Change (0x1b) plen 3 Handle: 256 Max slots: 5 > HCI Event: Max Slots Change (0x1b) plen 3 Handle: 256 Max slots: 1 > HCI Event: Synchronous Connect Complete (0x2c) plen 17 Status: Success (0x00) Handle: 257 Address: 00:1B:DC:F2:21:59 (OUI 00-1B-DC) Link type: eSCO (0x02) Transmission interval: 0x06 Retransmission window: 0x04 RX packet length: 30 TX packet length: 30 Air mode: Transparent (0x03) Signed-off-by: Hsin-Yu Chao <hychao@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-20ipv6: fix IPV6_ADDRFORM operation logicHangbin Liu1-6/+7
[ Upstream commit 79a1f0ccdbb4ad700590f61b00525b390cb53905 ] Socket option IPV6_ADDRFORM supports UDP/UDPLITE and TCP at present. Previously the checking logic looks like: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol != IPPROTO_TCP) break; After commit b6f6118901d1 ("ipv6: restrict IPV6_ADDRFORM operation"), TCP was blocked as the logic changed to: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol == IPPROTO_TCP) do_some_check; break; else break; Then after commit 82c9ae440857 ("ipv6: fix restrict IPV6_ADDRFORM operation") UDP/UDPLITE were blocked as the logic changed to: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; if (sk->sk_protocol == IPPROTO_TCP) do_some_check; if (sk->sk_protocol != IPPROTO_TCP) break; Fix it by using Eric's code and simply remove the break in TCP check, which looks like: if (sk->sk_protocol == IPPROTO_UDP || sk->sk_protocol == IPPROTO_UDPLITE) do_some_check; else if (sk->sk_protocol == IPPROTO_TCP) do_some_check; else break; Fixes: 82c9ae440857 ("ipv6: fix restrict IPV6_ADDRFORM operation") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-11l2tp: do not use inet_hash()/inet_unhash()Eric Dumazet2-14/+42
[ Upstream commit 02c71b144c811bcdd865e0a1226d0407d11357e8 ] syzbot recently found a way to crash the kernel [1] Issue here is that inet_hash() & inet_unhash() are currently only meant to be used by TCP & DCCP, since only these protocols provide the needed hashinfo pointer. L2TP uses a single list (instead of a hash table) This old bug became an issue after commit 610236587600 ("bpf: Add new cgroup attach type to enable sock modifications") since after this commit, sk_common_release() can be called while the L2TP socket is still considered 'hashed'. general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 PID: 7063 Comm: syz-executor654 Not tainted 5.7.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:inet_unhash+0x11f/0x770 net/ipv4/inet_hashtables.c:600 Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e dd 04 00 00 48 8d 7d 08 44 8b 73 08 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 55 05 00 00 48 8d 7d 14 4c 8b 6d 08 48 b8 00 00 RSP: 0018:ffffc90001777d30 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff88809a6df940 RCX: ffffffff8697c242 RDX: 0000000000000001 RSI: ffffffff8697c251 RDI: 0000000000000008 RBP: 0000000000000000 R08: ffff88809f3ae1c0 R09: fffffbfff1514cc1 R10: ffffffff8a8a6607 R11: fffffbfff1514cc0 R12: ffff88809a6df9b0 R13: 0000000000000007 R14: 0000000000000000 R15: ffffffff873a4d00 FS: 0000000001d2b880(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000006cd090 CR3: 000000009403a000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: sk_common_release+0xba/0x370 net/core/sock.c:3210 inet_create net/ipv4/af_inet.c:390 [inline] inet_create+0x966/0xe00 net/ipv4/af_inet.c:248 __sock_create+0x3cb/0x730 net/socket.c:1428 sock_create net/socket.c:1479 [inline] __sys_socket+0xef/0x200 net/socket.c:1521 __do_sys_socket net/socket.c:1530 [inline] __se_sys_socket net/socket.c:1528 [inline] __x64_sys_socket+0x6f/0xb0 net/socket.c:1528 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 entry_SYSCALL_64_after_hwframe+0x49/0xb3 RIP: 0033:0x441e29 Code: e8 fc b3 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffdce184148 EFLAGS: 00000246 ORIG_RAX: 0000000000000029 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000441e29 RDX: 0000000000000073 RSI: 0000000000000002 RDI: 0000000000000002 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000402c30 R14: 0000000000000000 R15: 0000000000000000 Modules linked in: ---[ end trace 23b6578228ce553e ]--- RIP: 0010:inet_unhash+0x11f/0x770 net/ipv4/inet_hashtables.c:600 Code: 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e dd 04 00 00 48 8d 7d 08 44 8b 73 08 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 55 05 00 00 48 8d 7d 14 4c 8b 6d 08 48 b8 00 00 RSP: 0018:ffffc90001777d30 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff88809a6df940 RCX: ffffffff8697c242 RDX: 0000000000000001 RSI: ffffffff8697c251 RDI: 0000000000000008 RBP: 0000000000000000 R08: ffff88809f3ae1c0 R09: fffffbfff1514cc1 R10: ffffffff8a8a6607 R11: fffffbfff1514cc0 R12: ffff88809a6df9b0 R13: 0000000000000007 R14: 0000000000000000 R15: ffffffff873a4d00 FS: 0000000001d2b880(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000006cd090 CR3: 000000009403a000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fixes: 0d76751fad77 ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Cc: Andrii Nakryiko <andriin@fb.com> Reported-by: syzbot+3610d489778b57cc8031@syzkaller.appspotmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-11l2tp: add sk_family checks to l2tp_validate_socketEric Dumazet1-0/+2
[ Upstream commit d9a81a225277686eb629938986d97629ea102633 ] syzbot was able to trigger a crash after using an ISDN socket and fool l2tp. Fix this by making sure the UDP socket is of the proper family. BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x465/0x540 net/ipv4/udp_tunnel.c:78 Write of size 1 at addr ffff88808ed0c590 by task syz-executor.5/3018 CPU: 0 PID: 3018 Comm: syz-executor.5 Not tainted 5.7.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 print_address_description.constprop.0.cold+0xd3/0x413 mm/kasan/report.c:382 __kasan_report.cold+0x20/0x38 mm/kasan/report.c:511 kasan_report+0x33/0x50 mm/kasan/common.c:625 setup_udp_tunnel_sock+0x465/0x540 net/ipv4/udp_tunnel.c:78 l2tp_tunnel_register+0xb15/0xdd0 net/l2tp/l2tp_core.c:1523 l2tp_nl_cmd_tunnel_create+0x4b2/0xa60 net/l2tp/l2tp_netlink.c:249 genl_family_rcv_msg_doit net/netlink/genetlink.c:673 [inline] genl_family_rcv_msg net/netlink/genetlink.c:718 [inline] genl_rcv_msg+0x627/0xdf0 net/netlink/genetlink.c:735 netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469 genl_rcv+0x24/0x40 net/netlink/genetlink.c:746 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6e6/0x810 net/socket.c:2352 ___sys_sendmsg+0x100/0x170 net/socket.c:2406 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2439 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 entry_SYSCALL_64_after_hwframe+0x49/0xb3 RIP: 0033:0x45ca29 Code: 0d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007effe76edc78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004fe1c0 RCX: 000000000045ca29 RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000005 RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 000000000000094e R14: 00000000004d5d00 R15: 00007effe76ee6d4 Allocated by task 3018: save_stack+0x1b/0x40 mm/kasan/common.c:49 set_track mm/kasan/common.c:57 [inline] __kasan_kmalloc mm/kasan/common.c:495 [inline] __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:468 __do_kmalloc mm/slab.c:3656 [inline] __kmalloc+0x161/0x7a0 mm/slab.c:3665 kmalloc include/linux/slab.h:560 [inline] sk_prot_alloc+0x223/0x2f0 net/core/sock.c:1612 sk_alloc+0x36/0x1100 net/core/sock.c:1666 data_sock_create drivers/isdn/mISDN/socket.c:600 [inline] mISDN_sock_create+0x272/0x400 drivers/isdn/mISDN/socket.c:796 __sock_create+0x3cb/0x730 net/socket.c:1428 sock_create net/socket.c:1479 [inline] __sys_socket+0xef/0x200 net/socket.c:1521 __do_sys_socket net/socket.c:1530 [inline] __se_sys_socket net/socket.c:1528 [inline] __x64_sys_socket+0x6f/0xb0 net/socket.c:1528 do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295 entry_SYSCALL_64_after_hwframe+0x49/0xb3 Freed by task 2484: save_stack+0x1b/0x40 mm/kasan/common.c:49 set_track mm/kasan/common.c:57 [inline] kasan_set_free_info mm/kasan/common.c:317 [inline] __kasan_slab_free+0xf7/0x140 mm/kasan/common.c:456 __cache_free mm/slab.c:3426 [inline] kfree+0x109/0x2b0 mm/slab.c:3757 kvfree+0x42/0x50 mm/util.c:603 __free_fdtable+0x2d/0x70 fs/file.c:31 put_files_struct fs/file.c:420 [inline] put_files_struct+0x248/0x2e0 fs/file.c:413 exit_files+0x7e/0xa0 fs/file.c:445 do_exit+0xb04/0x2dd0 kernel/exit.c:791 do_group_exit+0x125/0x340 kernel/exit.c:894 get_signal+0x47b/0x24e0 kernel/signal.c:2739 do_signal+0x81/0x2240 arch/x86/kernel/signal.c:784 exit_to_usermode_loop+0x26c/0x360 arch/x86/entry/common.c:161 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] syscall_return_slowpath arch/x86/entry/common.c:279 [inline] do_syscall_64+0x6b1/0x7d0 arch/x86/entry/common.c:305 entry_SYSCALL_64_after_hwframe+0x49/0xb3 The buggy address belongs to the object at ffff88808ed0c000 which belongs to the cache kmalloc-2k of size 2048 The buggy address is located 1424 bytes inside of 2048-byte region [ffff88808ed0c000, ffff88808ed0c800) The buggy address belongs to the page: page:ffffea00023b4300 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0xfffe0000000200(slab) raw: 00fffe0000000200 ffffea0002838208 ffffea00015ba288 ffff8880aa000e00 raw: 0000000000000000 ffff88808ed0c000 0000000100000001 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88808ed0c480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88808ed0c500: 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88808ed0c580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff88808ed0c600: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88808ed0c680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Fixes: 6b9f34239b00 ("l2tp: fix races in tunnel creation") Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Cc: Guillaume Nault <gnault@redhat.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-11vsock: fix timeout in vsock_accept()Stefano Garzarella1-1/+1
[ Upstream commit 7e0afbdfd13d1e708fe96e31c46c4897101a6a43 ] The accept(2) is an "input" socket interface, so we should use SO_RCVTIMEO instead of SO_SNDTIMEO to set the timeout. So this patch replace sock_sndtimeo() with sock_rcvtimeo() to use the right timeout in the vsock_accept(). Fixes: d021c344051a ("VSOCK: Introduce VM Sockets") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-11devinet: fix memleak in inetdev_init()Yang Yingliang1-0/+1
[ Upstream commit 1b49cd71b52403822731dc9f283185d1da355f97 ] When devinet_sysctl_register() failed, the memory allocated in neigh_parms_alloc() should be freed. Fixes: 20e61da7ffcf ("ipv4: fail early when creating netdev named all or default") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-11esp6: fix memleak on error path in esp6_inputZhen Lei1-1/+3
commit 7284fdf39a912322ce97de2d30def3c6068a418c upstream. This ought to be an omission in e6194923237 ("esp: Fix memleaks on error paths."). The memleak on error path in esp6_input is similar to esp_input of esp4. Fixes: e6194923237 ("esp: Fix memleaks on error paths.") Fixes: 3f29770723f ("ipsec: check return value of skb_to_sgvec always") Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03net: rtnl_configure_link: fix dev flags changes arg to __dev_notify_flagsRoopa Prabhu1-1/+1
commit 56a49d7048703f5ffdb84d3a0ee034108fba6850 upstream. This fix addresses https://bugzilla.kernel.org/show_bug.cgi?id=201071 Commit 5025f7f7d506 wrongly relied on __dev_change_flags to notify users of dev flag changes in the case when dev->rtnl_link_state = RTNL_LINK_INITIALIZED. Fix it by indicating flag changes explicitly to __dev_notify_flags. Fixes: 5025f7f7d506 ("rtnetlink: add rtnl_link_state check in rtnl_configure_link") Reported-By: Liam mcbirnie <liam.mcbirnie@boeing.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03mac80211: fix memory leakSudip Mukherjee1-0/+1
commit ea32f065bd3e3e09f0bcb3042f1664caf6b3e233 upstream. On error we jumped to the error label and returned the error code but we missed releasing sinfo. Fixes: 5fe74014172d ("mac80211: avoid excessive stack usage in sta_info") Reviewed-by: Julian Calaby <julian.calaby@gmail.com> Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03netfilter: nf_conntrack_pptp: fix compilation warning with W=1 buildPablo Neira Ayuso1-1/+1
commit 4946ea5c1237036155c3b3a24f049fd5f849f8f6 upstream. >> include/linux/netfilter/nf_conntrack_pptp.h:13:20: warning: 'const' type qualifier on return type has no effect [-Wignored-qualifiers] extern const char *const pptp_msg_name(u_int16_t msg); ^~~~~~ Reported-by: kbuild test robot <lkp@intel.com> Fixes: 4c559f15efcc ("netfilter: nf_conntrack_pptp: prevent buffer overflows in debug code") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03netfilter: nf_conntrack_pptp: prevent buffer overflows in debug codePablo Neira Ayuso2-32/+37
commit 4c559f15efcc43b996f4da528cd7f9483aaca36d upstream. Dan Carpenter says: "Smatch complains that the value for "cmd" comes from the network and can't be trusted." Add pptp_msg_name() helper function that checks for the array boundary. Fixes: f09943fefe6b ("[NETFILTER]: nf_conntrack/nf_nat: add PPTP helper port") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03netfilter: ipset: Fix subcounter update skipPhil Sutter1-1/+1
commit a164b95ad6055c50612795882f35e0efda1f1390 upstream. If IPSET_FLAG_SKIP_SUBCOUNTER_UPDATE is set, user requested to not update counters in sub sets. Therefore IPSET_FLAG_SKIP_COUNTER_UPDATE must be set, not unset. Fixes: 6e01781d1c80e ("netfilter: ipset: set match: add support to match the counters") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03netfilter: nft_reject_bridge: enable reject with bridge vlanMichael Braun1-0/+6
commit e9c284ec4b41c827f4369973d2792992849e4fa5 upstream. Currently, using the bridge reject target with tagged packets results in untagged packets being sent back. Fix this by mirroring the vlan id as well. Fixes: 85f5b3086a04 ("netfilter: bridge: add reject support") Signed-off-by: Michael Braun <michael-dev@fami-braun.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03ip_vti: receive ipip packet by calling ip_tunnel_rcvXin Long1-1/+22
commit 976eba8ab596bab94b9714cd46d38d5c6a2c660d upstream. In Commit dd9ee3444014 ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel"), it tries to receive IPIP packets in vti by calling xfrm_input(). This case happens when a small packet or frag sent by peer is too small to get compressed. However, xfrm_input() will still get to the IPCOMP path where skb sec_path is set, but never dropped while it should have been done in vti_ipcomp4_protocol.cb_handler(vti_rcv_cb), as it's not an ipcomp4 packet. This will cause that the packet can never pass xfrm4_policy_check() in the upper protocol rcv functions. So this patch is to call ip_tunnel_rcv() to process IPIP packets instead. Fixes: dd9ee3444014 ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03vti4: eliminated some duplicate code.Jeremy Sowden1-38/+22
commit f981c57ffd2d7cf2dd4b6d6f8fcb3965df42f54c upstream. The ipip tunnel introduced in commit dd9ee3444014 ("vti4: Fix a ipip packet processing bug in 'IPCOMP' virtual tunnel") largely duplicated the existing vti_input and vti_recv functions. Refactored to deduplicate the common code. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-03xfrm: fix a NULL-ptr deref in xfrm_local_errorXin Long1-1/+2
commit f6a23d85d078c2ffde79c66ca81d0a1dde451649 upstream. This patch is to fix a crash: [ ] kasan: GPF could be caused by NULL-ptr deref or user memory access [ ] general protection fault: 0000 [#1] SMP KASAN PTI [ ] RIP: 0010:ipv6_local_error+0xac/0x7a0 [ ] Call Trace: [ ] xfrm6_local_error+0x1eb/0x300 [ ] xfrm_local_error+0x95/0x130 [ ] __xfrm6_output+0x65f/0xb50 [ ] xfrm6_output+0x106/0x46f [ ] udp_tunnel6_xmit_skb+0x618/0xbf0 [ip6_udp_tunnel] [ ] vxlan_xmit_one+0xbc6/0x2c60 [vxlan] [ ] vxlan_xmit+0x6a0/0x4276 [vxlan] [ ] dev_hard_start_xmit+0x165/0x820 [ ] __dev_queue_xmit+0x1ff0/0x2b90 [ ] ip_finish_output2+0xd3e/0x1480 [ ] ip_do_fragment+0x182d/0x2210 [ ] ip_output+0x1d0/0x510 [ ] ip_send_skb+0x37/0xa0 [ ] raw_sendmsg+0x1b4c/0x2b80 [ ] sock_sendmsg+0xc0/0x110 This occurred when sending a v4 skb over vxlan6 over ipsec, in which case skb->protocol == htons(ETH_P_IPV6) while skb->sk->sk_family == AF_INET in xfrm_local_error(). Then it will go to xfrm6_local_error() where it tries to get ipv6 info from a ipv4 sk. This issue was actually fixed by Commit 628e341f319f ("xfrm: make local error reporting more robust"), but brought back by Commit 844d48746e4b ("xfrm: choose protocol family by skb protocol"). So to fix it, we should call xfrm6_local_error() only when skb->protocol is htons(ETH_P_IPV6) and skb->sk->sk_family is AF_INET6. Fixes: 844d48746e4b ("xfrm: choose protocol family by skb protocol") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>