summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-03-18devlink: validate length of param valuesJakub Kicinski1-12/+19
[ Upstream commit 8750939b6ad86abc3f53ec8a9683a1cded4a5654 ] DEVLINK_ATTR_PARAM_VALUE_DATA may have different types so it's not checked by the normal netlink policy. Make sure the attribute length is what we expect. Fixes: e3b7ca18ad7b ("devlink: Add param set command") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18net: memcg: fix lockdep splat in inet_csk_accept()Eric Dumazet1-7/+7
commit 06669ea346e476a5339033d77ef175566a40efbb upstream. Locking newsk while still holding the listener lock triggered a lockdep splat [1] We can simply move the memcg code after we release the listener lock, as this can also help if multiple threads are sharing a common listener. Also fix a typo while reading socket sk_rmem_alloc. [1] WARNING: possible recursive locking detected 5.6.0-rc3-syzkaller #0 Not tainted -------------------------------------------- syz-executor598/9524 is trying to acquire lock: ffff88808b5b8b90 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline] ffff88808b5b8b90 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x69f/0xd30 net/ipv4/inet_connection_sock.c:492 but task is already holding lock: ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline] ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x8d/0xd30 net/ipv4/inet_connection_sock.c:445 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(sk_lock-AF_INET6); lock(sk_lock-AF_INET6); *** DEADLOCK *** May be due to missing lock nesting notation 1 lock held by syz-executor598/9524: #0: ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: lock_sock include/net/sock.h:1541 [inline] #0: ffff88808b5b9590 (sk_lock-AF_INET6){+.+.}, at: inet_csk_accept+0x8d/0xd30 net/ipv4/inet_connection_sock.c:445 stack backtrace: CPU: 0 PID: 9524 Comm: syz-executor598 Not tainted 5.6.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x188/0x20d lib/dump_stack.c:118 print_deadlock_bug kernel/locking/lockdep.c:2370 [inline] check_deadlock kernel/locking/lockdep.c:2411 [inline] validate_chain kernel/locking/lockdep.c:2954 [inline] __lock_acquire.cold+0x114/0x288 kernel/locking/lockdep.c:3954 lock_acquire+0x197/0x420 kernel/locking/lockdep.c:4484 lock_sock_nested+0xc5/0x110 net/core/sock.c:2947 lock_sock include/net/sock.h:1541 [inline] inet_csk_accept+0x69f/0xd30 net/ipv4/inet_connection_sock.c:492 inet_accept+0xe9/0x7c0 net/ipv4/af_inet.c:734 __sys_accept4_file+0x3ac/0x5b0 net/socket.c:1758 __sys_accept4+0x53/0x90 net/socket.c:1809 __do_sys_accept4 net/socket.c:1821 [inline] __se_sys_accept4 net/socket.c:1818 [inline] __x64_sys_accept4+0x93/0xf0 net/socket.c:1818 do_syscall_64+0xf6/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4445c9 Code: e8 0c 0d 03 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffc35b37608 EFLAGS: 00000246 ORIG_RAX: 0000000000000120 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004445c9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000306777 R09: 0000000000306777 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00000000004053d0 R14: 0000000000000000 R15: 0000000000000000 Fixes: d752a4986532 ("net: memcg: late association of sock to memcg") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Shakeel Butt <shakeelb@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18net: memcg: late association of sock to memcgShakeel Butt3-15/+24
[ Upstream commit d752a4986532cb6305dfd5290a614cde8072769d ] If a TCP socket is allocated in IRQ context or cloned from unassociated (i.e. not associated to a memcg) in IRQ context then it will remain unassociated for its whole life. Almost half of the TCPs created on the system are created in IRQ context, so, memory used by such sockets will not be accounted by the memcg. This issue is more widespread in cgroup v1 where network memory accounting is opt-in but it can happen in cgroup v2 if the source socket for the cloning was created in root memcg. To fix the issue, just do the association of the sockets at the accept() time in the process context and then force charge the memory buffer already used and reserved by the socket. Signed-off-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18cgroup: memcg: net: do not associate sock with unrelated cgroupShakeel Butt2-0/+8
[ Upstream commit e876ecc67db80dfdb8e237f71e5b43bb88ae549c ] We are testing network memory accounting in our setup and noticed inconsistent network memory usage and often unrelated cgroups network usage correlates with testing workload. On further inspection, it seems like mem_cgroup_sk_alloc() and cgroup_sk_alloc() are broken in irq context specially for cgroup v1. mem_cgroup_sk_alloc() and cgroup_sk_alloc() can be called in irq context and kind of assumes that this can only happen from sk_clone_lock() and the source sock object has already associated cgroup. However in cgroup v1, where network memory accounting is opt-in, the source sock can be unassociated with any cgroup and the new cloned sock can get associated with unrelated interrupted cgroup. Cgroup v2 can also suffer if the source sock object was created by process in the root cgroup or if sk_alloc() is called in irq context. The fix is to just do nothing in interrupt. WARNING: Please note that about half of the TCP sockets are allocated from the IRQ context, so, memory used by such sockets will not be accouted by the memcg. The stack trace of mem_cgroup_sk_alloc() from IRQ-context: CPU: 70 PID: 12720 Comm: ssh Tainted: 5.6.0-smp-DEV #1 Hardware name: ... Call Trace: <IRQ> dump_stack+0x57/0x75 mem_cgroup_sk_alloc+0xe9/0xf0 sk_clone_lock+0x2a7/0x420 inet_csk_clone_lock+0x1b/0x110 tcp_create_openreq_child+0x23/0x3b0 tcp_v6_syn_recv_sock+0x88/0x730 tcp_check_req+0x429/0x560 tcp_v6_rcv+0x72d/0xa40 ip6_protocol_deliver_rcu+0xc9/0x400 ip6_input+0x44/0xd0 ? ip6_protocol_deliver_rcu+0x400/0x400 ip6_rcv_finish+0x71/0x80 ipv6_rcv+0x5b/0xe0 ? ip6_sublist_rcv+0x2e0/0x2e0 process_backlog+0x108/0x1e0 net_rx_action+0x26b/0x460 __do_softirq+0x104/0x2a6 do_softirq_own_stack+0x2a/0x40 </IRQ> do_softirq.part.19+0x40/0x50 __local_bh_enable_ip+0x51/0x60 ip6_finish_output2+0x23d/0x520 ? ip6table_mangle_hook+0x55/0x160 __ip6_finish_output+0xa1/0x100 ip6_finish_output+0x30/0xd0 ip6_output+0x73/0x120 ? __ip6_finish_output+0x100/0x100 ip6_xmit+0x2e3/0x600 ? ipv6_anycast_cleanup+0x50/0x50 ? inet6_csk_route_socket+0x136/0x1e0 ? skb_free_head+0x1e/0x30 inet6_csk_xmit+0x95/0xf0 __tcp_transmit_skb+0x5b4/0xb20 __tcp_send_ack.part.60+0xa3/0x110 tcp_send_ack+0x1d/0x20 tcp_rcv_state_process+0xe64/0xe80 ? tcp_v6_connect+0x5d1/0x5f0 tcp_v6_do_rcv+0x1b1/0x3f0 ? tcp_v6_do_rcv+0x1b1/0x3f0 __release_sock+0x7f/0xd0 release_sock+0x30/0xa0 __inet_stream_connect+0x1c3/0x3b0 ? prepare_to_wait+0xb0/0xb0 inet_stream_connect+0x3b/0x60 __sys_connect+0x101/0x120 ? __sys_getsockopt+0x11b/0x140 __x64_sys_connect+0x1a/0x20 do_syscall_64+0x51/0x200 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The stack trace of mem_cgroup_sk_alloc() from IRQ-context: Fixes: 2d7580738345 ("mm: memcontrol: consolidate cgroup socket tracking") Fixes: d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets") Signed-off-by: Shakeel Butt <shakeelb@google.com> Reviewed-by: Roman Gushchin <guro@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18bnxt_en: reinitialize IRQs when MTU is modifiedVasundhara Volam1-2/+2
[ Upstream commit a9b952d267e59a3b405e644930f46d252cea7122 ] MTU changes may affect the number of IRQs so we must call bnxt_close_nic()/bnxt_open_nic() with the irq_re_init parameter set to true. The reason is that a larger MTU may require aggregation rings not needed with smaller MTU. We may not be able to allocate the required number of aggregation rings and so we reduce the number of channels which will change the number of IRQs. Without this patch, it may crash eventually in pci_disable_msix() when the IRQs are not properly unwound. Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18sfc: detach from cb_page in efx_copy_channel()Edward Cree1-0/+1
[ Upstream commit 4b1bd9db078f7d5332c8601a2f5bd43cf0458fd4 ] It's a resource, not a parameter, so we can't copy it into the new channel's TX queues, otherwise aliasing will lead to resource- management bugs if the channel is subsequently torn down without being initialised. Before the Fixes:-tagged commit there was a similar bug with tsoh_page, but I'm not sure it's worth doing another fix for such old kernels. Fixes: e9117e5099ea ("sfc: Firmware-Assisted TSO version 2") Suggested-by: Derek Shute <Derek.Shute@stratus.com> Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18r8152: check disconnect status after long sleepYou-Sheng Yang1-0/+8
[ Upstream commit d64c7a08034b32c285e576208ae44fc3ba3fa7df ] Dell USB Type C docking WD19/WD19DC attaches additional peripherals as: /: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M |__ Port 1: Dev 11, If 0, Class=Hub, Driver=hub/4p, 5000M |__ Port 3: Dev 12, If 0, Class=Hub, Driver=hub/4p, 5000M |__ Port 4: Dev 13, If 0, Class=Vendor Specific Class, Driver=r8152, 5000M where usb 2-1-3 is a hub connecting all USB Type-A/C ports on the dock. When hotplugging such dock with additional usb devices already attached on it, the probing process may reset usb 2.1 port, therefore r8152 ethernet device is also reset. However, during r8152 device init there are several for-loops that, when it's unable to retrieve hardware registers due to being disconnected from USB, may take up to 14 seconds each in practice, and that has to be completed before USB may re-enumerate devices on the bus. As a result, devices attached to the dock will only be available after nearly 1 minute after the dock was plugged in: [ 216.388290] [250] r8152 2-1.4:1.0: usb_probe_interface [ 216.388292] [250] r8152 2-1.4:1.0: usb_probe_interface - got id [ 258.830410] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): PHY not ready [ 258.830460] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): Invalid header when reading pass-thru MAC addr [ 258.830464] r8152 2-1.4:1.0 (unnamed net_device) (uninitialized): Get ether addr fail This happens in, for example, r8153_init: static int generic_ocp_read(struct r8152 *tp, u16 index, u16 size, void *data, u16 type) { if (test_bit(RTL8152_UNPLUG, &tp->flags)) return -ENODEV; ... } static u16 ocp_read_word(struct r8152 *tp, u16 type, u16 index) { u32 data; ... generic_ocp_read(tp, index, sizeof(tmp), &tmp, type | byen); data = __le32_to_cpu(tmp); ... return (u16)data; } static void r8153_init(struct r8152 *tp) { ... if (test_bit(RTL8152_UNPLUG, &tp->flags)) return; for (i = 0; i < 500; i++) { if (ocp_read_word(tp, MCU_TYPE_PLA, PLA_BOOT_CTRL) & AUTOLOAD_DONE) break; msleep(20); } ... } Since ocp_read_word() doesn't check the return status of generic_ocp_read(), and the only exit condition for the loop is to have a match in the returned value, such loops will only ends after exceeding its maximum runs when the device has been marked as disconnected, which takes 500 * 20ms = 10 seconds in theory, 14 in practice. To solve this long latency another test to RTL8152_UNPLUG flag should be added after those 20ms sleep to skip unnecessary loops, so that the device probe can complete early and proceed to parent port reset/reprobe process. This can be reproduced on all kernel versions up to latest v5.6-rc2, but after v5.5-rc7 the reproduce rate is dramatically lowered to 1/30 or less while it was around 1/2. Signed-off-by: You-Sheng Yang <vicamo.yang@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18net: systemport: fix index check to avoid an array out of bounds accessColin Ian King1-1/+1
[ Upstream commit c0368595c1639947839c0db8294ee96aca0b3b86 ] Currently the bounds check on index is off by one and can lead to an out of bounds access on array priv->filters_loc when index is RXCHK_BRCM_TAG_MAX. Fixes: bb9051a2b230 ("net: systemport: Add support for WAKE_FILTER") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18net: stmmac: dwmac1000: Disable ACS if enhanced descs are not usedRemi Pommarel1-1/+2
[ Upstream commit b723bd933980f4956dabc8a8d84b3e83be8d094c ] ACS (auto PAD/FCS stripping) removes FCS off 802.3 packets (LLC) so that there is no need to manually strip it for such packets. The enhanced DMA descriptors allow to flag LLC packets so that the receiving callback can use that to strip FCS manually or not. On the other hand, normal descriptors do not support that. Thus in order to not truncate LLC packet ACS should be disabled when using normal DMA descriptors. Fixes: 47dd7a540b8a0 ("net: add support for STMicroelectronics Ethernet controllers.") Signed-off-by: Remi Pommarel <repk@triplefau.lt> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18net/packet: tpacket_rcv: do not increment ring index on dropWillem de Bruijn1-6/+7
[ Upstream commit 46e4c421a053c36bf7a33dda2272481bcaf3eed3 ] In one error case, tpacket_rcv drops packets after incrementing the ring producer index. If this happens, it does not update tp_status to TP_STATUS_USER and thus the reader is stalled for an iteration of the ring, causing out of order arrival. The only such error path is when virtio_net_hdr_from_skb fails due to encountering an unknown GSO type. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18net: nfc: fix bounds checking bugs on "pipe"Dan Carpenter1-3/+16
[ Upstream commit a3aefbfe45751bf7b338c181b97608e276b5bb73 ] This is similar to commit 674d9de02aa7 ("NFC: Fix possible memory corruption when handling SHDLC I-Frame commands") and commit d7ee81ad09f0 ("NFC: nci: Add some bounds checking in nci_hci_cmd_received()") which added range checks on "pipe". The "pipe" variable comes skb->data[0] in nfc_hci_msg_rx_work(). It's in the 0-255 range. We're using it as the array index into the hdev->pipes[] array which has NFC_HCI_MAX_PIPES (128) members. Fixes: 118278f20aa8 ("NFC: hci: Add pipes table to reference them with a tuple {gate, host}") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18net: macsec: update SCI upon MAC address change.Dmitry Bogdanov1-5/+6
[ Upstream commit 6fc498bc82929ee23aa2f35a828c6178dfd3f823 ] SCI should be updated, because it contains MAC in its first 6 octets. Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18netlink: Use netlink header as base to calculate bad attribute offsetPablo Neira Ayuso1-1/+1
[ Upstream commit 84b3268027641401bb8ad4427a90a3cce2eb86f5 ] Userspace might send a batch that is composed of several netlink messages. The netlink_ack() function must use the pointer to the netlink header as base to calculate the bad attribute offset. Fixes: 2d4bc93368f5 ("netlink: extended ACK reporting") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18net/ipv6: use configured metric when add peer routeHangbin Liu1-3/+3
[ Upstream commit 07758eb9ff52794fba15d03aa88d92dbd1b7d125 ] When we add peer address with metric configured, IPv4 could set the dest metric correctly, but IPv6 do not. e.g. ]# ip addr add 192.0.2.1 peer 192.0.2.2/32 dev eth1 metric 20 ]# ip route show dev eth1 192.0.2.2 proto kernel scope link src 192.0.2.1 metric 20 ]# ip addr add 2001:db8::1 peer 2001:db8::2/128 dev eth1 metric 20 ]# ip -6 route show dev eth1 2001:db8::1 proto kernel metric 20 pref medium 2001:db8::2 proto kernel metric 256 pref medium Fix this by using configured metric instead of default one. Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: 8308f3ff1753 ("net/ipv6: Add support for specifying metric of connected routes") Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18ipvlan: don't deref eth hdr before checking it's setMahesh Bandewar1-8/+10
[ Upstream commit ad8192767c9f9cf97da57b9ffcea70fb100febef ] IPvlan in L3 mode discards outbound multicast packets but performs the check before ensuring the ether-header is set or not. This is an error that Eric found through code browsing. Fixes: 2ad7bf363841 (“ipvlan: Initial check-in of the IPVLAN driver.”) Signed-off-by: Mahesh Bandewar <maheshb@google.com> Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18ipvlan: do not use cond_resched_rcu() in ipvlan_process_multicast()Eric Dumazet1-1/+1
[ Upstream commit afe207d80a61e4d6e7cfa0611a4af46d0ba95628 ] Commit e18b353f102e ("ipvlan: add cond_resched_rcu() while processing muticast backlog") added a cond_resched_rcu() in a loop using rcu protection to iterate over slaves. This is breaking rcu rules, so lets instead use cond_resched() at a point we can reschedule Fixes: e18b353f102e ("ipvlan: add cond_resched_rcu() while processing muticast backlog") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18ipvlan: do not add hardware address of master to its unicast filter listJiri Wiesner1-4/+1
[ Upstream commit 63aae7b17344d4b08a7d05cb07044de4c0f9dcc6 ] There is a problem when ipvlan slaves are created on a master device that is a vmxnet3 device (ipvlan in VMware guests). The vmxnet3 driver does not support unicast address filtering. When an ipvlan device is brought up in ipvlan_open(), the ipvlan driver calls dev_uc_add() to add the hardware address of the vmxnet3 master device to the unicast address list of the master device, phy_dev->uc. This inevitably leads to the vmxnet3 master device being forced into promiscuous mode by __dev_set_rx_mode(). Promiscuous mode is switched on the master despite the fact that there is still only one hardware address that the master device should use for filtering in order for the ipvlan device to be able to receive packets. The comment above struct net_device describes the uc_promisc member as a "counter, that indicates, that promiscuous mode has been enabled due to the need to listen to additional unicast addresses in a device that does not implement ndo_set_rx_mode()". Moreover, the design of ipvlan guarantees that only the hardware address of a master device, phy_dev->dev_addr, will be used to transmit and receive all packets from its ipvlan slaves. Thus, the unicast address list of the master device should not be modified by ipvlan_open() and ipvlan_stop() in order to make ipvlan a workable option on masters that do not support unicast address filtering. Fixes: 2ad7bf3638411 ("ipvlan: Initial check-in of the IPVLAN driver") Reported-by: Per Sundstrom <per.sundstrom@redqube.se> Signed-off-by: Jiri Wiesner <jwiesner@suse.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18ipvlan: add cond_resched_rcu() while processing muticast backlogMahesh Bandewar1-0/+1
[ Upstream commit e18b353f102e371580f3f01dd47567a25acc3c1d ] If there are substantial number of slaves created as simulated by Syzbot, the backlog processing could take much longer and result into the issue found in the Syzbot report. INFO: rcu_sched detected stalls on CPUs/tasks: (detected by 1, t=10502 jiffies, g=5049, c=5048, q=752) All QSes seen, last rcu_sched kthread activity 10502 (4294965563-4294955061), jiffies_till_next_fqs=1, root ->qsmask 0x0 syz-executor.1 R running task on cpu 1 10984 11210 3866 0x30020008 179034491270 Call Trace: <IRQ> [<ffffffff81497163>] _sched_show_task kernel/sched/core.c:8063 [inline] [<ffffffff81497163>] _sched_show_task.cold+0x2fd/0x392 kernel/sched/core.c:8030 [<ffffffff8146a91b>] sched_show_task+0xb/0x10 kernel/sched/core.c:8073 [<ffffffff815c931b>] print_other_cpu_stall kernel/rcu/tree.c:1577 [inline] [<ffffffff815c931b>] check_cpu_stall kernel/rcu/tree.c:1695 [inline] [<ffffffff815c931b>] __rcu_pending kernel/rcu/tree.c:3478 [inline] [<ffffffff815c931b>] rcu_pending kernel/rcu/tree.c:3540 [inline] [<ffffffff815c931b>] rcu_check_callbacks.cold+0xbb4/0xc29 kernel/rcu/tree.c:2876 [<ffffffff815e3962>] update_process_times+0x32/0x80 kernel/time/timer.c:1635 [<ffffffff816164f0>] tick_sched_handle+0xa0/0x180 kernel/time/tick-sched.c:161 [<ffffffff81616ae4>] tick_sched_timer+0x44/0x130 kernel/time/tick-sched.c:1193 [<ffffffff815e75f7>] __run_hrtimer kernel/time/hrtimer.c:1393 [inline] [<ffffffff815e75f7>] __hrtimer_run_queues+0x307/0xd90 kernel/time/hrtimer.c:1455 [<ffffffff815e90ea>] hrtimer_interrupt+0x2ea/0x730 kernel/time/hrtimer.c:1513 [<ffffffff844050f4>] local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1031 [inline] [<ffffffff844050f4>] smp_apic_timer_interrupt+0x144/0x5e0 arch/x86/kernel/apic/apic.c:1056 [<ffffffff84401cbe>] apic_timer_interrupt+0x8e/0xa0 arch/x86/entry/entry_64.S:778 RIP: 0010:do_raw_read_lock+0x22/0x80 kernel/locking/spinlock_debug.c:153 RSP: 0018:ffff8801dad07ab8 EFLAGS: 00000a02 ORIG_RAX: ffffffffffffff12 RAX: 0000000000000000 RBX: ffff8801c4135680 RCX: 0000000000000000 RDX: 1ffff10038826afe RSI: ffff88019d816bb8 RDI: ffff8801c41357f0 RBP: ffff8801dad07ac0 R08: 0000000000004b15 R09: 0000000000310273 R10: ffff88019d816bb8 R11: 0000000000000001 R12: ffff8801c41357e8 R13: 0000000000000000 R14: ffff8801cfb19850 R15: ffff8801cfb198b0 [<ffffffff8101460e>] __raw_read_lock_bh include/linux/rwlock_api_smp.h:177 [inline] [<ffffffff8101460e>] _raw_read_lock_bh+0x3e/0x50 kernel/locking/spinlock.c:240 [<ffffffff840d78ca>] ipv6_chk_mcast_addr+0x11a/0x6f0 net/ipv6/mcast.c:1006 [<ffffffff84023439>] ip6_mc_input+0x319/0x8e0 net/ipv6/ip6_input.c:482 [<ffffffff840211c8>] dst_input include/net/dst.h:449 [inline] [<ffffffff840211c8>] ip6_rcv_finish+0x408/0x610 net/ipv6/ip6_input.c:78 [<ffffffff840214de>] NF_HOOK include/linux/netfilter.h:292 [inline] [<ffffffff840214de>] NF_HOOK include/linux/netfilter.h:286 [inline] [<ffffffff840214de>] ipv6_rcv+0x10e/0x420 net/ipv6/ip6_input.c:278 [<ffffffff83a29efa>] __netif_receive_skb_one_core+0x12a/0x1f0 net/core/dev.c:5303 [<ffffffff83a2a15c>] __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:5417 [<ffffffff83a2f536>] process_backlog+0x216/0x6c0 net/core/dev.c:6243 [<ffffffff83a30d1b>] napi_poll net/core/dev.c:6680 [inline] [<ffffffff83a30d1b>] net_rx_action+0x47b/0xfb0 net/core/dev.c:6748 [<ffffffff846002c8>] __do_softirq+0x2c8/0x99a kernel/softirq.c:317 [<ffffffff813e656a>] invoke_softirq kernel/softirq.c:399 [inline] [<ffffffff813e656a>] irq_exit+0x16a/0x1a0 kernel/softirq.c:439 [<ffffffff84405115>] exiting_irq arch/x86/include/asm/apic.h:561 [inline] [<ffffffff84405115>] smp_apic_timer_interrupt+0x165/0x5e0 arch/x86/kernel/apic/apic.c:1058 [<ffffffff84401cbe>] apic_timer_interrupt+0x8e/0xa0 arch/x86/entry/entry_64.S:778 </IRQ> RIP: 0010:__sanitizer_cov_trace_pc+0x26/0x50 kernel/kcov.c:102 RSP: 0018:ffff880196033bd8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff12 RAX: ffff88019d8161c0 RBX: 00000000ffffffff RCX: ffffc90003501000 RDX: 0000000000000002 RSI: ffffffff816236d1 RDI: 0000000000000005 RBP: ffff880196033bd8 R08: ffff88019d8161c0 R09: 0000000000000000 R10: 1ffff10032c067f0 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000080 R14: 0000000000000000 R15: 0000000000000000 [<ffffffff816236d1>] do_futex+0x151/0x1d50 kernel/futex.c:3548 [<ffffffff816260f0>] C_SYSC_futex kernel/futex_compat.c:201 [inline] [<ffffffff816260f0>] compat_SyS_futex+0x270/0x3b0 kernel/futex_compat.c:175 [<ffffffff8101da17>] do_syscall_32_irqs_on arch/x86/entry/common.c:353 [inline] [<ffffffff8101da17>] do_fast_syscall_32+0x357/0xe1c arch/x86/entry/common.c:415 [<ffffffff84401a9b>] entry_SYSENTER_compat+0x8b/0x9d arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7f23c69 RSP: 002b:00000000f5d1f12c EFLAGS: 00000282 ORIG_RAX: 00000000000000f0 RAX: ffffffffffffffda RBX: 000000000816af88 RCX: 0000000000000080 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000816af8c RBP: 00000000f5d1f228 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 rcu_sched kthread starved for 10502 jiffies! g5049 c5048 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 ->cpu=1 rcu_sched R running task on cpu 1 13048 8 2 0x90000000 179099587640 Call Trace: [<ffffffff8147321f>] context_switch+0x60f/0xa60 kernel/sched/core.c:3209 [<ffffffff8100095a>] __schedule+0x5aa/0x1da0 kernel/sched/core.c:3934 [<ffffffff810021df>] schedule+0x8f/0x1b0 kernel/sched/core.c:4011 [<ffffffff8101116d>] schedule_timeout+0x50d/0xee0 kernel/time/timer.c:1803 [<ffffffff815c13f1>] rcu_gp_kthread+0xda1/0x3b50 kernel/rcu/tree.c:2327 [<ffffffff8144b318>] kthread+0x348/0x420 kernel/kthread.c:246 [<ffffffff84400266>] ret_from_fork+0x56/0x70 arch/x86/entry/entry_64.S:393 Fixes: ba35f8588f47 (“ipvlan: Defer multicast / broadcast processing to a work-queue”) Signed-off-by: Mahesh Bandewar <maheshb@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18ipv6/addrconf: call ipv6_mc_up() for non-Ethernet interfaceHangbin Liu1-0/+4
[ Upstream commit 60380488e4e0b95e9e82aa68aa9705baa86de84c ] Rafał found an issue that for non-Ethernet interface, if we down and up frequently, the memory will be consumed slowly. The reason is we add allnodes/allrouters addressed in multicast list in ipv6_add_dev(). When link down, we call ipv6_mc_down(), store all multicast addresses via mld_add_delrec(). But when link up, we don't call ipv6_mc_up() for non-Ethernet interface to remove the addresses. This makes idev->mc_tomb getting bigger and bigger. The call stack looks like: addrconf_notify(NETDEV_REGISTER) ipv6_add_dev ipv6_dev_mc_inc(ff01::1) ipv6_dev_mc_inc(ff02::1) ipv6_dev_mc_inc(ff02::2) addrconf_notify(NETDEV_UP) addrconf_dev_config /* Alas, we support only Ethernet autoconfiguration. */ return; addrconf_notify(NETDEV_DOWN) addrconf_ifdown ipv6_mc_down igmp6_group_dropped(ff02::2) mld_add_delrec(ff02::2) igmp6_group_dropped(ff02::1) igmp6_group_dropped(ff01::1) After investigating, I can't found a rule to disable multicast on non-Ethernet interface. In RFC2460, the link could be Ethernet, PPP, ATM, tunnels, etc. In IPv4, it doesn't check the dev type when calls ip_mc_up() in inetdev_event(). Even for IPv6, we don't check the dev type and call ipv6_add_dev(), ipv6_dev_mc_inc() after register device. So I think it's OK to fix this memory consumer by calling ipv6_mc_up() for non-Ethernet interface. v2: Also check IFF_MULTICAST flag to make sure the interface supports multicast Reported-by: Rafał Miłecki <zajec5@gmail.com> Tested-by: Rafał Miłecki <zajec5@gmail.com> Fixes: 74235a25c673 ("[IPV6] addrconf: Fix IPv6 on tuntap tunnels") Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when set link down") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18inet_diag: return classid for all socket typesDmitry Yakunin5-40/+40
[ Upstream commit 83f73c5bb7b9a9135173f0ba2b1aa00c06664ff9 ] In commit 1ec17dbd90f8 ("inet_diag: fix reporting cgroup classid and fallback to priority") croup classid reporting was fixed. But this works only for TCP sockets because for other socket types icsk parameter can be NULL and classid code path is skipped. This change moves classid handling to inet_diag_msg_attrs_fill() function. Also inet_diag_msg_attrs_size() helper was added and addends in nlmsg_new() were reordered to save order from inet_sk_diag_fill(). Fixes: 1ec17dbd90f8 ("inet_diag: fix reporting cgroup classid and fallback to priority") Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18gre: fix uninit-value in __iptunnel_pull_headerEric Dumazet1-2/+10
[ Upstream commit 17c25cafd4d3e74c83dce56b158843b19c40b414 ] syzbot found an interesting case of the kernel reading an uninit-value [1] Problem is in the handling of ETH_P_WCCP in gre_parse_header() We look at the byte following GRE options to eventually decide if the options are four bytes longer. Use skb_header_pointer() to not pull bytes if we found that no more bytes were needed. All callers of gre_parse_header() are properly using pskb_may_pull() anyway before proceeding to next header. [1] BUG: KMSAN: uninit-value in pskb_may_pull include/linux/skbuff.h:2303 [inline] BUG: KMSAN: uninit-value in __iptunnel_pull_header+0x30c/0xbd0 net/ipv4/ip_tunnel_core.c:94 CPU: 1 PID: 11784 Comm: syz-executor940 Not tainted 5.6.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 pskb_may_pull include/linux/skbuff.h:2303 [inline] __iptunnel_pull_header+0x30c/0xbd0 net/ipv4/ip_tunnel_core.c:94 iptunnel_pull_header include/net/ip_tunnels.h:411 [inline] gre_rcv+0x15e/0x19c0 net/ipv6/ip6_gre.c:606 ip6_protocol_deliver_rcu+0x181b/0x22c0 net/ipv6/ip6_input.c:432 ip6_input_finish net/ipv6/ip6_input.c:473 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ip6_input net/ipv6/ip6_input.c:482 [inline] ip6_mc_input+0xdf2/0x1460 net/ipv6/ip6_input.c:576 dst_input include/net/dst.h:442 [inline] ip6_rcv_finish net/ipv6/ip6_input.c:76 [inline] NF_HOOK include/linux/netfilter.h:307 [inline] ipv6_rcv+0x683/0x710 net/ipv6/ip6_input.c:306 __netif_receive_skb_one_core net/core/dev.c:5198 [inline] __netif_receive_skb net/core/dev.c:5312 [inline] netif_receive_skb_internal net/core/dev.c:5402 [inline] netif_receive_skb+0x66b/0xf20 net/core/dev.c:5461 tun_rx_batched include/linux/skbuff.h:4321 [inline] tun_get_user+0x6aef/0x6f60 drivers/net/tun.c:1997 tun_chr_write_iter+0x1f2/0x360 drivers/net/tun.c:2026 call_write_iter include/linux/fs.h:1901 [inline] new_sync_write fs/read_write.c:483 [inline] __vfs_write+0xa5a/0xca0 fs/read_write.c:496 vfs_write+0x44a/0x8f0 fs/read_write.c:558 ksys_write+0x267/0x450 fs/read_write.c:611 __do_sys_write fs/read_write.c:623 [inline] __se_sys_write fs/read_write.c:620 [inline] __ia32_sys_write+0xdb/0x120 fs/read_write.c:620 do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline] do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410 entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139 RIP: 0023:0xf7f62d99 Code: 90 e8 0b 00 00 00 f3 90 0f ae e8 eb f9 8d 74 26 00 89 3c 24 c3 90 90 90 90 90 90 90 90 90 90 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 eb 0d 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 002b:00000000fffedb2c EFLAGS: 00000217 ORIG_RAX: 0000000000000004 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000020002580 RDX: 0000000000000fca RSI: 0000000000000036 RDI: 0000000000000004 RBP: 0000000000008914 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:144 [inline] kmsan_internal_poison_shadow+0x66/0xd0 mm/kmsan/kmsan.c:127 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:82 slab_alloc_node mm/slub.c:2793 [inline] __kmalloc_node_track_caller+0xb40/0x1200 mm/slub.c:4401 __kmalloc_reserve net/core/skbuff.c:142 [inline] __alloc_skb+0x2fd/0xac0 net/core/skbuff.c:210 alloc_skb include/linux/skbuff.h:1051 [inline] alloc_skb_with_frags+0x18c/0xa70 net/core/skbuff.c:5766 sock_alloc_send_pskb+0xada/0xc60 net/core/sock.c:2242 tun_alloc_skb drivers/net/tun.c:1529 [inline] tun_get_user+0x10ae/0x6f60 drivers/net/tun.c:1843 tun_chr_write_iter+0x1f2/0x360 drivers/net/tun.c:2026 call_write_iter include/linux/fs.h:1901 [inline] new_sync_write fs/read_write.c:483 [inline] __vfs_write+0xa5a/0xca0 fs/read_write.c:496 vfs_write+0x44a/0x8f0 fs/read_write.c:558 ksys_write+0x267/0x450 fs/read_write.c:611 __do_sys_write fs/read_write.c:623 [inline] __se_sys_write fs/read_write.c:620 [inline] __ia32_sys_write+0xdb/0x120 fs/read_write.c:620 do_syscall_32_irqs_on arch/x86/entry/common.c:339 [inline] do_fast_syscall_32+0x3c7/0x6e0 arch/x86/entry/common.c:410 entry_SYSENTER_compat+0x68/0x77 arch/x86/entry/entry_64_compat.S:139 Fixes: 95f5c64c3c13 ("gre: Move utility functions to common headers") Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18cgroup, netclassid: periodically release file_lock on classid updatingDmitry Yakunin1-10/+37
[ Upstream commit 018d26fcd12a75fb9b5fe233762aa3f2f0854b88 ] In our production environment we have faced with problem that updating classid in cgroup with heavy tasks cause long freeze of the file tables in this tasks. By heavy tasks we understand tasks with many threads and opened sockets (e.g. balancers). This freeze leads to an increase number of client timeouts. This patch implements following logic to fix this issue: аfter iterating 1000 file descriptors file table lock will be released thus providing a time gap for socket creation/deletion. Now update is non atomic and socket may be skipped using calls: dup2(oldfd, newfd); close(oldfd); But this case is not typical. Moreover before this patch skip is possible too by hiding socket fd in unix socket buffer. New sockets will be allocated with updated classid because cgroup state is updated before start of the file descriptors iteration. So in common cases this patch has no side effects. Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18net: phy: Avoid multiple suspendsFlorian Fainelli1-2/+3
commit 503ba7c6961034ff0047707685644cad9287c226 upstream. It is currently possible for a PHY device to be suspended as part of a network device driver's suspend call while it is still being attached to that net_device, either via phy_suspend() or implicitly via phy_stop(). Later on, when the MDIO bus controller get suspended, we would attempt to suspend again the PHY because it is still attached to a network device. This is both a waste of time and creates an opportunity for improper clock/power management bugs to creep in. Fixes: 803dd9c77ac3 ("net: phy: avoid suspending twice a PHY") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-18phy: Revert toggling reset changes.David S. Miller1-6/+5
commit 7b566f70e1bf65b189b66eb3de6f431c30f7dff2 upstream. This reverts: ef1b5bf506b1 ("net: phy: Fix not to call phy_resume() if PHY is not attached") 8c85f4b81296 ("net: phy: micrel: add toggling phy reset if PHY is not attached") Andrew Lunn informs me that there are alternative efforts underway to fix this more properly. Signed-off-by: David S. Miller <davem@davemloft.net> [just take the ef1b5bf506b1 revert - gregkh] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-16Linux 4.19.110v4.19.110Greg Kroah-Hartman1-1/+1
2020-03-16KVM: SVM: fix up incorrect backportGreg Kroah-Hartman1-1/+1
When I backported 52918ed5fcf0 ("KVM: SVM: Override default MMIO mask if memory encryption is enabled") to 4.19 (which resulted in commit a4e761c9f63a ("KVM: SVM: Override default MMIO mask if memory encryption is enabled")), I messed up the call to kvm_mmu_set_mmio_spte_mask() Fix that here now. Reported-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Sean Christopherson <sean.j.christopherson@intel.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11Linux 4.19.109v4.19.109Greg Kroah-Hartman1-1/+1
2020-03-11scsi: pm80xx: Fixed kernel panic during error recovery for SATA driveDeepak Ukey3-2/+8
commit 196ba6629cf95e51403337235d09742fcdc3febd upstream. Disabling the SATA drive interface cause kernel panic. When the drive Interface is disabled, device should be deregistered after aborting all pending I/Os. Also changed the port recovery timeout to 10000 ms for PM8006 controller. Signed-off-by: Deepak Ukey <deepak.ukey@microchip.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11dm integrity: fix a deadlock due to offloading to an incorrect workqueueMikulas Patocka1-2/+13
commit 53770f0ec5fd417429775ba006bc4abe14002335 upstream. If we need to perform synchronous I/O in dm_integrity_map_continue(), we must make sure that we are not in the map function - in order to avoid the deadlock due to bio queuing in generic_make_request. To avoid the deadlock, we offload the request to metadata_wq. However, metadata_wq also processes metadata updates for write requests. If there are too many requests that get offloaded to metadata_wq at the beginning of dm_integrity_map_continue, the workqueue metadata_wq becomes clogged and the system is incapable of processing any metadata updates. This causes a deadlock because all the requests that need to do metadata updates wait for metadata_wq to proceed and metadata_wq waits inside wait_and_add_new_range until some existing request releases its range lock (which doesn't happen because the range lock is released after metadata update). In order to fix the deadlock, we create a new workqueue offload_wq and offload requests to it - so that processing of offload_wq is independent from processing of metadata_wq. Fixes: 7eada909bfd7 ("dm: add integrity target") Cc: stable@vger.kernel.org # v4.12+ Reported-by: Heinz Mauelshagen <heinzm@redhat.com> Tested-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11efi/x86: Handle by-ref arguments covering multiple pages in mixed modeArd Biesheuvel1-19/+26
commit 8319e9d5ad98ffccd19f35664382c73cea216193 upstream. The mixed mode runtime wrappers are fragile when it comes to how the memory referred to by its pointer arguments are laid out in memory, due to the fact that it translates these addresses to physical addresses that the runtime services can dereference when running in 1:1 mode. Since vmalloc'ed pages (including the vmap'ed stack) are not contiguous in the physical address space, this scheme only works if the referenced memory objects do not cross page boundaries. Currently, the mixed mode runtime service wrappers require that all by-ref arguments that live in the vmalloc space have a size that is a power of 2, and are aligned to that same value. While this is a sensible way to construct an object that is guaranteed not to cross a page boundary, it is overly strict when it comes to checking whether a given object violates this requirement, as we can simply take the physical address of the first and the last byte, and verify that they point into the same physical page. When this check fails, we emit a WARN(), but then simply proceed with the call, which could cause data corruption if the next physical page belongs to a mapping that is entirely unrelated. Given that with vmap'ed stacks, this condition is much more likely to trigger, let's relax the condition a bit, but fail the runtime service call if it does trigger. Fixes: f6697df36bdf0bf7 ("x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=y") Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: linux-efi@vger.kernel.org Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200221084849.26878-4-ardb@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11efi/x86: Align GUIDs to their size in the mixed mode runtime wrapperArd Biesheuvel1-4/+21
commit 63056e8b5ebf41d52170e9f5ba1fc83d1855278c upstream. Hans reports that his mixed mode systems running v5.6-rc1 kernels hit the WARN_ON() in virt_to_phys_or_null_size(), caused by the fact that efi_guid_t objects on the vmap'ed stack happen to be misaligned with respect to their sizes. As a quick (i.e., backportable) fix, copy GUID pointer arguments to the local stack into a buffer that is naturally aligned to its size, so that it is guaranteed to cover only one physical page. Note that on x86, we cannot rely on the stack pointer being aligned the way the compiler expects, so we need to allocate an 8-byte aligned buffer of sufficient size, and copy the GUID into that buffer at an offset that is aligned to 16 bytes. Fixes: f6697df36bdf0bf7 ("x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=y") Reported-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Hans de Goede <hdegoede@redhat.com> Cc: linux-efi@vger.kernel.org Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20200221084849.26878-2-ardb@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11powerpc: fix hardware PMU exception bug on PowerVM compatibility mode systemsDesnes A. Nunes do Rosario1-1/+3
commit fc37a1632d40c80c067eb1bc235139f5867a2667 upstream. PowerVM systems running compatibility mode on a few Power8 revisions are still vulnerable to the hardware defect that loses PMU exceptions arriving prior to a context switch. The software fix for this issue is enabled through the CPU_FTR_PMAO_BUG cpu_feature bit, nevertheless this bit also needs to be set for PowerVM compatibility mode systems. Fixes: 68f2f0d431d9ea4 ("powerpc: Add a cpu feature CPU_FTR_PMAO_BUG") Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com> Reviewed-by: Leonardo Bras <leonardo@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200227134715.9715-1-desnesn@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11dmaengine: coh901318: Fix a double lock bug in dma_tc_handle()Dan Carpenter1-4/+0
commit 36d5d22090d13fd3a7a8c9663a711cbe6970aac8 upstream. The caller is already holding the lock so this will deadlock. Fixes: 0b58828c923e ("DMAENGINE: COH 901 318 remove irq counting") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20200217144050.3i4ymbytogod4ijn@kili.mountain Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11hwmon: (adt7462) Fix an error return in ADT7462_REG_VOLT()Dan Carpenter1-1/+1
commit 44f2f882909fedfc3a56e4b90026910456019743 upstream. This is only called from adt7462_update_device(). The caller expects it to return zero on error. I fixed a similar issue earlier in commit a4bf06d58f21 ("hwmon: (adt7462) ADT7462_REG_VOLT_MAX() should return 0") but I missed this one. Fixes: c0b4e3ab0c76 ("adt7462: new hwmon driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Link: https://lore.kernel.org/r/20200303101608.kqjwfcazu2ylhi2a@kili.mountain Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11ARM: dts: imx7-colibri: Fix frequency for sd/mmcOleksandr Suvorov1-1/+0
commit 2773fe1d31c42ffae2a9cb9a6055d99dd86e2fee upstream. SD/MMC on Colibri iMX7S/D modules successfully support 200Mhz frequency in HS200 mode. Removing the unnecessary max-frequency limit significantly increases the performance: == before fix ==== root@colibri-imx7-emmc:~# hdparm -t /dev/mmcblk0 /dev/mmcblk0: Timing buffered disk reads: 252 MB in 3.02 seconds = 83.54 MB/sec ================== === after fix ==== root@colibri-imx7-emmc:~# hdparm -t /dev/mmcblk0 /dev/mmcblk0: Timing buffered disk reads: 408 MB in 3.00 seconds = 135.94 MB/sec ================== Fixes: f928a4a377e4 ("ARM: dts: imx7: add Toradex Colibri iMX7D 1GB (eMMC) support") Signed-off-by: Oleksandr Suvorov <oleksandr.suvorov@toradex.com> Reviewed-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11ARM: dts: am437x-idk-evm: Fix incorrect OPP node namesSuman Anna1-2/+2
commit 31623468be0bf57617b8057dcd335693935a9491 upstream. The commit 337c6c9a69af ("ARM: dts: am437x-idk-evm: Disable OPP50 for MPU") adjusts couple of OPP nodes defined in the common am4372.dtsi file, but used outdated node names. This results in these getting treated as new OPP nodes with missing properties. Fix this properly by using the correct node names as updated in commit b9cb2ba71848 ("ARM: dts: Use - instead of @ for DT OPP entries for TI SoCs"). Reported-by: Roger Quadros <rogerq@ti.com> Fixes: 337c6c9a69af ("ARM: dts: am437x-idk-evm: Disable OPP50 for MPU") Signed-off-by: Suman Anna <s-anna@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11ARM: imx: build v7_cpu_resume() unconditionallyAhmad Fatoum4-16/+28
commit 512a928affd51c2dc631401e56ad5ee5d5dd68b6 upstream. This function is not only needed by the platform suspend code, but is also reused as the CPU resume function when the ARM cores can be powered down completely in deep idle, which is the case on i.MX6SX and i.MX6UL(L). Providing the static inline stub whenever CONFIG_SUSPEND is disabled means that those platforms will hang on resume from cpuidle if suspend is disabled. So there are two problems: - The static inline stub masks the linker error - The function is not available where needed Fix both by just building the function unconditionally, when CONFIG_SOC_IMX6 is enabled. The actual code is three instructions long, so it's arguably ok to just leave it in for all i.MX6 kernel configurations. Fixes: 05136f0897b5 ("ARM: imx: support arm power off in cpuidle for i.mx6sx") Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: Rouven Czerwinski <r.czerwinski@pengutronix.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11IB/hfi1, qib: Ensure RCU is locked when accessing listDennis Dalessandro2-1/+5
commit 817a68a6584aa08e323c64283fec5ded7be84759 upstream. The packet handling function, specifically the iteration of the qp list for mad packet processing misses locking RCU before running through the list. Not only is this incorrect, but the list_for_each_entry_rcu() call can not be called with a conditional check for lock dependency. Remedy this by invoking the rcu lock and unlock around the critical section. This brings MAD packet processing in line with what is done for non-MAD packets. Fixes: 7724105686e7 ("IB/hfi1: add driver files") Link: https://lore.kernel.org/r/20200225195445.140896.41873.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11RMDA/cm: Fix missing ib_cm_destroy_id() in ib_cm_insert_listen()Jason Gunthorpe1-0/+1
commit c14dfddbd869bf0c2bafb7ef260c41d9cebbcfec upstream. The algorithm pre-allocates a cm_id since allocation cannot be done while holding the cm.lock spinlock, however it doesn't free it on one error path, leading to a memory leak. Fixes: 067b171b8679 ("IB/cm: Share listening CM IDs") Link: https://lore.kernel.org/r/20200221152023.GA8680@ziepe.ca Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11RDMA/iwcm: Fix iwcm work deallocationBernard Metzler1-1/+3
commit 810dbc69087b08fd53e1cdd6c709f385bc2921ad upstream. The dealloc_work_entries() function must update the work_free_list pointer while freeing its entries, since potentially called again on same list. A second iteration of the work list caused system crash. This happens, if work allocation fails during cma_iw_listen() and free_cm_id() tries to free the list again during cleanup. Fixes: 922a8e9fb2e0 ("RDMA: iWARP Connection Manager.") Link: https://lore.kernel.org/r/20200302181614.17042-1-bmt@zurich.ibm.com Reported-by: syzbot+cb0c054eabfba4342146@syzkaller.appspotmail.com Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11ARM: dts: imx6: phycore-som: fix emmc supplyMarco Felsch1-1/+0
commit eb0bbba7636b9fc81939d6087a5fe575e150c95a upstream. Currently the vmmc is supplied by the 1.8V pmic rail but this is wrong. The default module behaviour is to power VCCQ and VCC by the 3.3V power rail. Optional the user can connect the VCCQ to the pmic 1.8V emmc power rail using a solder jumper. Fixes: ddec5d1c0047 ("ARM: dts: imx6: Add initial support for phyCORE-i.MX 6 SOM") Signed-off-by: Marco Felsch <m.felsch@pengutronix.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11phy: mapphone-mdm6600: Fix write timeouts with shorter GPIO toggle intervalTony Lindgren1-1/+8
commit 46b7edf1c7b7c91004c4db2c355cbd033f2385f9 upstream. I've noticed that when writing data to the modem the writes can time out at some point eventually. Looks like kicking the modem idle GPIO every 600 ms instead of once a second fixes the issue. Note that this rate is different from our runtime PM autosuspend rate MDM6600_MODEM_IDLE_DELAY_MS that we still want to keep at 1 second, so let's add a separate define for PHY_MDM6600_IDLE_KICK_MS. Fixes: f7f50b2a7b05 ("phy: mapphone-mdm6600: Add runtime PM support for n_gsm on USB suspend") Cc: Marcel Partap <mpartap@gmx.net> Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Michael Scott <hashcode0f@gmail.com> Cc: NeKit <nekit1000@gmail.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11phy: mapphone-mdm6600: Fix timeouts by adding wake-up handlingTony Lindgren1-2/+16
commit be4e3c737eebd75815633f4b8fd766defaf0f1fc upstream. We have an interrupt handler for the wake-up GPIO pin, but we're missing the code to wake-up the system. This can cause timeouts receiving data for the UART that shares the wake-up GPIO pin with the USB PHY. All we need to do is just wake the system and kick the autosuspend timeout to fix the issue. Fixes: 5d1ebbda0318 ("phy: mapphone-mdm6600: Add USB PHY driver for MDM6600 on Droid 4") Cc: Marcel Partap <mpartap@gmx.net> Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Michael Scott <hashcode0f@gmail.com> Cc: NeKit <nekit1000@gmail.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11drm/sun4i: de2/de3: Remove unsupported VI layer formatsJernej Skrabec2-14/+0
commit a4769905f0ae32cae4f096f646ab03b8b4794c74 upstream. YUV444 and YVU444 are planar formats, but HW format RGB888 is packed. This means that those two mappings were never correct. Remove them. Fixes: 60a3dcf96aa8 ("drm/sun4i: Add DE2 definitions for YUV formats") Acked-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Link: https://patchwork.freedesktop.org/patch/msgid/20200224173901.174016-2-jernej.skrabec@siol.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11drm/sun4i: Fix DE2 VI layer format supportJernej Skrabec2-11/+67
commit 20896ef137340e9426cf322606f764452f5eb960 upstream. DE2 VI layer doesn't support blending which means alpha channel is ignored. Replace all formats with alpha with "don't care" (X) channel. Fixes: 7480ba4d7571 ("drm/sun4i: Add support for DE2 VI planes") Acked-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net> Link: https://patchwork.freedesktop.org/patch/msgid/20200224173901.174016-4-jernej.skrabec@siol.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11ASoC: dapm: Correct DAPM handling of active widgets during shutdownCharles Keepax1-1/+1
commit 9b3193089e77d3b59b045146ff1c770dd899acb1 upstream. commit c2caa4da46a4 ("ASoC: Fix widget powerdown on shutdown") added a set of the power state during snd_soc_dapm_shutdown to ensure the widgets powered off. However, when commit 39eb5fd13dff ("ASoC: dapm: Delay w->power update until the changes are written") added the new_power member of the widget structure, to differentiate between the current power state and the target power state, it did not update the shutdown to use the new_power member. As new_power has not updated it will be left in the state set by the last DAPM sequence, ie. 1 for active widgets. So as the DAPM sequence for the shutdown proceeds it will turn the widgets on (despite them already being on) rather than turning them off. Fixes: 39eb5fd13dff ("ASoC: dapm: Delay w->power update until the changes are written") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Link: https://lore.kernel.org/r/20200228153145.21013-1-ckeepax@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11ASoC: pcm512x: Fix unbalanced regulator enable call in probe error pathMatthias Reichl1-3/+5
commit ac0a68997935c4acb92eaae5ad8982e0bb432d56 upstream. When we get a clock error during probe we have to call regulator_bulk_disable before bailing out, otherwise we trigger a warning in regulator_put. Fix this by using "goto err" like in the error cases above. Fixes: 5a3af1293194d ("ASoC: pcm512x: Add PCM512x driver") Signed-off-by: Matthias Reichl <hias@horus.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20200220202956.29233-1-hias@horus.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11ASoC: pcm: Fix possible buffer overflow in dpcm state sysfs outputTakashi Iwai1-8/+8
commit 6c89ffea60aa3b2a33ae7987de1e84bfb89e4c9e upstream. dpcm_show_state() invokes multiple snprintf() calls to concatenate formatted strings on the fixed size buffer. The usage of snprintf() is supposed for avoiding the buffer overflow, but it doesn't work as expected because snprintf() doesn't return the actual output size but the size to be written. Fix this bug by replacing all snprintf() calls with scnprintf() calls. Fixes: f86dcef87b77 ("ASoC: dpcm: Add debugFS support for DPCM") Signed-off-by: Takashi Iwai <tiwai@suse.de> Acked-by: Cezary Rojewski <cezary.rojewski@intel.com> Link: https://lore.kernel.org/r/20200218111737.14193-4-tiwai@suse.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-03-11dmaengine: imx-sdma: remove dma_slave_config direction usage and leave ↵Vinod Koul1-17/+39
sdma_event_enable() [ Upstream commit 107d06441b709d31ce592535086992799ee51e17 ] dma_slave_config direction was marked as deprecated quite some time back, remove the usage from this driver so that the field can be removed ENBLn bit should be set before any dma request triggered, please refer to the below information from i.mx6sololite RM. Otherwise, spi/uart test will be fail because there is dma request from tx fifo always before dmaengine_prep_slave_sg() in where ENBLn set and violate the below rule. https://www.nxp.com/docs/en/reference-manual/IMX6SLRM.pdf: 40.8.28 Channel Enable RAM (SDMAARM_CHNENBLn) "It is thus essential for the Arm platform to program them before any DMA request is triggered to the SDMA, otherwise an unpredictable combination of channels may be started". Signed-off-by: Robin Gong <yibin.gong@nxp.com> [vkoul: sqashed patch from Robin into direction change] Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-03-11ASoC: intel: skl: Fix possible buffer overflow in debug outputsTakashi Iwai1-14/+14
commit 549cd0ba04dcfe340c349cd983bd440480fae8ee upstream. The debugfs output of intel skl driver writes strings with multiple snprintf() calls with the fixed size. This was supposed to avoid the buffer overflow but actually it still would, because snprintf() returns the expected size to be output, not the actual output size. Fix it by replacing snprintf() calls with scnprintf(). Fixes: d14700a01f91 ("ASoC: Intel: Skylake: Debugfs facility to dump module config") Signed-off-by: Takashi Iwai <tiwai@suse.de> Acked-by: Cezary Rojewski <cezary.rojewski@intel.com> Link: https://lore.kernel.org/r/20200218111737.14193-3-tiwai@suse.de Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>