summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-07-22sched: Fix race against ptrace_freeze_trace()Peter Zijlstra1-10/+14
There is apparently one site that violates the rule that only current and ttwu() will modify task->state, namely ptrace_{,un}freeze_traced() will change task->state for a remote task. Oleg explains: "TASK_TRACED/TASK_STOPPED was always protected by siglock. In particular, ttwu(__TASK_TRACED) must be always called with siglock held. That is why ptrace_freeze_traced() assumes it can safely do s/TASK_TRACED/__TASK_TRACED/ under spin_lock(siglock)." This breaks the ordering scheme introduced by commit: dbfb089d360b ("sched: Fix loadavg accounting race") Specifically, the reload not matching no longer implies we don't have to block. Simply things by noting that what we need is a LOAD->STORE ordering and this can be provided by a control dependency. So replace: prev_state = prev->state; raw_spin_lock(&rq->lock); smp_mb__after_spinlock(); /* SMP-MB */ if (... && prev_state && prev_state == prev->state) deactivate_task(); with: prev_state = prev->state; if (... && prev_state) /* CTRL-DEP */ deactivate_task(); Since that already implies the 'prev->state' load must be complete before allowing the 'prev->on_rq = 0' store to become visible. Fixes: dbfb089d360b ("sched: Fix loadavg accounting race") Reported-by: Jiri Slaby <jirislaby@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com> Tested-by: Christian Brauner <christian.brauner@ubuntu.com>
2020-07-22x86, vmlinux.lds: Page-align end of ..page_aligned sectionsJoerg Roedel2-1/+5
On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is page-aligned, but the end of the .bss..page_aligned section is not guaranteed to be page-aligned. As a result, objects from other .bss sections may end up on the same 4k page as the idt_table, and will accidentially get mapped read-only during boot, causing unexpected page-faults when the kernel writes to them. This could be worked around by making the objects in the page aligned sections page sized, but that's wrong. Explicit sections which store only page aligned objects have an implicit guarantee that the object is alone in the page in which it is placed. That works for all objects except the last one. That's inconsistent. Enforcing page sized objects for these sections would wreckage memory sanitizers, because the object becomes artificially larger than it should be and out of bound access becomes legit. Align the end of the .bss..page_aligned and .data..page_aligned section on page-size so all objects places in these sections are guaranteed to have their own page. [ tglx: Amended changelog ] Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200721093448.10417-1-joro@8bytes.org
2020-07-22net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offloadMurali Karicheri1-1/+2
Currently drive supports taprio offload which is a tc feature offloaded to cpsw hardware. So driver has to set the hw feature flag, NETIF_F_HW_TC in the net device to be compliant. This patch adds the flag. Fixes: 8127224c2708 ("ethernet: ti: am65-cpsw-qos: add TAPRIO offload support") Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: ethernet: ave: Fix error returns in ave_initWang Hai1-1/+1
When regmap_update_bits failed in ave_init(), calls of the functions reset_control_assert() and clk_disable_unprepare() were missed. Add goto out_reset_assert to do this. Fixes: 57878f2f4697 ("net: ethernet: ave: add support for phy-mode setting of system controller") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22drivers/net/wan/x25_asy: Fix to make it workXie He1-7/+14
This driver is not working because of problems of its receiving code. This patch fixes it to make it work. When the driver receives an LAPB frame, it should first pass the frame to the LAPB module to process. After processing, the LAPB module passes the data (the packet) back to the driver, the driver should then add a one-byte pseudo header and pass the data to upper layers. The changes to the "x25_asy_bump" function and the "x25_asy_data_indication" function are to correctly implement this procedure. Also, the "x25_asy_unesc" function ignores any frame that is shorter than 3 bytes. However the shortest frames are 2-byte long. So we need to change it to allow 2-byte frames to pass. Cc: Eric Dumazet <edumazet@google.com> Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Xie He <xie.he.0141@gmail.com> Reviewed-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22Revert "drm/amd/display: Expose connector VRR range via debugfs"Bhanuprakash Modem1-20/+0
v3: * Rebase (Manasi) v2: * Rebase (Manasi) As both VRR min and max are already part of drm_display_info, drm can expose this VRR range for each connector. Hence this logic should move to core DRM. This reverts commit 727962f030c23422a01e8b22d0f463815fb15ec4. Signed-off-by: Bhanuprakash Modem <bhanuprakash.modem@intel.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: AMD gfx <amd-gfx@lists.freedesktop.org> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-22ipvs: fix the connection sync failed in some casesguodeqing1-4/+8
The sync_thread_backup only checks sk_receive_queue is empty or not, there is a situation which cannot sync the connection entries when sk_receive_queue is empty and sk_rmem_alloc is larger than sk_rcvbuf, the sync packets are dropped in __udp_enqueue_schedule_skb, this is because the packets in reader_queue is not read, so the rmem is not reclaimed. Here I add the check of whether the reader_queue of the udp sock is empty or not to solve this problem. Fixes: 2276f58ac589 ("udp: use a separate rx queue for packet reception") Reported-by: zhouxudong <zhouxudong8@huawei.com> Signed-off-by: guodeqing <geffrey.guo@huawei.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-07-22selftest: txtimestamp: fix net ns entry logicPaolo Pisati1-1/+1
According to 'man 8 ip-netns', if `ip netns identify` returns an empty string, there's no net namespace associated with current PID: fix the net ns entrance logic. Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22xtensa: fix access check in csum_and_copy_from_userMax Filippov1-1/+1
Commit d341659f470b ("xtensa: switch to providing csum_and_copy_from_user()") introduced access check, but incorrectly tested dst instead of src. Fix access_ok argument in csum_and_copy_from_user. Cc: Al Viro <viro@zeniv.linux.org.uk> Fixes: d341659f470b ("xtensa: switch to providing csum_and_copy_from_user()") Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-07-22Merge branch 'qed-suppress-irrelevant-error-messages-on-HW-init'David S. Miller4-26/+34
Alexander Lobakin says: ==================== qed: suppress irrelevant error messages on HW init This raises the verbosity level of several error/warning messages on driver/module initialization, most of which are false-positives, and the one actively spamming the log for no reason. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22qed: suppress false-positives interrupt error messages on HW initAlexander Lobakin3-24/+32
It was found that qed_pglueb_rbc_attn_handler() can produce a lot of false-positive error detections on driver load/reload (especially after crashes/recoveries) and spam the kernel log: [ 4.958275] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d00ff0 [ 2079.146764] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0 [ 2116.374631] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0 [ 2135.250564] [qed_pglueb_rbc_attn_handler:324()]ICPL error - 00d80ff0 [...] Reduce the logging level of two false-positive prone error messages from notice to verbose on initialization (only) to not mix it with real error attentions while debugging. Fixes: 666db4862f2d ("qed: Revise load sequence to avoid PCI errors") Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22qed: suppress "don't support RoCE & iWARP" flooding on HW initAlexander Lobakin1-2/+2
Change the verbosity of the "don't support RoCE & iWARP simultaneously" warning to debug level to stop flooding on driver/hardware initialization: [ 4.783230] qede 01:00.00: Storm FW 8.37.7.0, Management FW 8.52.9.0 [MBI 15.10.6] [eth0] [ 4.810020] [qed_rdma_set_pf_params:2076()]Current day drivers don't support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only [ 4.861186] qede 01:00.01: Storm FW 8.37.7.0, Management FW 8.52.9.0 [MBI 15.10.6] [eth1] [ 4.893311] [qed_rdma_set_pf_params:2076()]Current day drivers don't support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only [ 5.181713] qede a1:00.00: Storm FW 8.37.7.0, Management FW 8.52.9.0 [MBI 15.10.6] [eth2] [ 5.224740] [qed_rdma_set_pf_params:2076()]Current day drivers don't support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only [ 5.276449] qede a1:00.01: Storm FW 8.37.7.0, Management FW 8.52.9.0 [MBI 15.10.6] [eth3] [ 5.318671] [qed_rdma_set_pf_params:2076()]Current day drivers don't support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only [ 5.369548] qede a1:00.02: Storm FW 8.37.7.0, Management FW 8.52.9.0 [MBI 15.10.6] [eth4] [ 5.411645] [qed_rdma_set_pf_params:2076()]Current day drivers don't support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only Fixes: e0a8f9de16fc ("qed: Add iWARP enablement support") Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22netdevsim: fix unbalaced locking in nsim_create()Taehee Yoo1-2/+2
In the nsim_create(), rtnl_lock() is called before nsim_bpf_init(). If nsim_bpf_init() is failed, rtnl_unlock() should be called, but it isn't called. So, unbalanced locking would occur. Fixes: e05b2d141fef ("netdevsim: move netdev creation/destruction to dev probe") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: dsa: microchip: call phy_remove_link_mode during probeHelmut Grohne3-23/+23
When doing "ip link set dev ... up" for a ksz9477 backed link, ksz9477_phy_setup is called and it calls phy_remove_link_mode to remove 1000baseT HDX. During phy_remove_link_mode, phy_advertise_supported is called. Doing so reverts any previous change to advertised link modes e.g. using a udevd .link file. phy_remove_link_mode is not meant to be used while opening a link and should be called during phy probe when the link is not yet available to userspace. Therefore move the phy_remove_link_mode calls into ksz9477_switch_register. It indirectly calls dsa_register_switch, which creates the relevant struct phy_devices and we update the link modes right after that. At that time dev->features is already initialized by ksz9477_switch_detect. Remove phy_setup from ksz_dev_ops as no users remain. Link: https://lore.kernel.org/netdev/20200715192722.GD1256692@lunn.ch/ Fixes: 42fc6a4c613019 ("net: dsa: microchip: prepare PHY for proper advertisement") Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22Merge branch 'hns3-fixes'David S. Miller5-41/+38
Huazhong Tan says: ==================== net: hns3: fixes for -net There are some bugfixes for the HNS3 ethernet driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: hns3: fix return value error when query MAC link status failJian Shen2-27/+25
Currently, PF queries the MAC link status per second by calling function hclge_get_mac_link_status(). It return the error code when failed to send cmdq command to firmware. It's incorrect, because this return value is used as the MAC link status, which 0 means link down, and none-zero means link up. So fixes it. Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Huazhong tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: hns3: fix error handling for desc fillingYunsheng Lin2-0/+9
The content of the TX desc is automatically cleared by the HW when the HW has sent out the packet to the wire. When desc filling fails in hns3_nic_net_xmit(), it will call hns3_clear_desc() to do the error handling, which miss zeroing of the TX desc and the checking if a unmapping is needed. So add the zeroing and checking in hns3_clear_desc() to avoid the above problem. Also add DESC_TYPE_UNKNOWN to indicate the info in desc_cb is not valid, because hns3_nic_reclaim_desc() may treat the desc_cb->type of zero as packet and add to the sent pkt statistics accordingly. Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: hns3: fix for not calculating TX BD send size correctlyYunsheng Lin2-3/+1
With GRO and fraglist support, the SKB can be aggregated to a total size of 65535, and when that SKB is forwarded through a bridge, the size of the SKB may be pushed to exceed the size of 65535 when br_dev_queue_push_xmit() is called. The max send size of BD supported by the HW is 65535, when a SKB with a headlen of over 65535 is sent to the driver, the driver needs to use multi BD to send the linear data, and the send size of the last BD is calculated incorrectly by the driver who is using '&' operation, which causes a TX error. Use '%' operation to fix this problem. Fixes: 3fe13ed95dd3 ("net: hns3: avoid mult + div op in critical data path") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: hns3: fix for not unmapping TX buffer correctlyYunsheng Lin1-11/+3
When a big TX buffer is sent using multi BD, the driver maps the whole TX buffer, and unmaps it using info in desc_cb corresponding to each BD, but only the info in the desc_cb of first BD is correct, other info in desc_cb is wrong, which causes TX unmapping problem when SMMU is on. Only set the mapping and freeing info in the desc_cb of first BD to fix this problem, because the TX buffer only need to be unmapped and freed once. Fixes: 1e8a7977d09f("net: hns3: add handling for big TX fragment") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huzhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: udp: Fix wrong clean up for IS_UDPLITE macroMiaohe Lin2-2/+2
We can't use IS_UDPLITE to replace udp_sk->pcflag when UDPLITE_RECV_CC is checked. Fixes: b2bf1e2659b1 ("[UDP]: Clean up for IS_UDPLITE macro") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net-sysfs: add a newline when printing 'tx_timeout' by sysfsXiongfeng Wang1-1/+1
When I cat 'tx_timeout' by sysfs, it displays as follows. It's better to add a newline for easy reading. root@syzkaller:~# cat /sys/devices/virtual/net/lo/queues/tx-0/tx_timeout 0root@syzkaller:~# Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22net: ethernet: ravb: exit if re-initialization fails in tx timeoutYoshihiro Shimoda1-2/+24
According to the report of [1], this driver is possible to cause the following error in ravb_tx_timeout_work(). ravb e6800000.ethernet ethernet: failed to switch device to config mode This error means that the hardware could not change the state from "Operation" to "Configuration" while some tx and/or rx queue are operating. After that, ravb_config() in ravb_dmac_init() will fail, and then any descriptors will be not allocaled anymore so that NULL pointer dereference happens after that on ravb_start_xmit(). To fix the issue, the ravb_tx_timeout_work() should check the return values of ravb_stop_dma() and ravb_dmac_init(). If ravb_stop_dma() fails, ravb_tx_timeout_work() re-enables TX and RX and just exits. If ravb_dmac_init() fails, just exits. [1] https://lore.kernel.org/linux-renesas-soc/20200518045452.2390-1-dirk.behme@de.bosch.com/ Reported-by: Dirk Behme <dirk.behme@de.bosch.com> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22Merge branch 'udp-Fix-reuseport-selection-with-connected-sockets'David S. Miller3-12/+19
Kuniyuki Iwashima says: ==================== udp: Fix reuseport selection with connected sockets. This patch set addresses two issues which happen when both connected and unconnected sockets are in the same UDP reuseport group. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22udp: Improve load balancing for SO_REUSEPORT.Kuniyuki Iwashima2-12/+18
Currently, SO_REUSEPORT does not work well if connected sockets are in a UDP reuseport group. Then reuseport_has_conns() returns true and the result of reuseport_select_sock() is discarded. Also, unconnected sockets have the same score, hence only does the first unconnected socket in udp_hslot always receive all packets sent to unconnected sockets. So, the result of reuseport_select_sock() should be used for load balancing. The noteworthy point is that the unconnected sockets placed after connected sockets in sock_reuseport.socks will receive more packets than others because of the algorithm in reuseport_select_sock(). index | connected | reciprocal_scale | result --------------------------------------------- 0 | no | 20% | 40% 1 | no | 20% | 20% 2 | yes | 20% | 0% 3 | no | 20% | 40% 4 | yes | 20% | 0% If most of the sockets are connected, this can be a problem, but it still works better than now. Fixes: acdcecc61285 ("udp: correct reuseport selection with connected sockets") CC: Willem de Bruijn <willemb@google.com> Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-22udp: Copy has_conns in reuseport_grow().Kuniyuki Iwashima1-0/+1
If an unconnected socket in a UDP reuseport group connect()s, has_conns is set to 1. Then, when a packet is received, udp[46]_lib_lookup2() scans all sockets in udp_hslot looking for the connected socket with the highest score. However, when the number of sockets bound to the port exceeds max_socks, reuseport_grow() resets has_conns to 0. It can cause udp[46]_lib_lookup2() to return without scanning all sockets, resulting in that packets sent to connected sockets may be distributed to unconnected sockets. Therefore, reuseport_grow() should copy has_conns. Fixes: acdcecc61285 ("udp: correct reuseport selection with connected sockets") CC: Willem de Bruijn <willemb@google.com> Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-21btrfs: fix mount failure caused by race with umountBoris Burkov1-0/+8
It is possible to cause a btrfs mount to fail by racing it with a slow umount. The crux of the sequence is generic_shutdown_super not yet calling sop->put_super before btrfs_mount_root calls btrfs_open_devices. If that occurs, btrfs_open_devices will decide the opened counter is non-zero, increment it, and skip resetting fs_devices->total_rw_bytes to 0. From here, mount will call sget which will result in grab_super trying to take the super block umount semaphore. That semaphore will be held by the slow umount, so mount will block. Before up-ing the semaphore, umount will delete the super block, resulting in mount's sget reliably allocating a new one, which causes the mount path to dutifully fill it out, and increment total_rw_bytes a second time, which causes the mount to fail, as we see double the expected bytes. Here is the sequence laid out in greater detail: CPU0 CPU1 down_write sb->s_umount btrfs_kill_super kill_anon_super(sb) generic_shutdown_super(sb); shrink_dcache_for_umount(sb); sync_filesystem(sb); evict_inodes(sb); // SLOW btrfs_mount_root btrfs_scan_one_device fs_devices = device->fs_devices fs_info->fs_devices = fs_devices // fs_devices-opened makes this a no-op btrfs_open_devices(fs_devices, mode, fs_type) s = sget(fs_type, test, set, flags, fs_info); find sb in s_instances grab_super(sb); down_write(&s->s_umount); // blocks sop->put_super(sb) // sb->fs_devices->opened == 2; no-op spin_lock(&sb_lock); hlist_del_init(&sb->s_instances); spin_unlock(&sb_lock); up_write(&sb->s_umount); return 0; retry lookup don't find sb in s_instances (deleted by CPU0) s = alloc_super return s; btrfs_fill_super(s, fs_devices, data) open_ctree // fs_devices total_rw_bytes improperly set! btrfs_read_chunk_tree read_one_dev // increment total_rw_bytes again!! super_total_bytes < fs_devices->total_rw_bytes // ERROR!!! To fix this, we clear total_rw_bytes from within btrfs_read_chunk_tree before the calls to read_one_dev, while holding the sb umount semaphore and the uuid mutex. To reproduce, it is sufficient to dirty a decent number of inodes, then quickly umount and mount. for i in $(seq 0 500) do dd if=/dev/zero of="/mnt/foo/$i" bs=1M count=1 done umount /mnt/foo& mount /mnt/foo does the trick for me. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-21btrfs: fix page leaks after failure to lock page for delallocRobbie Ko1-1/+2
When locking pages for delalloc, we check if it's dirty and mapping still matches. If it does not match, we need to return -EAGAIN and release all pages. Only the current page was put though, iterate over all the remaining pages too. CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Robbie Ko <robbieko@synology.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-21btrfs: qgroup: fix data leak caused by race between writeback and truncateQu Wenruo1-13/+10
[BUG] When running tests like generic/013 on test device with btrfs quota enabled, it can normally lead to data leak, detected at unmount time: BTRFS warning (device dm-3): qgroup 0/5 has unreleased space, type 0 rsv 4096 ------------[ cut here ]------------ WARNING: CPU: 11 PID: 16386 at fs/btrfs/disk-io.c:4142 close_ctree+0x1dc/0x323 [btrfs] RIP: 0010:close_ctree+0x1dc/0x323 [btrfs] Call Trace: btrfs_put_super+0x15/0x17 [btrfs] generic_shutdown_super+0x72/0x110 kill_anon_super+0x18/0x30 btrfs_kill_super+0x17/0x30 [btrfs] deactivate_locked_super+0x3b/0xa0 deactivate_super+0x40/0x50 cleanup_mnt+0x135/0x190 __cleanup_mnt+0x12/0x20 task_work_run+0x64/0xb0 __prepare_exit_to_usermode+0x1bc/0x1c0 __syscall_return_slowpath+0x47/0x230 do_syscall_64+0x64/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ---[ end trace caf08beafeca2392 ]--- BTRFS error (device dm-3): qgroup reserved space leaked [CAUSE] In the offending case, the offending operations are: 2/6: writev f2X[269 1 0 0 0 0] [1006997,67,288] 0 2/7: truncate f2X[269 1 0 0 48 1026293] 18388 0 The following sequence of events could happen after the writev(): CPU1 (writeback) | CPU2 (truncate) ----------------------------------------------------------------- btrfs_writepages() | |- extent_write_cache_pages() | |- Got page for 1003520 | | 1003520 is Dirty, no writeback | | So (!clear_page_dirty_for_io()) | | gets called for it | |- Now page 1003520 is Clean. | | | btrfs_setattr() | | |- btrfs_setsize() | | |- truncate_setsize() | | New i_size is 18388 |- __extent_writepage() | | |- page_offset() > i_size | |- btrfs_invalidatepage() | |- Page is clean, so no qgroup | callback executed This means, the qgroup reserved data space is not properly released in btrfs_invalidatepage() as the page is Clean. [FIX] Instead of checking the dirty bit of a page, call btrfs_qgroup_free_data() unconditionally in btrfs_invalidatepage(). As qgroup rsv are completely bound to the QGROUP_RESERVED bit of io_tree, not bound to page status, thus we won't cause double freeing anyway. Fixes: 0b34c261e235 ("btrfs: qgroup: Prevent qgroup->reserved from going subzero") CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-21drm/amdgpu: Fix NULL dereference in dpm sysfs handlersPaweł Gronowski1-6/+3
NULL dereference occurs when string that is not ended with space or newline is written to some dpm sysfs interface (for example pp_dpm_sclk). This happens because strsep replaces the tmp with NULL if the delimiter is not present in string, which is then dereferenced by tmp[0]. Reproduction example: sudo sh -c 'echo -n 1 > /sys/class/drm/card0/device/pp_dpm_sclk' Signed-off-by: Paweł Gronowski <me@woland.xyz> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-07-21drm/amd/powerplay: fix a crash when overclocking Vega MQiu Wenbo1-4/+6
Avoid kernel crash when vddci_control is SMU7_VOLTAGE_CONTROL_NONE and vddci_voltage_table is empty. It has been tested on Intel Hades Canyon (i7-8809G). Bug: https://bugzilla.kernel.org/show_bug.cgi?id=208489 Fixes: ac7822b0026f ("drm/amd/powerplay: add smumgr support for VEGAM (v2)") Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-07-21btrfs: fix double free on ulist after backref resolution failureFilipe Manana1-0/+1
At btrfs_find_all_roots_safe() we allocate a ulist and set the **roots argument to point to it. However if later we fail due to an error returned by find_parent_nodes(), we free that ulist but leave a dangling pointer in the **roots argument. Upon receiving the error, a caller of this function can attempt to free the same ulist again, resulting in an invalid memory access. One such scenario is during qgroup accounting: btrfs_qgroup_account_extents() --> calls btrfs_find_all_roots() passes &new_roots (a stack allocated pointer) to btrfs_find_all_roots() --> btrfs_find_all_roots() just calls btrfs_find_all_roots_safe() passing &new_roots to it --> allocates ulist and assigns its address to **roots (which points to new_roots from btrfs_qgroup_account_extents()) --> find_parent_nodes() returns an error, so we free the ulist and leave **roots pointing to it after returning --> btrfs_qgroup_account_extents() sees btrfs_find_all_roots() returned an error and jumps to the label 'cleanup', which just tries to free again the same ulist Stack trace example: ------------[ cut here ]------------ BTRFS: tree first key check failed WARNING: CPU: 1 PID: 1763215 at fs/btrfs/disk-io.c:422 btrfs_verify_level_key+0xe0/0x180 [btrfs] Modules linked in: dm_snapshot dm_thin_pool (...) CPU: 1 PID: 1763215 Comm: fsstress Tainted: G W 5.8.0-rc3-btrfs-next-64 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:btrfs_verify_level_key+0xe0/0x180 [btrfs] Code: 28 5b 5d (...) RSP: 0018:ffffb89b473779a0 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff90397759bf08 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000027 RDI: 00000000ffffffff RBP: ffff9039a419c000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: ffffb89b43301000 R12: 000000000000005e R13: ffffb89b47377a2e R14: ffffb89b473779af R15: 0000000000000000 FS: 00007fc47e1e1000(0000) GS:ffff9039ac200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc47e1df000 CR3: 00000003d9e4e001 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: read_block_for_search+0xf6/0x350 [btrfs] btrfs_next_old_leaf+0x242/0x650 [btrfs] resolve_indirect_refs+0x7cf/0x9e0 [btrfs] find_parent_nodes+0x4ea/0x12c0 [btrfs] btrfs_find_all_roots_safe+0xbf/0x130 [btrfs] btrfs_qgroup_account_extents+0x9d/0x390 [btrfs] btrfs_commit_transaction+0x4f7/0xb20 [btrfs] btrfs_sync_file+0x3d4/0x4d0 [btrfs] do_fsync+0x38/0x70 __x64_sys_fdatasync+0x13/0x20 do_syscall_64+0x5c/0xe0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fc47e2d72e3 Code: Bad RIP value. RSP: 002b:00007fffa32098c8 EFLAGS: 00000246 ORIG_RAX: 000000000000004b RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fc47e2d72e3 RDX: 00007fffa3209830 RSI: 00007fffa3209830 RDI: 0000000000000003 RBP: 000000000000072e R08: 0000000000000001 R09: 0000000000000003 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000003e8 R13: 0000000051eb851f R14: 00007fffa3209970 R15: 00005607c4ac8b50 irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<ffffffffb8eb5e85>] copy_process+0x755/0x1eb0 softirqs last enabled at (0): [<ffffffffb8eb5e85>] copy_process+0x755/0x1eb0 softirqs last disabled at (0): [<0000000000000000>] 0x0 ---[ end trace 8639237550317b48 ]--- BTRFS error (device sdc): tree first key mismatch detected, bytenr=62324736 parent_transid=94 key expected=(262,108,1351680) has=(259,108,1921024) general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b6b: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI CPU: 2 PID: 1763215 Comm: fsstress Tainted: G W 5.8.0-rc3-btrfs-next-64 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:ulist_release+0x14/0x60 [btrfs] Code: c7 07 00 (...) RSP: 0018:ffffb89b47377d60 EFLAGS: 00010282 RAX: 6b6b6b6b6b6b6b6b RBX: ffff903959b56b90 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000270024 RDI: ffff9036e2adc840 RBP: ffff9036e2adc848 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9036e2adc840 R13: 0000000000000015 R14: ffff9039a419ccf8 R15: ffff90395d605840 FS: 00007fc47e1e1000(0000) GS:ffff9039ac600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8c1c0a51c8 CR3: 00000003d9e4e004 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ulist_free+0x13/0x20 [btrfs] btrfs_qgroup_account_extents+0xf3/0x390 [btrfs] btrfs_commit_transaction+0x4f7/0xb20 [btrfs] btrfs_sync_file+0x3d4/0x4d0 [btrfs] do_fsync+0x38/0x70 __x64_sys_fdatasync+0x13/0x20 do_syscall_64+0x5c/0xe0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fc47e2d72e3 Code: Bad RIP value. RSP: 002b:00007fffa32098c8 EFLAGS: 00000246 ORIG_RAX: 000000000000004b RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fc47e2d72e3 RDX: 00007fffa3209830 RSI: 00007fffa3209830 RDI: 0000000000000003 RBP: 000000000000072e R08: 0000000000000001 R09: 0000000000000003 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000003e8 R13: 0000000051eb851f R14: 00007fffa3209970 R15: 00005607c4ac8b50 Modules linked in: dm_snapshot dm_thin_pool (...) ---[ end trace 8639237550317b49 ]--- RIP: 0010:ulist_release+0x14/0x60 [btrfs] Code: c7 07 00 (...) RSP: 0018:ffffb89b47377d60 EFLAGS: 00010282 RAX: 6b6b6b6b6b6b6b6b RBX: ffff903959b56b90 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000270024 RDI: ffff9036e2adc840 RBP: ffff9036e2adc848 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff9036e2adc840 R13: 0000000000000015 R14: ffff9039a419ccf8 R15: ffff90395d605840 FS: 00007fc47e1e1000(0000) GS:ffff9039ad200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6a776f7d40 CR3: 00000003d9e4e002 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Fix this by making btrfs_find_all_roots_safe() set *roots to NULL after it frees the ulist. Fixes: 8da6d5815c592b ("Btrfs: added btrfs_find_all_roots()") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-07-21drm/amdgpu/sienna_cichlid: add SMU i2c support (v2)Alex Deucher1-0/+239
Enable SMU i2c bus access for sienna_cichlid asics. v2: change callback name Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amdgpu/navi1x: add SMU i2c support (v2)Alex Deucher1-0/+239
Enable SMU i2c bus access for navi1x asics. v2: add missing implementation Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amdgpu/swSMU: remove eeprom from the smu i2c handlers (v2)Alex Deucher4-36/+36
The driver uses it for EEPROM access, but it's just an i2c bus. v2: change the callback name as well. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amdgpu/vega20: enable the smu i2c bus for all boardsAlex Deucher1-7/+4
There is no longer a ras dependency so it's safe to expose on all boards. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amdgpu: remove eeprom from the smu i2c handlersAlex Deucher3-30/+30
The driver uses it for EEPROM access, but it's just an i2c bus. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amdgpu: move i2c bus lock out of ras structureAlex Deucher3-8/+4
It's not really ras related. It's just a lock for the bus in general. This removes the ras dependency from the smu i2c bus. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amdgpu: Fix NULL dereference in dpm sysfs handlersPaweł Gronowski1-6/+3
NULL dereference occurs when string that is not ended with space or newline is written to some dpm sysfs interface (for example pp_dpm_sclk). This happens because strsep replaces the tmp with NULL if the delimiter is not present in string, which is then dereferenced by tmp[0]. Reproduction example: sudo sh -c 'echo -n 1 > /sys/class/drm/card0/device/pp_dpm_sclk' Signed-off-by: Paweł Gronowski <me@woland.xyz> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: fix a crash when overclocking Vega MQiu Wenbo1-4/+6
Avoid kernel crash when vddci_control is SMU7_VOLTAGE_CONTROL_NONE and vddci_voltage_table is empty. It has been tested on Intel Hades Canyon (i7-8809G). Bug: https://bugzilla.kernel.org/show_bug.cgi?id=208489 Fixes: ac7822b0026f ("drm/amd/powerplay: add smumgr support for VEGAM (v2)") Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Qiu Wenbo <qiuwenbo@phytium.com.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: retrieve VCN dpm table per instancesJiansong Chen1-30/+37
To accommodate VCN instances variance, otherwise it may trigger smu response error for configuration with less instances. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: update driver if version for navy_flounderJiansong Chen1-1/+1
It's in accordance with pmfw 65.3.0 for navy_flounder. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/powerplay: fix typos for clk mapJiansong Chen1-2/+2
It should be DCLK1->PPCLK_DCLK_1 and VCLK->PPCLK_VCLK_0. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Acked-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amdgpu/vcn: merge shared memory into vcpuJames Zhu2-13/+6
Merge vcn firmware shared memory bo into vcn vcpu bo. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21Revert "drm/amdgpu/vcn: add shared memory restore after wake up from sleep."James Zhu2-28/+1
This reverts commit 21b704d78352c289d31697824ceea7ad0ff4ce59. To merge vcn firmware shared memory bo into vcn vcpu bo. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/display: 3.2.95Aric Cyr1-1/+1
Signed-off-by: Aric Cyr <aric.cyr@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/display: interface to obtain minimum plane size capsIgor Kravchenko5-5/+18
[Why] Implement an interface to obtain plane size caps [How] Add min_width, min_height fields to dc_plane_cap structure. Set values to 16x16 for discrete ASICs, and 64x64 for others. Signed-off-by: Igor Kravchenko <Igor.Kravchenko@amd.com> Reviewed-by: Aric Cyr <Aric.Cyr@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/display: Add additional config guards for DCNAurabindo Pillai1-2/+4
[Why&How] Fix build error by protecting code with config guard to enable building amdgpu without CONFIG_DRM_AMD_DC_DCN enabled. This option is disabled by default for allmodconfig. Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/display: Call dsc related functions indirectly via dc interfaceAurabindo Pillai1-1/+1
[Why&How] Accessing dcn20_add_dsc_to_stream_resource directly causes build failure for configuration which has CONFIG_DRM_AMD_DC_DCN disabled. Fix this by calling the corresponding function exposed via dc resource functions. Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/display: Improve compatibility by re-ordering info-packetsNaveed Ashfaq1-4/+4
[why] On DCN20, Some features would not be activated when ALLM was turned on. TV seemed to activate only the latest info packet sent, and the ALLM info packet was sent after the VSIF info packet. The packet indices was also inconsistent between DCN10 and DCN20. [how] Change the packet indices of DCN20 to match those of DCN10. This makes them consistent and also makes the vendor info packet be sent after the hfvsif info packet. Signed-off-by: Naveed Ashfaq <Naveed.Ashfaq@amd.com> Reviewed-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-21drm/amd/display: [FW Promotion] Release 0.0.25Anthony Koo1-2/+2
Signed-off-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>