kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2018-03-24	sfp: fix non-detection of PHY	Russell King	1	-4/+4
	[ Upstream commit 20b56ed9f8adfb9a7fb1c878878c54aa4ed645c1 ] The detection of a PHY changed in commit e98a3aabf85f ("mdio_bus: don't return NULL from mdiobus_scan()") which now causes sfp to print an error message. Update for this change. Fixes: 73970055450e ("sfp: add SFP module support") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24	sfp: fix EEPROM reading in the case of non-SFF8472 SFPs	Russell King	1	-4/+3
	[ Upstream commit 2794ffc441dde3109804085dc745e8014a4de224 ] The EEPROM reading was trying to read from the second EEPROM address if we requested the last byte from the SFF8079 EEPROM, which caused a failure when the second EEPROM is not present. Discovered with a S-RJ01 SFP module. Fix this. Fixes: 73970055450e ("sfp: add SFP module support") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24	net: phy: meson-gxl: check phy_write return value	Jerome Brunet	1	-12/+38
	[ Upstream commit 9042b46eda33ef5db3cdfc9e12b3c8cabb196141 ] Always check phy_write return values. Better to be safe than sorry Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24	qmi_wwan: set FLAG_SEND_ZLP to avoid network initiated disconnect	Bjørn Mork	1	-2/+2
	[ Upstream commit 245d21190aec547c0de64f70c0e6de871c185a24 ] It has been reported that the dummy byte we add to avoid ZLPs can be forwarded by the modem to the PGW/GGSN, and that some operators will drop the connection if this happens. In theory, QMI devices are based on CDC ECM and should as such both support ZLPs and silently ignore the dummy byte. The latter assumption failed. Let's test out the first. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24	ath10k: handling qos at STA side based on AP WMM enable/disable	Balaji Pothunoori	1	-1/+1
	[ Upstream commit 07ffb4497360ae8789f05555fec8171ee952304d ] Data packets are not sent by STA in case of STA joined to non QOS AP (WMM disabled AP). This is happening because of STA is sending data packets to firmware from host with qos enabled along with non qos queue value(TID = 16). Due to qos enabled, firmware is discarding the packet. This patch fixes this issue by updating the qos based on station WME capability field if WMM is disabled in AP. This patch is required by 10.4 family chipsets like QCA4019/QCA9888/QCA9884/QCA99X0. Firmware Versoin : 10.4-3.5.1-00018. For 10.2.4 family chipsets QCA988X/QCA9887 and QCA6174 this patch has no effect. Signed-off-by: Balaji Pothunoori <bpothuno@qti.qualcomm.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24	rtlwifi: always initialize variables given to RT_TRACE()	Nicolas Iooss	1	-1/+1
	[ Upstream commit e4779162f7377baa9fb9a044555ecaae22c3f125 ] In rtl_rx_ampdu_apply(), when rtlpriv->cfg->ops->get_btc_status() returns false, RT_TRACE() is called with the values of variables reject_agg and agg_size, which have not been initialized. Always initialize these variables in order to prevent using uninitialized values. This issue has been found with clang. The compiler reported: drivers/net/wireless/realtek/rtlwifi/base.c:1665:6: error: variable 'agg_size' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] if (rtlpriv->cfg->ops->get_btc_status()) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/wireless/realtek/rtlwifi/base.c:1671:31: note: uninitialized use occurs here reject_agg, ctrl_agg_size, agg_size); ^~~~~~~~ drivers/net/wireless/realtek/rtlwifi/base.c:1665:6: error: variable 'reject_agg' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] if (rtlpriv->cfg->ops->get_btc_status()) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/wireless/realtek/rtlwifi/base.c:1671:4: note: uninitialized use occurs here reject_agg, ctrl_agg_size, agg_size); ^~~~~~~~~~ Fixes: 2635664e6e4a ("rtlwifi: Add rx ampdu cfg for btcoexist.") Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24	rtlwifi: rtl_pci: Fix the bug when inactiveps is enabled.	Tsang-Shian Lin	1	-0/+7
	[ Upstream commit b7573a0a27bfa8270dea9b145448f6884b7cacc1 ] Reset the driver current tx read/write index to zero when inactiveps nic out of sync with HW state. Wrong driver tx read/write index will cause Tx fail. Signed-off-by: Tsang-Shian Lin <thlin@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Yan-Hsuan Chuang <yhchuang@realtek.com> Cc: Birming Chiu <birming@realtek.com> Cc: Shaofu <shaofu@realtek.com> Cc: Steven Ting <steventing@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24	hv_netvsc: Fix the TX/RX buffer default sizes	Haiyang Zhang	2	-5/+12
	[ Upstream commit 41f61db2cd24d5ad802386719cccde1479aa82a6 ] The values were not computed correctly. There are no significant visible impact, though. The intended size of RX buffer is 16 MB, and the default slot size is 1728. So, NETVSC_DEFAULT_RX should be 1610241024 / 1728 = 9709. The intended size of TX buffer is 1 MB, and the slot size is 6144. So, NETVSC_DEFAULT_TX should be 1024*1024 / 6144 = 170. The patch puts the formula directly into the macro, and moves them to hyperv_net.h, together with related macros. Fixes: 5023a6db73196 ("netvsc: increase default receive buffer size") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24	hv_netvsc: Fix the receive buffer size limit	Haiyang Zhang	2	-2/+9
	[ Upstream commit 11b2b653102571ac791885324371d9a1a17b900e ] The max should be 31 MB on host with NVSP version > 2. On legacy hosts (NVSP version <=2) only 15 MB receive buffer is allowed, otherwise the buffer request will be rejected by the host, resulting vNIC not coming up. The NVSP version is only available after negotiation. So, we add the limit checking for legacy hosts in netvsc_init_buf(). Fixes: 5023a6db73196 ("netvsc: increase default receive buffer size") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19	ipvlan: add L2 check for packets arriving via virtual devices	Mahesh Bandewar	1	-0/+4
	[ Upstream commit 92ff42645028fa6f9b8aa767718457b9264316b4 ] Packets that don't have dest mac as the mac of the master device should not be entertained by the IPvlan rx-handler. This is mostly true as the packet path mostly takes care of that, except when the master device is a virtual device. As demonstrated in the following case - ip netns add ns1 ip link add ve1 type veth peer name ve2 ip link add link ve2 name iv1 type ipvlan mode l2 ip link set dev iv1 netns ns1 ip link set ve1 up ip link set ve2 up ip -n ns1 link set iv1 up ip addr add 192.168.10.1/24 dev ve1 ip -n ns1 addr 192.168.10.2/24 dev iv1 ping -c2 192.168.10.2 <Works!> ip neigh show dev ve1 ip neigh show 192.168.10.2 lladdr <random> dev ve1 ping -c2 192.168.10.2 <Still works! Wrong!!> This patch adds that missing check in the IPvlan rx-handler. Reported-by: Amit Sikka <amit.sikka@ericsson.com> Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19	mac80211_hwsim: enforce PS_MANUAL_POLL to be set after PS_ENABLED	Adiel Aloni	1	-6/+11
	[ Upstream commit e16ea4bb516bc21ea2202f2107718b29218bea59 ] Enforce using PS_MANUAL_POLL in ps hwsim debugfs to trigger a poll, only if PS_ENABLED was set before. This is required due to commit c9491367b759 ("mac80211: always update the PM state of a peer on MGMT / DATA frames") that enforces the ap to check only mgmt/data frames ps bit, and then update station's power save accordingly. When sending only ps-poll (control frame) the ap will not be aware that the station entered power save. Setting ps enable before triggering ps_poll, will send NDP with PM bit enabled first. Signed-off-by: Adiel Aloni <adiel.aloni@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19	veth: set peer GSO values	Stephen Hemminger	1	-0/+3
	[ Upstream commit 72d24955b44a4039db54a1c252b5031969eeaac3 ] When new veth is created, and GSO values have been configured on one device, clone those values to the peer. For example: # ip link add dev vm1 gso_max_size 65530 type veth peer name vm2 This should create vm1 <--> vm2 with both having GSO maximum size set to 65530. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19	virtio_net: Disable interrupts if napi_complete_done rescheduled napi	Toshiaki Makita	1	-3/+6
	[ Upstream commit fdaa767aefc1685f9a41e91f447c9aea94103df6 ] Since commit 39e6c8208d7b ("net: solve a NAPI race") napi has been able to be rescheduled within napi_complete_done() even in non-busypoll case, but virtnet_poll() always enabled interrupts before complete, and when napi was rescheduled within napi_complete_done() it did not disable interrupts. This caused more interrupts when event idx is disabled. According to commit cbdadbbf0c79 ("virtio_net: fix race in RX VQ processing") we cannot place virtqueue_enable_cb_prepare() after NAPI_STATE_SCHED is cleared, so disable interrupts again if napi_complete_done() returned false. Tested with vhost-user of OVS 2.7 on host, which does not have the event idx feature. * Before patch: $ netperf -t UDP_STREAM -H 192.168.150.253 -l 60 -- -m 1472 MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.150.253 () port 0 AF_INET Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 212992 1472 60.00 32763206 0 6430.32 212992 60.00 23384299 4589.56 Interrupts on guest: 9872369 Packets/interrupt: 2.37 * After patch $ netperf -t UDP_STREAM -H 192.168.150.253 -l 60 -- -m 1472 MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.150.253 () port 0 AF_INET Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 212992 1472 60.00 32794646 0 6436.49 212992 60.00 32793501 6436.27 Interrupts on guest: 4941299 Packets/interrupt: 6.64 Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19	bnxt_en: Don't print "Link speed -1 no longer supported" messages.	Michael Chan	1	-3/+7
	[ Upstream commit a8168b6cee6e9334dfebb4b9108e8d73794f6088 ] On some dual port NICs, the 2 ports have to be configured with compatible link speeds. Under some conditions, a port's configured speed may no longer be supported. The firmware will send a message to the driver when this happens. Improve this logic that prints out the warning by only printing it if we can determine the link speed that is no longer supported. If the speed is unknown or it is in autoneg mode, skip the warning message. Reported-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Tested-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19	ath10k: fix invalid STS_CAP_OFFSET_MASK	Ben Greear	1	-1/+2
	[ Upstream commit 8cec57f5277ef0e354e37a0bf909dc71bc1f865b ] The 10.4 firmware defines this as a 3-bit field, as does the mac80211 stack. The 4th bit is defined as CONF_IMPLICIT_BF at least in the firmware header I have seen. This patch fixes the ath10k wmi header to match the firmware. Signed-off-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19	mwifiex: cfg80211: do not change virtual interface during scan processing	Limin Zhu	1	-0/+6
	[ Upstream commit c61cfe49f0f0f0d1f8b56d0b045838d597e8c3a3 ] (1) Change virtual interface operation in cfg80211 process reset and reinitilize private data structure. (2) Scan result event processed in main process will dereference private data structure concurrently, ocassionly crash the kernel. The cornel case could be trigger by below steps: (1) wpa_cli mlan0 scan (2) ./hostapd mlan0.conf Cfg80211 asynchronous scan procedure is not all the time operated under rtnl lock, here we add the protect to serialize the cfg80211 scan and change_virtual interface operation. Signed-off-by: Limin Zhu <liminzhu@marvell.com> Signed-off-by: Xinming Hu <huxm@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19	bnxt_en: Uninitialized variable in bnxt_tc_parse_actions()	Dan Carpenter	1	-4/+1
	[ Upstream commit 92425c40676d498efccae6fecdb8f8e4dcf7e4a4 ] Smatch warns that: drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c:160 bnxt_tc_parse_actions() error: uninitialized symbol 'rc'. "rc" is either uninitialized or set to zero here so we can just remove the check. Fixes: 8c95f773b4a3 ("bnxt_en: add support for Flower based vxlan encap/decap offload") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19	iwlwifi: mvm: avoid dumping assert log when device is stopped	Sara Sharon	1	-0/+6
	[ Upstream commit 6362ab721ef5c4ecfa01f53ad4137d3d984f0c6c ] We might erroneously get to error dumping code when the device is already stopped. In that case the driver will detect a defective value and will try to reset the HW, assuming it is only a bus issue. The driver than proceeds with the dumping. The result has two side effects: 1. The device won't be stopped again, since the transport status is already stopped, so the device remains powered on while it actually should be stopped. 2. The dump in that case is completely garbaged and useless. Detect and avoid this. It will also make debugging such issues easier. Signed-off-by: Sara Sharon <sara.sharon@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19	ath10k: update tdls teardown state to target	Manikanta Pubbisetty	1	-0/+10
	[ Upstream commit 424ea0d174e82365f85c6770225dba098b8f1d5f ] It is required to update the teardown state of the peer when a tdls link with that peer is terminated. This information is useful for the target to perform some cleanups wrt the tdls peer. Without proper cleanup, target assumes that the peer is connected and blocks future connection requests, updating the teardown state of the peer addresses the problem. Tested this change on QCA9888 with 10.4-3.5.1-00018 fw version. Signed-off-by: Manikanta Pubbisetty <mpubbise@qti.qualcomm.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19	net: thunderx: Set max queue count taking XDP_TX into account	Sunil Goutham	1	-0/+5
	[ Upstream commit 87de083857aa269fb171ef0b39696b2888361c58 ] on T81 there are only 4 cores, hence setting max queue count to 4 would leave nothing for XDP_TX. This patch fixes this by doubling max queue count in above scenarios. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: cjacob <cjacob@caviumnetworks.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19	net: ieee802154: adf7242: Fix bug if defined DEBUG	Michael Hennerich	1	-2/+2
	[ Upstream commit 388b3b2b03701f3b3c10975c272892d7f78080df ] This fixes undefined reference to struct adf7242_local *lp in case DEBUG is defined. Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-19	iwlwifi: mvm: rs: don't override the rate history in the search cycle	Emmanuel Grumbach	1	-3/+1
	[ Upstream commit 992172e3aec19e5b0ea5b757ba40a146b9282d1e ] When we are in a search cycle, we try different combinations of parameters. Those combinations are called 'columns'. When we switch to a new column, we first need to check if this column has a suitable rate, if not, we can't try it. This means we must not erase the statistics we gathered for the previous column until we are sure that we are indeed switching column. The code that tries to switch to a new column first sets a whole bunch of things for the new column, and only then checks that we can find suitable rates in that column. While doing that, the code mistakenly erased the rate statistics. This code was right until struct iwl_scale_tbl_info grew up for TPC. Fix this to make sure we don't erase the rate statistics until we are sure that we can indeed switch to the new column. Note that this bug is really harmless since it causes a change in the behavior only when we can't find any rate in the new column which should really not happen. In the case we do find a suitable we reset the rate statistics a few lines later anyway. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-15	mac80211_hwsim: don't use WQ_MEM_RECLAIM	Johannes Berg	1	-1/+1
	commit ce162bfbc0b601841886965baba14877127c7c7c upstream. We're obviously not part of a memory reclaim path, so don't set the flag. This also causes a warning in check_flush_dependency() since we end up in a code path that flushes a non-reclaim workqueue, and we shouldn't do that if we were really part of reclaim. Reported-by: syzbot+41cdaf4232c50e658934@syzkaller.appspotmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	net: phy: Restore phy_resume() locking assumption	Andrew Lunn	2	-6/+14
	[ Upstream commit 9c2c2e62df3fa30fb13fbeb7512a4eede729383b ] commit f5e64032a799 ("net: phy: fix resume handling") changes the locking semantics for phy_resume() such that the caller now needs to hold the phy mutex. Not all call sites were adopted to this new semantic, resulting in warnings from the added WARN_ON(!mutex_is_locked(&phydev->lock)). Rather than change the semantics, add a __phy_resume() and restore the old behavior of phy_resume(). Reported-by: Heiner Kallweit <hkallweit1@gmail.com> Fixes: f5e64032a799 ("net: phy: fix resume handling") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	net/mlx5: Fix error handling when adding flow rules	Vlad Buslov	1	-2/+8
	[ Upstream commit 9238e380e823a39983ee8d6b6ee8d1a9c4ba8a65 ] If building match list or adding existing fg fails when node is locked, function returned without unlocking it. This happened if node version changed or adding existing fg returned with EAGAIN after jumping to search_again_locked label. Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	cxgb4: fix trailing zero in CIM LA dump	Rahul Lakkireddy	2	-2/+2
	[ Upstream commit e6f02a4d57cc438099bc8abfba43ba1400d77b38 ] Set correct size of the CIM LA dump for T6. Fixes: 27887bc7cb7f ("cxgb4: collect hardware LA dumps") Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	virtio-net: disable NAPI only when enabled during XDP set	Jason Wang	1	-3/+5
	[ Upstream commit 4e09ff5362843dff3accfa84c805c7f3a99de9cd ] We try to disable NAPI to prevent a single XDP TX queue being used by multiple cpus. But we don't check if device is up (NAPI is enabled), this could result stall because of infinite wait in napi_disable(). Fixing this by checking device state through netif_running() before. Fixes: 4941d472bf95b ("virtio-net: do not reset during XDP set") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	tuntap: disable preemption during XDP processing	Jason Wang	1	-0/+6
	[ Upstream commit 23e43f07f896f8578318cfcc9466f1e8b8ab21b6 ] Except for tuntap, all other drivers' XDP was implemented at NAPI poll() routine in a bh. This guarantees all XDP operation were done at the same CPU which is required by e.g BFP_MAP_TYPE_PERCPU_ARRAY. But for tuntap, we do it in process context and we try to protect XDP processing by RCU reader lock. This is insufficient since CONFIG_PREEMPT_RCU can preempt the RCU reader critical section which breaks the assumption that all XDP were processed in the same CPU. Fixing this by simply disabling preemption during XDP processing. Fixes: 761876c857cb ("tap: XDP support") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	tuntap: correctly add the missing XDP flush	Jason Wang	1	-0/+1
	[ Upstream commit 1bb4f2e868a2891ab8bc668b8173d6ccb8c4ce6f ] We don't flush batched XDP packets through xdp_do_flush_map(), this will cause packets stall at TX queue. Consider we don't do XDP on NAPI poll(), the only possible fix is to call xdp_do_flush_map() immediately after xdp_do_redirect(). Note, this in fact won't try to batch packets through devmap, we could address in the future. Reported-by: Christoffer Dall <christoffer.dall@linaro.org> Fixes: 761876c857cb ("tap: XDP support") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	mlxsw: spectrum_router: Do not unconditionally clear route offload indication	Ido Schimmel	1	-0/+3
	[ Upstream commit d1c95af366961101819f07e3c64d44f3be7f0367 ] When mlxsw replaces (or deletes) a route it removes the offload indication from the replaced route. This is problematic for IPv4 routes, as the offload indication is stored in the fib_info which is usually shared between multiple routes. Instead of unconditionally clearing the offload indication, only clear it if no other route is using the fib_info. Fixes: 3984d1a89fe7 ("mlxsw: spectrum_router: Provide offload indication using nexthop flags") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Alexander Petrovskiy <alexpe@mellanox.com> Tested-by: Alexander Petrovskiy <alexpe@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	amd-xgbe: Restore PCI interrupt enablement setting on resume	Tom Lendacky	1	-0/+2
	[ Upstream commit cfd092f2db8b4b6727e1c03ef68a7842e1023573 ] After resuming from suspend, the PCI device support must re-enable the interrupt setting so that interrupts are actually delivered. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	net/mlx5e: Verify inline header size do not exceed SKB linear size	Eran Ben Elisha	1	-1/+1
	[ Upstream commit f600c6088018d1dbc5777d18daa83660f7ea4a64 ] Driver tries to copy at least MLX5E_MIN_INLINE bytes into the control segment of the WQE. It assumes that the linear part contains at least MLX5E_MIN_INLINE bytes, which can be wrong. Cited commit verified that driver will not copy more bytes into the inline header part that the actual size of the packet. Re-factor this check to make sure we do not exceed the linear part as well. This fix is aligned with the current driver's assumption that the entire L2 will be present in the linear part of the SKB. Fixes: 6aace17e64f4 ("net/mlx5e: Fix inline header size for small packets") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	mlxsw: spectrum_router: Fix error path in mlxsw_sp_vr_create	Jiri Pirko	1	-14/+18
	[ Upstream commit 0f2d2b2736b08dafa3bde31d048750fbc8df3a31 ] Since mlxsw_sp_fib_create() and mlxsw_sp_mr_table_create() use ERR_PTR macro to propagate int err through return of a pointer, the return value is not NULL in case of failure. So if one of the calls fails, one of vr->fib4, vr->fib6 or vr->mr4_table is not NULL and mlxsw_sp_vr_is_used wrongly assumes that vr is in use which leads to crash like following one: [ 1293.949291] BUG: unable to handle kernel NULL pointer dereference at 00000000000006c9 [ 1293.952729] IP: mlxsw_sp_mr_table_flush+0x15/0x70 [mlxsw_spectrum] Fix this by using local variables to hold the pointers and set vr->* only in case everything went fine. Fixes: 76610ebbde18 ("mlxsw: spectrum_router: Refactor virtual router handling") Fixes: a3d9bc506d64 ("mlxsw: spectrum_router: Extend virtual routers with IPv6 support") Fixes: d42b0965b1d4 ("mlxsw: spectrum_router: Add multicast routes notification handling functionality") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	net/mlx5e: Fix loopback self test when GRO is off	Inbar Karmy	1	-1/+2
	[ Upstream commit ef7a3518f7dd4f4cf5e5b5358c93d1eb78df28fb ] When GRO is off, the transport header pointer in sk_buff is initialized to network's header. To find the udp header, instead of using udp_hdr() which assumes skb_network_header was set, manually calculate the udp header offset. Fixes: 0952da791c97 ("net/mlx5e: Add support for loopback selftest") Signed-off-by: Inbar Karmy <inbark@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	net: phy: fix phy_start to consider PHY_IGNORE_INTERRUPT	Heiner Kallweit	1	-1/+1
	[ Upstream commit 08f5138512180a479ce6b9d23b825c9f4cd3be77 ] This condition wasn't adjusted when PHY_IGNORE_INTERRUPT (-2) was added long ago. In case of PHY_IGNORE_INTERRUPT the MAC interrupt indicates also PHY state changes and we should do what the symbol says. Fixes: 84a527a41f38 ("net: phylib: fix interrupts re-enablement in phy_start") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	net/mlx5e: Specify numa node when allocating drop rq	Gal Pressman	1	-2/+8
	[ Upstream commit 2f0db87901698cd73d828cc6fb1957b8916fc911 ] When allocating a drop rq, no numa node is explicitly set which means allocations are done on node zero. This is not necessarily the nearest numa node to the HCA, and even worse, might even be a memoryless numa node. Choose the numa_node given to us by the pci device in order to properly allocate the coherent dma memory instead of assuming zero is valid. Fixes: 556dd1b9c313 ("net/mlx5e: Set drop RQ's necessary parameters only") Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	mlxsw: spectrum_switchdev: Check success of FDB add operation	Shalom Toledo	1	-2/+27
	[ Upstream commit 0a8a1bf17e3af34f1f8d2368916a6327f8b3bfd5 ] Until now, we assumed that in case of error when adding FDB entries, the write operation will fail, but this is not the case. Instead, we need to check that the number of entries reported in the response is equal to the number of entries specified in the request. Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Reported-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	net/mlx5e: Fix TCP checksum in LRO buffers	Gal Pressman	1	-14/+35
	[ Upstream commit 8babd44d2079079f9d5a4aca7005aed80236efe0 ] When receiving an LRO packet, the checksum field is set by the hardware to the checksum of the first coalesced packet. Obviously, this checksum is not valid for the merged LRO packet and should be fixed. We can use the CQE checksum which covers the checksum of the entire merged packet TCP payload to help us calculate the checksum incrementally. Tested by sending IPv4/6 traffic with LRO enabled, RX checksum disabled and watching nstat checksum error counters (in addition to the obvious bandwidth drop caused by checksum errors). This bug is usually "hidden" since LRO packets would go through the CHECKSUM_UNNECESSARY flow which does not validate the packet checksum. It's important to note that previous to this patch, LRO packets provided with CHECKSUM_UNNECESSARY are indeed packets with a correct validated checksum (even though the checksum inside the TCP header is incorrect), since the hardware LRO aggregation is terminated upon receiving a packet with bad checksum. Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files") Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	ppp: prevent unregistered channels from connecting to PPP units	Guillaume Nault	1	-0/+9
	[ Upstream commit 77f840e3e5f09c6d7d727e85e6e08276dd813d11 ] PPP units don't hold any reference on the channels connected to it. It is the channel's responsibility to ensure that it disconnects from its unit before being destroyed. In practice, this is ensured by ppp_unregister_channel() disconnecting the channel from the unit before dropping a reference on the channel. However, it is possible for an unregistered channel to connect to a PPP unit: register a channel with ppp_register_net_channel(), attach a /dev/ppp file to it with ioctl(PPPIOCATTCHAN), unregister the channel with ppp_unregister_channel() and finally connect the /dev/ppp file to a PPP unit with ioctl(PPPIOCCONNECT). Once in this situation, the channel is only held by the /dev/ppp file, which can be released at anytime and free the channel without letting the parent PPP unit know. Then the ppp structure ends up with dangling pointers in its ->channels list. Prevent this scenario by forbidding unregistered channels from connecting to PPP units. This maintains the code logic by keeping ppp_unregister_channel() responsible from disconnecting the channel if necessary and avoids modification on the reference counting mechanism. This issue seems to predate git history (successfully reproduced on Linux 2.6.26 and earlier PPP commits are unrelated). Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	net: ethernet: ti: cpsw: fix net watchdog timeout	Grygorii Strashko	1	-2/+14
	[ Upstream commit 62f94c2101f35cd45775df00ba09bde77580e26a ] It was discovered that simple program which indefinitely sends 200b UDP packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog timeout is triggered due to race between cpsw_ndo_start_xmit() and cpsw_tx_handler() [NAPI] cpsw_ndo_start_xmit() if (unlikely(!cpdma_check_free_tx_desc(txch))) { txq = netdev_get_tx_queue(ndev, q_idx); netif_tx_stop_queue(txq); ^^ as per [1] barier has to be used after set_bit() otherwise new value might not be visible to other cpus } cpsw_tx_handler() if (unlikely(netif_tx_queue_stopped(txq))) netif_tx_wake_queue(txq); and when it happens ndev TX queue became disabled forever while driver's HW TX queue is empty. Fix this, by adding smp_mb__after_atomic() after netif_tx_stop_queue() calls and double check for free TX descriptors after stopping ndev TX queue - if there are free TX descriptors wake up ndev TX queue. [1] https://www.kernel.org/doc/html/latest/core-api/atomic_ops.html Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	net: amd-xgbe: fix comparison to bitshift when dealing with a mask	Wolfram Sang	1	-1/+1
	[ Upstream commit a3276892db7a588bedc33168e502572008f714a9 ] Due to a typo, the mask was destroyed by a comparison instead of a bit shift. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	hdlc_ppp: carrier detect ok, don't turn off negotiation	Denis Du	1	-1/+4
	[ Upstream commit b6c3bad1ba83af1062a7ff6986d9edc4f3d7fc8e ] Sometimes when physical lines have a just good noise to make the protocol handshaking fail, but the carrier detect still good. Then after remove of the noise, nobody will trigger this protocol to be start again to cause the link to never come back. The fix is when the carrier is still on, not terminate the protocol handshaking. Signed-off-by: Denis Du <dudenis2000@yahoo.ca> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-09	ixgbe: fix crash in build_skb Rx code path	Emil Tantilov	1	-0/+8
	commit 0c5661ecc5dd7ce296870a3eb7b62b1b280a5e89 upstream. Add check for build_skb enabled ring in ixgbe_dma_sync_frag(). In that case &skb_shinfo(skb)->frags[0] may not always be set which can lead to a crash. Instead we derive the page offset from skb->data. Fixes: 42073d91a214 ("ixgbe: Have the CPU take ownership of the buffers sooner") CC: stable <stable@vger.kernel.org> Reported-by: Ambarish Soman <asoman@redhat.com> Suggested-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-28	net: thunderbolt: Run disconnect flow asynchronously when logout is received	Mika Westerberg	1	-1/+13
	commit 027d351c541744c0c780dd5801c63e4b90750b90 upstream. The control channel calls registered callbacks when control messages such as XDomain protocol messages are received. The control channel handling is done in a worker running on system workqueue which means the networking driver can't run tear down flow which includes sending disconnect request and waiting for a reply in the same worker. Otherwise reply is never received (as the work is already running) and the operation times out. To fix this run disconnect ThunderboltIP flow asynchronously once ThunderboltIP logout message is received. Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable") Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-28	net: thunderbolt: Tear down connection properly on suspend	Mika Westerberg	1	-4/+1
	commit 8e021a14d908475fea89ef85b5421865f7ad650d upstream. When suspending to mem or disk the Thunderbolt controller typically goes down as well tearing down the connection automatically. However, when suspend to idle is used this does not happen so we need to make sure the connection is properly disconnected before it can be re-established during resume. Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable") Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-28	PCI/cxgb4: Extend T3 PCI quirk to T4+ devices	Casey Leedom	1	-10/+0
	commit 7dcf688d4c78a18ba9538b2bf1b11dc7a43fe9be upstream. We've run into a problem where our device is attached to a Virtual Machine and the use of the new pci_set_vpd_size() API doesn't help. The VM kernel has been informed that the accesses are okay, but all of the actual VPD Capability Accesses are trapped down into the KVM Hypervisor where it goes ahead and imposes the silent denials. The right idea is to follow the kernel.org commit 1c7de2b4ff88 ("PCI: Enable access to non-standard VPD for Chelsio devices (cxgb3)") which Alexey Kardashevskiy authored to establish a PCI Quirk for our T3-based adapters. This commit extends that PCI Quirk to cover Chelsio T4 devices and later. The advantage of this approach is that the VPD Size gets set early in the Base OS/Hypervisor Boot and doesn't require that the cxgb4 driver even be available in the Base OS/Hypervisor. Thus PF4 can be exported to a Virtual Machine and everything should work. Fixes: 67e658794ca1 ("cxgb4: Set VPD size so we can read both VPD structures") Cc: <stable@vger.kernel.org> # v4.9+ Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25	tun: fix tun_napi_alloc_frags() frag allocator	Eric Dumazet	1	-10/+6
	commit 43a08e0f58b3f236165029710a4e3b303815253b upstream. <Mark Rutland reported> While fuzzing arm64 v4.16-rc1 with Syzkaller, I've been hitting a misaligned atomic in __skb_clone: atomic_inc(&(skb_shinfo(skb)->dataref)); where dataref doesn't have the required natural alignment, and the atomic operation faults. e.g. i often see it aligned to a single byte boundary rather than a four byte boundary. AFAICT, the skb_shared_info is misaligned at the instant it's allocated in __napi_alloc_skb() __napi_alloc_skb() </end of report> Problem is caused by tun_napi_alloc_frags() using napi_alloc_frag() with user provided seg sizes, leading to other users of this API getting unaligned page fragments. Since we would like to not necessarily add paddings or alignments to the frags that tun_napi_alloc_frags() attaches to the skb, switch to another page frag allocator. As a bonus skb_page_frag_refill() can use GFP_KERNEL allocations, meaning that we can not deplete memory reserves as easily. Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22	mvpp2: fix multicast address filter	Mikulas Patocka	1	-3/+8
	commit 7ac8ff95f48cbfa609a060fd6a1e361dd62feeb3 upstream. IPv6 doesn't work on the MacchiatoBIN board. It is caused by broken multicast address filter in the mvpp2 driver. The driver loads doesn't load any multicast entries if "allmulti" is not set. This condition should be reversed. The condition !netdev_mc_empty(dev) is useless (because netdev_for_each_mc_addr is nop if the list is empty). This patch also fixes a possible overflow of the multicast list - if mvpp2_prs_mac_da_accept fails, we set the allmulti flag and retry. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22	rtlwifi: rtl8821ae: Fix connection lost problem correctly	Larry Finger	2	-2/+4
	commit c713fb071edc0efc01a955f65a006b0e1795d2eb upstream. There has been a coding error in rtl8821ae since it was first introduced, namely that an 8-bit register was read using a 16-bit read in _rtl8821ae_dbi_read(). This error was fixed with commit 40b368af4b75 ("rtlwifi: Fix alignment issues"); however, this change led to instability in the connection. To restore stability, this change was reverted in commit b8b8b16352cd ("rtlwifi: rtl8821ae: Fix connection lost problem"). Unfortunately, the unaligned access causes machine checks in ARM architecture, and we were finally forced to find the actual cause of the problem on x86 platforms. Following a suggestion from Pkshih <pkshih@realtek.com>, it was found that increasing the ASPM L1 latency from 0 to 7 fixed the instability. This parameter was varied to see if a smaller value would work; however, it appears that 7 is the safest value. A new symbol is defined for this quantity, thus it can be easily changed if necessary. Fixes: b8b8b16352cd ("rtlwifi: rtl8821ae: Fix connection lost problem") Cc: Stable <stable@vger.kernel.org> # 4.14+ Fix-suggested-by: Pkshih <pkshih@realtek.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Tested-by: James Cameron <quozl@laptop.org> # x86_64 OLPC NL3 Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-22	mwifiex: resolve reset vs. remove()/shutdown() deadlocks	Brian Norris	1	-1/+4
	commit a64e7a79dd6030479caad603c8d78e6c9c14904f upstream. Commit b014e96d1abb ("PCI: Protect pci_error_handlers->reset_notify() usage with device_lock()") resolves races between driver reset and removal, but it introduces some new deadlock problems. If we see a timeout while we've already started suspending, removing, or shutting down the driver, we might see: (a) a worker thread, running mwifiex_pcie_work() -> mwifiex_pcie_card_reset_work() -> pci_reset_function() (b) a removal thread, running mwifiex_pcie_remove() -> mwifiex_free_adapter() -> mwifiex_unregister() -> mwifiex_cleanup_pcie() -> cancel_work_sync(&card->work) Unfortunately, mwifiex_pcie_remove() already holds the device lock that pci_reset_function() is now requesting, and so we see a deadlock. It's necessary to cancel and synchronize our outstanding work before tearing down the driver, so we can't have this work wait indefinitely for the lock. It's reasonable to only "try" to reset here, since this will mostly happen for cases where it's already difficult to reset the firmware anyway (e.g., while we're suspending or powering off the system). And if reset really needs to happen, we can always try again later. Fixes: b014e96d1abb ("PCI: Protect pci_error_handlers->reset_notify() usage with device_lock()") Cc: <stable@vger.kernel.org> Cc: Xinming Hu <huxm@marvell.com> Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>