kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2017-01-25	net: Specify the owning module for lwtunnel ops	Robert Shearman	6	-0/+8
	Modules implementing lwtunnel ops should not be allowed to unload while there is state alive using those ops, so specify the owning module for all lwtunnel ops. Signed-off-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-25	Merge branch 'tipc-topology-fixes'	David S. Miller	4	-83/+99
	Parthasarathy Bhuvaragan says: ==================== tipc: topology server fixes for nametable soft lockup In this series, we revert the commit 333f796235a527 ("tipc: fix a race condition leading to subscriber refcnt bug") and provide an alternate solution to fix the race conditions in commits 2-4. We have to do this as the above commit introduced a nametbl soft lockup at module exit as described by patch#4. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-25	tipc: fix cleanup at module unload	Parthasarathy Bhuvaragan	1	-3/+1
	In tipc_server_stop(), we iterate over the connections with limiting factor as server's idr_in_use. We ignore the fact that this variable is decremented in tipc_close_conn(), leading to premature exit. In this commit, we iterate until the we have no connections left. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: John Thompson <thompa.atl@gmail.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-25	tipc: ignore requests when the connection state is not CONNECTED	Parthasarathy Bhuvaragan	1	-6/+7
	In tipc_conn_sendmsg(), we first queue the request to the outqueue followed by the connection state check. If the connection is not connected, we should not queue this message. In this commit, we reject the messages if the connection state is not CF_CONNECTED. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Tested-by: John Thompson <thompa.atl@gmail.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-25	tipc: fix nametbl_lock soft lockup at module exit	Parthasarathy Bhuvaragan	1	-11/+5
	Commit 333f796235a527 ("tipc: fix a race condition leading to subscriber refcnt bug") reveals a soft lockup while acquiring nametbl_lock. Before commit 333f796235a527, we call tipc_conn_shutdown() from tipc_close_conn() in the context of tipc_topsrv_stop(). In that context, we are allowed to grab the nametbl_lock. Commit 333f796235a527, moved tipc_conn_release (renamed from tipc_conn_shutdown) to the connection refcount cleanup. This allows either tipc_nametbl_withdraw() or tipc_topsrv_stop() to the cleanup. Since tipc_exit_net() first calls tipc_topsrv_stop() and then tipc_nametble_withdraw() increases the chances for the later to perform the connection cleanup. The soft lockup occurs in the call chain of tipc_nametbl_withdraw(), when it performs the tipc_conn_kref_release() as it tries to grab nametbl_lock again while holding it already. tipc_nametbl_withdraw() grabs nametbl_lock tipc_nametbl_remove_publ() tipc_subscrp_report_overlap() tipc_subscrp_send_event() tipc_conn_sendmsg() << if (con->flags != CF_CONNECTED) we do conn_put(), triggering the cleanup as refcount=0. >> tipc_conn_kref_release tipc_sock_release tipc_conn_release tipc_subscrb_delete tipc_subscrp_delete tipc_nametbl_unsubscribe << Soft Lockup >> The previous changes in this series fixes the race conditions fixed by commit 333f796235a527. Hence we can now revert the commit. Fixes: 333f796235a52727 ("tipc: fix a race condition leading to subscriber refcnt bug") Reported-and-Tested-by: John Thompson <thompa.atl@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-25	tipc: fix connection refcount error	Parthasarathy Bhuvaragan	1	-9/+10
	Until now, the generic server framework maintains the connection id's per subscriber in server's conn_idr. At tipc_close_conn, we remove the connection id from the server list, but the connection is valid until we call the refcount cleanup. Hence we have a window where the server allocates the same connection to an new subscriber leading to inconsistent reference count. We have another refcount warning we grab the refcount in tipc_conn_lookup() for connections with flag with CF_CONNECTED not set. This usually occurs at shutdown when the we stop the topology server and withdraw TIPC_CFG_SRV publication thereby triggering a withdraw message to subscribers. In this commit, we: 1. remove the connection from the server list at recount cleanup. 2. grab the refcount for a connection only if CF_CONNECTED is set. Tested-by: John Thompson <thompa.atl@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-25	tipc: add subscription refcount to avoid invalid delete	Parthasarathy Bhuvaragan	2	-54/+71
	Until now, the subscribers keep track of the subscriptions using reference count at subscriber level. At subscription cancel or subscriber delete, we delete the subscription only if the timer was pending for the subscription. This approach is incorrect as: 1. del_timer() is not SMP safe, if on CPU0 the check for pending timer returns true but CPU1 might schedule the timer callback thereby deleting the subscription. Thus when CPU0 is scheduled, it deletes an invalid subscription. 2. We export tipc_subscrp_report_overlap(), which accesses the subscription pointer multiple times. Meanwhile the subscription timer can expire thereby freeing the subscription and we might continue to access the subscription pointer leading to memory violations. In this commit, we introduce subscription refcount to avoid deleting an invalid subscription. Reported-and-Tested-by: John Thompson <thompa.atl@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-25	tipc: fix nametbl_lock soft lockup at node/link events	Parthasarathy Bhuvaragan	1	-2/+7
	We trigger a soft lockup as we grab nametbl_lock twice if the node has a pending node up/down or link up/down event while: - we process an incoming named message in tipc_named_rcv() and perform an tipc_update_nametbl(). - we have pending backlog items in the name distributor queue during a nametable update using tipc_nametbl_publish() or tipc_nametbl_withdraw(). The following are the call chain associated: tipc_named_rcv() Grabs nametbl_lock tipc_update_nametbl() (publish/withdraw) tipc_node_subscribe()/unsubscribe() tipc_node_write_unlock() << lockup occurs if an outstanding node/link event exits, as we grabs nametbl_lock again >> tipc_nametbl_withdraw() Grab nametbl_lock tipc_named_process_backlog() tipc_update_nametbl() << rest as above >> The function tipc_node_write_unlock(), in addition to releasing the lock processes the outstanding node/link up/down events. To do this, we need to grab the nametbl_lock again leading to the lockup. In this commit we fix the soft lockup by introducing a fast variant of node_unlock(), where we just release the lock. We adapt the node_subscribe()/node_unsubscribe() to use the fast variants. Reported-and-Tested-by: John Thompson <thompa.atl@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	Merge branch 'alx-mq-fixes'	David S. Miller	1	-3/+8
	Tobias Regnery says: ==================== alx: fix fallout from multi queue conversion Here are 3 fixes for the multi queue conversion in v4.10. The first patch fixes a wrong condition in an if statement. Patches 2 and 3 fixes regressions in the corner case when requesting msi-x interrupts fails and we fall back to msi or legacy interrupts. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	alx: work around hardware bug in interrupt fallback path	Tobias Regnery	1	-2/+6
	If requesting msi-x interrupts fails in alx_request_irq we fall back to a single tx queue and msi or legacy interrupts. Currently the adapter stops working in this case and we get tx watchdog timeouts. For reasons unknown the adapter gets confused when we load the dma adresses to the chip in alx_init_ring_ptrs twice: the first time with multiple queues and the second time in the fallback case with a single queue. To fix this move the the call to alx_reinit_rings (which calls alx_init_ring_ptrs) after alx_request_irq. At this time it is clear how much tx queues we have and which dma addresses we use. Fixes: d768319cd427 ("alx: enable multiple tx queues") Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	alx: fix fallback to msi or legacy interrupts	Tobias Regnery	1	-0/+1
	If requesting msi-x interrupts fails we should fall back to msi or legacy interrupts. However alx_realloc_ressources don't call alx_init_intr, so we fail to set the right number of tx queues. This results in watchdog timeouts and a nonfunctional adapter. Fixes: d768319cd427 ("alx: enable multiple tx queues") Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	alx: fix wrong condition to free descriptor memory	Tobias Regnery	1	-1/+1
	The condition to free the descriptor memory is wrong, we want to free the memory if it is set and not if it is unset. Invert the test to fix this issue. Fixes: b0999223f224b ("alx: add ability to allocate and free alx_napi structures") Signed-off-by: Tobias Regnery <tobias.regnery@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	qmi_wwan/cdc_ether: add device ID for HP lt2523 (Novatel E371) WWAN card	Bjørn Mork	2	-0/+15
	Another rebranded Novatel E371. qmi_wwan should drive this device, while cdc_ether should ignore it. Even though the USB descriptors are plain CDC-ETHER that USB interface is a QMI interface. Ref commit 7fdb7846c9ca ("qmi_wwan/cdc_ether: add device IDs for Dell 5804 (Novatel E371) WWAN card") Cc: Dan Williams <dcbw@redhat.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	ibmveth: Add a proper check for the availability of the checksum features	Thomas Huth	1	-2/+5
	When using the ibmveth driver in a KVM/QEMU based VM, it currently always prints out a scary error message like this when it is started: ibmveth 71000003 (unregistered net_device): unable to change checksum offload settings. 1 rc=-2 ret_attr=71000003 This happens because the driver always tries to enable the checksum offloading without checking for the availability of this feature first. QEMU does not support checksum offloading for the spapr-vlan device, thus we always get the error message here. According to the LoPAPR specification, the "ibm,illan-options" property of the corresponding device tree node should be checked first to see whether the H_ILLAN_ATTRIUBTES hypercall and thus the checksum offloading feature is available. Thus let's do this in the ibmveth driver, too, so that the error message is really only limited to cases where something goes wrong, and does not occur if the feature is just missing. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	Merge branch 'vxlan-fdb-fixes'	David S. Miller	1	-3/+7
	Roopa Prabhu says: ==================== vxlan: misc fdb fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	vxlan: do not age static remote mac entries	Balakrishnan Raman	1	-1/+1
	Mac aging is applicable only for dynamically learnt remote mac entries. Check for user configured static remote mac entries and skip aging. Signed-off-by: Balakrishnan Raman <ramanb@cumulusnetworks.com> Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	vxlan: don't flush static fdb entries on admin down	Roopa Prabhu	1	-2/+6
	This patch skips flushing static fdb entries in ndo_stop, but flushes all fdb entries during vxlan device delete. This is consistent with the bridge driver fdb Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	Merge branch 'ip6_tnl_parse_tlv_enc_lim-fixes'	David S. Miller	2	-12/+27
	Eric Dumazet says: ==================== ipv6: fix ip6_tnl_parse_tlv_enc_lim() issues First patch fixes ip6_tnl_parse_tlv_enc_lim() callers, bug added in linux-3.7 Second patch fixes ip6_tnl_parse_tlv_enc_lim() itself, bug predates linux-2.6.12 Based on a report from Dmitry Vyukov, thanks to KASAN. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	ipv6: fix ip6_tnl_parse_tlv_enc_lim()	Eric Dumazet	1	-12/+22
	This function suffers from multiple issues. First one is that pskb_may_pull() may reallocate skb->head, so the 'raw' pointer needs either to be reloaded or not used at all. Second issue is that NEXTHDR_DEST handling does not validate that the options are present in skb->data, so we might read garbage or access non existent memory. With help from Willem de Bruijn. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	ip6_tunnel: must reload ipv6h in ip6ip6_tnl_xmit()	Eric Dumazet	2	-0/+5
	Since ip6_tnl_parse_tlv_enc_lim() can call pskb_may_pull(), we must reload any pointer that was related to skb->head (or skb->data), or risk use after free. Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dmitry Kozlov <xeb@mail.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	virtio_net: fix PAGE_SIZE > 64k	Michael S. Tsirkin	1	-1/+9
	I don't have any guests with PAGE_SIZE > 64k but the code seems to be clearly broken in that case as PAGE_SIZE / MERGEABLE_BUFFER_ALIGN will need more than 8 bit and so the code in mergeable_ctx_to_buf_address does not give us the actual true size. Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	af_unix: move unix_mknod() out of bindlock	WANG Cong	1	-11/+16
	Dmitry reported a deadlock scenario: unix_bind() path: u->bindlock ==> sb_writer do_splice() path: sb_writer ==> pipe->mutex ==> u->bindlock In the unix_bind() code path, unix_mknod() does not have to be done with u->bindlock held, since it is a pure fs operation, so we can just move unix_mknod() out. Reported-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Cc: Rainer Weikusat <rweikusat@mobileactivedefense.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	mlxsw: spectrum_router: Correctly reallocate adjacency entries	Ido Schimmel	1	-4/+6
	mlxsw_sp_nexthop_group_mac_update() is called in one of two cases: 1) When the MAC of a nexthop needs to be updated 2) When the size of a nexthop group has changed In the second case the adjacency entries for the nexthop group need to be reallocated from the adjacency table. In this case we must write to the entries the MAC addresses of all the nexthops that should be offloaded and not only those whose MAC changed. Otherwise, these entries would be filled with garbage data, resulting in packet loss. Fixes: a7ff87acd995 ("mlxsw: spectrum_router: Implement next-hop routing") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	r8152: don't execute runtime suspend if the tx is not empty	hayeswang	1	-1/+3
	Runtime suspend shouldn't be executed if the tx queue is not empty, because the device is not idle. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	Documentation: net: phy: improve explanation when to specify the PHY ID	Martin Blumenstingl	1	-2/+3
	The old description basically read like "ethernet-phy-idAAAA.BBBB" can be specified when you know the actual PHY ID. However, specifying this has a side-effect: it forces Linux to bind to a certain PHY driver (the one that matches the ID given in the compatible string), ignoring the ID which is reported by the actual PHY. Whenever a device is shipped with (multiple) different PHYs during it's production lifetime then explicitly specifying "ethernet-phy-idAAAA.BBBB" could break certain revisions of that device. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	net: phy: marvell: Add Wake from LAN support for 88E1510 PHY	Jingju Hou	1	-0/+2
	Signed-off-by: Jingju Hou <houjingj@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	Merge tag 'mac80211-for-davem-2017-01-24' of ↵	David S. Miller	1	-2/+0
	git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A single fix, for a sleeping context problem found by LTP. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24	mac80211: don't try to sleep in rate_control_rate_init()	Johannes Berg	1	-2/+0
	In my previous patch, I missed that rate_control_rate_init() is called from some places that cannot sleep, so it cannot call ieee80211_recalc_min_chandef(). Remove that call for now to fix the context bug, we'll have to find a different way to fix the minimum channel width issue. Fixes: 96aa2e7cf126 ("mac80211: calculate min channel width correctly") Reported-by: Xiaolong Ye (via lkp-robot) <xiaolong.ye@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2017-01-23	net: dsa: Check return value of phy_connect_direct()	Florian Fainelli	1	-4/+2
	We need to check the return value of phy_connect_direct() in dsa_slave_phy_connect() otherwise we may be continuing the initialization of a slave network device with a PHY that already attached somewhere else and which will soon be in error because the PHY device is in error. The conditions for such an error to occur are that we have a port of our switch that is not disabled, and has the same port number as a PHY address (say both 5) that can be probed using the DSA slave MII bus. We end-up having this slave network device find a PHY at the same address as our port number, and we try to attach to it. A slave network (e.g: port 0) has already attached to our PHY device, and we try to re-attach it with a different network device, but since we ignore the error we would end-up initializating incorrect device references by the time the slave network interface is opened. The code has been (re)organized several times, making it hard to provide an exact Fixes tag, this is a bugfix nonetheless. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	net: phy: Avoid deadlock during phy_error()	Florian Fainelli	1	-5/+9
	phy_error() is called in the PHY state machine workqueue context, and calls phy_trigger_machine() which does a cancel_delayed_work_sync() of the workqueue we execute from, causing a deadlock situation. Augment phy_trigger_machine() machine with a sync boolean indicating whether we should use cancel__sync() or just cancel__work(). Fixes: 3c293f4e08b5 ("net: phy: Trigger state machine on state change and not polling.") Reported-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	net: mpls: Fix multipath selection for LSR use case	David Ahern	1	-23/+25
	MPLS multipath for LSR is broken -- always selecting the first nexthop in the one label case. For example: $ ip -f mpls ro ls 100 nexthop as to 200 via inet 172.16.2.2 dev virt12 nexthop as to 300 via inet 172.16.3.2 dev virt13 101 nexthop as to 201 via inet6 2000:2::2 dev virt12 nexthop as to 301 via inet6 2000:3::2 dev virt13 In this example incoming packets have a single MPLS labels which means BOS bit is set. The BOS bit is passed from mpls_forward down to mpls_multipath_hash which never processes the hash loop because BOS is 1. Update mpls_multipath_hash to process the entire label stack. mpls_hdr_len tracks the total mpls header length on each pass (on pass N mpls_hdr_len is N * sizeof(mpls_shim_hdr)). When the label is found with the BOS set it verifies the skb has sufficient header for ipv4 or ipv6, and find the IPv4 and IPv6 header by using the last mpls_hdr pointer and adding 1 to advance past it. With these changes I have verified the code correctly sees the label, BOS, IPv4 and IPv6 addresses in the network header and icmp/tcp/udp traffic for ipv4 and ipv6 are distributed across the nexthops. Fixes: 1c78efa8319ca ("mpls: flow-based multipath selection") Acked-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	Merge branch 'amd-xgbe-fixes'	David S. Miller	5	-5/+26
	Tom Lendacky says: ==================== amd-xgbe: AMD XGBE driver fixes 2017-01-20 This patch series addresses some issues in the AMD XGBE driver. The following fixes are included in this driver update series: - Add a fix for a version of the hardware that uses different register offset values for a device with the same PCI device ID - Add support to check the return code from the xgbe_init() function This patch series is based on net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	amd-xgbe: Check xgbe_init() return code	Lendacky, Thomas	2	-2/+6
	The xgbe_init() routine returns a return code indicating success or failure, but the return code is not checked. Add code to xgbe_init() to issue a message when failures are seen and add code to check the xgbe_init() return code. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-23	amd-xgbe: Add a hardware quirk for register definitions	Lendacky, Thomas	4	-3/+20
	A newer version of the hardware is using the same PCI ids for the network device but has altered register definitions for determining the window settings for the indirect PCS access. Add support to check for this hardware and if found use the new register values. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	bridge: netlink: call br_changelink() during br_dev_newlink()	Ivan Vecera	1	-14/+19
	Any bridge options specified during link creation (e.g. ip link add) are ignored as br_dev_newlink() does not process them. Use br_changelink() to do it. Fixes: 133235161721 ("bridge: implement rtnl_link_ops->changelink") Signed-off-by: Ivan Vecera <cera@cera.cz> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	xen-netfront: Fix Rx stall during network stress and OOM	Vineeth Remanan Pillai	1	-1/+1
	During an OOM scenario, request slots could not be created as skb allocation fails. So the netback cannot pass in packets and netfront wrongly assumes that there is no more work to be done and it disables polling. This causes Rx to stall. The issue is with the retry logic which schedules the timer if the created slots are less than NET_RX_SLOTS_MIN. The count of new request slots to be pushed are calculated as a difference between new req_prod and rsp_cons which could be more than the actual slots, if there are unconsumed responses. The fix is to calculate the count of newly created slots as the difference between new req_prod and old req_prod. Signed-off-by: Vineeth Remanan Pillai <vineethp@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	net/mlx5e: Do not recycle pages from emergency reserve	Eric Dumazet	1	-0/+3
	A driver using dev_alloc_page() must not reuse a page allocated from emergency memory reserve. Otherwise all packets using this page will be immediately dropped, unless for very specific sockets having SOCK_MEMALLOC bit set. This issue might be hard to debug, because only a fraction of received packets would be dropped. Fixes: 4415a0319f92 ("net/mlx5e: Implement RX mapped page cache for page recycle") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	bpf: fix samples xdp_tx_iptunnel and tc_l2_redirect with fake KBUILD_MODNAME	Jesper Dangaard Brouer	2	-0/+2
	Fix build errors for samples/bpf xdp_tx_iptunnel and tc_l2_redirect, when dynamic debugging is enabled (CONFIG_DYNAMIC_DEBUG) by defining a fake KBUILD_MODNAME. Just like Daniel Borkmann fixed other samples/bpf in commit 96a8eb1eeed2 ("bpf: fix samples to add fake KBUILD_MODNAME"). Fixes: 12d8bb64e3f6 ("bpf: xdp: Add XDP example for head adjustment") Fixes: 90e02896f1a4 ("bpf: Add test for bpf_redirect to ipip/ip6tnl") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	bcm63xx_enet: avoid uninitialized variable warning	Arnd Bergmann	1	-2/+4
	gcc-7 and probably earlier versions get confused by this function and print a harmless warning: drivers/net/ethernet/broadcom/bcm63xx_enet.c: In function 'bcm_enet_open': drivers/net/ethernet/broadcom/bcm63xx_enet.c:1130:3: error: 'phydev' may be used uninitialized in this function [-Werror=maybe-uninitialized] This adds an initialization for the 'phydev' variable when it is unused and changes the check to test for that NULL pointer to make it clear that we always pass a valid pointer here. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	qed: avoid possible stack overflow in qed_ll2_acquire_connection	Arnd Bergmann	3	-61/+53
	struct qed_ll2_info is rather large, so putting it on the stack can cause an overflow, as this warning tries to tell us: drivers/net/ethernet/qlogic/qed/qed_ll2.c: In function 'qed_ll2_start': drivers/net/ethernet/qlogic/qed/qed_ll2.c:2159:1: error: the frame size of 1056 bytes is larger than 1024 bytes [-Werror=frame-larger-than=] qed_ll2_start_ooo() already uses a dynamic allocation for the structure to work around that problem, and we could do the same in qed_ll2_start() as well as qed_roce_ll2_start(), but since the structure is only used to pass a couple of initialization values here, it seems nicer to replace it with a different structure. Lacking any idea for better naming, I'm adding 'struct qed_ll2_conn', which now contains all the initialization data, and this now simply gets copied into struct qed_ll2_info rather than assigning all members one by one. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	Revert "net: sctp: fix array overrun read on sctp_timer_tbl"	David S. Miller	1	-1/+1
	This reverts commit 0e73fc9a56f22f2eec4d2b2910c649f7af67b74d. This fix wasn't correct, a better one is coming right up. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	net: sctp: fix array overrun read on sctp_timer_tbl	Colin Ian King	1	-1/+1
	The comparison on the timeout can lead to an array overrun read on sctp_timer_tbl because of an off-by-one error. Fix this by using < instead of <= and also compare to the array size rather than SCTP_EVENT_TIMEOUT_MAX. Fixes CoverityScan CID#1397639 ("Out-of-bounds read") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	ipv6: seg6_genl_set_tunsrc() must check kmemdup() return value	Eric Dumazet	1	-0/+2
	seg6_genl_get_tunsrc() and set_tun_src() do not handle tun_src being possibly NULL, so we must check kmemdup() return value and abort if it is NULL Fixes: 915d7e5e5930 ("ipv6: sr: add code base for control plane support of SR-IPv6") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: David Lebrun <david.lebrun@uclouvain.be> Acked-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	r8152: fix rtl8152_post_reset function	hayeswang	1	-0/+2
	The rtl8152_post_reset() should sumbit rx urb and interrupt transfer, otherwise the rx wouldn't work and the linking change couldn't be detected. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-20	virtio-net: restore VIRTIO_HDR_F_DATA_VALID on receiving	Jason Wang	5	-6/+10
	Commit 501db511397f ("virtio: don't set VIRTIO_NET_HDR_F_DATA_VALID on xmit") in fact disables VIRTIO_HDR_F_DATA_VALID on receiving path too, fixing this by adding a hint (has_data_valid) and set it only on the receiving path. Cc: Rolf Neugebauer <rolf.neugebauer@docker.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Rolf Neugebauer <rolf.neugebauer@docker.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-19	gianfar: Do not reuse pages from emergency reserve	Eric Dumazet	1	-1/+1
	A driver using dev_alloc_page() must not reuse a page that had to use emergency memory reserve. Otherwise all packets using this page will be immediately dropped, unless for very specific sockets having SOCK_MEMALLOC bit set. This issue might be hard to debug, because only a fraction of the RX ring buffer would suffer from drops. Fixes: 75354148ce69 ("gianfar: Add paged allocation and Rx S/G") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Claudiu Manoil <claudiu.manoil@freescale.com> Acked-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-19	tcp: initialize max window for a new fastopen socket	Alexey Kodanev	1	-0/+1
	Found that if we run LTP netstress test with large MSS (65K), the first attempt from server to send data comparable to this MSS on fastopen connection will be delayed by the probe timer. Here is an example: < S seq 0:0 win 43690 options [mss 65495 wscale 7 tfo cookie] length 32 > S. seq 0:0 ack 1 win 43690 options [mss 65495 wscale 7] length 0 < . ack 1 win 342 length 0 Inside tcp_sendmsg(), tcp_send_mss() returns max MSS in 'mss_now', as well as in 'size_goal'. This results the segment not queued for transmition until all the data copied from user buffer. Then, inside __tcp_push_pending_frames(), it breaks on send window test and continues with the check probe timer. Fragmentation occurs in tcp_write_wakeup()... +0.2 > P. seq 1:43777 ack 1 win 342 length 43776 < . ack 43777, win 1365 length 0 > P. seq 43777:65001 ack 1 win 342 options [...] length 21224 ... This also contradicts with the fact that we should bound to the half of the window if it is large. Fix this flaw by correctly initializing max_window. Before that, it could have large values that affect further calculations of 'size_goal'. Fixes: 168a8f58059a ("tcp: TCP Fast Open Server - main code path") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-19	net/mlx5e: Remove unused variable	Arnd Bergmann	1	-1/+0
	A cleanup removed the only user of this variable mlx5/core/en_ethtool.c: In function 'mlx5e_set_channels': mlx5/core/en_ethtool.c:546:6: error: unused variable 'ncv' [-Werror=unused-variable] Let's remove the declaration as well. Fixes: 639e9e94160e ("net/mlx5e: Remove unnecessary checks when setting num channels") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-19	ipv6: addrconf: Avoid addrconf_disable_change() using RCU read-side lock	Kefeng Wang	1	-3/+1
	Just like commit 4acd4945cd1e ("ipv6: addrconf: Avoid calling netdevice notifiers with RCU read-side lock"), it is unnecessary to make addrconf_disable_change() use RCU iteration over the netdev list, since it already holds the RTNL lock, or we may meet Illegal context switch in RCU read-side critical section. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-19	MAINTAINERS: update cxgb4 maintainer	Hariprasad Shenai	1	-1/+1
	Ganesg will be taking over as maintainer from now Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>