kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2019-01-26	net: hns3: remove dcb_ops->map_update in hclge_dcb	Yunsheng Lin	3	-14/+6
	After doing down/uninit/init/up in hclge_dcb, it is not necessary to call dcb_ops->map_update in enet, so hclge_map_update can be called directly in hclge_dcb. This is for preparing to call hns3_nic_set_real_num_queue with netdev down when user changes mqprio configuration. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	net: hns3: do reinitialization while mqprio configuration changed	Yunsheng Lin	2	-17/+21
	When user changes the mqprio configuration, enet need to be uninited and inited besides down'ed and up'ed, because the queue num may change when the TC num changes. Also, it is more suitable to do the down/unint/init/up operation in hclge module using hclge_notify_client, because this config change may affect PF and its VF. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	net: hns3: After setting the loopback, add the status of getting MAC	liuzhongzhu	1	-1/+22
	After setting the serdes loopback, you need to determine the status of the MAC negotiation. If a status exception is obtained after 200ms, a timeout error is returned. Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	net: hns3: fix broadcast promisc issue for revision 0x20	Jian Shen	2	-3/+11
	For revision 0x20, vlan filter is always bypassed when enable broadcast promisc mode. In this case, broadcast packets with any vlan id can be accpeted. We should disable broadcast promisc mode until user want enable it. Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	net: hns3: fix return value handle issue for hclge_set_loopback()	Jian Shen	2	-4/+7
	In current code, it always return 0, even loopback mode setting failed. It's incorrect. This patch fixes return value handle for loopback test. Fixes: 0f29fc23b21d ("net: hns3: Fix for loopback selftest failed problem") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	net: hns3: add error handling in hclge_ieee_setets	Yunsheng Lin	1	-2/+13
	Currently hclge_ieee_setets returns error directly when there is error, which may cause netdev not up problem. This patch adds some error handling when setting ETS configuration fails. Fixes: cacde272dd00 ("net: hns3: Add hclge_dcb module for the support of DCB feature") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	net: hns3: clear pci private data when unload hns3 driver	Jian Shen	1	-0/+1
	When unload hns3 driver, we should clear the pci private data. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	net: hns3: don't update packet statistics for packets dropped by hardware	Jian Shen	1	-3/+0
	Packet statistics for netdev should not include the packets dropped by hardware. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	Merge branch 'r8169-add-EEE-support-for-RTL8168f'	David S. Miller	1	-32/+48
	Heiner Kallweit says: ==================== r8169: add EEE support for RTL8168f This series adds EEE support for RTL8168f. Again first patch adds the support, and second patch enables EEE per default. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	r8169: enable EEE per default on RTL8168f	Heiner Kallweit	1	-33/+2
	Enable EEE per default on RTL8168f. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	r8169: add EEE support for RTL8168f	Heiner Kallweit	1	-0/+47
	Add EEE support for RTL8168f to the recently added EEE handling framework in the driver. This patch leaves the chip defaults, means EEE typically is disabled initially and it's up to the user to enable it via ethtool. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	Documentation: net: phy: switch documentation to rst format	Heiner Kallweit	3	-429/+448
	Switch phylib documentation to rst format. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	tcp: allow zerocopy with fastopen	Willem de Bruijn	3	-7/+6
	Accept MSG_ZEROCOPY in all the TCP states that allow sendmsg. Remove the explicit check for ESTABLISHED and CLOSE_WAIT states. This requires correctly handling zerocopy state (uarg, sk_zckey) in all paths reachable from other TCP states. Such as the EPIPE case in sk_stream_wait_connect, which a sendmsg() in incorrect state will now hit. Most paths are already safe. Only extension needed is for TCP Fastopen active open. This can build an skb with data in tcp_send_syn_data. Pass the uarg along with other fastopen state, so that this skb also generates a zerocopy notification on release. Tested with active and passive tcp fastopen packetdrill scripts at https://github.com/wdebruij/packetdrill/commit/1747eef03d25a2404e8132817d0f1244fd6f129d Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	ptp: fix debugfs_simple_attr.cocci warnings	YueHaibing	1	-8/+8
	Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE for debugfs files. Semantic patch information: Rationale: DEFINE_SIMPLE_ATTRIBUTE + debugfs_create_file() imposes some significant overhead as compared to DEFINE_DEBUGFS_ATTRIBUTE + debugfs_create_file_unsafe(). Generated by: scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	Merge branch 'r8169-add-EEE-support-for-RTL8168g+'	David S. Miller	1	-10/+206
	Heiner Kallweit says: ==================== r8169: add EEE support for RTL8168g+ This series adds general EEE support to be used with ethtool. In addition it implements EEE for chip versions from RTL8168g. The first patch leaves the default chip settings and the second enables EEE per default. This allows us to revert patch 2 w/o removing EEE support completely if we should face issues with EEE on particular chip versions. Unfortunately Realtek decided not to use the standard EEE MMD registers but to use proprietary registers. Therefore we can't use phylib functions like phy_ethtool_set_eee and have to reimplement the functionality. Tested on a system with RTL8168g (chip version 40). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	r8169: enable EEE per default on chip versions from RTL8168g	Heiner Kallweit	1	-0/+14
	Enable EEE per default on chip versions from RTL8168g. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	r8169: add general EEE support for chip versions from RTL8168g	Heiner Kallweit	1	-10/+192
	This patch adds the general framework to deal with EEE in this driver plus EEE support for chip versions from RTL8168g. We don't touch the default chip settings, therefore EEE will usually be disabled and it's up to the user to enable it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	Merge branch 'ipv6-defrag-rbtree'	David S. Miller	8	-660/+527
	Peter Oskolkov says: ==================== net: IP defrag: use rbtrees in IPv6 defragmentation Currently, IPv6 defragmentation code drops non-last fragments that are smaller than 1280 bytes: see commit 0ed4229b08c1 ("ipv6: defrag: drop non-last frags smaller than min mtu") This behavior is not specified in IPv6 RFCs and appears to break compatibility with some IPv6 implementations, as reported here: https://www.spinics.net/lists/netdev/msg543846.html This patchset contains four patches: - patch 1 moves rbtree-related code from IPv4 to files shared b/w IPv4/IPv6 - patch 2 changes IPv6 defragmenation code to use rbtrees for defrag queue - patch 3 changes nf_conntrack IPv6 defragmentation code to use rbtrees - patch 4 changes ip_defrag selftest to test changes made in the previous three patches. Along the way, the 1280-byte restrictions are removed. I plan to introduce similar changes to 6lowpan defragmentation code once I figure out how to test it. ==================== Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	selftests: net: ip_defrag: cover new IPv6 defrag behavior	Peter Oskolkov	2	-34/+51
	This patch adds several changes to the ip_defrag selftest, to cover new IPv6 defrag behavior: - min IPv6 frag size is now 8 instead of 1280 - new test cases to cover IPv6 defragmentation in nf_conntrack_reasm.c - new "permissive" mode in negative (overlap) tests: netfilter sometimes drops invalid packets without passing them to IPv6 underneath, and thus defragmentation sometimes succeeds when it is expected to fail; so the permissive mode does not fail the test if the correct reassembled datagram is received instead of a timeout. Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	net: IP6 defrag: use rbtrees in nf_conntrack_reasm.c	Peter Oskolkov	1	-189/+71
	Currently, IPv6 defragmentation code drops non-last fragments that are smaller than 1280 bytes: see commit 0ed4229b08c1 ("ipv6: defrag: drop non-last frags smaller than min mtu") This behavior is not specified in IPv6 RFCs and appears to break compatibility with some IPv6 implemenations, as reported here: https://www.spinics.net/lists/netdev/msg543846.html This patch re-uses common IP defragmentation queueing and reassembly code in IP6 defragmentation in nf_conntrack, removing the 1280 byte restriction. Signed-off-by: Peter Oskolkov <posk@google.com> Reported-by: Tom Herbert <tom@herbertland.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	net: IP6 defrag: use rbtrees for IPv6 defrag	Peter Oskolkov	2	-173/+71
	Currently, IPv6 defragmentation code drops non-last fragments that are smaller than 1280 bytes: see commit 0ed4229b08c1 ("ipv6: defrag: drop non-last frags smaller than min mtu") This behavior is not specified in IPv6 RFCs and appears to break compatibility with some IPv6 implemenations, as reported here: https://www.spinics.net/lists/netdev/msg543846.html This patch re-uses common IP defragmentation queueing and reassembly code in IPv6, removing the 1280 byte restriction. Signed-off-by: Peter Oskolkov <posk@google.com> Reported-by: Tom Herbert <tom@herbertland.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	net: IP defrag: encapsulate rbtree defrag code into callable functions	Peter Oskolkov	3	-264/+334
	This is a refactoring patch: without changing runtime behavior, it moves rbtree-related code from IPv4-specific files/functions into .h/.c defrag files shared with IPv6 defragmentation code. Signed-off-by: Peter Oskolkov <posk@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	Merge branch 's390-qeth-next'	David S. Miller	5	-337/+167
	Julian Wiedmann says: ==================== s390/qeth: updates 2019-01-25 please apply a first batch of qeth patches for net-next, primarily touching the net_device parts of the driver. In addition to the usual refactoring & code consolidation, patch 7 makes use of netif_device_detach() to let the stack know when our control plane is down. This helps quite a bit wrt to overall locking and proper init/shutdown sequencing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	s390/qeth: remove VLAN tracking for L2 devices	Julian Wiedmann	4	-63/+18
	For recovery purposes, qeth keeps track of all registered VIDs. Replace this by using the infrastructure introduced in commit 9daae9bd47cf ("net: Call add/kill vid ndo on vlan filter feature toggling"). By managing NETIF_F_HW_VLAN_CTAG_FILTER as a hw_feature, netdev_update_features() will select it from dev->wanted_features and replay all of the netdevice's VIDs to its ndo_vlan_rx_add_vid() callback. z/VM NICs strictly require VLAN registration, so don't expose it as hw_feature there but add a little hack in qeth_enable_hw_features() to make things work regardless. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	s390/qeth: detach netdevice while card is offline	Julian Wiedmann	4	-102/+20
	When a qeth card is offline, it has no connection to the HW. So none of our control callbacks can run IO against it, and we can only cache the input (eg a new MAC address) without providing proper feedback to the caller. In this context, it seems much more reasonable to simply detach the netdevice and let the kernel reject any interaction with it. This also makes all sorts of internal state checks and locking obsolete. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	s390/qeth: delay netdevice registration	Julian Wiedmann	3	-40/+43
	Re-order the code flow a bit so that all initial HW setup is done before putting the netdevice into play. For a netdevice that hasn't been registered before, we also don't need to re-enable its HW features or check for recovery actions. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	s390/qeth: remove TX disable from online path	Julian Wiedmann	2	-3/+0
	At best this is redundant, at worst it papers over a race in the offline / online code. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	s390/qeth: register MAC address earlier	Julian Wiedmann	1	-8/+12
	commit 4789a2188048 ("s390/qeth: fix race when setting MAC address") resolved a race where our initial programming of dev_addr into the HW and a call to ndo_set_mac_address() could run concurrently. In this case, we could end up getting confused about which address was actually set in the HW. The quick fix was to introduce additional locking that blocks any ndo_set_mac_address() while the device is being set online. But the race primarily originated from the fact that we first register the netdevice, and only then program its dev_addr. By re-ordering this sequence, userspace will only be able to change the MAC address _after_ we have finished with setting the initial dev_addr. Still, the same MAC address race can also occur during a subsequent call to qeth_l2_set_online(). So keep around the locking for now, until a follow-up patch fully resolves this. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	s390/qeth: consolidate open/stop netdev ops	Julian Wiedmann	4	-124/+84
	The L2 and L3 code for these ops is almost identical, we only need to provide a custom ndo_validate_addr() for L2 that checks whether programming the MAC address succeeded. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	s390/qeth: remove bogus netif_wake_queue()	Julian Wiedmann	1	-2/+0
	qeth_qdio_cq_handler() doesn't replenish the Output Queue(s), and thus has no reason to wake the txq. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	s390/qeth: streamline TX buffer management	Julian Wiedmann	1	-19/+14
	Consolidate the code that marks the current buffer to be flushed, and let qeth_fill_buffer() advance the Output Queue's buffer cursor. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	Documentation: net: phy: reflect latest changes to phylib API	Heiner Kallweit	1	-8/+10
	Recent changes to the phylib API - removed phy_stop_interrupts - replaced phy_start_interrupts with phy_request_interrupt - moved some functionality from phy_connect() and phy_disconnect() to phy_start() and phy_stop() respectively. Reflect these changes in the documentation. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26	Merge tag 'mlx5-updates-2019-01-25' of ↵	David S. Miller	10	-152/+211
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2019-01-25 This series provides some updates to mlx5 driver, From Tariq, 1) Make sure RX packet header does not cross page boundary To avoid page boundary crossing, use stride size that fits the maximum possible header. Stride is increased form 64B to 256B. 2) CQ struct cleanup: Take CQ decompress fields into a separate structure From Moshe, 3) Expand XPS cpumask to cover all online cpus From Jason Gunthorpe and Tariq: 4) Compilation warning cleanup From Or, 5) Add trace points for flow tables create/destroy From Saeed, 6) Software stats update/folding improvements this also solves a compilation warning on 32bit systems that was reported last release cycle by Arnd and Andrew. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25	net/mlx5e: Reuse fold sw stats in representors	Saeed Mahameed	3	-27/+10
	Representors software stats are basic, this patch is reusing the mlx5e_fold_sw_stats in representors, which sums up the basic stats64 for a mlx5e netdevice. Fixes: 8bfaf07f7806 ("net/mlx5e: Present SW stats when state is not opened") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25	net/mlx5e: Present the representors SW stats when state is not opened	Tariq Toukan	1	-10/+7
	This behavior is already adopted for all other cases in the cited patch. The representor's functions were missed, here we modify the them to behave similarly. Fixes: 8bfaf07f7806 ("net/mlx5e: Present SW stats when state is not opened") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25	net/mlx5e: Separate between ethtool and netdev software stats folding	Saeed Mahameed	3	-12/+25
	mlx5e_grp_sw_update_stats can be called from two threads, 1) ndo_get_stats64 2) get_ethtool_stats For this reason and to minimize concurrency issue impact on 64bit machines mlx5e_grp_sw_update_stats folds the software stats into a temporary variable then copies it to the global driver stats, both ethtool and ndo statistics callbacks will use the global software stats variable to report whatever stats they need. Actually ndo_get_stats64 doesn't need to fold the whole software stats (mlx5e_grp_sw_update_stats), all it needs is five counters to fill the rtnl_link_stats64 relevant stats parameter. Hence this patch introduces a simpler helper function to fold software stats for ndo_get_stats64 which will work directly on rtnl_link_stats64 stats parameter and not on the global or even temporary mlx5e_sw_stats variable. Since now mlx5e_grp_sw_update_stats is not called by ndo_get_stats64 we can make it static and remove the temp var. Unlike mlx5e_grp_sw_update_stats the new fold stats function doesn't need to zero out the output statistics parameter since it is already done by the stack @dev_get_stats(). This patch is fixing stack usage of mlx5e_grp_sw_update_stats on x86 gcc-4.9 and higher, the concurrency issue between mlx5's ndo_get_stats64 and get_ethtool_stats is resolved as well. Fixes: 8bfaf07f7806 ("net/mlx5e: Present SW stats when state is not opened") Reported-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25	net/mlx5: Add trace points for flow tables create/destroy	Or Gerlitz	3	-0/+39
	We were not tracking flow tables so far, add it up. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25	net/mlx5e: Return the allocated flow directly from __mlx5e_add_fdb_flow	Jason Gunthorpe	1	-15/+14
	This confusing construction confuses the compiler which can't see that flow is initialized if !err: drivers/net/ethernet/mellanox/mlx5/core/en_tc.c: In function `mlx5e_configure_flower` drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:2727:28: warning: `flow` may be used uninitialized in this function [-Wmaybe-uninitialized] There is no reason for two function outputs, just return the pointer directly and use ERR_PTR to encode a failure. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25	net/mlx5e: Expand XPS cpumask to cover all online cpus	Moshe Shemesh	2	-2/+33
	Currently we have one cpu in XPS cpumask per tx queue, this is good enough for default configuration where there is a tx queue per cpu. However, once configuration changes to use less tx queues, part of the cpus are not XPS-mapped and so the select queue decision falls back to hash calculation and balancing is not guaranteed. Expand XPS cpumask to enable using all cpus even when number of tx queues is smaller than number of cpus. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25	net/mlx5e: Take CQ decompress fields into a separate structure	Tariq Toukan	2	-61/+81
	Only the Receive CQ makes use of these fields. Take them out into a separate struct and save space in the generic CQ structure. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25	net/mlx5e: RX, Make sure packet header does not cross page boundary	Tariq Toukan	3	-32/+9
	In the non-linear SKB memory scheme of Striding RQ, a packet header could cross page boundary. This requires special care in fast path that costs LoC, additional runtime instructions and branches. It could happen when the header (up to 256B) does not fit in a single stride. Avoid this by working with a stride size that fits the maximum possible header. Stride is increased form 64B to 256B. Performance: Tested packet rate for UDP streams, single ring, on ConnectX-5. Configuration: Set Striding RQ and LRO ON (to enabled the non-linear SKB scheme). GRO OFF, early drop by TC rule. 64B: 4x worse memory utilization, no page-crossers headers - No degradation (5,887,305 pps). - The reduction in memory utilization is compensated by the saving of branches tests. 192B: 1.33x worse memory utilization, avoid page-crossers headers - Before: 5,727,252. After: 5,777,037. ~1% gain. 256B: Same memory util, no page-crossers - Before: 5,691,885. After: 5,748,007. ~1% gain. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25	net: Revert devlink health changes.	David S. Miller	11	-1785/+169
	This reverts the devlink health changes from 9/17/2019, Jiri wants things to be designed differently and it was agreed that the easiest way to do this is start from the beginning again. Commits reverted: cb5ccfbe73b389470e1dc11061bb185ef4bc9aec 880ee82f0313453ec5a6cb122866ac057263066b c7af343b4e33578b7de91786a3f639c8cfa0d97b ff253fedab961b22117a73ab808fcfa9e6852b50 6f9d56132eb6d2603d4273cfc65bed914ec47acb fcd852c69d776c0f46c8f79e8e431e5cc6ddc7b7 8a66704a13d9713593342e29b4f0c19762f5746b 12bd0dcefe88782ac1c9fff632958dd1b71d27e5 aba25279c10094c5c97d09c3491ca86d00b4ad5e ce019faa70f81555fa17ebc1d5a03651f2e7e15a b8c45a033acc607201588f7665ba84207e5149e0 And the follow-on build fix: o33a0efa4baecd689da9474ce0e8b673eb6931c60 Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25	bridge: remove duplicated include from br_multicast.c	YueHaibing	1	-1/+0
	Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25	mlxfw: Replace license text with SPDX identifiers and adjust copyrights	Shalom Toledo	9	-297/+20
	Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25	tipc: remove dead code in struct tipc_topsrv	Zhaolong Zhang	1	-3/+0
	max_rcvbuf_size is no longer used since commit "414574a0af36". Signed-off-by: Zhaolong Zhang <zhangzl2013@126.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25	Merge branch ↵	David S. Miller	2	-23/+161
	'tcp_bbr-Improving-TCP-BBR-performance-for-WiFi-and-cellular-networks' Priyaranjan Jha says: ==================== tcp_bbr: Improving TCP BBR performance for WiFi and cellular networks Ack aggregation is quite prevalent with wifi, cellular and cable modem link tchnologies, ACK decimation in middleboxes, and common offloading techniques such as TSO and GRO, at end hosts. Previously, BBR was often cwnd-limited in the presence of severe ACK aggregation, which resulted in low throughput due to insufficient data in flight. To achieve good throughput for wifi and other paths with aggregation, this patch series implements an ACK aggregation estimator for BBR, which estimates the maximum recent degree of ACK aggregation and adapts cwnd based on it. The algorithm is further described by the following presentation: https://datatracker.ietf.org/meeting/101/materials/slides-101-iccrg-an-update-on-bbr-work-at-google-00 (1) A preparatory patch, which refactors bbr_target_cwnd for generic inflight provisioning. (2) Implements BBR ack aggregation estimator and adapts cwnd based on measured degree of ACK aggregation. ==================== Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25	tcp_bbr: adapt cwnd based on ack aggregation estimation	Priyaranjan Jha	2	-3/+123
	Aggregation effects are extremely common with wifi, cellular, and cable modem link technologies, ACK decimation in middleboxes, and LRO and GRO in receiving hosts. The aggregation can happen in either direction, data or ACKs, but in either case the aggregation effect is visible to the sender in the ACK stream. Previously BBR's sending was often limited by cwnd under severe ACK aggregation/decimation because BBR sized the cwnd at 2BDP. If packets were acked in bursts after long delays (e.g. one ACK acking 5BDP after 5RTT), BBR's sending was halted after sending 2BDP over 2RTT, leaving the bottleneck idle for potentially long periods. Note that loss-based congestion control does not have this issue because when facing aggregation it continues increasing cwnd after bursts of ACKs, growing cwnd until the buffer is full. To achieve good throughput in the presence of aggregation effects, this algorithm allows the BBR sender to put extra data in flight to keep the bottleneck utilized during silences in the ACK stream that it has evidence to suggest were caused by aggregation. A summary of the algorithm: when a burst of packets are acked by a stretched ACK or a burst of ACKs or both, BBR first estimates the expected amount of data that should have been acked, based on its estimated bandwidth. Then the surplus ("extra_acked") is recorded in a windowed-max filter to estimate the recent level of observed ACK aggregation. Then cwnd is increased by the ACK aggregation estimate. The larger cwnd avoids BBR being cwnd-limited in the face of ACK silences that recent history suggests were caused by aggregation. As a sanity check, the ACK aggregation degree is upper-bounded by the cwnd (at the time of measurement) and a global max of BW 100ms. The algorithm is further described by the following presentation: https://datatracker.ietf.org/meeting/101/materials/slides-101-iccrg-an-update-on-bbr-work-at-google-00 In our internal testing, we observed a significant increase in BBR throughput (measured using netperf), in a basic wifi setup. - Host1 (sender on ethernet) -> AP -> Host2 (receiver on wifi) - 2.4 GHz -> BBR before: ~73 Mbps; BBR after: ~102 Mbps; CUBIC: ~100 Mbps - 5.0 GHz -> BBR before: ~362 Mbps; BBR after: ~593 Mbps; CUBIC: ~601 Mbps Also, this code is running globally on YouTube TCP connections and produced significant bandwidth increases for YouTube traffic. This is based on Ian Swett's max_ack_height_ algorithm from the QUIC BBR implementation. Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25	tcp_bbr: refactor bbr_target_cwnd() for general inflight provisioning	Priyaranjan Jha	1	-21/+39
	Because bbr_target_cwnd() is really a general-purpose BBR helper for computing some volume of inflight data as a function of the estimated BDP, refactor it into following helper functions: - bbr_bdp() - bbr_quantization_budget() - bbr_inflight() Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25	r8169: factor out PHY init sequence adjusting 10M and ALDPS	Heiner Kallweit	1	-28/+21
	Few chip versions use the same sequence to adjust 10M and ALDPS, so let's factor it out. This patch also fixes a (most likely) typo in rtl8168g_1_hw_phy_config. There bit 8 in reg 0x14 on page 0x0bcc was set and not cleared. According to the vendor driver this bit needs to be cleared in all cases. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25	r8169: factor out disabling ALDPS	Heiner Kallweit	1	-20/+11
	Chip versions from RTL8168g onward use the same sequence to disable ALDPS (Advanced Link-Down Power Saving). So let's factor this out. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>