summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-01-26net: hns3: remove dcb_ops->map_update in hclge_dcbYunsheng Lin3-14/+6
After doing down/uninit/init/up in hclge_dcb, it is not necessary to call dcb_ops->map_update in enet, so hclge_map_update can be called directly in hclge_dcb. This is for preparing to call hns3_nic_set_real_num_queue with netdev down when user changes mqprio configuration. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26net: hns3: do reinitialization while mqprio configuration changedYunsheng Lin2-17/+21
When user changes the mqprio configuration, enet need to be uninited and inited besides down'ed and up'ed, because the queue num may change when the TC num changes. Also, it is more suitable to do the down/unint/init/up operation in hclge module using hclge_notify_client, because this config change may affect PF and its VF. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26net: hns3: After setting the loopback, add the status of getting MACliuzhongzhu1-1/+22
After setting the serdes loopback, you need to determine the status of the MAC negotiation. If a status exception is obtained after 200ms, a timeout error is returned. Signed-off-by: liuzhongzhu <liuzhongzhu@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26net: hns3: fix broadcast promisc issue for revision 0x20Jian Shen2-3/+11
For revision 0x20, vlan filter is always bypassed when enable broadcast promisc mode. In this case, broadcast packets with any vlan id can be accpeted. We should disable broadcast promisc mode until user want enable it. Fixes: 46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26net: hns3: fix return value handle issue for hclge_set_loopback()Jian Shen2-4/+7
In current code, it always return 0, even loopback mode setting failed. It's incorrect. This patch fixes return value handle for loopback test. Fixes: 0f29fc23b21d ("net: hns3: Fix for loopback selftest failed problem") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26net: hns3: add error handling in hclge_ieee_setetsYunsheng Lin1-2/+13
Currently hclge_ieee_setets returns error directly when there is error, which may cause netdev not up problem. This patch adds some error handling when setting ETS configuration fails. Fixes: cacde272dd00 ("net: hns3: Add hclge_dcb module for the support of DCB feature") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26net: hns3: clear pci private data when unload hns3 driverJian Shen1-0/+1
When unload hns3 driver, we should clear the pci private data. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26net: hns3: don't update packet statistics for packets dropped by hardwareJian Shen1-3/+0
Packet statistics for netdev should not include the packets dropped by hardware. Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26Merge branch 'r8169-add-EEE-support-for-RTL8168f'David S. Miller1-32/+48
Heiner Kallweit says: ==================== r8169: add EEE support for RTL8168f This series adds EEE support for RTL8168f. Again first patch adds the support, and second patch enables EEE per default. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26r8169: enable EEE per default on RTL8168fHeiner Kallweit1-33/+2
Enable EEE per default on RTL8168f. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26r8169: add EEE support for RTL8168fHeiner Kallweit1-0/+47
Add EEE support for RTL8168f to the recently added EEE handling framework in the driver. This patch leaves the chip defaults, means EEE typically is disabled initially and it's up to the user to enable it via ethtool. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26Documentation: net: phy: switch documentation to rst formatHeiner Kallweit3-429/+448
Switch phylib documentation to rst format. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26tcp: allow zerocopy with fastopenWillem de Bruijn3-7/+6
Accept MSG_ZEROCOPY in all the TCP states that allow sendmsg. Remove the explicit check for ESTABLISHED and CLOSE_WAIT states. This requires correctly handling zerocopy state (uarg, sk_zckey) in all paths reachable from other TCP states. Such as the EPIPE case in sk_stream_wait_connect, which a sendmsg() in incorrect state will now hit. Most paths are already safe. Only extension needed is for TCP Fastopen active open. This can build an skb with data in tcp_send_syn_data. Pass the uarg along with other fastopen state, so that this skb also generates a zerocopy notification on release. Tested with active and passive tcp fastopen packetdrill scripts at https://github.com/wdebruij/packetdrill/commit/1747eef03d25a2404e8132817d0f1244fd6f129d Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26ptp: fix debugfs_simple_attr.cocci warningsYueHaibing1-8/+8
Use DEFINE_DEBUGFS_ATTRIBUTE rather than DEFINE_SIMPLE_ATTRIBUTE for debugfs files. Semantic patch information: Rationale: DEFINE_SIMPLE_ATTRIBUTE + debugfs_create_file() imposes some significant overhead as compared to DEFINE_DEBUGFS_ATTRIBUTE + debugfs_create_file_unsafe(). Generated by: scripts/coccinelle/api/debugfs/debugfs_simple_attr.cocci Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26Merge branch 'r8169-add-EEE-support-for-RTL8168g+'David S. Miller1-10/+206
Heiner Kallweit says: ==================== r8169: add EEE support for RTL8168g+ This series adds general EEE support to be used with ethtool. In addition it implements EEE for chip versions from RTL8168g. The first patch leaves the default chip settings and the second enables EEE per default. This allows us to revert patch 2 w/o removing EEE support completely if we should face issues with EEE on particular chip versions. Unfortunately Realtek decided not to use the standard EEE MMD registers but to use proprietary registers. Therefore we can't use phylib functions like phy_ethtool_set_eee and have to reimplement the functionality. Tested on a system with RTL8168g (chip version 40). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26r8169: enable EEE per default on chip versions from RTL8168gHeiner Kallweit1-0/+14
Enable EEE per default on chip versions from RTL8168g. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26r8169: add general EEE support for chip versions from RTL8168gHeiner Kallweit1-10/+192
This patch adds the general framework to deal with EEE in this driver plus EEE support for chip versions from RTL8168g. We don't touch the default chip settings, therefore EEE will usually be disabled and it's up to the user to enable it. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26Merge branch 'ipv6-defrag-rbtree'David S. Miller8-660/+527
Peter Oskolkov says: ==================== net: IP defrag: use rbtrees in IPv6 defragmentation Currently, IPv6 defragmentation code drops non-last fragments that are smaller than 1280 bytes: see commit 0ed4229b08c1 ("ipv6: defrag: drop non-last frags smaller than min mtu") This behavior is not specified in IPv6 RFCs and appears to break compatibility with some IPv6 implementations, as reported here: https://www.spinics.net/lists/netdev/msg543846.html This patchset contains four patches: - patch 1 moves rbtree-related code from IPv4 to files shared b/w IPv4/IPv6 - patch 2 changes IPv6 defragmenation code to use rbtrees for defrag queue - patch 3 changes nf_conntrack IPv6 defragmentation code to use rbtrees - patch 4 changes ip_defrag selftest to test changes made in the previous three patches. Along the way, the 1280-byte restrictions are removed. I plan to introduce similar changes to 6lowpan defragmentation code once I figure out how to test it. ==================== Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26selftests: net: ip_defrag: cover new IPv6 defrag behaviorPeter Oskolkov2-34/+51
This patch adds several changes to the ip_defrag selftest, to cover new IPv6 defrag behavior: - min IPv6 frag size is now 8 instead of 1280 - new test cases to cover IPv6 defragmentation in nf_conntrack_reasm.c - new "permissive" mode in negative (overlap) tests: netfilter sometimes drops invalid packets without passing them to IPv6 underneath, and thus defragmentation sometimes succeeds when it is expected to fail; so the permissive mode does not fail the test if the correct reassembled datagram is received instead of a timeout. Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26net: IP6 defrag: use rbtrees in nf_conntrack_reasm.cPeter Oskolkov1-189/+71
Currently, IPv6 defragmentation code drops non-last fragments that are smaller than 1280 bytes: see commit 0ed4229b08c1 ("ipv6: defrag: drop non-last frags smaller than min mtu") This behavior is not specified in IPv6 RFCs and appears to break compatibility with some IPv6 implemenations, as reported here: https://www.spinics.net/lists/netdev/msg543846.html This patch re-uses common IP defragmentation queueing and reassembly code in IP6 defragmentation in nf_conntrack, removing the 1280 byte restriction. Signed-off-by: Peter Oskolkov <posk@google.com> Reported-by: Tom Herbert <tom@herbertland.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26net: IP6 defrag: use rbtrees for IPv6 defragPeter Oskolkov2-173/+71
Currently, IPv6 defragmentation code drops non-last fragments that are smaller than 1280 bytes: see commit 0ed4229b08c1 ("ipv6: defrag: drop non-last frags smaller than min mtu") This behavior is not specified in IPv6 RFCs and appears to break compatibility with some IPv6 implemenations, as reported here: https://www.spinics.net/lists/netdev/msg543846.html This patch re-uses common IP defragmentation queueing and reassembly code in IPv6, removing the 1280 byte restriction. Signed-off-by: Peter Oskolkov <posk@google.com> Reported-by: Tom Herbert <tom@herbertland.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26net: IP defrag: encapsulate rbtree defrag code into callable functionsPeter Oskolkov3-264/+334
This is a refactoring patch: without changing runtime behavior, it moves rbtree-related code from IPv4-specific files/functions into .h/.c defrag files shared with IPv6 defragmentation code. Signed-off-by: Peter Oskolkov <posk@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Florian Westphal <fw@strlen.de> Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26Merge branch 's390-qeth-next'David S. Miller5-337/+167
Julian Wiedmann says: ==================== s390/qeth: updates 2019-01-25 please apply a first batch of qeth patches for net-next, primarily touching the net_device parts of the driver. In addition to the usual refactoring & code consolidation, patch 7 makes use of netif_device_detach() to let the stack know when our control plane is down. This helps quite a bit wrt to overall locking and proper init/shutdown sequencing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26s390/qeth: remove VLAN tracking for L2 devicesJulian Wiedmann4-63/+18
For recovery purposes, qeth keeps track of all registered VIDs. Replace this by using the infrastructure introduced in commit 9daae9bd47cf ("net: Call add/kill vid ndo on vlan filter feature toggling"). By managing NETIF_F_HW_VLAN_CTAG_FILTER as a hw_feature, netdev_update_features() will select it from dev->wanted_features and replay all of the netdevice's VIDs to its ndo_vlan_rx_add_vid() callback. z/VM NICs strictly require VLAN registration, so don't expose it as hw_feature there but add a little hack in qeth_enable_hw_features() to make things work regardless. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26s390/qeth: detach netdevice while card is offlineJulian Wiedmann4-102/+20
When a qeth card is offline, it has no connection to the HW. So none of our control callbacks can run IO against it, and we can only cache the input (eg a new MAC address) without providing proper feedback to the caller. In this context, it seems much more reasonable to simply detach the netdevice and let the kernel reject any interaction with it. This also makes all sorts of internal state checks and locking obsolete. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26s390/qeth: delay netdevice registrationJulian Wiedmann3-40/+43
Re-order the code flow a bit so that all initial HW setup is done before putting the netdevice into play. For a netdevice that hasn't been registered before, we also don't need to re-enable its HW features or check for recovery actions. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26s390/qeth: remove TX disable from online pathJulian Wiedmann2-3/+0
At best this is redundant, at worst it papers over a race in the offline / online code. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26s390/qeth: register MAC address earlierJulian Wiedmann1-8/+12
commit 4789a2188048 ("s390/qeth: fix race when setting MAC address") resolved a race where our initial programming of dev_addr into the HW and a call to ndo_set_mac_address() could run concurrently. In this case, we could end up getting confused about which address was actually set in the HW. The quick fix was to introduce additional locking that blocks any ndo_set_mac_address() while the device is being set online. But the race primarily originated from the fact that we first register the netdevice, and only then program its dev_addr. By re-ordering this sequence, userspace will only be able to change the MAC address _after_ we have finished with setting the initial dev_addr. Still, the same MAC address race can also occur during a subsequent call to qeth_l2_set_online(). So keep around the locking for now, until a follow-up patch fully resolves this. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26s390/qeth: consolidate open/stop netdev opsJulian Wiedmann4-124/+84
The L2 and L3 code for these ops is almost identical, we only need to provide a custom ndo_validate_addr() for L2 that checks whether programming the MAC address succeeded. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26s390/qeth: remove bogus netif_wake_queue()Julian Wiedmann1-2/+0
qeth_qdio_cq_handler() doesn't replenish the Output Queue(s), and thus has no reason to wake the txq. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26s390/qeth: streamline TX buffer managementJulian Wiedmann1-19/+14
Consolidate the code that marks the current buffer to be flushed, and let qeth_fill_buffer() advance the Output Queue's buffer cursor. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26Documentation: net: phy: reflect latest changes to phylib APIHeiner Kallweit1-8/+10
Recent changes to the phylib API - removed phy_stop_interrupts - replaced phy_start_interrupts with phy_request_interrupt - moved some functionality from phy_connect() and phy_disconnect() to phy_start() and phy_stop() respectively. Reflect these changes in the documentation. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-26Merge tag 'mlx5-updates-2019-01-25' of ↵David S. Miller10-152/+211
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2019-01-25 This series provides some updates to mlx5 driver, From Tariq, 1) Make sure RX packet header does not cross page boundary To avoid page boundary crossing, use stride size that fits the maximum possible header. Stride is increased form 64B to 256B. 2) CQ struct cleanup: Take CQ decompress fields into a separate structure From Moshe, 3) Expand XPS cpumask to cover all online cpus From Jason Gunthorpe and Tariq: 4) Compilation warning cleanup From Or, 5) Add trace points for flow tables create/destroy From Saeed, 6) Software stats update/folding improvements this also solves a compilation warning on 32bit systems that was reported last release cycle by Arnd and Andrew. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25net/mlx5e: Reuse fold sw stats in representorsSaeed Mahameed3-27/+10
Representors software stats are basic, this patch is reusing the mlx5e_fold_sw_stats in representors, which sums up the basic stats64 for a mlx5e netdevice. Fixes: 8bfaf07f7806 ("net/mlx5e: Present SW stats when state is not opened") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25net/mlx5e: Present the representors SW stats when state is not openedTariq Toukan1-10/+7
This behavior is already adopted for all other cases in the cited patch. The representor's functions were missed, here we modify the them to behave similarly. Fixes: 8bfaf07f7806 ("net/mlx5e: Present SW stats when state is not opened") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25net/mlx5e: Separate between ethtool and netdev software stats foldingSaeed Mahameed3-12/+25
mlx5e_grp_sw_update_stats can be called from two threads, 1) ndo_get_stats64 2) get_ethtool_stats For this reason and to minimize concurrency issue impact on 64bit machines mlx5e_grp_sw_update_stats folds the software stats into a temporary variable then copies it to the global driver stats, both ethtool and ndo statistics callbacks will use the global software stats variable to report whatever stats they need. Actually ndo_get_stats64 doesn't need to fold the whole software stats (mlx5e_grp_sw_update_stats), all it needs is five counters to fill the rtnl_link_stats64 relevant stats parameter. Hence this patch introduces a simpler helper function to fold software stats for ndo_get_stats64 which will work directly on rtnl_link_stats64 stats parameter and not on the global or even temporary mlx5e_sw_stats variable. Since now mlx5e_grp_sw_update_stats is not called by ndo_get_stats64 we can make it static and remove the temp var. Unlike mlx5e_grp_sw_update_stats the new fold stats function doesn't need to zero out the output statistics parameter since it is already done by the stack @dev_get_stats(). This patch is fixing stack usage of mlx5e_grp_sw_update_stats on x86 gcc-4.9 and higher, the concurrency issue between mlx5's ndo_get_stats64 and get_ethtool_stats is resolved as well. Fixes: 8bfaf07f7806 ("net/mlx5e: Present SW stats when state is not opened") Reported-by: Arnd Bergmann <arnd@arndb.de> Reported-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25net/mlx5: Add trace points for flow tables create/destroyOr Gerlitz3-0/+39
We were not tracking flow tables so far, add it up. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25net/mlx5e: Return the allocated flow directly from __mlx5e_add_fdb_flowJason Gunthorpe1-15/+14
This confusing construction confuses the compiler which can't see that flow is initialized if !err: drivers/net/ethernet/mellanox/mlx5/core/en_tc.c: In function `mlx5e_configure_flower` drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:2727:28: warning: `flow` may be used uninitialized in this function [-Wmaybe-uninitialized] There is no reason for two function outputs, just return the pointer directly and use ERR_PTR to encode a failure. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25net/mlx5e: Expand XPS cpumask to cover all online cpusMoshe Shemesh2-2/+33
Currently we have one cpu in XPS cpumask per tx queue, this is good enough for default configuration where there is a tx queue per cpu. However, once configuration changes to use less tx queues, part of the cpus are not XPS-mapped and so the select queue decision falls back to hash calculation and balancing is not guaranteed. Expand XPS cpumask to enable using all cpus even when number of tx queues is smaller than number of cpus. Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25net/mlx5e: Take CQ decompress fields into a separate structureTariq Toukan2-61/+81
Only the Receive CQ makes use of these fields. Take them out into a separate struct and save space in the generic CQ structure. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25net/mlx5e: RX, Make sure packet header does not cross page boundaryTariq Toukan3-32/+9
In the non-linear SKB memory scheme of Striding RQ, a packet header could cross page boundary. This requires special care in fast path that costs LoC, additional runtime instructions and branches. It could happen when the header (up to 256B) does not fit in a single stride. Avoid this by working with a stride size that fits the maximum possible header. Stride is increased form 64B to 256B. Performance: Tested packet rate for UDP streams, single ring, on ConnectX-5. Configuration: Set Striding RQ and LRO ON (to enabled the non-linear SKB scheme). GRO OFF, early drop by TC rule. 64B: 4x worse memory utilization, no page-crossers headers - No degradation (5,887,305 pps). - The reduction in memory utilization is compensated by the saving of branches tests. 192B: 1.33x worse memory utilization, avoid page-crossers headers - Before: 5,727,252. After: 5,777,037. ~1% gain. 256B: Same memory util, no page-crossers - Before: 5,691,885. After: 5,748,007. ~1% gain. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-01-25net: Revert devlink health changes.David S. Miller11-1785/+169
This reverts the devlink health changes from 9/17/2019, Jiri wants things to be designed differently and it was agreed that the easiest way to do this is start from the beginning again. Commits reverted: cb5ccfbe73b389470e1dc11061bb185ef4bc9aec 880ee82f0313453ec5a6cb122866ac057263066b c7af343b4e33578b7de91786a3f639c8cfa0d97b ff253fedab961b22117a73ab808fcfa9e6852b50 6f9d56132eb6d2603d4273cfc65bed914ec47acb fcd852c69d776c0f46c8f79e8e431e5cc6ddc7b7 8a66704a13d9713593342e29b4f0c19762f5746b 12bd0dcefe88782ac1c9fff632958dd1b71d27e5 aba25279c10094c5c97d09c3491ca86d00b4ad5e ce019faa70f81555fa17ebc1d5a03651f2e7e15a b8c45a033acc607201588f7665ba84207e5149e0 And the follow-on build fix: o33a0efa4baecd689da9474ce0e8b673eb6931c60 Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25bridge: remove duplicated include from br_multicast.cYueHaibing1-1/+0
Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25mlxfw: Replace license text with SPDX identifiers and adjust copyrightsShalom Toledo9-297/+20
Signed-off-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25tipc: remove dead code in struct tipc_topsrvZhaolong Zhang1-3/+0
max_rcvbuf_size is no longer used since commit "414574a0af36". Signed-off-by: Zhaolong Zhang <zhangzl2013@126.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25Merge branch ↵David S. Miller2-23/+161
'tcp_bbr-Improving-TCP-BBR-performance-for-WiFi-and-cellular-networks' Priyaranjan Jha says: ==================== tcp_bbr: Improving TCP BBR performance for WiFi and cellular networks Ack aggregation is quite prevalent with wifi, cellular and cable modem link tchnologies, ACK decimation in middleboxes, and common offloading techniques such as TSO and GRO, at end hosts. Previously, BBR was often cwnd-limited in the presence of severe ACK aggregation, which resulted in low throughput due to insufficient data in flight. To achieve good throughput for wifi and other paths with aggregation, this patch series implements an ACK aggregation estimator for BBR, which estimates the maximum recent degree of ACK aggregation and adapts cwnd based on it. The algorithm is further described by the following presentation: https://datatracker.ietf.org/meeting/101/materials/slides-101-iccrg-an-update-on-bbr-work-at-google-00 (1) A preparatory patch, which refactors bbr_target_cwnd for generic inflight provisioning. (2) Implements BBR ack aggregation estimator and adapts cwnd based on measured degree of ACK aggregation. ==================== Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25tcp_bbr: adapt cwnd based on ack aggregation estimationPriyaranjan Jha2-3/+123
Aggregation effects are extremely common with wifi, cellular, and cable modem link technologies, ACK decimation in middleboxes, and LRO and GRO in receiving hosts. The aggregation can happen in either direction, data or ACKs, but in either case the aggregation effect is visible to the sender in the ACK stream. Previously BBR's sending was often limited by cwnd under severe ACK aggregation/decimation because BBR sized the cwnd at 2*BDP. If packets were acked in bursts after long delays (e.g. one ACK acking 5*BDP after 5*RTT), BBR's sending was halted after sending 2*BDP over 2*RTT, leaving the bottleneck idle for potentially long periods. Note that loss-based congestion control does not have this issue because when facing aggregation it continues increasing cwnd after bursts of ACKs, growing cwnd until the buffer is full. To achieve good throughput in the presence of aggregation effects, this algorithm allows the BBR sender to put extra data in flight to keep the bottleneck utilized during silences in the ACK stream that it has evidence to suggest were caused by aggregation. A summary of the algorithm: when a burst of packets are acked by a stretched ACK or a burst of ACKs or both, BBR first estimates the expected amount of data that should have been acked, based on its estimated bandwidth. Then the surplus ("extra_acked") is recorded in a windowed-max filter to estimate the recent level of observed ACK aggregation. Then cwnd is increased by the ACK aggregation estimate. The larger cwnd avoids BBR being cwnd-limited in the face of ACK silences that recent history suggests were caused by aggregation. As a sanity check, the ACK aggregation degree is upper-bounded by the cwnd (at the time of measurement) and a global max of BW * 100ms. The algorithm is further described by the following presentation: https://datatracker.ietf.org/meeting/101/materials/slides-101-iccrg-an-update-on-bbr-work-at-google-00 In our internal testing, we observed a significant increase in BBR throughput (measured using netperf), in a basic wifi setup. - Host1 (sender on ethernet) -> AP -> Host2 (receiver on wifi) - 2.4 GHz -> BBR before: ~73 Mbps; BBR after: ~102 Mbps; CUBIC: ~100 Mbps - 5.0 GHz -> BBR before: ~362 Mbps; BBR after: ~593 Mbps; CUBIC: ~601 Mbps Also, this code is running globally on YouTube TCP connections and produced significant bandwidth increases for YouTube traffic. This is based on Ian Swett's max_ack_height_ algorithm from the QUIC BBR implementation. Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25tcp_bbr: refactor bbr_target_cwnd() for general inflight provisioningPriyaranjan Jha1-21/+39
Because bbr_target_cwnd() is really a general-purpose BBR helper for computing some volume of inflight data as a function of the estimated BDP, refactor it into following helper functions: - bbr_bdp() - bbr_quantization_budget() - bbr_inflight() Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25r8169: factor out PHY init sequence adjusting 10M and ALDPSHeiner Kallweit1-28/+21
Few chip versions use the same sequence to adjust 10M and ALDPS, so let's factor it out. This patch also fixes a (most likely) typo in rtl8168g_1_hw_phy_config. There bit 8 in reg 0x14 on page 0x0bcc was set and not cleared. According to the vendor driver this bit needs to be cleared in all cases. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25r8169: factor out disabling ALDPSHeiner Kallweit1-20/+11
Chip versions from RTL8168g onward use the same sequence to disable ALDPS (Advanced Link-Down Power Saving). So let's factor this out. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>