summaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
2018-04-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds875-16446/+62322
Pull networking updates from David Miller: 1) Support offloading wireless authentication to userspace via NL80211_CMD_EXTERNAL_AUTH, from Srinivas Dasari. 2) A lot of work on network namespace setup/teardown from Kirill Tkhai. Setup and cleanup of namespaces now all run asynchronously and thus performance is significantly increased. 3) Add rx/tx timestamping support to mv88e6xxx driver, from Brandon Streiff. 4) Support zerocopy on RDS sockets, from Sowmini Varadhan. 5) Use denser instruction encoding in x86 eBPF JIT, from Daniel Borkmann. 6) Support hw offload of vlan filtering in mvpp2 dreiver, from Maxime Chevallier. 7) Support grafting of child qdiscs in mlxsw driver, from Nogah Frankel. 8) Add packet forwarding tests to selftests, from Ido Schimmel. 9) Deal with sub-optimal GSO packets better in BBR congestion control, from Eric Dumazet. 10) Support 5-tuple hashing in ipv6 multipath routing, from David Ahern. 11) Add path MTU tests to selftests, from Stefano Brivio. 12) Various bits of IPSEC offloading support for mlx5, from Aviad Yehezkel, Yossi Kuperman, and Saeed Mahameed. 13) Support RSS spreading on ntuple filters in SFC driver, from Edward Cree. 14) Lots of sockmap work from John Fastabend. Applications can use eBPF to filter sendmsg and sendpage operations. 15) In-kernel receive TLS support, from Dave Watson. 16) Add XDP support to ixgbevf, this is significant because it should allow optimized XDP usage in various cloud environments. From Tony Nguyen. 17) Add new Intel E800 series "ice" ethernet driver, from Anirudh Venkataramanan et al. 18) IP fragmentation match offload support in nfp driver, from Pieter Jansen van Vuuren. 19) Support XDP redirect in i40e driver, from Björn Töpel. 20) Add BPF_RAW_TRACEPOINT program type for accessing the arguments of tracepoints in their raw form, from Alexei Starovoitov. 21) Lots of striding RQ improvements to mlx5 driver with many performance improvements, from Tariq Toukan. 22) Use rhashtable for inet frag reassembly, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1678 commits) net: mvneta: improve suspend/resume net: mvneta: split rxq/txq init and txq deinit into SW and HW parts ipv6: frags: fix /proc/sys/net/ipv6/ip6frag_low_thresh net: bgmac: Fix endian access in bgmac_dma_tx_ring_free() net: bgmac: Correctly annotate register space route: check sysctl_fib_multipath_use_neigh earlier than hash fix typo in command value in drivers/net/phy/mdio-bitbang. sky2: Increase D3 delay to sky2 stops working after suspend net/mlx5e: Set EQE based as default TX interrupt moderation mode ibmvnic: Disable irqs before exiting reset from closed state net: sched: do not emit messages while holding spinlock vlan: also check phy_driver ts_info for vlan's real device Bluetooth: Mark expected switch fall-throughs Bluetooth: Set HCI_QUIRK_SIMULTANEOUS_DISCOVERY for BTUSB_QCA_ROME Bluetooth: btrsi: remove unused including <linux/version.h> Bluetooth: hci_bcm: Remove DMI quirk for the MINIX Z83-4 sh_eth: kill useless check in __sh_eth_get_regs() sh_eth: add sh_eth_cpu_data::no_xdfar flag ipv6: factorize sk_wmem_alloc updates done by __ip6_append_data() ipv4: factorize sk_wmem_alloc updates done by __ip_append_data() ...
2018-04-03Merge tag 'arch-removal' of ↵Linus Torvalds22-9359/+10
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pul removal of obsolete architecture ports from Arnd Bergmann: "This removes the entire architecture code for blackfin, cris, frv, m32r, metag, mn10300, score, and tile, including the associated device drivers. I have been working with the (former) maintainers for each one to ensure that my interpretation was right and the code is definitely unused in mainline kernels. Many had fond memories of working on the respective ports to start with and getting them included in upstream, but also saw no point in keeping the port alive without any users. In the end, it seems that while the eight architectures are extremely different, they all suffered the same fate: There was one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem, which was more costly than licensing newer off-the-shelf CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems that all the SoC product lines are still around, but have not used the custom CPU architectures for several years at this point. In contrast, CPU instruction sets that remain popular and have actively maintained kernel ports tend to all be used across multiple licensees. [ See the new nds32 port merged in the previous commit for the next generation of "one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem" - Linus ] The removal came out of a discussion that is now documented at https://lwn.net/Articles/748074/. Unlike the original plans, I'm not marking any ports as deprecated but remove them all at once after I made sure that they are all unused. Some architectures (notably tile, mn10300, and blackfin) are still being shipped in products with old kernels, but those products will never be updated to newer kernel releases. After this series, we still have a few architectures without mainline gcc support: - unicore32 and hexagon both have very outdated gcc releases, but the maintainers promised to work on providing something newer. At least in case of hexagon, this will only be llvm, not gcc. - openrisc, risc-v and nds32 are still in the process of finishing their support or getting it added to mainline gcc in the first place. They all have patched gcc-7.3 ports that work to some degree, but complete upstream support won't happen before gcc-8.1. Csky posted their first kernel patch set last week, their situation will be similar [ Palmer Dabbelt points out that RISC-V support is in mainline gcc since gcc-7, although gcc-7.3.0 is the recommended minimum - Linus ]" This really says it all: 2498 files changed, 95 insertions(+), 467668 deletions(-) * tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits) MAINTAINERS: UNICORE32: Change email account staging: iio: remove iio-trig-bfin-timer driver tty: hvc: remove tile driver tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers serial: remove tile uart driver serial: remove m32r_sio driver serial: remove blackfin drivers serial: remove cris/etrax uart drivers usb: Remove Blackfin references in USB support usb: isp1362: remove blackfin arch glue usb: musb: remove blackfin port usb: host: remove tilegx platform glue pwm: remove pwm-bfin driver i2c: remove bfin-twi driver spi: remove blackfin related host drivers watchdog: remove bfin_wdt driver can: remove bfin_can driver mmc: remove bfin_sdh driver input: misc: remove blackfin rotary driver input: keyboard: remove bf54x driver ...
2018-04-03Merge tag 'nds32-for-linus-4.17' of ↵Linus Torvalds1-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux Pull nds32 architecture support from Greentime Hu: "This contains the core nds32 Linux port (including interrupt controller driver and timer driver), which has been through seven rounds of review on mailing list. It is able to boot to shell and passes most LTP-2017 testsuites in nds32 AE3XX platform: Total Tests: 1901 Total Skipped Tests: 618 Total Failures: 78" Reviewed-by: Arnd Bergmann <arnd@arndb.de> * tag 'nds32-for-linus-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux: (44 commits) nds32: To use the generic dump_stack() nds32: fix building failed if using elf toolchain. nios2: add ioremap_nocache declaration before include asm-generic/io.h. nds32: fix building failed if using older version gcc. dt-bindings: timer: Add andestech atcpit100 timer binding doc clocksource/drivers/atcpit100: VDSO support clocksource/drivers/atcpit100: Add andestech atcpit100 timer net: faraday add nds32 support. irqchip: Andestech Internal Vector Interrupt Controller driver dt-bindings: interrupt-controller: Andestech Internal Vector Interrupt Controller dt-bindings: nds32 SoC Bindings dt-bindings: nds32 L2 cache controller Bindings dt-bindings: nds32 CPU Bindings MAINTAINERS: Add nds32 nds32: Build infrastructure nds32: defconfig nds32: Miscellaneous header files nds32: Device tree support nds32: Generic timers support nds32: Loadable modules ...
2018-04-02net: mvneta: improve suspend/resumeJisheng Zhang1-7/+62
Current suspend/resume implementation reuses the mvneta_open() and mvneta_close(), but it could be optimized to take only necessary actions during suspend/resume. One obvious problem of current implementation is: after hundreds of system suspend/resume cycles, the resume of mvneta could fail due to fragmented dma coherent memory. After this patch, the non-necessary memory alloc/free is optimized out. Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-02net: mvneta: split rxq/txq init and txq deinit into SW and HW partsJisheng Zhang1-19/+66
This is to prepare the suspend/resume improvement in next patch. The SW parts can be optimized out during resume. As for rxq handling during suspend, we'd like to drop packets by calling mvneta_rxq_drop_pkts() which is both SW and HW operation, so we don't split rxq deinit. Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-02net: bgmac: Fix endian access in bgmac_dma_tx_ring_free()Florian Fainelli1-1/+2
bgmac_dma_tx_ring_free() assigns the ctl1 word which is a litle endian 32-bit word without using proper accessors, fix this, and because a length cannot be negative, use unsigned int while at it. Fixes: 9cde94506eac ("bgmac: implement scatter/gather support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-02net: bgmac: Correctly annotate register spaceFlorian Fainelli1-3/+3
All the members: base, idm_base and nicpm_base should be annotated with __iomem since they are pointers to register space. This fixes a bunch of sparse reported warnings. Fixes: f6a95a24957a ("net: ethernet: bgmac: Add platform device support") Fixes: dd5c5d037f5e ("net: ethernet: bgmac: add NS2 support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-02fix typo in command value in drivers/net/phy/mdio-bitbang.Frans Meulenbroeks1-1/+1
mdio-bitbang mentioned 10 for both read and write. However mdio read opcode is 10 and write opcode is 01 Fixed comment. Signed-off-by: Frans Meulenbroeks <fransmeulenbroeks@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-02sky2: Increase D3 delay to sky2 stops working after suspendKai-Heng Feng1-1/+1
The sky2 ethernet stops working after system resume from suspend: [ 582.852065] sky2 0000:04:00.0: Refused to change power state, currently in D3 The current 150ms delay is not enough, change it to 200ms can solve the issue. BugLink: https://bugs.launchpad.net/bugs/1758507 Cc: Stable <stable@vger.kernel.org> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-02net/mlx5e: Set EQE based as default TX interrupt moderation modeTal Gilboa1-4/+4
The default TX moderation mode was mistakenly set to CQE based. The intention was to add a control ability in order to improve some specific use-cases. In general, we prefer to use EQE based moderation as it gives much better numbers for the common cases. CQE based causes a degradation in the common case since it resets the moderation timer on CQE generation. This causes an issue when TSO is well utilized (large TSO sessions). The timer is set to 16us so traffic of ~64KB TSO sessions per second would mean timer reset (CQE per TSO session -> long time between CQEs). In this case we quickly reach the tcp_limit_output_bytes (256KB by default) and cause a halt in TX traffic. By setting EQE based moderation we make sure timer would expire after 16us regardless of the packet rate. This fixes an up to 40% packet rate and up to 23% bandwidth degradtions. Fixes: 0088cbbc4b66 ("net/mlx5e: Enable CQE based moderation on TX CQ") Signed-off-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-02ibmvnic: Disable irqs before exiting reset from closed stateJohn Allen1-10/+18
When the driver is closed, all the associated irqs are disabled. In the event that the driver exits a reset in the closed state, we should be consistent with the state we are in directly after a close. So before we exit the reset routine, all irqs should be disabled as well. This will prevent the irqs from being enabled twice in this case and reporting a number of noisy warning traces. Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller21-173/+262
Minor conflicts in drivers/net/ethernet/mellanox/mlx5/core/en_rep.c, we had some overlapping changes: 1) In 'net' MLX5E_PARAMS_LOG_{SQ,RQ}_SIZE --> MLX5E_REP_PARAMS_LOG_{SQ,RQ}_SIZE 2) In 'net-next' params->log_rq_size is renamed to be params->log_rq_mtu_frames. 3) In 'net-next' params->hard_mtu is added. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01sh_eth: kill useless check in __sh_eth_get_regs()Sergei Shtylyov1-15/+10
Iff TSU registers exist on a given [G]Ether controller, they always include the CAM entry table registers (TSU_ADR{H|L}<n>), thus the check for invalid TSU_ADRH0 offset in __sh_eth_get_regs() is useless... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01sh_eth: add sh_eth_cpu_data::no_xdfar flagSergei Shtylyov2-2/+4
The commit 6ded286555c2 ("sh_eth: Fix RX recovery on R-Car in case of RX ring underrun") added a check for an bad RDFAR offset in sh_eth_rx(), so that the code could work on the R-Car Ether controllers which don't have this register (and TDFAR), then the commit 3365711df02 ("sh_eth: WARN on access to a register not implemented in a particular chip") replaced offset 0 with 0xffff. Adding/checking the 'no_xdfar' bit field in the 'struct sh_eth_cpu_data' instead results in less object code... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01Revert "net: usb: asix88179_178a: de-duplicate code"David S. Miller1-31/+86
This reverts commit cf29bded91f9153208c4c2145d602cd82430ad1a. It causes regressions. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01net/mlx4_en: CHECKSUM_COMPLETE support for fragmentsEric Dumazet1-6/+4
Refine the RX check summing handling to propagate the hardware provided checksum so that we do not have to compute it later in software. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Tariq Toukan <tariqt@mellanox.com> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01mlxsw: spectrum: Don't use resource ID of 0Petr Machata1-1/+1
In commit 145307460ba9 ("devlink: Remove top_hierarchy arg to devlink_resource_register"), the "top_hierarchy" parameter to devlink_resource_register() was removed in favor of using the parameter "parent_resource_id" exclusively to determine who the parent is. The root node's resource ID for this purpose is DEVLINK_RESOURCE_ID_PARENT_TOP with the value 0. It is therefore problematic that the resource MLXSW_SP_RESOURCE_KVD has also ID of 0. Fix this by numbering driver-specific resources from 1. Fixes: 145307460ba9 ("devlink: Remove top_hierarchy arg to devlink_resource_register") Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01mlxsw: spectrum: Pass mlxsw_core as arg of mlxsw_sp_kvdl_resources_register()Jiri Pirko3-4/+4
Pass struct mlxsw_core instead of devlink since it is nicer within mlxsw code and we need both structs in mlxsw_sp_kvdl_resources_register() anyway. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01mlxsw: Move "resources_query_enable" out of mlxsw_config_profileJiri Pirko6-12/+8
As struct mlxsw_config_profile is mapped to the payload of the FW command of the same name, resources_query_enable flag does not belong there. Move it to struct mlxsw_driver. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01mlxsw: Move "used_kvd_sizes" check to mlxsw_pci_config_profileJiri Pirko3-6/+4
The check should be done directly in mlxsw_pci_config_profile, as for other profile items. Also, be consistent in naming with the rest and rename to "used_kvd_sizes". Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01mlxsw: core: Fix arg name of MLXSW_CORE_RES_VALID and MLXSW_CORE_RES_GETJiri Pirko1-4/+4
First arg of these helpers should be "mlxsw_core". Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01mlxsw: remove kvd_hash_granularity from config profile structJiri Pirko2-4/+2
This should not be part of the struct, as the struct fields are tightly coupled with the FW command payload of the same name. Just use the "granularity" define directly, as in other places. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01mlxsw: spectrum: Change KVD linear parts from list to arrayJiri Pirko1-143/+92
The parts info is array. The parts copy this info array, yet they are a list. So make the indexing according to the id and change the list of parts into array of parts. This helps to eliminate lookups and constructs like mlxsw_sp_kvdl_part_update() (took me some non-trivial time to figure out what is going on there). Alongside with that, introduce a helper macro to define the parts infos. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01mlxsw: Constify devlink_resource_opsJiri Pirko2-4/+4
devlink_resource_ops should be const as the arg of register function is also const. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01mlxsw: spectrum_kvdl: Fix handling of resource_size_paramJiri Pirko1-33/+14
Current code uses global variables, adjusts them and passes pointer down to devlink. With every other mlxsw_core instance, the previously passed pointer values are rewritten. Fix this by de-globalize the variables. Fixes: 7f47b19bd744 ("mlxsw: spectrum_kvdl: Add support for per part occupancy") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01mlxsw: spectrum_acl: Fix flex actions header ifndef define constructJiri Pirko1-2/+2
Fix copy&paste error in flex actions header ifndef define construct Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01cxgb4: LLD driver changes to support TLSAtul Gupta3-15/+131
Read the Inline TLS capability from firmware. Determine the area reserved for storing the keys Dump the Inline TLS tx and rx records count. Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Reviewed-by: Michael Werner <werner@chelsio.com> Reviewed-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01cxgb4: Inline TLS FW InterfaceAtul Gupta3-6/+283
Key area size in hw-config file. CPL struct for TLS request and response. Work request for Inline TLS. Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Reviewed-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller14-115/+794
Daniel Borkmann says: ==================== pull-request: bpf-next 2018-03-31 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add raw BPF tracepoint API in order to have a BPF program type that can access kernel internal arguments of the tracepoints in their raw form similar to kprobes based BPF programs. This infrastructure also adds a new BPF_RAW_TRACEPOINT_OPEN command to BPF syscall which returns an anon-inode backed fd for the tracepoint object that allows for automatic detach of the BPF program resp. unregistering of the tracepoint probe on fd release, from Alexei. 2) Add new BPF cgroup hooks at bind() and connect() entry in order to allow BPF programs to reject, inspect or modify user space passed struct sockaddr, and as well a hook at post bind time once the port has been allocated. They are used in FB's container management engine for implementing policy, replacing fragile LD_PRELOAD wrapper intercepting bind() and connect() calls that only works in limited scenarios like glibc based apps but not for other runtimes in containerized applications, from Andrey. 3) BPF_F_INGRESS flag support has been added to sockmap programs for their redirect helper call bringing it in line with cls_bpf based programs. Support is added for both variants of sockmap programs, meaning for tx ULP hooks as well as recv skb hooks, from John. 4) Various improvements on BPF side for the nfp driver, besides others this work adds BPF map update and delete helper call support from the datapath, JITing of 32 and 64 bit XADD instructions as well as offload support of bpf_get_prandom_u32() call. Initial implementation of nfp packet cache has been tackled that optimizes memory access (see merge commit for further details), from Jakub and Jiong. 5) Removal of struct bpf_verifier_env argument from the print_bpf_insn() API has been done in order to prepare to use print_bpf_insn() soon out of perf tool directly. This makes the print_bpf_insn() API more generic and pushes the env into private data. bpftool is adjusted as well with the print_bpf_insn() argument removal, from Jiri. 6) Couple of cleanups and prep work for the upcoming BTF (BPF Type Format). The latter will reuse the current BPF verifier log as well, thus bpf_verifier_log() is further generalized, from Martin. 7) For bpf_getsockopt() and bpf_setsockopt() helpers, IPv4 IP_TOS read and write support has been added in similar fashion to existing IPv6 IPV6_TCLASS socket option we already have, from Nikita. 8) Fixes in recent sockmap scatterlist API usage, which did not use sg_init_table() for initialization thus triggering a BUG_ON() in scatterlist API when CONFIG_DEBUG_SG was enabled. This adds and uses a small helper sg_init_marker() to properly handle the affected cases, from Prashant. 9) Let the BPF core follow IDR code convention and therefore use the idr_preload() and idr_preload_end() helpers, which would also help idr_alloc_cyclic() under GFP_ATOMIC to better succeed under memory pressure, from Shaohua. 10) Last but not least, a spelling fix in an error message for the BPF cookie UID helper under BPF sample code, from Colin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Add ULP calls to stop and restart IRQs.Michael Chan3-17/+90
When the driver needs to re-initailize the IRQ vectors, we make the new ulp_irq_stop() call to tell the RDMA driver to disable and free the IRQ vectors. After IRQ vectors have been re-initailized, we make the ulp_irq_restart() call to tell the RDMA driver that IRQs can be restarted. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Reserve completion rings and MSIX for bnxt_re RDMA driver.Michael Chan3-16/+65
Add additional logic to reserve completion rings for the bnxt_re driver when it requests MSIX vectors. The function bnxt_cp_rings_in_use() will return the total number of completion rings used by both drivers that need to be reserved. If the network interface in up, we will close and open the NIC to reserve the new set of completion rings and re-initialize the vectors. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Refactor bnxt_need_reserve_rings().Michael Chan1-32/+25
Refactor bnxt_need_reserve_rings() slightly so that __bnxt_reserve_rings() can call it and remove some duplicated code. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Add IRQ remapping logic.Michael Chan1-17/+42
Add remapping logic so that bnxt_en can use any arbitrary MSIX vectors. This will allow the driver to reserve one range of MSIX vectors to be used by both bnxt_en and bnxt_re. bnxt_en can now skip over the MSIX vectors used by bnxt_re. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Change IRQ assignment for RDMA driver.Michael Chan3-3/+61
In the current code, the range of MSIX vectors allocated for the RDMA driver is disjoint from the network driver. This creates a problem for the new firmware ring reservation scheme. The new scheme requires the reserved completion rings/MSIX vectors to be in a contiguous range. Change the logic to allocate RDMA MSIX vectors to be contiguous with the vectors used by bnxt_en on new firmware using the new scheme. The new function bnxt_get_num_msix() calculates the exact number of vectors needed by both drivers. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Improve ring allocation logic.Michael Chan2-15/+21
Currently, the driver code makes some assumptions about the group index and the map index of rings. This makes the code more difficult to understand and less flexible. Improve it by adding the grp_idx and map_idx fields explicitly to the bnxt_ring_struct as a union. The grp_idx is initialized for each tx ring and rx agg ring during init. time. We do the same for the map_idx for each cmpl ring. The grp_idx ties the tx ring to the ring group. The map_idx is the doorbell index of the ring. With this new infrastructure, we can change the ring index mapping scheme easily in the future. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Improve valid bit checking in firmware response message.Michael Chan2-5/+18
When firmware sends a DMA response to the driver, the last byte of the message will be set to 1 to indicate that the whole response is valid. The driver waits for the message to be valid before reading the message. The firmware spec allows these response messages to increase in length by adding new fields to the end of these messages. The older spec's valid location may become a new field in a newer spec. To guarantee compatibility, the driver should zero the valid byte before interpreting the entire message so that any new fields not implemented by the older spec will be read as zero. For messages that are forwarded to VFs, we need to set the length and re-instate the valid bit so the VF will see the valid response. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Improve resource accounting for SRIOV.Michael Chan1-10/+8
When VFs are created, the current code subtracts the maximum VF resources from the PF's pool. This under-estimates the resources remaining in the PF pool. Instead, we should subtract the minimum VF resources. The VF minimum resources are guaranteed to the VFs and only these should be subtracted from the PF's pool. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Check max_tx_scheduler_inputs value from firmware.Michael Chan3-2/+19
When checking for the maximum pre-set TX channels for ethtool -l, we need to check the current max_tx_scheduler_inputs parameter from firmware. This parameter specifies the max input for the internal QoS nodes currently available to this function. The function's TX rings will be capped by this parameter. By adding this logic, we provide a more accurate pre-set max TX channels to the user. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Add extended port statistics supportVasundhara Volam3-2/+81
Gather periodic extended port statistics, if the device is PF and link is up. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Include additional hardware port statistics in ethtool -S.Vasundhara Volam1-0/+5
Include additional hardware port statistics in ethtool -S, which are useful for debugging. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Add support for ndo_set_vf_trustVasundhara Volam4-9/+37
Trusted VFs are allowed to modify MAC address, even when PF has assigned one. Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: fix clear flags in ethtool reset handlingScott Branden1-2/+6
Clear flags when reset command processed successfully for components specified. Fixes: 6502ad5963a5 ("bnxt_en: Add ETH_RESET_AP support") Signed-off-by: Scott Branden <scott.branden@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Use a dedicated VNIC mode for RDMA.Michael Chan2-4/+15
If the RDMA driver is registered, use a new VNIC mode that allows RDMA traffic to be seen on the netdev in promiscuous mode. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Adjust default rings for multi-port NICs.Michael Chan1-3/+9
Change the default ring logic to select default number of rings to be up to 8 per port if the default rings x NIC ports <= total CPUs. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01bnxt_en: Update firmware interface to 1.9.1.15.Michael Chan4-103/+210
Minor changes, such as new extended port statistics. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01Merge tag 'mlx5-updates-2018-03-30' of ↵David S. Miller9-443/+402
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2018-03-30 This series contains updates to mlx5 core and mlx5e netdev drivers. The main highlight of this series is the RX optimizations for striding RQ path, introduced by Tariq. First Four patches are trivial misc cleanups. - Spelling mistake fix - Dead code removal - Warning messages RX optimizations for striding RQ: 1) RX refactoring, cleanups and micro optimizations - MTU calculation simplifications, obsoletes some WQEs-to-packets translation functions and helps delete ~60 LOC. - Do not busy-wait a pending UMR completion. - post the new values of UMR WQE inline, instead of using a data pointer. - use pre-initialized structures to save calculations in datapath. 2) Use linear SKB in Striding RQ "build_skb", (Using linear SKB has many advantages): - Saves a memcpy of the headers. - No page-boundary checks in datapath. - No filler CQEs. - Significantly smaller CQ. - SKB data continuously resides in linear part, and not split to small amount (linear part) and large amount (fragment). This saves datapath cycles in driver and improves utilization of SKB fragments in GRO. - The fragments of a resulting GRO SKB follow the IP forwarding assumption of equal-size fragments. implementation details: HW writes the packets to the beginning of a stride, i.e. does not keep headroom. To overcome this we make sure we can extend backwards and use the last bytes of stride i-1. Extra care is needed for stride 0 as it has no preceding stride. We make sure headroom bytes are available by shifting the buffer pointer passed to HW by headroom bytes. This configuration now becomes default, whenever capable. Of course, this implies turning LRO off. Performance testing: ConnectX-5, single core, single RX ring, default MTU. UDP packet rate, early drop in TC layer: -------------------------------------------- | pkt size | before | after | ratio | -------------------------------------------- | 1500byte | 4.65 Mpps | 5.96 Mpps | 1.28x | | 500byte | 5.23 Mpps | 5.97 Mpps | 1.14x | | 64byte | 5.94 Mpps | 5.96 Mpps | 1.00x | -------------------------------------------- TCP streams: ~20% gain 3) Support XDP over Striding RQ: Now that linear SKB is supported over Striding RQ, we can support XDP by setting stride size to PAGE_SIZE and headroom to XDP_PACKET_HEADROOM. Striding RQ is capable of a higher packet-rate than conventional RQ. Performance testing: ConnectX-5, 24 rings, default MTU. CQE compression ON (to reduce completions BW in PCI). XDP_DROP packet rate: -------------------------------------------------- | pkt size | XDP rate | 100GbE linerate | pct% | -------------------------------------------------- | 64byte | 126.2 Mpps | 148.0 Mpps | 85% | | 128byte | 80.0 Mpps | 84.8 Mpps | 94% | | 256byte | 42.7 Mpps | 42.7 Mpps | 100% | | 512byte | 23.4 Mpps | 23.4 Mpps | 100% | -------------------------------------------------- 4) Remove mlx5 page_ref bulking in Striding RQ and use page_ref_inc only when needed. Without this bulking, we have: - no atomic ops on WQE allocation or free - one atomic op per SKB - In the default MTU configuration (1500, stride size is 2K), the non-bulking method execute 2 atomic ops as before - For larger MTUs with stride size of 4K, non-bulking method executes only a single op. - For XDP (stride size of 4K, no SKBs), non-bulking have no atomic ops per packet at all. Performance testing: ConnectX-5, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz. Single core packet rate (64 bytes). Early drop in TC: no degradation. XDP_DROP: before: 14,270,188 pps after: 20,503,603 pps, 43% improvement. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01hv_netvsc: Clean up extra parameter from rndis_filter_receive_data()Haiyang Zhang1-7/+9
The variables, msg and data, have the same value. This patch removes the extra one. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01ethernet: hisilicon: hns: hns_dsaf_mac: Use generic eth_broadcast_addrJoe Perches1-4/+2
Rather than use an on-stack array to copy a broadcast address, use the generic eth_broadcast_addr function to save a trivial amount of object code. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01net: Do not take net_rwsem in __rtnl_link_unregister()Kirill Tkhai2-0/+4
This function calls call_netdevice_notifier(), which also may take net_rwsem. So, we can't use net_rwsem here. This patch makes callers of this functions take pernet_ops_rwsem, like register_netdevice_notifier() does. This will protect the modifications of net_namespace_list, and allows notifiers to take it (they won't have to care about context). Since __rtnl_link_unregister() is used on module load and unload (which are not frequent operations), this looks for me better, than make all call_netdevice_notifier() always executing in "protected net_namespace_list" context. Also, this fixes the problem we had a deal in 328fbe747ad4 "Close race between {un, }register_netdevice_notifier and ...", and guarantees __rtnl_link_unregister() does not skip exitting net. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-01net: hns3: remove unnecessary pci_set_drvdata() and devm_kfree()Wei Yongjun1-4/+0
There is no need for explicit calls of devm_kfree(), as the allocated memory will be freed during driver's detach. The driver core clears the driver data to NULL after device_release. Thus, it is not needed to manually clear the device driver data to NULL. So remove the unnecessary pci_set_drvdata() and devm_kfree(). Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>