kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2022-04-13	ice: synchronize_rcu() when terminating rings	Maciej Fijalkowski	1	-1/+1
	[ Upstream commit f9124c68f05ffdb87a47e3ea6d5fae9dad7cb6eb ] Unfortunately, the ice driver doesn't respect the RCU critical section that XSK wakeup is surrounded with. To fix this, add synchronize_rcu() calls to paths that destroy resources that might be in use. This was addressed in other AF_XDP ZC enabled drivers, for reference see for example commit b3873a5be757 ("net/i40e: Fix concurrency issues between config flow and XSK") Fixes: efc2214b6047 ("ice: Add support for XDP") Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Shwetha Nagaraju <shwetha.nagaraju@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-14	ice: Use port number instead of PF ID for WoL	Anirudh Venkataramanan	1	-1/+1
	commit 3176551979b92b02756979c0f1e2d03d1fc82b1e upstream. As per the spec, the WoL control word read from the NVM should be interpreted as port numbers, and not PF numbers. So when checking if WoL supported, use the port number instead of the PF ID. Also, ice_is_wol_supported doesn't really need a pointer to the pf struct, but just needs a pointer to the hw instance. Fixes: 769c500dcc1e ("ice: Add advanced power mgmt for WoL") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-14	ice: remove DCBNL_DEVRESET bit from PF state	Dave Ertman	1	-1/+0
	commit 741b7b743bbcb5a3848e4e55982064214f900d2f upstream. The original purpose of the ICE_DCBNL_DEVRESET was to protect the driver during DCBNL device resets. But, the flow for DCBNL device resets now consists of only calls up the stack such as dev_close() and dev_open() that will result in NDO calls to the driver. These will be handled with state changes from the stack. Also, there is a problem of the dev_close and dev_open being blocked by checks for reset in progress also using the ICE_DCBNL_DEVRESET bit. Since the ICE_DCBNL_DEVRESET bit is not necessary for protecting the driver from DCBNL device resets and it is actually blocking changes coming from the DCBNL interface, remove the bit from the PF state and don't block driver function based on DCBNL reset in progress. Fixes: b94b013eb626 ("ice: Implement DCBNL support") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-14	ice: prevent ice_open and ice_stop during reset	Krzysztof Goreczny	1	-0/+1
	commit e95fc8573e07c5e4825df4650fd8b8c93fad27a7 upstream. There is a possibility of race between ice_open or ice_stop calls performed by OS and reset handling routine both trying to modify VSI resources. Observed scenarios: - reset handler deallocates memory in ice_vsi_free_arrays and ice_open tries to access it in ice_vsi_cfg_txq leading to driver crash - reset handler deallocates memory in ice_vsi_free_arrays and ice_close tries to access it in ice_down leading to driver crash - reset handler clears port scheduler topology and sets port state to ICE_SCHED_PORT_STATE_INIT leading to ice_ena_vsi_txq fail in ice_open To prevent this additional checks in ice_open and ice_stop are introduced to make sure that OS is not allowed to alter VSI config while reset is in progress. Fixes: cdedef59deb0 ("ice: Configure VSIs for Tx/Rx") Signed-off-by: Krzysztof Goreczny <krzysztof.goreczny@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04	ice: Fix state bits on LLDP mode switch	Dave Ertman	1	-2/+0
	[ Upstream commit 0d4907f65dc8fc5e897ad19956fca1acb3b33bc8 ] DCBX_CAP bits were not being adjusted when switching between SW and FW controlled LLDP. Adjust bits to correctly indicate which mode the LLDP logic is in. Fixes: b94b013eb626 ("ice: Implement DCBNL support") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-04	ice: Fix MSI-X vector fallback logic	Brett Creeley	1	-1/+3
	[ Upstream commit f3fe97f64384fa4073d9dc0278c4b351c92e295c ] The current MSI-X enablement logic tries to enable best-case MSI-X vectors and if that fails we only support a bare-minimum set. This includes a single MSI-X for 1 Tx and 1 Rx queue and a single MSI-X for the OICR interrupt. Unfortunately, the driver fails to load when we don't get as many MSI-X as requested for a couple reasons. First, the code to allocate MSI-X in the driver tries to allocate num_online_cpus() MSI-X for LAN traffic without caring about the number of MSI-X actually enabled/requested from the kernel for LAN traffic. So, when calling ice_get_res() for the PF VSI, it returns failure because the number of available vectors is less than requested. Fix this by not allowing the PF VSI to allocation more than pf->num_lan_msix MSI-X vectors and pf->num_lan_msix Rx/Tx queues. Limiting the number of queues is done because we don't want more than 1 Tx/Rx queue per interrupt due to performance conerns. Second, the driver assigns pf->num_lan_msix = 2, to account for LAN traffic and the OICR. However, pf->num_lan_msix is only meant for LAN MSI-X. This is causing a failure when the PF VSI tries to allocate/reserve the minimum pf->num_lan_msix because the OICR MSI-X has already been reserved, so there may not be enough MSI-X vectors left. Fix this by setting pf->num_lan_msix = 1 for the failure case. Then the ICE_MIN_MSIX accounts for the LAN MSI-X and the OICR MSI-X needed for the failure case. Update the related defines used in ice_ena_msix_range() to align with the above behavior and remove the unused RDMA defines because RDMA is currently not supported. Also, remove the now incorrect comment. Fixes: 152b978a1f90 ("ice: Rework ice_ena_msix_range") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-09	ice: refactor devlink_port to be per-VSI	Jacob Keller	1	-3/+4
	Currently, the devlink_port structure is stored within the ice_pf. This made sense because we create a single devlink_port for each PF. This setup does not mesh with the abstractions in the driver very well, and led to a flow where we accidentally call devlink_port_unregister twice during error cleanup. In particular, if devlink_port_register or devlink_port_unregister are called twice, this leads to a kernel panic. This appears to occur during some possible flows while cleaning up from a failure during driver probe. If register_netdev fails, then we will call devlink_port_unregister in ice_cfg_netdev as it cleans up. Later, we again call devlink_port_unregister since we assume that we must cleanup the port that is associated with the PF structure. This occurs because we cleanup the devlink_port for the main PF even though it was not allocated. We allocated the port within a per-VSI function for managing the main netdev, but did not release the port when cleaning up that VSI, the allocation and destruction are not aligned. Instead of attempting to manage the devlink_port as part of the PF structure, manage it as part of the PF VSI. Doing this has advantages, as we can match the de-allocation of the devlink_port with the unregister_netdev associated with the main PF VSI. Moving the port to the VSI is preferable as it paves the way for handling devlink ports allocated for other purposes such as SR-IOV VFs. Since we're changing up how we allocate the devlink_port, also change the indexing. Originally, we indexed the port using the PF id number. This came from an old goal of sharing a devlink for each physical function. Managing devlink instances across multiple function drivers is not workable. Instead, lets set the port number to the logical port number returned by firmware and set the index using the VSI index (sometimes referred to as VSI handle). Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-09-26	intel-ethernet: clean up W=1 warnings in kdoc	Jesse Brandeburg	1	-1/+1
	This takes care of all of the trivial W=1 fixes in the Intel Ethernet drivers, which allows developers and maintainers to build more of the networking tree with more complete warning checks. There are three classes of kdoc warnings fixed: - cannot understand function prototype: 'x' - Excess function parameter 'x' description in 'y' - Function parameter or member 'x' not described in 'y' All of the changes were trivial comment updates on function headers. Inspired by Lee Jones' series of wireless work to do the same. Compile tested only, and passes simple test of $ git ls-files *.[ch] \| egrep drivers/net/ethernet/intel \| \ xargs scripts/kernel-doc -none Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-31	xsk: i40e: ice: ixgbe: mlx5: Pass buffer pool to driver instead of umem	Magnus Karlsson	1	-9/+9
	Replace the explicit umem reference passed to the driver in AF_XDP zero-copy mode with the buffer pool instead. This in preparation for extending the functionality of the zero-copy mode so that umems can be shared between queues on the same netdev and also between netdevs. In this commit, only an umem reference has been added to the buffer pool struct. But later commits will add other entities to it. These are going to be entities that are different between different queue ids and netdevs even though the umem is shared between them. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-2-git-send-email-magnus.karlsson@intel.com
2020-08-01	ice: add useful statistics	Jesse Brandeburg	1	-0/+1
	Display and count some useful hot-path statistics. The usefulness is as follows: - tx_restart: use to determine if the transmit ring size is too small or if the transmit interrupt rate is too low. - rx_gro_dropped: use to count drops from GRO layer, which previously were completely uncounted when occurring. - tx_busy: use to determine when the driver is miscounting number of descriptors needed for an skb. - tx_timeout: as our other drivers, count the number of times we've reset due to timeout because the kernel only prints a warning once per netdev. Several of these were already counted but not displayed. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-29	ice: implement device flash update via devlink	Jacob Keller	1	-0/+9
	Use the newly added pldmfw library to implement device flash update for the Intel ice networking device driver. This support uses the devlink flash update interface. The main parts of the flash include the Option ROM, the netlist module, and the main NVM data. The PLDM firmware file contains modules for each of these components. Using the pldmfw library, the provided firmware file will be scanned for the three major components, "fw.undi" for the Option ROM, "fw.mgmt" for the main NVM module containing the primary device firmware, and "fw.netlist" containing the netlist module. The flash is separated into two banks, the active bank containing the running firmware, and the inactive bank which we use for update. Each module is updated in a staged process. First, the inactive bank is erased, preparing the device for update. Second, the contents of the component are copied to the inactive portion of the flash. After all components are updated, the driver signals the device to switch the active bank during the next EMP reset (which would usually occur during the next reboot). Although the firmware AdminQ interface does report an immediate status for each command, the NVM erase and NVM write commands receive status asynchronously. The driver must not continue writing until previous erase and write commands have finished. The real status of the NVM commands is returned over the receive AdminQ. Implement a simple interface that uses a wait queue so that the main update thread can sleep until the completion status is reported by firmware. For erasing the inactive banks, this can take quite a while in practice. To help visualize the process to the devlink application and other applications based on the devlink netlink interface, status is reported via the devlink_flash_update_status_notify. While we do report status after each 4k block when writing, there is no real status we can report during erasing. We simply must wait for the complete module erasure to finish. With this implementation, basic flash update for the ice hardware is supported. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24	ice: support Total Port Shutdown on devices that support it	Bruce Allan	1	-0/+1
	When the Port Disable bit is set in the Link Default Override Mask TLV PFA module in the NVM, Total Port Shutdown mode is supported and enabled. In this mode, the driver should act as if the link-down-on-close ethtool private flag is always enabled and dis-allow any change to that flag. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-24	ice: add link lenient and default override support	Paul Greenwalt	1	-0/+3
	Adds functions to check for link override firmware support and get the override settings for a port. The previously supported/default link mode was strict mode. In strict mode link is configured based on get PHY capabilities PHY types with media. Lenient mode is now the default link mode. In lenient mode the link is configured based on get PHY capabilities PHY types without media. This allows the user to configure link that the media does not report. Limit the minimum supported link mode to 25G for devices that support 100G, and 1G for devices that support less than 100G. Default override is only supported in lenient mode. If default override is supported and enabled, then default override values are used for configuring speed and FEC. Default override provide persistent link settings in the NVM. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Evan Swanson <evan.swanson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-24	ice: restore PHY settings on media insertion	Paul Greenwalt	1	-0/+4
	After the transition from no media to media FW will clear the set-phy-cfg data set by the user. Save initial PHY settings and any settings later requested by the user and use that data to restore PHY settings on media insertion. Since PHY configuration is now being stored, replace calls that were calling FW to get the configuration with the saved copy. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-24	ice: Add advanced power mgmt for WoL	Akeem G Abodunrin	1	-0/+3
	Add callbacks needed to support advanced power management for Wake on LAN. Also make ice_pf_state_is_nominal function available for all configurations not just CONFIG_PCI_IOV. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-02	ice: implement snapshot for device capabilities	Jacob Keller	1	-0/+1
	Add a new devlink region used for capturing a snapshot of the device capabilities buffer which is reported by the firmware over the AdminQ. This information can useful in debugging driver and firmware interactions. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-06-26	net/intel: remove driver versions from Intel drivers	Jeff Kirsher	1	-1/+0
	As with other networking drivers, remove the unnecessary driver version from the Intel drivers. The ethtool driver information and module version will then report the kernel version instead. For ixgbe, i40e and ice drivers, the driver passes the driver version to the firmware to confirm that we are up and running. So we now pass the value of UTS_RELEASE to the firmware. This adminq call is required per the HAS document. The Device then sends an indication to the BMC that the PF driver is present. This is done using Host NC Driver Status Indication in NC-SI Get Link command or via the Host Network Controller Driver Status Change AEN. What the BMC may do with this information is implementation-dependent, but this is a standard NC-SI 1.1 command we honor per the HAS. CC: Bruce Allan <bruce.w.allan@intel.com> CC: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Alek Loktionov <aleksandr.loktionov@intel.com> CC: Kevin Liedtke <kevin.d.liedtke@intel.com> CC: Aaron Rowden <aaron.f.rowden@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Co-developed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com>
2020-05-23	ice: Implement aRFS	Brett Creeley	1	-0/+14
	Enable accelerated Receive Flow Steering (aRFS). It is used to steer Rx flows to a specific queue. This functionality is triggered by the network stack through ndo_rx_flow_steer and requires Flow Director (ntuple on) to function. The fltr_info is used to add/remove/update flow rules in the HW, the fltr_state is used to determine what to do with the filter with respect to HW and/or SW, and the flow_id is used in co-ordination with the network stack. The work for aRFS is split into two paths: the ndo_rx_flow_steer operation and the ice_service_task. The former is where the kernel hands us an Rx SKB among other items to setup aRFS and the latter is where the driver adds/updates/removes filter rules from HW and updates filter state. In the Rx path the following things can happen: 1. New aRFS entries are added to the hash table and the state is set to ICE_ARFS_INACTIVE so the filter can be updated in HW by the ice_service_task path. 2. aRFS entries have their Rx Queue updated if we receive a pre-existing flow_id and the filter state is ICE_ARFS_ACTIVE. The state is set to ICE_ARFS_INACTIVE so the filter can be updated in HW by the ice_service_task path. 3. aRFS entries marked as ICE_ARFS_TODEL are deleted In the ice_service_task path the following things can happen: 1. New aRFS entries marked as ICE_ARFS_INACTIVE are added or updated in HW. and their state is updated to ICE_ARFS_ACTIVE. 2. aRFS entries are deleted from HW and their state is updated to ICE_ARFS_TODEL. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-23	ice: Restore filters following reset	Henry Tieman	1	-0/+2
	Following a reset, Flow Director filters are cleared from the hardware. Rebuild the filters using the software structures containing the filter rules. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-23	ice: Support IPv4 Flow Director filters	Henry Tieman	1	-0/+4
	Support the addition and deletion of IPv4 filters. Supported fields are: src-ip, dst-ip, src-port, and dst-port Supported flow-types are: tcp4, udp4, sctp4, ip4 Example usage: ethtool -N eth0 flow-type tcp4 src-ip 192.168.0.55 dst-ip 172.16.0.55 \ src-port 16 dst-port 12 action 32 Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-23	ice: Support displaying ntuple rules	Henry Tieman	1	-0/+9
	Add functionality for ethtool --show-ntuple, allowing for filters to be displayed when set functionality is added. Add statistics related to Flow Director matches and status. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-23	ice: Initialize Flow Director resources	Henry Tieman	1	-0/+24
	Flow Director allows for redirection based on ntuple rules. Rules are programmed using the ethtool set-ntuple interface. Supported actions are redirect to queue and drop. Setup the initial framework to process Flow Director filters. Create and allocate resources to manage and program filters to the hardware. Filters are processed via a sideband interface; a control VSI is created to manage communication and process requests through the sideband. Upon allocation of resources, update the hardware tables to accept perfect filters. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22	ice: cleanup vf_id signedness	Jesse Brandeburg	1	-1/+1
	The vf_id variable is dealt with in the code in inconsistent ways of sign usage, preventing compilation with -Werror=sign-compare. Fix this problem in the code by always treating vf_id as unsigned, since there are no valid values of vf_id that are negative. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22	ice: Fix casting issues	Karol Kolacinski	1	-5/+5
	Change min() macros to min_t() which has compare type specified and it helps avoid precision loss. In some cases there was precision loss during calls or assignments. Some fields in structs were unnecessarily large and gave multiple warnings. There were also some minor type differences which are now fixed as well as some cases where a simple cast was needed. Callers were were passing data that is a u16 to ice_sched_cfg_node_bw_alloc() but the function was truncating that to a u8. Fix that by changing the function to take a u16. Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22	ice: Provide more meaningful error message	Lihong Yang	1	-0/+2
	When printing the ice status or AQ error codes, instead of printing out the numerical value, provide the description of the error code. This provides more info about the issue than a number. Signed-off-by: Lihong Yang <lihong.yang@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22	ice: Add VF promiscuous support	Brett Creeley	1	-0/+1
	Implement promiscuous support for VF VSIs. Behaviour of promiscuous support is based on VF trust as well as the, introduced, vf-true-promisc flag. A trusted VF with vf-true-promisc disabled will be the default VSI, which means that all traffic without a matching destination MAC address in the device's internal switch will be forwarded to this VF VSI. A trusted VF with vf-true-promisc enabled will go into "true promiscuous mode". This amounts to the VF receiving all ingress and egress traffic that hits the device's internal switch. An untrusted VF will only receive traffic destined for that VF. The vf-true-promisc-support flag cannot be toggled while any VF is in promiscuous mode. This flag should be set prior to loading the iavf driver or spawning VF(s). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22	ice: Add support for tunnel offloads	Tony Nguyen	1	-0/+4
	Create a boost TCAM entry for each tunnel port in order to get a tunnel PTYPE. Update netdev feature flags and implement the appropriate logic to get and set values for hardware offloads. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-03-27	ice: add a devlink region for dumping NVM contents	Jacob Keller	1	-0/+2
	Add a devlink region for exposing the device's Non Volatime Memory flash contents. Support the recently added .snapshot operation, enabling userspace to request a snapshot of the NVM contents via DEVLINK_CMD_REGION_NEW. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-21	ice: enable initial devlink support	Jacob Keller	1	-0/+4
	Begin implementing support for the devlink interface with the ice driver. The pf structure is currently memory managed through devres, via a devm_alloc. To mimic this behavior, after allocating the devlink pointer, use devm_add_action to add a teardown action for releasing the devlink memory on exit. The ice hardware is a multi-function PCIe device. Thus, each physical function will get its own devlink instance. This means that each function will be treated independently, with its own parameters and configuration. This is done because the ice driver loads a separate instance for each function. Due to this, the implementation does not enable devlink to manage device-wide resources or configuration, as each physical function will be treated independently. This is done for simplicity, as managing a devlink instance across multiple driver instances would significantly increase the complexity for minimal gain. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-03-10	ice: Increase mailbox receive queue length to maximum	Lukasz Czapnik	1	-1/+0
	Currently the PF's mailbox receive queue is only 512 entries. This fine, but considering that all VF's mailbox send queues funnel into the PF's single mailbox receive queue, let's increase it to the maximum size. This will help prevent any possible bottleneck/slowdown occurring from the PF's mailbox receive queue being full. Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-03-10	ice: Fix removing driver while bare-metal VFs pass traffic	Brett Creeley	1	-0/+1
	Currently, if there are bare-metal VFs passing traffic and the ice driver is removed, there is a possibility of VFs triggering a Tx timeout right before iavf_remove(). This is causing iavf_close() to not be called because there is a check in the beginning of iavf_remove() that bails out early if (adapter->state < IAVF_DOWN_PENDING). This makes it so some resources do not get cleaned up. Specifically, free_irq() is never called for data interrupts, which results in the following line of code to trigger: pci_disable_msix() free_msi_irqs() ... BUG_ON(irq_has_action(entry->irq + i)); ... To prevent the Tx timeout from occurring on the VF during driver unload for ice and the iavf there are a few changes that are needed. [1] Don't disable all active VF Tx/Rx queues prior to calling pci_disable_sriov. [2] Call ice_free_vfs() before disabling the service task. [3] Disable VF resets when the ice driver is being unloaded by setting the pf->state flag __ICE_VF_RESETS_DISABLED. Changing [1] and [2] allow each VF driver's remove flow to successfully send VIRTCHNL requests, which includes queue disable. This prevents unexpected Tx timeouts because the PF driver is no longer forcefully disabling queues. Due to [1] and [2] there is a possibility that the PF driver will get a VFLR or reset request over VIRTCHNL from a VF during PF driver unload. Prevent that by doing [3]. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-03-10	ice: Improve clarity of prints and variables	Brett Creeley	1	-2/+2
	Currently when the device runs out of MSI-X interrupts a cryptic and unhelpful message is printed. This will cause confusion when hitting this case. Fix this by clearing up the error message for both SR-IOV and non SR-IOV use cases. Also, make a few minor changes to increase clarity of variables. 1. Change per VF MSI-X and queue pair variables in the PF structure. 2. Use ICE_NONQ_VECS_VF when determining pf->num_msix_per_vf instead of the magic number "1". This vector is reserved for the OICR. All of the resource tracking functions were moved to avoid adding any forward declaration function prototypes. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-03-10	ice: allow bigger VFs	Mitch Williams	1	-1/+0
	Unlike the XL710 series, 800-series hardware can allocate more than 4 MSI-X vectors per VF. This patch enables that functionality. We dynamically allocate vectors and queues depending on how many VFs are enabled. Allocating the maximum number of VFs replicates XL710 behavior with 4 queues and 4 vectors. But allocating a smaller number of VFs will give you 16 queues and 16 vectors. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-19	ice: update malicious driver detection event handling	Paul Greenwalt	1	-0/+4
	Update the PF VFs MDD event message to rate limit once per second and report the total number Rx\|Tx event count. Add support to print pending MDD events that occur during the rate limit. The use of net_ratelimit did not allow for per VF Rx\|Tx granularity. Additional PF MDD log messages are guarded by netif_msg_[rx\|tx]_err(). Since VF RX MDD events disable the queue, add ethtool private flag mdd-auto-reset-vf to configure VF reset to re-enable the queue. Disable anti-spoof detection interrupt to prevent spurious events during a function reset. To avoid race condition do not make PF MDD register reads conditional on global MDD result. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-04	ice: Add a boundary check in ice_xsk_umem()	Krzysztof Kazimierczak	1	-2/+3
	In ice_xsk_umem(), variable qid which is later used as an array index, is not validated for a possible boundary exceedance. Because of that, a calling function might receive an invalid address, which causes general protection fault when dereferenced. To address this, add a boundary check to see if qid is greater than the size of a UMEM array. Also, don't let user change vsi->num_xsk_umems just by trying to setup a second UMEM if its value is already set up (i.e. UMEM region has already been allocated for this VSI). While at it, make sure that ring->zca.free pointer is always zeroed out if there is no UMEM on a specified ring. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-04	ice: Add code to keep track of current dflt_vsi	Brett Creeley	1	-0/+2
	We can't have more than one default VSI so prevent another VSI from overwriting the current dflt_vsi. This was achieved by adding the following functions: ice_is_dflt_vsi_in_use() - Used to check if the default VSI is already being used. ice_is_vsi_dflt_vsi() - Used to check if VSI passed in is in fact the default VSI. ice_set_dflt_vsi() - Used to set the default VSI via a switch rule ice_clear_dflt_vsi() - Used to clear the default VSI via a switch rule. Also, there was no need to introduce any locking because all mailbox events and synchronization of switch filters for the PF happen in the service task. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-04	ice: Fix VF spoofchk	Brett Creeley	1	-0/+1
	There are many things wrong with the function ice_set_vf_spoofchk(). 1. The VSI being modified is the PF VSI, not the VF VSI. 2. We are enabling Rx VLAN pruning instead of Tx VLAN anti-spoof. 3. The spoofchk setting for each VF is not initialized correctly or re-initialized correctly on reset. To fix [1] we need to make sure we are modifying the VF VSI. This is done by using the vf->lan_vsi_idx to index into the PF's VSI array. To fix [2] replace setting Rx VLAN pruning in ice_set_vf_spoofchk() with setting Tx VLAN anti-spoof. To Fix [3] we need to make sure the initial VSI settings match what is done in ice_set_vf_spoofchk() for spoofchk=on. Also make sure this also works for VF reset. This was done by modifying ice_vsi_init() to account for the current spoofchk state of the VF VSI. Because of these changes, Tx VLAN anti-spoof needs to be removed from ice_cfg_vlan_pruning(). This is okay for the VF because this is now controlled from the admin enabling/disabling spoofchk. For the PF, Tx VLAN anti-spoof should not be set. This change requires us to call ice_set_vf_spoofchk() when configuring promiscuous mode for the VF which requires ice_set_vf_spoofchk() to move in order to prevent a forward declaration prototype. Also, add VLAN 0 by default when allocating a VF since the PF is unaware if the guest OS is running the 8021q module. Without this, MDD events will trigger on untagged traffic because spoofcheck is enabled by default. Due to this change, ignore add/delete messages for VLAN 0 from VIRTCHNL since this is added/deleted during VF initialization/teardown respectively and should not be modified. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-23	ice: Implement ethtool ops for channels	Henry Tieman	1	-0/+4
	Add code to query and set the number of channels on the primary VSI for a PF. This is accessed from the 'ethtool -l' and 'ethtool -L' commands, respectively. Though the ice driver supports asymmetric queues report an IRQ vector that has both Rx and Tx queues attached and is counted as a 'combined' channel. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-23	ice: Add ice_pf_to_dev(pf) macro	Brett Creeley	1	-0/+2
	We use &pf->dev->pdev all over the code. Add a simple macro to do this for us. When multiple de-references like this are being done add a local struct device variable. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-08	ice: Implement DCBNL support	Dave Ertman	1	-0/+2
	Implement interface layer for the DCBNL subsystem. These are the functions to support the callbacks defined in the dcbnl_rtnl_ops struct. These callbacks are going to be used to interface with the DCB settings of the device. Implementation of dcb_nl set functions and supporting SW DCB functions. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-08	ice: Use ice_ena_vsi and ice_dis_vsi in DCB configuration flow	Anirudh Venkataramanan	1	-4/+0
	DCB configuration flow needs to disable and enable only the PF (main) VSI, so use ice_ena_vsi and ice_dis_vsi. To avoid the use of ifdef to control the staticness of these functions, move them to ice_lib.c. Also replace the allocate and copy of old_cfg to kmemdup() in ice_pf_dcb_cfg(). Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-05	ice: introduce legacy Rx flag	Maciej Fijalkowski	1	-0/+1
	Add an ethtool "legacy-rx" priv flag for toggling the Rx path. This control knob will be mainly used for build_skb usage as well as buffer size/MTU manipulation. In preparation for adding build_skb support in a way that it takes care of how we set the values of max_frame and rx_buf_len fields of struct ice_vsi. Specifically, in this patch mentioned fields are set to values that will allow us to provide headroom and tailroom in-place. This can be mostly broken down onto following: - for legacy-rx "on" ethtool control knob, old behaviour is kept; - for standard 1500 MTU size configure the buffer of size 1536, as network stack is expecting the NET_SKB_PAD to be provided and NET_IP_ALIGN can have a non-zero value (these can be typically equal to 32 and 2, respectively); - for larger MTUs go with max_frame set to 9k and configure the 3k buffer in case when PAGE_SIZE of underlying arch is less than 8k; 3k buffer is implying the need for order 1 page, so that our page recycling scheme can still be applied; With that said, substitute the hardcoded ICE_RXBUF_2048 and PAGE_SIZE values in DMA API that we're making use of with rx_ring->rx_buf_len and ice_rx_pg_size(rx_ring). The latter is an introduced helper for determining the page size based on its order (which was figured out via ice_rx_pg_order). Last but not least, take care of truesize calculation. In the followup patch the headroom/tailroom computation logic will be introduced. This change aligns the buffer and frame configuration with other Intel drivers, most importantly with iavf. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04	ice: Add support for AF_XDP	Krzysztof Kazimierczak	1	-0/+26
	Add zero copy AF_XDP support. This patch adds zero copy support for Tx and Rx; code for zero copy is added to ice_xsk.h and ice_xsk.c. For Tx, implement ndo_xsk_wakeup. As with other drivers, reuse existing XDP Tx queues for this task, since XDP_REDIRECT guarantees mutual exclusion between different NAPI contexts based on CPU ID. In turn, a netdev can XDP_REDIRECT to another netdev with a different NAPI context, since the operation is bound to a specific core and each core has its own hardware ring. For Rx, allocate frames as MEM_TYPE_ZERO_COPY on queues that AF_XDP is enabled. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04	ice: Add support for XDP	Maciej Fijalkowski	1	-2/+22
	Add support for XDP. Implement ndo_bpf and ndo_xdp_xmit. Upon load of an XDP program, allocate additional Tx rings for dedicated XDP use. The following actions are supported: XDP_TX, XDP_DROP, XDP_REDIRECT, XDP_PASS, and XDP_ABORTED. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04	ice: Introduce ice_base.c	Anirudh Venkataramanan	1	-0/+8
	Remove a few uses of kernel configuration flags from ice_lib.c by introducing a new source file ice_base.c. Also move corresponding function prototypes from ice_lib.h to ice_base.h and include ice_base.h where required. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-12	ice: Enable DDP package download	Tony Nguyen	1	-0/+14
	Attempt to request an optional device-specific DDP package file (one with the PCIe Device Serial Number in its name so that different DDP package files can be used on different devices). If the optional package file exists, download it to the device. If not, download the default package file. Log an appropriate message based on whether or not a DDP package file exists and the return code from the attempt to download it to the device. If the download fails and there is not already a package file on the device, go into "Safe Mode" where some features are not supported. Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-12	ice: Fix FW version formatting in dmesg	Lukasz Czapnik	1	-1/+0
	The FW build id is currently being displayed as an int which doesn't make sense. Instead display FW build id as a hex value. Also add other useful information to the output such as NVM version, API patch info, and FW build hash. Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-12	ice: send driver version to firmware	Paul M Stillwell Jr	1	-0/+1
	The driver is required to send a version to the firmware to indicate that the driver is up. If the driver doesn't do this the firmware doesn't behave properly. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05	ice: change default number of receive descriptors	Jesse Brandeburg	1	-17/+2
	The driver should start out with a reasonable number of descriptors that can prevent drops due to a CPU being in a power management state. Change the default number of descriptors to 2048. The user can always change the value at runtime. Transmit descriptor counts are not modified because they don't need to change due to the speed of the interface, or for power managed CPUs, but the code is simplified to a fixed value for the transmit default. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05	ice: Minor refactor in queue management	Anirudh Venkataramanan	1	-2/+2
	Remove q_left_tx and q_left_rx from the PF struct as these can be obtained by calling ice_get_avail_txq_count and ice_get_avail_rxq_count respectively. The function ice_determine_q_usage is only setting num_lan_tx and num_lan_rx in the PF structure, and these are later assigned to vsi->alloc_txq and vsi->alloc_rxq respectively. This is an unnecessary indirection, so remove ice_determine_q_usage and just assign values for vsi->alloc_txq and vsi->alloc_rxq in ice_vsi_set_num_qs and use these to set num_lan_tx and num_lan_rx respectively. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>