kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2023-12-14	net: ethtool: pass a pointer to parameters to get/set_rxfh ethtool ops	Ahmed Zaki	9	-153/+153
	The get/set_rxfh ethtool ops currently takes the rxfh (RSS) parameters as direct function arguments. This will force us to change the API (and all drivers' functions) every time some new parameters are added. This is part 1/2 of the fix, as suggested in [1]: - First simplify the code by always providing a pointer to all params (indir, key and func); the fact that some of them may be NULL seems like a weird historic thing or a premature optimization. It will simplify the drivers if all pointers are always present. - Then make the functions take a dev pointer, and a pointer to a single struct wrapping all arguments. The set_* should also take an extack. Link: https://lore.kernel.org/netdev/20231121152906.2dd5f487@kernel.org/ [1] Suggested-by: Jakub Kicinski <kuba@kernel.org> Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Link: https://lore.kernel.org/r/20231213003321.605376-2-ahmed.zaki@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	idpf: add get/set for Ethtool's header split ringparam	Michal Kubiak	5	-7/+90
	idpf supports the header split feature and that feature is always enabled by default. However, for flexibility reasons and to simplify some scenarios, it would be useful to have the support for switching the header split off (and on) from the userspace. Address that need by adding the user config parameter, the functions for disabling (or enabling) the header split feature, and calls to them from the Ethtool ringparam callbacks. It still is enabled by default if supported by the hardware. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Co-developed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20231212142752.935000-3-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-14	ice: use VLAN proto from ring packet context in skb path	Larysa Zaremba	2	-10/+6
	VLAN proto, used in ice XDP hints implementation is stored in ring packet context. Utilize this value in skb VLAN processing too instead of checking netdev features. At the same time, use vlan_tci instead of vlan_tag in touched code, because VLAN tag often refers to VLAN proto and VLAN TCI combined, while in the code we clearly store only VLAN TCI. Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-12-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	ice: Implement VLAN tag hint	Larysa Zaremba	6	-9/+59
	Implement .xmo_rx_vlan_tag callback to allow XDP code to read packet's VLAN tag. At the same time, use vlan_tci instead of vlan_tag in touched code, because VLAN tag often refers to VLAN proto and VLAN TCI combined, while in the code we clearly store only VLAN TCI. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-11-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	ice: Support XDP hints in AF_XDP ZC mode	Larysa Zaremba	2	-0/+19
	In AF_XDP ZC, xdp_buff is not stored on ring, instead it is provided by xsk_buff_pool. Space for metadata sources right after such buffers was already reserved in commit 94ecc5ca4dbf ("xsk: Add cb area to struct xdp_buff_xsk"). Some things (such as pointer to packet context) do not change on a per-packet basis, so they can be set at the same time as RX queue info. On the other hand, RX descriptor is unique for each packet, but is already known when setting DMA addresses. This minimizes performance impact of hints on regular packet processing. Update AF_XDP ZC packet processing to support XDP hints. Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-9-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	ice: Support RX hash XDP hint	Larysa Zaremba	2	-204/+281
	RX hash XDP hint requests both hash value and type. Type is XDP-specific, so we need a separate way to map these values to the hardware ptypes, so create a lookup table. Instead of creating a new long list, reuse contents of ice_decode_rx_desc_ptype[] through preprocessor. Current hash type enum does not contain ICMP packet type, but ice devices support it, so also add a new type into core code. Then use previously refactored code and create a function that allows XDP code to read RX hash. Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-7-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	ice: Support HW timestamp hint	Larysa Zaremba	7	-7/+42
	Use previously refactored code and create a function that allows XDP code to read HW timestamp. Also, introduce packet context, where hints-related data will be stored. ice_xdp_buff contains only a pointer to this structure, to avoid copying it in ZC mode later in the series. HW timestamp is the first supported hint in the driver, so also add xdp_metadata_ops. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-6-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	ice: Introduce ice_xdp_buff	Larysa Zaremba	3	-5/+30
	In order to use XDP hints via kfuncs we need to put RX descriptor and miscellaneous data next to xdp_buff. Same as in hints implementations in other drivers, we achieve this through putting xdp_buff into a child structure. Currently, xdp_buff is stored in the ring structure, so replace it with union that includes child structure. This way enough memory is available while existing XDP code remains isolated from hints. Minimum size of the new child structure (ice_xdp_buff) is exactly 64 bytes (single cache line). To place it at the start of a cache line, move 'next' field from CL1 to CL4, as it isn't used often. This still leaves 192 bits available in CL3 for packet context extensions. Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-5-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	ice: Make ptype internal to descriptor info processing	Larysa Zaremba	4	-13/+16
	Currently, rx_ptype variable is used only as an argument to ice_process_skb_fields() and is computed just before the function call. Therefore, there is no reason to pass this value as an argument. Instead, remove this argument and compute the value directly inside ice_process_skb_fields() function. Also, separate its calculation into a short function, so the code can later be reused in .xmo_() callbacks. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-4-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	ice: make RX HW timestamp reading code more reusable	Larysa Zaremba	3	-20/+36
	Previously, we only needed RX HW timestamp in skb path, hence all related code was written with skb in mind. But with the addition of XDP hints via kfuncs to the ice driver, the same logic will be needed in .xmo_() callbacks. Put generic process of reading RX HW timestamp from a descriptor into a separate function. Move skb-related code into another source file. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-3-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-14	ice: make RX hash reading code more reusable	Larysa Zaremba	1	-11/+25
	Previously, we only needed RX hash in skb path, hence all related code was written with skb in mind. But with the addition of XDP hints via kfuncs to the ice driver, the same logic will be needed in .xmo_() callbacks. Separate generic process of reading RX hash from a descriptor into a separate function. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20231205210847.28460-2-larysa.zaremba@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-12-13	i40e: Fix ST code value for Clause 45	Ivan Vecera	2	-3/+3
	ST code value for clause 45 that has been changed by commit 8196b5fd6c73 ("i40e: Refactor I40E_MDIO_CLAUSE* macros") is currently wrong. The mentioned commit refactored ..MDIO_CLAUSE??_STCODE_MASK so their value is the same for both clauses. The value is correct for clause 22 but not for clause 45. Fix the issue by adding a parameter to I40E_GLGEN_MSCA_STCODE_MASK macro that specifies required value. Fixes: 8196b5fd6c73 ("i40e: Refactor I40E_MDIO_CLAUSE* macros") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-13	ice: fix theoretical out-of-bounds access in ethtool link modes	Michal Schmidt	1	-2/+2
	To map phy types reported by the hardware to ethtool link mode bits, ice uses two lookup tables (phy_type_low_lkup, phy_type_high_lkup). The "low" table has 64 elements to cover every possible bit the hardware may report, but the "high" table has only 13. If the hardware reports a higher bit in phy_types_high, the driver would access memory beyond the lookup table's end. Instead of iterating through all 64 bits of phy_types_{low,high}, use the sizes of the respective lookup tables. Fixes: 9136e1f1e5c3 ("ice: refactor PHY type to ethtool link mode") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-13	dpll: remove leftover mode_supported() op and use mode_get() instead	Jiri Pirko	1	-26/+0
	Mode supported is currently reported to the user exactly the same, as the current mode. That's because mode changing is not implemented. Remove the leftover mode_supported() op and use mode_get() to fill up the supported mode exposed to user. One, if even, mode changing is going to be introduced, this could be very easily taken back. In the meantime, prevent drivers form implementing this in wrong way (as for example recent netdevsim implementation attempt intended to do). Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-12	e1000e: Use pcie_capability_read_word() for reading LNKSTA	Ilpo Järvinen	2	-8/+4
	Use pcie_capability_read_word() for reading LNKSTA and remove the custom define that matches to PCI_EXP_LNKSTA. As only single user for cap_offset remains, replace it with a call to pci_pcie_cap(). Instead of e1000_adapter, make local variable out of pci_dev because both users are interested in it. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-12	iavf: Fix iavf_shutdown to call iavf_remove instead iavf_close	Slawomir Laba	1	-51/+21
	Make the flow for pci shutdown be the same to the pci remove. iavf_shutdown was implementing an incomplete version of iavf_remove. It misses several calls to the kernel like iavf_free_misc_irq, iavf_reset_interrupt_capability, iounmap that might break the system on reboot or hibernation. Implement the call of iavf_remove directly in iavf_shutdown to close this gap. Fixes below error messages (dmesg) during shutdown stress tests - [685814.900917] ice 0000:88:00.0: MAC 02:d0:5f:82:43:5d does not exist for VF 0 [685814.900928] ice 0000:88:00.0: MAC 33:33:00:00:00:01 does not exist for VF 0 Reproduction: 1. Create one VF interface: echo 1 > /sys/class/net/<interface_name>/device/sriov_numvfs 2. Run live dmesg on the host: dmesg -wH 3. On SUT, script below steps into vf_namespace_assignment.sh <#!/bin/sh> // Remove <>. Git removes # line if=<VF name> (edit this per VF name) loop=0 while true; do echo test round $loop let loop++ ip netns add ns$loop ip link set dev $if up ip link set dev $if netns ns$loop ip netns exec ns$loop ip link set dev $if up ip netns exec ns$loop ip link set dev $if netns 1 ip netns delete ns$loop done 4. Run the script for at least 1000 iterations on SUT: ./vf_namespace_assignment.sh Expected result: No errors in dmesg. Fixes: 129cf89e5856 ("iavf: rename functions and structs to new name") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Co-developed-by: Ranganatha Rao <ranganatha.rao@intel.com> Signed-off-by: Ranganatha Rao <ranganatha.rao@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-12	iavf: Handle ntuple on/off based on new state machines for flow director	Piotr Gardocki	1	-0/+59
	ntuple-filter feature on/off: Default is on. If turned off, the filters will be removed from both PF and iavf list. The removal is irrespective of current filter state. Steps to reproduce: ------------------- 1. Ensure ntuple is on. ethtool -K enp8s0 ntuple-filters on 2. Create a filter to receive the traffic into non-default rx-queue like 15 and ensure traffic is flowing into queue into 15. Now, turn off ntuple. Traffic should not flow to configured queue 15. It should flow to default RX queue. Fixes: 0dbfbabb840d ("iavf: Add framework to enable ethtool ntuple filters") Signed-off-by: Piotr Gardocki <piotrx.gardocki@intel.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Ranganatha Rao <ranganatha.rao@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-12	iavf: Introduce new state machines for flow director	Piotr Gardocki	5	-23/+139
	New states introduced: IAVF_FDIR_FLTR_DIS_REQUEST IAVF_FDIR_FLTR_DIS_PENDING IAVF_FDIR_FLTR_INACTIVE Current FDIR state machines (SM) are not adequate to handle a few scenarios in the link DOWN/UP event, reset event and ntuple-feature. For example, when VF link goes DOWN and comes back UP administratively, the expectation is that previously installed filters should also be restored. But with current SM, filters are not restored. So with new SM, during link DOWN filters are marked as INACTIVE in the iavf list but removed from PF. After link UP, SM will transition from INACTIVE to ADD_REQUEST to restore the filter. Similarly, with VF reset, filters will be removed from the PF, but marked as INACTIVE in the iavf list. Filters will be restored after reset completion. Steps to reproduce: ------------------- 1. Create a VF. Here VF is enp8s0. 2. Assign IP addresses to VF and link partner and ping continuously from remote. Here remote IP is 1.1.1.1. 3. Check default RX Queue of traffic. ethtool -S enp8s0 \| grep -E "rx-[[:digit:]]+\.packets" 4. Add filter - change default RX Queue (to 15 here) ethtool -U ens8s0 flow-type ip4 src-ip 1.1.1.1 action 15 loc 5 5. Ensure filter gets added and traffic is received on RX queue 15 now. Link event testing: ------------------- 6. Bring VF link down and up. If traffic flows to configured queue 15, test is success, otherwise it is a failure. Reset event testing: -------------------- 7. Reset the VF. If traffic flows to configured queue 15, test is success, otherwise it is a failure. Fixes: 0dbfbabb840d ("iavf: Add framework to enable ethtool ntuple filters") Signed-off-by: Piotr Gardocki <piotrx.gardocki@intel.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Ranganatha Rao <ranganatha.rao@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-12	e1000e: Use PCI_EXP_LNKSTA_NLW & FIELD_GET() instead of custom defines/code	Ilpo Järvinen	2	-5/+4
	e1000e has own copy of PCI Negotiated Link Width field defines. Use the ones from include/uapi/linux/pci_regs.h instead of the custom ones and remove the custom ones and convert to FIELD_GET(). Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-12	igb: Use FIELD_GET() to extract Link Width	Ilpo Järvinen	1	-3/+3
	Use FIELD_GET() to extract PCIe Negotiated Link Width field instead of custom masking and shifting. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-08	net: Convert some ethtool_sprintf() to ethtool_puts()	justinstitt@google.com	7	-24/+13
	This patch converts some basic cases of ethtool_sprintf() to ethtool_puts(). The conversions are used in cases where ethtool_sprintf() was being used with just two arguments: \| ethtool_sprintf(&data, buffer[i].name); or when it's used with format string: "%s" \| ethtool_sprintf(&data, "%s", buffer[i].name); which both now become: \| ethtool_puts(&data, buffer[i].name); Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-08	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	6	-24/+14
	Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/stmicro/stmmac/dwmac5.c drivers/net/ethernet/stmicro/stmmac/dwmac5.h drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c drivers/net/ethernet/stmicro/stmmac/hwif.h 37e4b8df27bc ("net: stmmac: fix FPE events losing") c3f3b97238f6 ("net: stmmac: Refactor EST implementation") https://lore.kernel.org/all/20231206110306.01e91114@canb.auug.org.au/ Adjacent changes: net/ipv4/tcp_ao.c 9396c4ee93f9 ("net/tcp: Don't store TCP-AO maclen on reqsk") 7b0f570f879a ("tcp: Move TCP-AO bits from cookie_v[46]_check() to tcp_ao_syncookie().") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-05	iavf: validate tx_coalesce_usecs even if rx_coalesce_usecs is zero	Jacob Keller	2	-11/+2
	In __iavf_set_coalesce, the driver checks both ec->rx_coalesce_usecs and ec->tx_coalesce_usecs for validity. It does this via a chain if if/else-if blocks. If every single branch of the series of if statements exited, this would be fine. However, the rx_coalesce_usecs is checked against zero to print an informative message if use_adaptive_rx_coalesce is enabled. If this check is true, it short circuits the entire chain of statements, preventing validation of the tx_coalesce_usecs field. Indeed, since commit e792779e6b63 ("iavf: Prevent changing static ITR values if adaptive moderation is on") the iavf driver actually rejects any change to the tx_coalesce_usecs or rx_coalesce_usecs when use_adaptive_tx_coalesce or use_adaptive_rx_coalesce is enabled, making this checking a bit redundant. Fix this error by removing the unnecessary and redundant checks for use_adaptive_rx_coalesce and use_adaptive_tx_coalesce. Since zero is a valid value, and since the tx_coalesce_usecs and rx_coalesce_usecs fields are already unsigned, remove the minimum value check. This allows assigning an ITR value ranging from 0-8160 as described by the printed message. Fixes: 65e87c0398f5 ("i40evf: support queue-specific settings for interrupt moderation") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-05	i40e: Fix unexpected MFS warning message	Ivan Vecera	1	-1/+1
	Commit 3a2c6ced90e1 ("i40e: Add a check to see if MFS is set") added a warning message that reports unexpected size of port's MFS (max frame size) value. This message use for the port number local variable 'i' that is wrong. In i40e_probe() this 'i' variable is used only to iterate VSIs to find FDIR VSI: <code> ... /* if FDIR VSI was set up, start it now */ for (i = 0; i < pf->num_alloc_vsi; i++) { if (pf->vsi[i] && pf->vsi[i]->type == I40E_VSI_FDIR) { i40e_vsi_open(pf->vsi[i]); break; } } ... </code> So the warning message use for the port number index of FDIR VSI if this exists or pf->num_alloc_vsi if not. Fix the message by using 'pf->hw.port' for the port number. Fixes: 3a2c6ced90e1 ("i40e: Add a check to see if MFS is set") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-05	ice: Restore fix disabling RX VLAN filtering	Marcin Szycik	1	-3/+8
	Fix setting dis_rx_filtering depending on whether port vlan is being turned on or off. This was originally fixed in commit c793f8ea15e3 ("ice: Fix disabling Rx VLAN filtering with port VLAN enabled"), but while refactoring ice_vf_vsi_init_vlan_ops(), the fix has been lost. Restore the fix along with the original comment from that change. Also delete duplicate lines in ice_port_vlan_on(). Fixes: 2946204b3fa8 ("ice: implement bridge port vlan") Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-05	ice: change vfs.num_msix_per to vf->num_msix	Michal Swiatkowski	2	-9/+3
	vfs::num_msix_per should be only used as default value for vf->num_msix. For other use cases vf->num_msix should be used, as VF can have different MSI-X amount values. Fix incorrect register index calculation. vfs::num_msix_per and pf->sriov_base_vector shouldn't be used after implementation of changing MSI-X amount on VFs. Instead vf->first_vector_idx should be used, as it is storing value for first irq index. Fixes: fe1c5ca2fe76 ("ice: implement num_msix field per VF") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-05	ice: Rename E822 to E82X	Karol Kolacinski	6	-281/+280
	When code is applicable for both E822 and E823 devices, rename it from E822 to E82X. ICE_PHY_PER_NAC_E822 was unused, so just remove it. Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-05	ice: periodically kick Tx timestamp interrupt	Jacob Keller	1	-0/+50
	The E822 hardware for Tx timestamping keeps track of how many outstanding timestamps are still in the PHY memory block. It will not generate a new interrupt to the MAC until all of the timestamps in the region have been read. If somehow all the available data is not read, but the driver has exited its interrupt routine already, the PHY will not generate a new interrupt even if new timestamp data is captured. Because no interrupt is generated, the driver never processes the timestamp data. This state results in a permanent failure for all future Tx timestamps. It is not clear how the driver and hardware could enter this state. However, if it does, there is currently no recovery mechanism. Add a recovery mechanism via the periodic PTP work thread which invokes ice_ptp_periodic_work(). Introduce a new check, ice_ptp_maybe_trigger_tx_interrupt() which checks the PHY timestamp ready bitmask. If any bits are set, trigger a software interrupt by writing to PFINT_OICR. Once triggered, the main timestamp processing thread will read through the PHY data and clear the outstanding timestamp data. Once cleared, new data should trigger interrupts as expected. This should allow recovery from such a state rather than leaving the device in a state where we cannot process Tx timestamps. It is possible that this function checks for timestamp data simultaneously with the interrupt, and it might trigger additional unnecessary interrupts. This will cause a small amount of additional processing. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Andrii Staikov <andrii.staikov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-05	ice: Re-enable timestamping correctly after reset	Karol Kolacinski	2	-10/+11
	During reset, TX_TSYN interrupt should be processed as it may process timestamps in brief moments before and after reset. Timestamping should be enabled on VSIs at the end of reset procedure. On ice_get_phy_tx_tstamp_ready error, interrupt should not be rearmed because error only happens on resets. Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-05	ice: Improve logs for max ntuple errors	Pawel Kaminski	1	-5/+8
	Supported number of ntuple filters affect also maximum location value that can be provided to ethtool command. Update error message to provide info about max supported value. Fix double spaces in the error messages. Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Pawel Kaminski <pawel.kaminski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-05	ice: add CGU info to devlink info callback	Arkadiusz Kubalewski	1	-0/+20
	If Clock Generation Unit is present on NIC board user shall know its details. Provide the devlink info callback with a new: - fixed type object (cgu.id) indicating hardware variant of onboard CGU, - running type object (fw.cgu) consisting of CGU id, config and firmware versions. These information shall be known for debugging purposes. Test (on NIC board with CGU) $ devlink dev info <bus_name>/<dev_name> \| grep cgu cgu.id 36 fw.cgu 8032.16973825.6021 Test (on NIC board without CGU) $ devlink dev info <bus_name>/<dev_name> \| grep cgu -c 0 Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-05	ice: read internal temperature sensor	Konrad Knitter	10	-1/+249
	Since 4.30 firmware exposes internal thermal sensor reading via admin queue commands. Expose those readouts via hwmon API when supported. Datasheet: Get Sensor Reading Command (Opcode: 0x0632) +--------------------+--------+--------------------+-------------------------+ \| Name \| Bytes \| Value \| Remarks \| +--------------------+--------+--------------------+-------------------------+ \| Flags \| 1-0 \| \| \| \| Opcode \| 2-3 \| 0x0632 \| Command opcode \| \| Datalen \| 4-5 \| 0 \| No external buffer. \| \| Return value \| 6-7 \| \| Return value. \| \| Cookie High \| 8-11 \| Cookie \| \| \| Cookie Low \| 12-15 \| Cookie \| \| \| Sensor \| 16 \| \| 0x00: Internal temp \| \| \| \| \| 0x01-0xFF: Reserved. \| \| Format \| 17 \| Requested response \| Only 0x00 is supported. \| \| \| \| format \| 0x01-0xFF: Reserved. \| \| Reserved \| 18-23 \| \| \| \| Data Address high \| 24-27 \| Response buffer \| \| \| \| \| address \| \| \| Data Address low \| 28-31 \| Response buffer \| \| \| \| \| address \| \| +--------------------+--------+--------------------+-------------------------+ Get Sensor Reading Response (Opcode: 0x0632) +--------------------+--------+--------------------+-------------------------+ \| Name \| Bytes \| Value \| Remarks \| +--------------------+--------+--------------------+-------------------------+ \| Flags \| 1-0 \| \| \| \| Opcode \| 2-3 \| 0x0632 \| Command opcode \| \| Datalen \| 4-5 \| 0 \| No external buffer \| \| Return value \| 6-7 \| \| Return value. \| \| \| \| \| EINVAL: Invalid \| \| \| \| \| parameters \| \| \| \| \| ENOENT: Unsupported \| \| \| \| \| sensor \| \| \| \| \| EIO: Sensor access \| \| \| \| \| error \| \| Cookie High \| 8-11 \| Cookie \| \| \| Cookie Low \| 12-15 \| Cookie \| \| \| Sensor Reading \| 16-23 \| \| Format of the reading \| \| \| \| \| is dependent on request \| \| Data Address high \| 24-27 \| Response buffer \| \| \| \| \| address \| \| \| Data Address low \| 28-31 \| Response buffer \| \| \| \| \| address \| \| +--------------------+--------+--------------------+-------------------------+ Sensor Reading for Sensor 0x00 (Internal Chip Temperature): +--------------------+--------+--------------------+-------------------------+ \| Name \| Bytes \| Value \| Remarks \| +--------------------+--------+--------------------+-------------------------+ \| Thermal Sensor \| 0 \| \| Reading in degrees \| \| reading \| \| \| Celsius. Signed int8 \| \| Warning High \| 1 \| \| Warning High threshold \| \| threshold \| \| \| in degrees Celsius. \| \| \| \| \| Unsigned int8. \| \| \| \| \| 0xFF when unsupported \| \| Critical High \| 2 \| \| Critical High threshold \| \| threshold \| \| \| in degrees Celsius. \| \| \| \| \| Unsigned int8. \| \| \| \| \| 0xFF when unsupported \| \| Fatal High \| 3 \| \| Fatal High threshold \| \| threshold \| \| \| in degrees Celsius. \| \| \| \| \| Unsigned int8. \| \| \| \| \| 0xFF when unsupported \| \| Reserved \| 4-7 \| \| \| +--------------------+--------+--------------------+-------------------------+ Driver provides current reading from HW as well as device specific thresholds for thermal alarm (Warning, Critical, Fatal) events. $ sensors Output ========================================================= ice-pci-b100 Adapter: PCI adapter temp1: +62.0°C (high = +95.0°C, crit = +105.0°C) (emerg = +115.0°C) Tested on Intel Corporation Ethernet Controller E810-C for SFP Co-developed-by: Marcin Domagala <marcinx.domagala@intel.com> Signed-off-by: Marcin Domagala <marcinx.domagala@intel.com> Co-developed-by: Eric Joyner <eric.joyner@intel.com> Signed-off-by: Eric Joyner <eric.joyner@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Konrad Knitter <konrad.knitter@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-12-05	net: Add NAPI IRQ support	Amritha Nambiar	1	-0/+2
	Add support to associate the interrupt vector number for a NAPI instance. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Link: https://lore.kernel.org/r/170147334728.5260.13221803396905901904.stgit@anambiarhost.jf.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-05	ice: Add support in the driver for associating queue with napi	Amritha Nambiar	4	-3/+84
	After the napi context is initialized, map the napi instance with the queue/queue-set on the corresponding irq line. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Link: https://lore.kernel.org/r/170147332060.5260.13310934657151560599.stgit@anambiarhost.jf.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-01	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	4	-50/+118
	Cross-merge networking fixes after downstream PR. No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-30	ice: Fix VF Reset paths when interface in a failed over aggregate	Dave Ertman	4	-50/+118
	There is an error when an interface has the following conditions: - PF is in an aggregate (bond) - PF has VFs created on it - bond is in a state where it is failed-over to the secondary interface - A VF reset is issued on one or more of those VFs The issue is generated by the originating PF trying to rebuild or reconfigure the VF resources. Since the bond is failed over to the secondary interface the queue contexts are in a modified state. To fix this issue, have the originating interface reclaim its resources prior to the tear-down and rebuild or reconfigure. Then after the process is complete, move the resources back to the currently active interface. There are multiple paths that can be used depending on what triggered the event, so create a helper function to move the queues and use paired calls to the helper (back to origin, process, then move back to active interface) under the same lag_mutex lock. Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for SRIOV on bonded interface") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://lore.kernel.org/r/20231127212340.1137657-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29	Merge branch '40GbE' of ↵	Jakub Kicinski	12	-200/+90
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-11-27 (i40e, iavf) This series contains updates to i40e and iavf drivers. Ivan Vecera performs more cleanups on i40e and iavf drivers; removing unused fields, defines, and unneeded fields. Petr Oros utilizes iavf_schedule_aq_request() helper to replace open coded equivalents. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: iavf: use iavf_schedule_aq_request() helper iavf: Remove queue tracking fields from iavf_adminq_ring i40e: Remove queue tracking fields from i40e_adminq_ring i40e: Remove AQ register definitions for VF types i40e: Delete unused and useless i40e_pf fields ==================== Link: https://lore.kernel.org/r/20231127211037.1135403-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29	ice: fix error code in ice_eswitch_attach()	Dan Carpenter	1	-1/+3
	Set the "err" variable on this error path. Fixes: fff292b47ac1 ("ice: add VF representors one by one") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Link: https://lore.kernel.org/r/e0349ee5-76e6-4ff4-812f-4aa0d3f76ae7@moroto.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-27	iavf: use iavf_schedule_aq_request() helper	Petr Oros	2	-17/+8
	Use the iavf_schedule_aq_request() helper when we need to schedule a watchdog task immediately. No functional change. Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-27	iavf: Remove queue tracking fields from iavf_adminq_ring	Ivan Vecera	4	-70/+39
	Fields 'head', 'tail', 'len', 'bah' and 'bal' in iavf_adminq_ring are used to store register offsets. These offsets are initialized and remains constant so there is no need to store them in the iavf_adminq_ring structure. Remove these fields from iavf_adminq_ring and use register offset constants instead. Remove iavf_adminq_init_regs() that originally stores these constants into these fields. Finally improve iavf_check_asq_alive() that assumes that non-zero value of hw->aq.asq.len indicates fully initialized AdminQ send queue. Replace it by check for non-zero value of field hw->aq.asq.count that is non-zero when the sending queue is initialized and is zeroed during shutdown of the queue. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-27	i40e: Remove queue tracking fields from i40e_adminq_ring	Ivan Vecera	4	-70/+39
	Fields 'head', 'tail', 'len', 'bah' and 'bal' in i40e_adminq_ring are used to store register offsets. These offsets are initialized and remains constant so there is no need to store them in the i40e_adminq_ring structure. Remove these fields from i40e_adminq_ring and use register offset constants instead. Remove i40e_adminq_init_regs() that originally stores these constants into these fields. Finally improve i40e_check_asq_alive() that assumes that non-zero value of hw->aq.asq.len indicates fully initialized AdminQ send queue. Replace it by check for non-zero value of field hw->aq.asq.count that is non-zero when the sending queue is initialized and is zeroed during shutdown of the queue. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-27	i40e: Remove AQ register definitions for VF types	Ivan Vecera	1	-10/+0
	The i40e driver does not handle its VF device types so there is no need to keep AdminQ register definitions for such device types. Remove them. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-27	i40e: Delete unused and useless i40e_pf fields	Ivan Vecera	3	-33/+4
	Removed fields: .fc_autoneg_status Since commit c56999f94876 ("i40e/i40evf: Add set_fc and init of FC settings") write-only and otherwise unused .eeprom_version Write-only and otherwise unused .atr_sample_rate Has only one possible value (I40E_DEFAULT_ATR_SAMPLE_RATE). Remove it and replace its occurrences by I40E_DEFAULT_ATR_SAMPLE_RATE .adminq_work_limit Has only one possible value (I40E_AQ_WORK_LIMIT). Remove it and replace its occurrences by I40E_AQ_WORK_LIMIT .tx_sluggish_count Unused, never written .pf_seid Used to store VSI downlink seid and it is referenced only once in the same codepath. There is no need to save it into i40e_pf. Remove it and use downlink_seid directly in the mentioned log message. .instance Write only. Remove it as well as ugly static local variable 'pfs_found' in i40e_probe. .int_policy .switch_kobj .ptp_pps_work .ptp_extts1_work .ptp_pps_start .pps_delay .ptp_pin .override_q_count All these unused at all Prior the patch: pahole -Ci40e_pf drivers/net/ethernet/intel/i40e/i40e.ko \| tail -5 /* size: 5368, cachelines: 84, members: 127 / / sum members: 5297, holes: 20, sum holes: 71 / / paddings: 6, sum paddings: 19 / / last cacheline: 56 bytes / }; After the patch: pahole -Ci40e_pf drivers/net/ethernet/intel/i40e/i40e.ko \| tail -5 / size: 4976, cachelines: 78, members: 112 / / sum members: 4905, holes: 17, sum holes: 71 / / paddings: 6, sum paddings: 19 / / last cacheline: 48 bytes */ }; Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-23	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	6	-89/+92
	Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/intel/ice/ice_main.c c9663f79cd82 ("ice: adjust switchdev rebuild path") 7758017911a4 ("ice: restore timestamp configuration after device reset") https://lore.kernel.org/all/20231121211259.3348630-1-anthony.l.nguyen@intel.com/ Adjacent changes: kernel/bpf/verifier.c bb124da69c47 ("bpf: keep track of max number of bpf_loop callback iterations") 5f99f312bd3b ("bpf: add register bounds sanity checks and sanitization") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-23	i40e: Fix adding unsupported cloud filters	Ivan Vecera	1	-7/+9
	If a VF tries to add unsupported cloud filter through virtchnl then i40e_add_del_cloud_filter(_big_buf) returns -ENOTSUPP but this error code is stored in 'ret' instead of 'aq_ret' that is used as error code sent back to VF. In this scenario where one of the mentioned functions fails the value of 'aq_ret' is zero so the VF will incorrectly receive a 'success'. Use 'aq_ret' to store return value and remove 'ret' local variable. Additionally fix the issue when filter allocation fails, in this case no notification is sent back to the VF. Fixes: e284fc280473 ("i40e: Add and delete cloud filter") Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20231121211338.3348677-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-23	ice: restore timestamp configuration after device reset	Jacob Keller	3	-40/+51
	The driver calls ice_ptp_cfg_timestamp() during ice_ptp_prepare_for_reset() to disable timestamping while the device is resetting. This operation destroys the user requested configuration. While the driver does call ice_ptp_cfg_timestamp in ice_rebuild() to restore some hardware settings after a reset, it unconditionally passes true or false, resulting in failure to restore previous user space configuration. This results in a device reset forcibly disabling timestamp configuration regardless of current user settings. This was not detected previously due to a quirk of the LinuxPTP ptp4l application. If ptp4l detects a missing timestamp, it enters a fault state and performs recovery logic which includes executing SIOCSHWTSTAMP again, restoring the now accidentally cleared configuration. Not every application does this, and for these applications, timestamps will mysteriously stop after a PF reset, without being restored until an application restart. Fix this by replacing ice_ptp_cfg_timestamp() with two new functions: 1) ice_ptp_disable_timestamp_mode() which unconditionally disables the timestamping logic in ice_ptp_prepare_for_reset() and ice_ptp_release() 2) ice_ptp_restore_timestamp_mode() which calls ice_ptp_restore_tx_interrupt() to restore Tx timestamping configuration, calls ice_set_rx_tstamp() to restore Rx timestamping configuration, and issues an immediate TSYN_TX interrupt to ensure that timestamps which may have occurred during the device reset get processed. Modify the ice_ptp_set_timestamp_mode to directly save the user configuration and then call ice_ptp_restore_timestamp_mode. This way, reset no longer destroys the saved user configuration. This obsoletes the ice_set_tx_tstamp() function which can now be safely removed. With this change, all devices should now restore Tx and Rx timestamping functionality correctly after a PF reset without application intervention. Fixes: 77a781155a65 ("ice: enable receive hardware timestamping") Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-23	ice: unify logic for programming PFINT_TSYN_MSK	Jacob Keller	1	-26/+34
	Commit d938a8cca88a ("ice: Auxbus devices & driver for E822 TS") modified how Tx timestamps are handled for E822 devices. On these devices, only the clock owner handles reading the Tx timestamp data from firmware. To do this, the PFINT_TSYN_MSK register is modified from the default value to one which enables reacting to a Tx timestamp on all PHY ports. The driver currently programs PFINT_TSYN_MSK in different places depending on whether the port is the clock owner or not. For the clock owner, the PFINT_TSYN_MSK value is programmed during ice_ptp_init_owner just before calling ice_ptp_tx_ena_intr to program the PHY ports. For the non-clock owner ports, the PFINT_TSYN_MSK is programmed during ice_ptp_init_port. If a large enough device reset occurs, the PFINT_TSYN_MSK register will be reset to the default value in which only the PHY associated directly with the PF will cause the Tx timestamp interrupt to trigger. The driver lacks logic to reprogram the PFINT_TSYN_MSK register after a device reset. For the E822 device, this results in the PF no longer responding to interrupts for other ports. This results in failure to deliver Tx timestamps to user space applications. Rename ice_ptp_configure_tx_tstamp to ice_ptp_cfg_tx_interrupt, and unify the logic for programming PFINT_TSYN_MSK and PFINT_OICR_ENA into one place. This function will program both registers according to the combination of user configuration and device requirements. This ensures that PFINT_TSYN_MSK is always restored when we configure the Tx timestamp interrupt. Fixes: d938a8cca88a ("ice: Auxbus devices & driver for E822 TS") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-23	ice: remove ptp_tx ring parameter flag	Jacob Keller	3	-18/+0
	Before performing a Tx timestamp in ice_stamp(), the driver checks a ptp_tx ring variable to see if timestamping is enabled on that ring. This value is set for all rings whenever userspace configures Tx timestamping. Ostensibly this was done to avoid wasting cycles checking other fields when timestamping has not been enabled. However, for Tx timestamps we already get an individual per-SKB flag indicating whether userspace wants to request a timestamp on that packet. We do not gain much by also having a separate flag to check for whether timestamping was enabled. In fact, the driver currently fails to restore the field after a PF reset. Because of this, if a PF reset occurs, timestamps will be disabled. Since this flag doesn't add value in the hotpath, remove it and always provide a timestamp if the SKB flag has been set. A following change will fix the reset path to properly restore user timestamping configuration completely. This went unnoticed for some time because one of the most common applications using Tx timestamps, ptp4l, will reconfigure the socket as part of its fault recovery logic. Fixes: ea9b847cda64 ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-19	Merge branch '100GbE' of ↵	Jakub Kicinski	14	-422/+553
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: one by one port representors creation Michal Swiatkowski says: Currently ice supports creating port representors only for VFs. For that use case they can be created and removed in one step. This patchset is refactoring current flow to support port representor creation also for subfunctions and SIOV. In this case port representors need to be created and removed one by one. Also, they can be added and removed while other port representors are running. To achieve that we need to change the switchdev configuration flow. Three first patches are only cosmetic (renaming, removing not used code). Next few ones are preparation for new flow. The most important one is "add VF representor one by one". It fully implements new flow. New type of port representor (for subfunction) will be introduced in follow up patchset. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: reserve number of CP queues ice: adjust switchdev rebuild path ice: add VF representors one by one ice: realloc VSI stats arrays ice: set Tx topology every time new repr is added ice: allow changing SWITCHDEV_CTRL VSI queues ice: return pointer to representor ice: make representor code generic ice: remove VF pointer reference in eswitch code ice: track port representors in xarray ice: use repr instead of vf->repr ice: track q_id in representor ice: remove unused control VSI parameter ice: remove redundant max_vsi_num variable ice: rename switchdev to eswitch ==================== Link: https://lore.kernel.org/r/20231114181449.1290117-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-19	Merge branch '1GbE' of ↵	Jakub Kicinski	6	-42/+105
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== igc: Add support for physical + free-running timers Vinicius Costa Gomes says: The objective is to allow having functionality that depends on the physical timer (taprio and ETF offloads, for example) and vclocks operating together. The "big" missing piece is the implementation of the .getcyclesx64() function in igc, as i225/i226 have multiple timers, we use one of those timers (timer 1) as a free-running (non adjustable) timer. The complication is that only implementing .getcyclesx64() and nothing else will break synchronization when using vclocks, as reading the clock will retrieve the free-running value but timnestamps will come from the adjustable timer. The solution is to modify "in one go" the timestamping code to be able to retrieve the timestamp from the correct timer (if a socket is "phc_bound" to a vclock the timestamp will come from the free-running timer). I was debating whether or not to do the adjustments for the internal latencies for the free-running timestamps, decided to do the adjustments so the path delay when using vclocks is similar to the one when using the physical clock. One future improvement is to implement the .getcrosscycles() function. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: igc: Add support for PTP .getcyclesx64() igc: Simplify setting flags in the TX data descriptor ==================== Link: https://lore.kernel.org/r/20231114183640.1303163-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>