summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-10-29Merge branch 'clean-up-sfp-register-definitions'Jakub Kicinski1-90/+97
Russell King says: ==================== Clean up SFP register definitions This two-part patch series cleans up the SFP register definitions by 1. converting them from hex to decimal, as all the definitions in the documents use decimal, this makes it easier to cross-reference. 2. moving the bit definitions for each register along side their register address definition ==================== Link: https://lore.kernel.org/r/Y1qFvaDlLVM1fHdG@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: sfp: move field definitions along side register indexRussell King (Oracle)1-10/+17
Just as we do for the A2h enum, arrange the A0h enum to have the field definitions next to their corresponding register index. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: sfp: convert register indexes from hex to decimalRussell King (Oracle)1-90/+90
The register indexes in the standards are in decimal rather than hex, so lets specify them in decimal in the header file so we can easily cross-reference without converting between hex and decimal. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29Merge branch 'net-mtk_eth_soc-improve-pcs-implementation'Jakub Kicinski3-77/+134
Russell King says: ==================== net: mtk_eth_soc: improve PCS implementation As a result of invesigations from Frank Wunderlich, we know a lot more about the Mediatek "SGMII" PCS block, and can implement the PCS support correctly. This series achieves that, and Frank has tested the final result and reports that it works for him. The series could do with further testing by others, but I suspect that is unlikely to happen until it is merged based on past performances with this driver. Briefly, the patches in order: 1. Add a new helper to get the link timer duration in nanoseconds 2. Add definitions for the newly discovered registers and updates to bit definitions, including bitmasks for the BMCR, BMSR and two advertisement registers. 3. Remove unnecessary/unused error handling (functions always returning zero.) 4. Adding the missing pcs_get_state() implementation. 5. Converting the code to use regmap_update_bits() rather than open-coding read-modify-write sequences. 6. Adding out-of-band speed and duplex forcing for all non-inband modes not just the 802.3z link modes the code currently does. 7. Moving the release of the PHY power down to the main pcs_config() function. 8. Moving the interface speed selection to the main pcs_config() function. 9. Adding advertisement programming. 10. Adding correct link timer programming using the new helper in the first patch. 11. Adding support for 802.3z negotiation. There is one remaining issue - when configuring the PCS for in-band, for some reason the AN restart bit is always set. This should not be necessary, but requires further investigation with the hardware to find out whether it is really necessary. I suspect this was a work around for a previous poor implementation. ==================== Link: https://lore.kernel.org/r/Y1qDMw+DJLAJHT40@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: mtk_eth_soc: add support for in-band 802.3z negotiationRussell King (Oracle)1-35/+42
As a result of help from Frank Wunderlich to investigate and test, we now know how to program this PCS for in-band 802.3z negotiation. Add support for this by moving the contents of the two functions into the common mtk_pcs_config() function and adding the register settings for 802.3z negotiation. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: mtk_eth_soc: move and correct link timer programmingRussell King (Oracle)1-5/+8
Program the link timer appropriately for the interface mode being used, using the newly introduced phylink helper that provides the nanosecond link timer interval. The intervals are 1.6ms for SGMII based protocols and 10ms for 802.3z based protocols. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: mtk_eth_soc: add advertisement programmingRussell King (Oracle)1-1/+12
Program the advertisement into the mtk PCS block. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: mtk_eth_soc: move interface speed selectionRussell King (Oracle)1-8/+10
Move the selection of the underlying interface speed to the pcs_config function, so we always program the interface speed. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: mtk_eth_soc: move PHY power upRussell King (Oracle)1-7/+4
The PHY power up is common to both configuration paths, so move it into the parent function. We need to do this for all serdes modes. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: mtk_eth_soc: add out of band forcing of speed and duplex in pcs_link_upRussell King (Oracle)1-11/+17
Add support for forcing the link speed and duplex setting in the pcs_link_up() method for out of band modes, which will be useful when we finish converting the pcs_config() method. Until then, we still have to force duplex for 802.3z modes to work correctly. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: mtk_eth_soc: convert mtk_sgmii to use regmap_update_bits()Russell King (Oracle)1-35/+26
mtk_sgmii does a lot of read-modify-write operations, for which there is a specific regmap function. Use this function instead of open-coding the operations. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: mtk_eth_soc: add pcs_get_state() implementationRussell King (Oracle)1-0/+15
Add a pcs_get_state() implementation which uses the advertisements to compute the resulting link modes, and BMSR contents to determine negotiation and link status. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: mtk_eth_soc: eliminate unnecessary error handlingRussell King (Oracle)1-12/+6
The functions called by the pcs_config() method always return zero, so there is no point trying to handle an error from these functions. Make these functions void, eliminate the "err" variable and simply return zero from the pcs_config() function itself. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: mtk_eth_soc: add definitions for PCSRussell King (Oracle)1-3/+10
As a result of help from Frank Wunderlich to investigate and test, we know a bit more about the PCS on the Mediatek platforms. Update the definitions from this investigation. This PCS appears similar, but not identical to the Lynx PCS. Although not included in this patch, but for future reference, the PHY ID registers at offset 4 read as 0x4d544950 'MTIP'. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: phylink: add phylink_get_link_timer_ns() helperRussell King (Oracle)1-0/+24
Add a helper to convert the PHY interface mode to the required link timer setting as stated by the appropriate standard. Inappropriate interface modes return an error. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29Merge branch 'net-remove-the-obsolte-u64_stats_fetch_-_irq'Jakub Kicinski79-319/+318
Sebastian Andrzej Siewior says: ==================== net: Remove the obsolte u64_stats_fetch_*_irq() This is the removal of u64_stats_fetch_*_irq() users in networking. The prerequisites are part of v6.1-rc1. The spi and bpf bits are not part of the series and have been routed directly. ==================== Link: https://lore.kernel.org/r/20221026132215.696950-1-bigeasy@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: Remove the obsolte u64_stats_fetch_*_irq() users (net).Thomas Gleixner16-45/+44
Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore. Convert to the regular interface. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29net: Remove the obsolte u64_stats_fetch_*_irq() users (drivers).Thomas Gleixner63-274/+274
Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore. Convert to the regular interface. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-29Kalle Valo says:Jakub Kicinski122-1106/+35581
==================== pull-request: wireless-next-2022-10-28 First set of patches v6.2. mac80211 refactoring continues for Wi-Fi 7. All mac80211 driver are now converted to use internal TX queues, this might cause some regressions so we wanted to do this early in the cycle. Note: wireless tree was merged[1] to wireless-next to avoid some conflicts with mac80211 patches between the trees. Unfortunately there are still two smaller conflicts in net/mac80211/util.c which Stephen also reported[2]. In the first conflict initialise scratch_len to "params->scratch_len ?: 3 * params->len" (note number 3, not 2!) and in the second conflict take the version which uses elems->scratch_pos. [1] https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next.git/commit/?id=dfd2d876b3fda1790bc0239ba4c6967e25d16e91 [2] https://lore.kernel.org/all/20221020032340.5cf101c0@canb.auug.org.au/ mac80211 - preparation for Wi-Fi 7 Multi-Link Operation (MLO) continues - add API to show the link STAs in debugfs - all mac80211 drivers are now using mac80211 internal TX queues (iTXQs) rtw89 - support 8852BE rtl8xxxu - support RTL8188FU brmfmac - support two station interfaces concurrently bcma - support SPROM rev 11 ==================== Link: https://lore.kernel.org/r/20221028132943.304ECC433B5@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28nfc: s3fwrn5: use devm_clk_get_optional_enabled() helperDmitry Torokhov1-14/+5
Because we enable the clock immediately after acquiring it in probe, we can combine the 2 operations and use devm_clk_get_optional_enabled() helper. Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28Merge branch 'txgbe'David S. Miller12-19/+1273
Jiawen Wu says: ==================== net: WangXun txgbe ethernet driver This patch series adds support for WangXun 10 gigabit NIC, to initialize hardware, set mac address, and register netdev. Change log: v6: address comments: Jakub Kicinski: check with scripts/kernel-doc v5: address comments: Jakub Kicinski: clean build with W=1 C=1 v4: address comments: Andrew Lunn: https://lore.kernel.org/all/YzXROBtztWopeeaA@lunn.ch/ v3: address comments: Andrew Lunn: remove hw function ops, reorder functions, use BIT(n) for register bit offset, move the same code of txgbe and ngbe to libwx v2: address comments: Andrew Lunn: https://lore.kernel.org/netdev/YvRhld5rD%2FxgITEg@lunn.ch/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28net: txgbe: Set MAC address and register netdevJiawen Wu7-7/+566
Add MAC address related operations, and register netdev. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28net: txgbe: Reset hardwareJiawen Wu9-9/+434
Reset and initialize the hardware by configuring the MAC layer. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28net: txgbe: Store PCI infoJiawen Wu9-10/+280
Get PCI config space info, set LAN id and check flash status. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28Merge branch 'tcp-plb'David S. Miller13-2/+305
Mubashir Adnan Qureshi says: ==================== net: Add PLB functionality to TCP This patch series adds PLB (Protective Load Balancing) to TCP and hooks it up to DCTCP. PLB is disabled by default and can be enabled using relevant sysctls and support from underlying CC. PLB (Protective Load Balancing) is a host based mechanism for load balancing across switch links. It leverages congestion signals(e.g. ECN) from transport layer to randomly change the path of the connection experiencing congestion. PLB changes the path of the connection by changing the outgoing IPv6 flow label for IPv6 connections (implemented in Linux by calling sk_rethink_txhash()). Because of this implementation mechanism, PLB can currently only work for IPv6 traffic. For more information, see the SIGCOMM 2022 paper: https://doi.org/10.1145/3544216.3544226 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28tcp: add rcv_wnd and plb_rehash to TCP_INFOMubashir Adnan Qureshi2-0/+7
rcv_wnd can be useful to diagnose TCP performance where receiver window becomes the bottleneck. rehash reports the PLB and timeout triggered rehash attempts by the TCP connection. Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28tcp: add u32 counter in tcp_sock and an SNMP counter for PLBMubashir Adnan Qureshi6-0/+9
A u32 counter is added to tcp_sock for counting the number of PLB triggered rehashes for a TCP connection. An SNMP counter is also added to count overall PLB triggered rehash events for a host. These counters are hooked up to PLB implementation for DCTCP. TCP_NLA_REHASH is added to SCM_TIMESTAMPING_OPT_STATS that reports the rehash attempts triggered due to PLB or timeouts. This gives a historical view of sustained congestion or timeouts experienced by the TCP connection. Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28tcp: add support for PLB in DCTCPMubashir Adnan Qureshi1-1/+22
PLB support is added to TCP DCTCP code. As DCTCP uses ECN as the congestion signal, PLB also uses ECN to make decisions whether to change the path or not upon sustained congestion. Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28tcp: add PLB functionality for TCPMubashir Adnan Qureshi4-2/+137
Congestion control algorithms track PLB state and cause the connection to trigger a path change when either of the 2 conditions is satisfied: - No packets are in flight and (# consecutive congested rounds >= sysctl_tcp_plb_idle_rehash_rounds) - (# consecutive congested rounds >= sysctl_tcp_plb_rehash_rounds) A round (RTT) is marked as congested when congestion signal (ECN ce_ratio) over an RTT is greater than sysctl_tcp_plb_cong_thresh. In the event of RTO, PLB (via tcp_write_timeout()) triggers a path change and disables congestion-triggered path changes for random time between (sysctl_tcp_plb_suspend_rto_sec, 2*sysctl_tcp_plb_suspend_rto_sec) to avoid hopping onto the "connectivity blackhole". RTO-triggered path changes can still happen during this cool-off period. Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28tcp: add sysctls for TCP PLB parametersMubashir Adnan Qureshi4-0/+131
PLB (Protective Load Balancing) is a host based mechanism for load balancing across switch links. It leverages congestion signals(e.g. ECN) from transport layer to randomly change the path of the connection experiencing congestion. PLB changes the path of the connection by changing the outgoing IPv6 flow label for IPv6 connections (implemented in Linux by calling sk_rethink_txhash()). Because of this implementation mechanism, PLB can currently only work for IPv6 traffic. For more information, see the SIGCOMM 2022 paper: https://doi.org/10.1145/3544216.3544226 This commit adds new sysctl knobs and sets their default values for TCP PLB. Signed-off-by: Mubashir Adnan Qureshi <mubashirq@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28Merge branch 'mxl-gpy-MDI-X'David S. Miller1-8/+90
Raju Lakkaraju says: ==================== net: phy: mxl-gpy: Add MDI-X This patch series add the MDI-X feature to GPY211 PHYs and Also Change return type to gpy_update_interface() function ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28net: phy: mxl-gpy: Add PHY Auto/MDI/MDI-X set driver for GPY211 chipsRaju Lakkaraju1-1/+71
Add support for MDI-X status and configuration for GPY211 chips Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28net: phy: mxl-gpy: Change gpy_update_interface() function return typeRaju Lakkaraju1-8/+20
gpy_update_interface() is called from gpy_read_status() which does return error codes. gpy_read_status() would benefit from returning -EINVAL, etc. Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextJakub Kicinski14-45/+695
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next 1) Move struct nft_payload_set definition to .c file where it is only used. 2) Shrink transport and inner header offset fields in the nft_pktinfo structure to 16-bits, from Florian Westphal. 3) Get rid of nft_objref Kbuild toggle, make it built-in into nf_tables. This expression is used to instantiate conntrack helpers in nftables. After removing the conntrack helper auto-assignment toggle it this feature became more important so move it to the nf_tables core module. Also from Florian. 4) Extend the existing function to calculate payload inner header offset to deal with the GRE and IPIP transport protocols. 6) Add inner expression support for nf_tables. This new expression provides a packet parser for tunneled packets which uses a userspace description of the expected inner headers. The inner expression invokes the payload expression (via direct call) to match on the inner header protocol fields using the inner link, network and transport header offsets. An example of the bytecode generated from userspace to match on IP source encapsulated in a VxLAN packet: # nft --debug=netlink add rule netdev x y udp dport 4789 vxlan ip saddr 1.2.3.4 netdev x y [ meta load l4proto => reg 1 ] [ cmp eq reg 1 0x00000011 ] [ payload load 2b @ transport header + 2 => reg 1 ] [ cmp eq reg 1 0x0000b512 ] [ inner type vxlan hdrsize 8 flags f [ meta load protocol => reg 1 ] ] [ cmp eq reg 1 0x00000008 ] [ inner type vxlan hdrsize 8 flags f [ payload load 4b @ network header + 12 => reg 1 ] ] [ cmp eq reg 1 0x04030201 ] 7) Store inner link, network and transport header offsets in percpu area to parse inner packet header once only. Matching on a different tunnel type invalidates existing offsets in the percpu area and it invokes the inner tunnel parser again. 8) Add support for inner meta matching. This support for NFTA_META_PROTOCOL, which specifies the inner ethertype, and NFT_META_L4PROTO, which specifies the inner transport protocol. 9) Extend nft_inner to parse GENEVE optional fields to calculate the link layer offset. 10) Update inner expression so tunnel offset points to GRE header to normalize tunnel header handling. This also allows to perform different interpretations of the GRE header from userspace. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nft_inner: set tunnel offset to GRE header offset netfilter: nft_inner: add geneve support netfilter: nft_meta: add inner match support netfilter: nft_inner: add percpu inner context netfilter: nft_inner: support for inner tunnel header matching netfilter: nft_payload: access ipip payload for inner offset netfilter: nft_payload: access GRE payload via inner offset netfilter: nft_objref: make it builtin netfilter: nf_tables: reduce nft_pktinfo by 8 bytes netfilter: nft_payload: move struct nft_payload_set definition where it belongs ==================== Link: https://lore.kernel.org/r/20221026132227.3287-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28net: dpaa2-eth: Simplify bool conversionYang Li1-1/+1
./drivers/net/ethernet/freescale/dpaa2/dpaa2-xsk.c:453:42-47: WARNING: conversion to bool not needed here Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2577 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20221026051824.38730-1-yang.lee@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28Merge branch 'ionic-vf-attr-replay-and-other-updates'Jakub Kicinski6-32/+176
Shannon Nelson says: ==================== ionic: VF attr replay and other updates For better VF management when a FW update restart or a FW crash recover is detected, the PF now will replay any user specified VF attributes to be sure the FW hasn't lost them in the restart. Newer FW offers more packet processing offloads, so we now support them in the driver. A small refactor of the Rx buffer fill cleans a bit of code and will help future work on buffer caching. ==================== Link: https://lore.kernel.org/r/20221026143744.11598-1-snelson@pensando.io Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28ionic: refactor use of ionic_rx_fill()Neel Patel1-11/+12
The same pre-work code is used before each call to ionic_rx_fill(), so bring it in and make it a part of the routine. Signed-off-by: Neel Patel <neel@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28ionic: enable tunnel offloadsNeel Patel2-3/+13
Support stateless offloads for GRE, VXLAN, GENEVE, IPXIP4 and IPXIP6 when the FW supports them. Signed-off-by: Neel Patel <neel@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28ionic: new ionic device identity level and VF start controlShannon Nelson5-2/+64
A new ionic dev_cmd is added to the interface in ionic_if.h, with a new capabilities field in the ionic device identity to signal its availability in the FW. The identity level code is incremented to '2' to show support for this new capabilities bitfield. If the driver has indicated with the new identity level that it has the VF_CTRL command, newer FW will wait for the start command before starting the VFs after a FW update or crash recovery. This patch updates the driver to make use of the new VF start control in fw_up path to be sure that the PF has set the user attributes on the VF before the FW allows the VFs to restart. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28ionic: only save the user set VF attributesShannon Nelson1-16/+17
Report the current FW values for the VF attributes, but don't save the FW values locally, only save the vf attributes that are given to us from the user. This allows us to replay user data, and doesn't end up confusing things like "who set the mac address". Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28ionic: replay VF attributes after fw crash recoveryShannon Nelson1-0/+70
The VF attributes that the user has set into the FW through the PF can be lost over a FW crash recovery. Much like we already replay the PF mac/vlan filters, we now add a replay in the recovery path to be sure the FW has the up-to-date VF configurations. Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski142-578/+1616
drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c 2871edb32f46 ("can: kvaser_usb: Fix possible completions during init_completion") abb8670938b2 ("can: kvaser_usb_leaf: Ignore stale bus-off after start") 8d21f5927ae6 ("can: kvaser_usb_leaf: Fix improved state not being reported") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27Merge tag 'net-6.1-rc3-2' of ↵Linus Torvalds62-347/+1197
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from 802.15.4 (Zigbee et al). Current release - regressions: - ipa: fix bugs in the register conversion for IPA v3.1 and v3.5.1 Current release - new code bugs: - mptcp: fix abba deadlock on fastopen - eth: stmmac: rk3588: allow multiple gmac controllers in one system Previous releases - regressions: - ip: rework the fix for dflt addr selection for connected nexthop - net: couple more fixes for misinterpreting bits in struct page after the signature was added Previous releases - always broken: - ipv6: ensure sane device mtu in tunnels - openvswitch: switch from WARN to pr_warn on a user-triggerable path - ethtool: eeprom: fix null-deref on genl_info in dump - ieee802154: more return code fixes for corner cases in dgram_sendmsg - mac802154: fix link-quality-indicator recording - eth: mlx5: fixes for IPsec, PTP timestamps, OvS and conntrack offload - eth: fec: limit register access on i.MX6UL - eth: bcm4908_enet: update TX stats after actual transmission - can: rcar_canfd: improve IRQ handling for RZ/G2L Misc: - genetlink: piggy back on the newly added resv_op_start to enforce more sanity checks on new commands" * tag 'net-6.1-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits) net: enetc: survive memory pressure without crashing kcm: do not sense pfmemalloc status in kcm_sendpage() net: do not sense pfmemalloc status in skb_append_pagefrags() net/mlx5e: Fix macsec sci endianness at rx sa update net/mlx5e: Fix wrong bitwise comparison usage in macsec_fs_rx_add_rule function net/mlx5e: Fix macsec rx security association (SA) update/delete net/mlx5e: Fix macsec coverity issue at rx sa update net/mlx5: Fix crash during sync firmware reset net/mlx5: Update fw fatal reporter state on PCI handlers successful recover net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroed net/mlx5e: TC, Reject forwarding from internal port to internal port net/mlx5: Fix possible use-after-free in async command interface net/mlx5: ASO, Create the ASO SQ with the correct timestamp format net/mlx5e: Update restore chain id for slow path packets net/mlx5e: Extend SKB room check to include PTP-SQ net/mlx5: DR, Fix matcher disconnect error flow net/mlx5: Wait for firmware to enable CRS before pci_restore_state net/mlx5e: Do not increment ESN when updating IPsec ESN state netdevsim: remove dir in nsim_dev_debugfs_init() when creating ports dir failed netdevsim: fix memory leak in nsim_drv_probe() when nsim_dev_resources_register() failed ...
2022-10-27Merge tag 'execve-v6.1-rc3' of ↵Linus Torvalds2-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull execve fixes from Kees Cook: - Fix an ancient signal action copy race (Bernd Edlinger) - Fix a memory leak in ELF loader, when under memory pressure (Li Zetao) * tag 'execve-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: fs/binfmt_elf: Fix memory leak in load_elf_binary() exec: Copy oldsighand->action under spin-lock
2022-10-27Merge tag 'hardening-v6.1-rc3' of ↵Linus Torvalds4-36/+58
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening fixes from Kees Cook: - Fix older Clang vs recent overflow KUnit test additions (Nick Desaulniers, Kees Cook) - Fix kern-doc visibility for overflow helpers (Kees Cook) * tag 'hardening-v6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: overflow: Refactor test skips for Clang-specific issues overflow: disable failing tests for older clang versions overflow: Fix kern-doc markup for functions
2022-10-27Merge tag 'media/v6.1-3' of ↵Linus Torvalds7-13/+83
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "A bunch of patches addressing issues in the vivid driver and adding new checks in V4L2 to validate the input parameters from some ioctls" * tag 'media/v6.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: vivid.rst: loop_video is set on the capture devnode media: vivid: set num_in/outputs to 0 if not supported media: vivid: drop GFP_DMA32 media: vivid: fix control handler mutex deadlock media: videodev2.h: V4L2_DV_BT_BLANKING_HEIGHT should check 'interlaced' media: v4l2-dv-timings: add sanity checks for blanking values media: vivid: dev->bitmap_cap wasn't freed in all cases media: vivid: s_fbuf: add more sanity checks
2022-10-27Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds3-9/+15
Pull fscrypt fix from Eric Biggers: "Fix a memory leak that was introduced by a change that went into -rc1" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: fix keyring memory leak on mount failure
2022-10-27net: enetc: survive memory pressure without crashingVladimir Oltean1-0/+5
Under memory pressure, enetc_refill_rx_ring() may fail, and when called during the enetc_open() -> enetc_setup_rxbdr() procedure, this is not checked for. An extreme case of memory pressure will result in exactly zero buffers being allocated for the RX ring, and in such a case it is expected that hardware drops all RX packets due to lack of buffers. This does not happen, because the reset-default value of the consumer and produces index is 0, and this makes the ENETC think that all buffers have been initialized and that it owns them (when in reality none were). The hardware guide explains this best: | Configure the receive ring producer index register RBaPIR with a value | of 0. The producer index is initially configured by software but owned | by hardware after the ring has been enabled. Hardware increments the | index when a frame is received which may consume one or more BDs. | Hardware is not allowed to increment the producer index to match the | consumer index since it is used to indicate an empty condition. The ring | can hold at most RBLENR[LENGTH]-1 received BDs. | | Configure the receive ring consumer index register RBaCIR. The | consumer index is owned by software and updated during operation of the | of the BD ring by software, to indicate that any receive data occupied | in the BD has been processed and it has been prepared for new data. | - If consumer index and producer index are initialized to the same | value, it indicates that all BDs in the ring have been prepared and | hardware owns all of the entries. | - If consumer index is initialized to producer index plus N, it would | indicate N BDs have been prepared. Note that hardware cannot start if | only a single buffer is prepared due to the restrictions described in | (2). | - Software may write consumer index to match producer index anytime | while the ring is operational to indicate all received BDs prior have | been processed and new BDs prepared for hardware. Normally, the value of rx_ring->rcir (consumer index) is brought in sync with the rx_ring->next_to_use software index, but this only happens if page allocation ever succeeded. When PI==CI==0, the hardware appears to receive frames and write them to DMA address 0x0 (?!), then set the READY bit in the BD. The enetc_clean_rx_ring() function (and its XDP derivative) is naturally not prepared to handle such a condition. It will attempt to process those frames using the rx_swbd structure associated with index i of the RX ring, but that structure is not fully initialized (enetc_new_page() does all of that). So what happens next is undefined behavior. To operate using no buffer, we must initialize the CI to PI + 1, which will block the hardware from advancing the CI any further, and drop everything. The issue was seen while adding support for zero-copy AF_XDP sockets, where buffer memory comes from user space, which can even decide to supply no buffers at all (example: "xdpsock --txonly"). However, the bug is present also with the network stack code, even though it would take a very determined person to trigger a page allocation failure at the perfect time (a series of ifup/ifdown under memory pressure should eventually reproduce it given enough retries). Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20221027182925.3256653-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27kcm: do not sense pfmemalloc status in kcm_sendpage()Eric Dumazet1-1/+1
Similar to changes done in TCP in blamed commit. We should not sense pfmemalloc status in sendpage() methods. Fixes: 326140063946 ("tcp: TX zerocopy should not sense pfmemalloc status") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20221027040637.1107703-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27net: do not sense pfmemalloc status in skb_append_pagefrags()Eric Dumazet1-1/+1
skb_append_pagefrags() is used by af_unix and udp sendpage() implementation so far. In commit 326140063946 ("tcp: TX zerocopy should not sense pfmemalloc status") we explained why we should not sense pfmemalloc status for pages owned by user space. We should also use skb_fill_page_desc_noacc() in skb_append_pagefrags() to avoid following KCSAN report: BUG: KCSAN: data-race in lru_add_fn / skb_append_pagefrags write to 0xffffea00058fc1c8 of 8 bytes by task 17319 on cpu 0: __list_add include/linux/list.h:73 [inline] list_add include/linux/list.h:88 [inline] lruvec_add_folio include/linux/mm_inline.h:323 [inline] lru_add_fn+0x327/0x410 mm/swap.c:228 folio_batch_move_lru+0x1e1/0x2a0 mm/swap.c:246 lru_add_drain_cpu+0x73/0x250 mm/swap.c:669 lru_add_drain+0x21/0x60 mm/swap.c:773 free_pages_and_swap_cache+0x16/0x70 mm/swap_state.c:311 tlb_batch_pages_flush mm/mmu_gather.c:59 [inline] tlb_flush_mmu_free mm/mmu_gather.c:256 [inline] tlb_flush_mmu+0x5b2/0x640 mm/mmu_gather.c:263 tlb_finish_mmu+0x86/0x100 mm/mmu_gather.c:363 exit_mmap+0x190/0x4d0 mm/mmap.c:3098 __mmput+0x27/0x1b0 kernel/fork.c:1185 mmput+0x3d/0x50 kernel/fork.c:1207 copy_process+0x19fc/0x2100 kernel/fork.c:2518 kernel_clone+0x166/0x550 kernel/fork.c:2671 __do_sys_clone kernel/fork.c:2812 [inline] __se_sys_clone kernel/fork.c:2796 [inline] __x64_sys_clone+0xc3/0xf0 kernel/fork.c:2796 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd read to 0xffffea00058fc1c8 of 8 bytes by task 17325 on cpu 1: page_is_pfmemalloc include/linux/mm.h:1817 [inline] __skb_fill_page_desc include/linux/skbuff.h:2432 [inline] skb_fill_page_desc include/linux/skbuff.h:2453 [inline] skb_append_pagefrags+0x210/0x600 net/core/skbuff.c:3974 unix_stream_sendpage+0x45e/0x990 net/unix/af_unix.c:2338 kernel_sendpage+0x184/0x300 net/socket.c:3561 sock_sendpage+0x5a/0x70 net/socket.c:1054 pipe_to_sendpage+0x128/0x160 fs/splice.c:361 splice_from_pipe_feed fs/splice.c:415 [inline] __splice_from_pipe+0x222/0x4d0 fs/splice.c:559 splice_from_pipe fs/splice.c:594 [inline] generic_splice_sendpage+0x89/0xc0 fs/splice.c:743 do_splice_from fs/splice.c:764 [inline] direct_splice_actor+0x80/0xa0 fs/splice.c:931 splice_direct_to_actor+0x305/0x620 fs/splice.c:886 do_splice_direct+0xfb/0x180 fs/splice.c:974 do_sendfile+0x3bf/0x910 fs/read_write.c:1255 __do_sys_sendfile64 fs/read_write.c:1323 [inline] __se_sys_sendfile64 fs/read_write.c:1309 [inline] __x64_sys_sendfile64+0x10c/0x150 fs/read_write.c:1309 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x0000000000000000 -> 0xffffea00058fc188 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 17325 Comm: syz-executor.0 Not tainted 6.1.0-rc1-syzkaller-00158-g440b7895c990-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/11/2022 Fixes: 326140063946 ("tcp: TX zerocopy should not sense pfmemalloc status") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20221027040346.1104204-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>