summaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
2023-08-20ipv4: fix data-races around inet->inet_idEric Dumazet1-1/+1
UDP sendmsg() is lockless, so ip_select_ident_segs() can very well be run from multiple cpus [1] Convert inet->inet_id to an atomic_t, but implement a dedicated path for TCP, avoiding cost of a locked instruction (atomic_add_return()) Note that this patch will cause a trivial merge conflict because we added inet->flags in net-next tree. v2: added missing change in drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c (David Ahern) [1] BUG: KCSAN: data-race in __ip_make_skb / __ip_make_skb read-write to 0xffff888145af952a of 2 bytes by task 7803 on cpu 1: ip_select_ident_segs include/net/ip.h:542 [inline] ip_select_ident include/net/ip.h:556 [inline] __ip_make_skb+0x844/0xc70 net/ipv4/ip_output.c:1446 ip_make_skb+0x233/0x2c0 net/ipv4/ip_output.c:1560 udp_sendmsg+0x1199/0x1250 net/ipv4/udp.c:1260 inet_sendmsg+0x63/0x80 net/ipv4/af_inet.c:830 sock_sendmsg_nosec net/socket.c:725 [inline] sock_sendmsg net/socket.c:748 [inline] ____sys_sendmsg+0x37c/0x4d0 net/socket.c:2494 ___sys_sendmsg net/socket.c:2548 [inline] __sys_sendmmsg+0x269/0x500 net/socket.c:2634 __do_sys_sendmmsg net/socket.c:2663 [inline] __se_sys_sendmmsg net/socket.c:2660 [inline] __x64_sys_sendmmsg+0x57/0x60 net/socket.c:2660 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd read to 0xffff888145af952a of 2 bytes by task 7804 on cpu 0: ip_select_ident_segs include/net/ip.h:541 [inline] ip_select_ident include/net/ip.h:556 [inline] __ip_make_skb+0x817/0xc70 net/ipv4/ip_output.c:1446 ip_make_skb+0x233/0x2c0 net/ipv4/ip_output.c:1560 udp_sendmsg+0x1199/0x1250 net/ipv4/udp.c:1260 inet_sendmsg+0x63/0x80 net/ipv4/af_inet.c:830 sock_sendmsg_nosec net/socket.c:725 [inline] sock_sendmsg net/socket.c:748 [inline] ____sys_sendmsg+0x37c/0x4d0 net/socket.c:2494 ___sys_sendmsg net/socket.c:2548 [inline] __sys_sendmmsg+0x269/0x500 net/socket.c:2634 __do_sys_sendmmsg net/socket.c:2663 [inline] __se_sys_sendmmsg net/socket.c:2660 [inline] __x64_sys_sendmmsg+0x57/0x60 net/socket.c:2660 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x184d -> 0x184e Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 7804 Comm: syz-executor.1 Not tainted 6.5.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023 ================================================================== Fixes: 23f57406b82d ("ipv4: avoid using shared IP generator for connected sockets") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-20net: validate veth and vxcan peer ifindexesJakub Kicinski2-10/+2
veth and vxcan need to make sure the ifindexes of the peer are not negative, core does not validate this. Using iproute2 with user-space-level checking removed: Before: # ./ip link add index 10 type veth peer index -1 # ip link show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 52:54:00:74:b2:03 brd ff:ff:ff:ff:ff:ff 10: veth1@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 8a:90:ff:57:6d:5d brd ff:ff:ff:ff:ff:ff -1: veth0@veth1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether ae:ed:18:e6:fa:7f brd ff:ff:ff:ff:ff:ff Now: $ ./ip link add index 10 type veth peer index -1 Error: ifindex can't be negative. This problem surfaced in net-next because an explicit WARN() was added, the root cause is older. Fixes: e6f8f1a739b6 ("veth: Allow to create peer link with given ifindex") Fixes: a8f820a380a2 ("can: add Virtual CAN Tunnel driver (vxcan)") Reported-by: syzbot+5ba06978f34abb058571@syzkaller.appspotmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-20net: dsa: realtek: add phylink_get_caps implementationRussell King (Oracle)1-0/+28
The user ports use RSGMII, but we don't have that, and DT doesn't specify a phy interface mode, so phylib defaults to GMII. These support 1G, 100M and 10M with flow control. It is unknown whether asymetric pause is supported at all speeds. The CPU port uses MII/GMII/RGMII/REVMII by hardware pin strapping, and support speeds specific to each, with full duplex only supported in some modes. Flow control may be supported again by hardware pin strapping, and theoretically is readable through a register but no information is given in the datasheet for that. So, we do a best efforts - and be lenient. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-20net: bcmgenet: Fix return value check for fixed_phy_register()Ruan Jinjie1-1/+1
The fixed_phy_register() function returns error pointers and never returns NULL. Update the checks accordingly. Fixes: b0ba512e25d7 ("net: bcmgenet: enable driver to work without a device tree") Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Doug Berger <opendmb@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-20net: bgmac: Fix return value check for fixed_phy_register()Ruan Jinjie1-1/+1
The fixed_phy_register() function returns error pointers and never returns NULL. Update the checks accordingly. Fixes: c25b23b8a387 ("bgmac: register fixed PHY for ARM BCM470X / BCM5301X chipsets") Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-20net/mlx5: Add RoCE MACsec steering infrastructure in corePatrisious Haddad2-9/+391
Adds all the core steering helper functions that are needed in order to setup RoCE steering rules which includes both the RX and TX rules addition and deletion. As well as exporting the function to be ready to use from the IB driver where we expose functions to allow deletion of all rules, which is needed when a GID is deleted, or a deletion of a specific rule when an SA is deleted, and a similar manner for the rules addition. These functions are used in a later patch by IB driver to trigger the rules addition/deletion when needed. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20net/mlx5: Configure MACsec steering for ingress RoCEv2 trafficPatrisious Haddad2-9/+352
Add steering tables/rules to check if the decrypted traffic is RoCEv2, if so copy reg_b_metadata to a temp reg and forward it to RDMA_RX domain. The rules are added once the MACsec device is assigned an IP address where we verify that the packet ip is for MACsec device and that the temp reg has MACsec operation and a valid SCI inside. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20net/mlx5: Configure MACsec steering for egress RoCEv2 trafficPatrisious Haddad1-1/+45
Add steering table in RDMA_TX domain, to forward MACsec traffic to MACsec crypto table in NIC domain. The tables are created in a lazy manner when the first TX SA is being created, and destroyed upon the destruction of the last SA. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20net/mlx5: Add MACsec priorities in RDMA namespacesPatrisious Haddad1-3/+32
Add MACsec flow steering priorities in RDMA namespaces. This allows adding tables/rules to forward RoCEv2 traffic to the MACsec crypto tables in NIC_TX domain, and accept RoCEv2 traffic from NIC_RX domain. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20RDMA/mlx5: Implement MACsec gid addition and deletionPatrisious Haddad2-33/+0
Handle MACsec IP ambiguity issue, since mlx5 hw can't support programming both the MACsec and the physical gid when they have the same IP address, because it wouldn't know to whom to steer the traffic. Hence in such case we delete the physical gid from the hw gid table, which would then cause all traffic sent over it to fail, and we'll only be able to send traffic over the MACsec gid. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20net/mlx5: Maintain fs_id xarray per MACsec device inside macsec steeringPatrisious Haddad3-94/+271
Remove fs_id from the MACsec SA, since it has no real usage there and instead maintain with the MACsec steering data inside the core. Downstream patches requires this change to facilitate IB driver accesses to the fs_ids to avoid RoCE MACsec dependency on EN driver. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20net/mlx5: Remove netdevice from MACsec steeringPatrisious Haddad3-77/+71
Since MACsec steering was moved from ethernet private code to core, remove the netdevice from the MACsec steering, and use core device methods for error reporting instead. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20net/mlx5e: Move MACsec flow steering and statistics database from ethernet ↵Patrisious Haddad5-25/+20
to core Since now MACsec flow steering (macsec_fs) and MACsec statistics (stats) are maintained by the core driver, move their data as well to be saved inside core structures instead of staying part of ethernet MACsec database. In addition cleanup all MACsec stats functions from the ethernet MACsec code and move what's needed to be part of macsec_fs instead. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20net/mlx5e: Rename MACsec flow steering functions/parameters to suit core ↵Patrisious Haddad5-119/+119
naming style Rename MACsec flow steering(macsec_fs) functions and parameters from ethernet(core/en_accel) naming convention to core naming convention. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20net/mlx5: Remove dependency of macsec flow steering on ethernetPatrisious Haddad1-9/+34
Since macsec flow steering was moved to core, it should be independent of all ethernet code and structures hence we remove all ethernet header includes and redefine ethernet structs internally for macsec_fs usage where needed. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20net/mlx5e: Move MACsec flow steering operations to be used as core libraryPatrisious Haddad9-29/+27
Move MACsec flow steering operations(macsec_fs) from core/en_accel to core/lib, this mandates moving MACsec statistics structure from the general MACsec code header(en_accel/macsec.h) to macsec_fs header to remove macsec_fs.h dependency over en_accel/macsec.h. This to lay the ground for RoCE MACsec by moving all the data that will need to be accessed by both ethernet MACsec and RoCE MACsec to be shared at core. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20macsec: add functions to get macsec real netdevice and check offloadPatrisious Haddad1-0/+12
Given a macsec net_device add two functions to return the real net_device for that device, and check if that macsec device is offloaded or not. This is needed for auxiliary drivers that implement MACsec offload, but have flows which are triggered over the macsec net_device, this allows the drivers in such cases to verify if the device is offloaded or not, and to access the real device of that macsec device, which would belong to the driver, and would be needed for the offload procedure. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-19net: microchip: sparx5: Update return value check for vcap_get_rule()Ruan Jinjie1-1/+1
As Simon Horman suggests, update vcap_get_rule() to always return an ERR_PTR() and update the error detection conditions to use IS_ERR(), so use IS_ERR() to check the return value. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Suggested-by: Simon Horman <horms@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-19net: lan966x: Fix return value check for vcap_get_rule()Ruan Jinjie1-2/+2
As Simon Horman suggests, update vcap_get_rule() to always return an ERR_PTR() and update the error detection conditions to use IS_ERR(), so use IS_ERR() to fix the return value issue. Fixes: 72df3489fb10 ("net: lan966x: Add ptp trap rules") Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Suggested-by: Simon Horman <horms@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-19net: microchip: vcap api: Always return ERR_PTR for vcap_get_rule()Ruan Jinjie1-1/+1
As Simon Horman suggests, update vcap_get_rule() to always return an ERR_PTR() and update the error detection conditions to use IS_ERR(), which would be more cleaner in this case. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Suggested-by: Simon Horman <horms@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-19net: phy: Fix deadlocking in phy_error() invocationSerge Semin1-4/+7
Since commit 91a7cda1f4b8 ("net: phy: Fix race condition on link status change") all the phy_error() method invocations have been causing the nested-mutex-lock deadlock because it's normally done in the PHY-driver threaded IRQ handlers which since that change have been called with the phydev->lock mutex held. Here is the calls thread: IRQ: phy_interrupt() +-> mutex_lock(&phydev->lock); <--------------------+ drv->handle_interrupt() | Deadlock due +-> ERROR: phy_error() + to the nested +-> phy_process_error() | mutex lock +-> mutex_lock(&phydev->lock); <-+ phydev->state = PHY_ERROR; mutex_unlock(&phydev->lock); mutex_unlock(&phydev->lock); The problem can be easily reproduced just by calling phy_error() from any PHY-device threaded interrupt handler. Fix it by dropping the phydev->lock mutex lock from the phy_process_error() method and printing a nasty error message to the system log if the mutex isn't held in the caller execution context. Note for the fix to work correctly in the PHY-subsystem itself the phydev->lock mutex locking must be added to the phy_error_precise() function. Link: https://lore.kernel.org/netdev/20230816180944.19262-1-fancer.lancer@gmail.com Fixes: 91a7cda1f4b8 ("net: phy: Fix race condition on link status change") Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-19net: mdio: xgene: remove useless xgene_mdio_statusRussell King (Oracle)1-3/+0
xgene_mdio_status is declared static, and is only written once by the driver. It appears to have been this way since the driver was first added to the kernel tree. No other users can be found, so let's remove it. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-19stmmac: intel: Enable correction of MAC propagation delayKurt Kanzenbach1-0/+1
All captured timestamps should be corrected by PHY, MAC and CDC introduced latency/errors. The CDC correction is already used. Enable MAC propagation delay correction as well which is available since commit 26cfb838aa00 ("net: stmmac: correct MAC propagation delay"). Before: |ptp4l[390.458]: rms 7 max 21 freq +177 +/- 14 delay 357 +/- 1 After: |ptp4l[620.012]: rms 7 max 20 freq +195 +/- 14 delay 345 +/- 1 Tested on Intel Elkhart Lake. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Johannes Zink <j.zink@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-19net: sfp: handle 100G/25G active optical cables in sfp_parse_supportJosua Mayer1-0/+10
Handle extended compliance code 0x1 (SFF8024_ECC_100G_25GAUI_C2M_AOC) for active optical cables supporting 25G and 100G speeds. Since the specification makes no statement about transmitter range, and as the specific sfp module that had been tested features only 2m fiber - short-range (SR) modes are selected. The 100G speed is irrelevant because it would require multiple fibers / multiple SFP28 modules combined under one netdev. sfp-bus.c only handles a single module per netdev, so only 25Gbps modes are selected. sfp_parse_support already handles SFF8024_ECC_100GBASE_SR4_25GBASE_SR with compatible properties, however that entry is a contradiction in itself since with SFP(28) 100GBASE_SR4 is impossible - that would likely be a mode for qsfp modules only. Add a case for SFF8024_ECC_100G_25GAUI_C2M_AOC selecting 25gbase-r interface mode and 25000baseSR link mode. Also enforce SFP28 bitrate limits on the values read from sfp eeprom as requested by Russell King. Tested with fs.com S28-AO02 AOC SFP28 module. Signed-off-by: Josua Mayer <josua@solid-run.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-19net: mdio: mdio-bitbang: Fix C45 read/write protocolSerge Semin1-2/+2
Based on the original code semantic in case of Clause 45 MDIO, the address command is supposed to be followed by the command sending the MMD address, not the CSR address. The commit 002dd3de097c ("net: mdio: mdio-bitbang: Separate C22 and C45 transactions") has erroneously broken that. So most likely due to an unfortunate variable name it switched the code to sending the CSR address. In our case it caused the protocol malfunction so the read operation always failed with the turnaround bit always been driven to one by PHY instead of zero. Fix that by getting back the correct behaviour: sending MMD address command right after the regular address command. Fixes: 002dd3de097c ("net: mdio: mdio-bitbang: Separate C22 and C45 transactions") Signed-off-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-19net: dsa: mt7530: fix handling of 802.1X PAE framesArınç ÜNAL2-0/+6
802.1X PAE frames are link-local frames, therefore they must be trapped to the CPU port. Currently, the MT753X switches treat 802.1X PAE frames as regular multicast frames, therefore flooding them to user ports. To fix this, set 802.1X PAE frames to be trapped to the CPU port(s). Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch") Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-19mlxsw: Fix the size of 'VIRT_ROUTER_MSB'Amit Cohen3-5/+5
The field 'virtual router' was extended to 12 bits in Spectrum-4. Therefore, the element 'MLXSW_AFK_ELEMENT_VIRT_ROUTER_MSB' needs 3 bits for Spectrum < 4 and 4 bits for Spectrum >= 4. The elements are stored in an internal storage scratchpad. Currently, the MSB is defined there as 3 bits. It means that for Spectrum-4, only 2K VRFs can be used for multicast routing, as the highest bit is not really used by the driver. Fix the definition of 'VIRT_ROUTER_MSB' to use 4 bits. Adjust the definitions of 'virtual router' field in the blocks accordingly - use '_avoid_size_check' for Spectrum-2 instead of for Spectrum-4. Fix the mask in parse function to use 4 bits. Fixes: 6d5d8ebb881c ("mlxsw: Rename virtual router flex key element") Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/79bed2b70f6b9ed58d4df02e9798a23da648015b.1692268427.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19mlxsw: reg: Fix SSPR register layoutIdo Schimmel1-9/+0
The two most significant bits of the "local_port" field in the SSPR register are always cleared since they are overwritten by the deprecated and overlapping "sub_port" field. On systems with more than 255 local ports (e.g., Spectrum-4), this results in the firmware maintaining invalid mappings between system port and local port. Specifically, two different systems ports (0x1 and 0x101) point to the same local port (0x1), which eventually leads to firmware errors. Fix by removing the deprecated "sub_port" field. Fixes: fd24b29a1b74 ("mlxsw: reg: Align existing registers to use extended local_port field") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/9b909a3033c8d3d6f67f237306bef4411c5e6ae4.1692268427.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19mlxsw: pci: Set time stamp fields also when its type is MIRROR_UTCDanielle Ratson1-2/+6
Currently, in Spectrum-2 and above, time stamps are extracted from the CQE into the time stamp fields in 'struct mlxsw_skb_cb', only when the CQE time stamp type is UTC. The time stamps are read directly from the CQE and software can get the time stamp in UTC format using CQEv2. From Spectrum-4, the time stamps that are read from the CQE are allowed to be also from MIRROR_UTC type. Therefore, we get a warning [1] from the driver that the time stamp fields were not set, when LLDP control packet is sent. Allow the time stamp type to be MIRROR_UTC and set the time stamp in this case as well. [1] WARNING: CPU: 11 PID: 0 at drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:1409 mlxsw_sp2_ptp_hwtstamp_fill+0x1f/0x70 [mlxsw_spectrum] [...] Call Trace: <IRQ> mlxsw_sp2_ptp_receive+0x3c/0x80 [mlxsw_spectrum] mlxsw_core_skb_receive+0x119/0x190 [mlxsw_core] mlxsw_pci_cq_tasklet+0x3c9/0x780 [mlxsw_pci] tasklet_action_common.constprop.0+0x9f/0x110 __do_softirq+0xbb/0x296 irq_exit_rcu+0x79/0xa0 common_interrupt+0x86/0xa0 </IRQ> <TASK> Fixes: 4735402173e6 ("mlxsw: spectrum: Extend to support Spectrum-4 ASIC") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/bcef4d044ef608a4e258d33a7ec0ecd91f480db5.1692268427.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19net: freescale: Remove unused declarationsYue Haibing2-9/+0
Commit 5d93cfcf7360 ("net: dpaa: Convert to phylink") removed fman_set_mac_active_pause()/fman_get_pause_cfg() but not declarations. Commit 48257c4f168e ("Add fs_enet ethernet network driver, for several embedded platforms.") declared but never implemented fs_enet_platform_init() and fs_enet_platform_cleanup(). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230817134159.38484-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19ipvlan: Fix a reference count leak warning in ipvlan_ns_exit()Lu Wei1-1/+2
There are two network devices(veth1 and veth3) in ns1, and ipvlan1 with L3S mode and ipvlan2 with L2 mode are created based on them as figure (1). In this case, ipvlan_register_nf_hook() will be called to register nf hook which is needed by ipvlans in L3S mode in ns1 and value of ipvl_nf_hook_refcnt is set to 1. (1) ns1 ns2 ------------ ------------ veth1--ipvlan1 (L3S) veth3--ipvlan2 (L2) (2) ns1 ns2 ------------ ------------ veth1--ipvlan1 (L3S) ipvlan2 (L2) veth3 | | |------->-------->--------->-------- migrate When veth3 migrates from ns1 to ns2 as figure (2), veth3 will register in ns2 and calls call_netdevice_notifiers with NETDEV_REGISTER event: dev_change_net_namespace call_netdevice_notifiers ipvlan_device_event ipvlan_migrate_l3s_hook ipvlan_register_nf_hook(newnet) (I) ipvlan_unregister_nf_hook(oldnet) (II) In function ipvlan_migrate_l3s_hook(), ipvl_nf_hook_refcnt in ns1 is not 0 since veth1 with ipvlan1 still in ns1, (I) and (II) will be called to register nf_hook in ns2 and unregister nf_hook in ns1. As a result, ipvl_nf_hook_refcnt in ns1 is decreased incorrectly and this in ns2 is increased incorrectly. When the second net namespace is removed, a reference count leak warning in ipvlan_ns_exit() will be triggered. This patch add a check before ipvlan_migrate_l3s_hook() is called. The warning can be triggered as follows: $ ip netns add ns1 $ ip netns add ns2 $ ip netns exec ns1 ip link add veth1 type veth peer name veth2 $ ip netns exec ns1 ip link add veth3 type veth peer name veth4 $ ip netns exec ns1 ip link add ipv1 link veth1 type ipvlan mode l3s $ ip netns exec ns1 ip link add ipv2 link veth3 type ipvlan mode l2 $ ip netns exec ns1 ip link set veth3 netns ns2 $ ip net del ns2 Fixes: 3133822f5ac1 ("ipvlan: use pernet operations and restrict l3s hooks to master netns") Signed-off-by: Lu Wei <luwei32@huawei.com> Reviewed-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20230817145449.141827-1-luwei32@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19bnxt_en: Add tx_resets ring counterMichael Chan3-0/+10
Add a new tx_resets ring counter. This counter will be saved as tx_total_resets across any reset. Since we currently do a full reset in bnxt_sched_reset_txr(), the per ring counter will always be cleared during reset. Only the tx_total_resets count will be meaningful and we only display this under ethtool -S. Link: https://lore.kernel.org/netdev/CACKFLimD-bKmJ1tGZOLYRjWzEwxkri-Mw7iFme1x2Dr0twdCeg@mail.gmail.com/ Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230817231911.165035-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19bnxt_en: Display the ring error counters under ethtool -SMichael Chan1-25/+23
The existing driver displays the sum of 4 ring counters under ethtool -S. These counters are in the array bnxt_sw_func_stats. These counters are summed at the time of ethtool -S and will be lost when the device is reset. Replace these counters with the new total ring error counters added in the last patch. These new counters are saved before reset. ethtool -S will now display the sum of the saved counters plus the current counters. Link: https://lore.kernel.org/netdev/CACKFLimD-bKmJ1tGZOLYRjWzEwxkri-Mw7iFme1x2Dr0twdCeg@mail.gmail.com/ Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230817231911.165035-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19bnxt_en: Save ring error counters across resetMichael Chan2-1/+46
Currently, the ring counters are stored in the per ring datastructure. During reset, all the rings are freed together with the associated datastructures. As a result, all the ring error counters will be reset to zero. Add logic to keep track of the total error counts of all the rings and save them before reset (including ifdown). The next patch will display these total ring error counters under ethtool -S. Link: https://lore.kernel.org/netdev/CACKFLimD-bKmJ1tGZOLYRjWzEwxkri-Mw7iFme1x2Dr0twdCeg@mail.gmail.com/ Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230817231911.165035-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19bnxt_en: Increment rx_resets counter in bnxt_disable_napi()Michael Chan1-5/+7
If we are doing a complete reset with irq_re_init set to true in bnxt_close_nic(), all the ring structures will be freed. New structures will be allocated in bnxt_open_nic(). The current code increments rx_resets counter in bnxt_enable_napi() if bnapi->in_reset is true. In a complete reset, bnapi->in_reset will never be true since the structure is just allocated. Increment the rx_resets counter in bnxt_disable_napi() instead. This will allow us to save all the ring error counters including the rx_resets counters in the next patch. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230817231911.165035-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19bnxt_en: Let the page pool manage the DMA mappingSomnath Kotur1-22/+10
Use the page pool's ability to maintain DMA mappings for us. This avoids re-mapping of the recycled pages. Link: https://lore.kernel.org/netdev/20230728231829.235716-4-michael.chan@broadcom.com/ Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230817231911.165035-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19bnxt_en: Use the unified RX page pool buffers for XDP and non-XDPSomnath Kotur2-60/+14
Convert to use the page pool buffers for the aggregation ring when running in non-XDP mode. This simplifies the driver and we benefit from the recycling of pages. Adjust the page pool size to account for the aggregation ring size. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20230817231911.165035-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19Merge branch '100GbE' of ↵Jakub Kicinski19-735/+624
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-08-17 (ice) This series contains updates to ice driver only. Jan removes unused functions and refactors code to make, possible, functions static. Jake rearranges some functions to be logically grouped. Marcin removes an unnecessary call to disable VLAN stripping. Yang Yingliang utilizes list_for_each_entry() helper for a couple list traversals. Przemek removes some parameters from ice_aq_alloc_free_res() which were always the same and reworks ice_aq_wait_for_event() to reduce chance of race. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: split ice_aq_wait_for_event() func into two ice: embed &ice_rq_event_info event into struct ice_aq_task ice: ice_aq_check_events: fix off-by-one check when filling buffer ice: drop two params from ice_aq_alloc_free_res() ice: use list_for_each_entry() helper ice: Remove redundant VSI configuration in eswitch setup ice: move E810T functions to before device agnostic ones ice: refactor ice_vsi_is_vlan_pruning_ena ice: refactor ice_ptp_hw to make functions static ice: refactor ice_sched to make functions static ice: Utilize assign_bit() helper ice: refactor ice_vf_lib to make functions static ice: refactor ice_lib to make functions static ice: refactor ice_ddp to make functions static ice: remove unused methods ==================== Link: https://lore.kernel.org/r/20230817212239.2601543-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19net: dsa: felix: fix oversize frame dropping for always closed tc-taprio gatesVladimir Oltean1-0/+3
The blamed commit resolved a bug where frames would still get stuck at egress, even though they're smaller than the maxSDU[tc], because the driver did not take into account the extra 33 ns that the queue system needs for scheduling the frame. It now takes that into account, but the arithmetic that we perform in vsc9959_tas_remaining_gate_len_ps() is buggy, because we operate on 64-bit unsigned integers, so gate_len_ns - VSC9959_TAS_MIN_GATE_LEN_NS may become a very large integer if gate_len_ns < 33 ns. In practice, this means that we've introduced a regression where all traffic class gates which are permanently closed will not get detected by the driver, and we won't enable oversize frame dropping for them. Before: mscc_felix 0000:00:00.5: port 0: max frame size 1526 needs 12400000 ps, 1152000 ps for mPackets at speed 1000 mscc_felix 0000:00:00.5: port 0 tc 0 min gate len 1000000, sending all frames mscc_felix 0000:00:00.5: port 0 tc 1 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 2 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 3 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 4 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 5 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 6 min gate len 0, sending all frames mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 5120 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 615 octets including FCS After: mscc_felix 0000:00:00.5: port 0: max frame size 1526 needs 12400000 ps, 1152000 ps for mPackets at speed 1000 mscc_felix 0000:00:00.5: port 0 tc 0 min gate len 1000000, sending all frames mscc_felix 0000:00:00.5: port 0 tc 1 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 2 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 3 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 4 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 5 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 6 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 5120 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 615 octets including FCS Fixes: 11afdc6526de ("net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230817120111.3522827-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19net: dm9051: Use PTR_ERR_OR_ZERO() to simplify codeRuan Jinjie1-4/+1
Return PTR_ERR_OR_ZERO() instead of return 0 or PTR_ERR() to simplify code. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230817022418.3588831-1-ruanjinjie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19sky2: Remove redundant NULL check for debugfs_create_dirRuan Jinjie1-1/+1
Since debugfs_create_dir() returns ERR_PTR, IS_ERR() is enough to check whether the directory is successfully created. So remove the redundant NULL check. Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230817073017.350002-1-ruanjinjie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19octeontx2-af: SDP: fix receive link configHariprasad Kelam1-1/+2
On SDP interfaces, frame oversize and undersize errors are observed as driver is not considering packet sizes of all subscribers of the link before updating the link config. This patch fixes the same. Fixes: 9b7dd87ac071 ("octeontx2-af: Support to modify min/max allowed packet lengths") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230817063006.10366-1-hkelam@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19pds_core: remove redundant pci_clear_master()Yu Liao1-4/+2
do_pci_disable_device() disable PCI bus-mastering as following: static void do_pci_disable_device(struct pci_dev *dev) { u16 pci_command; pci_read_config_word(dev, PCI_COMMAND, &pci_command); if (pci_command & PCI_COMMAND_MASTER) { pci_command &= ~PCI_COMMAND_MASTER; pci_write_config_word(dev, PCI_COMMAND, pci_command); } pcibios_disable_device(dev); } And pci_disable_device() sets dev->is_busmaster to 0. pci_enable_device() is called only once before calling to pci_disable_device() and such pci_clear_master() is not needed. So remove redundant pci_clear_master(). Also rename goto label 'err_out_clear_master' to 'err_out_disable_device'. Signed-off-by: Yu Liao <liaoyu15@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20230817025709.2023553-1-liaoyu15@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19Merge branch '40GbE' of ↵Jakub Kicinski6-56/+42
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== virtchnl: fix fake 1-elem arrays Alexander Lobakin says: 6.5-rc1 started spitting warning splats when composing virtchnl messages, precisely on virtchnl_rss_key and virtchnl_lut: [ 84.167709] memcpy: detected field-spanning write (size 52) of single field "vrk->key" at drivers/net/ethernet/intel/iavf/iavf_virtchnl.c:1095 (size 1) [ 84.169915] WARNING: CPU: 3 PID: 11 at drivers/net/ethernet/intel/ iavf/iavf_virtchnl.c:1095 iavf_set_rss_key+0x123/0x140 [iavf] ... [ 84.191982] Call Trace: [ 84.192439] <TASK> [ 84.192900] ? __warn+0xc9/0x1a0 [ 84.193353] ? iavf_set_rss_key+0x123/0x140 [iavf] [ 84.193818] ? report_bug+0x12c/0x1b0 [ 84.194266] ? handle_bug+0x42/0x70 [ 84.194714] ? exc_invalid_op+0x1a/0x50 [ 84.195149] ? asm_exc_invalid_op+0x1a/0x20 [ 84.195592] ? iavf_set_rss_key+0x123/0x140 [iavf] [ 84.196033] iavf_watchdog_task+0xb0c/0xe00 [iavf] ... [ 84.225476] memcpy: detected field-spanning write (size 64) of single field "vrl->lut" at drivers/net/ethernet/intel/iavf/iavf_virtchnl.c:1127 (size 1) [ 84.227190] WARNING: CPU: 27 PID: 1044 at drivers/net/ethernet/intel/ iavf/iavf_virtchnl.c:1127 iavf_set_rss_lut+0x123/0x140 [iavf] ... [ 84.246601] Call Trace: [ 84.247228] <TASK> [ 84.247840] ? __warn+0xc9/0x1a0 [ 84.248263] ? iavf_set_rss_lut+0x123/0x140 [iavf] [ 84.248698] ? report_bug+0x12c/0x1b0 [ 84.249122] ? handle_bug+0x42/0x70 [ 84.249549] ? exc_invalid_op+0x1a/0x50 [ 84.249970] ? asm_exc_invalid_op+0x1a/0x20 [ 84.250390] ? iavf_set_rss_lut+0x123/0x140 [iavf] [ 84.250820] iavf_watchdog_task+0xb16/0xe00 [iavf] Gustavo already tried to fix those back in 2021[0][1]. Unfortunately, a VM can run a different kernel than the host, meaning that those structures are sorta ABI. However, it is possible to have proper flex arrays + struct_size() calculations and still send the very same messages with the same sizes. The common rule is: elem[1] -> elem[] size = struct_size() + <difference between the old and the new msg size> The "old" size in the current code is calculated 3 different ways for 10 virtchnl structures total. Each commit addresses one of the ways cumulatively instead of per-structure. I was planning to send it to -net initially, but given that virtchnl was renamed from i40evf and got some fat style cleanup commits in the past, it's not very straightforward to even pick appropriate SHAs, not speaking of automatic portability. I may send manual backports for a couple of the latest supported kernels later on if anyone needs it at all. [0] https://lore.kernel.org/all/20210525230912.GA175802@embeddedor [1] https://lore.kernel.org/all/20210525231851.GA176647@embeddedor * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: virtchnl: fix fake 1-elem arrays for structures allocated as `nents` virtchnl: fix fake 1-elem arrays in structures allocated as `nents + 1` virtchnl: fix fake 1-elem arrays in structs allocated as `nents + 1` - 1 ==================== Link: https://lore.kernel.org/r/20230816210657.1326772-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19net: fec: use napi_consume_skb() in fec_enet_tx_queue()Wei Fang1-1/+1
Now that the "budget" is passed into fec_enet_tx_queue(), one optimization we can do is to use napi_consume_skb() to instead of dev_kfree_skb_any(). Signed-off-by: Wei Fang <wei.fang@nxp.com> Suggested-by: Alexander H Duyck <alexander.duyck@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20230816090242.463822-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-19sfc: Remove unneeded semicolonYang Li1-1/+1
./drivers/net/ethernet/sfc/tc_conntrack.c:464:2-3: Unneeded semicolon Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Acked-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://lore.kernel.org/r/20230816004944.10841-1-yang.lee@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski23-49/+186
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/sfc/tc.c fa165e194997 ("sfc: don't unregister flow_indr if it was never registered") 3bf969e88ada ("sfc: add MAE table machinery for conntrack table") https://lore.kernel.org/all/20230818112159.7430e9b4@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-18net: altera-tse: make ALTERA_TSE depend on HAS_IOMEMBaoquan He1-0/+1
On s390 systems (aka mainframes), it has classic channel devices for networking and permanent storage that are currently even more common than PCI devices. Hence it could have a fully functional s390 kernel with CONFIG_PCI=n, then the relevant iomem mapping functions [including ioremap(), devm_ioremap(), etc.] are not available. Here let ALTERA_TSE depend on HAS_IOMEM so that it won't be built to cause below compiling error if PCI is unset: ------ ERROR: modpost: "devm_ioremap" [drivers/net/ethernet/altera/altera_tse.ko] undefined! ------ Link: https://lkml.kernel.org/r/20230707135852.24292-6-bhe@redhat.com Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202306211329.ticOJCSv-lkp@intel.com/ Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Joyce Ooi <joyce.ooi@intel.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18asm-generic/iomap.h: remove ARCH_HAS_IOREMAP_xx macrosBaoquan He2-2/+2
Patch series "mm: ioremap: Convert architectures to take GENERIC_IOREMAP way", v8. Motivation and implementation: ============================== Currently, many architecutres have't taken the standard GENERIC_IOREMAP way to implement ioremap_prot(), iounmap(), and ioremap_xx(), but make these functions specifically under each arch's folder. Those cause many duplicated code of ioremap() and iounmap(). In this patchset, firstly introduce generic_ioremap_prot() and generic_iounmap() to extract the generic code for GENERIC_IOREMAP. By taking GENERIC_IOREMAP method, the generic generic_ioremap_prot(), generic_iounmap(), and their generic wrapper ioremap_prot(), ioremap() and iounmap() are all visible and available to arch. Arch needs to provide wrapper functions to override the generic version if there's arch specific handling in its corresponding ioremap_prot(), ioremap() or iounmap(). With these changes, duplicated ioremap/iounmap() code uder ARCH-es are removed, and the equivalent functioality is kept as before. Background info: ================ 1) The converting more architectures to take GENERIC_IOREMAP way is suggested by Christoph in below discussion: https://lore.kernel.org/all/Yp7h0Jv6vpgt6xdZ@infradead.org/T/#u 2) In the previous v1 to v3, it's basically further action after arm64 has converted to GENERIC_IOREMAP way in below patchset. It's done by adding hook ioremap_allowed() and iounmap_allowed() in ARCH to add ARCH specific handling the middle of ioremap_prot() and iounmap(). [PATCH v5 0/6] arm64: Cleanup ioremap() and support ioremap_prot() https://lore.kernel.org/all/20220607125027.44946-1-wangkefeng.wang@huawei.com/T/#u Later, during v3 reviewing, Christophe Leroy suggested to introduce generic_ioremap_prot() and generic_iounmap() to generic codes, and ARCH can provide wrapper function ioremap_prot(), ioremap() or iounmap() if needed. Christophe made a RFC patchset as below to specially demonstrate his idea. This is what v4 and now v5 is doing. [RFC PATCH 0/8] mm: ioremap: Convert architectures to take GENERIC_IOREMAP way https://lore.kernel.org/all/cover.1665568707.git.christophe.leroy@csgroup.eu/T/#u Testing: ======== In v8, I only applied this patchset onto the latest linus's tree to build and run on arm64 and s390. This patch (of 19): Let's use '#define ioremap_xx' and "#ifdef ioremap_xx" instead. To remove defined ARCH_HAS_IOREMAP_xx macros in <asm/io.h> of each ARCH, the ARCH's own ioremap_wc|wt|np definition need be above "#include <asm-generic/iomap.h>. Otherwise the redefinition error would be seen during compiling. So the relevant adjustments are made to avoid compiling error: loongarch: - doesn't include <asm-generic/iomap.h>, defining ARCH_HAS_IOREMAP_WC is redundant, so simply remove it. m68k: - selected GENERIC_IOMAP, <asm-generic/iomap.h> has been added in <asm-generic/io.h>, and <asm/kmap.h> is included above <asm-generic/iomap.h>, so simply remove ARCH_HAS_IOREMAP_WT defining. mips: - move "#include <asm-generic/iomap.h>" below ioremap_wc definition in <asm/io.h> powerpc: - remove "#include <asm-generic/iomap.h>" in <asm/io.h> because it's duplicated with the one in <asm-generic/io.h>, let's rely on the latter. x86: - selected GENERIC_IOMAP, remove #include <asm-generic/iomap.h> in the middle of <asm/io.h>. Let's rely on <asm-generic/io.h>. Link: https://lkml.kernel.org/r/20230706154520.11257-2-bhe@redhat.com Signed-off-by: Baoquan He <bhe@redhat.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Helge Deller <deller@gmx.de> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Niklas Schnelle <schnelle@linux.ibm.com> Cc: Stafford Horne <shorne@gmail.com> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Chris Zankel <chris@zankel.net> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Rich Felker <dalias@libc.org> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18Merge tag 'net-6.5-rc7' of ↵Linus Torvalds23-48/+185
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from ipsec and netfilter. No known outstanding regressions. Fixes to fixes: - virtio-net: set queues after driver_ok, avoid a potential race added by recent fix - Revert "vlan: Fix VLAN 0 memory leak", it may lead to a warning when VLAN 0 is registered explicitly - nf_tables: - fix false-positive lockdep splat in recent fixes - don't fail inserts if duplicate has expired (fix test failures) - fix races between garbage collection and netns dismantle Current release - new code bugs: - mlx5: Fix mlx5_cmd_update_root_ft() error flow Previous releases - regressions: - phy: fix IRQ-based wake-on-lan over hibernate / power off Previous releases - always broken: - sock: fix misuse of sk_under_memory_pressure() preventing system from exiting global TCP memory pressure if a single cgroup is under pressure - fix the RTO timer retransmitting skb every 1ms if linear option is enabled - af_key: fix sadb_x_filter validation, amment netlink policy - ipsec: fix slab-use-after-free in decode_session6() - macb: in ZynqMP resume always configure PS GTR for non-wakeup source Misc: - netfilter: set default timeout to 3 secs for sctp shutdown send and recv state (from 300ms), align with protocol timers" * tag 'net-6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits) ice: Block switchdev mode when ADQ is active and vice versa qede: fix firmware halt over suspend and resume net: do not allow gso_size to be set to GSO_BY_FRAGS sock: Fix misuse of sk_under_memory_pressure() sfc: don't fail probe if MAE/TC setup fails sfc: don't unregister flow_indr if it was never registered net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset net/mlx5: Fix mlx5_cmd_update_root_ft() error flow net/mlx5e: XDP, Fix fifo overrun on XDP_REDIRECT i40e: fix misleading debug logs iavf: fix FDIR rule fields masks validation ipv6: fix indentation of a config attribute mailmap: add entries for Simon Horman broadcom: b44: Use b44_writephy() return value net: openvswitch: reject negative ifindex team: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves net: phy: broadcom: stub c45 read/write for 54810 netfilter: nft_dynset: disallow object maps netfilter: nf_tables: GC transaction race with netns dismantle netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path ...