summaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
7 daysnet: phy: fix phy_uses_state_machine()Russell King (Oracle)1-2/+3
[ Upstream commit e0d1c55501d377163eb57feed863777ed1c973ad ] The blamed commit changed the conditions which phylib uses to stop and start the state machine in the suspend and resume paths, and while improving it, has caused two issues. The original code used this test: phydev->attached_dev && phydev->adjust_link and if true, the paths would handle the PHY state machine. This test evaluates true for normal drivers that are using phylib directly while the PHY is attached to the network device, but false in all other cases, which include the following cases: - when the PHY has never been attached to a network device. - when the PHY has been detached from a network device (as phy_detach() sets phydev->attached_dev to NULL, phy_disconnect() calls phy_detach() and additionally sets phydev->adjust_link NULL.) - when phylink is using the driver (as phydev->adjust_link is NULL.) Only the third case was incorrect, and the blamed commit attempted to fix this by changing this test to (simplified for brevity, see phy_uses_state_machine()): phydev->phy_link_change == phy_link_change ? phydev->attached_dev && phydev->adjust_link : true However, this also incorrectly evaluates true in the first two cases. Fix the first case by ensuring that phy_uses_state_machine() returns false when phydev->phy_link_change is NULL. Fix the second case by ensuring that phydev->phy_link_change is set to NULL when phy_detach() is called. Reported-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20250806082931.3289134-1-xu.yang_2@nxp.com Fixes: fc75ea20ffb4 ("net: phy: allow MDIO bus PM ops to start/stop state machine for phylink-controlled PHY") Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/E1uvMEz-00000003Aoe-3qWe@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Rajani Kantha <681739313@139.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysnet: phy: allow MDIO bus PM ops to start/stop state machine for ↵Vladimir Oltean1-2/+29
phylink-controlled PHY [ Upstream commit fc75ea20ffb452652f0d4033f38fe88d7cfdae35 ] DSA has 2 kinds of drivers: 1. Those who call dsa_switch_suspend() and dsa_switch_resume() from their device PM ops: qca8k-8xxx, bcm_sf2, microchip ksz 2. Those who don't: all others. The above methods should be optional. For type 1, dsa_switch_suspend() calls dsa_user_suspend() -> phylink_stop(), and dsa_switch_resume() calls dsa_user_resume() -> phylink_start(). These seem good candidates for setting mac_managed_pm = true because that is essentially its definition [1], but that does not seem to be the biggest problem for now, and is not what this change focuses on. Talking strictly about the 2nd category of DSA drivers here (which do not have MAC managed PM, meaning that for their attached PHYs, mdio_bus_phy_suspend() and mdio_bus_phy_resume() should run in full), I have noticed that the following warning from mdio_bus_phy_resume() is triggered: WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY && phydev->state != PHY_UP); because the PHY state machine is running. It's running as a result of a previous dsa_user_open() -> ... -> phylink_start() -> phy_start() having been initiated by the user. The previous mdio_bus_phy_suspend() was supposed to have called phy_stop_machine(), but it didn't. So this is why the PHY is in state PHY_NOLINK by the time mdio_bus_phy_resume() runs. mdio_bus_phy_suspend() did not call phy_stop_machine() because for phylink, the phydev->adjust_link function pointer is NULL. This seems a technicality introduced by commit fddd91016d16 ("phylib: fix PAL state machine restart on resume"). That commit was written before phylink existed, and was intended to avoid crashing with consumer drivers which don't use the PHY state machine - phylink always does, when using a PHY. But phylink itself has historically not been developed with suspend/resume in mind, and apparently not tested too much in that scenario, allowing this bug to exist unnoticed for so long. Plus, prior to the WARN_ON(), it would have likely been invisible. This issue is not in fact restricted to type 2 DSA drivers (according to the above ad-hoc classification), but can be extrapolated to any MAC driver with phylink and MDIO-bus-managed PHY PM ops. DSA is just where the issue was reported. Assuming mac_managed_pm is set correctly, a quick search indicates the following other drivers might be affected: $ grep -Zlr PHYLINK_NETDEV drivers/ | xargs -0 grep -L mac_managed_pm drivers/net/ethernet/atheros/ag71xx.c drivers/net/ethernet/microchip/sparx5/sparx5_main.c drivers/net/ethernet/microchip/lan966x/lan966x_main.c drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c drivers/net/ethernet/freescale/dpaa/dpaa_eth.c drivers/net/ethernet/freescale/ucc_geth.c drivers/net/ethernet/freescale/enetc/enetc_pf_common.c drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c drivers/net/ethernet/marvell/mvneta.c drivers/net/ethernet/marvell/prestera/prestera_main.c drivers/net/ethernet/mediatek/mtk_eth_soc.c drivers/net/ethernet/altera/altera_tse_main.c drivers/net/ethernet/wangxun/txgbe/txgbe_phy.c drivers/net/ethernet/meta/fbnic/fbnic_phylink.c drivers/net/ethernet/tehuti/tn40_phy.c drivers/net/ethernet/mscc/ocelot_net.c Make the existing conditions dependent on the PHY device having a phydev->phy_link_change() implementation equal to the default phy_link_change() provided by phylib. Otherwise, we implicitly know that the phydev has the phylink-provided phylink_phy_change() callback, and when phylink is used, the PHY state machine always needs to be stopped/ started on the suspend/resume path. The code is structured as such that if phydev->phy_link_change() is absent, it is a matter of time until the kernel will crash - no need to further complicate the test. Thus, for the situation where the PM is not managed by the MAC, we will make the MDIO bus PM ops treat identically the phylink-controlled PHYs with the phylib-controlled PHYs where an adjust_link() callback is supplied. In both cases, the MDIO bus PM ops should stop and restart the PHY state machine. [1] https://lore.kernel.org/netdev/Z-1tiW9zjcoFkhwc@shell.armlinux.org.uk/ Fixes: 744d23c71af3 ("net: phy: Warn about incorrect mdio_bus_phy_resume() state") Reported-by: Wei Fang <wei.fang@nxp.com> Tested-by: Wei Fang <wei.fang@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250407094042.2155633-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Rajani Kantha <681739313@139.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysnet: phy: move phy_link_change() prior to mdio_bus_phy_may_suspend()Vladimir Oltean1-13/+13
[ Upstream commit f40a673d6b4a128fe95dd9b8c3ed02da50a6a862 ] In an upcoming change, mdio_bus_phy_may_suspend() will need to distinguish a phylib-based PHY client from a phylink PHY client. For that, it will need to compare the phydev->phy_link_change() function pointer with the eponymous phy_link_change() provided by phylib. To avoid forward function declarations, the default PHY link state change method should be moved upwards. There is no functional change associated with this patch, it is only to reduce the noise from a real bug fix. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20250407093900.2155112-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> [ Minor context change fixed ] Signed-off-by: Rajani Kantha <681739313@139.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysnet: macb: Move devm_{free,request}_irq() out of spin lock areaKevin Hao1-5/+8
[ Upstream commit 317e49358ebbf6390fa439ef3c142f9239dd25fb ] The devm_free_irq() and devm_request_irq() functions should not be executed in an atomic context. During device suspend, all userspace processes and most kernel threads are frozen. Additionally, we flush all tx/rx status, disable all macb interrupts, and halt rx operations. Therefore, it is safe to split the region protected by bp->lock into two independent sections, allowing devm_free_irq() and devm_request_irq() to run in a non-atomic context. This modification resolves the following lockdep warning: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:591 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 501, name: rtcwake preempt_count: 1, expected: 0 RCU nest depth: 1, expected: 0 7 locks held by rtcwake/501: #0: ffff0008038c3408 (sb_writers#5){.+.+}-{0:0}, at: vfs_write+0xf8/0x368 #1: ffff0008049a5e88 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0xbc/0x1c8 #2: ffff00080098d588 (kn->active#70){.+.+}-{0:0}, at: kernfs_fop_write_iter+0xcc/0x1c8 #3: ffff800081c84888 (system_transition_mutex){+.+.}-{4:4}, at: pm_suspend+0x1ec/0x290 #4: ffff0008009ba0f8 (&dev->mutex){....}-{4:4}, at: device_suspend+0x118/0x4f0 #5: ffff800081d00458 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire+0x4/0x48 #6: ffff0008031fb9e0 (&bp->lock){-.-.}-{3:3}, at: macb_suspend+0x144/0x558 irq event stamp: 8682 hardirqs last enabled at (8681): [<ffff8000813c7d7c>] _raw_spin_unlock_irqrestore+0x44/0x88 hardirqs last disabled at (8682): [<ffff8000813c7b58>] _raw_spin_lock_irqsave+0x38/0x98 softirqs last enabled at (7322): [<ffff8000800f1b4c>] handle_softirqs+0x52c/0x588 softirqs last disabled at (7317): [<ffff800080010310>] __do_softirq+0x20/0x2c CPU: 1 UID: 0 PID: 501 Comm: rtcwake Not tainted 7.0.0-rc3-next-20260310-yocto-standard+ #125 PREEMPT Hardware name: ZynqMP ZCU102 Rev1.1 (DT) Call trace: show_stack+0x24/0x38 (C) __dump_stack+0x28/0x38 dump_stack_lvl+0x64/0x88 dump_stack+0x18/0x24 __might_resched+0x200/0x218 __might_sleep+0x38/0x98 __mutex_lock_common+0x7c/0x1378 mutex_lock_nested+0x38/0x50 free_irq+0x68/0x2b0 devm_irq_release+0x24/0x38 devres_release+0x40/0x80 devm_free_irq+0x48/0x88 macb_suspend+0x298/0x558 device_suspend+0x218/0x4f0 dpm_suspend+0x244/0x3a0 dpm_suspend_start+0x50/0x78 suspend_devices_and_enter+0xec/0x560 pm_suspend+0x194/0x290 state_store+0x110/0x158 kobj_attr_store+0x1c/0x30 sysfs_kf_write+0xa8/0xd0 kernfs_fop_write_iter+0x11c/0x1c8 vfs_write+0x248/0x368 ksys_write+0x7c/0xf8 __arm64_sys_write+0x28/0x40 invoke_syscall+0x4c/0xe8 el0_svc_common+0x98/0xf0 do_el0_svc+0x28/0x40 el0_svc+0x54/0x1e0 el0t_64_sync_handler+0x84/0x130 el0t_64_sync+0x198/0x1a0 Fixes: 558e35ccfe95 ("net: macb: WoL support for GEM type of Ethernet controller") Cc: stable@vger.kernel.org Reviewed-by: Théo Lebrun <theo.lebrun@bootlin.com> Signed-off-by: Kevin Hao <haokexin@gmail.com> Link: https://patch.msgid.link/20260318-macb-irq-v2-1-f1179768ab24@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> [ replaced `tmp` variable with direct `MACB_BIT(MAG)` ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 dayswifi: virt_wifi: remove SET_NETDEV_DEV to avoid use-after-freeAlexander Popov1-1/+0
commit 789b06f9f39cdc7e895bdab2c034e39c41c8f8d6 upstream. Currently we execute `SET_NETDEV_DEV(dev, &priv->lowerdev->dev)` for the virt_wifi net devices. However, unregistering a virt_wifi device in netdev_run_todo() can happen together with the device referenced by SET_NETDEV_DEV(). It can result in use-after-free during the ethtool operations performed on a virt_wifi device that is currently being unregistered. Such a net device can have the `dev.parent` field pointing to the freed memory, but ethnl_ops_begin() calls `pm_runtime_get_sync(dev->dev.parent)`. Let's remove SET_NETDEV_DEV for virt_wifi to avoid bugs like this: ================================================================== BUG: KASAN: slab-use-after-free in __pm_runtime_resume+0xe2/0xf0 Read of size 2 at addr ffff88810cfc46f8 by task pm/606 Call Trace: <TASK> dump_stack_lvl+0x4d/0x70 print_report+0x170/0x4f3 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 kasan_report+0xda/0x110 ? __pm_runtime_resume+0xe2/0xf0 ? __pm_runtime_resume+0xe2/0xf0 __pm_runtime_resume+0xe2/0xf0 ethnl_ops_begin+0x49/0x270 ethnl_set_features+0x23c/0xab0 ? __pfx_ethnl_set_features+0x10/0x10 ? kvm_sched_clock_read+0x11/0x20 ? local_clock_noinstr+0xf/0xf0 ? local_clock+0x10/0x30 ? kasan_save_track+0x25/0x60 ? __kasan_kmalloc+0x7f/0x90 ? genl_family_rcv_msg_attrs_parse.isra.0+0x150/0x2c0 genl_family_rcv_msg_doit+0x1e7/0x2c0 ? __pfx_genl_family_rcv_msg_doit+0x10/0x10 ? __pfx_cred_has_capability.isra.0+0x10/0x10 ? stack_trace_save+0x8e/0xc0 genl_rcv_msg+0x411/0x660 ? __pfx_genl_rcv_msg+0x10/0x10 ? __pfx_ethnl_set_features+0x10/0x10 netlink_rcv_skb+0x121/0x380 ? __pfx_genl_rcv_msg+0x10/0x10 ? __pfx_netlink_rcv_skb+0x10/0x10 ? __pfx_down_read+0x10/0x10 genl_rcv+0x23/0x30 netlink_unicast+0x60f/0x830 ? __pfx_netlink_unicast+0x10/0x10 ? __pfx___alloc_skb+0x10/0x10 netlink_sendmsg+0x6ea/0xbc0 ? __pfx_netlink_sendmsg+0x10/0x10 ? __futex_queue+0x10b/0x1f0 ____sys_sendmsg+0x7a2/0x950 ? copy_msghdr_from_user+0x26b/0x430 ? __pfx_____sys_sendmsg+0x10/0x10 ? __pfx_copy_msghdr_from_user+0x10/0x10 ___sys_sendmsg+0xf8/0x180 ? __pfx____sys_sendmsg+0x10/0x10 ? __pfx_futex_wait+0x10/0x10 ? fdget+0x2e4/0x4a0 __sys_sendmsg+0x11f/0x1c0 ? __pfx___sys_sendmsg+0x10/0x10 do_syscall_64+0xe2/0x570 ? exc_page_fault+0x66/0xb0 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> This fix may be combined with another one in the ethtool subsystem: https://lore.kernel.org/all/20260322075917.254874-1-alex.popov@linux.com/T/#u Fixes: d43c65b05b848e0b ("ethtool: runtime-resume netdev parent in ethnl_ops_begin") Cc: stable@vger.kernel.org Signed-off-by: Alexander Popov <alex.popov@linux.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260324224607.374327-1-alex.popov@linux.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysnet: enetc: fix PF !of_device_is_available() teardown pathVladimir Oltean1-1/+1
Upstream commit e15c5506dd39 ("net: enetc: allocate vf_state during PF probes") was backported incorrectly to kernels where enetc_pf_probe() still has to manually check whether the OF node of the PCI device is enabled. In kernels which contain commit bfce089ddd0e ("net: enetc: remove of_device_is_available() handling") and its dependent change, commit 6fffbc7ae137 ("PCI: Honor firmware's device disabled status"), the "err_device_disabled" label has disappeared. Yet, linux-6.1.y and earlier still contains it. The trouble is that upstream commit e15c5506dd39 ("net: enetc: allocate vf_state during PF probes"), backported as 35668e29e979 in linux-6.1.y, introduces new code for the err_setup_mac_addresses and err_alloc_netdev labels which calls kfree(pf->vf_state). This code must not execute for the err_device_disabled label, because at that stage, the pf structure has not yet been allocated, and is an uninitialized pointer. By moving the err_device_disabled label to undo just the previous operation, i.e. a successful enetc_psi_create() call with enetc_psi_destroy(), the dereference of uninitialized pf->vf_state is avoided. Fixes: 35668e29e979 ("net: enetc: allocate vf_state during PF probes") Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://lore.kernel.org/linux-patches/20260330073356.GA1017537@ax162/ Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> # build Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysnet: ftgmac100: fix ring allocation unwind on open failureYufan Chen1-4/+24
commit c0fd0fe745f5e8c568d898cd1513d0083e46204a upstream. ftgmac100_alloc_rings() allocates rx_skbs, tx_skbs, rxdes, txdes, and rx_scratch in stages. On intermediate failures it returned -ENOMEM directly, leaking resources allocated earlier in the function. Rework the failure path to use staged local unwind labels and free allocated resources in reverse order before returning -ENOMEM. This matches common netdev allocation cleanup style. Fixes: d72e01a0430f ("ftgmac100: Use a scratch buffer for failed RX allocations") Cc: stable@vger.kernel.org Signed-off-by: Yufan Chen <yufan.chen@linux.dev> Link: https://patch.msgid.link/20260328163257.60836-1-yufan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysvxlan: validate ND option lengths in vxlan_na_createYang Yang1-2/+4
commit afa9a05e6c4971bd5586f1b304e14d61fb3d9385 upstream. vxlan_na_create() walks ND options according to option-provided lengths. A malformed option can make the parser advance beyond the computed option span or use a too-short source LLADDR option payload. Validate option lengths against the remaining NS option area before advancing, and only read source LLADDR when the option is large enough for an Ethernet address. Fixes: 4b29dba9c085 ("vxlan: fix nonfunctional neigh_reduce()") Cc: stable@vger.kernel.org Reported-by: Yifan Wu <yifanwucs@gmail.com> Reported-by: Juefei Pu <tomapufckgml@gmail.com> Tested-by: Ao Zhou <n05ec@lzu.edu.cn> Co-developed-by: Yuan Tan <tanyuan98@outlook.com> Signed-off-by: Yuan Tan <tanyuan98@outlook.com> Suggested-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Yang Yang <n05ec@lzu.edu.cn> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260326034441.2037420-4-n05ec@lzu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 dayswifi: iwlwifi: mvm: fix potential out-of-bounds read in ↵Alexey Velichayshiy1-1/+1
iwl_mvm_nd_match_info_handler() commit 744fabc338e87b95c4d1ff7c95bc8c0f834c6d99 upstream. The memcpy function assumes the dynamic array notif->matches is at least as large as the number of bytes to copy. Otherwise, results->matches may contain unwanted data. To guarantee safety, extend the validation in one of the checks to ensure sufficient packet length. Found by Linux Verification Center (linuxtesting.org) with SVACE. Cc: stable@vger.kernel.org Fixes: 5ac54afd4d97 ("wifi: iwlwifi: mvm: Add handling for scan offload match info notification") Signed-off-by: Alexey Velichayshiy <a.velichayshiy@ispras.ru> Link: https://patch.msgid.link/20260207150335.1013646-1-a.velichayshiy@ispras.ru Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 dayswifi: wilc1000: fix u8 overflow in SSID scan buffer size calculationYasuaki Torimaru1-1/+1
commit d049e56b1739101d1c4d81deedb269c52a8dbba0 upstream. The variable valuesize is declared as u8 but accumulates the total length of all SSIDs to scan. Each SSID contributes up to 33 bytes (IEEE80211_MAX_SSID_LEN + 1), and with WILC_MAX_NUM_PROBED_SSID (10) SSIDs the total can reach 330, which wraps around to 74 when stored in a u8. This causes kmalloc to allocate only 75 bytes while the subsequent memcpy writes up to 331 bytes into the buffer, resulting in a 256-byte heap buffer overflow. Widen valuesize from u8 to u32 to accommodate the full range. Fixes: c5c77ba18ea6 ("staging: wilc1000: Add SDIO/SPI 802.11 driver") Cc: stable@vger.kernel.org Signed-off-by: Yasuaki Torimaru <yasuakitorimaru@gmail.com> Link: https://patch.msgid.link/20260324100624.983458-1-yasuakitorimaru@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysnet/mlx5: Avoid "No data available" when FW version queries failSaeed Mahameed3-24/+37
[ Upstream commit 10dc35f6a443d488f219d1a1e3fb8f8dac422070 ] Avoid printing the misleading "kernel answers: No data available" devlink output when querying firmware or pending firmware version fails (e.g. MLX5 fw state errors / flash failures). FW can fail on loading the pending flash image and get its version due to various reasons, examples: mlxfw: Firmware flash failed: key not applicable, err (7) mlx5_fw_image_pending: can't read pending fw version while fw state is 1 and the resulting: $ devlink dev info kernel answers: No data available Instead, just report 0 or 0xfff.. versions in case of failure to indicate a problem, and let other information be shown. after the fix: $ devlink dev info pci/0000:00:06.0: driver mlx5_core serial_number xxx... board.serial_number MT2225300179 versions: fixed: fw.psid MT_0000000436 running: fw.version 22.41.0188 fw 22.41.0188 stored: fw.version 255.255.65535 fw 255.255.65535 Fixes: 9c86b07e3069 ("net/mlx5: Added fw version query command") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260330194015.53585-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysnet/mlx5: lag: Check for LAG device before creating debugfsShay Drory1-0/+3
[ Upstream commit bf16bca6653679d8a514d6c1c5a2c67065033f14 ] __mlx5_lag_dev_add_mdev() may return 0 (success) even when an error occurs that is handled gracefully. Consequently, the initialization flow proceeds to call mlx5_ldev_add_debugfs() even when there is no valid LAG context. mlx5_ldev_add_debugfs() blindly created the debugfs directory and attributes. This exposed interfaces (like the members file) that rely on a valid ldev pointer, leading to potential NULL pointer dereferences if accessed when ldev is NULL. Add a check to verify that mlx5_lag_dev(dev) returns a valid pointer before attempting to create the debugfs entries. Fixes: 7f46a0b7327a ("net/mlx5: Lag, add debugfs to query hardware lag state") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260330194015.53585-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysnet: macb: properly unregister fixed rate clocksFedor Pchelkin1-4/+4
[ Upstream commit f0f367a4f459cc8118aadc43c6bba53c60d93f8d ] The additional resources allocated with clk_register_fixed_rate() need to be released with clk_unregister_fixed_rate(), otherwise they are lost. Fixes: 83a77e9ec415 ("net: macb: Added PCI wrapper for Platform Driver.") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Link: https://patch.msgid.link/20260330184542.626619-2-pchelkin@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysnet: macb: fix clk handling on PCI glue driver removalFedor Pchelkin1-2/+4
[ Upstream commit ce8fe5287b87e24e225c342f3b0ec04f0b3680fe ] platform_device_unregister() may still want to use the registered clks during runtime resume callback. Note that there is a commit d82d5303c4c5 ("net: macb: fix use after free on rmmod") that addressed the similar problem of clk vs platform device unregistration but just moved the bug to another place. Save the pointers to clks into local variables for reuse after platform device is unregistered. BUG: KASAN: use-after-free in clk_prepare+0x5a/0x60 Read of size 8 at addr ffff888104f85e00 by task modprobe/597 CPU: 2 PID: 597 Comm: modprobe Not tainted 6.1.164+ #114 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x8d/0xba print_report+0x17f/0x496 kasan_report+0xd9/0x180 clk_prepare+0x5a/0x60 macb_runtime_resume+0x13d/0x410 [macb] pm_generic_runtime_resume+0x97/0xd0 __rpm_callback+0xc8/0x4d0 rpm_callback+0xf6/0x230 rpm_resume+0xeeb/0x1a70 __pm_runtime_resume+0xb4/0x170 bus_remove_device+0x2e3/0x4b0 device_del+0x5b3/0xdc0 platform_device_del+0x4e/0x280 platform_device_unregister+0x11/0x50 pci_device_remove+0xae/0x210 device_remove+0xcb/0x180 device_release_driver_internal+0x529/0x770 driver_detach+0xd4/0x1a0 bus_remove_driver+0x135/0x260 driver_unregister+0x72/0xb0 pci_unregister_driver+0x26/0x220 __do_sys_delete_module+0x32e/0x550 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 </TASK> Allocated by task 519: kasan_save_stack+0x2c/0x50 kasan_set_track+0x21/0x30 __kasan_kmalloc+0x8e/0x90 __clk_register+0x458/0x2890 clk_hw_register+0x1a/0x60 __clk_hw_register_fixed_rate+0x255/0x410 clk_register_fixed_rate+0x3c/0xa0 macb_probe+0x1d8/0x42e [macb_pci] local_pci_probe+0xd7/0x190 pci_device_probe+0x252/0x600 really_probe+0x255/0x7f0 __driver_probe_device+0x1ee/0x330 driver_probe_device+0x4c/0x1f0 __driver_attach+0x1df/0x4e0 bus_for_each_dev+0x15d/0x1f0 bus_add_driver+0x486/0x5e0 driver_register+0x23a/0x3d0 do_one_initcall+0xfd/0x4d0 do_init_module+0x18b/0x5a0 load_module+0x5663/0x7950 __do_sys_finit_module+0x101/0x180 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Freed by task 597: kasan_save_stack+0x2c/0x50 kasan_set_track+0x21/0x30 kasan_save_free_info+0x2a/0x50 __kasan_slab_free+0x106/0x180 __kmem_cache_free+0xbc/0x320 clk_unregister+0x6de/0x8d0 macb_remove+0x73/0xc0 [macb_pci] pci_device_remove+0xae/0x210 device_remove+0xcb/0x180 device_release_driver_internal+0x529/0x770 driver_detach+0xd4/0x1a0 bus_remove_driver+0x135/0x260 driver_unregister+0x72/0xb0 pci_unregister_driver+0x26/0x220 __do_sys_delete_module+0x32e/0x550 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Fixes: d82d5303c4c5 ("net: macb: fix use after free on rmmod") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Link: https://patch.msgid.link/20260330184542.626619-1-pchelkin@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysnet: xilinx: axienet: Correct BD length masks to match AXIDMA IP specSuraj Gupta1-2/+2
[ Upstream commit 393e0b4f178ec7fce1141dacc3304e3607a92ee9 ] The XAXIDMA_BD_CTRL_LENGTH_MASK and XAXIDMA_BD_STS_ACTUAL_LEN_MASK macros were defined as 0x007FFFFF (23 bits), but the AXI DMA IP product guide (PG021) specifies the buffer length field as bits 25:0 (26 bits). Update both masks to match the IP documentation. In practice this had no functional impact, since Ethernet frames are far smaller than 2^23 bytes and the extra bits were always zero, but the masks should still reflect the hardware specification. Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com> Reviewed-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20260327073238.134948-2-suraj.gupta2@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daystg3: Fix race for querying speed/duplexThomas Bogendoerfer1-1/+1
[ Upstream commit bb417456c7814d1493d98b7dd9c040bf3ce3b4ed ] When driver signals carrier up via netif_carrier_on() its internal link_up state isn't updated immediately. This leads to inconsistent speed/duplex in /proc/net/bonding/bondX where the speed and duplex is shown as unknown while ethtool shows correct values. Fix this by using netif_carrier_ok() for link checking in get_ksettings function. Fixes: 84421b99cedc ("tg3: Update link_up flag for phylib devices") Signed-off-by: Thomas Bogendoerfer <tbogendoerfer@suse.de> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daystg3: replace placeholder MAC address with device propertyPaul SAGE1-0/+11
[ Upstream commit e4c00ba7274b613e3ab19e27eb009f0ec2e28379 ] On some systems (e.g. iMac 20,1 with BCM57766), the tg3 driver reads a default placeholder mac address (00:10:18:00:00:00) from the mailbox. The correct value on those systems are stored in the 'local-mac-address' property. This patch, detect the default value and tries to retrieve the correct address from the device_get_mac_address function instead. The patch has been tested on two different systems: - iMac 20,1 (BCM57766) model which use the local-mac-address property - iMac 13,2 (BCM57766) model which can use the mailbox, NVRAM or MAC control registers Tested-by: Rishon Jonathan R <mithicalaviator85@gmail.com> Co-developed-by: Vincent MORVAN <vinc@42.fr> Signed-off-by: Vincent MORVAN <vinc@42.fr> Signed-off-by: Paul SAGE <paul.sage@42.fr> Signed-off-by: Atharva Tiwari <atharvatiwarilinuxdev@gmail.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20260314215432.3589-1-atharvatiwarilinuxdev@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysnet: macb: Use dev_consume_skb_any() to free TX SKBsKevin Hao1-1/+1
commit 647b8a2fe474474704110db6bd07f7a139e621eb upstream. The napi_consume_skb() function is not intended to be called in an IRQ disabled context. However, after commit 6bc8a5098bf4 ("net: macb: Fix tx_ptr_lock locking"), the freeing of TX SKBs is performed with IRQs disabled. To resolve the following call trace, use dev_consume_skb_any() for freeing TX SKBs: WARNING: kernel/softirq.c:430 at __local_bh_enable_ip+0x174/0x188, CPU#0: ksoftirqd/0/15 Modules linked in: CPU: 0 UID: 0 PID: 15 Comm: ksoftirqd/0 Not tainted 7.0.0-rc4-next-20260319-yocto-standard-dirty #37 PREEMPT Hardware name: ZynqMP ZCU102 Rev1.1 (DT) pstate: 200000c5 (nzCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __local_bh_enable_ip+0x174/0x188 lr : local_bh_enable+0x24/0x38 sp : ffff800082b3bb10 x29: ffff800082b3bb10 x28: ffff0008031f3c00 x27: 000000000011ede0 x26: ffff000800a7ff00 x25: ffff800083937ce8 x24: 0000000000017a80 x23: ffff000803243a78 x22: 0000000000000040 x21: 0000000000000000 x20: ffff000800394c80 x19: 0000000000000200 x18: 0000000000000001 x17: 0000000000000001 x16: ffff000803240000 x15: 0000000000000000 x14: ffffffffffffffff x13: 0000000000000028 x12: ffff000800395650 x11: ffff8000821d1528 x10: ffff800081c2bc08 x9 : ffff800081c1e258 x8 : 0000000100000301 x7 : ffff8000810426ec x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000001 x3 : 0000000000000000 x2 : 0000000000000008 x1 : 0000000000000200 x0 : ffff8000810428dc Call trace: __local_bh_enable_ip+0x174/0x188 (P) local_bh_enable+0x24/0x38 skb_attempt_defer_free+0x190/0x1d8 napi_consume_skb+0x58/0x108 macb_tx_poll+0x1a4/0x558 __napi_poll+0x50/0x198 net_rx_action+0x1f4/0x3d8 handle_softirqs+0x16c/0x560 run_ksoftirqd+0x44/0x80 smpboot_thread_fn+0x1d8/0x338 kthread+0x120/0x150 ret_from_fork+0x10/0x20 irq event stamp: 29751 hardirqs last enabled at (29750): [<ffff8000813be184>] _raw_spin_unlock_irqrestore+0x44/0x88 hardirqs last disabled at (29751): [<ffff8000813bdf60>] _raw_spin_lock_irqsave+0x38/0x98 softirqs last enabled at (29150): [<ffff8000800f1aec>] handle_softirqs+0x504/0x560 softirqs last disabled at (29153): [<ffff8000800f2fec>] run_ksoftirqd+0x44/0x80 Fixes: 6bc8a5098bf4 ("net: macb: Fix tx_ptr_lock locking") Signed-off-by: Kevin Hao <haokexin@gmail.com> Cc: stable@vger.kernel.org Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260321-macb-tx-v1-1-b383a58dd4e6@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysvirtio_net: Fix UAF on dst_ops when IFF_XMIT_DST_RELEASE is cleared and ↵xietangxin1-0/+1
napi_tx is false commit ba8bda9a0896746053aa97ac6c3e08168729172c upstream. A UAF issue occurs when the virtio_net driver is configured with napi_tx=N and the device's IFF_XMIT_DST_RELEASE flag is cleared (e.g., during the configuration of tc route filter rules). When IFF_XMIT_DST_RELEASE is removed from the net_device, the network stack expects the driver to hold the reference to skb->dst until the packet is fully transmitted and freed. In virtio_net with napi_tx=N, skbs may remain in the virtio transmit ring for an extended period. If the network namespace is destroyed while these skbs are still pending, the corresponding dst_ops structure has freed. When a subsequent packet is transmitted, free_old_xmit() is triggered to clean up old skbs. It then calls dst_release() on the skb associated with the stale dst_entry. Since the dst_ops (referenced by the dst_entry) has already been freed, a UAF kernel paging request occurs. fix it by adds skb_dst_drop(skb) in start_xmit to explicitly release the dst reference before the skb is queued in virtio_net. Call Trace: Unable to handle kernel paging request at virtual address ffff80007e150000 CPU: 2 UID: 0 PID: 6236 Comm: ping Kdump: loaded Not tainted 7.0.0-rc1+ #6 PREEMPT ... percpu_counter_add_batch+0x3c/0x158 lib/percpu_counter.c:98 (P) dst_release+0xe0/0x110 net/core/dst.c:177 skb_release_head_state+0xe8/0x108 net/core/skbuff.c:1177 sk_skb_reason_drop+0x54/0x2d8 net/core/skbuff.c:1255 dev_kfree_skb_any_reason+0x64/0x78 net/core/dev.c:3469 napi_consume_skb+0x1c4/0x3a0 net/core/skbuff.c:1527 __free_old_xmit+0x164/0x230 drivers/net/virtio_net.c:611 [virtio_net] free_old_xmit drivers/net/virtio_net.c:1081 [virtio_net] start_xmit+0x7c/0x530 drivers/net/virtio_net.c:3329 [virtio_net] ... Reproduction Steps: NETDEV="enp3s0" config_qdisc_route_filter() { tc qdisc del dev $NETDEV root tc qdisc add dev $NETDEV root handle 1: prio tc filter add dev $NETDEV parent 1:0 \ protocol ip prio 100 route to 100 flowid 1:1 ip route add 192.168.1.100/32 dev $NETDEV realm 100 } test_ns() { ip netns add testns ip link set $NETDEV netns testns ip netns exec testns ifconfig $NETDEV 10.0.32.46/24 ip netns exec testns ping -c 1 10.0.32.1 ip netns del testns } config_qdisc_route_filter test_ns sleep 2 test_ns Fixes: f2fc6a54585a ("[NETNS][IPV6] route6 - move ip6_dst_ops inside the network namespace") Cc: stable@vger.kernel.org Signed-off-by: xietangxin <xietangxin@yeah.net> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Fixes: 0287587884b1 ("net: better IFF_XMIT_DST_RELEASE support") Link: https://patch.msgid.link/20260312025406.15641-1-xietangxin@yeah.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 daysnet: macb: use the current queue number for statsPaolo Valerio1-1/+1
[ Upstream commit 72d96e4e24bbefdcfbc68bdb9341a05d8f5cb6e5 ] There's a potential mismatch between the memory reserved for statistics and the amount of memory written. gem_get_sset_count() correctly computes the number of stats based on the active queues, whereas gem_get_ethtool_stats() indiscriminately copies data using the maximum number of queues, and in the case the number of active queues is less than MACB_MAX_QUEUES, this results in a OOB write as observed in the KASAN splat. ================================================================== BUG: KASAN: vmalloc-out-of-bounds in gem_get_ethtool_stats+0x54/0x78 [macb] Write of size 760 at addr ffff80008080b000 by task ethtool/1027 CPU: [...] Tainted: [E]=UNSIGNED_MODULE Hardware name: raspberrypi rpi/rpi, BIOS 2025.10 10/01/2025 Call trace: show_stack+0x20/0x38 (C) dump_stack_lvl+0x80/0xf8 print_report+0x384/0x5e0 kasan_report+0xa0/0xf0 kasan_check_range+0xe8/0x190 __asan_memcpy+0x54/0x98 gem_get_ethtool_stats+0x54/0x78 [macb 926c13f3af83b0c6fe64badb21ec87d5e93fcf65] dev_ethtool+0x1220/0x38c0 dev_ioctl+0x4ac/0xca8 sock_do_ioctl+0x170/0x1d8 sock_ioctl+0x484/0x5d8 __arm64_sys_ioctl+0x12c/0x1b8 invoke_syscall+0xd4/0x258 el0_svc_common.constprop.0+0xb4/0x240 do_el0_svc+0x48/0x68 el0_svc+0x40/0xf8 el0t_64_sync_handler+0xa0/0xe8 el0t_64_sync+0x1b0/0x1b8 The buggy address belongs to a 1-page vmalloc region starting at 0xffff80008080b000 allocated at dev_ethtool+0x11f0/0x38c0 The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff00000a333000 pfn:0xa333 flags: 0x7fffc000000000(node=0|zone=0|lastcpupid=0x1ffff) raw: 007fffc000000000 0000000000000000 dead000000000122 0000000000000000 raw: ffff00000a333000 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff80008080b080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff80008080b100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff80008080b180: 00 00 00 00 00 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ^ ffff80008080b200: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ffff80008080b280: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ================================================================== Fix it by making sure the copied size only considers the active number of queues. Fixes: 512286bbd4b7 ("net: macb: Added some queue statistics") Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Reviewed-by: Nicolai Buchwitz <nb@tipi-net.de> Link: https://patch.msgid.link/20260323191634.2185840-1-pvalerio@redhat.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysnet: enetc: fix the output issue of 'ethtool --show-ring'Wei Fang1-0/+2
[ Upstream commit 70b439bf06f6a12e491f827fa81a9887a11501f9 ] Currently, enetc_get_ringparam() only provides rx_pending and tx_pending, but 'ethtool --show-ring' no longer displays these fields. Because the ringparam retrieval path has moved to the new netlink interface, where rings_fill_reply() emits the *x_pending only if the *x_max_pending values are non-zero. So rx_max_pending and tx_max_pending to are added to enetc_get_ringparam() to fix the issue. Note that the maximum tx/rx ring size of hardware is 64K, but we haven't added set_ringparam() to make the ring size configurable. To avoid users mistakenly believing that the ring size can be increased, so set the *x_max_pending to priv->*x_bd_count. Fixes: e4a1717b677c ("ethtool: provide ring sizes with RINGS_GET request") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260320094222.706339-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysice: use ice_update_eth_stats() for representor statsPetr Oros2-4/+13
[ Upstream commit 2526e440df2725e7328d59b835a164826f179b93 ] ice_repr_get_stats64() and __ice_get_ethtool_stats() call ice_update_vsi_stats() on the VF's src_vsi. This always returns early because ICE_VSI_DOWN is permanently set for VF VSIs - ice_up() is never called on them since queues are managed by iavf through virtchnl. In __ice_get_ethtool_stats() the original code called ice_update_vsi_stats() for all VSIs including representors, iterated over ice_gstrings_vsi_stats[] to populate the data, and then bailed out with an early return before the per-queue ring stats section. That early return was necessary because representor VSIs have no rings on the PF side - the rings belong to the VF driver (iavf), so accessing per-queue stats would be invalid. Move the representor handling to the top of __ice_get_ethtool_stats() and call ice_update_eth_stats() directly to read the hardware GLV_* counters. This matches ice_get_vf_stats() which already uses ice_update_eth_stats() for the same VF VSI in legacy mode. Apply the same fix to ice_repr_get_stats64(). Note that ice_gstrings_vsi_stats[] contains five software ring counters (rx_buf_failed, rx_page_failed, tx_linearize, tx_busy, tx_restart) that are always zero for representors since the PF never processes packets on VF rings. This is pre-existing behavior unchanged by this patch. Fixes: 7aae80cef7ba ("ice: add port representor ethtool ops and stats") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Patryk Holda <patryk.holda@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysrtnetlink: pass netlink message header and portid to rtnl_configure_link()Hangbin Liu5-6/+6
[ Upstream commit 1d997f1013079c05b642c739901e3584a3ae558d ] This patch pass netlink message header and portid to rtnl_configure_link() All the functions in this call chain need to add the parameters so we can use them in the last call rtnl_notify(), and notify the userspace about the new link info if NLM_F_ECHO flag is set. - rtnl_configure_link() - __dev_notify_flags() - rtmsg_ifinfo() - rtmsg_ifinfo_event() - rtmsg_ifinfo_build_skb() - rtmsg_ifinfo_send() - rtnl_notify() Also move __dev_notify_flags() declaration to net/core/dev.h, as Jakub suggested. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 6931d21f87bc ("openvswitch: defer tunnel netdev_put to RCU release") Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysionic: fix persistent MAC address override on PFMohammad Heib1-6/+11
[ Upstream commit cbcb3cfcdc436d6f91a3d95ecfa9c831abe14aed ] The use of IONIC_CMD_LIF_SETATTR in the MAC address update path causes the ionic firmware to update the LIF's identity in its persistent state. Since the firmware state is maintained across host warm boots and driver reloads, any MAC change on the Physical Function (PF) becomes "sticky. This is problematic because it causes ethtool -P to report the user-configured MAC as the permanent factory address, which breaks system management tools that rely on a stable hardware identity. While Virtual Functions (VFs) need this hardware-level programming to properly handle MAC assignments in guest environments, the PF should maintain standard transient behavior. This patch gates the ionic_program_mac call using is_virtfn so that PF MAC changes remain local to the netdev filters and do not overwrite the firmware's permanent identity block. Fixes: 19058be7c48c ("ionic: VF initial random MAC address if no assigned mac") Signed-off-by: Mohammad Heib <mheib@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Link: https://patch.msgid.link/20260317170806.35390-1-mheib@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 daysnet: usb: r8152: add TRENDnet TUC-ET2GValentin Spreckels1-0/+1
[ Upstream commit 15fba71533bcdfaa8eeba69a5a5a2927afdf664a ] The TRENDnet TUC-ET2G is a RTL8156 based usb ethernet adapter. Add its vendor and product IDs. Signed-off-by: Valentin Spreckels <valentin@spreckels.dev> Link: https://patch.msgid.link/20260226195409.7891-2-valentin@spreckels.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25wifi: brcmfmac: fix use-after-free when rescheduling brcmf_btcoex_info workDuoming Zhou1-4/+2
[ Upstream commit 9cb83d4be0b9b697eae93d321e0da999f9cdfcfc ] The brcmf_btcoex_detach() only shuts down the btcoex timer, if the flag timer_on is false. However, the brcmf_btcoex_timerfunc(), which runs as timer handler, sets timer_on to false. This creates critical race conditions: 1.If brcmf_btcoex_detach() is called while brcmf_btcoex_timerfunc() is executing, it may observe timer_on as false and skip the call to timer_shutdown_sync(). 2.The brcmf_btcoex_timerfunc() may then reschedule the brcmf_btcoex_info worker after the cancel_work_sync() has been executed, resulting in use-after-free bugs. The use-after-free bugs occur in two distinct scenarios, depending on the timing of when the brcmf_btcoex_info struct is freed relative to the execution of its worker thread. Scenario 1: Freed before the worker is scheduled The brcmf_btcoex_info is deallocated before the worker is scheduled. A race condition can occur when schedule_work(&bt_local->work) is called after the target memory has been freed. The sequence of events is detailed below: CPU0 | CPU1 brcmf_btcoex_detach | brcmf_btcoex_timerfunc | bt_local->timer_on = false; if (cfg->btcoex->timer_on) | ... | cancel_work_sync(); | ... | kfree(cfg->btcoex); // FREE | | schedule_work(&bt_local->work); // USE Scenario 2: Freed after the worker is scheduled The brcmf_btcoex_info is freed after the worker has been scheduled but before or during its execution. In this case, statements within the brcmf_btcoex_handler() — such as the container_of macro and subsequent dereferences of the brcmf_btcoex_info object will cause a use-after-free access. The following timeline illustrates this scenario: CPU0 | CPU1 brcmf_btcoex_detach | brcmf_btcoex_timerfunc | bt_local->timer_on = false; if (cfg->btcoex->timer_on) | ... | cancel_work_sync(); | ... | schedule_work(); // Reschedule | kfree(cfg->btcoex); // FREE | brcmf_btcoex_handler() // Worker /* | btci = container_of(....); // USE The kfree() above could | ... also occur at any point | btci-> // USE during the worker's execution| */ | To resolve the race conditions, drop the conditional check and call timer_shutdown_sync() directly. It can deactivate the timer reliably, regardless of its current state. Once stopped, the timer_on state is then set to false. Fixes: 61730d4dfffc ("brcmfmac: support critical protocol API for DHCP") Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Link: https://patch.msgid.link/20250822050839.4413-1-duoming@zju.edu.cn Signed-off-by: Johannes Berg <johannes.berg@intel.com> [ Keep del_timer_sync() instead of timer_shutdown_sync() here. ] Signed-off-by: Robert Garcia <rob_garcia@163.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25net: stmmac: fix TSO DMA API usage causing oopsRussell King (Oracle)1-3/+4
[ Upstream commit 4c49f38e20a57f8abaebdf95b369295b153d1f8e ] Commit 66600fac7a98 ("net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data") moved the assignment of tx_skbuff_dma[]'s members to be later in stmmac_tso_xmit(). The buf (dma cookie) and len stored in this structure are passed to dma_unmap_single() by stmmac_tx_clean(). The DMA API requires that the dma cookie passed to dma_unmap_single() is the same as the value returned from dma_map_single(). However, by moving the assignment later, this is not the case when priv->dma_cap.addr64 > 32 as "des" is offset by proto_hdr_len. This causes problems such as: dwc-eth-dwmac 2490000.ethernet eth0: Tx DMA map failed and with DMA_API_DEBUG enabled: DMA-API: dwc-eth-dwmac 2490000.ethernet: device driver tries to +free DMA memory it has not allocated [device address=0x000000ffffcf65c0] [size=66 bytes] Fix this by maintaining "des" as the original DMA cookie, and use tso_des to pass the offset DMA cookie to stmmac_tso_allocator(). Full details of the crashes can be found at: https://lore.kernel.org/all/d8112193-0386-4e14-b516-37c2d838171a@nvidia.com/ https://lore.kernel.org/all/klkzp5yn5kq5efgtrow6wbvnc46bcqfxs65nz3qy77ujr5turc@bwwhelz2l4dw/ Reported-by: Jon Hunter <jonathanh@nvidia.com> Reported-by: Thierry Reding <thierry.reding@gmail.com> Fixes: 66600fac7a98 ("net: stmmac: TSO: Fix unbalanced DMA map/unmap for non-paged SKB data") Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Furong Xu <0x1207@gmail.com> Link: https://patch.msgid.link/E1tJXcx-006N4Z-PC@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> [ The context change is due to the commit 041cc86b3653 ("net: stmmac: Enable TSO on VLANs") in v6.11 which is irrelevant to the logic of this patch. ] Signed-off-by: Rahul Sharma <black.hawk@163.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25net: dsa: bcm_sf2: fix missing clk_disable_unprepare() in error pathsAnas Iqbal1-2/+6
[ Upstream commit b48731849609cbd8c53785a48976850b443153fd ] Smatch reports: drivers/net/dsa/bcm_sf2.c:997 bcm_sf2_sw_resume() warn: 'priv->clk' from clk_prepare_enable() not released on lines: 983,990. The clock enabled by clk_prepare_enable() in bcm_sf2_sw_resume() is not released if bcm_sf2_sw_rst() or bcm_sf2_cfp_resume() fails. Add the missing clk_disable_unprepare() calls in the error paths to properly release the clock resource. Fixes: e9ec5c3bd238 ("net: dsa: bcm_sf2: request and handle clocks") Reviewed-by: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Anas Iqbal <mohd.abd.6602@gmail.com> Link: https://patch.msgid.link/20260318084212.1287-1-mohd.abd.6602@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25net: mvpp2: guard flow control update with global_tx_fc in buffer switchingMuhammad Hammad Ijaz1-2/+2
[ Upstream commit 8a63baadf08453f66eb582fdb6dd234f72024723 ] mvpp2_bm_switch_buffers() unconditionally calls mvpp2_bm_pool_update_priv_fc() when switching between per-cpu and shared buffer pool modes. This function programs CM3 flow control registers via mvpp2_cm3_read()/mvpp2_cm3_write(), which dereference priv->cm3_base without any NULL check. When the CM3 SRAM resource is not present in the device tree (the third reg entry added by commit 60523583b07c ("dts: marvell: add CM3 SRAM memory to cp11x ethernet device tree")), priv->cm3_base remains NULL and priv->global_tx_fc is false. Any operation that triggers mvpp2_bm_switch_buffers(), for example an MTU change that crosses the jumbo frame threshold, will crash: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x0000000096000006 EC = 0x25: DABT (current EL), IL = 32 bits pc : readl+0x0/0x18 lr : mvpp2_cm3_read.isra.0+0x14/0x20 Call trace: readl+0x0/0x18 mvpp2_bm_pool_update_fc+0x40/0x12c mvpp2_bm_pool_update_priv_fc+0x94/0xd8 mvpp2_bm_switch_buffers.isra.0+0x80/0x1c0 mvpp2_change_mtu+0x140/0x380 __dev_set_mtu+0x1c/0x38 dev_set_mtu_ext+0x78/0x118 dev_set_mtu+0x48/0xa8 dev_ifsioc+0x21c/0x43c dev_ioctl+0x2d8/0x42c sock_ioctl+0x314/0x378 Every other flow control call site in the driver already guards hardware access with either priv->global_tx_fc or port->tx_fc. mvpp2_bm_switch_buffers() is the only place that omits this check. Add the missing priv->global_tx_fc guard to both the disable and re-enable calls in mvpp2_bm_switch_buffers(), consistent with the rest of the driver. Fixes: 3a616b92a9d1 ("net: mvpp2: Add TX flow control support for jumbo frames") Signed-off-by: Muhammad Hammad Ijaz <mhijaz@amazon.com> Reviewed-by: Gunnar Kudrjavets <gunnarku@amazon.com> Link: https://patch.msgid.link/20260316193157.65748-1-mhijaz@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25net: bonding: fix NULL deref in bond_debug_rlb_hash_showXiang Mei1-5/+11
[ Upstream commit 605b52497bf89b3b154674deb135da98f916e390 ] rlb_clear_slave intentionally keeps RLB hash-table entries on the rx_hashtbl_used_head list with slave set to NULL when no replacement slave is available. However, bond_debug_rlb_hash_show visites client_info->slave without checking if it's NULL. Other used-list iterators in bond_alb.c already handle this NULL-slave state safely: - rlb_update_client returns early on !client_info->slave - rlb_req_update_slave_clients, rlb_clear_slave, and rlb_rebalance compare slave values before visiting - lb_req_update_subnet_clients continues if slave is NULL The following NULL deref crash can be trigger in bond_debug_rlb_hash_show: [ 1.289791] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 1.292058] RIP: 0010:bond_debug_rlb_hash_show (drivers/net/bonding/bond_debugfs.c:41) [ 1.293101] RSP: 0018:ffffc900004a7d00 EFLAGS: 00010286 [ 1.293333] RAX: 0000000000000000 RBX: ffff888102b48200 RCX: ffff888102b48204 [ 1.293631] RDX: ffff888102b48200 RSI: ffffffff839daad5 RDI: ffff888102815078 [ 1.293924] RBP: ffff888102815078 R08: ffff888102b4820e R09: 0000000000000000 [ 1.294267] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888100f929c0 [ 1.294564] R13: ffff888100f92a00 R14: 0000000000000001 R15: ffffc900004a7ed8 [ 1.294864] FS: 0000000001395380(0000) GS:ffff888196e75000(0000) knlGS:0000000000000000 [ 1.295239] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.295480] CR2: 0000000000000000 CR3: 0000000102adc004 CR4: 0000000000772ef0 [ 1.295897] Call Trace: [ 1.296134] seq_read_iter (fs/seq_file.c:231) [ 1.296341] seq_read (fs/seq_file.c:164) [ 1.296493] full_proxy_read (fs/debugfs/file.c:378 (discriminator 1)) [ 1.296658] vfs_read (fs/read_write.c:572) [ 1.296981] ksys_read (fs/read_write.c:717) [ 1.297132] do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1)) [ 1.297325] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Add a NULL check and print "(none)" for entries with no assigned slave. Fixes: caafa84251b88 ("bonding: add the debugfs interface to see RLB hash table") Reported-by: Weiming Shi <bestswngs@gmail.com> Signed-off-by: Xiang Mei <xmei5@asu.edu> Link: https://patch.msgid.link/20260317005034.1888794-1-xmei5@asu.edu Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25net: macb: fix uninitialized rx_fs_lockFedor Pchelkin1-0/+3
[ Upstream commit 34b11cc56e4369bc08b1f4c4a04222d75ed596ce ] If hardware doesn't support RX Flow Filters, rx_fs_lock spinlock is not initialized leading to the following assertion splat triggerable via set_rxnfc callback. INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. CPU: 1 PID: 949 Comm: syz.0.6 Not tainted 6.1.164+ #113 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.1-0-g3208b098f51a-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x8d/0xba lib/dump_stack.c:106 assign_lock_key kernel/locking/lockdep.c:974 [inline] register_lock_class+0x141b/0x17f0 kernel/locking/lockdep.c:1287 __lock_acquire+0x74f/0x6c40 kernel/locking/lockdep.c:4928 lock_acquire kernel/locking/lockdep.c:5662 [inline] lock_acquire+0x190/0x4b0 kernel/locking/lockdep.c:5627 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x33/0x50 kernel/locking/spinlock.c:162 gem_del_flow_filter drivers/net/ethernet/cadence/macb_main.c:3562 [inline] gem_set_rxnfc+0x533/0xac0 drivers/net/ethernet/cadence/macb_main.c:3667 ethtool_set_rxnfc+0x18c/0x280 net/ethtool/ioctl.c:961 __dev_ethtool net/ethtool/ioctl.c:2956 [inline] dev_ethtool+0x229c/0x6290 net/ethtool/ioctl.c:3095 dev_ioctl+0x637/0x1070 net/core/dev_ioctl.c:510 sock_do_ioctl+0x20d/0x2c0 net/socket.c:1215 sock_ioctl+0x577/0x6d0 net/socket.c:1320 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl fs/ioctl.c:856 [inline] __x64_sys_ioctl+0x18c/0x210 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:46 [inline] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:76 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 A more straightforward solution would be to always initialize rx_fs_lock, just like rx_fs_list. However, in this case the driver set_rxnfc callback would return with a rather confusing error code, e.g. -EINVAL. So deny set_rxnfc attempts directly if the RX filtering feature is not supported by hardware. Fixes: ae8223de3df5 ("net: macb: Added support for RX filtering") Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Link: https://patch.msgid.link/20260316103826.74506-2-pchelkin@ispras.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25wifi: wlcore: Return -ENOMEM instead of -EAGAIN if there is not enough headroomGuenter Roeck1-1/+1
[ Upstream commit deb353d9bb009638b7762cae2d0b6e8fdbb41a69 ] Since upstream commit e75665dd0968 ("wifi: wlcore: ensure skb headroom before skb_push"), wl1271_tx_allocate() and with it wl1271_prepare_tx_frame() returns -EAGAIN if pskb_expand_head() fails. However, in wlcore_tx_work_locked(), a return value of -EAGAIN from wl1271_prepare_tx_frame() is interpreted as the aggregation buffer being full. This causes the code to flush the buffer, put the skb back at the head of the queue, and immediately retry the same skb in a tight while loop. Because wlcore_tx_work_locked() holds wl->mutex, and the retry happens immediately with GFP_ATOMIC, this will result in an infinite loop and a CPU soft lockup. Return -ENOMEM instead so the packet is dropped and the loop terminates. The problem was found by an experimental code review agent based on gemini-3.1-pro while reviewing backports into v6.18.y. Assisted-by: Gemini:gemini-3.1-pro Fixes: e75665dd0968 ("wifi: wlcore: ensure skb headroom before skb_push") Cc: Peter Astrand <astrand@lysator.liu.se> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Link: https://patch.msgid.link/20260318064636.3065925-1-linux@roeck-us.net Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25iavf: fix VLAN filter lost on add/delete racePetr Oros1-3/+6
[ Upstream commit fc9c69be594756b81b54c6bc40803fa6052f35ae ] When iavf_add_vlan() finds an existing filter in IAVF_VLAN_REMOVE state, it transitions the filter to IAVF_VLAN_ACTIVE assuming the pending delete can simply be cancelled. However, there is no guarantee that iavf_del_vlans() has not already processed the delete AQ request and removed the filter from the PF. In that case the filter remains in the driver's list as IAVF_VLAN_ACTIVE but is no longer programmed on the NIC. Since iavf_add_vlans() only picks up filters in IAVF_VLAN_ADD state, the filter is never re-added, and spoof checking drops all traffic for that VLAN. CPU0 CPU1 Workqueue ---- ---- --------- iavf_del_vlan(vlan 100) f->state = REMOVE schedule AQ_DEL_VLAN iavf_add_vlan(vlan 100) f->state = ACTIVE iavf_del_vlans() f is ACTIVE, skip iavf_add_vlans() f is ACTIVE, skip Filter is ACTIVE in driver but absent from NIC. Transition to IAVF_VLAN_ADD instead and schedule IAVF_FLAG_AQ_ADD_VLAN_FILTER so iavf_add_vlans() re-programs the filter. A duplicate add is idempotent on the PF. Fixes: 0c0da0e95105 ("iavf: refactor VLAN filter states") Signed-off-by: Petr Oros <poros@redhat.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25igc: fix missing update of skb->tail in igc_xmit_frame()Kohei Enju1-5/+2
[ Upstream commit 0ffba246652faf4a36aedc66059c2f94e4c83ea5 ] igc_xmit_frame() misses updating skb->tail when the packet size is shorter than the minimum one. Use skb_put_padto() in alignment with other Intel Ethernet drivers. Fixes: 0507ef8a0372 ("igc: Add transmit and receive fastpath and interrupt handlers") Signed-off-by: Kohei Enju <kohei@enjuk.jp> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Avigail Dahan <avigailx.dahan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25net: usb: aqc111: Do not perform PM inside suspend callbackNikola Z. Ivanov1-6/+6
[ Upstream commit 069c8f5aebe4d5224cf62acc7d4b3486091c658a ] syzbot reports "task hung in rpm_resume" This is caused by aqc111_suspend calling the PM variant of its write_cmd routine. The simplified call trace looks like this: rpm_suspend() usb_suspend_both() - here udev->dev.power.runtime_status == RPM_SUSPENDING aqc111_suspend() - called for the usb device interface aqc111_write32_cmd() usb_autopm_get_interface() pm_runtime_resume_and_get() rpm_resume() - here we call rpm_resume() on our parent rpm_resume() - Here we wait for a status change that will never happen. At this point we block another task which holds rtnl_lock and locks up the whole networking stack. Fix this by replacing the write_cmd calls with their _nopm variants Reported-by: syzbot+48dc1e8dfc92faf1124c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=48dc1e8dfc92faf1124c Fixes: e58ba4544c77 ("net: usb: aqc111: Add support for wake on LAN by MAGIC packet") Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com> Link: https://patch.msgid.link/20260313141643.1181386-1-zlatistiv@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25net: mana: fix use-after-free in mana_hwc_destroy_channel() by reordering ↵Dipayaan Roy1-3/+3
teardown [ Upstream commit fa103fc8f56954a60699a29215cb713448a39e87 ] A potential race condition exists in mana_hwc_destroy_channel() where hwc->caller_ctx is freed before the HWC's Completion Queue (CQ) and Event Queue (EQ) are destroyed. This allows an in-flight CQ interrupt handler to dereference freed memory, leading to a use-after-free or NULL pointer dereference in mana_hwc_handle_resp(). mana_smc_teardown_hwc() signals the hardware to stop but does not synchronize against IRQ handlers already executing on other CPUs. The IRQ synchronization only happens in mana_hwc_destroy_cq() via mana_gd_destroy_eq() -> mana_gd_deregister_irq(). Since this runs after kfree(hwc->caller_ctx), a concurrent mana_hwc_rx_event_handler() can dereference freed caller_ctx (and rxq->msg_buf) in mana_hwc_handle_resp(). Fix this by reordering teardown to reverse-of-creation order: destroy the TX/RX work queues and CQ/EQ before freeing hwc->caller_ctx. This ensures all in-flight interrupt handlers complete before the memory they access is freed. Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/abHA3AjNtqa1nx9k@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25net: bcmgenet: increase WoL poll timeoutJustin Chen1-1/+1
[ Upstream commit 6cfc3bc02b977f2fba5f7268e6504d1931a774f7 ] Some systems require more than 5ms to get into WoL mode. Increase the timeout value to 50ms. Fixes: c51de7f3976b ("net: bcmgenet: add Wake-on-LAN support code") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20260312191852.3904571-1-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-25net: stmmac: remove support for lpi_intr_oRussell King (Oracle)6-58/+0
commit 14eb64db8ff07b58a35b98375f446d9e20765674 upstream. The dwmac databook for v3.74a states that lpi_intr_o is a sideband signal which should be used to ungate the application clock, and this signal is synchronous to the receive clock. The receive clock can run at 2.5, 25 or 125MHz depending on the media speed, and can stop under the control of the link partner. This means that the time it takes to clear is dependent on the negotiated media speed, and thus can be 8, 40, or 400ns after reading the LPI control and status register. It has been observed with some aggressive link partners, this clock can stop while lpi_intr_o is still asserted, meaning that the signal remains asserted for an indefinite period that the local system has no direct control over. The LPI interrupts will still be signalled through the main interrupt path in any case, and this path is not dependent on the receive clock. This, since we do not gate the application clock, and the chances of adding clock gating in the future are slim due to the clocks being ill-defined, lpi_intr_o serves no useful purpose. Remove the code which requests the interrupt, and all associated code. Reported-by: Ovidiu Panait <ovidiu.panait.rb@renesas.com> Tested-by: Ovidiu Panait <ovidiu.panait.rb@renesas.com> # Renesas RZ/V2H board Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1vnJbt-00000007YYN-28nm@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Ovidiu Panait <ovidiu.panait.rb@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25eth: bnxt: always recalculate features after XDP clearing, fix null-derefJakub Kicinski3-13/+21
[ Upstream commit f0aa6a37a3dbb40b272df5fc6db93c114688adcd ] Recalculate features when XDP is detached. Before: # ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp # ip li set dev eth0 xdp off # ethtool -k eth0 | grep gro rx-gro-hw: off [requested on] After: # ip li set dev eth0 xdp obj xdp_dummy.bpf.o sec xdp # ip li set dev eth0 xdp off # ethtool -k eth0 | grep gro rx-gro-hw: on The fact that HW-GRO doesn't get re-enabled automatically is just a minor annoyance. The real issue is that the features will randomly come back during another reconfiguration which just happens to invoke netdev_update_features(). The driver doesn't handle reconfiguring two things at a time very robustly. Starting with commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()") we only reconfigure the RSS hash table if the "effective" number of Rx rings has changed. If HW-GRO is enabled "effective" number of rings is 2x what user sees. So if we are in the bad state, with HW-GRO re-enablement "pending" after XDP off, and we lower the rings by / 2 - the HW-GRO rings doing 2x and the ethtool -L doing / 2 may cancel each other out, and the: if (old_rx_rings != bp->hw_resc.resv_rx_rings && condition in __bnxt_reserve_rings() will be false. The RSS map won't get updated, and we'll crash with: BUG: kernel NULL pointer dereference, address: 0000000000000168 RIP: 0010:__bnxt_hwrm_vnic_set_rss+0x13a/0x1a0 bnxt_hwrm_vnic_rss_cfg_p5+0x47/0x180 __bnxt_setup_vnic_p5+0x58/0x110 bnxt_init_nic+0xb72/0xf50 __bnxt_open_nic+0x40d/0xab0 bnxt_open_nic+0x2b/0x60 ethtool_set_channels+0x18c/0x1d0 As we try to access a freed ring. The issue is present since XDP support was added, really, but prior to commit 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()") it wasn't causing major issues. Fixes: 1054aee82321 ("bnxt_en: Use NETIF_F_GRO_HW.") Fixes: 98ba1d931f61 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Link: https://patch.msgid.link/20250109043057.2888953-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> [ The context change is due to the commit 1f6e77cb9b32 ("bnxt_en: Add bnxt_l2_filter hash table.") in v6.8 and the commit 8336a974f37d ("bnxt_en: Save user configured filters in a lookup list") in v6.9 which are irrelevant to the logic of this patch. ] Signed-off-by: Rahul Sharma <black.hawk@163.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25gve: defer interrupt enabling until NAPI registrationAnkit Garg2-1/+5
commit 3d970eda003441f66551a91fda16478ac0711617 upstream. Currently, interrupts are automatically enabled immediately upon request. This allows interrupt to fire before the associated NAPI context is fully initialized and cause failures like below: [ 0.946369] Call Trace: [ 0.946369] <IRQ> [ 0.946369] __napi_poll+0x2a/0x1e0 [ 0.946369] net_rx_action+0x2f9/0x3f0 [ 0.946369] handle_softirqs+0xd6/0x2c0 [ 0.946369] ? handle_edge_irq+0xc1/0x1b0 [ 0.946369] __irq_exit_rcu+0xc3/0xe0 [ 0.946369] common_interrupt+0x81/0xa0 [ 0.946369] </IRQ> [ 0.946369] <TASK> [ 0.946369] asm_common_interrupt+0x22/0x40 [ 0.946369] RIP: 0010:pv_native_safe_halt+0xb/0x10 Use the `IRQF_NO_AUTOEN` flag when requesting interrupts to prevent auto enablement and explicitly enable the interrupt in NAPI initialization path (and disable it during NAPI teardown). This ensures that interrupt lifecycle is strictly coupled with readiness of NAPI context. Cc: stable@vger.kernel.org Fixes: 1dfc2e46117e ("gve: Refactor napi add and remove functions") Signed-off-by: Ankit Garg <nktgrg@google.com> Reviewed-by: Jordan Rhee <jordanrhee@google.com> Reviewed-by: Joshua Washington <joshwash@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20251219102945.2193617-1-hramamurthy@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> [ modified to re-introduce the irq member to struct gve_notify_block, which was introuduced in commit 9a5e0776d11f ("gve: Avoid rescheduling napi if on wrong cpu"). ] Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25net: stmmac: dwmac-loongson: Set clk_csr_i to 100-150MHzHuacai Chen1-1/+1
commit e1aa5ef892fb4fa9014a25e87b64b97347919d37 upstream. Current clk_csr_i setting of Loongson STMMAC (including LS7A1000/2000 and LS2K1000/2000/3000) are copy & paste from other drivers. In fact, Loongson STMMAC use 125MHz clocks and need 62 freq division to within 2.5MHz, meeting most PHY MDC requirement. So fix by setting clk_csr_i to 100-150MHz, otherwise some PHYs may link fail. Cc: stable@vger.kernel.org Fixes: 30bba69d7db40e7 ("stmmac: pci: Add dwmac support for Loongson") Signed-off-by: Hongliang Wang <wanghongliang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://patch.msgid.link/20260203062901.2158236-1-chenhuacai@loongson.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25net: enetc: allocate vf_state during PF probesWei Fang1-9/+9
[ Upstream commit e15c5506dd39885cd047f811a64240e2e8ab401b ] In the previous implementation, vf_state is allocated memory only when VF is enabled. However, net_device_ops::ndo_set_vf_mac() may be called before VF is enabled to configure the MAC address of VF. If this is the case, enetc_pf_set_vf_mac() will access vf_state, resulting in access to a null pointer. The simplified error log is as follows. root@ls1028ardb:~# ip link set eno0 vf 1 mac 00:0c:e7:66:77:89 [ 173.543315] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004 [ 173.637254] pc : enetc_pf_set_vf_mac+0x3c/0x80 Message from sy [ 173.641973] lr : do_setlink+0x4a8/0xec8 [ 173.732292] Call trace: [ 173.734740] enetc_pf_set_vf_mac+0x3c/0x80 [ 173.738847] __rtnl_newlink+0x530/0x89c [ 173.742692] rtnl_newlink+0x50/0x7c [ 173.746189] rtnetlink_rcv_msg+0x128/0x390 [ 173.750298] netlink_rcv_skb+0x60/0x130 [ 173.754145] rtnetlink_rcv+0x18/0x24 [ 173.757731] netlink_unicast+0x318/0x380 [ 173.761665] netlink_sendmsg+0x17c/0x3c8 Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20241031060247.1290941-2-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Rahul Sharma <black.hawk@163.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25net: enetc: reimplement RFS/RSS memory clearing as PCI quirkVladimir Oltean1-30/+73
[ Upstream commit f0168042a21292d20007d24ab2e4fc32f79ebf11 ] The workaround implemented in commit 3222b5b613db ("net: enetc: initialize RFS/RSS memories for unused ports too") is no longer effective after commit 6fffbc7ae137 ("PCI: Honor firmware's device disabled status"). Thus, it has introduced a regression and we see AER errors being reported again: $ ip link set sw2p0 up && dhclient -i sw2p0 && ip addr show sw2p0 fsl_enetc 0000:00:00.2 eno2: configuring for fixed/internal link mode fsl_enetc 0000:00:00.2 eno2: Link is Up - 2.5Gbps/Full - flow control rx/tx mscc_felix 0000:00:00.5 swp2: configuring for fixed/sgmii link mode mscc_felix 0000:00:00.5 swp2: Link is Up - 1Gbps/Full - flow control off sja1105 spi2.2 sw2p0: configuring for phy/rgmii-id link mode sja1105 spi2.2 sw2p0: Link is Up - 1Gbps/Full - flow control off pcieport 0000:00:1f.0: AER: Multiple Corrected error received: 0000:00:00.0 pcieport 0000:00:1f.0: AER: can't find device of ID0000 Rob's suggestion is to reimplement the enetc driver workaround as a PCI fixup, and to modify the PCI core to run the fixups for all PCI functions. This change handles the first part. We refactor the common code in enetc_psi_create() and enetc_psi_destroy(), and use the PCI fixup only for those functions for which enetc_pf_probe() won't get called. This avoids some work being done twice for the PFs which are enabled. Fixes: 6fffbc7ae137 ("PCI: Honor firmware's device disabled status") Link: https://lore.kernel.org/netdev/CAL_JsqLsVYiPLx2kcHkDQ4t=hQVCR7NHziDwi9cCFUFhx48Qow@mail.gmail.com/ Suggested-by: Rob Herring <robh@kernel.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Rahul Sharma <black.hawk@163.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25net: fec: handle page_pool_dev_alloc_pages errorKevin Groeneveld1-5/+14
[ Upstream commit 001ba0902046cb6c352494df610718c0763e77a5 ] The fec_enet_update_cbd function calls page_pool_dev_alloc_pages but did not handle the case when it returned NULL. There was a WARN_ON(!new_page) but it would still proceed to use the NULL pointer and then crash. This case does seem somewhat rare but when the system is under memory pressure it can happen. One case where I can duplicate this with some frequency is when writing over a smbd share to a SATA HDD attached to an imx6q. Setting /proc/sys/vm/min_free_kbytes to higher values also seems to solve the problem for my test case. But it still seems wrong that the fec driver ignores the memory allocation error and can crash. This commit handles the allocation error by dropping the current packet. Fixes: 95698ff6177b5 ("net: fec: using page pool to manage RX buffers") Signed-off-by: Kevin Groeneveld <kgroeneveld@lenbrook.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20250113154846.1765414-1-kgroeneveld@lenbrook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> [ The context change is due to the commit 6c8fae0caf5d (net: fec: simplify the code logic of quirks") in v6.2 which is irrelevant to the logic of this patch. ] Signed-off-by: Johnny Hao <johnny_haocn@sina.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25ice: reintroduce retry mechanism for indirect AQJakub Staniszewski1-3/+9
[ Upstream commit 326256c0a72d4877cec1d4df85357da106233128 ] Add retry mechanism for indirect Admin Queue (AQ) commands. To do so we need to keep the command buffer. This technically reverts commit 43a630e37e25 ("ice: remove unused buffer copy code in ice_sq_send_cmd_retry()"), but combines it with a fix in the logic by using a kmemdup() call, making it more robust and less likely to break in the future due to programmer error. Cc: Michal Schmidt <mschmidt@redhat.com> Cc: stable@vger.kernel.org Fixes: 3056df93f7a8 ("ice: Re-send some AQ commands, as result of EBUSY AQ error") Signed-off-by: Jakub Staniszewski <jakub.staniszewski@linux.intel.com> Co-developed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Signed-off-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25ice: sleep, don't busy-wait, in the SQ send retry loopMichal Schmidt1-1/+1
[ Upstream commit b488ae52ef9f74155ab358f8c68e74327b45e0e1 ] 10 ms is a lot of time to spend busy-waiting. Sleeping is clearly allowed here, because we have just returned from ice_sq_send_cmd(), which takes a mutex. On kernels with HZ=100, this msleep may be twice as long, but I don't think it matters. I did not actually observe any retries happening here. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Stable-dep-of: 326256c0a72d ("ice: reintroduce retry mechanism for indirect AQ") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25ice: remove unused buffer copy code in ice_sq_send_cmd_retry()Michal Schmidt1-11/+2
[ Upstream commit 43a630e37e259fee83ab3fd769c42e2fed97ca81 ] The 'buf_cpy'-related code in ice_sq_send_cmd_retry() looks broken. 'buf' is nowhere copied into 'buf_cpy'. The reason this does not cause problems is that all commands for which 'is_cmd_for_retry' is true go with a NULL buf. Let's remove 'buf_cpy'. Add a WARN_ON in case the assumption no longer holds in the future. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Stable-dep-of: 326256c0a72d ("ice: reintroduce retry mechanism for indirect AQ") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25net: macb: Reinitialize tx/rx queue pointer registers and rx ring during resumeKevin Hao1-0/+10
[ Upstream commit 718d0766ce4c7634ce62fa78b526ea7263487edd ] On certain platforms, such as AMD Versal boards, the tx/rx queue pointer registers are cleared after suspend, and the rx queue pointer register is also disabled during suspend if WOL is enabled. Previously, we assumed that these registers would be restored by macb_mac_link_up(). However, in commit bf9cf80cab81, macb_init_buffers() was moved from macb_mac_link_up() to macb_open(). Therefore, we should call macb_init_buffers() to reinitialize the tx/rx queue pointer registers during resume. Due to the reset of these two registers, we also need to adjust the tx/rx rings accordingly. The tx ring will be handled by gem_shuffle_tx_rings() in macb_mac_link_up(), so we only need to initialize the rx ring here. Fixes: bf9cf80cab81 ("net: macb: Fix tx/rx malfunction after phy link down and up") Reported-by: Quanyang Wang <quanyang.wang@windriver.com> Signed-off-by: Kevin Hao <haokexin@gmail.com> Tested-by: Quanyang Wang <quanyang.wang@windriver.com> Cc: stable@vger.kernel.org Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260312-macb-versal-v1-2-467647173fa4@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25net: macb: Introduce gem_init_rx_ring()Kevin Hao1-4/+9
[ Upstream commit 1a7124ecd655bcaf1845197fe416aa25cff4c3ea ] Extract the initialization code for the GEM RX ring into a new function. This change will be utilized in a subsequent patch. No functional changes are introduced. Signed-off-by: Kevin Hao <haokexin@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260312-macb-versal-v1-1-467647173fa4@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 718d0766ce4c ("net: macb: Reinitialize tx/rx queue pointer registers and rx ring during resume") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-03-25net: macb: queue tie-off or disable during WOL suspendVineeth Karumanchi2-3/+64
[ Upstream commit 759cc793ebfc2d1a02f357ae97e5dcdcd63f758f ] When GEM is used as a wake device, it is not mandatory for the RX DMA to be active. The RX engine in IP only needs to receive and identify a wake packet through an interrupt. The wake packet is of no further significance; hence, it is not required to be copied into memory. By disabling RX DMA during suspend, we can avoid unnecessary DMA processing of any incoming traffic. During suspend, perform either of the below operations: - tie-off/dummy descriptor: Disable unused queues by connecting them to a looped descriptor chain without free slots. - queue disable: The newer IP version allows disabling individual queues. Co-developed-by: Harini Katakam <harini.katakam@amd.com> Signed-off-by: Harini Katakam <harini.katakam@amd.com> Signed-off-by: Vineeth Karumanchi <vineeth.karumanchi@amd.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Tested-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> # on SAMA7G5 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Stable-dep-of: 718d0766ce4c ("net: macb: Reinitialize tx/rx queue pointer registers and rx ring during resume") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>