summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)AuthorFilesLines
4 daysidpf: use actual mbx receive payload lengthJoshua Hay1-8/+1
commit 640f70063e6d3a76a63f57e130fba43ba8c7e980 upstream. When a mailbox message is received, the driver is checking for a non 0 datalen in the controlq descriptor. If it is valid, the payload is attached to the ctlq message to give to the upper layer. However, the payload response size given to the upper layer was taken from the buffer metadata which is _always_ the max buffer size. This meant the API was returning 4K as the payload size for all messages. This went unnoticed since the virtchnl exchange response logic was checking for a response size less than 0 (error), not less than exact size, or not greater than or equal to the max mailbox buffer size (4K). All of these checks will pass in the success case since the size provided is always 4K. However, this breaks anyone that wants to validate the exact response size. Fetch the actual payload length from the value provided in the descriptor data_len field (instead of the buffer metadata). Unfortunately, this means we lose some extra error parsing for variable sized virtchnl responses such as create vport and get ptypes. However, the original checks weren't really helping anyways since the size was _always_ 4K. Fixes: 34c21fa894a1 ("idpf: implement virtchnl transaction manager") Cc: stable@vger.kernel.org # 6.9+ Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 daysice: Fix improper handling of refcount in ice_sriov_set_msix_vec_count()Gui-Dong Han1-2/+6
commit d517cf89874c6039e6294b18d66f40988e62502a upstream. This patch addresses an issue with improper reference count handling in the ice_sriov_set_msix_vec_count() function. First, the function calls ice_get_vf_by_id(), which increments the reference count of the vf pointer. If the subsequent call to ice_get_vf_vsi() fails, the function currently returns an error without decrementing the reference count of the vf pointer, leading to a reference count leak. The correct behavior, as implemented in this patch, is to decrement the reference count using ice_put_vf(vf) before returning an error when vsi is NULL. Second, the function calls ice_sriov_get_irqs(), which sets vf->first_vector_idx. If this call returns a negative value, indicating an error, the function returns an error without decrementing the reference count of the vf pointer, resulting in another reference count leak. The patch addresses this by adding a call to ice_put_vf(vf) before returning an error when vf->first_vector_idx < 0. This bug was identified by an experimental static analysis tool developed by our team. The tool specializes in analyzing reference count operations and identifying potential mismanagement of reference counts. In this case, the tool flagged the missing decrement operation as a potential issue, leading to this patch. Fixes: 4035c72dc1ba ("ice: reconfig host after changing MSI-X on VF") Fixes: 4d38cb44bd32 ("ice: manage VFs MSI-X using resource tracking") Cc: stable@vger.kernel.org Signed-off-by: Gui-Dong Han <hanguidong02@outlook.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 daysice: Fix improper handling of refcount in ice_dpll_init_rclk_pins()Gui-Dong Han1-2/+2
commit ccca30a18e36a742e606d5bf0630e75be7711d0a upstream. This patch addresses a reference count handling issue in the ice_dpll_init_rclk_pins() function. The function calls ice_dpll_get_pins(), which increments the reference count of the relevant resources. However, if the condition WARN_ON((!vsi || !vsi->netdev)) is met, the function currently returns an error without properly releasing the resources acquired by ice_dpll_get_pins(), leading to a reference count leak. To resolve this, the check has been moved to the top of the function. This ensures that the function verifies the state before any resources are acquired, avoiding the need for additional resource management in the error path. This bug was identified by an experimental static analysis tool developed by our team. The tool specializes in analyzing reference count operations and detecting potential issues where resources are not properly managed. In this case, the tool flagged the missing release operation as a potential problem, which led to the development of this patch. Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu") Cc: stable@vger.kernel.org Signed-off-by: Gui-Dong Han <hanguidong02@outlook.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 daysnet: ibm: emac: mal: add dcr_unmap to _removeRosen Penev1-0/+2
[ Upstream commit 080ddc22f3b0a58500f87e8e865aabbf96495eea ] It's done in probe so it should be undone here. Fixes: 1d3bb996481e ("Device tree aware EMAC driver") Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20241008233050.9422-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysnet: ti: icssg-prueth: Fix race condition for VLAN table accessMD Danish Anwar3-0/+5
[ Upstream commit ff8ee11e778520c5716b7f165d2c7ce14d6a068b ] The VLAN table is a shared memory between the two ports/slices in a ICSSG cluster and this may lead to race condition when the common code paths for both ports are executed in different CPUs. Fix the race condition access by locking the shared memory access Fixes: 487f7323f39a ("net: ti: icssg-prueth: Add helper functions to configure FDB") Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysnet: ibm: emac: mal: fix wrong gotoRosen Penev1-1/+1
[ Upstream commit 08c8acc9d8f3f70d62dd928571368d5018206490 ] dcr_map is called in the previous if and therefore needs to be unmapped. Fixes: 1ff0fcfcb1a6 ("ibm_newemac: Fix new MAL feature handling") Signed-off-by: Rosen Penev <rosenp@gmail.com> Link: https://patch.msgid.link/20241007235711.5714-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 dayse1000e: change I219 (19) devices to ADPVitaly Lifshits2-4/+4
[ Upstream commit 9d9e5347b035412daa844f884b94a05bac94f864 ] Sporadic issues, such as PHY access loss, have been observed on I219 (19) devices. It was found that these devices have hardware more closely related to ADP than MTP and the issues were caused by taking MTP-specific flows. Change the MAC and board types of these devices from MTP to ADP to correctly reflect the LAN hardware, and flows, of these devices. Fixes: db2d737d63c5 ("e1000e: Separate MTP board type from ADP") Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysigb: Do not bring the device up after non-fatal errorMohamed Khalfella1-0/+4
[ Upstream commit 330a699ecbfc9c26ec92c6310686da1230b4e7eb ] Commit 004d25060c78 ("igb: Fix igb_down hung on surprise removal") changed igb_io_error_detected() to ignore non-fatal pcie errors in order to avoid hung task that can happen when igb_down() is called multiple times. This caused an issue when processing transient non-fatal errors. igb_io_resume(), which is called after igb_io_error_detected(), assumes that device is brought down by igb_io_error_detected() if the interface is up. This resulted in panic with stacktrace below. [ T3256] igb 0000:09:00.0 haeth0: igb: haeth0 NIC Link is Down [ T292] pcieport 0000:00:1c.5: AER: Uncorrected (Non-Fatal) error received: 0000:09:00.0 [ T292] igb 0000:09:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID) [ T292] igb 0000:09:00.0: device [8086:1537] error status/mask=00004000/00000000 [ T292] igb 0000:09:00.0: [14] CmpltTO [ 200.105524,009][ T292] igb 0000:09:00.0: AER: TLP Header: 00000000 00000000 00000000 00000000 [ T292] pcieport 0000:00:1c.5: AER: broadcast error_detected message [ T292] igb 0000:09:00.0: Non-correctable non-fatal error reported. [ T292] pcieport 0000:00:1c.5: AER: broadcast mmio_enabled message [ T292] pcieport 0000:00:1c.5: AER: broadcast resume message [ T292] ------------[ cut here ]------------ [ T292] kernel BUG at net/core/dev.c:6539! [ T292] invalid opcode: 0000 [#1] PREEMPT SMP [ T292] RIP: 0010:napi_enable+0x37/0x40 [ T292] Call Trace: [ T292] <TASK> [ T292] ? die+0x33/0x90 [ T292] ? do_trap+0xdc/0x110 [ T292] ? napi_enable+0x37/0x40 [ T292] ? do_error_trap+0x70/0xb0 [ T292] ? napi_enable+0x37/0x40 [ T292] ? napi_enable+0x37/0x40 [ T292] ? exc_invalid_op+0x4e/0x70 [ T292] ? napi_enable+0x37/0x40 [ T292] ? asm_exc_invalid_op+0x16/0x20 [ T292] ? napi_enable+0x37/0x40 [ T292] igb_up+0x41/0x150 [ T292] igb_io_resume+0x25/0x70 [ T292] report_resume+0x54/0x70 [ T292] ? report_frozen_detected+0x20/0x20 [ T292] pci_walk_bus+0x6c/0x90 [ T292] ? aer_print_port_info+0xa0/0xa0 [ T292] pcie_do_recovery+0x22f/0x380 [ T292] aer_process_err_devices+0x110/0x160 [ T292] aer_isr+0x1c1/0x1e0 [ T292] ? disable_irq_nosync+0x10/0x10 [ T292] irq_thread_fn+0x1a/0x60 [ T292] irq_thread+0xe3/0x1a0 [ T292] ? irq_set_affinity_notifier+0x120/0x120 [ T292] ? irq_affinity_notify+0x100/0x100 [ T292] kthread+0xe2/0x110 [ T292] ? kthread_complete_and_exit+0x20/0x20 [ T292] ret_from_fork+0x2d/0x50 [ T292] ? kthread_complete_and_exit+0x20/0x20 [ T292] ret_from_fork_asm+0x11/0x20 [ T292] </TASK> To fix this issue igb_io_resume() checks if the interface is running and the device is not down this means igb_io_error_detected() did not bring the device down and there is no need to bring it up. Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> Reviewed-by: Yuanyuan Zhong <yzhong@purestorage.com> Fixes: 004d25060c78 ("igb: Fix igb_down hung on surprise removal") Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysi40e: Fix macvlan leak by synchronizing access to mac_filter_hashAleksandr Loktionov2-0/+3
[ Upstream commit dac6c7b3d33756d6ce09f00a96ea2ecd79fae9fb ] This patch addresses a macvlan leak issue in the i40e driver caused by concurrent access to vsi->mac_filter_hash. The leak occurs when multiple threads attempt to modify the mac_filter_hash simultaneously, leading to inconsistent state and potential memory leaks. To fix this, we now wrap the calls to i40e_del_mac_filter() and zeroing vf->default_lan_addr.addr with spin_lock/unlock_bh(&vsi->mac_filter_hash_lock), ensuring atomic operations and preventing concurrent access. Additionally, we add lockdep_assert_held(&vsi->mac_filter_hash_lock) in i40e_add_mac_filter() to help catch similar issues in the future. Reproduction steps: 1. Spawn VFs and configure port vlan on them. 2. Trigger concurrent macvlan operations (e.g., adding and deleting portvlan and/or mac filters). 3. Observe the potential memory leak and inconsistent state in the mac_filter_hash. This synchronization ensures the integrity of the mac_filter_hash and prevents the described leak. Fixes: fed0d9f13266 ("i40e: Fix VF's MAC Address change on VM") Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysice: Fix increasing MSI-X on VFMarcin Szycik3-5/+9
[ Upstream commit bce9af1b030bf59d51bbabf909a3ef164787e44e ] Increasing MSI-X value on a VF leads to invalid memory operations. This is caused by not reallocating some arrays. Reproducer: modprobe ice echo 0 > /sys/bus/pci/devices/$PF_PCI/sriov_drivers_autoprobe echo 1 > /sys/bus/pci/devices/$PF_PCI/sriov_numvfs echo 17 > /sys/bus/pci/devices/$VF0_PCI/sriov_vf_msix_count Default MSI-X is 16, so 17 and above triggers this issue. KASAN reports: BUG: KASAN: slab-out-of-bounds in ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] Read of size 8 at addr ffff8888b937d180 by task bash/28433 (...) Call Trace: (...) ? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] kasan_report+0xed/0x120 ? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] ice_vsi_cfg_def+0x3360/0x4770 [ice] ? mutex_unlock+0x83/0xd0 ? __pfx_ice_vsi_cfg_def+0x10/0x10 [ice] ? __pfx_ice_remove_vsi_lkup_fltr+0x10/0x10 [ice] ice_vsi_cfg+0x7f/0x3b0 [ice] ice_vf_reconfig_vsi+0x114/0x210 [ice] ice_sriov_set_msix_vec_count+0x3d0/0x960 [ice] sriov_vf_msix_count_store+0x21c/0x300 (...) Allocated by task 28201: (...) ice_vsi_cfg_def+0x1c8e/0x4770 [ice] ice_vsi_cfg+0x7f/0x3b0 [ice] ice_vsi_setup+0x179/0xa30 [ice] ice_sriov_configure+0xcaa/0x1520 [ice] sriov_numvfs_store+0x212/0x390 (...) To fix it, use ice_vsi_rebuild() instead of ice_vf_reconfig_vsi(). This causes the required arrays to be reallocated taking the new queue count into account (ice_vsi_realloc_stat_arrays()). Set req_txq and req_rxq before ice_vsi_rebuild(), so that realloc uses the newly set queue count. Additionally, ice_vsi_rebuild() does not remove VSI filters (ice_fltr_remove_all()), so ice_vf_init_host_cfg() is no longer necessary. Reported-by: Jacob Keller <jacob.e.keller@intel.com> Fixes: 2a2cb4c6c181 ("ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysice: Flush FDB entries before resetWojciech Drewek3-22/+8
[ Upstream commit fbcb968a98ac0b71f5a2bda2751d7a32d201f90d ] Triggering the reset while in switchdev mode causes errors[1]. Rules are already removed by this time because switch content is flushed in case of the reset. This means that rules were deleted from HW but SW still thinks they exist so when we get SWITCHDEV_FDB_DEL_TO_DEVICE notification we try to delete not existing rule. We can avoid these errors by clearing the rules early in the reset flow before they are removed from HW. Switchdev API will get notified that the rule was removed so we won't get SWITCHDEV_FDB_DEL_TO_DEVICE notification. Remove unnecessary ice_clear_sw_switch_recipes. [1] ice 0000:01:00.0: Failed to delete FDB forward rule, err: -2 ice 0000:01:00.0: Failed to delete FDB guard rule, err: -2 Fixes: 7c945a1a8e5f ("ice: Switchdev FDB events support") Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysice: Fix netif_is_ice() in Safe ModeMarcin Szycik1-1/+2
[ Upstream commit 8e60dbcbaaa177dacef55a61501790e201bf8c88 ] netif_is_ice() works by checking the pointer to netdev ops. However, it only checks for the default ice_netdev_ops, not ice_netdev_safe_mode_ops, so in Safe Mode it always returns false, which is unintuitive. While it doesn't look like netif_is_ice() is currently being called anywhere in Safe Mode, this could change and potentially lead to unexpected behaviour. Fixes: df006dd4b1dc ("ice: Add initial support framework for LAG") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysice: Fix entering Safe ModeMarcin Szycik1-3/+1
[ Upstream commit b972060a47780aa2d46441e06b354156455cc877 ] If DDP package is missing or corrupted, the driver should enter Safe Mode. Instead, an error is returned and probe fails. To fix this, don't exit init if ice_init_ddp_config() returns an error. Repro: * Remove or rename DDP package (/lib/firmware/intel/ice/ddp/ice.pkg) * Load ice Fixes: cc5776fe1832 ("ice: Enable switching default Tx scheduler topology") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysnet: ethernet: adi: adin1110: Fix some error handling path in ↵Christophe JAILLET1-2/+2
adin1110_read_fifo() [ Upstream commit 83211ae1640516accae645de82f5a0a142676897 ] If 'frame_size' is too small or if 'round_len' is an error code, it is likely that an error code should be returned to the caller. Actually, 'ret' is likely to be 0, so if one of these sanity checks fails, 'success' is returned. Return -EINVAL instead. Fixes: bc93e19d088b ("net: ethernet: adi: Add ADIN1110 support") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://patch.msgid.link/8ff73b40f50d8fa994a454911b66adebce8da266.1727981562.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysRevert "net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is enabled"Jakub Kicinski1-1/+1
[ Upstream commit 5546da79e6cc5bb3324bf25688ed05498fd3f86d ] This reverts commit b514c47ebf41a6536551ed28a05758036e6eca7c. The commit describes that we don't have to sync the page when recycling, and it tries to optimize that case. But we do need to sync after allocation. Recycling side should be changed to pass the right sync size instead. Fixes: b514c47ebf41 ("net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is enabled") Reported-by: Jon Hunter <jonathanh@nvidia.com> Link: https://lore.kernel.org/20241004070846.2502e9ea@kernel.org Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Furong Xu <0x1207@gmail.com> Link: https://patch.msgid.link/20241004142115.910876-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 dayssfc: Don't invoke xdp_do_flush() from netpoll.Sebastian Andrzej Siewior2-2/+4
[ Upstream commit 55e802468e1d38dec8e25a2fdb6078d45b647e8c ] Yury reported a crash in the sfc driver originated from netpoll_send_udp(). The netconsole sends a message and then netpoll invokes the driver's NAPI function with a budget of zero. It is dedicated to allow driver to free TX resources, that it may have used while sending the packet. In the netpoll case the driver invokes xdp_do_flush() unconditionally, leading to crash because bpf_net_context was never assigned. Invoke xdp_do_flush() only if budget is not zero. Fixes: 401cb7dae8130 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") Reported-by: Yury Vostrikov <mon@unformed.ru> Closes: https://lore.kernel.org/5627f6d1-5491-4462-9d75-bc0612c26a22@app.fastmail.com Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20241002125837.utOcRo6Y@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysice: fix VLAN replay after resetDave Ertman1-2/+0
[ Upstream commit 0eae2c136cb624e4050092feb59f18159b4f2512 ] There is a bug currently when there are more than one VLAN defined and any reset that affects the PF is initiated, after the reset rebuild no traffic will pass on any VLAN but the last one created. This is caused by the iteration though the VLANs during replay each clearing the vsi_map bitmap of the VSI that is being replayed. The problem is that during rhe replay, the pointer to the vsi_map bitmap is used by each successive vlan to determine if it should be replayed on this VSI. The logic was that the replay of the VLAN would replace the bit in the map before the next VLAN would iterate through. But, since the replay copies the old bitmap pointer to filt_replay_rules and creates a new one for the recreated VLANS, it does not do this, and leaves the old bitmap broken to be used to replay the remaining VLANs. Since the old bitmap will be cleaned up in post replay cleanup, there is no need to alter it and break following VLAN replay, so don't clear the bit. Fixes: 334cb0626de1 ("ice: Implement VSI replay framework") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysice: disallow DPLL_PIN_STATE_SELECTABLE for dpll output pinsArkadiusz Kubalewski1-0/+2
[ Upstream commit afe6e30e7701979f536f8fbf6fdef7212441f61a ] Currently the user may request DPLL_PIN_STATE_SELECTABLE for an output pin, and this would actually set the DISCONNECTED state instead. It doesn't make any sense. SELECTABLE is valid only in case of input pins (on AUTOMATIC type dpll), where dpll itself would select best valid input. For the output pin only CONNECTED/DISCONNECTED are expected. Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysice: fix memleak in ice_init_tx_topology()Przemek Kitszel3-39/+31
[ Upstream commit c188afdc36113760873ec78cbc036f6b05f77621 ] Fix leak of the FW blob (DDP pkg). Make ice_cfg_tx_topo() const-correct, so ice_init_tx_topology() can avoid copying whole FW blob. Copy just the topology section, and only when needed. Reuse the buffer allocated for the read of the current topology. This was found by kmemleak, with the following trace for each PF: [<ffffffff8761044d>] kmemdup_noprof+0x1d/0x50 [<ffffffffc0a0a480>] ice_init_ddp_config+0x100/0x220 [ice] [<ffffffffc0a0da7f>] ice_init_dev+0x6f/0x200 [ice] [<ffffffffc0a0dc49>] ice_init+0x29/0x560 [ice] [<ffffffffc0a10c1d>] ice_probe+0x21d/0x310 [ice] Constify ice_cfg_tx_topo() @buf parameter. This cascades further down to few more functions. Fixes: cc5776fe1832 ("ice: Enable switching default Tx scheduler topology") CC: Larysa Zaremba <larysa.zaremba@intel.com> CC: Jacob Keller <jacob.e.keller@intel.com> CC: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> CC: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysice: clear port vlan config during resetMichal Swiatkowski3-0/+65
[ Upstream commit d019b1a9128d65956f04679ec2bb8b0800f13358 ] Since commit 2a2cb4c6c181 ("ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()") VF VSI is only reconfigured instead of recreated. The context configuration from previous setting is still the same. If any of the config needs to be cleared it needs to be cleared explicitly. Previously there was assumption that port vlan will be cleared automatically. Now, when VSI is only reconfigured we have to do it in the code. Not clearing port vlan configuration leads to situation when the driver VSI config is different than the VSI config in HW. Traffic can't be passed after setting and clearing port vlan, because of invalid VSI config in HW. Example reproduction: > ip a a dev $(VF) $(VF_IP_ADDRESS) > ip l s dev $(VF) up > ping $(VF_IP_ADDRESS) ping is working fine here > ip link set eth5 vf 0 vlan 100 > ip link set eth5 vf 0 vlan 0 > ping $(VF_IP_ADDRESS) ping isn't working Fixes: 2a2cb4c6c181 ("ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Piotr Tyda <piotr.tyda@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysice: set correct dst VSI in only LAN filtersMichal Swiatkowski1-0/+11
[ Upstream commit 839e3f9bee425c90a0423d14b102a42fe6635c73 ] The filters set that will reproduce the problem: $ tc filter add dev $VF0_PR ingress protocol arp prio 0 flower \ skip_sw dst_mac ff:ff:ff:ff:ff:ff action mirred egress \ redirect dev $PF0 $ tc filter add dev $VF0_PR ingress protocol arp prio 0 flower \ skip_sw dst_mac ff:ff:ff:ff:ff:ff src_mac 52:54:00:00:00:10 \ action mirred egress mirror dev $VF1_PR Expected behaviour is to set all broadcast from VF0 to the LAN. If the src_mac match the value from filters, send packet to LAN and to VF1. In this case both LAN_EN and LB_EN flags in switch is set in case of packet matching both filters. As dst VSI for the only LAN enable bit is PF VSI, the packet is being seen on PF. To fix this change dst VSI to the source VSI. It will block receiving any packet even when LB_EN is set by switch, because local loopback is clear on VF VSI during normal operation. Side note: if the second filters action is redirect instead of mirror LAN_EN is clear, because switch is AND-ing LAN_EN from each matched filters and OR-ing LB_EN. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Fixes: 73b483b79029 ("ice: Manage act flags for switchdev offloads") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysnet: fec: don't save PTP state if PTP is unsupportedWei Fang1-2/+4
commit 6be063071a457767ee229db13f019c2ec03bfe44 upstream. Some platforms (such as i.MX25 and i.MX27) do not support PTP, so on these platforms fec_ptp_init() is not called and the related members in fep are not initialized. However, fec_ptp_save_state() is called unconditionally, which causes the kernel to panic. Therefore, add a condition so that fec_ptp_save_state() is not called if PTP is not supported. Fixes: a1477dc87dc4 ("net: fec: Restart PPS after link state change") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/lkml/353e41fe-6bb4-4ee9-9980-2da2a9c1c508@roeck-us.net/ Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Csókás, Bence <csokas.bence@prolan.hu> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://patch.msgid.link/20241008061153.1977930-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 daysr8169: add tally counter fields added with RTL8125Heiner Kallweit1-0/+27
[ Upstream commit ced8e8b8f40accfcce4a2bbd8b150aa76d5eff9a ] RTL8125 added fields to the tally counter, what may result in the chip dma'ing these new fields to unallocated memory. Therefore make sure that the allocated memory area is big enough to hold all of the tally counter values, even if we use only parts of it. Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/741d26a9-2b2b-485d-91d9-ecb302e345b5@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysr8169: Fix spelling mistake: "tx_underun" -> "tx_underrun"Colin Ian King1-2/+2
[ Upstream commit 8df9439389a44fb2cc4ef695e08d6a8870b1616c ] There is a spelling mistake in the struct field tx_underun, rename it to tx_underrun. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/20240909140021.64884-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: ced8e8b8f40a ("r8169: add tally counter fields added with RTL8125") Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet: stmmac: Fix zero-division error when disabling tc cbsKhaiWenTan1-0/+1
commit 675faf5a14c14a2be0b870db30a70764df81e2df upstream. The commit b8c43360f6e4 ("net: stmmac: No need to calculate speed divider when offload is disabled") allows the "port_transmit_rate_kbps" to be set to a value of 0, which is then passed to the "div_s64" function when tc-cbs is disabled. This leads to a zero-division error. When tc-cbs is disabled, the idleslope, sendslope, and credit values the credit values are not required to be configured. Therefore, adding a return statement after setting the txQ mode to DCB when tc-cbs is disabled would prevent a zero-division error. Fixes: b8c43360f6e4 ("net: stmmac: No need to calculate speed divider when offload is disabled") Cc: <stable@vger.kernel.org> Co-developed-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: KhaiWenTan <khai.wen.tan@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240918061422.1589662-1-khai.wen.tan@linux.intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 daysnfp: Use IRQF_NO_AUTOEN flag in request_irq()Jinjie Ruan1-3/+2
[ Upstream commit daaba19d357f0900b303a530ced96c78086267ea ] disable_irq() after request_irq() still has a time gap in which interrupts can come. request_irq() with IRQF_NO_AUTOEN flag will disable IRQ auto-enable when request IRQ. Reviewed-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patch.msgid.link/20240911094445.1922476-4-ruanjinjie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet: atlantic: Avoid warning about potential string truncationSimon Horman1-2/+2
[ Upstream commit 5874e0c9f25661c2faefe4809907166defae3d7f ] W=1 builds with GCC 14.2.0 warn that: .../aq_ethtool.c:278:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 6 [-Wformat-truncation=] 278 | snprintf(tc_string, 8, "TC%d ", tc); | ^~ .../aq_ethtool.c:278:56: note: directive argument in the range [-2147483641, 254] 278 | snprintf(tc_string, 8, "TC%d ", tc); | ^~~~~~~ .../aq_ethtool.c:278:33: note: ‘snprintf’ output between 5 and 15 bytes into a destination of size 8 278 | snprintf(tc_string, 8, "TC%d ", tc); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ tc is always in the range 0 - cfg->tcs. And as cfg->tcs is a u8, the range is 0 - 255. Further, on inspecting the code, it seems that cfg->tcs will never be more than AQ_CFG_TCS_MAX (8), so the range is actually 0 - 8. So, it seems that the condition that GCC flags will not occur. But, nonetheless, it would be nice if it didn't emit the warning. It seems that this can be achieved by changing the format specifier from %d to %u, in which case I believe GCC recognises an upper bound on the range of tc of 0 - 255. After some experimentation I think this is due to the combination of the use of %u and the type of cfg->tcs (u8). Empirically, updating the type of the tc variable to unsigned int has the same effect. As both of these changes seem to make sense in relation to what the code is actually doing - iterating over unsigned values - do both. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240821-atlantic-str-v1-1-fa2cfe38ca00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysbnxt_en: Extend maximum length of version string by 1 byteSimon Horman1-1/+1
[ Upstream commit ffff7ee843c351ce71d6e0d52f0f20bea35e18c9 ] This corrects an out-by-one error in the maximum length of the package version string. The size argument of snprintf includes space for the trailing '\0' byte, so there is no need to allow extra space for it by reducing the value of the size argument by 1. Found by inspection. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20240813-bnxt-str-v2-1-872050a157e7@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet: mvpp2: Increase size of queue_name bufferSimon Horman1-1/+1
[ Upstream commit 91d516d4de48532d967a77967834e00c8c53dfe6 ] Increase size of queue_name buffer from 30 to 31 to accommodate the largest string written to it. This avoids truncation in the possibly unlikely case where the string is name is the maximum size. Flagged by gcc-14: .../mvpp2_main.c: In function 'mvpp2_probe': .../mvpp2_main.c:7636:32: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=] 7636 | "stats-wq-%s%s", netdev_name(priv->port_list[0]->dev), | ^ .../mvpp2_main.c:7635:9: note: 'snprintf' output between 10 and 31 bytes into a destination of size 30 7635 | snprintf(priv->queue_name, sizeof(priv->queue_name), | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 7636 | "stats-wq-%s%s", netdev_name(priv->port_list[0]->dev), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 7637 | priv->port_count > 1 ? "+" : ""); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Introduced by commit 118d6298f6f0 ("net: mvpp2: add ethtool GOP statistics"). I am not flagging this as a bug as I am not aware that it is one. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Marcin Wojtas <marcin.s.wojtas@gmail.com> Link: https://patch.msgid.link/20240806-mvpp2-namelen-v1-1-6dc773653f2f@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet: hisilicon: hns_mdio: fix OF node leak in probe()Krzysztof Kozlowski1-0/+1
[ Upstream commit e62beddc45f487b9969821fad3a0913d9bc18a2f ] Driver is leaking OF node reference from of_parse_phandle_with_fixed_args() in probe(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240827144421.52852-4-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info()Krzysztof Kozlowski1-0/+1
[ Upstream commit 5680cf8d34e1552df987e2f4bb1bff0b2a8c8b11 ] Driver is leaking OF node reference from of_parse_phandle_with_fixed_args() in hns_mac_get_info(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240827144421.52852-3-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet: hisilicon: hip04: fix OF node leak in probe()Krzysztof Kozlowski1-0/+1
[ Upstream commit 17555297dbd5bccc93a01516117547e26a61caf1 ] Driver is leaking OF node reference from of_parse_phandle_with_fixed_args() in probe(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240827144421.52852-2-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysice: Adjust over allocation of memory in ice_sched_add_root_node() and ↵Aleksandr Mishin1-4/+2
ice_sched_add_node() [ Upstream commit 62fdaf9e8056e9a9e6fe63aa9c816ec2122d60c6 ] In ice_sched_add_root_node() and ice_sched_add_node() there are calls to devm_kcalloc() in order to allocate memory for array of pointers to 'ice_sched_node' structure. But incorrect types are used as sizeof() arguments in these calls (structures instead of pointers) which leads to over allocation of memory. Adjust over allocation of memory by correcting types in devm_kcalloc() sizeof() arguments. Found by Linux Verification Center (linuxtesting.org) with SVACE. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 dayse1000e: avoid failing the system during pm_suspendVitaly Lifshits1-8/+11
[ Upstream commit 0a6ad4d9e1690c7faa3a53f762c877e477093657 ] Occasionally when the system goes into pm_suspend, the suspend might fail due to a PHY access error on the network adapter. Previously, this would have caused the whole system to fail to go to a low power state. An example of this was reported in the following Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205015 [ 1663.694828] e1000e 0000:00:19.0 eth0: Failed to disable ULP [ 1664.731040] asix 2-3:1.0 eth1: link up, 100Mbps, full-duplex, lpa 0xC1E1 [ 1665.093513] e1000e 0000:00:19.0 eth0: Hardware Error [ 1665.596760] e1000e 0000:00:19.0: pci_pm_resume+0x0/0x80 returned 0 after 2975399 usecs and then the system never recovers from it, and all the following suspend failed due to this [22909.393854] PM: pci_pm_suspend(): e1000e_pm_suspend+0x0/0x760 [e1000e] returns -2 [22909.393858] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x160 returns -2 [22909.393861] PM: Device 0000:00:1f.6 failed to suspend async: error -2 This can be avoided by changing the return values of __e1000_shutdown and e1000e_pm_suspend functions so that they always return 0 (success). This is consistent with what other drivers do. If the e1000e driver encounters a hardware error during suspend, potential side effects include slightly higher power draw or non-working wake on LAN. This is preferred to a system-level suspend failure, and a warning message is written to the system log, so that the user can be aware that the LAN controller experienced a problem during suspend. Link: https://bugzilla.kernel.org/show_bug.cgi?id=205015 Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet: stmmac: dwmac4: extend timeout for VLAN Tag register busy bit checkShenwei Wang1-9/+9
[ Upstream commit 4c1b56671b68ffcbe6b78308bfdda6bcce6491ae ] Increase the timeout for checking the busy bit of the VLAN Tag register from 10µs to 500ms. This change is necessary to accommodate scenarios where Energy Efficient Ethernet (EEE) is enabled. Overnight testing revealed that when EEE is active, the busy bit can remain set for up to approximately 300ms. The new 500ms timeout provides a safety margin. Fixes: ed64639bc1e0 ("net: stmmac: Add support for VLAN Rx filtering") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com> Link: https://patch.msgid.link/20240924205424.573913-1-shenwei.wang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet: fec: Reload PTP registers after link-state changeCsókás, Bence2-0/+23
[ Upstream commit d9335d0232d2da605585eea1518ac6733518f938 ] On link-state change, the controller gets reset, which clears all PTP registers, including PHC time, calibrated clock correction values etc. For correct IEEE 1588 operation we need to restore these after the reset. Fixes: 6605b730c061 ("FEC: Add time stamping code and a PTP hardware clock") Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20240924093705.2897329-2-csokas.bence@prolan.hu Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet: fec: Restart PPS after link state changeCsókás, Bence3-1/+46
[ Upstream commit a1477dc87dc4996dcf65a4893d4e2c3a6b593002 ] On link state change, the controller gets reset, causing PPS to drop out. Re-enable PPS if it was enabled before the controller reset. Fixes: 6605b730c061 ("FEC: Add time stamping code and a PTP hardware clock") Signed-off-by: Csókás, Bence <csokas.bence@prolan.hu> Link: https://patch.msgid.link/20240924093705.2897329-1-csokas.bence@prolan.hu Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet: ethernet: lantiq_etop: fix memory disclosureAleksander Jan Bajkowski1-1/+3
[ Upstream commit 45c0de18ff2dc9af01236380404bbd6a46502c69 ] When applying padding, the buffer is not zeroed, which results in memory disclosure. The mentioned data is observed on the wire. This patch uses skb_put_padto() to pad Ethernet frames properly. The mentioned function zeroes the expanded buffer. In case the packet cannot be padded it is silently dropped. Statistics are also not incremented. This driver does not support statistics in the old 32-bit format or the new 64-bit format. These will be added in the future. In its current form, the patch should be easily backported to stable versions. Ethernet MACs on Amazon-SE and Danube cannot do padding of the packets in hardware, so software padding must be applied. Fixes: 504d4721ee8e ("MIPS: Lantiq: Add ethernet driver") Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240923214949.231511-2-olek2@wp.pl Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet/mlx5e: Fix crash caused by calling __xfrm_state_delete() twiceJianbo Liu1-1/+7
[ Upstream commit 7b124695db40d5c9c5295a94ae928a8d67a01c3d ] The km.state is not checked in driver's delayed work. When xfrm_state_check_expire() is called, the state can be reset to XFRM_STATE_EXPIRED, even if it is XFRM_STATE_DEAD already. This happens when xfrm state is deleted, but not freed yet. As __xfrm_state_delete() is called again in xfrm timer, the following crash occurs. To fix this issue, skip xfrm_state_check_expire() if km.state is not XFRM_STATE_VALID. Oops: general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] SMP CPU: 5 UID: 0 PID: 7448 Comm: kworker/u102:2 Not tainted 6.11.0-rc2+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Workqueue: mlx5e_ipsec: eth%d mlx5e_ipsec_handle_sw_limits [mlx5_core] RIP: 0010:__xfrm_state_delete+0x3d/0x1b0 Code: 0f 84 8b 01 00 00 48 89 fd c6 87 c8 00 00 00 05 48 8d bb 40 10 00 00 e8 11 04 1a 00 48 8b 95 b8 00 00 00 48 8b 85 c0 00 00 00 <48> 89 42 08 48 89 10 48 8b 55 10 48 b8 00 01 00 00 00 00 ad de 48 RSP: 0018:ffff88885f945ec8 EFLAGS: 00010246 RAX: dead000000000122 RBX: ffffffff82afa940 RCX: 0000000000000036 RDX: dead000000000100 RSI: 0000000000000000 RDI: ffffffff82afb980 RBP: ffff888109a20340 R08: ffff88885f945ea0 R09: 0000000000000000 R10: 0000000000000000 R11: ffff88885f945ff8 R12: 0000000000000246 R13: ffff888109a20340 R14: ffff88885f95f420 R15: ffff88885f95f400 FS: 0000000000000000(0000) GS:ffff88885f940000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f2163102430 CR3: 00000001128d6001 CR4: 0000000000370eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> ? die_addr+0x33/0x90 ? exc_general_protection+0x1a2/0x390 ? asm_exc_general_protection+0x22/0x30 ? __xfrm_state_delete+0x3d/0x1b0 ? __xfrm_state_delete+0x2f/0x1b0 xfrm_timer_handler+0x174/0x350 ? __xfrm_state_delete+0x1b0/0x1b0 __hrtimer_run_queues+0x121/0x270 hrtimer_run_softirq+0x88/0xd0 handle_softirqs+0xcc/0x270 do_softirq+0x3c/0x50 </IRQ> <TASK> __local_bh_enable_ip+0x47/0x50 mlx5e_ipsec_handle_sw_limits+0x7d/0x90 [mlx5_core] process_one_work+0x137/0x2d0 worker_thread+0x28d/0x3a0 ? rescuer_thread+0x480/0x480 kthread+0xb8/0xe0 ? kthread_park+0x80/0x80 ret_from_fork+0x2d/0x50 ? kthread_park+0x80/0x80 ret_from_fork_asm+0x11/0x20 </TASK> Fixes: b2f7b01d36a9 ("net/mlx5e: Simulate missing IPsec TX limits hardware functionality") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet/mlx5e: SHAMPO, Fix overflow of hd_per_wqDragos Tatulea1-1/+1
[ Upstream commit 023d2a43ed0d9ab73d4a35757121e4c8e01298e5 ] When having larger RQ sizes and small MTUs sizes, the hd_per_wq variable can overflow. Like in the following case: $> ethtool --set-ring eth1 rx 8192 $> ip link set dev eth1 mtu 144 $> ethtool --features eth1 rx-gro-hw on ... yields in dmesg: mlx5_core 0000:08:00.1: mlx5_cmd_out_err:808:(pid 194797): CREATE_MKEY(0x200) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x3bf6f), err(-22) because hd_per_wq is 64K which overflows to 0 and makes the command fail. This patch increases the variable size to 32 bit. Fixes: 99be56171fa9 ("net/mlx5e: SHAMPO, Re-enable HW-GRO") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet/mlx5e: Fix NULL deref in mlx5e_tir_builder_alloc()Elena Salomatkina1-0/+3
[ Upstream commit f25389e779500cf4a59ef9804534237841bce536 ] In mlx5e_tir_builder_alloc() kvzalloc() may return NULL which is dereferenced on the next line in a reference to the modify field. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: a6696735d694 ("net/mlx5e: Convert TIR to a dedicated object") Signed-off-by: Elena Salomatkina <esalomatkina@ispras.ru> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet/mlx5: Added cond_resched() to crdump collectionMohamed Khalfella1-0/+10
[ Upstream commit ec793155894140df7421d25903de2e6bc12c695b ] Collecting crdump involves reading vsc registers from pci config space of mlx device, which can take long time to complete. This might result in starving other threads waiting to run on the cpu. Numbers I got from testing ConnectX-5 Ex MCX516A-CDAT in the lab: - mlx5_vsc_gw_read_block_fast() was called with length = 1310716. - mlx5_vsc_gw_read_fast() reads 4 bytes at a time. It was not used to read the entire 1310716 bytes. It was called 53813 times because there are jumps in read_addr. - On average mlx5_vsc_gw_read_fast() took 35284.4ns. - In total mlx5_vsc_wait_on_flag() called vsc_read() 54707 times. The average time for each call was 17548.3ns. In some instances vsc_read() was called more than one time when the flag was not set. As expected the thread released the cpu after 16 iterations in mlx5_vsc_wait_on_flag(). - Total time to read crdump was 35284.4ns * 53813 ~= 1.898s. It was seen in the field that crdump can take more than 5 seconds to complete. During that time mlx5_vsc_wait_on_flag() did not release the cpu because it did not complete 16 iterations. It is believed that pci config reads were slow. Adding cond_resched() every 128 register read improves the situation. In the common case the, crdump takes ~1.8989s, the thread yields the cpu every ~4.51ms. If crdump takes ~5s, the thread yields the cpu every ~18.0ms. Fixes: 8b9d8baae1de ("net/mlx5: Add Crdump support") Reviewed-by: Yuanyuan Zhong <yzhong@purestorage.com> Signed-off-by: Mohamed Khalfella <mkhalfella@purestorage.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet/mlx5: Fix error path in multi-packet WQE transmitGerd Bayer1-1/+0
[ Upstream commit 2bcae12c795f32ddfbf8c80d1b5f1d3286341c32 ] Remove the erroneous unmap in case no DMA mapping was established The multi-packet WQE transmit code attempts to obtain a DMA mapping for the skb. This could fail, e.g. under memory pressure, when the IOMMU driver just can't allocate more memory for page tables. While the code tries to handle this in the path below the err_unmap label it erroneously unmaps one entry from the sq's FIFO list of active mappings. Since the current map attempt failed this unmap is removing some random DMA mapping that might still be required. If the PCI function now presents that IOVA, the IOMMU may assumes a rogue DMA access and e.g. on s390 puts the PCI function in error state. The erroneous behavior was seen in a stress-test environment that created memory pressure. Fixes: 5af75c747e2a ("net/mlx5e: Enhanced TX MPWQE for SKBs") Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Acked-by: Maxim Mikityanskiy <maxtram95@gmail.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
11 daysnet: sparx5: Fix invalid timestampsAakash Menon1-1/+5
[ Upstream commit 151ac45348afc5b56baa584c7cd4876addf461ff ] Bit 270-271 are occasionally unexpectedly set by the hardware. This issue was observed with 10G SFPs causing huge time errors (> 30ms) in PTP. Only 30 bits are needed for the nanosecond part of the timestamp, clear 2 most significant bits before extracting timestamp from the internal frame header. Fixes: 70dfe25cd866 ("net: sparx5: Update extraction/injection for timestamping") Signed-off-by: Aakash Menon <aakash.menon@protempis.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-04idpf: fix netdev Tx queue stop/wakeMichal Kubiak3-27/+21
commit e4b398dd82f5d5867bc5f442c43abc8fba30ed2c upstream. netif_txq_maybe_stop() returns -1, 0, or 1, while idpf_tx_maybe_stop_common() says it returns 0 or -EBUSY. As a result, there sometimes are Tx queue timeout warnings despite that the queue is empty or there is at least enough space to restart it. Make idpf_tx_maybe_stop_common() inline and returning true or false, handling the return of netif_txq_maybe_stop() properly. Use a correct goto in idpf_tx_maybe_stop_splitq() to avoid stopping the queue or incrementing the stops counter twice. Fixes: 6818c4d5b3c2 ("idpf: add splitq start_xmit") Fixes: a5ab9ee0df0b ("idpf: add singleq start_xmit and napi poll") Cc: stable@vger.kernel.org # 6.7+ Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-04net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is enabledFurong Xu1-1/+1
[ Upstream commit b514c47ebf41a6536551ed28a05758036e6eca7c ] Commit 5fabb01207a2 ("net: stmmac: Add initial XDP support") sets PP_FLAG_DMA_SYNC_DEV flag for page_pool unconditionally, page_pool_recycle_direct() will call page_pool_dma_sync_for_device() on every page even the page is not going to be reused by XDP program. When XDP is not enabled, the page which holds the received buffer will be recycled once the buffer is copied into new SKB by skb_copy_to_linear_data(), then the MAC core will never reuse this page any longer. Always setting PP_FLAG_DMA_SYNC_DEV wastes CPU cycles on unnecessary calling of page_pool_dma_sync_for_device(). After this patch, up to 9% noticeable performance improvement was observed on certain platforms. Fixes: 5fabb01207a2 ("net: stmmac: Add initial XDP support") Signed-off-by: Furong Xu <0x1207@gmail.com> Link: https://patch.msgid.link/20240919121028.1348023-1-0x1207@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-04net: ravb: Fix R-Car RX frame size limitPaul Barker1-2/+10
[ Upstream commit ec8234717db8589078d08b17efa528a235c61f4f ] The RX frame size limit should not be based on the current MTU setting. Instead it should be based on the hardware capabilities. While we're here, improve the description of the receive frame length setting as suggested by Niklas. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-04net: ravb: Fix maximum TX frame size for GbEth devicesPaul Barker2-1/+6
[ Upstream commit 1d63864299cafa7c8cbde56491c9932afdbff7ea ] The datasheets for all SoCs using the GbEth IP specify a maximum transmission frame size of 1.5 kByte. I've confirmed through internal discussions that support for 1522 byte frames has been validated, which allows us to support the default MTU of 1500 bytes after reserving space for the Ethernet header, frame checksums and an optional VLAN tag. Fixes: 2e95e08ac009 ("ravb: Add rx_max_buf_size to struct ravb_hw_info") Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-04net: seeq: Fix use after free vulnerability in ether3 Driver Due to Race ↵Kaixin Wang1-0/+2
Condition [ Upstream commit b5109b60ee4fcb2f2bb24f589575e10cc5283ad4 ] In the ether3_probe function, a timer is initialized with a callback function ether3_ledoff, bound to &prev(dev)->timer. Once the timer is started, there is a risk of a race condition if the module or device is removed, triggering the ether3_remove function to perform cleanup. The sequence of operations that may lead to a UAF bug is as follows: CPU0 CPU1 | ether3_ledoff ether3_remove | free_netdev(dev); | put_devic | kfree(dev); | | ether3_outw(priv(dev)->regs.config2 |= CFG2_CTRLO, REG_CONFIG2); | // use dev Fix it by ensuring that the timer is canceled before proceeding with the cleanup in ether3_remove. Fixes: 6fd9c53f7186 ("net: seeq: Convert timers to use timer_setup()") Signed-off-by: Kaixin Wang <kxwang23@m.fudan.edu.cn> Link: https://patch.msgid.link/20240915144045.451-1-kxwang23@m.fudan.edu.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-04net: xilinx: axienet: Fix packet countingSean Anderson1-9/+14
[ Upstream commit 5a6caa2cfabb559309b5ce29ee7c8e9ce1a9a9df ] axienet_free_tx_chain returns the number of DMA descriptors it's handled. However, axienet_tx_poll treats the return as the number of packets. When scatter-gather SKBs are enabled, a single packet may use multiple DMA descriptors, which causes incorrect packet counts. Fix this by explicitly keepting track of the number of packets processed as separate from the DMA descriptors. Budget does not affect the number of Tx completions we can process for NAPI, so we use the ring size as the limit instead of budget. As we no longer return the number of descriptors processed to axienet_tx_poll, we now update tx_bd_ci in axienet_free_tx_chain. Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20240913145156.2283067-1-sean.anderson@linux.dev Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>