kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2023-08-23	net/mlx5: Push devlink port PF/VF init/cleanup calls out of ↵	Jiri Pirko	4	-20/+58
	devlink_port_register/unregister() In order to prepare for mlx5_esw_offloads_devlink_port_register/unregister() to be used for SFs as well, push out the PF/VF specific init/cleanup calls outside. Introduce mlx5_eswitch_load/unload_pf_vf_vport() and call them from there. Use these new helpers of PF/VF loading and make mlx5_eswitch_local/unload_vport() reusable for SFs. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-23	net/mlx5: Push out SF devlink port init and cleanup code to separate helpers	Jiri Pirko	1	-10/+51
	Similar to what was done for PFs/VFs, introduce devlink port init and cleanup helpers for SFs and manage the vport->dl_port pointer there. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-23	net/mlx5: Rework devlink port alloc/free into init/cleanup	Jiri Pirko	1	-22/+43
	In order to prepare the devlink port registration function to be common for PFs/VFs and SFs, change the existing devlink port allocation and free functions into PF/VF init and cleanup, so similar helpers could be later on introduced for SFs. Make the init/cleanup helpers responsible for setting/clearing the vport->dl_port pointer. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-23	igc: Fix the typo in the PTM Control macro	Sasha Neftin	1	-1/+1
	The IGC_PTM_CTRL_SHRT_CYC defines the time between two consecutive PTM requests. The bit resolution of this field is six bits. That bit five was missing in the mask. This patch comes to correct the typo in the IGC_PTM_CTRL_SHRT_CYC macro. Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()") Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://lore.kernel.org/r/20230821171721.2203572-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-23	igb: Avoid starting unnecessary workqueues	Alessio Igor Bogani	1	-12/+12
	If ptp_clock_register() fails or CONFIG_PTP isn't enabled, avoid starting PTP related workqueues. In this way we can fix this: BUG: unable to handle page fault for address: ffffc9000440b6f8 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 100000067 P4D 100000067 PUD 1001e0067 PMD 107dc5067 PTE 0 Oops: 0000 [#1] PREEMPT SMP [...] Workqueue: events igb_ptp_overflow_check RIP: 0010:igb_rd32+0x1f/0x60 [...] Call Trace: igb_ptp_read_82580+0x20/0x50 timecounter_read+0x15/0x60 igb_ptp_overflow_check+0x1a/0x50 process_one_work+0x1cb/0x3c0 worker_thread+0x53/0x3f0 ? rescuer_thread+0x370/0x370 kthread+0x142/0x160 ? kthread_associate_blkcg+0xc0/0xc0 ret_from_fork+0x1f/0x30 Fixes: 1f6e8178d685 ("igb: Prevent dropped Tx timestamps via work items and interrupts.") Fixes: d339b1331616 ("igb: add PTP Hardware Clock code") Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230821171927.2203644-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-23	Merge branch '100GbE' of ↵	Jakub Kicinski	5	-33/+14
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-08-21 (ice) This series contains updates to ice driver only. Jesse fixes an issue on calculating buffer size. Petr Oros reverts a commit that does not fully resolve VF reset issues and implements one that provides a fuller fix. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Fix NULL pointer deref during VF reset Revert "ice: Fix ice VF reset during iavf initialization" ice: fix receive buffer size miscalculation ==================== Link: https://lore.kernel.org/r/20230821171633.2203505-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-23	bnx2x: new flag for track HW resource allocation	Thinh Tran	4	-28/+44
	While injecting PCIe errors to the upstream PCIe switch of a BCM57810 NIC, system hangs/crashes were observed. After several calls to bnx2x_tx_timout() complete, bnx2x_nic_unload() is called to free up HW resources and bnx2x_napi_disable() is called to release NAPI objects. Later, when the EEH driver calls bnx2x_io_slot_reset() to complete the recovery process, bnx2x attempts to disable NAPI again by calling bnx2x_napi_disable() and freeing resources which have already been freed, resulting in a hang or crash. Introduce a new flag to track the HW resource and NAPI allocation state, refactor duplicated code into a single function, check page pool allocation status before freeing, and reduces debug output when a TX timeout event occurs. Reviewed-by: Manish Chopra <manishc@marvell.com> Tested-by: Abdul Haleem <abdhalee@in.ibm.com> Tested-by: David Christensen <drc@linux.vnet.ibm.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Venkata Sai Duggi <venkata.sai.duggi@ibm.com> Signed-off-by: Thinh Tran <thinhtr@linux.vnet.ibm.com> Link: https://lore.kernel.org/r/20230818161443.708785-2-thinhtr@linux.vnet.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-22	sfc: allocate a big enough SKB for loopback selftest packet	Edward Cree	3	-3/+3
	Cited commits passed a size to alloc_skb that was only big enough for the actual packet contents, but the following skb_put + memcpy writes the whole struct efx_loopback_payload including leading and trailing padding bytes (which are then stripped off with skb_pull/skb_trim). This could cause an skb_over_panic, although in practice we get saved by kmalloc_size_roundup. Pass the entire size we use, instead of the size of the final packet. Reported-by: Andy Moreton <andy.moreton@amd.com> Fixes: cf60ed469629 ("sfc: use padding to fix alignment in loopback test") Fixes: 30c24dd87f3f ("sfc: siena: use padding to fix alignment in loopback test") Fixes: 1186c6b31ee1 ("sfc: falcon: use padding to fix alignment in loopback test") Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230821180153.18652-1-edward.cree@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-22	Merge tag 'mlx5-updates-2023-08-16' of ↵	Jakub Kicinski	12	-83/+182
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-08-16 1) aRFS ethtool stats Improve aRFS observability by adding new set of counters. Each Rx ring will have this set of counters listed below. These counters are exposed through ethtool -S. 1.1) arfs_add: number of times a new rule has been created. 1.2) arfs_request_in: number of times a rule was requested to move from its current Rx ring to a new Rx ring (incremented on the destination Rx ring). 1.3) arfs_request_out: number of times a rule was requested to move out from its current Rx ring (incremented on source/current Rx ring). 1.4) arfs_expired: number of times a rule has been expired by the kernel and removed from HW. 1.5) arfs_err: number of times a rule creation or modification has failed. 2) Supporting inline WQE when possible in SW steering 3) Misc cleanups and fixups to net-next branch * tag 'mlx5-updates-2023-08-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Devcom, only use devcom after NULL check in mlx5_devcom_send_event() net/mlx5: DR, Supporting inline WQE when possible net/mlx5: Rename devlink port ops struct for PFs/VFs net/mlx5: Remove VPORT_UPLINK handling from devlink_port.c net/mlx5: Call mlx5_esw_offloads_rep_load/unload() for uplink port directly net/mlx5: Update dead links in Kconfig documentation net/mlx5: Remove health syndrome enum duplication net/mlx5: DR, Remove unneeded local variable net/mlx5: DR, Fix code indentation net/mlx5: IRQ, consolidate irq and affinity mask allocation net/mlx5e: Fix spelling mistake "Faided" -> "Failed" net/mlx5e: aRFS, Introduce ethtool stats net/mlx5e: aRFS, Warn if aRFS table does not exist for aRFS rule net/mlx5e: aRFS, Prevent repeated kernel rule migrations requests ==================== Link: https://lore.kernel.org/r/20230821175739.81188-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-22	net: ethernet: mtk_eth_soc: fix NULL pointer on hw reset	Daniel Golle	1	-2/+10
	When a hardware reset is triggered on devices not initializing WED the calls to mtk_wed_fe_reset and mtk_wed_fe_reset_complete dereference a pointer on uninitialized stack memory. Break out of both functions in case a hw_list entry is 0. Fixes: 08a764a7c51b ("net: ethernet: mtk_wed: add reset/reset_complete callbacks") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/5465c1609b464cc7407ae1530c40821dcdf9d3e6.1692634266.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-22	net: ethernet: ti: Remove unused declarations	Yue Haibing	2	-3/+0
	Commit e8609e69470f ("net: ethernet: ti: am65-cpsw: Convert to PHYLINK") removed am65_cpsw_nuss_adjust_link() but not its declaration. Commit 84640e27f230 ("net: netcp: Add Keystone NetCP core ethernet driver") declared but never implemented netcp_device_find_module(). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Roger Quadros <rogerq@kernel.org> Link: https://lore.kernel.org/r/20230821134029.40084-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-22	net: microchip: Remove unused declarations	Yue Haibing	3	-7/+0
	Commit 264a9c5c9dff ("net: sparx5: Remove unused GLAG handling in PGID") removed sparx5_pgid_alloc_glag() but not its declaration. Commit 27d293cceee5 ("net: microchip: sparx5: Add support for rule count by cookie") removed vcap_rule_iter() but not its declaration. Commit 8beef08f4618 ("net: microchip: sparx5: Adding initial VCAP API support") declared but never implemented vcap_api_set_client(). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230821135556.43224-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-22	ionic: Remove unused declarations	Yue Haibing	3	-3/+0
	Commit fbfb8031533c ("ionic: Add hardware init and device commands") declared but never implemented ionic_q_rewind()/ionic_set_dma_mask(). Commit 969f84394604 ("ionic: sync the filters in the work task") declared but never implemented ionic_rx_filters_need_sync(). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Acked-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20230821134717.51936-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-22	net: mscc: ocelot: Remove unused declarations	Yue Haibing	2	-3/+0
	Commit 6c30384eb1de ("net: mscc: ocelot: register devlink ports") declared but never implemented ocelot_devlink_init() and ocelot_devlink_teardown(). Commit 2096805497e2 ("net: mscc: ocelot: automatically detect VCAP constants") declared but never implemented ocelot_detect_vcap_constants(). Commit 403ffc2c34de ("net: mscc: ocelot: add support for preemptible traffic classes") declared but never implemented ocelot_port_update_preemptible_tcs(). Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20230821130218.19096-1-yuehaibing@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-22	alx: fix OOB-read compiler warning	GONG, Ruiqi	1	-3/+2
	The following message shows up when compiling with W=1: In function ‘fortify_memcpy_chk’, inlined from ‘alx_get_ethtool_stats’ at drivers/net/ethernet/atheros/alx/ethtool.c:297:2: ./include/linux/fortify-string.h:592:4: error: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning] 592 \| __read_overflow2_field(q_size_field, size); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In order to get alx stats altogether, alx_get_ethtool_stats() reads beyond hw->stats.rx_ok. Fix this warning by directly copying hw->stats, and refactor the unnecessarily complicated BUILD_BUG_ON btw. Signed-off-by: GONG, Ruiqi <gongruiqi1@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20230821013218.1614265-1-gongruiqi@huaweicloud.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-08-22	tg3: Use slab_build_skb() when needed	Kees Cook	1	-1/+4
	The tg3 driver will use kmalloc() under some conditions. Check the frag_size and use slab_build_skb() when frag_size is 0. Silences the warning introduced by commit ce098da1497c ("skbuff: Introduce slab_build_skb()"): Use slab_build_skb() instead ... tg3_poll_work+0x638/0xf90 [tg3] Fixes: ce098da1497c ("skbuff: Introduce slab_build_skb()") Reported-by: Fiona Ebner <f.ebner@proxmox.com> Closes: https://lore.kernel.org/all/1bd4cb9c-4eb8-3bdb-3e05-8689817242d1@proxmox.com Cc: Siva Reddy Kallam <siva.kallam@broadcom.com> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://lore.kernel.org/r/20230818175417.never.273-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-21	Revert "pds_core: Fix some kernel-doc comments"	Jakub Kicinski	1	-2/+2
	This reverts commit cb39c35783f26892bb1a72b1115c94fa2e77f4c5. Patch was applied to hastily, the problem is already fixed in Alex's vfio tree: https://lore.kernel.org/all/20230821112237.105872b5.alex.williamson@redhat.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-21	net/mlx5: Devcom, only use devcom after NULL check in mlx5_devcom_send_event()	Li Zetao	1	-1/+2
	There is a warning reported by kernel test robot: smatch warnings: drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.c:264 mlx5_devcom_send_event() warn: variable dereferenced before IS_ERR check devcom (see line 259) The reason for the warning is that the pointer is used before check, put the assignment to comp after devcom check to silence the warning. Fixes: 88d162b47981 ("net/mlx5: Devcom, Infrastructure changes") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Closes: https://lore.kernel.org/r/202308041028.AkXYDwJ6-lkp@intel.com/ Signed-off-by: Li Zetao <lizetao1@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-21	net/mlx5: DR, Supporting inline WQE when possible	Itamar Gozlan	1	-13/+102
	In WQE (Work Queue Entry), the two types of data segments memories are pointers and inline data, where inline data is passed directly as part of the WQE. For software steering, the maximal inline size should be less than 2*MLX5_SEND_WQE_BB, i.e., the potential data must fit with the required inline WQE headers. Two consecutive blocks (MLX5_SEND_WQE_BB) are not guaranteed to reside on the same memory page. Hence, writes to MLX5_SEND_WQE_BB should be done separately, i.e., each MLX5_SEND_WQE_BB should be obtained using the mlx5_wq_cyc_get_wqe macro. Signed-off-by: Itamar Gozlan <igozlan@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-21	net/mlx5: Rename devlink port ops struct for PFs/VFs	Jiri Pirko	1	-2/+2
	As this struct is only used for devlink ports created for PF/VF, add it to the name of the variable to distinguish from the SF related ops struct. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-21	net/mlx5: Remove VPORT_UPLINK handling from devlink_port.c	Jiri Pirko	1	-10/+2
	It is not possible that the functions in devlink_port.c are called for uplink port. Remove this leftover code. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-21	net/mlx5: Call mlx5_esw_offloads_rep_load/unload() for uplink port directly	Jiri Pirko	1	-12/+8
	For uplink port, mlx5_esw_offloads_load/unload_rep() are currently called. There are 2 check inside, which effectively make the functions a simple wrappers of mlx5_esw_offloads_rep_load/unload() for uplink port. So avoid one check and indirection and call mlx5_esw_offloads_rep_load/unload() for uplink port directly. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-21	net/mlx5: Remove health syndrome enum duplication	Gal Pressman	1	-25/+11
	Health syndrome enum values were duplicated in mlx5_ifc and health.c, the correct place for them is mlx5_ifc. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-21	net/mlx5: DR, Remove unneeded local variable	Yevgeny Kliteynik	1	-1/+0
	Remove local variable that is already defined outside of the scope of this block. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-21	net/mlx5: DR, Fix code indentation	Yevgeny Kliteynik	1	-1/+1
	Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-21	net/mlx5: IRQ, consolidate irq and affinity mask allocation	Saeed Mahameed	1	-8/+6
	Consolidate the mlx5_irq and mlx5_irq->mask allocation, to simplify error flows and to match the dealloctation sequence @irq_release for symmetry. Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com>
2023-08-21	net/mlx5e: Fix spelling mistake "Faided" -> "Failed"	Colin Ian King	1	-1/+1
	There is a spelling mistake in a warning message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-21	net/mlx5e: aRFS, Introduce ethtool stats	Adham Faris	3	-8/+44
	Improve aRFS observability by adding new set of counters. Each Rx ring will have this set of counters listed below. These counters are exposed through ethtool -S. 1) arfs_add: number of times a new rule has been created. 2) arfs_request_in: number of times a rule was requested to move from its current Rx ring to a new Rx ring (incremented on the destination Rx ring). 3) arfs_request_out: number of times a rule was requested to move out from its current Rx ring (incremented on source/current Rx ring). 4) arfs_expired: number of times a rule has been expired by the kernel and removed from HW. 5) arfs_err: number of times a rule creation or modification has failed. This patch removes rx[i]_xsk_arfs_err counter and its documentation in mlx5/counters.rst since aRFS activity does not occur in XSK RQ's. Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com>
2023-08-21	net/mlx5e: aRFS, Warn if aRFS table does not exist for aRFS rule	Adham Faris	1	-0/+2
	aRFS tables should be allocated and exist in advance. Driver shouldn't reach a point where it tries to add aRFS rule to table that does not exist. Add warning if driver encounters such situation. Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-21	net/mlx5e: aRFS, Prevent repeated kernel rule migrations requests	Adham Faris	1	-1/+1
	aRFS rule movement requests from one Rx ring to other Rx ring arrive from the kernel to ensure that packets are steered to the right Rx ring. In the time interval until satisfying such a request, several more requests might follow, for the same flow. This patch detects and prevents repeated aRFS rules movement requests. In mlx5e_rx_flow_steer() ndo, after finding the aRFS rule that have been requested to move by the kernel, check if it's already requested to move by calling work_busy(&arfs_rule->arfs_work) handler. IOW, if this request is pending to be executed (in the work queue) or it's executing now but hasn't finished yet, return current filter ID and don't issue a new transition work. Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-08-21	ice: Fix NULL pointer deref during VF reset	Petr Oros	1	-7/+8
	During stress test with attaching and detaching VF from KVM and simultaneously changing VFs spoofcheck and trust there was a NULL pointer dereference in ice_reset_vf that VF's VSI is null. More than one instance of ice_reset_vf() can be running at a given time. When we rebuild the VSI in ice_reset_vf, another reset can be triaged from ice_service_task. In this case we can access the currently uninitialized VSI and cause panic. The window for this racing condition has been around for a long time but it's much worse after commit 227bf4500aaa ("ice: move VSI delete outside deconfig") because the reset runs faster. ice_reset_vf() using vf->cfg_lock and when we move this lock before accessing to the VF VSI, we can fix BUG for all cases. Panic occurs sometimes in ice_vsi_is_rx_queue_active() and sometimes in ice_vsi_stop_all_rx_rings() With our reproducer, we can hit BUG: ~8h before commit 227bf4500aaa ("ice: move VSI delete outside deconfig"). ~20m after commit 227bf4500aaa ("ice: move VSI delete outside deconfig"). After this fix we are not able to reproduce it after ~48h There was commit cf90b74341ee ("ice: Fix call trace with null VSI during VF reset") which also tried to fix this issue, but it was only partially resolved and the bug still exists. [ 6420.658415] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 6420.665382] #PF: supervisor read access in kernel mode [ 6420.670521] #PF: error_code(0x0000) - not-present page [ 6420.675659] PGD 0 [ 6420.677679] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 6420.682038] CPU: 53 PID: 326472 Comm: kworker/53:0 Kdump: loaded Not tainted 5.14.0-317.el9.x86_64 #1 [ 6420.691250] Hardware name: Dell Inc. PowerEdge R750/04V528, BIOS 1.6.5 04/15/2022 [ 6420.698729] Workqueue: ice ice_service_task [ice] [ 6420.703462] RIP: 0010:ice_vsi_is_rx_queue_active+0x2d/0x60 [ice] [ 6420.705860] ice 0000:ca:00.0: VF 0 is now untrusted [ 6420.709494] Code: 00 00 66 83 bf 76 04 00 00 00 48 8b 77 10 74 3e 31 c0 eb 0f 0f b7 97 76 04 00 00 48 83 c0 01 39 c2 7e 2b 48 8b 97 68 04 00 00 <0f> b7 0c 42 48 8b 96 20 13 00 00 48 8d 94 8a 00 00 12 00 8b 12 83 [ 6420.714426] ice 0000:ca:00.0 ens7f0: Setting MAC 22:22:22:22:22:00 on VF 0. VF driver will be reinitialized [ 6420.733120] RSP: 0018:ff778d2ff383fdd8 EFLAGS: 00010246 [ 6420.733123] RAX: 0000000000000000 RBX: ff2acf1916294000 RCX: 0000000000000000 [ 6420.733125] RDX: 0000000000000000 RSI: ff2acf1f2c6401a0 RDI: ff2acf1a27301828 [ 6420.762346] RBP: ff2acf1a27301828 R08: 0000000000000010 R09: 0000000000001000 [ 6420.769476] R10: ff2acf1916286000 R11: 00000000019eba3f R12: ff2acf19066460d0 [ 6420.776611] R13: ff2acf1f2c6401a0 R14: ff2acf1f2c6401a0 R15: 00000000ffffffff [ 6420.783742] FS: 0000000000000000(0000) GS:ff2acf28ffa80000(0000) knlGS:0000000000000000 [ 6420.791829] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6420.797575] CR2: 0000000000000000 CR3: 00000016ad410003 CR4: 0000000000773ee0 [ 6420.804708] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 6420.811034] vfio-pci 0000:ca:01.0: enabling device (0000 -> 0002) [ 6420.811840] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 6420.811841] PKRU: 55555554 [ 6420.811842] Call Trace: [ 6420.811843] <TASK> [ 6420.811844] ice_reset_vf+0x9a/0x450 [ice] [ 6420.811876] ice_process_vflr_event+0x8f/0xc0 [ice] [ 6420.841343] ice_service_task+0x23b/0x600 [ice] [ 6420.845884] ? __schedule+0x212/0x550 [ 6420.849550] process_one_work+0x1e2/0x3b0 [ 6420.853563] ? rescuer_thread+0x390/0x390 [ 6420.857577] worker_thread+0x50/0x3a0 [ 6420.861242] ? rescuer_thread+0x390/0x390 [ 6420.865253] kthread+0xdd/0x100 [ 6420.868400] ? kthread_complete_and_exit+0x20/0x20 [ 6420.873194] ret_from_fork+0x1f/0x30 [ 6420.876774] </TASK> [ 6420.878967] Modules linked in: vfio_pci vfio_pci_core vfio_iommu_type1 vfio iavf vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables bridge stp llc sctp ip6_udp_tunnel udp_tunnel nfp tls nfnetlink bluetooth mlx4_en mlx4_core rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common i10nm_edac nfit libnvdimm ipmi_ssif x86_pkg_temp_thermal intel_powerclamp coretemp irdma kvm_intel i40e kvm iTCO_wdt dcdbas ib_uverbs irqbypass iTCO_vendor_support mgag200 mei_me ib_core dell_smbios isst_if_mmio isst_if_mbox_pci rapl i2c_algo_bit drm_shmem_helper intel_cstate drm_kms_helper syscopyarea sysfillrect isst_if_common sysimgblt intel_uncore fb_sys_fops dell_wmi_descriptor wmi_bmof intel_vsec mei i2c_i801 acpi_ipmi ipmi_si i2c_smbus ipmi_devintf intel_pch_thermal acpi_power_meter pcspk r Fixes: efe41860008e ("ice: Fix memory corruption in VF driver") Fixes: f23df5220d2b ("ice: Fix spurious interrupt during removal of trusted VF") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-21	Revert "ice: Fix ice VF reset during iavf initialization"	Petr Oros	4	-25/+4
	This reverts commit 7255355a0636b4eff08d5e8139c77d98f151c4fc. After this commit we are not able to attach VF to VM: virsh attach-interface v0 hostdev --managed 0000:41:01.0 --mac 52:52:52:52:52:52 error: Failed to attach interface error: Cannot set interface MAC to 52:52:52:52:52:52 for ifname enp65s0f0np0 vf 0: Resource temporarily unavailable ice_check_vf_ready_for_cfg() already contain waiting for reset. New condition in ice_check_vf_ready_for_reset() causing only problems. Fixes: 7255355a0636 ("ice: Fix ice VF reset during iavf initialization") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-21	ice: fix receive buffer size miscalculation	Jesse Brandeburg	1	-1/+2
	The driver is misconfiguring the hardware for some values of MTU such that it could use multiple descriptors to receive a packet when it could have simply used one. Change the driver to use a round-up instead of the result of a shift, as the shift can truncate the lower bits of the size, and result in the problem noted above. It also aligns this driver with similar code in i40e. The insidiousness of this problem is that everything works with the wrong size, it's just not working as well as it could, as some MTU sizes end up using two or more descriptors, and there is no way to tell that is happening without looking at ice_trace or a bus analyzer. Fixes: efc2214b6047 ("ice: Add support for XDP") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-21	pds_core: Fix some kernel-doc comments	Yang Li	1	-2/+2
	Fix some kernel-doc comments to silence the warnings: drivers/net/ethernet/amd/pds_core/auxbus.c:18: warning: Function parameter or member 'pf' not described in 'pds_client_register' drivers/net/ethernet/amd/pds_core/auxbus.c:18: warning: Excess function parameter 'pf_pdev' description in 'pds_client_register' drivers/net/ethernet/amd/pds_core/auxbus.c:58: warning: Function parameter or member 'pf' not described in 'pds_client_unregister' drivers/net/ethernet/amd/pds_core/auxbus.c:58: warning: Excess function parameter 'pf_pdev' description in 'pds_client_unregister' Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-20	net: stmmac: Check more MAC HW features for XGMAC Core 3.20	Furong Xu	5	-8/+149
	1. XGMAC Core does not have hash_filter definition, it uses vlhash(VLAN Hash Filtering) instead, skip hash_filter when XGMAC. 2. Show exact size of Hash Table instead of raw register value. 3. Show full description of safety features defined by Synopsys Databook. 4. When safety feature is configured with no parity, or ECC only, keep FSM Parity Checking disabled. Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-20	net: lan743x: Return PTR_ERR() for fixed_phy_register()	Ruan Jinjie	1	-1/+1
	fixed_phy_register() returns -EPROBE_DEFER, -EINVAL and -EBUSY, etc, in addition to -EIO. The Best practice is to return these error codes with PTR_ERR(). Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-20	net: bcmgenet: Return PTR_ERR() for fixed_phy_register()	Ruan Jinjie	1	-1/+1
	fixed_phy_register() returns -EPROBE_DEFER, -EINVAL and -EBUSY, etc, in addition to -ENODEV. The Best practice is to return these error codes with PTR_ERR(). Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Acked-by: Doug Berger <opendmb@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-20	net: bgmac: Return PTR_ERR() for fixed_phy_register()	Ruan Jinjie	1	-1/+1
	fixed_phy_register() returns -EPROBE_DEFER, -EINVAL and -EBUSY, etc, in addition to -ENODEV. The best practice is to return these error codes with PTR_ERR(). Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-20	ipv4: fix data-races around inet->inet_id	Eric Dumazet	1	-1/+1
	UDP sendmsg() is lockless, so ip_select_ident_segs() can very well be run from multiple cpus [1] Convert inet->inet_id to an atomic_t, but implement a dedicated path for TCP, avoiding cost of a locked instruction (atomic_add_return()) Note that this patch will cause a trivial merge conflict because we added inet->flags in net-next tree. v2: added missing change in drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_cm.c (David Ahern) [1] BUG: KCSAN: data-race in __ip_make_skb / __ip_make_skb read-write to 0xffff888145af952a of 2 bytes by task 7803 on cpu 1: ip_select_ident_segs include/net/ip.h:542 [inline] ip_select_ident include/net/ip.h:556 [inline] __ip_make_skb+0x844/0xc70 net/ipv4/ip_output.c:1446 ip_make_skb+0x233/0x2c0 net/ipv4/ip_output.c:1560 udp_sendmsg+0x1199/0x1250 net/ipv4/udp.c:1260 inet_sendmsg+0x63/0x80 net/ipv4/af_inet.c:830 sock_sendmsg_nosec net/socket.c:725 [inline] sock_sendmsg net/socket.c:748 [inline] ____sys_sendmsg+0x37c/0x4d0 net/socket.c:2494 ___sys_sendmsg net/socket.c:2548 [inline] __sys_sendmmsg+0x269/0x500 net/socket.c:2634 __do_sys_sendmmsg net/socket.c:2663 [inline] __se_sys_sendmmsg net/socket.c:2660 [inline] __x64_sys_sendmmsg+0x57/0x60 net/socket.c:2660 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd read to 0xffff888145af952a of 2 bytes by task 7804 on cpu 0: ip_select_ident_segs include/net/ip.h:541 [inline] ip_select_ident include/net/ip.h:556 [inline] __ip_make_skb+0x817/0xc70 net/ipv4/ip_output.c:1446 ip_make_skb+0x233/0x2c0 net/ipv4/ip_output.c:1560 udp_sendmsg+0x1199/0x1250 net/ipv4/udp.c:1260 inet_sendmsg+0x63/0x80 net/ipv4/af_inet.c:830 sock_sendmsg_nosec net/socket.c:725 [inline] sock_sendmsg net/socket.c:748 [inline] ____sys_sendmsg+0x37c/0x4d0 net/socket.c:2494 ___sys_sendmsg net/socket.c:2548 [inline] __sys_sendmmsg+0x269/0x500 net/socket.c:2634 __do_sys_sendmmsg net/socket.c:2663 [inline] __se_sys_sendmmsg net/socket.c:2660 [inline] __x64_sys_sendmmsg+0x57/0x60 net/socket.c:2660 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x184d -> 0x184e Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 7804 Comm: syz-executor.1 Not tainted 6.5.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023 ================================================================== Fixes: 23f57406b82d ("ipv4: avoid using shared IP generator for connected sockets") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-20	net: bcmgenet: Fix return value check for fixed_phy_register()	Ruan Jinjie	1	-1/+1
	The fixed_phy_register() function returns error pointers and never returns NULL. Update the checks accordingly. Fixes: b0ba512e25d7 ("net: bcmgenet: enable driver to work without a device tree") Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Doug Berger <opendmb@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-20	net: bgmac: Fix return value check for fixed_phy_register()	Ruan Jinjie	1	-1/+1
	The fixed_phy_register() function returns error pointers and never returns NULL. Update the checks accordingly. Fixes: c25b23b8a387 ("bgmac: register fixed PHY for ARM BCM470X / BCM5301X chipsets") Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-20	net/mlx5: Add RoCE MACsec steering infrastructure in core	Patrisious Haddad	2	-9/+391
	Adds all the core steering helper functions that are needed in order to setup RoCE steering rules which includes both the RX and TX rules addition and deletion. As well as exporting the function to be ready to use from the IB driver where we expose functions to allow deletion of all rules, which is needed when a GID is deleted, or a deletion of a specific rule when an SA is deleted, and a similar manner for the rules addition. These functions are used in a later patch by IB driver to trigger the rules addition/deletion when needed. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20	net/mlx5: Configure MACsec steering for ingress RoCEv2 traffic	Patrisious Haddad	2	-9/+352
	Add steering tables/rules to check if the decrypted traffic is RoCEv2, if so copy reg_b_metadata to a temp reg and forward it to RDMA_RX domain. The rules are added once the MACsec device is assigned an IP address where we verify that the packet ip is for MACsec device and that the temp reg has MACsec operation and a valid SCI inside. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20	net/mlx5: Configure MACsec steering for egress RoCEv2 traffic	Patrisious Haddad	1	-1/+45
	Add steering table in RDMA_TX domain, to forward MACsec traffic to MACsec crypto table in NIC domain. The tables are created in a lazy manner when the first TX SA is being created, and destroyed upon the destruction of the last SA. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20	net/mlx5: Add MACsec priorities in RDMA namespaces	Patrisious Haddad	1	-3/+32
	Add MACsec flow steering priorities in RDMA namespaces. This allows adding tables/rules to forward RoCEv2 traffic to the MACsec crypto tables in NIC_TX domain, and accept RoCEv2 traffic from NIC_RX domain. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20	RDMA/mlx5: Implement MACsec gid addition and deletion	Patrisious Haddad	2	-33/+0
	Handle MACsec IP ambiguity issue, since mlx5 hw can't support programming both the MACsec and the physical gid when they have the same IP address, because it wouldn't know to whom to steer the traffic. Hence in such case we delete the physical gid from the hw gid table, which would then cause all traffic sent over it to fail, and we'll only be able to send traffic over the MACsec gid. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20	net/mlx5: Maintain fs_id xarray per MACsec device inside macsec steering	Patrisious Haddad	3	-94/+271
	Remove fs_id from the MACsec SA, since it has no real usage there and instead maintain with the MACsec steering data inside the core. Downstream patches requires this change to facilitate IB driver accesses to the fs_ids to avoid RoCE MACsec dependency on EN driver. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20	net/mlx5: Remove netdevice from MACsec steering	Patrisious Haddad	3	-77/+71
	Since MACsec steering was moved from ethernet private code to core, remove the netdevice from the MACsec steering, and use core device methods for error reporting instead. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20	net/mlx5e: Move MACsec flow steering and statistics database from ethernet ↵	Patrisious Haddad	5	-25/+20
	to core Since now MACsec flow steering (macsec_fs) and MACsec statistics (stats) are maintained by the core driver, move their data as well to be saved inside core structures instead of staying part of ethernet MACsec database. In addition cleanup all MACsec stats functions from the ethernet MACsec code and move what's needed to be part of macsec_fs instead. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-08-20	net/mlx5e: Rename MACsec flow steering functions/parameters to suit core ↵	Patrisious Haddad	5	-119/+119
	naming style Rename MACsec flow steering(macsec_fs) functions and parameters from ethernet(core/en_accel) naming convention to core naming convention. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>