summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2024-05-02net/mlx5e: Advertise mlx5 ethernet driver updates sk_buff md_dst for MACsecRahul Rameshbabu1-0/+1
commit 39d26a8f2efcb8b5665fe7d54a7dba306a8f1dff upstream. mlx5 Rx flow steering and CQE handling enable the driver to be able to update an skb's md_dst attribute as MACsec when MACsec traffic arrives when a device is configured for offloading. Advertise this to the core stack to take advantage of this capability. Cc: stable@vger.kernel.org Fixes: b7c9400cbc48 ("net/mlx5e: Implement MACsec Rx data path using MACsec skb_metadata_dst") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Benjamin Poirier <bpoirier@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/20240423181319.115860-5-rrameshbabu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02macsec: Detect if Rx skb is macsec-related for offloading devices that ↵Rahul Rameshbabu1-7/+37
update md_dst commit 642c984dd0e37dbaec9f87bd1211e5fac1f142bf upstream. Can now correctly identify where the packets should be delivered by using md_dst or its absence on devices that provide it. This detection is not possible without device drivers that update md_dst. A fallback pattern should be used for supporting such device drivers. This fallback mode causes multicast messages to be cloned to both the non-macsec and macsec ports, independent of whether the multicast message received was encrypted over MACsec or not. Other non-macsec traffic may also fail to be handled correctly for devices in promiscuous mode. Link: https://lore.kernel.org/netdev/ZULRxX9eIbFiVi7v@hog/ Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: stable@vger.kernel.org Fixes: 860ead89b851 ("net/macsec: Add MACsec skb_metadata_dst Rx Data path support") Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Reviewed-by: Benjamin Poirier <bpoirier@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/20240423181319.115860-4-rrameshbabu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02i2c: smbus: fix NULL function pointer dereferenceWolfram Sang1-6/+6
[ Upstream commit 91811a31b68d3765b3065f4bb6d7d6d84a7cfc9f ] Baruch reported an OOPS when using the designware controller as target only. Target-only modes break the assumption of one transfer function always being available. Fix this by always checking the pointer in __i2c_transfer. Reported-by: Baruch Siach <baruch@tkos.co.il> Closes: https://lore.kernel.org/r/4269631780e5ba789cf1ae391eec1b959def7d99.1712761976.git.baruch@tkos.co.il Fixes: 4b1acc43331d ("i2c: core changes for slave support") [wsa: dropped the simplification in core-smbus to avoid theoretical regressions] Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02phy: ti: tusb1210: Resolve charger-det crash if charger psy is unregisteredHans de Goede1-11/+12
[ Upstream commit bf6e4ee5c43690e4c5a8a057bbcd4ff986bed052 ] The power_supply frame-work is not really designed for there to be long living in kernel references to power_supply devices. Specifically unregistering a power_supply while some other code has a reference to it triggers a WARN in power_supply_unregister(): WARN_ON(atomic_dec_return(&psy->use_cnt)); Folllowed by the power_supply still getting removed and the backing data freed anyway, leaving the tusb1210 charger-detect code with a dangling reference, resulting in a crash the next time tusb1210_get_online() is called. Fix this by only holding the reference in tusb1210_get_online() freeing it at the end of the function. Note this still leaves a theoretical race window, but it avoids the issue when manually rmmod-ing the charger chip driver during development. Fixes: 48969a5623ed ("phy: ti: tusb1210: Add charger detection") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20240406140821.18624-1-hdegoede@redhat.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02dmaengine: idxd: Fix oops during rmmod on single-CPU platformsFenghua Yu1-6/+3
[ Upstream commit f221033f5c24659dc6ad7e5cf18fb1b075f4a8be ] During the removal of the idxd driver, registered offline callback is invoked as part of the clean up process. However, on systems with only one CPU online, no valid target is available to migrate the perf context, resulting in a kernel oops: BUG: unable to handle page fault for address: 000000000002a2b8 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 1470e1067 P4D 0 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 20 Comm: cpuhp/0 Not tainted 6.8.0-rc6-dsa+ #57 Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.86B.2492.D03.2307181620 07/18/2023 RIP: 0010:mutex_lock+0x2e/0x50 ... Call Trace: <TASK> __die+0x24/0x70 page_fault_oops+0x82/0x160 do_user_addr_fault+0x65/0x6b0 __pfx___rdmsr_safe_on_cpu+0x10/0x10 exc_page_fault+0x7d/0x170 asm_exc_page_fault+0x26/0x30 mutex_lock+0x2e/0x50 mutex_lock+0x1e/0x50 perf_pmu_migrate_context+0x87/0x1f0 perf_event_cpu_offline+0x76/0x90 [idxd] cpuhp_invoke_callback+0xa2/0x4f0 __pfx_perf_event_cpu_offline+0x10/0x10 [idxd] cpuhp_thread_fun+0x98/0x150 smpboot_thread_fn+0x27/0x260 smpboot_thread_fn+0x1af/0x260 __pfx_smpboot_thread_fn+0x10/0x10 kthread+0x103/0x140 __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x50 __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 <TASK> Fix the issue by preventing the migration of the perf context to an invalid target. Fixes: 81dd4d4d6178 ("dmaengine: idxd: Add IDXD performance monitor support") Reported-by: Terrence Xu <terrence.xu@intel.com> Tested-by: Terrence Xu <terrence.xu@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20240313214031.1658045-1-fenghua.yu@intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02dma: xilinx_dpdma: Fix lockingSean Anderson1-3/+10
[ Upstream commit 244296cc3a155199a8b080d19e645d7d49081a38 ] There are several places where either chan->lock or chan->vchan.lock was not held. Add appropriate locking. This fixes lockdep warnings like [ 31.077578] ------------[ cut here ]------------ [ 31.077831] WARNING: CPU: 2 PID: 40 at drivers/dma/xilinx/xilinx_dpdma.c:834 xilinx_dpdma_chan_queue_transfer+0x274/0x5e0 [ 31.077953] Modules linked in: [ 31.078019] CPU: 2 PID: 40 Comm: kworker/u12:1 Not tainted 6.6.20+ #98 [ 31.078102] Hardware name: xlnx,zynqmp (DT) [ 31.078169] Workqueue: events_unbound deferred_probe_work_func [ 31.078272] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 31.078377] pc : xilinx_dpdma_chan_queue_transfer+0x274/0x5e0 [ 31.078473] lr : xilinx_dpdma_chan_queue_transfer+0x270/0x5e0 [ 31.078550] sp : ffffffc083bb2e10 [ 31.078590] x29: ffffffc083bb2e10 x28: 0000000000000000 x27: ffffff880165a168 [ 31.078754] x26: ffffff880164e920 x25: ffffff880164eab8 x24: ffffff880164d480 [ 31.078920] x23: ffffff880165a148 x22: ffffff880164e988 x21: 0000000000000000 [ 31.079132] x20: ffffffc082aa3000 x19: ffffff880164e880 x18: 0000000000000000 [ 31.079295] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 31.079453] x14: 0000000000000000 x13: ffffff8802263dc0 x12: 0000000000000001 [ 31.079613] x11: 0001ffc083bb2e34 x10: 0001ff880164e98f x9 : 0001ffc082aa3def [ 31.079824] x8 : 0001ffc082aa3dec x7 : 0000000000000000 x6 : 0000000000000516 [ 31.079982] x5 : ffffffc7f8d43000 x4 : ffffff88003c9c40 x3 : ffffffffffffffff [ 31.080147] x2 : ffffffc7f8d43000 x1 : 00000000000000c0 x0 : 0000000000000000 [ 31.080307] Call trace: [ 31.080340] xilinx_dpdma_chan_queue_transfer+0x274/0x5e0 [ 31.080518] xilinx_dpdma_issue_pending+0x11c/0x120 [ 31.080595] zynqmp_disp_layer_update+0x180/0x3ac [ 31.080712] zynqmp_dpsub_plane_atomic_update+0x11c/0x21c [ 31.080825] drm_atomic_helper_commit_planes+0x20c/0x684 [ 31.080951] drm_atomic_helper_commit_tail+0x5c/0xb0 [ 31.081139] commit_tail+0x234/0x294 [ 31.081246] drm_atomic_helper_commit+0x1f8/0x210 [ 31.081363] drm_atomic_commit+0x100/0x140 [ 31.081477] drm_client_modeset_commit_atomic+0x318/0x384 [ 31.081634] drm_client_modeset_commit_locked+0x8c/0x24c [ 31.081725] drm_client_modeset_commit+0x34/0x5c [ 31.081812] __drm_fb_helper_restore_fbdev_mode_unlocked+0x104/0x168 [ 31.081899] drm_fb_helper_set_par+0x50/0x70 [ 31.081971] fbcon_init+0x538/0xc48 [ 31.082047] visual_init+0x16c/0x23c [ 31.082207] do_bind_con_driver.isra.0+0x2d0/0x634 [ 31.082320] do_take_over_console+0x24c/0x33c [ 31.082429] do_fbcon_takeover+0xbc/0x1b0 [ 31.082503] fbcon_fb_registered+0x2d0/0x34c [ 31.082663] register_framebuffer+0x27c/0x38c [ 31.082767] __drm_fb_helper_initial_config_and_unlock+0x5c0/0x91c [ 31.082939] drm_fb_helper_initial_config+0x50/0x74 [ 31.083012] drm_fbdev_dma_client_hotplug+0xb8/0x108 [ 31.083115] drm_client_register+0xa0/0xf4 [ 31.083195] drm_fbdev_dma_setup+0xb0/0x1cc [ 31.083293] zynqmp_dpsub_drm_init+0x45c/0x4e0 [ 31.083431] zynqmp_dpsub_probe+0x444/0x5e0 [ 31.083616] platform_probe+0x8c/0x13c [ 31.083713] really_probe+0x258/0x59c [ 31.083793] __driver_probe_device+0xc4/0x224 [ 31.083878] driver_probe_device+0x70/0x1c0 [ 31.083961] __device_attach_driver+0x108/0x1e0 [ 31.084052] bus_for_each_drv+0x9c/0x100 [ 31.084125] __device_attach+0x100/0x298 [ 31.084207] device_initial_probe+0x14/0x20 [ 31.084292] bus_probe_device+0xd8/0xdc [ 31.084368] deferred_probe_work_func+0x11c/0x180 [ 31.084451] process_one_work+0x3ac/0x988 [ 31.084643] worker_thread+0x398/0x694 [ 31.084752] kthread+0x1bc/0x1c0 [ 31.084848] ret_from_fork+0x10/0x20 [ 31.084932] irq event stamp: 64549 [ 31.084970] hardirqs last enabled at (64548): [<ffffffc081adf35c>] _raw_spin_unlock_irqrestore+0x80/0x90 [ 31.085157] hardirqs last disabled at (64549): [<ffffffc081adf010>] _raw_spin_lock_irqsave+0xc0/0xdc [ 31.085277] softirqs last enabled at (64503): [<ffffffc08001071c>] __do_softirq+0x47c/0x500 [ 31.085390] softirqs last disabled at (64498): [<ffffffc080017134>] ____do_softirq+0x10/0x1c [ 31.085501] ---[ end trace 0000000000000000 ]--- Fixes: 7cbb0c63de3f ("dmaengine: xilinx: dpdma: Add the Xilinx DisplayPort DMA engine driver") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Link: https://lore.kernel.org/r/20240308210034.3634938-2-sean.anderson@linux.dev Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02phy: rockchip-snps-pcie3: fix clearing PHP_GRF_PCIESEL_CON bitsSebastian Reichel1-3/+6
[ Upstream commit 55491a5fa163bf15158f34f3650b3985f25622b9 ] Currently the PCIe v3 PHY driver only sets the pcie1ln_sel bits, but does not clear them because of an incorrect write mask. This fixes up the issue by using a newly introduced constant for the write mask. While at it also introduces a proper GENMASK based constant for the PCIE30_PHY_MODE. Fixes: 2e9bffc4f713 ("phy: rockchip: Support PCIe v3") Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240404-rk3588-pcie-bifurcation-fixes-v1-2-9907136eeafd@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02phy: rockchip-snps-pcie3: fix bifurcation on rk3588Michal Tomek1-16/+8
[ Upstream commit f8020dfb311d2b6cf657668792aaa5fa8863a7dd ] So far all RK3588 boards use fully aggregated PCIe. CM3588 is one of the few boards using this feature and apparently it is broken. The PHY offers the following mapping options: port 0 lane 0 - always mapped to controller 0 (4L) port 0 lane 1 - to controller 0 or 2 (1L0) port 1 lane 0 - to controller 0 or 1 (2L) port 1 lane 1 - to controller 0, 1 or 3 (1L1) The data-lanes DT property maps these as follows: 0 = no controller (unsupported by the HW) 1 = 4L 2 = 2L 3 = 1L0 4 = 1L1 That allows the following configurations with first column being the mainline data-lane mapping, second column being the downstream name, third column being PCIE3PHY_GRF_CMN_CON0 and PHP_GRF_PCIESEL register values and final column being the user visible lane setup: <1 1 1 1> = AGGREG = [4 0] = x4 (aggregation) <1 1 2 2> = NANBNB = [0 0] = x2 x2 (no bif.) <1 3 2 2> = NANBBI = [1 1] = x2 x1x1 (bif. of port 0) <1 1 2 4> = NABINB = [2 2] = x1x1 x2 (bif. of port 1) <1 3 2 4> = NABIBI = [3 3] = x1x1 x1x1 (bif. of both ports) The driver currently does not program PHP_GRF_PCIESEL correctly, which is fixed by this patch. As a side-effect the new logic is much simpler than the old logic. Fixes: 2e9bffc4f713 ("phy: rockchip: Support PCIe v3") Signed-off-by: Michal Tomek <mtdev79b@gmail.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Acked-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20240404-rk3588-pcie-bifurcation-fixes-v1-1-9907136eeafd@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02phy: freescale: imx8m-pcie: fix pcie link-up instabilityMarcel Ziswiler1-2/+4
[ Upstream commit 3a161017f1de55cc48be81f6156004c151f32677 ] Leaving AUX_PLL_REFCLK_SEL at its reset default of AUX_IN (PLL clock) proves to be more stable on the i.MX 8M Mini. Fixes: 1aa97b002258 ("phy: freescale: pcie: Initialize the imx8 pcie standalone phy driver") Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com> Reviewed-by: Richard Zhu <hongxing.zhu@nxp.com> Link: https://lore.kernel.org/r/20240322130646.1016630-2-marcel@ziswiler.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02phy: freescale: imx8m-pcie: Refine i.MX8MM PCIe PHY driverRichard Zhu1-40/+66
[ Upstream commit ca679c49c4463595499a053ba94328acb574fffa ] To make it more flexible and easy to expand. Refine i.MX8MM PCIe PHY driver. - Use gpr compatible string to avoid the codes duplications when add another platform PCIe PHY support. - Re-arrange the codes to let it more flexible and easy to expand. No functional change. Re-arrange the TX tuning, since internal registers can be wrote through APB interface before assertion of CMN_RST. Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Marek Vasut <marex@denx.de> Tested-by: Richard Leitner <richard.leitner@skidata.com> Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Link: https://lore.kernel.org/r/1665625622-20551-4-git-send-email-hongxing.zhu@nxp.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Stable-dep-of: 3a161017f1de ("phy: freescale: imx8m-pcie: fix pcie link-up instability") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02phy: marvell: a3700-comphy: Fix hardcoded array sizeMikhail Kobuk1-1/+1
[ Upstream commit 627207703b73615653eea5ab7a841d5b478d961e ] Replace hardcoded 'gbe_phy_init' array size by explicit one. Fixes: 934337080c6c ("phy: marvell: phy-mvebu-a3700-comphy: Add native kernel implementation") Signed-off-by: Mikhail Kobuk <m.kobuk@ispras.ru> Link: https://lore.kernel.org/r/20240321164734.49273-2-m.kobuk@ispras.ru Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02phy: marvell: a3700-comphy: Fix out of bounds readMikhail Kobuk1-3/+4
[ Upstream commit e4308bc22b9d46cf33165c9dfaeebcf29cd56f04 ] There is an out of bounds read access of 'gbe_phy_init_fix[fix_idx].addr' every iteration after 'fix_idx' reaches 'ARRAY_SIZE(gbe_phy_init_fix)'. Make sure 'gbe_phy_init[addr]' is used when all elements of 'gbe_phy_init_fix' array are handled. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 934337080c6c ("phy: marvell: phy-mvebu-a3700-comphy: Add native kernel implementation") Signed-off-by: Mikhail Kobuk <m.kobuk@ispras.ru> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/20240321164734.49273-1-m.kobuk@ispras.ru Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02idma64: Don't try to serve interrupts when device is powered offAndy Shevchenko1-0/+4
[ Upstream commit 9140ce47872bfd89fca888c2f992faa51d20c2bc ] When iDMA 64-bit device is powered off, the IRQ status register is all 1:s. This is never happen in real case and signalling that the device is simply powered off. Don't try to serve interrupts that are not ours. Fixes: 667dfed98615 ("dmaengine: add a driver for Intel integrated DMA 64-bit") Reported-by: Heiner Kallweit <hkallweit1@gmail.com> Closes: https://lore.kernel.org/r/700bbb84-90e1-4505-8ff0-3f17ea8bc631@gmail.com Tested-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240321120453.1360138-1-andriy.shevchenko@linux.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02dmaengine: tegra186: Fix residual calculationAkhil R1-0/+3
[ Upstream commit 30f0ced9971b2d8c8c24ae75786f9079489a012d ] The existing residual calculation returns an incorrect value when bytes_xfer == bytes_req. This scenario occurs particularly with drivers like UART where DMA is scheduled for maximum number of bytes and is terminated when the bytes inflow stops. At higher baud rates, it could request the tx_status while there is no bytes left to transfer. This will lead to incorrect residual being set. Hence return residual as '0' when bytes transferred equals to the bytes requested. Fixes: ee17028009d4 ("dmaengine: tegra: Add tegra gpcdma driver") Signed-off-by: Akhil R <akhilrajeev@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://lore.kernel.org/r/20240315124411.17582-1-akhilrajeev@nvidia.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02dmaengine: owl: fix register access functionsArnd Bergmann1-2/+2
[ Upstream commit 43c633ef93a5d293c96ebcedb40130df13128428 ] When building with 'make W=1', clang notices that the computed register values are never actually written back but instead the wrong variable is set: drivers/dma/owl-dma.c:244:6: error: variable 'regval' set but not used [-Werror,-Wunused-but-set-variable] 244 | u32 regval; | ^ drivers/dma/owl-dma.c:268:6: error: variable 'regval' set but not used [-Werror,-Wunused-but-set-variable] 268 | u32 regval; | ^ Change these to what was most likely intended. Fixes: 47e20577c24d ("dmaengine: Add Actions Semi Owl family S900 DMA driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Peter Korsgaard <peter@korsgaard.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20240322132116.906475-1-arnd@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02mtd: diskonchip: work around ubsan link failureArnd Bergmann1-2/+2
commit 21c9fb611c25d5cd038f6fe485232e7884bb0b3d upstream. I ran into a randconfig build failure with UBSAN using gcc-13.2: arm-linux-gnueabi-ld: error: unplaced orphan section `.bss..Lubsan_data31' from `drivers/mtd/nand/raw/diskonchip.o' I'm not entirely sure what is going on here, but I suspect this has something to do with the check for the end of the doc_locations[] array that contains an (unsigned long)0xffffffff element, which is compared against the signed (int)0xffffffff. If this is the case, we should get a runtime check for undefined behavior, but we instead get an unexpected build-time error. I would have expected this to work fine on 32-bit architectures despite the signed integer overflow, though on 64-bit architectures this likely won't ever work. Changing the contition to instead check for the size of the array makes the code safe everywhere and avoids the ubsan check that leads to the link error. The loop code goes back to before 2.6.12. Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20240405143015.717429-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02fbdev: fix incorrect address computation in deferred IONam Cao1-1/+1
commit 78d9161d2bcd442d93d917339297ffa057dbee8c upstream. With deferred IO enabled, a page fault happens when data is written to the framebuffer device. Then driver determines which page is being updated by calculating the offset of the written virtual address within the virtual memory area, and uses this offset to get the updated page within the internal buffer. This page is later copied to hardware (thus the name "deferred IO"). This offset calculation is only correct if the virtual memory area is mapped to the beginning of the internal buffer. Otherwise this is wrong. For example, if users do: mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000); Then the virtual memory area will mapped at offset 0xff000 within the internal buffer. This offset 0xff000 is not accounted for, and wrong page is updated. Correct the calculation by using vmf->pgoff instead. With this change, the variable "offset" will no longer hold the exact offset value, but it is rounded down to multiples of PAGE_SIZE. But this is still correct, because this variable is only used to calculate the page offset. Reported-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Closes: https://lore.kernel.org/linux-fbdev/271372d6-e665-4e7f-b088-dee5f4ab341a@oracle.com Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct") Cc: <stable@vger.kernel.org> Signed-off-by: Nam Cao <namcao@linutronix.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Tested-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240423115053.4490-1-namcao@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02net: b44: set pause params only when interface is upPeter Münster1-6/+8
commit e3eb7dd47bd4806f00e104eb6da092c435f9fb21 upstream. b44_free_rings() accesses b44::rx_buffers (and ::tx_buffers) unconditionally, but b44::rx_buffers is only valid when the device is up (they get allocated in b44_open(), and deallocated again in b44_close()), any other time these are just a NULL pointers. So if you try to change the pause params while the network interface is disabled/administratively down, everything explodes (which likely netifd tries to do). Link: https://github.com/openwrt/openwrt/issues/13789 Fixes: 1da177e4c3f4 (Linux-2.6.12-rc2) Cc: stable@vger.kernel.org Reported-by: Peter Münster <pm@a16n.net> Suggested-by: Jonas Gorski <jonas.gorski@gmail.com> Signed-off-by: Vaclav Svoboda <svoboda@neng.cz> Tested-by: Peter Münster <pm@a16n.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Peter Münster <pm@a16n.net> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/87y192oolj.fsf@a16n.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02ACPI: CPPC: Fix access width used for PCC registersVanshidhar Konda1-16/+37
commit f489c948028b69cea235d9c0de1cc10eeb26a172 upstream. commit 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses") modified cpc_read()/cpc_write() to use access_width to read CPC registers. However, for PCC registers the access width field in the ACPI register macro specifies the PCC subspace ID. For non-zero PCC subspace ID it is incorrectly treated as access width. This causes errors when reading from PCC registers in the CPPC driver. For PCC registers, base the size of read/write on the bit width field. The debug message in cpc_read()/cpc_write() is updated to print relevant information for the address space type used to read the register. Fixes: 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses") Signed-off-by: Vanshidhar Konda <vanshikonda@os.amperecomputing.com> Tested-by: Jarred White <jarredwhite@linux.microsoft.com> Reviewed-by: Jarred White <jarredwhite@linux.microsoft.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Cc: 5.15+ <stable@vger.kernel.org> # 5.15+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02ACPI: CPPC: Fix bit_offset shift in MASK_VAL() macroJarred White1-2/+2
commit 05d92ee782eeb7b939bdd0189e6efcab9195bf95 upstream. Commit 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses") neglected to properly wrap the bit_offset shift when it comes to applying the mask. This may cause incorrect values to be read and may cause the cpufreq module not be loaded. [ 11.059751] cpu_capacity: CPU0 missing/invalid highest performance. [ 11.066005] cpu_capacity: partial information: fallback to 1024 for all CPUs Also, corrected the bitmask generation in GENMASK (extra bit being added). Fixes: 2f4a4d63a193 ("ACPI: CPPC: Use access_width over bit_width for system memory accesses") Signed-off-by: Jarred White <jarredwhite@linux.microsoft.com> Cc: 5.15+ <stable@vger.kernel.org> # 5.15+ Reviewed-by: Vanshidhar Konda <vanshikonda@os.amperecomputing.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02ACPI: CPPC: Use access_width over bit_width for system memory accessesJarred White1-5/+26
commit 2f4a4d63a193be6fd530d180bb13c3592052904c upstream. To align with ACPI 6.3+, since bit_width can be any 8-bit value, it cannot be depended on to be always on a clean 8b boundary. This was uncovered on the Cobalt 100 platform. SError Interrupt on CPU26, code 0xbe000011 -- SError CPU: 26 PID: 1510 Comm: systemd-udevd Not tainted 5.15.2.1-13 #1 Hardware name: MICROSOFT CORPORATION, BIOS MICROSOFT CORPORATION pstate: 62400009 (nZCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--) pc : cppc_get_perf_caps+0xec/0x410 lr : cppc_get_perf_caps+0xe8/0x410 sp : ffff8000155ab730 x29: ffff8000155ab730 x28: ffff0080139d0038 x27: ffff0080139d0078 x26: 0000000000000000 x25: ffff0080139d0058 x24: 00000000ffffffff x23: ffff0080139d0298 x22: ffff0080139d0278 x21: 0000000000000000 x20: ffff00802b251910 x19: ffff0080139d0000 x18: ffffffffffffffff x17: 0000000000000000 x16: ffffdc7e111bad04 x15: ffff00802b251008 x14: ffffffffffffffff x13: ffff013f1fd63300 x12: 0000000000000006 x11: ffffdc7e128f4420 x10: 0000000000000000 x9 : ffffdc7e111badec x8 : ffff00802b251980 x7 : 0000000000000000 x6 : ffff0080139d0028 x5 : 0000000000000000 x4 : ffff0080139d0018 x3 : 00000000ffffffff x2 : 0000000000000008 x1 : ffff8000155ab7a0 x0 : 0000000000000000 Kernel panic - not syncing: Asynchronous SError Interrupt CPU: 26 PID: 1510 Comm: systemd-udevd Not tainted 5.15.2.1-13 #1 Hardware name: MICROSOFT CORPORATION, BIOS MICROSOFT CORPORATION Call trace: dump_backtrace+0x0/0x1e0 show_stack+0x24/0x30 dump_stack_lvl+0x8c/0xb8 dump_stack+0x18/0x34 panic+0x16c/0x384 add_taint+0x0/0xc0 arm64_serror_panic+0x7c/0x90 arm64_is_fatal_ras_serror+0x34/0xa4 do_serror+0x50/0x6c el1h_64_error_handler+0x40/0x74 el1h_64_error+0x7c/0x80 cppc_get_perf_caps+0xec/0x410 cppc_cpufreq_cpu_init+0x74/0x400 [cppc_cpufreq] cpufreq_online+0x2dc/0xa30 cpufreq_add_dev+0xc0/0xd4 subsys_interface_register+0x134/0x14c cpufreq_register_driver+0x1b0/0x354 cppc_cpufreq_init+0x1a8/0x1000 [cppc_cpufreq] do_one_initcall+0x50/0x250 do_init_module+0x60/0x27c load_module+0x2300/0x2570 __do_sys_finit_module+0xa8/0x114 __arm64_sys_finit_module+0x2c/0x3c invoke_syscall+0x78/0x100 el0_svc_common.constprop.0+0x180/0x1a0 do_el0_svc+0x84/0xa0 el0_svc+0x2c/0xc0 el0t_64_sync_handler+0xa4/0x12c el0t_64_sync+0x1a4/0x1a8 Instead, use access_width to determine the size and use the offset and width to shift and mask the bits to read/write out. Make sure to add a check for system memory since pcc redefines the access_width to subspace id. If access_width is not set, then fall back to using bit_width. Signed-off-by: Jarred White <jarredwhite@linux.microsoft.com> Reviewed-by: Easwar Hariharan <eahariha@linux.microsoft.com> Cc: 5.15+ <stable@vger.kernel.org> # 5.15+ [ rjw: Subject and changelog edits, comment adjustments ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02irqchip/gic-v3-its: Prevent double free on errorGuanrui Huang1-7/+2
commit c26591afd33adce296c022e3480dea4282b7ef91 upstream. The error handling path in its_vpe_irq_domain_alloc() causes a double free when its_vpe_init() fails after successfully allocating at least one interrupt. This happens because its_vpe_irq_domain_free() frees the interrupts along with the area bitmap and the vprop_page and its_vpe_irq_domain_alloc() subsequently frees the area bitmap and the vprop_page again. Fix this by unconditionally invoking its_vpe_irq_domain_free() which handles all cases correctly and by removing the bitmap/vprop_page freeing from its_vpe_irq_domain_alloc(). [ tglx: Massaged change log ] Fixes: 7d75bbb4bc1a ("irqchip/gic-v3-its: Add VPE irq domain allocation/teardown") Signed-off-by: Guanrui Huang <guanrui.huang@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240418061053.96803-2-guanrui.huang@linux.alibaba.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02drm/amdgpu: Fix leak when GPU memory allocation failsMukul Joshi1-0/+1
commit 25e9227c6afd200bed6774c866980b8e36d033af upstream. Free the sync object if the memory allocation fails for any reason. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02drm/amdgpu/sdma5.2: use legacy HDP flush for SDMA2/3Alex Deucher1-11/+15
commit 9792b7cc18aaa0c2acae6af5d0acf249bcb1ab0d upstream. This avoids a potential conflict with firmwares with the newer HDP flush mechanism. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02HID: i2c-hid: remove I2C_HID_READ_PENDING flag to prevent lock-upNam Cao1-9/+0
commit 9c0f59e47a90c54d0153f8ddc0f80d7a36207d0e upstream. The flag I2C_HID_READ_PENDING is used to serialize I2C operations. However, this is not necessary, because I2C core already has its own locking for that. More importantly, this flag can cause a lock-up: if the flag is set in i2c_hid_xfer() and an interrupt happens, the interrupt handler (i2c_hid_irq) will check this flag and return immediately without doing anything, then the interrupt handler will be invoked again in an infinite loop. Since interrupt handler is an RT task, it takes over the CPU and the flag-clearing task never gets scheduled, thus we have a lock-up. Delete this unnecessary flag. Reported-and-tested-by: Eva Kurchatova <nyandarknessgirl@gmail.com> Closes: https://lore.kernel.org/r/CA+eeCSPUDpUg76ZO8dszSbAGn+UHjcyv8F1J-CUPVARAzEtW9w@mail.gmail.com Fixes: 4a200c3b9a40 ("HID: i2c-hid: introduce HID over i2c specification implementation") Cc: <stable@vger.kernel.org> Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02mmc: sdhci-msm: pervent access to suspended controllerMantas Pucka1-1/+15
commit f8def10f73a516b771051a2f70f2f0446902cb4f upstream. Generic sdhci code registers LED device and uses host->runtime_suspended flag to protect access to it. The sdhci-msm driver doesn't set this flag, which causes a crash when LED is accessed while controller is runtime suspended. Fix this by setting the flag correctly. Cc: stable@vger.kernel.org Fixes: 67e6db113c90 ("mmc: sdhci-msm: Add pm_runtime and system PM support") Signed-off-by: Mantas Pucka <mantas@8devices.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20240321-sdhci-mmc-suspend-v1-1-fbc555a64400@8devices.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02Bluetooth: qca: fix NULL-deref on non-serdev suspendJohan Hovold1-0/+3
commit 73e87c0a49fda31d7b589edccf4c72e924411371 upstream. Qualcomm ROME controllers can be registered from the Bluetooth line discipline and in this case the HCI UART serdev pointer is NULL. Add the missing sanity check to prevent a NULL-pointer dereference when wakeup() is called for a non-serdev controller during suspend. Just return true for now to restore the original behaviour and address the crash with pre-6.2 kernels, which do not have commit e9b3e5b8c657 ("Bluetooth: hci_qca: only assign wakeup with serial port support") that causes the crash to happen already at setup() time. Fixes: c1a74160eaf1 ("Bluetooth: hci_qca: Add device_may_wakeup support") Cc: stable@vger.kernel.org # 5.13 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02Bluetooth: btusb: Add Realtek RTL8852BE support ID 0x0bda:0x4853WangYuli1-0/+2
commit d1a5a7eede2977da3d2002d5ea3b519019cc1a98 upstream. Add the support ID(0x0bda, 0x4853) to usb_device_id table for Realtek RTL8852BE. Without this change the device utilizes an obsolete version of the firmware that is encoded in it rather than the updated Realtek firmware and config files from the firmware directory. The latter files implement many new features. The device table is as follows: T: Bus=03 Lev=01 Prnt=01 Port=09 Cnt=03 Dev#= 4 Spd=12 MxCh= 0 D: Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0bda ProdID=4853 Rev= 0.00 S: Manufacturer=Realtek S: Product=Bluetooth Radio S: SerialNumber=00e04c000001 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms Cc: stable@vger.kernel.org Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: WangYuli <wangyuli@uniontech.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02virtio_net: Do not send RSS key if it is not supportedBreno Leitao1-4/+22
commit 059a49aa2e25c58f90b50151f109dd3c4cdb3a47 upstream. There is a bug when setting the RSS options in virtio_net that can break the whole machine, getting the kernel into an infinite loop. Running the following command in any QEMU virtual machine with virtionet will reproduce this problem: # ethtool -X eth0 hfunc toeplitz This is how the problem happens: 1) ethtool_set_rxfh() calls virtnet_set_rxfh() 2) virtnet_set_rxfh() calls virtnet_commit_rss_command() 3) virtnet_commit_rss_command() populates 4 entries for the rss scatter-gather 4) Since the command above does not have a key, then the last scatter-gatter entry will be zeroed, since rss_key_size == 0. sg_buf_size = vi->rss_key_size; 5) This buffer is passed to qemu, but qemu is not happy with a buffer with zero length, and do the following in virtqueue_map_desc() (QEMU function): if (!sz) { virtio_error(vdev, "virtio: zero sized buffers are not allowed"); 6) virtio_error() (also QEMU function) set the device as broken vdev->broken = true; 7) Qemu bails out, and do not repond this crazy kernel. 8) The kernel is waiting for the response to come back (function virtnet_send_command()) 9) The kernel is waiting doing the following : while (!virtqueue_get_buf(vi->cvq, &tmp) && !virtqueue_is_broken(vi->cvq)) cpu_relax(); 10) None of the following functions above is true, thus, the kernel loops here forever. Keeping in mind that virtqueue_is_broken() does not look at the qemu `vdev->broken`, so, it never realizes that the vitio is broken at QEMU side. Fix it by not sending RSS commands if the feature is not available in the device. Fixes: c7114b1249fa ("drivers/net/virtio_net: Added basic RSS support.") Cc: stable@vger.kernel.org Cc: qemu-devel@nongnu.org Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Heng Qi <hengqi@linux.alibaba.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Konstantin Ovsepian <ovs@ovs.to> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02net: ethernet: ti: am65-cpts: Fix PTPv1 message type on TX packetsJason Reeder1-0/+5
[ Upstream commit 1b9e743e923b256e353a9a644195372285e5a6c0 ] The CPTS, by design, captures the messageType (Sync, Delay_Req, etc.) field from the second nibble of the PTP header which is defined in the PTPv2 (1588-2008) specification. In the PTPv1 (1588-2002) specification the first two bytes of the PTP header are defined as the versionType which is always 0x0001. This means that any PTPv1 packets that are tagged for TX timestamping by the CPTS will have their messageType set to 0x0 which corresponds to a Sync message type. This causes issues when a PTPv1 stack is expecting a Delay_Req (messageType: 0x1) timestamp that never appears. Fix this by checking if the ptp_class of the timestamped TX packet is PTP_CLASS_V1 and then matching the PTP sequence ID to the stored sequence ID in the skb->cb data structure. If the sequence IDs match and the packet is of type PTPv1 then there is a chance that the messageType has been incorrectly stored by the CPTS so overwrite the messageType stored by the CPTS with the messageType from the skb->cb data structure. This allows the PTPv1 stack to receive TX timestamps for Delay_Req packets which are necessary to lock onto a PTP Leader. Signed-off-by: Jason Reeder <jreeder@ti.com> Signed-off-by: Ravi Gunasekaran <r-gunasekaran@ti.com> Tested-by: Ed Trexel <ed.trexel@hp.com> Fixes: f6bd59526ca5 ("net: ethernet: ti: introduce am654 common platform time sync driver") Link: https://lore.kernel.org/r/20240424071626.32558-1-r-gunasekaran@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02iavf: Fix TC config comparison with existing adapter TC configSudheer Mogilappagari1-1/+29
[ Upstream commit 54976cf58d6168b8d15cebb395069f23b2f34b31 ] Same number of TCs doesn't imply that underlying TC configs are same. The config could be different due to difference in number of queues in each TC. Add utility function to determine if TC configs are same. Fixes: d5b33d024496 ("i40evf: add ndo_setup_tc callback to i40evf") Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Mineri Bhange <minerix.bhange@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240423182723.740401-4-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02i40e: Report MFS in decimal base instead of hexErwan Velu1-2/+2
[ Upstream commit ef3c313119ea448c22da10366faa26b5b4b1a18e ] If the MFS is set below the default (0x2600), a warning message is reported like the following : MFS for port 1 has been set below the default: 600 This message is a bit confusing as the number shown here (600) is in fact an hexa number: 0x600 = 1536 Without any explicit "0x" prefix, this message is read like the MFS is set to 600 bytes. MFS, as per MTUs, are usually expressed in decimal base. This commit reports both current and default MFS values in decimal so it's less confusing for end-users. A typical warning message looks like the following : MFS for port 1 (1536) has been set below the default (9728) Signed-off-by: Erwan Velu <e.velu@criteo.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Fixes: 3a2c6ced90e1 ("i40e: Add a check to see if MFS is set") Link: https://lore.kernel.org/r/20240423182723.740401-3-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02i40e: Do not use WQ_MEM_RECLAIM flag for workqueueSindhu Devale1-1/+1
[ Upstream commit 2cc7d150550cc981aceedf008f5459193282425c ] Issue reported by customer during SRIOV testing, call trace: When both i40e and the i40iw driver are loaded, a warning in check_flush_dependency is being triggered. This seems to be because of the i40e driver workqueue is allocated with the WQ_MEM_RECLAIM flag, and the i40iw one is not. Similar error was encountered on ice too and it was fixed by removing the flag. Do the same for i40e too. [Feb 9 09:08] ------------[ cut here ]------------ [ +0.000004] workqueue: WQ_MEM_RECLAIM i40e:i40e_service_task [i40e] is flushing !WQ_MEM_RECLAIM infiniband:0x0 [ +0.000060] WARNING: CPU: 0 PID: 937 at kernel/workqueue.c:2966 check_flush_dependency+0x10b/0x120 [ +0.000007] Modules linked in: snd_seq_dummy snd_hrtimer snd_seq snd_timer snd_seq_device snd soundcore nls_utf8 cifs cifs_arc4 nls_ucs2_utils rdma_cm iw_cm ib_cm cifs_md4 dns_resolver netfs qrtr rfkill sunrpc vfat fat intel_rapl_msr intel_rapl_common irdma intel_uncore_frequency intel_uncore_frequency_common ice ipmi_ssif isst_if_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp gnss coretemp ib_uverbs rapl intel_cstate ib_core iTCO_wdt iTCO_vendor_support acpi_ipmi mei_me ipmi_si intel_uncore ioatdma i2c_i801 joydev pcspkr mei ipmi_devintf lpc_ich intel_pch_thermal i2c_smbus ipmi_msghandler acpi_power_meter acpi_pad xfs libcrc32c ast sd_mod drm_shmem_helper t10_pi drm_kms_helper sg ixgbe drm i40e ahci crct10dif_pclmul libahci crc32_pclmul igb crc32c_intel libata ghash_clmulni_intel i2c_algo_bit mdio dca wmi dm_mirror dm_region_hash dm_log dm_mod fuse [ +0.000050] CPU: 0 PID: 937 Comm: kworker/0:3 Kdump: loaded Not tainted 6.8.0-rc2-Feb-net_dev-Qiueue-00279-gbd43c5687e05 #1 [ +0.000003] Hardware name: Intel Corporation S2600BPB/S2600BPB, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020 [ +0.000001] Workqueue: i40e i40e_service_task [i40e] [ +0.000024] RIP: 0010:check_flush_dependency+0x10b/0x120 [ +0.000003] Code: ff 49 8b 54 24 18 48 8d 8b b0 00 00 00 49 89 e8 48 81 c6 b0 00 00 00 48 c7 c7 b0 97 fa 9f c6 05 8a cc 1f 02 01 e8 35 b3 fd ff <0f> 0b e9 10 ff ff ff 80 3d 78 cc 1f 02 00 75 94 e9 46 ff ff ff 90 [ +0.000002] RSP: 0018:ffffbd294976bcf8 EFLAGS: 00010282 [ +0.000002] RAX: 0000000000000000 RBX: ffff94d4c483c000 RCX: 0000000000000027 [ +0.000001] RDX: ffff94d47f620bc8 RSI: 0000000000000001 RDI: ffff94d47f620bc0 [ +0.000001] RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000ffff7fff [ +0.000001] R10: ffffbd294976bb98 R11: ffffffffa0be65e8 R12: ffff94c5451ea180 [ +0.000001] R13: ffff94c5ab5e8000 R14: ffff94c5c20b6e05 R15: ffff94c5f1330ab0 [ +0.000001] FS: 0000000000000000(0000) GS:ffff94d47f600000(0000) knlGS:0000000000000000 [ +0.000002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000001] CR2: 00007f9e6f1fca70 CR3: 0000000038e20004 CR4: 00000000007706f0 [ +0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ +0.000001] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ +0.000001] PKRU: 55555554 [ +0.000001] Call Trace: [ +0.000001] <TASK> [ +0.000002] ? __warn+0x80/0x130 [ +0.000003] ? check_flush_dependency+0x10b/0x120 [ +0.000002] ? report_bug+0x195/0x1a0 [ +0.000005] ? handle_bug+0x3c/0x70 [ +0.000003] ? exc_invalid_op+0x14/0x70 [ +0.000002] ? asm_exc_invalid_op+0x16/0x20 [ +0.000006] ? check_flush_dependency+0x10b/0x120 [ +0.000002] ? check_flush_dependency+0x10b/0x120 [ +0.000002] __flush_workqueue+0x126/0x3f0 [ +0.000015] ib_cache_cleanup_one+0x1c/0xe0 [ib_core] [ +0.000056] __ib_unregister_device+0x6a/0xb0 [ib_core] [ +0.000023] ib_unregister_device_and_put+0x34/0x50 [ib_core] [ +0.000020] i40iw_close+0x4b/0x90 [irdma] [ +0.000022] i40e_notify_client_of_netdev_close+0x54/0xc0 [i40e] [ +0.000035] i40e_service_task+0x126/0x190 [i40e] [ +0.000024] process_one_work+0x174/0x340 [ +0.000003] worker_thread+0x27e/0x390 [ +0.000001] ? __pfx_worker_thread+0x10/0x10 [ +0.000002] kthread+0xdf/0x110 [ +0.000002] ? __pfx_kthread+0x10/0x10 [ +0.000002] ret_from_fork+0x2d/0x50 [ +0.000003] ? __pfx_kthread+0x10/0x10 [ +0.000001] ret_from_fork_asm+0x1b/0x30 [ +0.000004] </TASK> [ +0.000001] ---[ end trace 0000000000000000 ]--- Fixes: 4d5957cbdecd ("i40e: remove WQ_UNBOUND and the task limit of our workqueue") Signed-off-by: Sindhu Devale <sindhu.devale@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Robert Ganzynkowicz <robert.ganzynkowicz@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240423182723.740401-2-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02eth: bnxt: fix counting packets discarded due to OOM and netpollJakub Kicinski1-26/+18
[ Upstream commit 730117730709992c9f6535dd7b47638ee561ec45 ] I added OOM and netpoll discard counters, naively assuming that the cpr pointer is pointing to a common completion ring. Turns out that is usually *a* completion ring but not *the* completion ring which bnapi->cp_ring points to. bnapi->cp_ring is where the stats are read from, so we end up reporting 0 thru ethtool -S and qstat even though the drop events have happened. Make 100% sure we're recording statistics in the correct structure. Fixes: 907fd4a294db ("bnxt: count discards due to memory allocation errors") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240424002148.3937059-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02mlxsw: spectrum_acl_tcam: Fix memory leak when canceling rehash workIdo Schimmel1-1/+5
[ Upstream commit fb4e2b70a7194b209fc7320bbf33b375f7114bd5 ] The rehash delayed work is rescheduled with a delay if the number of credits at end of the work is not negative as supposedly it means that the migration ended. Otherwise, it is rescheduled immediately. After "mlxsw: spectrum_acl_tcam: Fix possible use-after-free during rehash" the above is no longer accurate as a non-negative number of credits is no longer indicative of the migration being done. It can also happen if the work encountered an error in which case the migration will resume the next time the work is scheduled. The significance of the above is that it is possible for the work to be pending and associated with hints that were allocated when the migration started. This leads to the hints being leaked [1] when the work is canceled while pending as part of ACL region dismantle. Fix by freeing the hints if hints are associated with a work that was canceled while pending. Blame the original commit since the reliance on not having a pending work associated with hints is fragile. [1] unreferenced object 0xffff88810e7c3000 (size 256): comm "kworker/0:16", pid 176, jiffies 4295460353 hex dump (first 32 bytes): 00 30 95 11 81 88 ff ff 61 00 00 00 00 00 00 80 .0......a....... 00 00 61 00 40 00 00 00 00 00 00 00 04 00 00 00 ..a.@........... backtrace (crc 2544ddb9): [<00000000cf8cfab3>] kmalloc_trace+0x23f/0x2a0 [<000000004d9a1ad9>] objagg_hints_get+0x42/0x390 [<000000000b143cf3>] mlxsw_sp_acl_erp_rehash_hints_get+0xca/0x400 [<0000000059bdb60a>] mlxsw_sp_acl_tcam_vregion_rehash_work+0x868/0x1160 [<00000000e81fd734>] process_one_work+0x59c/0xf20 [<00000000ceee9e81>] worker_thread+0x799/0x12c0 [<00000000bda6fe39>] kthread+0x246/0x300 [<0000000070056d23>] ret_from_fork+0x34/0x70 [<00000000dea2b93e>] ret_from_fork_asm+0x1a/0x30 Fixes: c9c9af91f1d9 ("mlxsw: spectrum_acl: Allow to interrupt/continue rehash work") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/0cc12ebb07c4d4c41a1265ee2c28b392ff997a86.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02mlxsw: spectrum_acl_tcam: Fix incorrect list API usageIdo Schimmel1-0/+7
[ Upstream commit b377add0f0117409c418ddd6504bd682ebe0bf79 ] Both the function that migrates all the chunks within a region and the function that migrates all the entries within a chunk call list_first_entry() on the respective lists without checking that the lists are not empty. This is incorrect usage of the API, which leads to the following warning [1]. Fix by returning if the lists are empty as there is nothing to migrate in this case. [1] WARNING: CPU: 0 PID: 6437 at drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_tcam.c:1266 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x1f1/0> Modules linked in: CPU: 0 PID: 6437 Comm: kworker/0:37 Not tainted 6.9.0-rc3-custom-00883-g94a65f079ef6 #39 Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019 Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work RIP: 0010:mlxsw_sp_acl_tcam_vchunk_migrate_all+0x1f1/0x2c0 [...] Call Trace: <TASK> mlxsw_sp_acl_tcam_vregion_rehash_work+0x6c/0x4a0 process_one_work+0x151/0x370 worker_thread+0x2cb/0x3e0 kthread+0xd0/0x100 ret_from_fork+0x34/0x50 ret_from_fork_asm+0x1a/0x30 </TASK> Fixes: 6f9579d4e302 ("mlxsw: spectrum_acl: Remember where to continue rehash migration") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/4628e9a22d1d84818e28310abbbc498e7bc31bc9.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02mlxsw: spectrum_acl_tcam: Fix warning during rehashIdo Schimmel1-3/+17
[ Upstream commit 743edc8547a92b6192aa1f1b6bb78233fa21dc9b ] As previously explained, the rehash delayed work migrates filters from one region to another. This is done by iterating over all chunks (all the filters with the same priority) in the region and in each chunk iterating over all the filters. When the work runs out of credits it stores the current chunk and entry as markers in the per-work context so that it would know where to resume the migration from the next time the work is scheduled. Upon error, the chunk marker is reset to NULL, but without resetting the entry markers despite being relative to it. This can result in migration being resumed from an entry that does not belong to the chunk being migrated. In turn, this will eventually lead to a chunk being iterated over as if it is an entry. Because of how the two structures happen to be defined, this does not lead to KASAN splats, but to warnings such as [1]. Fix by creating a helper that resets all the markers and call it from all the places the currently only reset the chunk marker. For good measures also call it when starting a completely new rehash. Add a warning to avoid future cases. [1] WARNING: CPU: 7 PID: 1076 at drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_keys.c:407 mlxsw_afk_encode+0x242/0x2f0 Modules linked in: CPU: 7 PID: 1076 Comm: kworker/7:24 Tainted: G W 6.9.0-rc3-custom-00880-g29e61d91b77b #29 Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019 Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work RIP: 0010:mlxsw_afk_encode+0x242/0x2f0 [...] Call Trace: <TASK> mlxsw_sp_acl_atcam_entry_add+0xd9/0x3c0 mlxsw_sp_acl_tcam_entry_create+0x5e/0xa0 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x109/0x290 mlxsw_sp_acl_tcam_vregion_rehash_work+0x6c/0x470 process_one_work+0x151/0x370 worker_thread+0x2cb/0x3e0 kthread+0xd0/0x100 ret_from_fork+0x34/0x50 </TASK> Fixes: 6f9579d4e302 ("mlxsw: spectrum_acl: Remember where to continue rehash migration") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/cc17eed86b41dd829d39b07906fec074a9ce580e.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02mlxsw: spectrum_acl_tcam: Fix memory leak during rehashIdo Schimmel1-0/+4
[ Upstream commit 8ca3f7a7b61393804c46f170743c3b839df13977 ] The rehash delayed work migrates filters from one region to another. This is done by iterating over all chunks (all the filters with the same priority) in the region and in each chunk iterating over all the filters. If the migration fails, the code tries to migrate the filters back to the old region. However, the rollback itself can also fail in which case another migration will be erroneously performed. Besides the fact that this ping pong is not a very good idea, it also creates a problem. Each virtual chunk references two chunks: The currently used one ('vchunk->chunk') and a backup ('vchunk->chunk2'). During migration the first holds the chunk we want to migrate filters to and the second holds the chunk we are migrating filters from. The code currently assumes - but does not verify - that the backup chunk does not exist (NULL) if the currently used chunk does not reference the target region. This assumption breaks when we are trying to rollback a rollback, resulting in the backup chunk being overwritten and leaked [1]. Fix by not rolling back a failed rollback and add a warning to avoid future cases. [1] WARNING: CPU: 5 PID: 1063 at lib/parman.c:291 parman_destroy+0x17/0x20 Modules linked in: CPU: 5 PID: 1063 Comm: kworker/5:11 Tainted: G W 6.9.0-rc2-custom-00784-gc6a05c468a0b #14 Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019 Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work RIP: 0010:parman_destroy+0x17/0x20 [...] Call Trace: <TASK> mlxsw_sp_acl_atcam_region_fini+0x19/0x60 mlxsw_sp_acl_tcam_region_destroy+0x49/0xf0 mlxsw_sp_acl_tcam_vregion_rehash_work+0x1f1/0x470 process_one_work+0x151/0x370 worker_thread+0x2cb/0x3e0 kthread+0xd0/0x100 ret_from_fork+0x34/0x50 ret_from_fork_asm+0x1a/0x30 </TASK> Fixes: 843500518509 ("mlxsw: spectrum_acl: Do rollback as another call to mlxsw_sp_acl_tcam_vchunk_migrate_all()") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/d5edd4f4503934186ae5cfe268503b16345b4e0f.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02mlxsw: spectrum_acl_tcam: Rate limit error messageIdo Schimmel1-1/+1
[ Upstream commit 5bcf925587e9b5d36420d572a0b4d131c90fb306 ] In the rare cases when the device resources are exhausted it is likely that the rehash delayed work will fail. An error message will be printed whenever this happens which can be overwhelming considering the fact that the work is per-region and that there can be hundreds of regions. Fix by rate limiting the error message. Fixes: e5e7962ee5c2 ("mlxsw: spectrum_acl: Implement region migration according to hints") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/c510763b2ebd25e7990d80183feff91cde593145.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02mlxsw: spectrum_acl_tcam: Fix possible use-after-free during rehashIdo Schimmel1-0/+1
[ Upstream commit 54225988889931467a9b55fdbef534079b665519 ] The rehash delayed work migrates filters from one region to another according to the number of available credits. The migrated from region is destroyed at the end of the work if the number of credits is non-negative as the assumption is that this is indicative of migration being complete. This assumption is incorrect as a non-negative number of credits can also be the result of a failed migration. The destruction of a region that still has filters referencing it can result in a use-after-free [1]. Fix by not destroying the region if migration failed. [1] BUG: KASAN: slab-use-after-free in mlxsw_sp_acl_ctcam_region_entry_remove+0x21d/0x230 Read of size 8 at addr ffff8881735319e8 by task kworker/0:31/3858 CPU: 0 PID: 3858 Comm: kworker/0:31 Tainted: G W 6.9.0-rc2-custom-00782-gf2275c2157d8 #5 Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019 Workqueue: mlxsw_core mlxsw_sp_acl_tcam_vregion_rehash_work Call Trace: <TASK> dump_stack_lvl+0xc6/0x120 print_report+0xce/0x670 kasan_report+0xd7/0x110 mlxsw_sp_acl_ctcam_region_entry_remove+0x21d/0x230 mlxsw_sp_acl_ctcam_entry_del+0x2e/0x70 mlxsw_sp_acl_atcam_entry_del+0x81/0x210 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x3cd/0xb50 mlxsw_sp_acl_tcam_vregion_rehash_work+0x157/0x1300 process_one_work+0x8eb/0x19b0 worker_thread+0x6c9/0xf70 kthread+0x2c9/0x3b0 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 174: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x8f/0xa0 __kmalloc+0x19c/0x360 mlxsw_sp_acl_tcam_region_create+0xdf/0x9c0 mlxsw_sp_acl_tcam_vregion_rehash_work+0x954/0x1300 process_one_work+0x8eb/0x19b0 worker_thread+0x6c9/0xf70 kthread+0x2c9/0x3b0 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30 Freed by task 7: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 poison_slab_object+0x102/0x170 __kasan_slab_free+0x14/0x30 kfree+0xc1/0x290 mlxsw_sp_acl_tcam_region_destroy+0x272/0x310 mlxsw_sp_acl_tcam_vregion_rehash_work+0x731/0x1300 process_one_work+0x8eb/0x19b0 worker_thread+0x6c9/0xf70 kthread+0x2c9/0x3b0 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30 Fixes: c9c9af91f1d9 ("mlxsw: spectrum_acl: Allow to interrupt/continue rehash work") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/3e412b5659ec2310c5c615760dfe5eac18dd7ebd.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02mlxsw: spectrum_acl_tcam: Fix possible use-after-free during activity updateIdo Schimmel1-2/+8
[ Upstream commit 79b5b4b18bc85b19d3a518483f9abbbe6d7b3ba4 ] The rule activity update delayed work periodically traverses the list of configured rules and queries their activity from the device. As part of this task it accesses the entry pointed by 'ventry->entry', but this entry can be changed concurrently by the rehash delayed work, leading to a use-after-free [1]. Fix by closing the race and perform the activity query under the 'vregion->lock' mutex. [1] BUG: KASAN: slab-use-after-free in mlxsw_sp_acl_tcam_flower_rule_activity_get+0x121/0x140 Read of size 8 at addr ffff8881054ed808 by task kworker/0:18/181 CPU: 0 PID: 181 Comm: kworker/0:18 Not tainted 6.9.0-rc2-custom-00781-gd5ab772d32f7 #2 Hardware name: Mellanox Technologies Ltd. MSN3700/VMOD0005, BIOS 5.11 01/06/2019 Workqueue: mlxsw_core mlxsw_sp_acl_rule_activity_update_work Call Trace: <TASK> dump_stack_lvl+0xc6/0x120 print_report+0xce/0x670 kasan_report+0xd7/0x110 mlxsw_sp_acl_tcam_flower_rule_activity_get+0x121/0x140 mlxsw_sp_acl_rule_activity_update_work+0x219/0x400 process_one_work+0x8eb/0x19b0 worker_thread+0x6c9/0xf70 kthread+0x2c9/0x3b0 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30 </TASK> Allocated by task 1039: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x8f/0xa0 __kmalloc+0x19c/0x360 mlxsw_sp_acl_tcam_entry_create+0x7b/0x1f0 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x30d/0xb50 mlxsw_sp_acl_tcam_vregion_rehash_work+0x157/0x1300 process_one_work+0x8eb/0x19b0 worker_thread+0x6c9/0xf70 kthread+0x2c9/0x3b0 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30 Freed by task 1039: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 poison_slab_object+0x102/0x170 __kasan_slab_free+0x14/0x30 kfree+0xc1/0x290 mlxsw_sp_acl_tcam_vchunk_migrate_all+0x3d7/0xb50 mlxsw_sp_acl_tcam_vregion_rehash_work+0x157/0x1300 process_one_work+0x8eb/0x19b0 worker_thread+0x6c9/0xf70 kthread+0x2c9/0x3b0 ret_from_fork+0x4d/0x80 ret_from_fork_asm+0x1a/0x30 Fixes: 2bffc5322fd8 ("mlxsw: spectrum_acl: Don't take mutex in mlxsw_sp_acl_tcam_vregion_rehash_work()") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/1fcce0a60b231ebeb2515d91022284ba7b4ffe7a.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02mlxsw: spectrum_acl_tcam: Fix race during rehash delayed workIdo Schimmel1-2/+2
[ Upstream commit d90cfe20562407d9f080d24123078d666d730707 ] The purpose of the rehash delayed work is to reduce the number of masks (eRPs) used by an ACL region as the eRP bank is a global and limited resource. This is done in three steps: 1. Creating a new set of masks and a new ACL region which will use the new masks and to which the existing filters will be migrated to. The new region is assigned to 'vregion->region' and the region from which the filters are migrated from is assigned to 'vregion->region2'. 2. Migrating all the filters from the old region to the new region. 3. Destroying the old region and setting 'vregion->region2' to NULL. Only the second steps is performed under the 'vregion->lock' mutex although its comments says that among other things it "Protects consistency of region, region2 pointers". This is problematic as the first step can race with filter insertion from user space that uses 'vregion->region', but under the mutex. Fix by holding the mutex across the entirety of the delayed work and not only during the second step. Fixes: 2bffc5322fd8 ("mlxsw: spectrum_acl: Don't take mutex in mlxsw_sp_acl_tcam_vregion_rehash_work()") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/1ec1d54edf2bad0a369e6b4fa030aba64e1f124b.1713797103.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02Bluetooth: qca: set power_ctrl_enabled on NULL returned by gpiod_get_optional()Bartosz Golaszewski1-5/+13
[ Upstream commit 3d05fc82237aa97162d0d7dc300b55bb34e91d02 ] Any return value from gpiod_get_optional() other than a pointer to a GPIO descriptor or a NULL-pointer is an error and the driver should abort probing. That being said: commit 56d074d26c58 ("Bluetooth: hci_qca: don't use IS_ERR_OR_NULL() with gpiod_get_optional()") no longer sets power_ctrl_enabled on NULL-pointer returned by devm_gpiod_get_optional(). Restore this behavior but bail-out on errors. While at it: also bail-out on error returned when trying to get the "swctrl" GPIO. Reported-by: Wren Turkal <wt@penguintechs.org> Reported-by: Zijun Hu <quic_zijuhu@quicinc.com> Closes: https://lore.kernel.org/linux-bluetooth/1713449192-25926-2-git-send-email-quic_zijuhu@quicinc.com/ Fixes: 56d074d26c58 ("Bluetooth: hci_qca: don't use IS_ERR_OR_NULL() with gpiod_get_optional()") Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Tested-by: Wren Turkal <wt@penguintechs.org> Reported-by: Wren Turkal <wt@penguintechs.org> Reported-by: Zijun Hu <quic_zijuhu@quicinc.com> Reviewed-by: Krzysztof Kozlowski<krzysztof.kozlowski@linaro.org> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02net: gtp: Fix Use-After-Free in gtp_dellinkHyunwoo Kim1-1/+2
[ Upstream commit f2a904107ee2b647bb7794a1a82b67740d7c8a64 ] Since call_rcu, which is called in the hlist_for_each_entry_rcu traversal of gtp_dellink, is not part of the RCU read critical section, it is possible that the RCU grace period will pass during the traversal and the key will be free. To prevent this, it should be changed to hlist_for_each_entry_safe. Fixes: 94dc550a5062 ("gtp: fix an use-after-free in ipv4_pdp_find()") Signed-off-by: Hyunwoo Kim <v4bel@theori.io> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02net: usb: ax88179_178a: stop lying about skb->truesizeEric Dumazet1-8/+3
[ Upstream commit 4ce62d5b2f7aecd4900e7d6115588ad7f9acccca ] Some usb drivers try to set small skb->truesize and break core networking stacks. In this patch, I removed one of the skb->truesize overide. I also replaced one skb_clone() by an allocation of a fresh and small skb, to get minimally sized skbs, like we did in commit 1e2c61172342 ("net: cdc_ncm: reduce skb truesize in rx path") Fixes: f8ebb3ac881b ("net: usb: ax88179_178a: Fix packet receiving") Reported-by: shironeko <shironeko@tesaguri.club> Closes: https://lore.kernel.org/netdev/c110f41a0d2776b525930f213ca9715c@tesaguri.club/ Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jose Alonso <joalonsof@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240421193828.1966195-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02NFC: trf7970a: disable all regulators on removalPaul Geurts1-19/+23
[ Upstream commit 6bea4f03c6a4e973ef369e15aac88f37981db49e ] During module probe, regulator 'vin' and 'vdd-io' are used and enabled, but the vdd-io regulator overwrites the 'vin' regulator pointer. During remove, only the vdd-io is disabled, as the vin regulator pointer is not available anymore. When regulator_put() is called during resource cleanup a kernel warning is given, as the regulator is still enabled. Store the two regulators in separate pointers and disable both the regulators on module remove. Fixes: 49d22c70aaf0 ("NFC: trf7970a: Add device tree option of 1.8 Volt IO voltage") Signed-off-by: Paul Geurts <paul_geurts@live.nl> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/DB7PR09MB26847A4EBF88D9EDFEB1DA0F950E2@DB7PR09MB2684.eurprd09.prod.outlook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02bnxt_en: Fix the PCI-AER routinesVikas Gupta1-3/+16
[ Upstream commit a1acdc226baec331512f815d6ac9dd6f8435cc7f ] We do not support two simultaneous recoveries so check for reset flag, BNXT_STATE_IN_FW_RESET, and do not proceed with AER further. When the pci channel state is pci_channel_io_frozen, the PCIe link can not be trusted so we disable the traffic immediately and stop BAR access by calling bnxt_fw_fatal_close(). BAR access after AER fatal error can cause an NMI. Fixes: f75d9a0aa967 ("bnxt_en: Re-write PCI BARs after PCI fatal error.") Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02bnxt_en: refactor reset close codeVikas Gupta1-6/+11
[ Upstream commit 7474b1c82be3780692d537d331f9aa7fc1e5a368 ] Introduce bnxt_fw_fatal_close() API which can be used to stop data path and disable device when firmware is in fatal state. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Stable-dep-of: a1acdc226bae ("bnxt_en: Fix the PCI-AER routines") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02mlxsw: core: Unregister EMAD trap using FORWARD actionIdo Schimmel1-1/+1
[ Upstream commit 976c44af48141cd8595601c0af2a19a43c5b228b ] The device's manual (PRM - Programmer's Reference Manual) classifies the trap that is used to deliver EMAD responses as an "event trap". Among other things, it means that the only actions that can be associated with the trap are TRAP and FORWARD (NOP). Currently, during driver de-initialization the driver unregisters the trap by setting its action to DISCARD, which violates the above guideline. Future firmware versions will prevent such misuses by returning an error. This does not prevent the driver from working, but an error will be printed to the kernel log during module removal / devlink reload: mlxsw_spectrum 0000:03:00.0: Reg cmd access status failed (status=7(bad parameter)) mlxsw_spectrum 0000:03:00.0: Reg cmd access failed (reg_id=7003(hpkt),type=write) Suppress the error message by aligning the driver to the manual and use a FORWARD (NOP) action when unregistering the trap. Fixes: 4ec14b7634b2 ("mlxsw: Add interface to access registers and process events") Cc: Jiri Pirko <jiri@resnulli.us> Cc: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/753a89e14008fde08cb4a2c1e5f537b81d8eb2d6.1713446092.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02vxlan: drop packets from invalid src-addressDavid Bauer1-0/+4
[ Upstream commit f58f45c1e5b92975e91754f5407250085a6ae7cf ] The VXLAN driver currently does not check if the inner layer2 source-address is valid. In case source-address snooping/learning is enabled, a entry in the FDB for the invalid address is created with the layer3 address of the tunnel endpoint. If the frame happens to have a non-unicast address set, all this non-unicast traffic is subsequently not flooded to the tunnel network but sent to the learnt host in the FDB. To make matters worse, this FDB entry does not expire. Apply the same filtering for packets as it is done for bridges. This not only drops these invalid packets but avoids them from being learnt into the FDB. Fixes: d342894c5d2f ("vxlan: virtual extensible lan") Suggested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David Bauer <mail@david-bauer.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>