kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
7 days	spi: tegra114: Preserve SPI mode bits in def_command1_reg	Vishwaroop A	1	-0/+3
	[ Upstream commit a0a75b40c919b9f6d3a0b6c978e6ccf344c1be5a ] The COMMAND1 register bits [29:28] set the SPI mode, which controls the clock idle level. When a transfer ends, tegra_spi_transfer_end() writes def_command1_reg back to restore the default state, but this register value currently lacks the mode bits. This results in the clock always being configured as idle low, breaking devices that need it high. Fix this by storing the mode bits in def_command1_reg during setup, to prevent this field from always being cleared. Fixes: f333a331adfa ("spi/tegra114: add spi driver") Signed-off-by: Vishwaroop A <va@nvidia.com> Link: https://patch.msgid.link/20260204141212.1540382-1-va@nvidia.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	spi: tegra: Fix a memory leak in tegra_slink_probe()	Felix Gu	1	-2/+4
	[ Upstream commit 41d9a6795b95d6ea28439ac1e9ce8c95bbca20fc ] In tegra_slink_probe(), when platform_get_irq() fails, it directly returns from the function with an error code, which causes a memory leak. Replace it with a goto label to ensure proper cleanup. Fixes: eb9913b511f1 ("spi: tegra: Fix missing IRQ check in tegra_slink_probe()") Signed-off-by: Felix Gu <ustc.gu@gmail.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Link: https://patch.msgid.link/20260202-slink-v1-1-eac50433a6f9@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	spi: tegra210-quad: Protect curr_xfer check in IRQ handler	Breno Leitao	1	-0/+20
	[ Upstream commit edf9088b6e1d6d88982db7eb5e736a0e4fbcc09e ] Now that all other accesses to curr_xfer are done under the lock, protect the curr_xfer NULL check in tegra_qspi_isr_thread() with the spinlock. Without this protection, the following race can occur: CPU0 (ISR thread) CPU1 (timeout path) ---------------- ------------------- if (!tqspi->curr_xfer) // sees non-NULL spin_lock() tqspi->curr_xfer = NULL spin_unlock() handle_*_xfer() spin_lock() t = tqspi->curr_xfer // NULL! ... t->len ... // NULL dereference! With this patch, all curr_xfer accesses are now properly synchronized. Although all accesses to curr_xfer are done under the lock, in tegra_qspi_isr_thread() it checks for NULL, releases the lock and reacquires it later in handle_cpu_based_xfer()/handle_dma_based_xfer(). There is a potential for an update in between, which could cause a NULL pointer dereference. To handle this, add a NULL check inside the handlers after acquiring the lock. This ensures that if the timeout path has already cleared curr_xfer, the handler will safely return without dereferencing the NULL pointer. Fixes: b4e002d8a7ce ("spi: tegra210-quad: Fix timeout handling") Signed-off-by: Breno Leitao <leitao@debian.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20260126-tegra_xfer-v2-6-6d2115e4f387@debian.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	spi: tegra210-quad: Protect curr_xfer clearing in ↵	Breno Leitao	1	-0/+3
	tegra_qspi_non_combined_seq_xfer [ Upstream commit 6d7723e8161f3c3f14125557e19dd080e9d882be ] Protect the curr_xfer clearing in tegra_qspi_non_combined_seq_xfer() with the spinlock to prevent a race with the interrupt handler that reads this field to check if a transfer is in progress. Fixes: b4e002d8a7ce ("spi: tegra210-quad: Fix timeout handling") Signed-off-by: Breno Leitao <leitao@debian.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20260126-tegra_xfer-v2-5-6d2115e4f387@debian.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	spi: tegra210-quad: Protect curr_xfer in tegra_qspi_combined_seq_xfer	Breno Leitao	1	-0/+5
	[ Upstream commit bf4528ab28e2bf112c3a2cdef44fd13f007781cd ] The curr_xfer field is read by the IRQ handler without holding the lock to check if a transfer is in progress. When clearing curr_xfer in the combined sequence transfer loop, protect it with the spinlock to prevent a race with the interrupt handler. Protect the curr_xfer clearing at the exit path of tegra_qspi_combined_seq_xfer() with the spinlock to prevent a race with the interrupt handler that reads this field. Without this protection, the IRQ handler could read a partially updated curr_xfer value, leading to NULL pointer dereference or use-after-free. Fixes: b4e002d8a7ce ("spi: tegra210-quad: Fix timeout handling") Signed-off-by: Breno Leitao <leitao@debian.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20260126-tegra_xfer-v2-4-6d2115e4f387@debian.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	spi: tegra210-quad: Protect curr_xfer assignment in ↵	Breno Leitao	1	-0/+3
	tegra_qspi_setup_transfer_one [ Upstream commit f5a4d7f5e32ba163cff893493ec1cbb0fd2fb0d5 ] When the timeout handler processes a completed transfer and signals completion, the transfer thread can immediately set up the next transfer and assign curr_xfer to point to it. If a delayed ISR from the previous transfer then runs, it checks if (!tqspi->curr_xfer) (currently without the lock also -- to be fixed soon) to detect stale interrupts, but this check passes because curr_xfer now points to the new transfer. The ISR then incorrectly processes the new transfer's context. Protect the curr_xfer assignment with the spinlock to ensure the ISR either sees NULL (and bails out) or sees the new value only after the assignment is complete. Fixes: 921fc1838fb0 ("spi: tegra210-quad: Add support for Tegra210 QSPI controller") Signed-off-by: Breno Leitao <leitao@debian.org> Tested-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20260126-tegra_xfer-v2-3-6d2115e4f387@debian.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	spi: tegra210-quad: Move curr_xfer read inside spinlock	Breno Leitao	1	-2/+4
	[ Upstream commit ef13ba357656451d6371940d8414e3e271df97e3 ] Move the assignment of the transfer pointer from curr_xfer inside the spinlock critical section in both handle_cpu_based_xfer() and handle_dma_based_xfer(). Previously, curr_xfer was read before acquiring the lock, creating a window where the timeout path could clear curr_xfer between reading it and using it. By moving the read inside the lock, the handlers are guaranteed to see a consistent value that cannot be modified by the timeout path. Fixes: 921fc1838fb0 ("spi: tegra210-quad: Add support for Tegra210 QSPI controller") Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Thierry Reding <treding@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Jon Hunter <jonathanh@nvidia.com> Link: https://patch.msgid.link/20260126-tegra_xfer-v2-2-6d2115e4f387@debian.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	spi: tegra210-quad: Return IRQ_HANDLED when timeout already processed transfer	Breno Leitao	1	-2/+17
	[ Upstream commit aabd8ea0aa253d40cf5f20a609fc3d6f61e38299 ] When the ISR thread wakes up late and finds that the timeout handler has already processed the transfer (curr_xfer is NULL), return IRQ_HANDLED instead of IRQ_NONE. Use a similar approach to tegra_qspi_handle_timeout() by reading QSPI_TRANS_STATUS and checking the QSPI_RDY bit to determine if the hardware actually completed the transfer. If QSPI_RDY is set, the interrupt was legitimate and triggered by real hardware activity. The fact that the timeout path handled it first doesn't make it spurious. Returning IRQ_NONE incorrectly suggests the interrupt wasn't for this device, which can cause issues with shared interrupt lines and interrupt accounting. Fixes: b4e002d8a7ce ("spi: tegra210-quad: Fix timeout handling") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Usama Arif <usamaarif642@gmail.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/20260126-tegra_xfer-v2-1-6d2115e4f387@debian.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	regulator: spacemit-p1: Fix n_voltages for BUCK and LDO regulators	Guodong Xu	1	-3/+3
	[ Upstream commit 41399c5d476156635c9a58de870d39318e22fa09 ] Higher voltage settings were unusable due to incorrect n_voltages values causing registration failures. For example, setting aldo4 to 3.3V failed with -EINVAL because the required selector (123) exceeded the allowed range (n_voltages=117). Fix by aligning n_voltages with the hardware register widths per the P1 datasheet [1]: - BUCK: 255 (was 254), allows selectors 0-254, selector 255 is reserved - LDO: 128 (was 117), allows selectors 0-127, selectors 0-10 are for suspend mode, valid operational range is 11-127 This enables the full voltage range supported by the hardware. Fixes: 8b84d712ad84 ("regulator: spacemit: support SpacemiT P1 regulators") Link: https://developer.spacemit.com/documentation [1] Signed-off-by: Guodong Xu <guodong@riscstar.com> Link: https://patch.msgid.link/20260122-spacemit-p1-v1-1-309be27fbff9@riscstar.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	i2c: imx: preserve error state in block data length handler	LI Qingwu	1	-1/+2
	commit b126097b0327437048bd045a0e4d273dea2910dd upstream. When a block read returns an invalid length, zero or >I2C_SMBUS_BLOCK_MAX, the length handler sets the state to IMX_I2C_STATE_FAILED. However, i2c_imx_master_isr() unconditionally overwrites this with IMX_I2C_STATE_READ_CONTINUE, causing an endless read loop that overruns buffers and crashes the system. Guard the state transition to preserve error states set by the length handler. Fixes: 5f5c2d4579ca ("i2c: imx: prevent rescheduling in non dma mode") Signed-off-by: LI Qingwu <Qing-wu.Li@leica-geosystems.com.cn> Cc: <stable@vger.kernel.org> # v6.13+ Reviewed-by: Stefan Eichenberger <eichest@gmail.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20260116111906.3413346-2-Qing-wu.Li@leica-geosystems.com.cn Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 days	gpio: loongson-64bit: Fix incorrect NULL check after devm_kcalloc()	Chen Ni	1	-1/+1
	[ Upstream commit e34f77b09080c86c929153e2a72da26b4f8947ff ] Fix incorrect NULL check in loongson_gpio_init_irqchip(). The function checks chip->parent instead of chip->irq.parents. Fixes: 03c146cb6cd1 ("gpio: loongson-64bit: Add support for Loongson-2K0300 SoC") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Link: https://patch.msgid.link/20260205072649.3271158-1-nichen@iscas.ac.cn Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	firmware: cs_dsp: rate-limit log messages in KUnit builds	Richard Fitzgerald	7	-14/+138
	[ Upstream commit 10db9f6899dd3a2dfd26efd40afd308891dc44a8 ] Use the dev_*_ratelimit() macros if the cs_dsp KUnit tests are enabled in the build, and allow the KUnit tests to disable message output. Some of the KUnit tests cause a very large number of log messages from cs_dsp, because the tests perform many different test cases. This could cause some lines to be dropped from the kernel log. Dropped lines can prevent the KUnit wrappers from parsing the ktap output in the dmesg log. The KUnit builds of cs_dsp export three bools that the KUnit tests can use to entirely disable log output of err, warn and info messages. Some tests have been updated to use this, replacing the previous fudge of a usleep() in the exit handler of each test. We don't necessarily want to disable all log messages if they aren't expected to be excessive, so the rate-limiting allows leaving some logging enabled. The rate-limited macros are not used in normal builds because it is not appropriate to rate-limit every message. That could cause important messages to be dropped, and there wouldn't be such a high rate of messages in normal operation. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/linux-sound/af393f08-facb-4c44-a054-1f61254803ec@opensource.cirrus.com/T/#t Fixes: cd8c058499b6 ("firmware: cs_dsp: Add KUnit testing of bin error cases") Link: https://patch.msgid.link/20260130171256.863152-1-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	firmware: cs_dsp: Factor out common debugfs string read	Richard Fitzgerald	1	-24/+21
	[ Upstream commit 78cfd833bc04c0398ca4cfc64704350aebe4d4c2 ] cs_dsp_debugfs_wmfw_read() and cs_dsp_debugfs_bin_read() were identical except for which struct member they printed. Move all this duplicated code into a common function cs_dsp_debugfs_string_read(). The check for dsp->booted has been removed because this is redundant. The two strings are set when the DSP is booted and cleared when the DSP is powered-down. Access to the string char * must be protected by the pwr_lock mutex. The string is passed into cs_dsp_debugfs_string_read() as a pointer to the char * so that the mutex lock can also be factored out into cs_dsp_debugfs_string_read(). wmfw_file_name and bin_file_name members of struct cs_dsp have been changed to const char *. It makes for a better API to pass a const pointer into cs_dsp_debugfs_string_read(). Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20251120130640.1169780-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org> Stable-dep-of: 10db9f6899dd ("firmware: cs_dsp: rate-limit log messages in KUnit builds") Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	nvme-pci: handle changing device dma map requirements	Keith Busch	1	-15/+30
	[ Upstream commit 071be3b0b6575d45be9df9c5b612f5882bfc5e88 ] The initial state of dma_needs_unmap may be false, but change to true while mapping the data iterator. Enabling swiotlb is one such case that can change the result. The nvme driver needs to save the mapped dma vectors to be unmapped later, so allocate as needed during iteration rather than assume it was always allocated at the beginning. This fixes a NULL dereference from accessing an uninitialized dma_vecs when the device dma unmapping requirements change mid-iteration. Fixes: b8b7570a7ec8 ("nvme-pci: fix dma unmapping when using PRPs and not using the IOVA mapping") Link: https://lore.kernel.org/linux-nvme/20260202125738.1194899-1-pradeep.pragallapati@oss.qualcomm.com/ Reported-by: Pradeep P V K <pradeep.pragallapati@oss.qualcomm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	drm/xe/guc: Fix CFI violation in debugfs access.	Daniele Ceraolo Spurio	2	-3/+5
	[ Upstream commit 4cb1b327135dddf3d0ec2544ea36ed05ba2252bc ] xe_guc_print_info is void-returning, but the function pointer it is assigned to expects an int-returning function, leading to the following CFI error: [ 206.873690] CFI failure at guc_debugfs_show+0xa1/0xf0 [xe] (target: xe_guc_print_info+0x0/0x370 [xe]; expected type: 0xbe3bc66a) Fix this by updating xe_guc_print_info to return an integer. Fixes: e15826bb3c2c ("drm/xe/guc: Refactor GuC debugfs initialization") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: George D Sworo <george.d.sworo@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260129182547.32899-2-daniele.ceraolospurio@intel.com (cherry picked from commit dd8ea2f2ab71b98887fdc426b0651dbb1d1ea760) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	hwmon: (occ) Mark occ_init_attribute() as __printf	Arnd Bergmann	1	-0/+1
	[ Upstream commit 831a2b27914cc880130ffe8fb8d1e65a5324d07f ] This is a printf-style function, which gcc -Werror=suggest-attribute=format correctly points out: drivers/hwmon/occ/common.c: In function 'occ_init_attribute': drivers/hwmon/occ/common.c:761:9: error: function 'occ_init_attribute' might be a candidate for 'gnu_printf' format attribute [-Werror=suggest-attribute=format] Add the attribute to avoid this warning and ensure any incorrect format strings are detected here. Fixes: 744c2fe950e9 ("hwmon: (occ) Rework attribute registration for stack usage") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20260203163440.2674340-1-arnd@kernel.org Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	drm/xe/pm: Disable D3Cold for BMG only on specific platforms	Karthik Poosa	1	-3/+10
	[ Upstream commit bb36170d959fad7f663f91eb9c32a84dd86bef2b ] Restrict D3Cold disablement for BMG to unsupported NUC platforms, instead of disabling it on all platforms. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: 3e331a6715ee ("drm/xe/pm: Temporarily disable D3Cold on BMG") Link: https://patch.msgid.link/20260123173238.1642383-1-karthik.poosa@intel.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 39125eaf8863ab09d70c4b493f58639b08d5a897) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	drm/xe/query: Fix topology query pointer advance	Shuicheng Lin	1	-1/+1
	[ Upstream commit 7ee9b3e091c63da71e15c72003f1f07e467f5158 ] The topology query helper advanced the user pointer by the size of the pointer, not the size of the structure. This can misalign the output blob and corrupt the following mask. Fix the increment to use sizeof(topo). There is no issue currently, as sizeof(topo) happens to be equal to sizeof(topo) on 64-bit systems (both evaluate to 8 bytes). Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patch.msgid.link/20260130043907.465128-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit c2a6859138e7f73ad904be17dd7d1da6cc7f06b3) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	drm/mgag200: fix mgag200_bmc_stop_scanout()	Jacob Keller	2	-19/+18
	[ Upstream commit 0e0c8f4d16de92520623aa1ea485cadbf64e6929 ] The mgag200_bmc_stop_scanout() function is called by the .atomic_disable() handler for the MGA G200 VGA BMC encoder. This function performs a few register writes to inform the BMC of an upcoming mode change, and then polls to wait until the BMC actually stops. The polling is implemented using a busy loop with udelay() and an iteration timeout of 300, resulting in the function blocking for 300 milliseconds. The function gets called ultimately by the output_poll_execute work thread for the DRM output change polling thread of the mgag200 driver: kworker/0:0-mm_ 3528 [000] 4555.315364: ffffffffaa0e25b3 delay_halt.part.0+0x33 ffffffffc03f6188 mgag200_bmc_stop_scanout+0x178 ffffffffc087ae7a disable_outputs+0x12a ffffffffc087c12a drm_atomic_helper_commit_tail+0x1a ffffffffc03fa7b6 mgag200_mode_config_helper_atomic_commit_tail+0x26 ffffffffc087c9c1 commit_tail+0x91 ffffffffc087d51b drm_atomic_helper_commit+0x11b ffffffffc0509694 drm_atomic_commit+0xa4 ffffffffc05105e8 drm_client_modeset_commit_atomic+0x1e8 ffffffffc0510ce6 drm_client_modeset_commit_locked+0x56 ffffffffc0510e24 drm_client_modeset_commit+0x24 ffffffffc088a743 __drm_fb_helper_restore_fbdev_mode_unlocked+0x93 ffffffffc088a683 drm_fb_helper_hotplug_event+0xe3 ffffffffc050f8aa drm_client_dev_hotplug+0x9a ffffffffc088555a output_poll_execute+0x29a ffffffffa9b35924 process_one_work+0x194 ffffffffa9b364ee worker_thread+0x2fe ffffffffa9b3ecad kthread+0xdd ffffffffa9a08549 ret_from_fork+0x29 On a server running ptp4l with the mgag200 driver loaded, we found that ptp4l would sometimes get blocked from execution because of this busy waiting loop. Every so often, approximately once every 20 minutes -- though with large variance -- the output_poll_execute() thread would detect some sort of change that required performing a hotplug event which results in attempting to stop the BMC scanout, resulting in a 300msec delay on one CPU. On this system, ptp4l was pinned to a single CPU. When the output_poll_execute() thread ran on that CPU, it blocked ptp4l from executing for its 300 millisecond duration. This resulted in PTP service disruptions such as failure to send a SYNC message on time, failure to handle ANNOUNCE messages on time, and clock check warnings from the application. All of this despite the application being configured with FIFO_RT and a higher priority than the background workqueue tasks. (However, note that the kernel did not use CONFIG_PREEMPT...) It is unclear if the event is due to a faulty VGA connection, another bug, or actual events causing a change in the connection. At least on the system under test it is not a one-time event and consistently causes disruption to the time sensitive applications. The function has some helpful comments explaining what steps it is attempting to take. In particular, step 3a and 3b are explained as such: 3a - The third step is to verify if there is an active scan. We are waiting on a 0 on remhsyncsts (<XSPAREREG<0>. 3b - This step occurs only if the remove is actually scanning. We are waiting for the end of the frame which is a 1 on remvsyncsts (<XSPAREREG<1>). The actual steps 3a and 3b are implemented as while loops with a non-sleeping udelay(). The first step iterates while the tmp value at position 0 is not set. That is, it keeps iterating as long as the bit is zero. If the bit is already 0 (because there is no active scan), it will iterate the entire 300 attempts which wastes 300 milliseconds in total. This is opposite of what the description claims. The step 3b logic only executes if we do not iterate over the entire 300 attempts in the first loop. If it does trigger, it is trying to check and wait for a 1 on the remvsyncsts. However, again the condition is actually inverted and it will loop as long as the bit is 1, stopping once it hits zero (rather than the explained attempt to wait until we see a 1). Worse, both loops are implemented using non-sleeping waits which spin instead of allowing the scheduler to run other processes. If the kernel is not configured to allow arbitrary preemption, it will waste valuable CPU time doing nothing. There does not appear to be any documentation for the BMC register interface, beyond what is in the comments here. It seems more probable that the comment here is correct and the implementation accidentally got inverted from the intended logic. Reading through other DRM driver implementations, it does not appear that the .atomic_enable or .atomic_disable handlers need to delay instead of sleep. For example, the ast_astdp_encoder_helper_atomic_disable() function calls ast_dp_set_phy_sleep() which uses msleep(). The "atomic" in the name is referring to the atomic modesetting support, which is the support to enable atomic configuration from userspace, and not to the "atomic context" of the kernel. There is no reason to use udelay() here if a sleep would be sufficient. Replace the while loops with a read_poll_timeout() based implementation that will sleep between iterations, and which stops polling once the condition is met (instead of looping as long as the condition is met). This aligns with the commented behavior and avoids blocking on the CPU while doing nothing. Note the RREG_DAC is implemented using a statement expression to allow working properly with the read_poll_timeout family of functions. The other RREG_<TYPE> macros ought to be cleaned up to have better semantics, and several places in the mgag200 driver could make use of RREG_DAC or similar RREG_* macros should likely be cleaned up for better semantics as well, but that task has been left as a future cleanup for a non-bugfix. Fixes: 414c45310625 ("mgag200: initial g200se driver (v2)") Suggested-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260202-jk-mgag200-fix-bad-udelay-v2-1-ce1e9665987d@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	net: ethernet: adi: adin1110: Check return value of ↵	Chen Ni	1	-0/+3
	devm_gpiod_get_optional() in adin1110_check_spi() [ Upstream commit 78211543d2e44f84093049b4ef5f5bfa535f4645 ] The devm_gpiod_get_optional() function may return an ERR_PTR in case of genuine GPIO acquisition errors, not just NULL which indicates the legitimate absence of an optional GPIO. Add an IS_ERR() check after the call in adin1110_check_spi(). On error, return the error code to ensure proper failure handling rather than proceeding with invalid pointers. Fixes: 36934cac7aaf ("net: ethernet: adi: adin1110: add reset GPIO") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Nuno Sá <nuno.sa@analog.com> Link: https://patch.msgid.link/20260202040228.4129097-1-nichen@iscas.ac.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	drm/amd/display: fix wrong color value mapping on MCM shaper LUT	Melissa Wen	1	-2/+5
	[ Upstream commit 8f959d37c1f2efec6dac55915ee82302e98101fb ] Some shimmer/colorful points appears when using the steamOS color pipeline for HDR on gaming with DCN32. These points look like black values being wrongly mapped to red/blue/green values. It was caused because the number of hw points in regular LUTs and in a shaper LUT was treated as the same. DCN3+ regular LUTs have 257 bases and implicit deltas (i.e. HW calculates them), but shaper LUT is a special case: it has 256 bases and 256 deltas, as in DCN1-2 regular LUTs, and outputs 14-bit values. Fix that by setting by decreasing in 1 the number of HW points computed in the LUT segmentation so that shaper LUT (i.e. fixpoint == true) keeps the same DCN10 CM logic and regular LUTs go with `hw_points + 1`. CC: Krunoslav Kovac <Krunoslav.Kovac@amd.com> Fixes: 4d5fd3d08ea9 ("drm/amd/display: PQ tail accuracy") Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5006505b19a2119e71c008044d59f6d753c858b9) Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	wifi: iwlwifi: mvm: pause TCM on fast resume	Miri Korenblit	1	-1/+5
	[ Upstream commit fb7f54aa2a99b07945911152c5d3d4a6eb39f797 ] Not pausing it means that we can have the TCM work queued into a non-freezable workqueue, which, in resume, is re-activated before the driver's resume is called. The TCM work might send commands to the FW before we resumed the device, leading to an assert. Closes: https://lore.kernel.org/linux-wireless/aTDoDiD55qlUZ0pn@debian.local/ Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> Fixes: e8bb19c1d590 ("wifi: iwlwifi: support fast resume") Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260129212650.05621f3faedb.I44df9cf9183b5143df8078131e0d87c0fd7e1763@changeid Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	wifi: iwlwifi: mld: cancel mlo_scan_start_wk	Miri Korenblit	2	-2/+2
	[ Upstream commit 5ff641011ab7fb63ea101251087745d9826e8ef5 ] mlo_scan_start_wk is not canceled on disconnection. In fact, it is not canceled anywhere except in the restart cleanup, where we don't really have to. This can cause an init-after-queue issue: if, for example, the work was queued and then drv_change_interface got executed. This can also cause use-after-free: if the work is executed after the vif is freed. Fixes: 9748ad82a9d9 ("wifi: iwlwifi: defer MLO scan after link activation") Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260129212650.a36482a60719.I5bf64a108ca39dacb5ca0dcd8b7258a3ce8db74c@changeid Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	net: enetc: Convert 16-bit register reads to 32-bit for ENETC v4	Claudiu Manoil	2	-4/+15
	[ Upstream commit c28d765ec5da160d3a48d0928528084cef97bf19 ] It is not recommended to access the 32‑bit registers of this hardware IP using lower‑width accessors (i.e. 16‑bit), and the only exception to this rule was introduced in the initial ENETC v1 driver for the PMAR1 register, which holds the lower 16 bits of the primary MAC address of an SI. Meanwhile, this exception has been replicated in the v4 driver code as well. Since LS1028 (the only SoC with ENETC v1) is not affected by this issue, the current patch converts the 16‑bit reads from PMAR1 starting with ENETC v4. Fixes: 99100d0d9922 ("net: enetc: add preliminary support for i.MX95 ENETC PF") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260130141035.272471-5-claudiu.manoil@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	net: enetc: Convert 16-bit register writes to 32-bit for ENETC v4	Claudiu Manoil	1	-2/+2
	[ Upstream commit 21d0fc95b5920ae8e69a2c0394bef82b8392bcc9 ] For ENETC v4, which is integrated into more complex SoCs (compared to v1), 16‑bit register writes are blocked in the SoC interconnect on some chips. To be fair, it is not recommended to access 32‑bit registers of this IP using lower‑width accessors (i.e. 16‑bit), and the only exception to this rule was introduced by me in the initial ENETC v1 driver for the PMAR1 register, which holds the lower 16 bits of the primary MAC address of an SI. Meanwhile, this exception has been replicated for v4 as well. Since LS1028 (the only SoC with ENETC v1) is not affected by this issue, the current patch fixes the 16‑bit writes to PMAR1 starting with ENETC v4. Fixes: 99100d0d9922 ("net: enetc: add preliminary support for i.MX95 ENETC PF") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260130141035.272471-4-claudiu.manoil@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	net: enetc: Remove CBDR cacheability AXI settings for ENETC v4	Claudiu Manoil	1	-4/+0
	[ Upstream commit 9ae13b2e64fcd2ca00a76b7d60fc4641a6b9209d ] For ENETC v4 these settings are controlled by the global ENETC command cache attribute registers (EnCAR), from the IERB register block. The hardcoded CDBR cacheability settings were inherited from LS1028A, and should be removed from the ENETC v4 driver as they conflict with the global IERB settings. Fixes: e3f4a0a8ddb4 ("net: enetc: add command BD ring support for i.MX95 ENETC") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260130141035.272471-3-claudiu.manoil@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	net: enetc: Remove SI/BDR cacheability AXI settings for ENETC v4	Claudiu Manoil	1	-4/+7
	[ Upstream commit a69c17230cab07bd156f894fdc82bd78b43ea72f ] For ENETC v4 these settings are controlled by the global ENETC message and buffer cache attribute registers (EnBCAR and EnMCAR), from the IERB register block. The hardcoded cacheability settings were inherited from LS1028A, and should be removed from the ENETC v4 driver as they conflict with the global IERB settings. Fixes: 99100d0d9922 ("net: enetc: add preliminary support for i.MX95 ENETC PF") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20260130141035.272471-2-claudiu.manoil@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	hwmon: (acpi_power_meter) Fix deadlocks related to acpi_power_meter_notify()	Rafael J. Wysocki	1	-3/+14
	[ Upstream commit 615901b57b7ef8eb655f71358f7e956e42bcd16b ] The acpi_power_meter driver's .notify() callback function, acpi_power_meter_notify(), calls hwmon_device_unregister() under a lock that is also acquired by callbacks in sysfs attributes of the device being unregistered which is prone to deadlocks between sysfs access and device removal. Address this by moving the hwmon device removal in acpi_power_meter_notify() outside the lock in question, but notice that doing it alone is not sufficient because two concurrent METER_NOTIFY_CONFIG notifications may be attempting to remove the same device at the same time. To prevent that from happening, add a new lock serializing the execution of the switch () statement in acpi_power_meter_notify(). For simplicity, it is a static mutex which should not be a problem from the performance perspective. The new lock also allows the hwmon_device_register_with_info() in acpi_power_meter_notify() to be called outside the inner lock because it prevents the other notifications handled by that function from manipulating the "resource" object while the hwmon device based on it is being registered. The sending of ACPI netlink messages from acpi_power_meter_notify() is serialized by the new lock too which generally helps to ensure that the order of handling firmware notifications is the same as the order of sending netlink messages related to them. In addition, notice that hwmon_device_register_with_info() may fail in which case resource->hwmon_dev will become an error pointer, so add checks to avoid attempting to unregister the hwmon device pointer to by it in that case to acpi_power_meter_notify() and acpi_power_meter_remove(). Fixes: 16746ce8adfe ("hwmon: (acpi_power_meter) Replace the deprecated hwmon_device_register") Closes: https://lore.kernel.org/linux-hwmon/CAK8fFZ58fidGUCHi5WFX0uoTPzveUUDzT=k=AAm4yWo3bAuCFg@mail.gmail.com/ Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	net: usb: r8152: fix resume reset deadlock	Sergey Senozhatsky	1	-14/+15
	[ Upstream commit 6d06bc83a5ae8777a5f7a81c32dd75b8d9b2fe04 ] rtl8152 can trigger device reset during reset which potentially can result in a deadlock: ** DPM device timeout after 10 seconds; 15 seconds until panic ** Call Trace: <TASK> schedule+0x483/0x1370 schedule_preempt_disabled+0x15/0x30 __mutex_lock_common+0x1fd/0x470 __rtl8152_set_mac_address+0x80/0x1f0 dev_set_mac_address+0x7f/0x150 rtl8152_post_reset+0x72/0x150 usb_reset_device+0x1d0/0x220 rtl8152_resume+0x99/0xc0 usb_resume_interface+0x3e/0xc0 usb_resume_both+0x104/0x150 usb_resume+0x22/0x110 The problem is that rtl8152 resume calls reset under tp->control mutex while reset basically re-enters rtl8152 and attempts to acquire the same tp->control lock once again. Reset INACCESSIBLE device outside of tp->control mutex scope to avoid recursive mutex_lock() deadlock. Fixes: 4933b066fefb ("r8152: If inaccessible at resume time, issue a reset") Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Link: https://patch.msgid.link/20260129031106.3805887-1-senozhatsky@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	macvlan: fix error recovery in macvlan_common_newlink()	Eric Dumazet	1	-2/+3
	[ Upstream commit f8db6475a83649689c087a8f52486fcc53e627e9 ] valis provided a nice repro to crash the kernel: ip link add p1 type veth peer p2 ip link set address 00:00:00:00:00:20 dev p1 ip link set up dev p1 ip link set up dev p2 ip link add mv0 link p2 type macvlan mode source ip link add invalid% link p2 type macvlan mode source macaddr add 00:00:00:00:00:20 ping -c1 -I p1 1.2.3.4 He also gave a very detailed analysis: <quote valis> The issue is triggered when a new macvlan link is created with MACVLAN_MODE_SOURCE mode and MACVLAN_MACADDR_ADD (or MACVLAN_MACADDR_SET) parameter, lower device already has a macvlan port and register_netdevice() called from macvlan_common_newlink() fails (e.g. because of the invalid link name). In this case macvlan_hash_add_source is called from macvlan_change_sources() / macvlan_common_newlink(): This adds a reference to vlan to the port's vlan_source_hash using macvlan_source_entry. vlan is a pointer to the priv data of the link that is being created. When register_netdevice() fails, the error is returned from macvlan_newlink() to rtnl_newlink_create(): if (ops->newlink) err = ops->newlink(dev, &params, extack); else err = register_netdevice(dev); if (err < 0) { free_netdev(dev); goto out; } and free_netdev() is called, causing a kvfree() on the struct net_device that is still referenced in the source entry attached to the lower device's macvlan port. Now all packets sent on the macvlan port with a matching source mac address will trigger a use-after-free in macvlan_forward_source(). </quote valis> With all that, my fix is to make sure we call macvlan_flush_sources() regardless of @create value whenever "goto destroy_macvlan_port;" path is taken. Many thanks to valis for following up on this issue. Fixes: aa5fd0fb7748 ("driver: macvlan: Destroy new macvlan port if macvlan_common_newlink failed.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: valis <sec@valis.email> Reported-by: syzbot+7182fbe91e58602ec1fe@syzkaller.appspotmail.com Closes: https: //lore.kernel.org/netdev/695fb1e8.050a0220.1c677c.039f.GAE@google.com/T/#u Cc: Boudewijn van der Heide <boudewijn@delta-utec.com> Link: https://patch.msgid.link/20260129204359.632556-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	net: sfp: Fix quirk for Ubiquiti U-Fiber Instant SFP module	Marek Behún	1	-0/+2
	[ Upstream commit adcbadfd8e05d3558c9cfaa783f17c645181165f ] Commit fd580c9830316eda ("net: sfp: augment SFP parsing with phy_interface_t bitmap") did not add augumentation for the interface bitmap in the quirk for Ubiquiti U-Fiber Instant. The subsequent commit f81fa96d8a6c7a77 ("net: phylink: use phy_interface_t bitmaps for optical modules") then changed phylink code for selection of SFP interface: instead of using link mode bitmap, the interface bitmap is used, and the fastest interface mode supported by both SFP module and MAC is chosen. Since the interface bitmap contains also modes faster than 1000base-x, this caused a regression wherein this module stopped working out-of-the-box. Fix this. Fixes: fd580c9830316eda ("net: sfp: augment SFP parsing with phy_interface_t bitmap") Signed-off-by: Marek Behún <kabel@kernel.org> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20260129082227.17443-1-kabel@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	i40e: drop udp_tunnel_get_rx_info() call from i40e_open()	Mohammad Heib	1	-1/+0
	[ Upstream commit 40857194956dcaf3d2b66d6bd113d844c93bef54 ] The i40e driver calls udp_tunnel_get_rx_info() during i40e_open(). This is redundant because UDP tunnel RX offload state is preserved across device down/up cycles. The udp_tunnel core handles synchronization automatically when required. Furthermore, recent changes in the udp_tunnel infrastructure require querying RX info while holding the udp_tunnel lock. Calling it directly from the ndo_open path violates this requirement, triggering the following lockdep warning: Call Trace: <TASK> ? __udp_tunnel_nic_assert_locked+0x39/0x40 [udp_tunnel] i40e_open+0x135/0x14f [i40e] __dev_open+0x121/0x2e0 __dev_change_flags+0x227/0x270 dev_change_flags+0x3d/0xb0 devinet_ioctl+0x56f/0x860 sock_do_ioctl+0x7b/0x130 __x64_sys_ioctl+0x91/0xd0 do_syscall_64+0x90/0x170 ... </TASK> Remove the redundant and unsafe call to udp_tunnel_get_rx_info() from i40e_open() resolve the locking violation. Fixes: 1ead7501094c ("udp_tunnel: remove rtnl_lock dependency") Signed-off-by: Mohammad Heib <mheib@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	ice: drop udp_tunnel_get_rx_info() call from ndo_open()	Mohammad Heib	1	-3/+0
	[ Upstream commit 234e615bfece9e3e91c50fe49ab9e68ee37c791a ] The ice driver calls udp_tunnel_get_rx_info() during ice_open_internal(). This is redundant because UDP tunnel RX offload state is preserved across device down/up cycles. The udp_tunnel core handles synchronization automatically when required. Furthermore, recent changes in the udp_tunnel infrastructure require querying RX info while holding the udp_tunnel lock. Calling it directly from the ndo_open path violates this requirement, triggering the following lockdep warning: Call Trace: <TASK> ice_open_internal+0x253/0x350 [ice] __udp_tunnel_nic_assert_locked+0x86/0xb0 [udp_tunnel] __dev_open+0x2f5/0x880 __dev_change_flags+0x44c/0x660 netif_change_flags+0x80/0x160 devinet_ioctl+0xd21/0x15f0 inet_ioctl+0x311/0x350 sock_ioctl+0x114/0x220 __x64_sys_ioctl+0x131/0x1a0 ... </TASK> Remove the redundant and unsafe call to udp_tunnel_get_rx_info() from ice_open_internal() to resolve the locking violation Fixes: 1ead7501094c ("udp_tunnel: remove rtnl_lock dependency") Signed-off-by: Mohammad Heib <mheib@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	ice: Fix PTP NULL pointer dereference during VSI rebuild	Aaron Ma	3	-5/+29
	[ Upstream commit fc6f36eaaedcf4b81af6fe1a568f018ffd530660 ] Fix race condition where PTP periodic work runs while VSI is being rebuilt, accessing NULL vsi->rx_rings. The sequence was: 1. ice_ptp_prepare_for_reset() cancels PTP work 2. ice_ptp_rebuild() immediately queues PTP work 3. VSI rebuild happens AFTER ice_ptp_rebuild() 4. PTP work runs and accesses NULL vsi->rx_rings Fix: Keep PTP work cancelled during rebuild, only queue it after VSI rebuild completes in ice_rebuild(). Added ice_ptp_queue_work() helper function to encapsulate the logic for queuing PTP work, ensuring it's only queued when PTP is supported and the state is ICE_PTP_READY. Error log: [ 121.392544] ice 0000:60:00.1: PTP reset successful [ 121.392692] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 121.392712] #PF: supervisor read access in kernel mode [ 121.392720] #PF: error_code(0x0000) - not-present page [ 121.392727] PGD 0 [ 121.392734] Oops: Oops: 0000 [#1] SMP NOPTI [ 121.392746] CPU: 8 UID: 0 PID: 1005 Comm: ice-ptp-0000:60 Tainted: G S 6.19.0-rc6+ #4 PREEMPT(voluntary) [ 121.392761] Tainted: [S]=CPU_OUT_OF_SPEC [ 121.392773] RIP: 0010:ice_ptp_update_cached_phctime+0xbf/0x150 [ice] [ 121.393042] Call Trace: [ 121.393047] <TASK> [ 121.393055] ice_ptp_periodic_work+0x69/0x180 [ice] [ 121.393202] kthread_worker_fn+0xa2/0x260 [ 121.393216] ? __pfx_ice_ptp_periodic_work+0x10/0x10 [ice] [ 121.393359] ? __pfx_kthread_worker_fn+0x10/0x10 [ 121.393371] kthread+0x10d/0x230 [ 121.393382] ? __pfx_kthread+0x10/0x10 [ 121.393393] ret_from_fork+0x273/0x2b0 [ 121.393407] ? __pfx_kthread+0x10/0x10 [ 121.393417] ret_from_fork_asm+0x1a/0x30 [ 121.393432] </TASK> Fixes: 803bef817807d ("ice: factor out ice_ptp_rebuild_owner()") Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	ice: PTP: fix missing timestamps on E825 hardware	Jacob Keller	3	-78/+103
	[ Upstream commit 88b68f35eb43ad5ac77ac1107059040b04e6f477 ] The E825 hardware currently has each PF handle the PFINT_TSYN_TX cause of the miscellaneous OICR interrupt vector. The actual interrupt cause underlying this is shared by all ports on the same quad: ┌─────────────────────────────────┐ │ │ │ ┌────┐ ┌────┐ ┌────┐ ┌────┐ │ │ │PF 0│ │PF 1│ │PF 2│ │PF 3│ │ │ └────┘ └────┘ └────┘ └────┘ │ │ │ └────────────────▲────────────────┘ │ │ ┌────────────────┼────────────────┐ │ PHY QUAD │ └───▲────────▲────────▲────────▲──┘ │ │ │ │ ┌───┼──┐ ┌───┴──┐ ┌───┼──┐ ┌───┼──┐ │Port 0│ │Port 1│ │Port 2│ │Port 3│ └──────┘ └──────┘ └──────┘ └──────┘ If multiple PFs issue Tx timestamp requests near simultaneously, it is possible that the correct PF will not be interrupted and will miss its timestamp. Understanding why is somewhat complex. Consider the following sequence of events: CPU 0: Send Tx packet on PF 0 ... PF 0 enqueues packet with Tx request CPU 1, PF1: ... Send Tx packet on PF1 ... PF 1 enqueues packet with Tx request HW: PHY Port 0 sends packet PHY raises Tx timestamp event interrupt MAC raises each PF interrupt CPU 0, PF0: CPU 1, PF1: ice_misc_intr() checks for Tx timestamps ice_misc_intr() checks for Tx timestamp Sees packet ready bit set Sees nothing available ... Exits ... ... HW: PHY port 1 sends packet PHY interrupt ignored because not all packet timestamps read yet. ... Read timestamp, report to stack Because the interrupt event is shared for all ports on the same quad, the PHY will not raise a new interrupt for any PF until all timestamps are read. In the example above, the second timestamp comes in for port 1 before the timestamp from port 0 is read. At this point, there is no longer an interrupt thread running that will read the timestamps, because each PF has checked and found that there was no work to do. Applications such as ptp4l will timeout after waiting a few milliseconds. Eventually, the watchdog service task will re-check for all quads and notice that there are outstanding timestamps, and issue a software interrupt to recover. However, by this point it is far too late, and applications have already failed. All of this occurs because of the underlying hardware behavior. The PHY cannot raise a new interrupt signal until all outstanding timestamps have been read. As a first step to fix this, switch the E825C hardware to the ICE_PTP_TX_INTERRUPT_ALL mode. In this mode, only the clock owner PF will respond to the PFINT_TSYN_TX cause. Other PFs disable this cause and will not wake. In this mode, the clock owner will iterate over all ports and handle timestamps for each connected port. This matches the E822 behavior, and is a necessary but insufficient step to resolve the missing timestamps. Even with use of the ICE_PTP_TX_INTERRUPT_ALL mode, we still sometimes miss a timestamp event. The ice_ptp_tx_tstamp_owner() does re-check the ready bitmap, but does so before re-enabling the OICR interrupt vector. It also only checks the ready bitmap, but not the software Tx timestamp tracker. To avoid risk of losing a timestamp, refactor the logic to check both the software Tx timestamp tracker bitmap and the hardware ready bitmap. Additionally, do this outside of ice_ptp_process_ts() after we have already re-enabled the OICR interrupt. Remove the checks from the ice_ptp_tx_tstamp(), ice_ptp_tx_tstamp_owner(), and the ice_ptp_process_ts() functions. This results in ice_ptp_tx_tstamp() being nothing more than a wrapper around ice_ptp_process_tx_tstamp() so we can remove it. Add the ice_ptp_tx_tstamps_pending() function which returns a boolean indicating if there are any pending Tx timestamps. First, check the software timestamp tracker bitmap. In ICE_PTP_TX_INTERRUPT_ALL mode, check all ports software trackers. If a tracker has outstanding timestamp requests, return true. Additionally, check the PHY ready bitmap to confirm if the PHY indicates any outstanding timestamps. In the ice_misc_thread_fn(), call ice_ptp_tx_tstamps_pending() just before returning from the IRQ thread handler. If it returns true, write to PFINT_OICR to trigger a PFINT_OICR_TSYN_TX_M software interrupt. This will force the handler to interrupt again and complete the work even if the PHY hardware did not interrupt for any reason. This results in the following new flow for handling Tx timestamps: 1) send Tx packet 2) PHY captures timestamp 3) PHY triggers MAC interrupt 4) clock owner executes ice_misc_intr() with PFINT_OICR_TSYN_TX flag set 5) ice_ptp_ts_irq() returns IRQ_WAKE_THREAD 7) The interrupt thread wakes up and kernel calls ice_misc_intr_thread_fn() 8) ice_ptp_process_ts() is called to handle any outstanding timestamps 9) ice_irq_dynamic_ena() is called to re-enable the OICR hardware interrupt cause 10) ice_ptp_tx_tstamps_pending() is called to check if we missed any more outstanding timestamps, checking both software and hardware indicators. With this change, it should no longer be possible for new timestamps to come in such a way that we lose an interrupt. If a timestamp comes in before the ice_ptp_tx_tstamps_pending() call, it will be noticed by at least one of the software bitmap check or the hardware bitmap check. If the timestamp comes in after this check, it should cause a timestamp interrupt as we have already read all timestamps from the PHY and the OICR vector has been re-enabled. Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemyslaw Korba <przemyslaw.korba@intel.com> Tested-by: Vitaly Grinberg <vgrinber@redhat.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	ice: fix missing TX timestamps interrupts on E825 devices	Grzegorz Nitka	1	-1/+4
	[ Upstream commit 99854c167cfc113ad863832b1601c4ca1a639cfe ] Modify PTP (Precision Time Protocol) configuration on link down flow. Previously, PHY_REG_TX_OFFSET_READY register was cleared in such case. This register is used to determine if the timestamp is valid or not on the hardware side. However, there is a possibility that there is still the packet in the HW queue which originally was supposed to be timestamped but the link is already down and given register is cleared. This potentially might lead to the situation in which that 'delayed' packet's timestamp is treated as invalid one when the link is up again. This in turn leads to the situation in which the driver is not able to effectively clean timestamp memory and interrupt configuration. From the hardware perspective, that 'old' interrupt was not handled properly and even if new timestamp packets are processed, no new interrupts is generated. As a result, providing timestamps to the user applications (like ptp4l) is not possible. The solution for this problem is implemented at the driver level rather than the firmware, and maintains the tx_ready bit high, even during link down events. This avoids entering a potential inconsistent state between the driver and the timestamp hardware. Testing hints: - run PTP traffic at higher rate (like 16 PTP messages per second) - observe ptp4l behaviour at the client side in the following conditions: a) trigger link toggle events. It needs to be physiscal link down/up events b) link speed change In all above cases, PTP processing at ptp4l application should resume always. In failure case, the following permanent error message in ptp4l log was observed: controller-0 ptp4l: err [6175.116] ptp4l-legacy timed out while polling for tx timestamp Fixes: 7cab44f1c35f ("ice: Introduce ETH56G PHY model for E825C products") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	dpaa2-switch: add bounds check for if_id in IRQ handler	Junrui Luo	1	-0/+4
	[ Upstream commit 31a7a0bbeb006bac2d9c81a2874825025214b6d8 ] The IRQ handler extracts if_id from the upper 16 bits of the hardware status register and uses it to index into ethsw->ports[] without validation. Since if_id can be any 16-bit value (0-65535) but the ports array is only allocated with sw_attr.num_ifs elements, this can lead to an out-of-bounds read potentially. Add a bounds check before accessing the array, consistent with the existing validation in dpaa2_switch_rx(). Reported-by: Yuhao Jiang <danisjiang@gmail.com> Reported-by: Junrui Luo <moonafterrain@outlook.com> Fixes: 24ab724f8a46 ("dpaa2-switch: use the port index in the IRQ handler") Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Link: https://patch.msgid.link/SYBPR01MB7881D420AB43FF1A227B84AFAF91A@SYBPR01MB7881.ausprd01.prod.outlook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	net: liquidio: Fix off-by-one error in VF setup_nic_devices() cleanup	Zilin Guan	1	-2/+2
	[ Upstream commit 6cbba46934aefdfb5d171e0a95aec06c24f7ca30 ] In setup_nic_devices(), the initialization loop jumps to the label setup_nic_dev_free on failure. The current cleanup loop while(i--) skip the failing index i, causing a memory leak. Fix this by changing the loop to iterate from the current index i down to 0. Compile tested only. Issue found using code review. Fixes: 846b46873eeb ("liquidio CN23XX: VF offload features") Suggested-by: Simon Horman <horms@kernel.org> Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20260128154440.278369-4-zilin@seu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	net: liquidio: Fix off-by-one error in PF setup_nic_devices() cleanup	Zilin Guan	1	-2/+3
	[ Upstream commit 8558aef4e8a1a83049ab906d21d391093cfa7e7f ] In setup_nic_devices(), the initialization loop jumps to the label setup_nic_dev_free on failure. The current cleanup loop while(i--) skip the failing index i, causing a memory leak. Fix this by changing the loop to iterate from the current index i down to 0. Also, decrement i in the devlink_alloc failure path to point to the last successfully allocated index. Compile tested only. Issue found using code review. Fixes: f21fb3ed364b ("Add support of Cavium Liquidio ethernet adapters") Suggested-by: Simon Horman <horms@kernel.org> Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20260128154440.278369-3-zilin@seu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	net: liquidio: Initialize netdev pointer before queue setup	Zilin Guan	1	-17/+17
	[ Upstream commit 926ede0c85e1e57c97d64d9612455267d597bb2c ] In setup_nic_devices(), the netdev is allocated using alloc_etherdev_mq(). However, the pointer to this structure is stored in oct->props[i].netdev only after the calls to netif_set_real_num_rx_queues() and netif_set_real_num_tx_queues(). If either of these functions fails, setup_nic_devices() returns an error without freeing the allocated netdev. Since oct->props[i].netdev is still NULL at this point, the cleanup function liquidio_destroy_nic_device() will fail to find and free the netdev, resulting in a memory leak. Fix this by initializing oct->props[i].netdev before calling the queue setup functions. This ensures that the netdev is properly accessible for cleanup in case of errors. Compile tested only. Issue found using a prototype static analysis tool and code review. Fixes: c33c997346c3 ("liquidio: enhanced ethtool --set-channels feature") Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20260128154440.278369-2-zilin@seu.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	dpaa2-switch: prevent ZERO_SIZE_PTR dereference when num_ifs is zero	Junrui Luo	1	-0/+6
	[ Upstream commit ed48a84a72fefb20a82dd90a7caa7807e90c6f66 ] The driver allocates arrays for ports, FDBs, and filter blocks using kcalloc() with ethsw->sw_attr.num_ifs as the element count. When the device reports zero interfaces (either due to hardware configuration or firmware issues), kcalloc(0, ...) returns ZERO_SIZE_PTR (0x10) instead of NULL. Later in dpaa2_switch_probe(), the NAPI initialization unconditionally accesses ethsw->ports[0]->netdev, which attempts to dereference ZERO_SIZE_PTR (address 0x10), resulting in a kernel panic. Add a check to ensure num_ifs is greater than zero after retrieving device attributes. This prevents the zero-sized allocations and subsequent invalid pointer dereference. Reported-by: Yuhao Jiang <danisjiang@gmail.com> Reported-by: Junrui Luo <moonafterrain@outlook.com> Fixes: 0b1b71370458 ("staging: dpaa2-switch: handle Rx path on control interface") Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/SYBPR01MB7881BEABA8DA896947962470AF91A@SYBPR01MB7881.ausprd01.prod.outlook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	hwmon: (dell-smm) Add Dell G15 5510 to fan control whitelist	leobannocloutier@gmail.com	1	-0/+8
	[ Upstream commit 830e0bef79aaaea8b1ef426b8032e70c63a58653 ] On the Dell G15 5510, fans spin at maximum speed when AC power is connected. This behavior has been observed as a regression in recent kernels (v6.18+). Add the Dell G15 5510 to the fan control whitelist to enable manual fan control and resolve the issue. This model requires the same fan control configuration as the Dell G15 5511. Fixes: 1c1658058c99 ("hwmon: (dell-smm) Add support for automatic fan mode") Signed-off-by: Leo Banno-Cloutier <leobannocloutier@gmail.com> Link: https://lore.kernel.org/r/20260117015315.214569-2-leobannocloutier@gmail.com [groeck: Updated patch description to follow guidance] Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	platform/x86/intel/tpmi/plr: Make the file domain<n>/status writeable	Ricardo Neri	1	-1/+1
	[ Upstream commit 008bec8ffe6e7746588d1e12c5b3865fa478fc91 ] The file sys/kernel/debug/tpmi-<n>/plr/domain<n>/status has store and show callbacks. Make it writeable. Fixes: 811f67c51636d ("platform/x86/intel/tpmi: Add new auxiliary driver for performance limits") Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Link: https://patch.msgid.link/20260127-plr-debugfs-write-v1-1-1fffbc370b1e@linux.intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	platform/x86: hp-bioscfg: Skip empty attribute names	Mario Limonciello	1	-0/+5
	[ Upstream commit 6222883af286e2feb3c9ff2bf9fd8fdf4220c55a ] Avoid registering kobjects with empty names when a BIOS attribute name decodes to an empty string. Fixes: a34fc329b1895 ("platform/x86: hp-bioscfg: bioscfg") Reported-by: Alain Cousinie <alain.cousinie@laposte.net> Closes: https://lore.kernel.org/platform-driver-x86/22ed5f78-c8bf-4ab4-8c38-420cc0201e7e@laposte.net/ Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patch.msgid.link/20260128190501.2170068-1-mario.limonciello@amd.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	platform/x86: intel_telemetry: Fix PSS event register mask	Kaushlendra Kumar	1	-1/+1
	[ Upstream commit 39e9c376ac42705af4ed4ae39eec028e8bced9b4 ] The PSS telemetry info parsing incorrectly applies TELEM_INFO_SRAMEVTS_MASK when extracting event register count from firmware response. This reads bits 15-8 instead of the correct bits 7-0, causing misdetection of hardware capabilities. The IOSS path correctly uses TELEM_INFO_NENABLES_MASK for register count. Apply the same mask to PSS parsing for consistency. Fixes: 9d16b482b059 ("platform:x86: Add Intel telemetry platform driver") Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com> Link: https://patch.msgid.link/20251224061144.3925519-1-kaushlendra.kumar@intel.com Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	platform/x86: toshiba_haps: Fix memory leaks in add/remove routines	Rafael J. Wysocki	1	-1/+1
	[ Upstream commit 128497456756e1b952bd5a912cd073836465109d ] toshiba_haps_add() leaks the haps object allocated by it if it returns an error after allocating that object successfully. toshiba_haps_remove() does not free the object pointed to by toshiba_haps before clearing that pointer, so it becomes unreachable allocated memory. Address these memory leaks by using devm_kzalloc() for allocating the memory in question. Fixes: 23d0ba0c908a ("platform/x86: Toshiba HDD Active Protection Sensor") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	Revert "drm/amd/display: pause the workload setting in dm"	Alex Deucher	1	-11/+0
	[ Upstream commit f377ea0561c9576cdb7e3890bcf6b8168d455464 ] This reverts commit bc6d54ac7e7436721a19443265f971f890c13cc5. The workload profile needs to be in the default state when the dc idle optimizaion state is entered. However, when jobs come in for video or GFX or compute, the profile may be set to a non-default profile resulting in the dc idle optimizations not taking affect and resulting in higher power usage. As such we need to pause the workload profile changes during this transition. When this patch was originally committed, it caused a regression with a Dell U3224KB display, but no other problems were reported at the time. When it was reapplied (this patch) to address increased power usage, it seems to have caused additional regressions. This change seems to have a number of side affects (audio issues, stuttering, etc.). I suspect the pause should only happen when all displays are off or in static screen mode, but I think this call site gets called more often than that which results in idle state entry more often than intended. For now revert. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4894 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4717 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4725 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4517 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4806 Cc: Yang Wang <kevinyang.wang@amd.com> Cc: Kenneth Feng <kenneth.feng@amd.com> Cc: Roman Li <Roman.Li@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1412482b714358ffa30d38fd3dd0b05795163648) Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	scsi: target: iscsi: Fix use-after-free in iscsit_dec_conn_usage_count()	Maurizio Lombardi	1	-1/+4
	[ Upstream commit 9411a89e9e7135cc459178fa77a3f1d6191ae903 ] In iscsit_dec_conn_usage_count(), the function calls complete() while holding the conn->conn_usage_lock. As soon as complete() is invoked, the waiter (such as iscsit_close_connection()) may wake up and proceed to free the iscsit_conn structure. If the waiter frees the memory before the current thread reaches spin_unlock_bh(), it results in a KASAN slab-use-after-free as the function attempts to release a lock within the already-freed connection structure. Fix this by releasing the spinlock before calling complete(). Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reported-by: Zhaojuan Guo <zguo@redhat.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Link: https://patch.msgid.link/20260112165352.138606-2-mlombard@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	scsi: target: iscsi: Fix use-after-free in iscsit_dec_session_usage_count()	Maurizio Lombardi	1	-1/+4
	[ Upstream commit 84dc6037390b8607c5551047d3970336cb51ba9a ] In iscsit_dec_session_usage_count(), the function calls complete() while holding the sess->session_usage_lock. Similar to the connection usage count logic, the waiter signaled by complete() (e.g., in the session release path) may wake up and free the iscsit_session structure immediately. This creates a race condition where the current thread may attempt to execute spin_unlock_bh() on a session structure that has already been deallocated, resulting in a KASAN slab-use-after-free. To resolve this, release the session_usage_lock before calling complete() to ensure all dereferences of the sess pointer are finished before the waiter is allowed to proceed with deallocation. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reported-by: Zhaojuan Guo <zguo@redhat.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Link: https://patch.msgid.link/20260112165352.138606-3-mlombard@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	spi: intel-pci: Add support for Nova Lake SPI serial flash	Alan Borzeszkowski	1	-0/+1
	[ Upstream commit caa329649259d0f90c0056c9860ca659d4ba3211 ] Add Intel Nova Lake PCH-S SPI serial flash PCI ID to the list of supported devices. This is the same controller found in previous generations. Signed-off-by: Alan Borzeszkowski <alan.borzeszkowski@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://patch.msgid.link/20260115120305.10080-1-alan.borzeszkowski@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>