kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2025-08-28	mmc: sdhci-pci-gli: GL9763e: Mask the replay timer timeout of AER	Victor Shih	1	-0/+3
	commit 340be332e420ed37d15d4169a1b4174e912ad6cb upstream. Due to a flaw in the hardware design, the GL9763e replay timer frequently times out when ASPM is enabled. As a result, the warning messages will often appear in the system log when the system accesses the GL9763e PCI config. Therefore, the replay timer timeout must be masked. Signed-off-by: Victor Shih <victor.shih@genesyslogic.com.tw> Fixes: 1ae1d2d6e555 ("mmc: sdhci-pci-gli: Add Genesys Logic GL9763E support") Cc: stable@vger.kernel.org Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20250731065752.450231-4-victorshihgli@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	memstick: Fix deadlock by moving removing flag earlier	Jiayi Li	2	-1/+1
	commit 99d7ab8db9d8230b243f5ed20ba0229e54cc0dfa upstream. The existing memstick core patch: commit 62c59a8786e6 ("memstick: Skip allocating card when removing host") sets host->removing in memstick_remove_host(),but still exists a critical time window where memstick_check can run after host->eject is set but before removing is set. In the rtsx_usb_ms driver, the problematic sequence is: rtsx_usb_ms_drv_remove: memstick_check: host->eject = true cancel_work_sync(handle_req) if(!host->removing) ... memstick_alloc_card() memstick_set_rw_addr() memstick_new_req() rtsx_usb_ms_request() if(!host->eject) skip schedule_work wait_for_completion() memstick_remove_host: [blocks indefinitely] host->removing = true flush_workqueue() [block] 1. rtsx_usb_ms_drv_remove sets host->eject = true 2. cancel_work_sync(&host->handle_req) runs 3. memstick_check work may be executed here <-- danger window 4. memstick_remove_host sets removing = 1 During this window (step 3), memstick_check calls memstick_alloc_card, which may indefinitely waiting for mrq_complete completion that will never occur because rtsx_usb_ms_request sees eject=true and skips scheduling work, memstick_set_rw_addr waits forever for completion. This causes a deadlock when memstick_remove_host tries to flush_workqueue, waiting for memstick_check to complete, while memstick_check is blocked waiting for mrq_complete completion. Fix this by setting removing=true at the start of rtsx_usb_ms_drv_remove, before any work cancellation. This ensures memstick_check will see the removing flag immediately and exit early, avoiding the deadlock. Fixes: 62c59a8786e6 ("memstick: Skip allocating card when removing host") Signed-off-by: Jiayi Li <lijiayi@kylinos.cn> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250804013604.1311218-1-lijiayi@kylinos.cn Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	mmc: sdhci-pci-gli: Add a new function to simplify the code	Victor Shih	1	-14/+16
	commit dec8b38be4b35cae5f7fa086daf2631e2cfa09c1 upstream. In preparation to fix replay timer timeout, add sdhci_gli_mask_replay_timer_timeout() function to simplify some of the code, allowing it to be re-used. Signed-off-by: Victor Shih <victor.shih@genesyslogic.com.tw> Fixes: 1ae1d2d6e555 ("mmc: sdhci-pci-gli: Add Genesys Logic GL9763E support") Cc: stable@vger.kernel.org Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20250731065752.450231-2-victorshihgli@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	iommu/arm-smmu-v3: Fix smmu_domain->nr_ats_masters decrement	Nicolin Chen	1	-1/+1
	commit 685ca577b408ffd9c5a4057a2acc0cd3e6978b36 upstream. The arm_smmu_attach_commit() updates master->ats_enabled before calling arm_smmu_remove_master_domain() that is supposed to clean up everything in the old domain, including the old domain's nr_ats_masters. So, it is supposed to use the old ats_enabled state of the device, not an updated state. This isn't a problem if switching between two domains where: - old ats_enabled = false; new ats_enabled = false - old ats_enabled = true; new ats_enabled = true but can fail cases where: - old ats_enabled = false; new ats_enabled = true (old domain should keep the counter but incorrectly decreased it) - old ats_enabled = true; new ats_enabled = false (old domain needed to decrease the counter but incorrectly missed it) Update master->ats_enabled after arm_smmu_remove_master_domain() to fix this. Fixes: 7497f4211f4f ("iommu/arm-smmu-v3: Make changing domains be hitless for ATS") Cc: stable@vger.kernel.org Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Link: https://lore.kernel.org/r/20250801030127.2006979-1-nicolinc@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	Revert "can: ti_hecc: fix -Woverflow compiler warning"	Greg Kroah-Hartman	1	-1/+1
	This reverts commit 1da38b70d90f8529c060dd380d0c18e6d9595463 which is commit 7cae4d04717b002cffe41169da3f239c845a0723 upstream. Reported-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/63e25fdb-095a-40eb-b341-75781e71ea95@roeck-us.net Cc: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Cc: Marc Kleine-Budde <mkl@pengutronix.de> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	ata: libata-scsi: Return aborted command when missing sense and result TF	Damien Le Moal	1	-12/+15
	commit d2be9ea9a75550a35c5127a6c2633658bc38c76b upstream. ata_gen_ata_sense() is always called for a failed qc missing sense data so that a sense key, code and code qualifier can be generated using ata_to_sense_error() from the qc status and error fields of its result task file. However, if the qc does not have its result task file filled, ata_gen_ata_sense() returns early without setting a sense key. Improve this by defaulting to returning ABORTED COMMAND without any additional sense code, since we do not know the reason for the failure. The same fix is also applied in ata_gen_passthru_sense() with the additional check that the qc failed (qc->err_mask is set). Fixes: 816be86c7993 ("ata: libata-scsi: Check ATA_QCFLAG_RTF_FILLED before using result_tf") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	usb: typec: fusb302: cache PD RX state	Sebastian Reichel	1	-0/+8
	[ Upstream commit 1e61f6ab08786d66a11cfc51e13d6f08a6b06c56 ] This patch fixes a race condition communication error, which ends up in PD hard resets when losing the race. Some systems, like the Radxa ROCK 5B are powered through USB-C without any backup power source and use a FUSB302 chip to do the PD negotiation. This means it is quite important to avoid hard resets, since that effectively kills the system's power-supply. I've found the following race condition while debugging unplanned power loss during booting the board every now and then: 1. lots of TCPM/FUSB302/PD initialization stuff 2. TCPM ends up in SNK_WAIT_CAPABILITIES (tcpm_set_pd_rx is enabled here) 3. the remote PD source does not send anything, so TCPM does a SOFT RESET 4. TCPM ends up in SNK_WAIT_CAPABILITIES for the second time (tcpm_set_pd_rx is enabled again, even though it is still on) At this point I've seen broken CRC good messages being send by the FUSB302 with a logic analyzer sniffing the CC lines. Also it looks like messages are being lost and things generally going haywire with one of the two sides doing a hard reset once a broken CRC good message was send to the bus. I think the system is running into a race condition, that the FIFOs are being cleared and/or the automatic good CRC message generation flag is being updated while a message is already arriving. Let's avoid this by caching the PD RX enabled state, as we have already processed anything in the FIFOs and are in a good state. As a side effect that this also optimizes I2C bus usage :) As far as I can tell the problem theoretically also exists when TCPM enters SNK_WAIT_CAPABILITIES the first time, but I believe this is less critical for the following reason: On devices like the ROCK 5B, which are powered through a TCPM backed USB-C port, the bootloader must have done some prior PD communication (initial communication must happen within 5 seconds after plugging the USB-C plug). This means the first time the kernel TCPM state machine reaches SNK_WAIT_CAPABILITIES, the remote side is not sending messages actively. On other devices a hard reset simply adds some extra delay and things should be good afterwards. Fixes: c034a43e72dda ("staging: typec: Fairchild FUSB302 Type-c chip driver") Cc: stable <stable@kernel.org> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20250704-fusb302-race-condition-fix-v1-1-239012c0e27a@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	USB: typec: Use str_enable_disable-like helpers	Krzysztof Kozlowski	6	-21/+27
	[ Upstream commit 13b3af26a41538e5051baedba8678eba521a27d3 ] Replace ternary (condition ? "enable" : "disable") syntax with helpers from string_choices.h because: 1. Simple function call with one argument is easier to read. Ternary operator has three arguments and with wrapping might lead to quite long code. 2. Is slightly shorter thus also easier to read. 3. It brings uniformity in the text - same string. 4. Allows deduping by the linker, which results in a smaller binary file. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/20250114-str-enable-disable-usb-v1-3-c8405df47c19@linaro.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Stable-dep-of: 1e61f6ab0878 ("usb: typec: fusb302: cache PD RX state") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	ACPI: pfr_update: Fix the driver update version check	Chen Yu	1	-1/+1
	commit 8151320c747efb22d30b035af989fed0d502176e upstream. The security-version-number check should be used rather than the runtime version check for driver updates. Otherwise, the firmware update would fail when the update binary had a lower runtime version number than the current one. Fixes: 0db89fa243e5 ("ACPI: Introduce Platform Firmware Runtime Update device driver") Cc: 5.17+ <stable@vger.kernel.org> # 5.17+ Reported-by: "Govindarajulu, Hariganesh" <hariganesh.govindarajulu@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Link: https://patch.msgid.link/20250722143233.3970607-1-yu.c.chen@intel.com [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	drm/amd/display: Don't overwrite dce60_clk_mgr	Timur Kristóf	1	-1/+0
	commit 4db9cd554883e051df1840d4d58d636043101034 upstream. dc_clk_mgr_create accidentally overwrites the dce60_clk_mgr with the dce_clk_mgr, causing incorrect behaviour on DCE6. Fix it by removing the extra dce_clk_mgr_construct. Fixes: 62eab49faae7 ("drm/amd/display: hide VGH asic specific structs") Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bbddcbe36a686af03e91341b9bbfcca94bd45fb6) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	drm/amd/display: fix a Null pointer dereference vulnerability	Siyang Liu	1	-9/+10
	commit 1bcf63a44381691d6192872801f830ce3250e367 upstream. [Why] A null pointer dereference vulnerability exists in the AMD display driver's (DC module) cleanup function dc_destruct(). When display control context (dc->ctx) construction fails (due to memory allocation failure), this pointer remains NULL. During subsequent error handling when dc_destruct() is called, there's no NULL check before dereferencing the perf_trace member (dc->ctx->perf_trace), causing a kernel null pointer dereference crash. [How] Check if dc->ctx is non-NULL before dereferencing. Link: https://lore.kernel.org/r/tencent_54FF4252EDFB6533090A491A25EEF3EDBF06@qq.com Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> (Updated commit text and removed unnecessary error message) Signed-off-by: Siyang Liu <Security@tencent.com> Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9dd8e2ba268c636c240a918e0a31e6feaee19404) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	drm/amd/display: Add primary plane to commits for correct VRR handling	Michel Dänzer	1	-0/+9
	commit 3477c1b0972dc1c8a46f78e8fb1fa6966095b5ec upstream. amdgpu_dm_commit_planes calls update_freesync_state_on_stream only for the primary plane. If a commit affects a CRTC but not its primary plane, it would previously not trigger a refresh cycle or affect LFC, violating current UAPI semantics. Fixes e.g. atomic commits affecting only the cursor plane being limited to the minimum refresh rate. Don't do this for the legacy cursor ioctls though, it would break the UAPI semantics for those. Suggested-by: Xaver Hugl <xaver.hugl@kde.org> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3034 Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit cc7bfba95966251b254cb970c21627124da3b7f4) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	drm/amdkfd: Destroy KFD debugfs after destroy KFD wq	Amber Lin	1	-1/+1
	commit 2e58401a24e7b2d4ec619104e1a76590c1284a4c upstream. Since KFD proc content was moved to kernel debugfs, we can't destroy KFD debugfs before kfd_process_destroy_wq. Move kfd_process_destroy_wq prior to kfd_debugfs_fini to fix a kernel NULL pointer problem. It happens when /sys/kernel/debug/kfd was already destroyed in kfd_debugfs_fini but kfd_process_destroy_wq calls kfd_debugfs_remove_process. This line debugfs_remove_recursive(entry->proc_dentry); tries to remove /sys/kernel/debug/kfd/proc/<pid> while /sys/kernel/debug/kfd is already gone. It hangs the kernel by kernel NULL pointer. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Eric Huang <jinhuieric.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 0333052d90683d88531558dcfdbf2525cc37c233) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	drm/amdgpu: update mmhub 4.1.0 client id mappings	Alex Deucher	1	-21/+13
	commit a0b34e4c8663b13e45c78267b4de3004b1a72490 upstream. Update the client id mapping so the correct clients get printed when there is a mmhub page fault. Tested-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	drm/amdgpu: update mmhub 3.0.1 client id mappings	Alex Deucher	1	-25/+32
	commit 0bae62cc989fa99ac9cb564eb573aad916d1eb61 upstream. Update the client id mapping so the correct clients get printed when there is a mmhub page fault. Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2a2681eda73b99a2c1ee8cdb006099ea5d0c2505) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	drm/amdgpu: Update external revid for GC v9.5.0	Lijo Lazar	1	-0/+2
	commit 05c8b690511854ba31d8d1bff7139a13ec66b9e7 upstream. Use different external revid for GC v9.5.0 SOCs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 21c6764ed4bfaecad034bc4fd15dd64c5a436325) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	drm/amdgpu: Initialize data to NULL in imu_v12_0_program_rlc_ram()	Nathan Chancellor	1	-1/+1
	commit c90f2e1172c51fa25492471dc9910e2d7c1444b9 upstream. After a recent change in clang to expose uninitialized warnings from const variables and pointers [1], there is a warning in imu_v12_0_program_rlc_ram() because data is passed uninitialized to program_imu_rlc_ram(): drivers/gpu/drm/amd/amdgpu/imu_v12_0.c:374:30: error: variable 'data' is uninitialized when used here [-Werror,-Wuninitialized] 374 \| program_imu_rlc_ram(adev, data, (const u32)size); \| ^~~~ As this warning happens early in clang's frontend, it does not realize that due to the assignment of r to -EINVAL, program_imu_rlc_ram() is never actually called, and even if it were, data would not be dereferenced because size is 0. Just initialize data to NULL to silence the warning, as the commit that added program_imu_rlc_ram() mentioned it would eventually be used over the old method, at which point data can be properly initialized and used. Cc: stable@vger.kernel.org Closes: https://github.com/ClangBuiltLinux/linux/issues/2107 Fixes: 56159fffaab5 ("drm/amdgpu: use new method to program rlc ram") Link: https://github.com/llvm/llvm-project/commit/2464313eef01c5b1edf0eccf57a32cdee01472c7 [1] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	drm/amdgpu: check if hubbub is NULL in debugfs/amdgpu_dm_capabilities	Peter Shkenev	1	-1/+1
	commit b4a69f7f29c8a459ad6b4d8a8b72450f1d9fd288 upstream. HUBBUB structure is not initialized on DCE hardware, so check if it is NULL to avoid null dereference while accessing amdgpu_dm_capabilities file in debugfs. Signed-off-by: Peter Shkenev <mustela@erminea.space> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	drm/amdgpu: Avoid extra evict-restore process.	Gang Ba	1	-4/+2
	commit 1f02f2044bda1db1fd995bc35961ab075fa7b5a2 upstream. If vm belongs to another process, this is fclose after fork, wait may enable signaling KFD eviction fence and cause parent process queue evicted. [677852.634569] amdkfd_fence_enable_signaling+0x56/0x70 [amdgpu] [677852.634814] __dma_fence_enable_signaling+0x3e/0xe0 [677852.634820] dma_fence_wait_timeout+0x3a/0x140 [677852.634825] amddma_resv_wait_timeout+0x7f/0xf0 [amdkcl] [677852.634831] amdgpu_vm_wait_idle+0x2d/0x60 [amdgpu] [677852.635026] amdgpu_flush+0x34/0x50 [amdgpu] [677852.635208] filp_flush+0x38/0x90 [677852.635213] filp_close+0x14/0x30 [677852.635216] do_close_on_exec+0xdd/0x130 [677852.635221] begin_new_exec+0x1da/0x490 [677852.635225] load_elf_binary+0x307/0xea0 [677852.635231] ? srso_alias_return_thunk+0x5/0xfbef5 [677852.635235] ? ima_bprm_check+0xa2/0xd0 [677852.635240] search_binary_handler+0xda/0x260 [677852.635245] exec_binprm+0x58/0x1a0 [677852.635249] bprm_execve.part.0+0x16f/0x210 [677852.635254] bprm_execve+0x45/0x80 [677852.635257] do_execveat_common.isra.0+0x190/0x200 Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Gang Ba <Gang.Ba@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	drm/amd: Restore cached power limit during resume	Mario Limonciello	1	-0/+6
	commit ed4efe426a49729952b3dc05d20e33b94409bdd1 upstream. The power limit will be cached in smu->current_power_limit but if the ASIC goes into S3 this value won't be restored. Restore the value during SMU resume. Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250725031222.3015095-2-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 26a609e053a6fc494403e95403bc6a2470383bec) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	drm/amdgpu/discovery: fix fw based ip discovery	Alex Deucher	2	-36/+41
	commit 514678da56da089b756b4d433efd964fa22b2079 upstream. We only need the fw based discovery table for sysfs. No need to parse it. Additionally parsing some of the board specific tables may result in incorrect data on some boards. just load the binary and don't parse it on those boards. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4441 Fixes: 80a0e8282933 ("drm/amdgpu/discovery: optionally use fw based ip discovery") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 62eedd150fa11aefc2d377fc746633fdb1baeb55) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: venus: venc: Clamp param smaller than 1fps and bigger than 240	Ricardo Ribalda	1	-3/+2
	commit 417c01b92ec278a1118a05c6ad8a796eaa0c9c52 upstream. The driver uses "whole" fps in all its calculations (e.g. in load_per_instance()). Those calculation expect an fps bigger than 1, and not big enough to overflow. Clamp the param if the user provides a value that will result in an invalid fps. Reported-by: Hans Verkuil <hverkuil@xs4all.nl> Closes: https://lore.kernel.org/linux-media/f11653a7-bc49-48cd-9cdb-1659147453e4@xs4all.nl/T/#m91cd962ac942834654f94c92206e2f85ff7d97f0 Fixes: aaaa93eda64b ("[media] media: venus: venc: add video encoder files") Cc: stable@vger.kernel.org Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> [bod: Change "parm" to "param"] Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: venus: vdec: Clamp param smaller than 1fps and bigger than 240.	Ricardo Ribalda	2	-3/+4
	commit 377dc500d253f0b26732b2cb062e89668aef890a upstream. The driver uses "whole" fps in all its calculations (e.g. in load_per_instance()). Those calculation expect an fps bigger than 1, and not big enough to overflow. Clamp the value if the user provides a param that will result in an invalid fps. Reported-by: Hans Verkuil <hverkuil@xs4all.nl> Closes: https://lore.kernel.org/linux-media/f11653a7-bc49-48cd-9cdb-1659147453e4@xs4all.nl/T/#m91cd962ac942834654f94c92206e2f85ff7d97f0 Fixes: 7472c1c69138 ("[media] media: venus: vdec: add video decoder files") Cc: stable@vger.kernel.org Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> # qrb5615-rb5 Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> [bod: Change "parm" to "param"] Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: venus: protect against spurious interrupts during probe	Jorge Ramirez-Ortiz	1	-4/+4
	commit 3200144a2fa4209dc084a19941b9b203b43580f0 upstream. Make sure the interrupt handler is initialized before the interrupt is registered. If the IRQ is registered before hfi_create(), it's possible that an interrupt fires before the handler setup is complete, leading to a NULL dereference. This error condition has been observed during system boot on Rb3Gen2. Fixes: af2c3834c8ca ("[media] media: venus: adding core part and helper functions") Cc: stable@vger.kernel.org Signed-off-by: Jorge Ramirez-Ortiz <jorge.ramirez@oss.qualcomm.com> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Reviewed-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Dikshita Agarwal <quic_dikshita@quicinc.com> # RB5 Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: venus: hfi: explicitly release IRQ during teardown	Jorge Ramirez-Ortiz	1	-0/+1
	commit 640803003cd903cea73dc6a86bf6963e238e2b3f upstream. Ensure the IRQ is disabled - and all pending handlers completed - before dismantling the interrupt routing and clearing related pointers. This prevents any possibility of the interrupt triggering after the handler context has been invalidated. Fixes: d96d3f30c0f2 ("[media] media: venus: hfi: add Venus HFI files") Cc: stable@vger.kernel.org Signed-off-by: Jorge Ramirez-Ortiz <jorge.ramirez@oss.qualcomm.com> Reviewed-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Tested-by: Dikshita Agarwal <quic_dikshita@quicinc.com> # RB5 Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: venus: Fix MSM8998 frequency table	Konrad Dybcio	1	-5/+5
	commit ee3b94f22638e0f7a1893d95d87b08698b680052 upstream. Fill in the correct data for the production SKU. Fixes: 193b3dac29a4 ("media: venus: add msm8998 support") Cc: stable@vger.kernel.org Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Reviewed-by: Vikash Garodia <quic_vgarodia@quicinc.com> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: venus: Add a check for packet size after reading from shared memory	Vedang Nagar	1	-0/+4
	commit 49befc830daa743e051a65468c05c2ff9e8580e6 upstream. Add a check to ensure that the packet size does not exceed the number of available words after reading the packet header from shared memory. This ensures that the size provided by the firmware is safe to process and prevent potential out-of-bounds memory access. Fixes: d96d3f30c0f2 ("[media] media: venus: hfi: add Venus HFI files") Cc: stable@vger.kernel.org Signed-off-by: Vedang Nagar <quic_vnagar@quicinc.com> Co-developed-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Signed-off-by: Dikshita Agarwal <quic_dikshita@quicinc.com> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: qcom: camss: cleanup media device allocated resource on error path	Vladimir Zapolskiy	1	-1/+3
	commit 69080ec3d0daba8a894025476c98ab16b5a505a4 upstream. A call to media_device_init() requires media_device_cleanup() counterpart to complete cleanup and release any allocated resources. This has been done in the driver .remove() right from the beginning, but error paths on .probe() shall also be fixed. Fixes: a1d7c116fcf7 ("media: camms: Add core files") Cc: stable@vger.kernel.org Signed-off-by: Vladimir Zapolskiy <vladimir.zapolskiy@linaro.org> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Bryan O'Donoghue <bod@kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: ivsc: Fix crash at shutdown due to missing mei_cldev_disable() calls	Hans de Goede	2	-0/+4
	commit 0c92c49fc688cfadacc47ae99b06a31237702e9e upstream. Both the ACE and CSI driver are missing a mei_cldev_disable() call in their remove() function. This causes the mei_cl client to stay part of the mei_device->file_list list even though its memory is freed by mei_cl_bus_dev_release() calling kfree(cldev->cl). This leads to a use-after-free when mei_vsc_remove() runs mei_stop() which first removes all mei bus devices calling mei_ace_remove() and mei_csi_remove() followed by mei_cl_bus_dev_release() and then calls mei_cl_all_disconnect() which walks over mei_device->file_list dereferecing the just freed cldev->cl. And mei_vsc_remove() it self is run at shutdown because of the platform_device_unregister(tp->pdev) in vsc_tp_shutdown() When building a kernel with KASAN this leads to the following KASAN report: [ 106.634504] ================================================================== [ 106.634623] BUG: KASAN: slab-use-after-free in mei_cl_set_disconnected (drivers/misc/mei/client.c:783) mei [ 106.634683] Read of size 4 at addr ffff88819cb62018 by task systemd-shutdow/1 [ 106.634729] [ 106.634767] Tainted: [E]=UNSIGNED_MODULE [ 106.634770] Hardware name: Dell Inc. XPS 16 9640/09CK4V, BIOS 1.12.0 02/10/2025 [ 106.634773] Call Trace: [ 106.634777] <TASK> ... [ 106.634871] kasan_report (mm/kasan/report.c:221 mm/kasan/report.c:636) [ 106.634901] mei_cl_set_disconnected (drivers/misc/mei/client.c:783) mei [ 106.634921] mei_cl_all_disconnect (drivers/misc/mei/client.c:2165 (discriminator 4)) mei [ 106.634941] mei_reset (drivers/misc/mei/init.c:163) mei ... [ 106.635042] mei_stop (drivers/misc/mei/init.c:348) mei [ 106.635062] mei_vsc_remove (drivers/misc/mei/mei_dev.h:784 drivers/misc/mei/platform-vsc.c:393) mei_vsc [ 106.635066] platform_remove (drivers/base/platform.c:1424) Add the missing mei_cldev_disable() calls so that the mei_cl gets removed from mei_device->file_list before it is freed to fix this. Fixes: 78876f71b3e9 ("media: pci: intel: ivsc: Add ACE submodule") Fixes: 29006e196a56 ("media: pci: intel: ivsc: Add CSI submodule") Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hansg@kernel.org> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: mt9m114: Fix deadlock in get_frame_interval/set_frame_interval	Mathis Foerst	1	-8/+0
	commit 298d1471cf83d5a2a05970e41822a2403f451086 upstream. Getting / Setting the frame interval using the V4L2 subdev pad ops get_frame_interval/set_frame_interval causes a deadlock, as the subdev state is locked in the [1] but also in the driver itself. In [2] it's described that the caller is responsible to acquire and release the lock in this case. Therefore, acquiring the lock in the driver is wrong. Remove the lock acquisitions/releases from mt9m114_ifp_get_frame_interval() and mt9m114_ifp_set_frame_interval(). [1] drivers/media/v4l2-core/v4l2-subdev.c - line 1129 [2] Documentation/driver-api/media/v4l2-subdev.rst Fixes: 24d756e914fc ("media: i2c: Add driver for onsemi MT9M114 camera sensor") Cc: stable@vger.kernel.org Signed-off-by: Mathis Foerst <mathis.foerst@mt.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: ov2659: Fix memory leaks in ov2659_probe()	Zhang Shurong	1	-1/+2
	commit 76142b137b968d47b35cdd8d1dc924677d319c8b upstream. ov2659_probe() doesn't properly free control handler resources in failure paths, causing memory leaks. Add v4l2_ctrl_handler_free() to prevent these memory leaks and reorder the ctrl_handler assignment for better code flow. Fixes: c4c0283ab3cd ("[media] media: i2c: add support for omnivision's ov2659 sensor") Cc: stable@vger.kernel.org Signed-off-by: Zhang Shurong <zhang_shurong@foxmail.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: pisp_be: Fix pm_runtime underrun in probe	Jacopo Mondi	2	-3/+3
	commit e9bb2eacc7222ff8210903eb3b7d56709cc53228 upstream. During the probe() routine, the PiSP BE driver needs to power up the interface in order to identify and initialize the hardware. The driver resumes the interface by calling the pispbe_runtime_resume() function directly, without going through the pm_runtime helpers, but later suspends it by calling pm_runtime_put_autosuspend(). This causes a PM usage count imbalance at probe time, notified by the runtime_pm framework with the below message in the system log: pispbe 1000880000.pisp_be: Runtime PM usage count underflow! Fix this by resuming the interface using the pm runtime helpers instead of calling the resume function directly and use the pm_runtime framework in the probe() error path. While at it, remove manual suspend of the interface in the remove() function. The driver cannot be unloaded if in use, so simply disable runtime pm. To simplify the implementation, make the driver depend on PM as the RPI5 platform where the ISP is integrated in uses the PM framework by default. Fixes: 12187bd5d4f8 ("media: raspberrypi: Add support for PiSP BE") Cc: stable@vger.kernel.org Tested-by: Naushir Patuck <naush@raspberrypi.com> Reviewed-by: Naushir Patuck <naush@raspberrypi.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: rainshadow-cec: fix TOCTOU race condition in rain_interrupt()	Gui-Dong Han	1	-1/+2
	commit 7af160aea26c7dc9e6734d19306128cce156ec40 upstream. In the interrupt handler rain_interrupt(), the buffer full check on rain->buf_len is performed before acquiring rain->buf_lock. This creates a Time-of-Check to Time-of-Use (TOCTOU) race condition, as rain->buf_len is concurrently accessed and modified in the work handler rain_irq_work_handler() under the same lock. Multiple interrupt invocations can race, with each reading buf_len before it becomes full and then proceeding. This can lead to both interrupts attempting to write to the buffer, incrementing buf_len beyond its capacity (DATA_SIZE) and causing a buffer overflow. Fix this bug by moving the spin_lock() to before the buffer full check. This ensures that the check and the subsequent buffer modification are performed atomically, preventing the race condition. An corresponding spin_unlock() is added to the overflow path to correctly release the lock. This possible bug was found by an experimental static analysis tool developed by our team. Fixes: 0f314f6c2e77 ("[media] rainshadow-cec: new RainShadow Tech HDMI CEC driver") Cc: stable@vger.kernel.org Signed-off-by: Gui-Dong Han <hanguidong02@gmail.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: usbtv: Lock resolution while streaming	Ludwig Disterhof	1	-0/+4
	commit 7e40e0bb778907b2441bff68d73c3eb6b6cd319f upstream. When an program is streaming (ffplay) and another program (qv4l2) changes the TV standard from NTSC to PAL, the kernel crashes due to trying to copy to unmapped memory. Changing from NTSC to PAL increases the resolution in the usbtv struct, but the video plane buffer isn't adjusted, so it overflows. Fixes: 0e0fe3958fdd13d ("[media] usbtv: Add support for PAL video source") Cc: stable@vger.kernel.org Signed-off-by: Ludwig Disterhof <ludwig@disterhof.eu> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> [hverkuil: call vb2_is_busy instead of vb2_is_streaming] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: v4l2-ctrls: Don't reset handler's error in v4l2_ctrl_handler_free()	Sakari Ailus	1	-1/+0
	commit 5a0400aca5fa7c6b8ba456c311a460e733571c88 upstream. It's a common pattern in drivers to free the control handler's resources and then return the handler's error code on drivers' error handling paths. Alas, the v4l2_ctrl_handler_free() function also zeroes the error field, effectively indicating successful return to the caller. There's no apparent need to touch the error field while releasing the control handler's resources and cleaning up stale pointers. Not touching the handler's error field is a more certain way to address this problem than changing all the users, in which case the pattern would be likely to re-emerge in new drivers. Do just that, don't touch the control handler's error field in v4l2_ctrl_handler_free(). Fixes: 0996517cf8ea ("V4L/DVB: v4l2: Add new control handling framework") Cc: stable@vger.kernel.org Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Hans Verkuil <hverkuil@xs4all.nl> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: verisilicon: Fix AV1 decoder clock frequency	Nicolas Dufresne	1	-9/+0
	commit 01350185fe02ae3ea2c12d578e06af0d5186f33e upstream. The desired clock frequency was correctly set to 400MHz in the device tree but was lowered by the driver to 300MHz breaking 4K 60Hz content playback. Fix the issue by removing the driver call to clk_set_rate(), which reduce the amount of board specific code. Fixes: 003afda97c65 ("media: verisilicon: Enable AV1 decoder on rk3588") Cc: stable@vger.kernel.org Reviewed-by: Benjamin Gaignard <benjamin.gaignard@collabora.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: vivid: fix wrong pixel_array control size	Hans Verkuil	2	-3/+4
	commit 3e43442d4994c9e1e202c98129a87e330f7faaed upstream. The pixel_array control size was calculated incorrectly: the dimensions were swapped (dims[0] should be the height), and the values should be the width or height divided by PIXEL_ARRAY_DIV and rounded up. So don't use roundup, but use DIV_ROUND_UP instead. This bug is harmless in the sense that nothing will break, except that it consumes way too much memory for this control. Fixes: 6bc7643d1b9c ("media: vivid: add pixel_array test control") Cc: <stable@vger.kernel.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: ipu6: isys: Use correct pads for xlate_streams()	Sakari Ailus	1	-6/+6
	commit ff49672a28f3a856717f09d61380e524e243121f upstream. The pad argument to v4l2_subdev_state_xlate_streams() is incorrect, static pad number is used for the source pad even though the pad number is dependent on the stream. Fix it. Fixes: 3a5c59ad926b ("media: ipu6: Rework CSI-2 sub-device streaming control") Cc: stable@vger.kernel.org Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: imx: fix a potential memory leak in imx_media_csc_scaler_device_init()	Haoxiang Li	1	-1/+1
	commit fc5f8aec77704373ee804b5dba0e0e5029c0f180 upstream. Add video_device_release() in label 'err_m2m' to release the memory allocated by video_device_alloc() and prevent potential memory leaks. Remove the reduntant code in label 'err_m2m'. Fixes: a8ef0488cc59 ("media: imx: add csc/scaler mem2mem device") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li <haoxiang_li2024@163.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: hi556: correct the test pattern configuration	Bingbu Cao	1	-12/+14
	commit 020f602b068c9ce18d5056d02c8302199377d98d upstream. Hynix hi556 support 8 test pattern modes: hi556_test_pattern_menu[] = { { "Disabled", "Solid Colour", "100% Colour Bars", "Fade To Grey Colour Bars", "PN9", "Gradient Horizontal", "Gradient Vertical", "Check Board", "Slant Pattern", } The test pattern is set by a 8-bit register according to the specification. +--------+-------------------------------+ \| BIT[0] \| Solid color \| +--------+-------------------------------+ \| BIT[1] \| Color bar \| +--------+-------------------------------+ \| BIT[2] \| Fade to grey color bar \| +--------+-------------------------------+ \| BIT[3] \| PN9 \| +--------+-------------------------------+ \| BIT[4] \| Gradient horizontal \| +--------+-------------------------------+ \| BIT[5] \| Gradient vertical \| +--------+-------------------------------+ \| BIT[6] \| Check board \| +--------+-------------------------------+ \| BIT[7] \| Slant pattern \| +--------+-------------------------------+ Based on function above, current test pattern programming is wrong. This patch fixes it by 'BIT(pattern - 1)'. If pattern is 0, driver will disable the test pattern generation and set the pattern to 0. Fixes: e62138403a84 ("media: hi556: Add support for Hi-556 sensor") Cc: stable@vger.kernel.org Signed-off-by: Bingbu Cao <bingbu.cao@intel.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	media: gspca: Add bounds checking to firmware parser	Dan Carpenter	1	-2/+8
	commit aef89c0b2417da79cb2062a95476288f9f203ab0 upstream. This sd_init() function reads the firmware. The firmware data holds a series of records and the function reads each record and sends the data to the device. The request_ihex_firmware() function calls ihex_validate_fw() which ensures that the total length of all the records won't read out of bounds of the fw->data[]. However, a potential issue is if there is a single very large record (larger than PAGE_SIZE) and that would result in memory corruption. Generally we trust the firmware, but it's always better to double check. Fixes: 49b61ec9b5af ("[media] gspca: Add new vicam subdriver") Cc: stable@vger.kernel.org Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	soc/tegra: pmc: Ensure power-domains are in a known state	Jon Hunter	1	-22/+29
	commit b6bcbce3359619d05bf387d4f5cc3af63668dbaa upstream. After commit 13a4b7fb6260 ("pmdomain: core: Leave powered-on genpds on until late_initcall_sync") was applied, the Tegra210 Jetson TX1 board failed to boot. Looking into this issue, before this commit was applied, if any of the Tegra power-domains were in 'on' state when the kernel booted, they were being turned off by the genpd core before any driver had chance to request them. This was purely by luck and a consequence of the power-domains being turned off earlier during boot. After this commit was applied, any power-domains in the 'on' state are kept on for longer during boot and therefore, may never transitioned to the off state before they are requested/used. The hang on the Tegra210 Jetson TX1 is caused because devices in some power-domains are accessed without the power-domain being turned off and on, indicating that the power-domain is not in a completely on state. >From reviewing the Tegra PMC driver code, if a power-domain is in the 'on' state there is no guarantee that all the necessary clocks associated with the power-domain are on and even if they are they would not have been requested via the clock framework and so could be turned off later. Some power-domains also have a 'clamping' register that needs to be configured as well. In short, if a power-domain is already 'on' it is difficult to know if it has been configured correctly. Given that the power-domains happened to be switched off during boot previously, to ensure that they are in a good known state on boot, fix this by switching off any power-domains that are on initially when registering the power-domains with the genpd framework. Note that commit 05cfb988a4d0 ("soc/tegra: pmc: Initialise resets associated with a power partition") updated the tegra_powergate_of_get_resets() function to pass the 'off' to ensure that the resets for the power-domain are in the correct state on boot. However, now that we may power off a domain on boot, if it is on, it is better to move this logic into the tegra_powergate_add() function so that there is a single place where we are handling the initial state of the power-domain. Fixes: a38045121bf4 ("soc/tegra: pmc: Add generic PM domain support") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250731121832.213671-1-jonathanh@nvidia.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	amdgpu/amdgpu_discovery: increase timeout limit for IFWI init	Xaver Hugl	1	-2/+2
	commit 928587381b54b1b6c62736486b1dc6cb16c568c2 upstream. With a timeout of only 1 second, my rx 5700XT fails to initialize, so this increases the timeout to 2s. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3697 Signed-off-by: Xaver Hugl <xaver.hugl@kde.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9ed3d7bdf2dcdf1a1196630fab89a124526e9cc2) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	phy: qcom: phy-qcom-m31: Update IPQ5332 M31 USB phy initialization sequence	Kathiravan Thirumoorthy	1	-4/+10
	commit 4a3556b81b99f0c8c0358f7cc6801a62b4538fe2 upstream. The current configuration used for the IPQ5332 M31 USB PHY fails the Near End High Speed Signal Quality compliance test. To resolve this, update the initialization sequence as specified in the Hardware Design Document. Fixes: 08e49af50701 ("phy: qcom: Introduce M31 USB PHY driver") Cc: stable@kernel.org Signed-off-by: Kathiravan Thirumoorthy <kathiravan.thirumoorthy@oss.qualcomm.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250630-ipq5332_hsphy_complaince-v2-1-63621439ebdb@oss.qualcomm.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	vhost/vsock: Avoid allocating arbitrarily-sized SKBs	Will Deacon	1	-2/+4
	commit 10a886aaed293c4db3417951f396827216299e3d upstream. vhost_vsock_alloc_skb() returns NULL for packets advertising a length larger than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE in the packet header. However, this is only checked once the SKB has been allocated and, if the length in the packet header is zero, the SKB may not be freed immediately. Hoist the size check before the SKB allocation so that an iovec larger than VIRTIO_VSOCK_MAX_PKT_BUF_SIZE + the header size is rejected outright. The subsequent check on the length field in the header can then simply check that the allocated SKB is indeed large enough to hold the packet. Cc: <stable@vger.kernel.org> Fixes: 71dc9ec9ac7d ("virtio/vsock: replace virtio_vsock_pkt with sk_buff") Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Will Deacon <will@kernel.org> Message-Id: <20250717090116.11987-2-will@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	PCI: imx6: Delay link start until configfs 'start' written	Richard Zhu	1	-3/+0
	commit 2e6ea70690ddd1ffa422423fd0d4523e4dfe4b62 upstream. According to Documentation/PCI/endpoint/pci-endpoint-cfs.rst, the Endpoint controller (EPC) should only start the link when userspace writes '1' to the '/sys/kernel/config/pci_ep/controllers/<EPC>/start' attribute, which ultimately results in calling imx_pcie_start_link() via pci_epc_start_store(). To align with the documented behavior, do not start the link automatically when adding the EP controller. Fixes: 75c2f26da03f ("PCI: imx6: Add i.MX PCIe EP mode support") Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> [mani: reworded commit subject and description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250709033722.2924372-3-hongxing.zhu@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	PCI: imx6: Remove apps_reset toggling from imx_pcie_{assert/deassert}_core_reset	Richard Zhu	1	-2/+3
	commit d31eb217425591e100b475fad6360cd3da2073c6 upstream. apps_reset corresponds to LTSSM_EN in i.MX7, i.MX8MQ, i.MX8MM and i.MX8MP platforms. Since assertion/de-assertion of apps_reset is done in imx_pcie_ltssm_enable() and imx_pcie_ltssm_disable(), remove it from imx_pcie_assert_core_reset() and imx_pcie_deassert_core_reset(). This also fixes a failure in enumerating the PI7C9X2G608GP (hotplug) chip reliably on i.MX8MM, as reported by Tim. It should be noted that only i.MX7D, i.MX8MQ, i.MX8MM, and i.MX8MP platforms have the apps_reset logic, so this change doesn't have any effect on other platforms. Fixes: ef61c7d8d032 ("PCI: imx6: Deassert apps_reset in imx_pcie_deassert_core_reset()") Reported-by: Tim Harvey <tharvey@gateworks.com> Closes: https://lore.kernel.org/all/CAJ+vNU3ohR2YKTwC4xoYrc1z-neDoH2TTZcMHDy+poj9=jSy+w@mail.gmail.com/ Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> [mani: reworded commit subject and description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Tim Harvey <tharvey@gateworks.com> # imx8mp-venice-gw74xx (i.MX8MP + hotplug capable switch) Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250709033722.2924372-2-hongxing.zhu@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	PCI: imx6: Add IMX8MM_EP and IMX8MP_EP fixed 256-byte BAR 4 in epc_features	Richard Zhu	1	-0/+2
	commit 399444a87acdea5d21c218bc8e9b621fea1cd218 upstream. For IMX8MM_EP and IMX8MP_EP, add fixed 256-byte BAR 4 and reserved BAR 5 in imx8m_pcie_epc_features. Fixes: 75c2f26da03f ("PCI: imx6: Add i.MX PCIe EP mode support") Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> [bhelgaas: add details in subject] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250708091003.2582846-3-hongxing.zhu@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	PCI: endpoint: Fix configfs group removal on driver teardown	Damien Le Moal	2	-0/+2
	commit 910bdb8197f9322790c738bb32feaa11dba26909 upstream. An endpoint driver configfs attributes group is added to the epf_group list of struct pci_epf_driver by pci_epf_add_cfs() but an added group is not removed from this list when the attribute group is unregistered with pci_ep_cfs_remove_epf_group(). Add the missing list_del() call in pci_ep_cfs_remove_epf_group() to correctly remove the attribute group from the driver list. With this change, once the loop over all attribute groups in pci_epf_remove_cfs() completes, the driver epf_group list should be empty. Add a WARN_ON() to make sure of that. Fixes: ef1433f717a2 ("PCI: endpoint: Create configfs entry for each pci_epf_device_id table entry") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250624114544.342159-3-dlemoal@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	PCI: endpoint: Fix configfs group list head handling	Damien Le Moal	1	-1/+0
	commit d79123d79a8154b4318529b7b2ff7e15806f480b upstream. Doing a list_del() on the epf_group field of struct pci_epf_driver in pci_epf_remove_cfs() is not correct as this field is a list head, not a list entry. This list_del() call triggers a KASAN warning when an endpoint function driver which has a configfs attribute group is torn down: ================================================================== BUG: KASAN: slab-use-after-free in pci_epf_remove_cfs+0x17c/0x198 Write of size 8 at addr ffff00010f4a0d80 by task rmmod/319 CPU: 3 UID: 0 PID: 319 Comm: rmmod Not tainted 6.16.0-rc2 #1 NONE Hardware name: Radxa ROCK 5B (DT) Call trace: show_stack+0x2c/0x84 (C) dump_stack_lvl+0x70/0x98 print_report+0x17c/0x538 kasan_report+0xb8/0x190 __asan_report_store8_noabort+0x20/0x2c pci_epf_remove_cfs+0x17c/0x198 pci_epf_unregister_driver+0x18/0x30 nvmet_pci_epf_cleanup_module+0x24/0x30 [nvmet_pci_epf] __arm64_sys_delete_module+0x264/0x424 invoke_syscall+0x70/0x260 el0_svc_common.constprop.0+0xac/0x230 do_el0_svc+0x40/0x58 el0_svc+0x48/0xdc el0t_64_sync_handler+0x10c/0x138 el0t_64_sync+0x198/0x19c ... Remove this incorrect list_del() call from pci_epf_remove_cfs(). Fixes: ef1433f717a2 ("PCI: endpoint: Create configfs entry for each pci_epf_device_id table entry") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250624114544.342159-2-dlemoal@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>