summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2025-06-19Linux 6.15.3v6.15.3Greg Kroah-Hartman1-1/+1
Link: https://lore.kernel.org/r/20250617152451.485330293@linuxfoundation.org Tested-by: Christian Heusel <christian@heusel.eu> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Ronald Warsow <rwarsow@gmx.de> Tested-by: Salvatore Bonaccorso <carnil@debian.org> Tested-by: Luna Jernberg <droidbittin@gmail.com> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-By: Achill Gilgenast <fossdd@pwned.life> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com> Tested-by: Peter Schneider <pschneider1968@googlemail.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Hardik Garg <hargar@linux.microsoft.com> Tested-by: Justin M. Forbes <jforbes@fedoraproject.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19thermal/drivers/mediatek/lvts: Remove unused lvts_debugfs_exitArnd Bergmann1-2/+0
commit 3159c96ac2cb3a104ce8ee5b8a4afe98e4b8fcad upstream. When debugfs is disabled, the function has no reference any more: drivers/thermal/mediatek/lvts_thermal.c:266:13: error: 'lvts_debugfs_exit' defined but not used [-Werror=unused-function] 266 | static void lvts_debugfs_exit(struct lvts_domain *lvts_td) { } | ^~~~~~~~~~~~~~~~~ Fixes: ef280c17a840 ("thermal/drivers/mediatek/lvts: Fix debugfs unregister on failure") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20250505052502.1812867-1-arnd@kernel.org Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19gfs2: Don't clear sb->s_fs_info in gfs2_sys_fs_addAndrew Price2-2/+3
commit 9126d2754c5e5d1818765811a10af0a14cf1fa0a upstream. When gfs2_sys_fs_add() fails, it sets sb->s_fs_info to NULL on its error path (see commit 0d515210b696 ("GFS2: Add kobject release method")). The intention seems to be to prevent dereferencing sb->s_fs_info once the object pointed to has been deallocated, but that would be better achieved by setting the pointer to NULL in free_sbd(). As a consequence, when the call to gfs2_sys_fs_add() fails in gfs2_fill_super(), sdp = GFS2_SB(inode) will evaluate to NULL in iput() -> gfs2_drop_inode(), and accessing sdp->sd_flags will be a NULL pointer dereference. Fix that by only setting sb->s_fs_info to NULL when actually freeing the object pointed to in free_sbd(). Fixes: ae9f3bd8259a ("gfs2: replace sd_aspace with sd_inode") Reported-by: syzbot+b12826218502df019f9d@syzkaller.appspotmail.com Signed-off-by: Andrew Price <anprice@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19overflow: Introduce __DEFINE_FLEX for having no initializerKees Cook1-6/+19
commit 5c78e793f78732b60276401f75cc1a101f9ad121 upstream. While not yet in the tree, there is a proposed patch[1] that was depending on the prior behavior of _DEFINE_FLEX, which did not have an explicit initializer. Provide this via __DEFINE_FLEX now, which can also have attributes applied (e.g. __uninitialized). Examples of the resulting initializer behaviors can be seen here: https://godbolt.org/z/P7Go8Tr33 Link: https://lore.kernel.org/netdev/20250520205920.2134829-9-anthony.l.nguyen@intel.com [1] Fixes: 47e36ed78406 ("overflow: Fix direct struct member initialization in _DEFINE_FLEX()") Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19net: usb: aqc111: debug info before sanitationOliver Neukum1-4/+4
commit d3faab9b5a6a0477d69c38bd11c43aa5e936f929 upstream. If we sanitize error returns, the debug statements need to come before that so that we don't lose information. Signed-off-by: Oliver Neukum <oneukum@suse.com> Fixes: 405b0d610745 ("net: usb: aqc111: fix error handling of usbnet read calls") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: misc: onboard_usb_dev: fix build warning for ↵Arnd Bergmann1-2/+5
CONFIG_USB_ONBOARD_DEV_USB5744=n commit 662a9ece32add94469138ae66999ee16cb37a531 upstream. When the USB5744 option is disabled, the onboard_usb driver warns about unused functions: drivers/usb/misc/onboard_usb_dev.c:358:12: error: 'onboard_dev_5744_i2c_write_byte' defined but not used [-Werror=unused-function] 358 | static int onboard_dev_5744_i2c_write_byte(struct i2c_client *client, u16 addr, u8 data) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/usb/misc/onboard_usb_dev.c:313:12: error: 'onboard_dev_5744_i2c_read_byte' defined but not used [-Werror=unused-function] 313 | static int onboard_dev_5744_i2c_read_byte(struct i2c_client *client, u16 addr, u8 *data) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Extend the #ifdef block a little further to cover all of these functions. Ideally we'd use use if(IS_ENABLED()) instead, but that doesn't currently work because the i2c_transfer() and i2c_smbus_write_word_data() function declarations are hidden when CONFIG_I2C is disabled. Fixes: 1143d41922c0 ("usb: misc: onboard_usb_dev: Fix usb5744 initialization sequence") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20250523120947.2170302-1-arnd@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19regulator: dt-bindings: mt6357: Drop fixed compatible requirementNícolas F. R. A. Prado1-11/+1
commit 9cfdd7752ba5f8cc9b8191e8c9aeeec246241fa4 upstream. Some of the regulators on the MT6357 PMIC currently reference the fixed-regulator dt-binding, which enforces the presence of a regulator-fixed compatible. However since all regulators on the MT6357 PMIC are handled by a single mt6357-regulator driver, probed through MFD, the compatibles don't serve any purpose. In fact they cause failures in the DT kselftest since they aren't probed by the fixed regulator driver as would be expected. Furthermore this is the only dt-binding in this family like this: mt6359-regulator and mt6358-regulator don't require those compatibles. Commit d77e89b7b03f ("arm64: dts: mediatek: mt6357: Drop regulator-fixed compatibles") removed the compatibles from Devicetree, but missed updating the binding, which still requires them, introducing dt-binding errors. Remove the compatible requirement by referencing the plain regulator dt-binding instead to fix the dt-binding errors. Fixes: d77e89b7b03f ("arm64: dts: mediatek: mt6357: Drop regulator-fixed compatibles") Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Link: https://patch.msgid.link/20250514-mt6357-regulator-fixed-compatibles-removal-bindings-v1-1-2421e9cc6cc7@collabora.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19do_move_mount(): split the checks in subtree-of-our-ns and entire-anon casesAl Viro1-21/+25
commit 290da20e333955637f00647d9fff7c6e3c0b61e0 upstream. ... and fix the breakage in anon-to-anon case. There are two cases acceptable for do_move_mount() and mixing checks for those is making things hard to follow. One case is move of a subtree in caller's namespace. * source and destination must be in caller's namespace * source must be detachable from parent Another is moving the entire anon namespace elsewhere * source must be the root of anon namespace * target must either in caller's namespace or in a suitable anon namespace (see may_use_mount() for details). * target must not be in the same namespace as source. It's really easier to follow if tests are *not* mixed together... Reviewed-by: Christian Brauner <brauner@kernel.org> Fixes: 3b5260d12b1f ("Don't propagate mounts into detached trees") Reported-by: Allison Karlitskaya <lis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19calipso: unlock rcu before returning -EAFNOSUPPORTEric Dumazet1-2/+4
commit 3cae906e1a6184cdc9e4d260e4dbdf9a118d94ad upstream. syzbot reported that a recent patch forgot to unlock rcu in the error path. Adopt the convention that netlbl_conn_setattr() is already using. Fixes: 6e9f2df1c550 ("calipso: Don't call calipso functions for AF_INET sk.") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://patch.msgid.link/20250604133826.1667664-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19x86/fred/signal: Prevent immediate repeat of single step trap on return from ↵Xin Li (Intel)3-0/+30
SIGTRAP handler commit e34dbbc85d64af59176fe59fad7b4122f4330fe2 upstream. Clear the software event flag in the augmented SS to prevent immediate repeat of single step trap on return from SIGTRAP handler if the trap flag (TF) is set without an external debugger attached. Following is a typical single-stepping flow for a user process: 1) The user process is prepared for single-stepping by setting RFLAGS.TF = 1. 2) When any instruction in user space completes, a #DB is triggered. 3) The kernel handles the #DB and returns to user space, invoking the SIGTRAP handler with RFLAGS.TF = 0. 4) After the SIGTRAP handler finishes, the user process performs a sigreturn syscall, restoring the original state, including RFLAGS.TF = 1. 5) Goto step 2. According to the FRED specification: A) Bit 17 in the augmented SS is designated as the software event flag, which is set to 1 for FRED event delivery of SYSCALL, SYSENTER, or INT n. B) If bit 17 of the augmented SS is 1 and ERETU would result in RFLAGS.TF = 1, a single-step trap will be pending upon completion of ERETU. In step 4) above, the software event flag is set upon the sigreturn syscall, and its corresponding ERETU would restore RFLAGS.TF = 1. This combination causes a pending single-step trap upon completion of ERETU. Therefore, another #DB is triggered before any user space instruction is executed, which leads to an infinite loop in which the SIGTRAP handler keeps being invoked on the same user space IP. Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code") Suggested-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Sohil Mehta <sohil.mehta@intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20250609084054.2083189-2-xin%40zytor.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19x86/hyperv: Fix APIC ID and VP index confusion in hv_snp_boot_ap()Roman Kisel5-43/+64
commit 86c48271e0d60c82665e9fd61277002391efcef7 upstream. To start an application processor in SNP-isolated guest, a hypercall is used that takes a virtual processor index. The hv_snp_boot_ap() function uses that START_VP hypercall but passes as VP index to it what it receives as a wakeup_secondary_cpu_64 callback: the APIC ID. As those two aren't generally interchangeable, that may lead to hung APs if the VP index and the APIC ID don't match up. Update the parameter names to avoid confusion as to what the parameter is. Use the APIC ID to the VP index conversion to provide the correct input to the hypercall. Cc: stable@vger.kernel.org Fixes: 44676bb9d566 ("x86/hyperv: Add smp support for SEV-SNP guest") Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250507182227.7421-2-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250507182227.7421-2-romank@linux.microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19x86/iopl: Cure TIF_IO_BITMAP inconsistenciesThomas Gleixner2-4/+15
commit 8b68e978718f14fdcb080c2a7791c52a0d09bc6d upstream. io_bitmap_exit() is invoked from exit_thread() when a task exists or when a fork fails. In the latter case the exit_thread() cleans up resources which were allocated during fork(). io_bitmap_exit() invokes task_update_io_bitmap(), which in turn ends up in tss_update_io_bitmap(). tss_update_io_bitmap() operates on the current task. If current has TIF_IO_BITMAP set, but no bitmap installed, tss_update_io_bitmap() crashes with a NULL pointer dereference. There are two issues, which lead to that problem: 1) io_bitmap_exit() should not invoke task_update_io_bitmap() when the task, which is cleaned up, is not the current task. That's a clear indicator for a cleanup after a failed fork(). 2) A task should not have TIF_IO_BITMAP set and neither a bitmap installed nor IOPL emulation level 3 activated. This happens when a kernel thread is created in the context of a user space thread, which has TIF_IO_BITMAP set as the thread flags are copied and the IO bitmap pointer is cleared. Other than in the failed fork() case this has no impact because kernel threads including IO workers never return to user space and therefore never invoke tss_update_io_bitmap(). Cure this by adding the missing cleanups and checks: 1) Prevent io_bitmap_exit() to invoke task_update_io_bitmap() if the to be cleaned up task is not the current task. 2) Clear TIF_IO_BITMAP in copy_thread() unconditionally. For user space forks it is set later, when the IO bitmap is inherited in io_bitmap_share(). For paranoia sake, add a warning into tss_update_io_bitmap() to catch the case, when that code is invoked with inconsistent state. Fixes: ea5f1cd7ab49 ("x86/ioperm: Remove bitmap if all permissions dropped") Reported-by: syzbot+e2b1803445d236442e54@syzkaller.appspotmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/87wmdceom2.ffs@tglx Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19xen/arm: call uaccess_ttbr0_enable for dm_op hypercallStefano Stabellini1-1/+20
commit 7f9bbc1140ff8796230bc2634055763e271fd692 upstream. dm_op hypercalls might come from userspace and pass memory addresses as parameters. The memory addresses typically correspond to buffers allocated in userspace to hold extra hypercall parameters. On ARM, when CONFIG_ARM64_SW_TTBR0_PAN is enabled, they might not be accessible by Xen, as a result ioreq hypercalls might fail. See the existing comment in arch/arm64/xen/hypercall.S regarding privcmd_call for reference. For privcmd_call, Linux calls uaccess_ttbr0_enable before issuing the hypercall thanks to commit 9cf09d68b89a. We need to do the same for dm_op. This resolves the problem. Cc: stable@kernel.org Fixes: 9cf09d68b89a ("arm64: xen: Enable user access before a privcmd hvc call") Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> Reviewed-by: Juergen Gross <jgross@suse.com> Message-ID: <alpine.DEB.2.22.394.2505121446370.8380@ubuntu-linux-20-04-desktop> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19xfs: don't assume perags are initialised when trimming AGsDave Chinner1-1/+16
commit 23be716b1c4f3f3a6c00ee38d51a57ef7db9ef7d upstream. When running fstrim immediately after mounting a V4 filesystem, the fstrim fails to trim all the free space in the filesystem. It only trims the first extent in the by-size free space tree in each AG and then returns. If a second fstrim is then run, it runs correctly and the entire free space in the filesystem is iterated and discarded correctly. The problem lies in the setup of the trim cursor - it assumes that pag->pagf_longest is valid without either reading the AGF first or checking if xfs_perag_initialised_agf(pag) is true or not. As a result, when a filesystem is mounted without reading the AGF (e.g. a clean mount on a v4 filesystem) and the first operation is a fstrim call, pag->pagf_longest is zero and so the free extent search starts at the wrong end of the by-size btree and exits after discarding the first record in the tree. Fix this by deferring the initialisation of tcur->count to after we have locked the AGF and guaranteed that the perag is properly initialised. We trigger this on tcur->count == 0 after locking the AGF, as this will only occur on the first call to xfs_trim_gather_extents() for each AG. If we need to iterate, tcur->count will be set to the length of the record we need to restart at, so we can use this to ensure we only sample a valid pag->pagf_longest value for the iteration. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Bill O'Donnell <bodonnel@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Fixes: 89cfa899608f ("xfs: reduce AGF hold times during fstrim operations") Cc: <stable@vger.kernel.org> # v6.6 Signed-off-by: Carlos Maiolino <cem@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19ring-buffer: Move cpus_read_lock() outside of buffer->mutexSteven Rostedt1-5/+6
commit c98cc9797b7009308fff73d41bc1d08642dab77a upstream. Running a modified trace-cmd record --nosplice where it does a mmap of the ring buffer when '--nosplice' is set, caused the following lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 6.15.0-rc7-test-00002-gfb7d03d8a82f #551 Not tainted ------------------------------------------------------ trace-cmd/1113 is trying to acquire lock: ffff888100062888 (&buffer->mutex){+.+.}-{4:4}, at: ring_buffer_map+0x11c/0xe70 but task is already holding lock: ffff888100a5f9f8 (&cpu_buffer->mapping_lock){+.+.}-{4:4}, at: ring_buffer_map+0xcf/0xe70 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #5 (&cpu_buffer->mapping_lock){+.+.}-{4:4}: __mutex_lock+0x192/0x18c0 ring_buffer_map+0xcf/0xe70 tracing_buffers_mmap+0x1c4/0x3b0 __mmap_region+0xd8d/0x1f70 do_mmap+0x9d7/0x1010 vm_mmap_pgoff+0x20b/0x390 ksys_mmap_pgoff+0x2e9/0x440 do_syscall_64+0x79/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #4 (&mm->mmap_lock){++++}-{4:4}: __might_fault+0xa5/0x110 _copy_to_user+0x22/0x80 _perf_ioctl+0x61b/0x1b70 perf_ioctl+0x62/0x90 __x64_sys_ioctl+0x134/0x190 do_syscall_64+0x79/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #3 (&cpuctx_mutex){+.+.}-{4:4}: __mutex_lock+0x192/0x18c0 perf_event_init_cpu+0x325/0x7c0 perf_event_init+0x52a/0x5b0 start_kernel+0x263/0x3e0 x86_64_start_reservations+0x24/0x30 x86_64_start_kernel+0x95/0xa0 common_startup_64+0x13e/0x141 -> #2 (pmus_lock){+.+.}-{4:4}: __mutex_lock+0x192/0x18c0 perf_event_init_cpu+0xb7/0x7c0 cpuhp_invoke_callback+0x2c0/0x1030 __cpuhp_invoke_callback_range+0xbf/0x1f0 _cpu_up+0x2e7/0x690 cpu_up+0x117/0x170 cpuhp_bringup_mask+0xd5/0x120 bringup_nonboot_cpus+0x13d/0x170 smp_init+0x2b/0xf0 kernel_init_freeable+0x441/0x6d0 kernel_init+0x1e/0x160 ret_from_fork+0x34/0x70 ret_from_fork_asm+0x1a/0x30 -> #1 (cpu_hotplug_lock){++++}-{0:0}: cpus_read_lock+0x2a/0xd0 ring_buffer_resize+0x610/0x14e0 __tracing_resize_ring_buffer.part.0+0x42/0x120 tracing_set_tracer+0x7bd/0xa80 tracing_set_trace_write+0x132/0x1e0 vfs_write+0x21c/0xe80 ksys_write+0xf9/0x1c0 do_syscall_64+0x79/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #0 (&buffer->mutex){+.+.}-{4:4}: __lock_acquire+0x1405/0x2210 lock_acquire+0x174/0x310 __mutex_lock+0x192/0x18c0 ring_buffer_map+0x11c/0xe70 tracing_buffers_mmap+0x1c4/0x3b0 __mmap_region+0xd8d/0x1f70 do_mmap+0x9d7/0x1010 vm_mmap_pgoff+0x20b/0x390 ksys_mmap_pgoff+0x2e9/0x440 do_syscall_64+0x79/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e other info that might help us debug this: Chain exists of: &buffer->mutex --> &mm->mmap_lock --> &cpu_buffer->mapping_lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&cpu_buffer->mapping_lock); lock(&mm->mmap_lock); lock(&cpu_buffer->mapping_lock); lock(&buffer->mutex); *** DEADLOCK *** 2 locks held by trace-cmd/1113: #0: ffff888106b847e0 (&mm->mmap_lock){++++}-{4:4}, at: vm_mmap_pgoff+0x192/0x390 #1: ffff888100a5f9f8 (&cpu_buffer->mapping_lock){+.+.}-{4:4}, at: ring_buffer_map+0xcf/0xe70 stack backtrace: CPU: 5 UID: 0 PID: 1113 Comm: trace-cmd Not tainted 6.15.0-rc7-test-00002-gfb7d03d8a82f #551 PREEMPT Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x6e/0xa0 print_circular_bug.cold+0x178/0x1be check_noncircular+0x146/0x160 __lock_acquire+0x1405/0x2210 lock_acquire+0x174/0x310 ? ring_buffer_map+0x11c/0xe70 ? ring_buffer_map+0x11c/0xe70 ? __mutex_lock+0x169/0x18c0 __mutex_lock+0x192/0x18c0 ? ring_buffer_map+0x11c/0xe70 ? ring_buffer_map+0x11c/0xe70 ? function_trace_call+0x296/0x370 ? __pfx___mutex_lock+0x10/0x10 ? __pfx_function_trace_call+0x10/0x10 ? __pfx___mutex_lock+0x10/0x10 ? _raw_spin_unlock+0x2d/0x50 ? ring_buffer_map+0x11c/0xe70 ? ring_buffer_map+0x11c/0xe70 ? __mutex_lock+0x5/0x18c0 ring_buffer_map+0x11c/0xe70 ? do_raw_spin_lock+0x12d/0x270 ? find_held_lock+0x2b/0x80 ? _raw_spin_unlock+0x2d/0x50 ? rcu_is_watching+0x15/0xb0 ? _raw_spin_unlock+0x2d/0x50 ? trace_preempt_on+0xd0/0x110 tracing_buffers_mmap+0x1c4/0x3b0 __mmap_region+0xd8d/0x1f70 ? ring_buffer_lock_reserve+0x99/0xff0 ? __pfx___mmap_region+0x10/0x10 ? ring_buffer_lock_reserve+0x99/0xff0 ? __pfx_ring_buffer_lock_reserve+0x10/0x10 ? __pfx_ring_buffer_lock_reserve+0x10/0x10 ? bpf_lsm_mmap_addr+0x4/0x10 ? security_mmap_addr+0x46/0xd0 ? lock_is_held_type+0xd9/0x130 do_mmap+0x9d7/0x1010 ? 0xffffffffc0370095 ? __pfx_do_mmap+0x10/0x10 vm_mmap_pgoff+0x20b/0x390 ? __pfx_vm_mmap_pgoff+0x10/0x10 ? 0xffffffffc0370095 ksys_mmap_pgoff+0x2e9/0x440 do_syscall_64+0x79/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fb0963a7de2 Code: 00 00 00 0f 1f 44 00 00 41 f7 c1 ff 0f 00 00 75 27 55 89 cd 53 48 89 fb 48 85 ff 74 3b 41 89 ea 48 89 df b8 09 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 5b 5d c3 0f 1f 00 48 8b 05 e1 9f 0d 00 64 RSP: 002b:00007ffdcc8fb878 EFLAGS: 00000246 ORIG_RAX: 0000000000000009 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb0963a7de2 RDX: 0000000000000001 RSI: 0000000000001000 RDI: 0000000000000000 RBP: 0000000000000001 R08: 0000000000000006 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffdcc8fbe68 R14: 00007fb096628000 R15: 00005633e01a5c90 </TASK> The issue is that cpus_read_lock() is taken within buffer->mutex. The memory mapped pages are taken with the mmap_lock held. The buffer->mutex is taken within the cpu_buffer->mapping_lock. There's quite a chain with all these locks, where the deadlock can be fixed by moving the cpus_read_lock() outside the taking of the buffer->mutex. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Vincent Donnefort <vdonnefort@google.com> Link: https://lore.kernel.org/20250527105820.0f45d045@gandalf.local.home Fixes: 117c39200d9d7 ("ring-buffer: Introducing ring-buffer mapping functions") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19ring-buffer: Fix buffer locking in ring_buffer_subbuf_order_set()Dmitry Antipov1-3/+1
commit 40ee2afafc1d9fe3aa44a6fbe440d78a5c96a72e upstream. Enlarge the critical section in ring_buffer_subbuf_order_set() to ensure that error handling takes place with per-buffer mutex held, thus preventing list corruption and other concurrency-related issues. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com> Link: https://lore.kernel.org/20250606112242.1510605-1-dmantipov@yandex.ru Reported-by: syzbot+05d673e83ec640f0ced9@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=05d673e83ec640f0ced9 Fixes: f9b94daa542a8 ("ring-buffer: Set new size of the ring buffer sub page") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19ring-buffer: Do not trigger WARN_ON() due to a commit_overrunSteven Rostedt1-8/+18
commit 4fc78a7c9ca994e1da5d3940704d4e8f0ea8c5e4 upstream. When reading a memory mapped buffer the reader page is just swapped out with the last page written in the write buffer. If the reader page is the same as the commit buffer (the buffer that is currently being written to) it was assumed that it should never have missed events. If it does, it triggers a WARN_ON_ONCE(). But there just happens to be one scenario where this can legitimately happen. That is on a commit_overrun. A commit overrun is when an interrupt preempts an event being written to the buffer and then the interrupt adds so many new events that it fills and wraps the buffer back to the commit. Any new events would then be dropped and be reported as "missed_events". In this case, the next page to read is the commit buffer and after the swap of the reader page, the reader page will be the commit buffer, but this time there will be missed events and this triggers the following warning: ------------[ cut here ]------------ WARNING: CPU: 2 PID: 1127 at kernel/trace/ring_buffer.c:7357 ring_buffer_map_get_reader+0x49a/0x780 Modules linked in: kvm_intel kvm irqbypass CPU: 2 UID: 0 PID: 1127 Comm: trace-cmd Not tainted 6.15.0-rc7-test-00004-g478bc2824b45-dirty #564 PREEMPT Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:ring_buffer_map_get_reader+0x49a/0x780 Code: 00 00 00 48 89 fe 48 c1 ee 03 80 3c 2e 00 0f 85 ec 01 00 00 4d 3b a6 a8 00 00 00 0f 85 8a fd ff ff 48 85 c0 0f 84 55 fe ff ff <0f> 0b e9 4e fe ff ff be 08 00 00 00 4c 89 54 24 58 48 89 54 24 50 RSP: 0018:ffff888121787dc0 EFLAGS: 00010002 RAX: 00000000000006a2 RBX: ffff888100062800 RCX: ffffffff8190cb49 RDX: ffff888126934c00 RSI: 1ffff11020200a15 RDI: ffff8881010050a8 RBP: dffffc0000000000 R08: 0000000000000000 R09: ffffed1024d26982 R10: ffff888126934c17 R11: ffff8881010050a8 R12: ffff888126934c00 R13: ffff8881010050b8 R14: ffff888101005000 R15: ffff888126930008 FS: 00007f95c8cd7540(0000) GS:ffff8882b576e000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f95c8de4dc0 CR3: 0000000128452002 CR4: 0000000000172ef0 Call Trace: <TASK> ? __pfx_ring_buffer_map_get_reader+0x10/0x10 tracing_buffers_ioctl+0x283/0x370 __x64_sys_ioctl+0x134/0x190 do_syscall_64+0x79/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f95c8de48db Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00 RSP: 002b:00007ffe037ba110 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffe037bb2b0 RCX: 00007f95c8de48db RDX: 0000000000000000 RSI: 0000000000005220 RDI: 0000000000000006 RBP: 00007ffe037ba180 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffe037bb6f8 R14: 00007f95c9065000 R15: 00005575c7492c90 </TASK> irq event stamp: 5080 hardirqs last enabled at (5079): [<ffffffff83e0adb0>] _raw_spin_unlock_irqrestore+0x50/0x70 hardirqs last disabled at (5080): [<ffffffff83e0aa83>] _raw_spin_lock_irqsave+0x63/0x70 softirqs last enabled at (4182): [<ffffffff81516122>] handle_softirqs+0x552/0x710 softirqs last disabled at (4159): [<ffffffff815163f7>] __irq_exit_rcu+0x107/0x210 ---[ end trace 0000000000000000 ]--- The above was triggered by running on a kernel with both lockdep and KASAN as well as kmemleak enabled and executing the following command: # perf record -o perf-test.dat -a -- trace-cmd record --nosplice -e all -p function hackbench 50 With perf interjecting a lot of interrupts and trace-cmd enabling all events as well as function tracing, with lockdep, KASAN and kmemleak enabled, it could cause an interrupt preempting an event being written to add enough events to wrap the buffer. trace-cmd was modified to have --nosplice use mmap instead of reading the buffer. The way to differentiate this case from the normal case of there only being one page written to where the swap of the reader page received that one page (which is the commit page), check if the tail page is on the reader page. The difference between the commit page and the tail page is that the tail page is where new writes go to, and the commit page holds the first write that hasn't been committed yet. In the case of an interrupt preempting the write of an event and filling the buffer, it would move the tail page but not the commit page. Have the warning only trigger if the tail page is also on the reader page, and also print out the number of events dropped by a commit overrun as that can not yet be safely added to the page so that the reader can see there were events dropped. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Vincent Donnefort <vdonnefort@google.com> Link: https://lore.kernel.org/20250528121555.2066527e@gandalf.local.home Fixes: fe832be05a8ee ("ring-buffer: Have mmapped ring buffer keep track of missed events") Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19mm/filemap: use filemap_end_dropbehind() for read invalidationJens Axboe1-4/+3
commit 25b065a744ff0c1099bb357be1c40030b5a14c07 upstream. Use the filemap_end_dropbehind() helper rather than calling folio_unmap_invalidate() directly, as we need to check if the folio has been redirtied or marked for writeback once the folio lock has been re-acquired. Cc: stable@vger.kernel.org Reported-by: Trond Myklebust <trondmy@hammerspace.com> Fixes: 8026e49bff9b ("mm/filemap: add read support for RWF_DONTCACHE") Link: https://lore.kernel.org/linux-fsdevel/ba8a9805331ce258a622feaca266b163db681a10.camel@hammerspace.com/ Signed-off-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/20250527133255.452431-3-axboe@kernel.dk Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19mm/filemap: gate dropbehind invalidate on folio !dirty && !writebackJens Axboe1-2/+11
commit 095f627add86a6ddda2c2cfd563b0ee05d0172b2 upstream. It's possible for the folio to either get marked for writeback or redirtied. Add a helper, filemap_end_dropbehind(), which guards the folio_unmap_invalidate() call behind check for the folio being both non-dirty and not under writeback AFTER the folio lock has been acquired. Use this helper folio_end_dropbehind_write(). Cc: stable@vger.kernel.org Reported-by: Al Viro <viro@zeniv.linux.org.uk> Fixes: fb7d3bc41493 ("mm/filemap: drop streaming/uncached pages when writeback completes") Link: https://lore.kernel.org/linux-fsdevel/20250525083209.GS2023217@ZenIV/ Signed-off-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/20250527133255.452431-2-axboe@kernel.dk Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19Don't propagate mounts into detached treesAl Viro3-20/+4
commit 3b5260d12b1fe76b566fe182de8abc586b827ed0 upstream. All versions up to 6.14 did not propagate mount events into detached tree. Shortly after 6.14 a merge of vfs-6.15-rc1.mount.namespace (130e696aa68b) has changed that. Unfortunately, that has caused userland regressions (reported in https://lore.kernel.org/all/CAOYeF9WQhFDe+BGW=Dp5fK8oRy5AgZ6zokVyTj1Wp4EUiYgt4w@mail.gmail.com/) Straight revert wouldn't be an option - in particular, the variant in 6.14 had a bug that got fixed in d1ddc6f1d9f0 ("fix IS_MNT_PROPAGATING uses") and we don't want to bring the bug back. This is a modification of manual revert posted by Christian, with changes needed to avoid reintroducing the breakage in scenario described in d1ddc6f1d9f0. Cc: stable@vger.kernel.org Reported-by: Allison Karlitskaya <lis@redhat.com> Tested-by: Allison Karlitskaya <lis@redhat.com> Acked-by: Christian Brauner <brauner@kernel.org> Co-developed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-199p: Add a migrate_folio methodMatthew Wilcox (Oracle)1-0/+1
commit 03ddd7725ed1b39cf9251e1a420559f25dac49b3 upstream. The migration code used to be able to migrate dirty 9p folios by writing them back using writepage. When the writepage method was removed, we neglected to add a migrate_folio method, which means that dirty 9p folios have been unmovable ever since. This reduced our success at defragmenting memory on machines which use 9p heavily. Fixes: 80105ed2fd27 (9p: Use netfslib read/write_iter) Cc: stable@vger.kernel.org Cc: David Howells <dhowells@redhat.com> Cc: v9fs@lists.linux.dev Signed-off-by: "Matthew Wilcox (Oracle)" <willy@infradead.org> Link: https://lore.kernel.org/r/20250402150005.2309458-2-willy@infradead.org Acked-by: Dominique Martinet <asmadeus@codewreck.org> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: typec: tcpm: move tcpm_queue_vdm_unlocked to asynchronous workRD Babiera1-20/+71
commit 324d45e53f1a36c88bc649dc39e0c8300a41be0a upstream. A state check was previously added to tcpm_queue_vdm_unlocked to prevent a deadlock where the DisplayPort Alt Mode driver would be executing work and attempting to grab the tcpm_lock while the TCPM was holding the lock and attempting to unregister the altmode, blocking on the altmode driver's cancel_work_sync call. Because the state check isn't protected, there is a small window where the Alt Mode driver could determine that the TCPM is in a ready state and attempt to grab the lock while the TCPM grabs the lock and changes the TCPM state to one that causes the deadlock. The callstack is provided below: [110121.667392][ C7] Call trace: [110121.667396][ C7] __switch_to+0x174/0x338 [110121.667406][ C7] __schedule+0x608/0x9f0 [110121.667414][ C7] schedule+0x7c/0xe8 [110121.667423][ C7] kernfs_drain+0xb0/0x114 [110121.667431][ C7] __kernfs_remove+0x16c/0x20c [110121.667436][ C7] kernfs_remove_by_name_ns+0x74/0xe8 [110121.667442][ C7] sysfs_remove_group+0x84/0xe8 [110121.667450][ C7] sysfs_remove_groups+0x34/0x58 [110121.667458][ C7] device_remove_groups+0x10/0x20 [110121.667464][ C7] device_release_driver_internal+0x164/0x2e4 [110121.667475][ C7] device_release_driver+0x18/0x28 [110121.667484][ C7] bus_remove_device+0xec/0x118 [110121.667491][ C7] device_del+0x1e8/0x4ac [110121.667498][ C7] device_unregister+0x18/0x38 [110121.667504][ C7] typec_unregister_altmode+0x30/0x44 [110121.667515][ C7] tcpm_reset_port+0xac/0x370 [110121.667523][ C7] tcpm_snk_detach+0x84/0xb8 [110121.667529][ C7] run_state_machine+0x4c0/0x1b68 [110121.667536][ C7] tcpm_state_machine_work+0x94/0xe4 [110121.667544][ C7] kthread_worker_fn+0x10c/0x244 [110121.667552][ C7] kthread+0x104/0x1d4 [110121.667557][ C7] ret_from_fork+0x10/0x20 [110121.667689][ C7] Workqueue: events dp_altmode_work [110121.667697][ C7] Call trace: [110121.667701][ C7] __switch_to+0x174/0x338 [110121.667710][ C7] __schedule+0x608/0x9f0 [110121.667717][ C7] schedule+0x7c/0xe8 [110121.667725][ C7] schedule_preempt_disabled+0x24/0x40 [110121.667733][ C7] __mutex_lock+0x408/0xdac [110121.667741][ C7] __mutex_lock_slowpath+0x14/0x24 [110121.667748][ C7] mutex_lock+0x40/0xec [110121.667757][ C7] tcpm_altmode_enter+0x78/0xb4 [110121.667764][ C7] typec_altmode_enter+0xdc/0x10c [110121.667769][ C7] dp_altmode_work+0x68/0x164 [110121.667775][ C7] process_one_work+0x1e4/0x43c [110121.667783][ C7] worker_thread+0x25c/0x430 [110121.667789][ C7] kthread+0x104/0x1d4 [110121.667794][ C7] ret_from_fork+0x10/0x20 Change tcpm_queue_vdm_unlocked to queue for tcpm_queue_vdm_work, which can perform the state check while holding the TCPM lock while the Alt Mode lock is no longer held. This requires a new struct to hold the vdm data, altmode_vdm_event. Fixes: cdc9946ea637 ("usb: typec: tcpm: enforce ready state when queueing alt mode vdm") Cc: stable <stable@kernel.org> Signed-off-by: RD Babiera <rdbabiera@google.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Badhri Jagan Sridharan <badhri@google.com> Link: https://lore.kernel.org/r/20250506232853.1968304-2-rdbabiera@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: typec: tcpm/tcpci_maxim: Fix bounds check in process_rx()Amit Sunil Dhamne1-1/+2
commit 0736299d090f5c6a1032678705c4bc0a9511a3db upstream. Register read of TCPC_RX_BYTE_CNT returns the total size consisting of: PD message (pending read) size + 1 Byte for Frame Type (SOP*) This is validated against the max PD message (`struct pd_message`) size without accounting for the extra byte for the frame type. Note that the struct pd_message does not contain a field for the frame_type. This results in false negatives when the "PD message (pending read)" is equal to the max PD message size. Fixes: 6f413b559f86 ("usb: typec: tcpci_maxim: Chip level TCPC driver") Signed-off-by: Amit Sunil Dhamne <amitsd@google.com> Signed-off-by: Badhri Jagan Sridharan <badhri@google.com> Reviewed-by: Kyle Tso <kyletso@google.com> Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/stable/20250502-b4-new-fix-pd-rx-count-v1-1-e5711ed09b3d%40google.com Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20250502-b4-new-fix-pd-rx-count-v1-1-e5711ed09b3d@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: Flush altsetting 0 endpoints before reinitializating them after reset.Mathias Nyman1-2/+14
commit 89bb3dc13ac29a563f4e4c555e422882f64742bd upstream. usb core avoids sending a Set-Interface altsetting 0 request after device reset, and instead relies on calling usb_disable_interface() and usb_enable_interface() to flush and reset host-side of those endpoints. xHCI hosts allocate and set up endpoint ring buffers and host_ep->hcpriv during usb_hcd_alloc_bandwidth() callback, which in this case is called before flushing the endpoint in usb_disable_interface(). Call usb_disable_interface() before usb_hcd_alloc_bandwidth() to ensure URBs are flushed before new ring buffers for the endpoints are allocated. Otherwise host driver will attempt to find and remove old stale URBs from a freshly allocated new ringbuffer. Cc: stable <stable@kernel.org> Fixes: 4fe0387afa89 ("USB: don't send Set-Interface after reset") Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250514132520.225345-1-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: cdnsp: Fix issue with detecting USB 3.2 speedPawel Laszczak2-1/+6
commit 2852788cfbe9ca1ab68509d65804413871f741f9 upstream. Patch adds support for detecting SuperSpeedPlus Gen1 x2 and SuperSpeedPlus Gen2 x2 speed. Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Cc: stable <stable@kernel.org> Signed-off-by: Pawel Laszczak <pawell@cadence.com> Acked-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/PH7PR07MB95387AD98EDCA695FECE52BADD96A@PH7PR07MB9538.namprd07.prod.outlook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: cdnsp: Fix issue with detecting command completion eventPawel Laszczak1-1/+17
commit f4ecdc352646f7d23f348e5c544dbe3212c94fc8 upstream. In some cases, there is a small-time gap in which CMD_RING_BUSY can be cleared by controller but adding command completion event to event ring will be delayed. As the result driver will return error code. This behavior has been detected on usbtest driver (test 9) with configuration including ep1in/ep1out bulk and ep2in/ep2out isoc endpoint. Probably this gap occurred because controller was busy with adding some other events to event ring. The CMD_RING_BUSY is cleared to '0' when the Command Descriptor has been executed and not when command completion event has been added to event ring. To fix this issue for this test the small delay is sufficient less than 10us) but to make sure the problem doesn't happen again in the future the patch introduces 10 retries to check with delay about 20us before returning error code. Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver") Cc: stable <stable@kernel.org> Signed-off-by: Pawel Laszczak <pawell@cadence.com> Acked-by: Peter Chen <peter.chen@kernel.org> Link: https://lore.kernel.org/r/PH7PR07MB9538AA45362ACCF1B94EE9B7DD96A@PH7PR07MB9538.namprd07.prod.outlook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: misc: onboard_usb_dev: Fix usb5744 initialization sequenceJonathan Stroud1-13/+87
commit 1143d41922c0f87504f095417ba1870167970143 upstream. Introduce i2c APIs to read/write for proper configuration register programming. It ensures that read-modify-write sequence is performed and reserved bit in Runtime Flags 2 register are not touched. Also legacy smbus block write inserted an extra count value into the i2c data stream which breaks the register write on the usb5744. Switching to new read/write i2c APIs fixes both issues. Fixes: 6782311d04df ("usb: misc: onboard_usb_dev: add Microchip usb5744 SMBus programming support") Cc: stable <stable@kernel.org> Signed-off-by: Jonathan Stroud <jonathan.stroud@amd.com> Co-developed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://lore.kernel.org/r/1747398760-284021-1-git-send-email-radhey.shyam.pandey@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19tty: serial: 8250_omap: fix TX with DMA for am33xxJiri Slaby (SUSE)1-15/+10
commit b495021a973e2468497689bd3e29b736747b896f upstream. Commit 1788cf6a91d9 ("tty: serial: switch from circ_buf to kfifo") introduced an error in the TX DMA handling for 8250_omap. When the OMAP_DMA_TX_KICK flag is set, the "skip_byte" is pulled from the kfifo and emitted directly in order to start the DMA. While the kfifo is updated, dma->tx_size is not decreased. This leads to uart_xmit_advance() called in omap_8250_dma_tx_complete() advancing the kfifo by one too much. In practice, transmitting N bytes has been seen to result in the last N-1 bytes being sent repeatedly. This change fixes the problem by moving all of the dma setup after the OMAP_DMA_TX_KICK handling and using kfifo_len() instead of the DMA size for the 4-byte cutoff check. This slightly changes the behaviour at buffer wraparound, but it still transmits the correct bytes somehow. Now, the "skip_byte" would no longer be accounted to the stats. As previously, dma->tx_size included also this skip byte, up->icount.tx was updated by aforementioned uart_xmit_advance() in omap_8250_dma_tx_complete(). Fix this by using the uart_fifo_out() helper instead of bare kfifo_get(). Based on patch by Mans Rullgard <mans@mansr.com> Signed-off-by: "Jiri Slaby (SUSE)" <jirislaby@kernel.org> Fixes: 1788cf6a91d9 ("tty: serial: switch from circ_buf to kfifo") Link: https://lore.kernel.org/all/20250506150748.3162-1-mans@mansr.com/ Reported-by: Mans Rullgard <mans@mansr.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250522053835.3495975-1-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19VMCI: fix race between vmci_host_setup_notify and vmci_ctx_unset_notifyWupeng Ma1-6/+5
commit 1bd6406fb5f36c2bb1e96e27d4c3e9f4d09edde4 upstream. During our test, it is found that a warning can be trigger in try_grab_folio as follow: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1678 at mm/gup.c:147 try_grab_folio+0x106/0x130 Modules linked in: CPU: 0 UID: 0 PID: 1678 Comm: syz.3.31 Not tainted 6.15.0-rc5 #163 PREEMPT(undef) RIP: 0010:try_grab_folio+0x106/0x130 Call Trace: <TASK> follow_huge_pmd+0x240/0x8e0 follow_pmd_mask.constprop.0.isra.0+0x40b/0x5c0 follow_pud_mask.constprop.0.isra.0+0x14a/0x170 follow_page_mask+0x1c2/0x1f0 __get_user_pages+0x176/0x950 __gup_longterm_locked+0x15b/0x1060 ? gup_fast+0x120/0x1f0 gup_fast_fallback+0x17e/0x230 get_user_pages_fast+0x5f/0x80 vmci_host_unlocked_ioctl+0x21c/0xf80 RIP: 0033:0x54d2cd ---[ end trace 0000000000000000 ]--- Digging into the source, context->notify_page may init by get_user_pages_fast and can be seen in vmci_ctx_unset_notify which will try to put_page. However get_user_pages_fast is not finished here and lead to following try_grab_folio warning. The race condition is shown as follow: cpu0 cpu1 vmci_host_do_set_notify vmci_host_setup_notify get_user_pages_fast(uva, 1, FOLL_WRITE, &context->notify_page); lockless_pages_from_mm gup_pgd_range gup_huge_pmd // update &context->notify_page vmci_host_do_set_notify vmci_ctx_unset_notify notify_page = context->notify_page; if (notify_page) put_page(notify_page); // page is freed __gup_longterm_locked __get_user_pages follow_trans_huge_pmd try_grab_folio // warn here To slove this, use local variable page to make notify_page can be seen after finish get_user_pages_fast. Fixes: a1d88436d53a ("VMCI: Fix two UVA mapping bugs") Cc: stable <stable@kernel.org> Closes: https://lore.kernel.org/all/e91da589-ad57-3969-d979-879bbd10dddd@huawei.com/ Signed-off-by: Wupeng Ma <mawupeng1@huawei.com> Link: https://lore.kernel.org/r/20250510033040.901582-1-mawupeng1@huawei.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19usb: usbtmc: Fix read_stb function and get_stb ioctlDave Penkler1-8/+9
commit acb3dac2805d3342ded7dbbd164add32bbfdf21c upstream. The usbtmc488_ioctl_read_stb function relied on a positive return from usbtmc_get_stb to reset the srq condition in the driver. The USBTMC_IOCTL_GET_STB case tested for a positive return to return the stb to the user. Commit: <cac01bd178d6> ("usb: usbtmc: Fix erroneous get_stb ioctl error returns") changed the return value of usbtmc_get_stb to 0 on success instead of returning the value of usb_control_msg which is positive in the normal case. This change caused the function usbtmc488_ioctl_read_stb and the USBTMC_IOCTL_GET_STB ioctl to no longer function correctly. Change the test in usbtmc488_ioctl_read_stb to test for failure first and return the failure code immediately. Change the test for the USBTMC_IOCTL_GET_STB ioctl to test for 0 instead of a positive value. Fixes: cac01bd178d6 ("usb: usbtmc: Fix erroneous get_stb ioctl error returns") Cc: stable@vger.kernel.org Signed-off-by: Dave Penkler <dpenkler@gmail.com> Link: https://lore.kernel.org/r/20250521121656.18174-3-dpenkler@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19nvmem: zynqmp_nvmem: unbreak driver after cleanupPeter Korsgaard1-0/+1
commit fe8abdd175d7b547ae1a612757e7902bcd62e9cf upstream. Commit 29be47fcd6a0 ("nvmem: zynqmp_nvmem: zynqmp_nvmem_probe cleanup") changed the driver to expect the device pointer to be passed as the "context", but in nvmem the context parameter comes from nvmem_config.priv which is never set - Leading to null pointer exceptions when the device is accessed. Fixes: 29be47fcd6a0 ("nvmem: zynqmp_nvmem: zynqmp_nvmem_probe cleanup") Cc: stable <stable@kernel.org> Signed-off-by: Peter Korsgaard <peter@korsgaard.com> Reviewed-by: Michal Simek <michal.simek@amd.com> Tested-by: Michal Simek <michal.simek@amd.com> Signed-off-by: Srinivas Kandagatla <srini@kernel.org> Link: https://lore.kernel.org/r/20250509122407.11763-3-srini@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19posix-cpu-timers: fix race between handle_posix_cpu_timers() and ↵Oleg Nesterov1-0/+9
posix_cpu_timer_del() commit f90fff1e152dedf52b932240ebbd670d83330eca upstream. If an exiting non-autoreaping task has already passed exit_notify() and calls handle_posix_cpu_timers() from IRQ, it can be reaped by its parent or debugger right after unlock_task_sighand(). If a concurrent posix_cpu_timer_del() runs at that moment, it won't be able to detect timer->it.cpu.firing != 0: cpu_timer_task_rcu() and/or lock_task_sighand() will fail. Add the tsk->exit_state check into run_posix_cpu_timers() to fix this. This fix is not needed if CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y, because exit_task_work() is called before exit_notify(). But the check still makes sense, task_work_add(&tsk->posix_cputimers_work.work) will fail anyway in this case. Cc: stable@vger.kernel.org Reported-by: Benoît Sevens <bsevens@google.com> Fixes: 0bdd2ed4138e ("sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand()") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19powerpc/kernel: Fix ppc_save_regs inclusion in buildMadhavan Srinivasan1-2/+0
commit 93bd4a80efeb521314485a06d8c21157240497bb upstream. Recent patch fixed an old commit 'fc2a5a6161a2 ("powerpc/64s: ppc_save_regs is now needed for all 64s builds")' which is to include building of ppc_save_reg.c only when XMON and KEXEC_CORE and PPC_BOOK3S are enabled. This was valid, since ppc_save_regs was called only in replay_system_reset() of old irq.c which was under BOOK3S. But there has been multiple refactoring of irq.c and have added call to ppc_save_regs() from __replay_soft_interrupts -> replay_soft_interrupts which is part of irq_64.c included under CONFIG_PPC64. And since ppc_save_regs is called in CRASH_DUMP path as part of crash_setup_regs in kexec.h, CONFIG_PPC32 also needs it. So with this recent patch which enabled the building of ppc_save_regs.c caused a build break when none of these (XMON, KEXEC_CORE, BOOK3S) where enabled as part of config. Patch to enable building of ppc_save_regs.c by defaults. Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250511041111.841158-1-maddy@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19HID: usbhid: Eliminate recurrent out-of-bounds bug in usbhid_parse()Terry Junge4-20/+24
commit fe7f7ac8e0c708446ff017453add769ffc15deed upstream. Update struct hid_descriptor to better reflect the mandatory and optional parts of the HID Descriptor as per USB HID 1.11 specification. Note: the kernel currently does not parse any optional HID class descriptors, only the mandatory report descriptor. Update all references to member element desc[0] to rpt_desc. Add test to verify bLength and bNumDescriptors values are valid. Replace the for loop with direct access to the mandatory HID class descriptor member for the report descriptor. This eliminates the possibility of getting an out-of-bounds fault. Add a warning message if the HID descriptor contains any unsupported optional HID class descriptors. Reported-by: syzbot+c52569baf0c843f35495@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c52569baf0c843f35495 Fixes: f043bfc98c19 ("HID: usbhid: fix out-of-bounds bug") Cc: stable@vger.kernel.org Signed-off-by: Terry Junge <linuxhid@cosmicgizmosystems.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Jiri Kosina <jkosina@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19ALSA: usb-audio: Add implicit feedback quirk for RODE AI-1David Heimann1-0/+1
commit 6a3439a417b910e662c666993798e0691bc81147 upstream. The RODE AI-1 audio interface requires implicit feedback sync between playback endpoint 0x03 and feedback endpoint 0x84 on interface 3, but doesn't advertise this in its USB descriptors. Without this quirk, the device receives audio data but produces no output. Signed-off-by: David Heimann <d@dmeh.net> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/084dc88c-1193-4a94-a002-5599adff936c@app.fastmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19ALSA: usb-audio: Kill timer properly at removalTakashi Iwai1-1/+2
commit 0718a78f6a9f04b88d0dc9616cc216b31c5f3cf1 upstream. The USB-audio MIDI code initializes the timer, but in a rare case, the driver might be freed without the disconnect call. This leaves the timer in an active state while the assigned object is released via snd_usbmidi_free(), which ends up with a kernel warning when the debug configuration is enabled, as spotted by fuzzer. For avoiding the problem, put timer_shutdown_sync() at snd_usbmidi_free(), so that the timer can be killed properly. While we're at it, replace the existing timer_delete_sync() at the disconnect callback with timer_shutdown_sync(), too. Reported-by: syzbot+d8f72178ab6783a7daea@syzkaller.appspotmail.com Closes: https://lore.kernel.org/681c70d7.050a0220.a19a9.00c6.GAE@google.com Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20250519212031.14436-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19Revert "wifi: mwifiex: Fix HT40 bandwidth issue."Francesco Dolcini1-4/+2
commit 570896604f47d44d4ff6882d2a588428d2a6ef17 upstream. This reverts commit 4fcfcbe45734 ("wifi: mwifiex: Fix HT40 bandwidth issue.") That commit introduces a regression, when HT40 mode is enabled, received packets are lost, this was experience with W8997 with both SDIO-UART and SDIO-SDIO variants. From an initial investigation the issue solves on its own after some time, but it's not clear what is the reason. Given that this was just a performance optimization, let's revert it till we have a better understanding of the issue and a proper fix. Cc: Jeff Chen <jeff.chen_1@nxp.com> Cc: stable@vger.kernel.org Fixes: 4fcfcbe45734 ("wifi: mwifiex: Fix HT40 bandwidth issue.") Closes: https://lore.kernel.org/all/20250603203337.GA109929@francesco-nb/ Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://patch.msgid.link/20250605130302.55555-1-francesco@dolcini.it [fix commit reference format] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19tools/resolve_btfids: Fix build when cross compiling kernel with clang.Suleiman Souhlal1-1/+1
commit a298bbab903e3fb4cbe16d36d6195e68fad1b776 upstream. When cross compiling the kernel with clang, we need to override CLANG_CROSS_FLAGS when preparing the step libraries. Prior to commit d1d096312176 ("tools: fix annoying "mkdir -p ..." logs when building tools in parallel"), MAKEFLAGS would have been set to a value that wouldn't set a value for CLANG_CROSS_FLAGS, hiding the fact that we weren't properly overriding it. Fixes: 56a2df7615fa ("tools/resolve_btfids: Compile resolve_btfids as host program") Signed-off-by: Suleiman Souhlal <suleiman@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20250606074538.1608546-1-suleiman@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19pidfs: never refuse ppid == 0 in PIDFD_GET_INFOMike Yuan1-1/+1
commit b55eb6eb2a7427428c59b293a0900131fc849595 upstream. In systemd we spotted an issue after switching to ioctl(PIDFD_GET_INFO) for obtaining pid number the pidfd refers to, that for processes with a parent from outer pidns PIDFD_GET_INFO unexpectedly yields -ESRCH [1]. It turned out that there's an arbitrary check blocking this, which is not really sensible given getppid() happily returns 0 for such processes. Just drop the spurious check and userspace ought to handle ppid == 0 properly everywhere. [1] https://github.com/systemd/systemd/issues/37715 Fixes: cdda1f26e74b ("pidfd: add ioctl to retrieve pid info") Signed-off-by: Mike Yuan <me@yhndnzj.com> Link: https://lore.kernel.org/20250604150238.42664-1-me@yhndnzj.com Cc: Christian Brauner <brauner@kernel.org> Cc: Luca Boccassi <luca.boccassi@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19rust: list: fix path of `assert_pinned!`Benno Lossin1-1/+1
commit eb71feaacaaca227ae8f91c8578cf831553c5ab5 upstream. Commit dbd5058ba60c ("rust: make pin-init its own crate") moved all items from pin-init into the pin-init crate, including the `assert_pinned!` macro. Thus fix the path of the sole user of the `assert_pinned!` macro. This occurrence was missed in the commit above, since it is in a macro rule that has no current users (although binder is a future user). Cc: stable@kernel.org Fixes: dbd5058ba60c ("rust: make pin-init its own crate") Signed-off-by: Benno Lossin <lossin@kernel.org> Link: https://lore.kernel.org/r/20250525173450.853413-1-lossin@kernel.org [ Reworded slightly as discussed in the list. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19rust: compile libcore with edition 2024 for 1.87+Gary Guo2-11/+16
commit f4daa80d6be7d3c55ca72a8e560afc4e21f886aa upstream. Rust 1.87 (released on 2025-05-15) compiles core library with edition 2024 instead of 2021 [1]. Ensure that the edition matches libcore's expectation to avoid potential breakage. [ J3m3 reported in Zulip [2] that the `rust-analyzer` target was broken after this patch -- indeed, we need to avoid `core-cfgs` since those are passed to the `rust-analyzer` target. So, instead, I tweaked the patch to create a new `core-edition` variable and explicitly mention the `--edition` flag instead of reusing `core-cfg`s. In addition, pass a new argument using this new variable to `generate_rust_analyzer.py` so that we set the right edition there. By the way, for future reference: the `filter-out` change is needed for Rust < 1.87, since otherwise we would skip the `--edition=2021` we just added, ending up with no edition flag, and thus the compiler would default to the 2015 one. [2] https://rust-for-linux.zulipchat.com/#narrow/channel/291565/topic/x/near/520206547 - Miguel ] Cc: stable@vger.kernel.org # Needed in 6.12.y and later (Rust is pinned in older LTSs). Link: https://github.com/rust-lang/rust/pull/138162 [1] Reported-by: est31 <est31@protonmail.com> Closes: https://github.com/Rust-for-Linux/linux/issues/1163 Signed-off-by: Gary Guo <gary@garyguo.net> Link: https://lore.kernel.org/r/20250517085600.2857460-1-gary@garyguo.net Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19objtool/rust: relax slice condition to cover more `noreturn` Rust functionsMiguel Ojeda1-1/+2
commit cbeaa41dfe26b72639141e87183cb23e00d4b0dd upstream. Developers are indeed hitting other of the `noreturn` slice symbols in Nova [1], thus relax the last check in the list so that we catch all of them, i.e. *_4core5slice5index22slice_index_order_fail *_4core5slice5index24slice_end_index_len_fail *_4core5slice5index26slice_start_index_len_fail *_4core5slice5index29slice_end_index_overflow_fail *_4core5slice5index31slice_start_index_overflow_fail These all exist since at least Rust 1.78.0, thus backport it too. See commit 56d680dd23c3 ("objtool/rust: list `noreturn` Rust functions") for more details. Cc: stable@vger.kernel.org # Needed in 6.12.y and later. Cc: John Hubbard <jhubbard@nvidia.com> Cc: Timur Tabi <ttabi@nvidia.com> Cc: Kane York <kanepyork@gmail.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Reported-by: Joel Fernandes <joelagnelf@nvidia.com> Fixes: 56d680dd23c3 ("objtool/rust: list `noreturn` Rust functions") Closes: https://lore.kernel.org/rust-for-linux/20250513180757.GA1295002@joelnvbox/ [1] Tested-by: Joel Fernandes <joelagnelf@nvidia.com> Link: https://lore.kernel.org/r/20250520185555.825242-1-ojeda@kernel.org Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19uapi: bitops: use UAPI-safe variant of BITS_PER_LONG againThomas Weißschuh1-2/+2
commit 11fcf368506d347088e613edf6cd2604d70c454f upstream. Commit 1e7933a575ed ("uapi: Revert "bitops: avoid integer overflow in GENMASK(_ULL)"") did not take in account that the usage of BITS_PER_LONG in __GENMASK() was changed to __BITS_PER_LONG for UAPI-safety in commit 3c7a8e190bc5 ("uapi: introduce uapi-friendly macros for GENMASK"). BITS_PER_LONG can not be used in UAPI headers as it derives from the kernel configuration and not from the current compiler invocation. When building compat userspace code or a compat vDSO its value will be incorrect. Switch back to __BITS_PER_LONG. Fixes: 1e7933a575ed ("uapi: Revert "bitops: avoid integer overflow in GENMASK(_ULL)"") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Yury Norov [NVIDIA] <yury.norov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-19block: Fix bvec_set_folio() for very large foliosMatthew Wilcox (Oracle)1-2/+5
[ Upstream commit 5e223e06ee7c6d8f630041a0645ac90e39a42cc6 ] Similarly to 26064d3e2b4d ("block: fix adding folio to bio"), if we attempt to add a folio that is larger than 4GB, we'll silently truncate the offset and len. Widen the parameters to size_t, assert that the length is less than 4GB and set the first page that contains the interesting data rather than the first page of the folio. Fixes: 26db5ee15851 (block: add a bvec_set_folio helper) Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20250612144255.2850278-1-willy@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-19bio: Fix bio_first_folio() for SPARSEMEM without VMEMMAPMatthew Wilcox (Oracle)1-1/+1
[ Upstream commit f826ec7966a63d48e16e0868af4e038bf9a1a3ae ] It is possible for physically contiguous folios to have discontiguous struct pages if SPARSEMEM is enabled and SPARSEMEM_VMEMMAP is not. This is correctly handled by folio_page_idx(), so remove this open-coded implementation. Fixes: 640d1930bef4 (block: Add bio_for_each_folio_all()) Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20250612144126.2849931-1-willy@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-19smb: client: fix perf regression with deferred closesPaulo Alcantara1-3/+6
[ Upstream commit b64af6bcd3b0f3fc633d6a70adb0991737abfef4 ] Customer reported that one of their applications started failing to open files with STATUS_INSUFFICIENT_RESOURCES due to NetApp server hitting the maximum number of opens to same file that it would allow for a single client connection. It turned out the client was failing to reuse open handles with deferred closes because matching ->f_flags directly without masking off O_CREAT|O_EXCL|O_TRUNC bits first broke the comparision and then client ended up with thousands of deferred closes to same file. Those bits are already satisfied on the original open, so no need to check them against existing open handles. Reproducer: #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <fcntl.h> #include <pthread.h> #define NR_THREADS 4 #define NR_ITERATIONS 2500 #define TEST_FILE "/mnt/1/test/dir/foo" static char buf[64]; static void *worker(void *arg) { int i, j; int fd; for (i = 0; i < NR_ITERATIONS; i++) { fd = open(TEST_FILE, O_WRONLY|O_CREAT|O_APPEND, 0666); for (j = 0; j < 16; j++) write(fd, buf, sizeof(buf)); close(fd); } } int main(int argc, char *argv[]) { pthread_t t[NR_THREADS]; int fd; int i; fd = open(TEST_FILE, O_WRONLY|O_CREAT|O_TRUNC, 0666); close(fd); memset(buf, 'a', sizeof(buf)); for (i = 0; i < NR_THREADS; i++) pthread_create(&t[i], NULL, worker, NULL); for (i = 0; i < NR_THREADS; i++) pthread_join(t[i], NULL); return 0; } Before patch: $ mount.cifs //srv/share /mnt/1 -o ... $ mkdir -p /mnt/1/test/dir $ gcc repro.c && ./a.out ... number of opens: 1391 After patch: $ mount.cifs //srv/share /mnt/1 -o ... $ mkdir -p /mnt/1/test/dir $ gcc repro.c && ./a.out ... number of opens: 1 Cc: linux-cifs@vger.kernel.org Cc: David Howells <dhowells@redhat.com> Cc: Jay Shin <jaeshin@redhat.com> Cc: Pierguido Lambri <plambri@redhat.com> Fixes: b8ea3b1ff544 ("smb: enable reuse of deferred file handles for write operations") Acked-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-19io_uring: consistently use rcu semantics with sqpoll threadKeith Busch4-15/+38
[ Upstream commit c538f400fae22725580842deb2bef546701b64bd ] The sqpoll thread is dereferenced with rcu read protection in one place, so it needs to be annotated as an __rcu type, and should consistently use rcu helpers for access and assignment to make sparse happy. Since most of the accesses occur under the sqd->lock, we can use rcu_dereference_protected() without declaring an rcu read section. Provide a simple helper to get the thread from a locked context. Fixes: ac0b8b327a5677d ("io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo()") Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250611205343.1821117-1-kbusch@meta.com [axboe: fold in fix for register.c] Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-19block: don't use submit_bio_noacct_nocheck in blk_zone_wplug_bio_workChristoph Hellwig1-2/+5
[ Upstream commit cf625013d8741c01407bbb4a60c111b61b9fa69d ] Bios queued up in the zone write plug have already gone through all all preparation in the submit_bio path, including the freeze protection. Submitting them through submit_bio_noacct_nocheck duplicates the work and can can cause deadlocks when freezing a queue with pending bio write plugs. Go straight to ->submit_bio or blk_mq_submit_bio to bypass the superfluous extra freeze protection and checks. Fixes: 9b1ce7f0c6f8 ("block: Implement zone append emulation") Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Tested-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20250611044416.2351850-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-19io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo()Penglei Jiang2-7/+14
[ Upstream commit ac0b8b327a5677dc6fecdf353d808161525b1ff0 ] syzbot reports: BUG: KASAN: slab-use-after-free in getrusage+0x1109/0x1a60 Read of size 8 at addr ffff88810de2d2c8 by task a.out/304 CPU: 0 UID: 0 PID: 304 Comm: a.out Not tainted 6.16.0-rc1 #1 PREEMPT(voluntary) Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x53/0x70 print_report+0xd0/0x670 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 ? getrusage+0x1109/0x1a60 kasan_report+0xce/0x100 ? getrusage+0x1109/0x1a60 getrusage+0x1109/0x1a60 ? __pfx_getrusage+0x10/0x10 __io_uring_show_fdinfo+0x9fe/0x1790 ? ksys_read+0xf7/0x1c0 ? do_syscall_64+0xa4/0x260 ? vsnprintf+0x591/0x1100 ? __pfx___io_uring_show_fdinfo+0x10/0x10 ? __pfx_vsnprintf+0x10/0x10 ? mutex_trylock+0xcf/0x130 ? __pfx_mutex_trylock+0x10/0x10 ? __pfx_show_fd_locks+0x10/0x10 ? io_uring_show_fdinfo+0x57/0x80 io_uring_show_fdinfo+0x57/0x80 seq_show+0x38c/0x690 seq_read_iter+0x3f7/0x1180 ? inode_set_ctime_current+0x160/0x4b0 seq_read+0x271/0x3e0 ? __pfx_seq_read+0x10/0x10 ? __pfx__raw_spin_lock+0x10/0x10 ? __mark_inode_dirty+0x402/0x810 ? selinux_file_permission+0x368/0x500 ? file_update_time+0x10f/0x160 vfs_read+0x177/0xa40 ? __pfx___handle_mm_fault+0x10/0x10 ? __pfx_vfs_read+0x10/0x10 ? mutex_lock+0x81/0xe0 ? __pfx_mutex_lock+0x10/0x10 ? fdget_pos+0x24d/0x4b0 ksys_read+0xf7/0x1c0 ? __pfx_ksys_read+0x10/0x10 ? do_user_addr_fault+0x43b/0x9c0 do_syscall_64+0xa4/0x260 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f0f74170fc9 Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 8 RSP: 002b:00007fffece049e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0f74170fc9 RDX: 0000000000001000 RSI: 00007fffece049f0 RDI: 0000000000000004 RBP: 00007fffece05ad0 R08: 0000000000000000 R09: 00007fffece04d90 R10: 0000000000000000 R11: 0000000000000206 R12: 00005651720a1100 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Allocated by task 298: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 __kasan_slab_alloc+0x6e/0x70 kmem_cache_alloc_node_noprof+0xe8/0x330 copy_process+0x376/0x5e00 create_io_thread+0xab/0xf0 io_sq_offload_create+0x9ed/0xf20 io_uring_setup+0x12b0/0x1cc0 do_syscall_64+0xa4/0x260 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 22: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x37/0x50 kmem_cache_free+0xc4/0x360 rcu_core+0x5ff/0x19f0 handle_softirqs+0x18c/0x530 run_ksoftirqd+0x20/0x30 smpboot_thread_fn+0x287/0x6c0 kthread+0x30d/0x630 ret_from_fork+0xef/0x1a0 ret_from_fork_asm+0x1a/0x30 Last potentially related work creation: kasan_save_stack+0x33/0x60 kasan_record_aux_stack+0x8c/0xa0 __call_rcu_common.constprop.0+0x68/0x940 __schedule+0xff2/0x2930 __cond_resched+0x4c/0x80 mutex_lock+0x5c/0xe0 io_uring_del_tctx_node+0xe1/0x2b0 io_uring_clean_tctx+0xb7/0x160 io_uring_cancel_generic+0x34e/0x760 do_exit+0x240/0x2350 do_group_exit+0xab/0x220 __x64_sys_exit_group+0x39/0x40 x64_sys_call+0x1243/0x1840 do_syscall_64+0xa4/0x260 entry_SYSCALL_64_after_hwframe+0x77/0x7f The buggy address belongs to the object at ffff88810de2cb00 which belongs to the cache task_struct of size 3712 The buggy address is located 1992 bytes inside of freed 3712-byte region [ffff88810de2cb00, ffff88810de2d980) which is caused by the task_struct pointed to by sq->thread being released while it is being used in the function __io_uring_show_fdinfo(). Holding ctx->uring_lock does not prevent ehre relase or exit of sq->thread. Fix this by assigning and looking up ->thread under RCU, and grabbing a reference to the task_struct. This ensures that it cannot get released while fdinfo is using it. Reported-by: syzbot+531502bbbe51d2f769f4@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/682b06a5.a70a0220.3849cf.00b3.GAE@google.com Fixes: 3fcb9d17206e ("io_uring/sqpoll: statistics of the true utilization of sq threads") Signed-off-by: Penglei Jiang <superman.xpt@gmail.com> Link: https://lore.kernel.org/r/20250610171801.70960-1-superman.xpt@gmail.com [axboe: massage commit message] Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-19block: use q->elevator with ->elevator_lock held in elv_iosched_show()Ming Lei1-2/+1
[ Upstream commit 94209d27d14104ed828ca88cd5403a99162fe51a ] Use q->elevator with ->elevator_lock held in elv_iosched_show(), since the local cached elevator reference may become stale after getting ->elevator_lock. Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250505141805.2751237-5-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>