summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
4 daysxfs: fix UAF in xchk_btree_check_block_ownerDarrick J. Wong1-2/+5
commit 1c253e11225bc5167217897885b85093e17c2217 upstream. We cannot dereference bs->cur when trying to determine if bs->cur aliases bs->sc->sa.{bno,rmap}_cur after the latter has been freed. Fix this by sampling before type before any freeing could happen. The correct temporal ordering was broken when we removed xfs_btnum_t. Cc: r772577952@gmail.com Cc: <stable@vger.kernel.org> # v6.9 Fixes: ec793e690f801d ("xfs: remove xfs_btnum_t") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Jiaming Zhang <r772577952@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayserofs: fix UAF issue for file-backed mounts w/ directio optionChao Yu1-1/+6
commit 1caf50ce4af096d0280d59a31abdd85703cd995c upstream. [ 9.269940][ T3222] Call trace: [ 9.269948][ T3222] ext4_file_read_iter+0xac/0x108 [ 9.269979][ T3222] vfs_iocb_iter_read+0xac/0x198 [ 9.269993][ T3222] erofs_fileio_rq_submit+0x12c/0x180 [ 9.270008][ T3222] erofs_fileio_submit_bio+0x14/0x24 [ 9.270030][ T3222] z_erofs_runqueue+0x834/0x8ac [ 9.270054][ T3222] z_erofs_read_folio+0x120/0x220 [ 9.270083][ T3222] filemap_read_folio+0x60/0x120 [ 9.270102][ T3222] filemap_fault+0xcac/0x1060 [ 9.270119][ T3222] do_pte_missing+0x2d8/0x1554 [ 9.270131][ T3222] handle_mm_fault+0x5ec/0x70c [ 9.270142][ T3222] do_page_fault+0x178/0x88c [ 9.270167][ T3222] do_translation_fault+0x38/0x54 [ 9.270183][ T3222] do_mem_abort+0x54/0xac [ 9.270208][ T3222] el0_da+0x44/0x7c [ 9.270227][ T3222] el0t_64_sync_handler+0x5c/0xf4 [ 9.270253][ T3222] el0t_64_sync+0x1bc/0x1c0 EROFS may encounter above panic when enabling file-backed mount w/ directio mount option, the root cause is it may suffer UAF in below race condition: - z_erofs_read_folio wq s_dio_done_wq - z_erofs_runqueue - erofs_fileio_submit_bio - erofs_fileio_rq_submit - vfs_iocb_iter_read - ext4_file_read_iter - ext4_dio_read_iter - iomap_dio_rw : bio was submitted and return -EIOCBQUEUED - dio_aio_complete_work - dio_complete - dio->iocb->ki_complete (erofs_fileio_ki_complete()) - kfree(rq) : it frees iocb, iocb.ki_filp can be UAF in file_accessed(). - file_accessed : access NULL file point Introduce a reference count in struct erofs_fileio_rq, and initialize it as two, both erofs_fileio_ki_complete() and erofs_fileio_rq_submit() will decrease reference count, the last one decreasing the reference count to zero will free rq. Cc: stable@kernel.org Fixes: fb176750266a ("erofs: add file-backed mount support") Fixes: 6422cde1b0d5 ("erofs: use buffered I/O for file-backed mounts by default") Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayshfs: ensure sb->s_fs_info is always cleaned upMehdi Ben Hadj Khelifa2-22/+23
commit 05ce49a902be15dc93854cbfc20161205a9ee446 upstream. When hfs was converted to the new mount api a bug was introduced by changing the allocation pattern of sb->s_fs_info. If setup_bdev_super() fails after a new superblock has been allocated by sget_fc(), but before hfs_fill_super() takes ownership of the filesystem-specific s_fs_info data it was leaked. Fix this by freeing sb->s_fs_info in hfs_kill_super(). Cc: stable@vger.kernel.org Fixes: ffcd06b6d13b ("hfs: convert hfs to use the new mount api") Reported-by: syzbot+ad45f827c88778ff7df6@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ad45f827c88778ff7df6 Tested-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Mehdi Ben Hadj Khelifa <mehdi.benhadjkhelifa@gmail.com> Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Link: https://lore.kernel.org/r/20251201222843.82310-2-mehdi.benhadjkhelifa@gmail.com Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 daysnilfs2: Fix potential block overflow that cause system hangEdward Adam Davis1-0/+4
commit ed527ef0c264e4bed6c7b2a158ddf516b17f5f66 upstream. When a user executes the FITRIM command, an underflow can occur when calculating nblocks if end_block is too small. Since nblocks is of type sector_t, which is u64, a negative nblocks value will become a very large positive integer. This ultimately leads to the block layer function __blkdev_issue_discard() taking an excessively long time to process the bio chain, and the ns_segctor_sem lock remains held for a long period. This prevents other tasks from acquiring the ns_segctor_sem lock, resulting in the hang reported by syzbot in [1]. If the ending block is too small, typically if it is smaller than 4KiB range, depending on the usage of the segment 0, it may be possible to attempt a discard request beyond the device size causing the hang. Exiting successfully and assign the discarded size (0 in this case) to range->len. Although the start and len values in the user input range are too small, a conservative strategy is adopted here to safely ignore them, which is equivalent to a no-op; it will not perform any trimming and will not throw an error. [1] task:segctord state:D stack:28968 pid:6093 tgid:6093 ppid:2 task_flags:0x200040 flags:0x00080000 Call Trace: rwbase_write_lock+0x3dd/0x750 kernel/locking/rwbase_rt.c:272 nilfs_transaction_lock+0x253/0x4c0 fs/nilfs2/segment.c:357 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2569 [inline] nilfs_segctor_thread+0x6ec/0xe00 fs/nilfs2/segment.c:2684 [ryusuke: corrected part of the commit message about the consequences] Fixes: 82e11e857be3 ("nilfs2: add nilfs_sufile_trim_fs to trim clean segs") Reported-by: syzbot+7eedce5eb281acd832f0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7eedce5eb281acd832f0 Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: let send_done handle a completion without IB_SEND_SIGNALEDStefan Metzmacher1-0/+27
commit cf74fcdc43b322b6188a0750b5ee79e38be6d078 upstream. With smbdirect_send_batch processing we likely have requests without IB_SEND_SIGNALED, which will be destroyed in the final request that has IB_SEND_SIGNALED set. If the connection is broken all requests are signaled even without explicit IB_SEND_SIGNALED. Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: let smbd_post_send_negotiate_req() use smbd_post_send()Stefan Metzmacher1-25/+7
commit 5b1c6149657af840a02885135c700ab42e6aa322 upstream. The server has similar logic and it makes sure that request->wr is used instead of a stack struct ib_send_wr send_wr. This makes sure send_done can see request->wr.send_flags as the next commit will check for IB_SEND_SIGNALED Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: fix last send credit problem causing disconnectsStefan Metzmacher1-2/+29
commit 93ac432274e1361b4f6cd69e7c5d9b3ac21e13f5 upstream. When we are about to use the last send credit that was granted to us by the peer, we need to wait until we are ourself able to grant at least one credit to the peer. Otherwise it might not be possible for the peer to grant more credits. The following sections in MS-SMBD are related to this: 3.1.5.1 Sending Upper Layer Messages ... If Connection.SendCredits is 1 and the CreditsGranted field of the message is 0, stop processing. ... 3.1.5.9 Managing Credits Prior to Sending ... If Connection.ReceiveCredits is zero, or if Connection.SendCredits is one and the Connection.SendQueue is not empty, the sender MUST allocate and post at least one receive of size Connection.MaxReceiveSize and MUST increment Connection.ReceiveCredits by the number allocated and posted. If no receives are posted, the processing MUST return a value of zero to indicate to the caller that no Send message can be currently performed. ... This is a similar logic as we have in the server. Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: make use of smbdirect_socket.send_io.bcreditsStefan Metzmacher1-3/+55
commit 21538121efe6c8c5b51c742fa02cbe820bc48714 upstream. It turns out that our code will corrupt the stream of reassabled data transfer messages when we trigger an immendiate (empty) send. In order to fix this we'll have a single 'batch' credit per connection. And code getting that credit is free to use as much messages until remaining_length reaches 0, then the batch credit it given back and the next logical send can happen. Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: use smbdirect_send_batch processingStefan Metzmacher1-14/+135
commit 2c1ac39ce9cd4112f406775c626eef7f3eb4c481 upstream. This will allow us to use similar logic as we have in the server soon, so that we can share common code later. Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: introduce and use smbd_{alloc, free}_send_io()Stefan Metzmacher1-29/+58
commit dc77da0373529d43175984b390106be2d8f03609 upstream. This is basically a copy of smb_direct_{alloc,free}_sendmsg() in the server, with just using ib_dma_unmap_page() in all cases, which is the same as ib_dma_unmap_single(). We'll use this logic in common code in future. (I basically backported it from my branch that as already has everything in common). Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: split out smbd_ib_post_send()Stefan Metzmacher1-16/+17
commit bf30515caec590316e0d08208e4252eed4c160df upstream. This is like smb_direct_post_send() in the server and will simplify porting the smbdirect_send_batch and credit related logic from the server. Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: port and use the wait_for_credits logic used by serverStefan Metzmacher1-27/+43
commit bb848d205f7ac0141af52a5acb6dd116d9b71177 upstream. This simplifies the logic and prepares the use of smbdirect_send_batch in order to make sure all messages in a multi fragment send are grouped together. We'll add the smbdirect_send_batch processin in a later patch. Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: remove pointless sc->send_io.pending handling in ↵Stefan Metzmacher1-5/+0
smbd_post_send_iter() commit 8bfe3fd33f36b987c8200b112646732b5f5cd8b3 upstream. If we reach this the connection is already broken as smbd_post_send() already called smbd_disconnect_rdma_connection(). This will also simplify further changes. Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: remove pointless sc->recv_io.credits.count rollbackStefan Metzmacher1-3/+0
commit 6858531e5e8d68828eec349989cefce3f45a487f upstream. We either reach this code path before we call new_credits = manage_credits_prior_sending(sc), which means new_credits is still 0 or the connection is already broken as smbd_post_send() already called smbd_disconnect_rdma_connection(). This will also simplify further changes. Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: let smbd_post_send() make use of request->wrStefan Metzmacher1-8/+7
commit bf1656e12a9db2add716c7fb57b56967f69599fa upstream. We don't need a stack variable in addition. Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: let recv_done() queue a refill when the peer is low on creditsStefan Metzmacher1-2/+5
commit defb3c05fee94b296eebe05aaea16d2664b00252 upstream. In captures I saw that Windows was granting 191 credits in a batch when its peer posted a lot of messages. We are asking for a credit target of 255 and 191 is 252*3/4. So we also use that logic in order to fill the recv buffers available to the peer. Fixes: 02548c477a90 ("smb: client: queue post_recv_credits_work also if the peer raises the credit target") Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: make use of smbdirect_socket.recv_io.credits.availableStefan Metzmacher1-6/+28
commit 9911b1ed187a770a43950bf51f340ad4b7beecba upstream. The logic off managing recv credits by counting posted recv_io and granted credits is racy. That's because the peer might already consumed a credit, but between receiving the incoming recv at the hardware and processing the completion in the 'recv_done' functions we likely have a window where we grant credits, which don't really exist. So we better have a decicated counter for the available credits, which will be incremented when we posted new recv buffers and drained when we grant the credits to the peer. Fixes: 5fb9b459b368 ("smb: client: count the number of posted recv_io messages in order to calculated credits") Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: server: let send_done handle a completion without IB_SEND_SIGNALEDStefan Metzmacher1-0/+26
commit 9da82dc73cb03e85d716a2609364572367a5ff47 upstream. With smbdirect_send_batch processing we likely have requests without IB_SEND_SIGNALED, which will be destroyed in the final request that has IB_SEND_SIGNALED set. If the connection is broken all requests are signaled even without explicit IB_SEND_SIGNALED. Cc: <stable@vger.kernel.org> # 6.18.x Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: server: fix last send credit problem causing disconnectsStefan Metzmacher1-2/+30
commit 8cf2bbac6281434065f5f3aeab19c9c08ff755a2 upstream. When we are about to use the last send credit that was granted to us by the peer, we need to wait until we are ourself able to grant at least one credit to the peer. Otherwise it might not be possible for the peer to grant more credits. The following sections in MS-SMBD are related to this: 3.1.5.1 Sending Upper Layer Messages ... If Connection.SendCredits is 1 and the CreditsGranted field of the message is 0, stop processing. ... 3.1.5.9 Managing Credits Prior to Sending ... If Connection.ReceiveCredits is zero, or if Connection.SendCredits is one and the Connection.SendQueue is not empty, the sender MUST allocate and post at least one receive of size Connection.MaxReceiveSize and MUST increment Connection.ReceiveCredits by the number allocated and posted. If no receives are posted, the processing MUST return a value of zero to indicate to the caller that no Send message can be currently performed. ... This problem was found by running this on Windows 2025 against ksmbd with required smb signing: 'frametest.exe -r 4k -t 20 -n 2000' after 'frametest.exe -w 4k -t 20 -n 2000'. Link: https://lore.kernel.org/linux-cifs/b58fa352-2386-4145-b42e-9b4b1d484e17@samba.org/ Cc: <stable@vger.kernel.org> # 6.18.x Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: server: make use of smbdirect_socket.send_io.bcreditsStefan Metzmacher1-2/+51
commit 34abd408c8ba24d7c97bd02ba874d8c714f49db1 upstream. It turns out that our code will corrupt the stream of reassabled data transfer messages when we trigger an immendiate (empty) send. In order to fix this we'll have a single 'batch' credit per connection. And code getting that credit is free to use as much messages until remaining_length reaches 0, then the batch credit it given back and the next logical send can happen. Cc: <stable@vger.kernel.org> # 6.18.x Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: server: let recv_done() queue a refill when the peer is low on creditsStefan Metzmacher1-2/+4
commit 8106978d400cc88a99fb94927afe8fec7391ca3e upstream. In captures I saw that Windows was granting 191 credits in a batch when its peer posted a lot of messages. We are asking for a credit target of 255 and 191 is 252*3/4. So we also use that logic in order to fill the recv buffers available to the peer. Fixes: a7eef6144c97 ("smb: server: queue post_recv_credits_work in put_recvmsg() and avoid count_avail_recvmsg") Cc: <stable@vger.kernel.org> # 6.18.x Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: server: make use of smbdirect_socket.recv_io.credits.availableStefan Metzmacher1-5/+25
commit 26ad87a2cfb8c1384620d1693a166ed87303046e upstream. The logic off managing recv credits by counting posted recv_io and granted credits is racy. That's because the peer might already consumed a credit, but between receiving the incoming recv at the hardware and processing the completion in the 'recv_done' functions we likely have a window where we grant credits, which don't really exist. So we better have a decicated counter for the available credits, which will be incremented when we posted new recv buffers and drained when we grant the credits to the peer. This fixes regression Namjae reported with the 6.18 release. Fixes: 89b021a72663 ("smb: server: manage recv credits by counting posted recv_io and granted credits") Cc: <stable@vger.kernel.org> # 6.18.x Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: smbdirect: introduce smbdirect_socket.send_io.bcredits.*Stefan Metzmacher1-0/+16
commit 8e94268b21c8235d430ce1aa6dc0b15952744b9b upstream. It turns out that our code will corrupt the stream of reassabled data transfer messages when we trigger an immendiate (empty) send. In order to fix this we'll have a single 'batch' credit per connection. And code getting that credit is free to use as much messages until remaining_length reaches 0, then the batch credit it given back and the next logical send can happen. Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: smbdirect: introduce smbdirect_socket.recv_io.credits.availableStefan Metzmacher1-0/+2
commit 6e3c5052f9686192e178806e017b7377155f4bab upstream. The logic off managing recv credits by counting posted recv_io and granted credits is racy. That's because the peer might already consumed a credit, but between receiving the incoming recv at the hardware and processing the completion in the 'recv_done' functions we likely have a window where we grant credits, which don't really exist. So we better have a decicated counter for the available credits, which will be incremented when we posted new recv buffers and drained when we grant the credits to the peer. Fixes: 5fb9b459b368 ("smb: client: count the number of posted recv_io messages in order to calculated credits") Fixes: 89b021a72663 ("smb: server: manage recv credits by counting posted recv_io and granted credits") Cc: <stable@vger.kernel.org> # 6.18.x Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: server: fix leak of active_num_conn in ksmbd_tcp_new_connection()Henrique Carvalho1-1/+2
commit 77ffbcac4e569566d0092d5f22627dfc0896b553 upstream. On kthread_run() failure in ksmbd_tcp_new_connection(), the transport is freed via free_transport(), which does not decrement active_num_conn, leaking this counter. Replace free_transport() with ksmbd_tcp_disconnect(). Fixes: 0d0d4680db22e ("ksmbd: add max connections parameter") Cc: stable@vger.kernel.org Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 daysksmbd: add chann_lock to protect ksmbd_chann_list xarrayNamjae Jeon3-1/+17
commit 4f3a06cc57976cafa8c6f716646be6c79a99e485 upstream. ksmbd_chann_list xarray lacks synchronization, allowing use-after-free in multi-channel sessions (between lookup_chann_list() and ksmbd_chann_del). Adds rw_semaphore chann_lock to struct ksmbd_session and protects all xa_load/xa_store/xa_erase accesses. Cc: stable@vger.kernel.org Reported-by: Igor Stepansky <igor.stepansky@orca.security> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 daysksmbd: fix infinite loop caused by next_smb2_rcv_hdr_off reset in error pathsNamjae Jeon1-3/+3
commit 010eb01ce23b34b50531448b0da391c7f05a72af upstream. The problem occurs when a signed request fails smb2 signature verification check. In __process_request(), if check_sign_req() returns an error, set_smb2_rsp_status(work, STATUS_ACCESS_DENIED) is called. set_smb2_rsp_status() set work->next_smb2_rcv_hdr_off as zero. By resetting next_smb2_rcv_hdr_off to zero, the pointer to the next command in the chain is lost. Consequently, is_chained_smb2_message() continues to point to the same request header instead of advancing. If the header's NextCommand field is non-zero, the function returns true, causing __handle_ksmbd_work() to repeatedly process the same failed request in an infinite loop. This results in the kernel log being flooded with "bad smb2 signature" messages and high CPU usage. This patch fixes the issue by changing the return value from SERVER_HANDLER_CONTINUE to SERVER_HANDLER_ABORT. This ensures that the processing loop terminates immediately rather than attempting to continue from an invalidated offset. Reported-by: tianshuo han <hantianshuo233@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
4 dayssmb: client: split cached_fid bitfields to avoid shared-byte RMW racesHenrique Carvalho1-4/+4
commit ec306600d5ba7148c9dbf8f5a8f1f5c1a044a241 upstream. is_open, has_lease and on_list are stored in the same bitfield byte in struct cached_fid but are updated in different code paths that may run concurrently. Bitfield assignments generate byte read–modify–write operations (e.g. `orb $mask, addr` on x86_64), so updating one flag can restore stale values of the others. A possible interleaving is: CPU1: load old byte (has_lease=1, on_list=1) CPU2: clear both flags (store 0) CPU1: RMW store (old | IS_OPEN) -> reintroduces cleared bits To avoid this class of races, convert these flags to separate bool fields. Cc: stable@vger.kernel.org Fixes: ebe98f1447bbc ("cifs: enable caching of directories for which a lease is held") Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
13 daysMerge tag 'mm-hotfixes-stable-2026-02-06-12-37' of ↵Linus Torvalds1-15/+27
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "A couple of late-breaking MM fixes. One against a new-in-this-cycle patch and the other addresses a locking issue which has been there for over a year" * tag 'mm-hotfixes-stable-2026-02-06-12-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/memory-failure: reject unsupported non-folio compound page procfs: avoid fetching build ID while holding VMA lock
13 daysMerge tag 'ceph-for-6.19-rc9' of https://github.com/ceph/ceph-clientLinus Torvalds5-15/+42
Pull ceph fixes from Ilya Dryomov: "One RBD and two CephFS fixes which address potential oopses. The RBD thing is more of a rare edge case that pops up in our CI, while the two CephFS scenarios are regressions that were reported by users and can be triggered trivially in normal operation. All marked for stable" * tag 'ceph-for-6.19-rc9' of https://github.com/ceph/ceph-client: ceph: fix NULL pointer dereference in ceph_mds_auth_match() ceph: fix oops due to invalid pointer for kfree() in parse_longname() rbd: check for EOD after exclusive lock is ensured to be held
2026-02-06procfs: avoid fetching build ID while holding VMA lockAndrii Nakryiko1-15/+27
Fix PROCMAP_QUERY to fetch optional build ID only after dropping mmap_lock or per-VMA lock, whichever was used to lock VMA under question, to avoid deadlock reported by syzbot: -> #1 (&mm->mmap_lock){++++}-{4:4}: __might_fault+0xed/0x170 _copy_to_iter+0x118/0x1720 copy_page_to_iter+0x12d/0x1e0 filemap_read+0x720/0x10a0 blkdev_read_iter+0x2b5/0x4e0 vfs_read+0x7f4/0xae0 ksys_read+0x12a/0x250 do_syscall_64+0xcb/0xf80 entry_SYSCALL_64_after_hwframe+0x77/0x7f -> #0 (&sb->s_type->i_mutex_key#8){++++}-{4:4}: __lock_acquire+0x1509/0x26d0 lock_acquire+0x185/0x340 down_read+0x98/0x490 blkdev_read_iter+0x2a7/0x4e0 __kernel_read+0x39a/0xa90 freader_fetch+0x1d5/0xa80 __build_id_parse.isra.0+0xea/0x6a0 do_procmap_query+0xd75/0x1050 procfs_procmap_ioctl+0x7a/0xb0 __x64_sys_ioctl+0x18e/0x210 do_syscall_64+0xcb/0xf80 entry_SYSCALL_64_after_hwframe+0x77/0x7f other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- rlock(&mm->mmap_lock); lock(&sb->s_type->i_mutex_key#8); lock(&mm->mmap_lock); rlock(&sb->s_type->i_mutex_key#8); *** DEADLOCK *** This seems to be exacerbated (as we haven't seen these syzbot reports before that) by the recent: 777a8560fd29 ("lib/buildid: use __kernel_read() for sleepable context") To make this safe, we need to grab file refcount while VMA is still locked, but other than that everything is pretty straightforward. Internal build_id_parse() API assumes VMA is passed, but it only needs the underlying file reference, so just add another variant build_id_parse_file() that expects file passed directly. [akpm@linux-foundation.org: fix up kerneldoc] Link: https://lkml.kernel.org/r/20260129215340.3742283-1-andrii@kernel.org Fixes: ed5d583a88a9 ("fs/procfs: implement efficient VMA querying API for /proc/<pid>/maps") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Reported-by: <syzbot+4e70c8e0a2017b432f7a@syzkaller.appspotmail.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Tested-by: Suren Baghdasaryan <surenb@google.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Eduard Zingerman <eddyz87@gmail.com> Cc: Hao Luo <haoluo@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Song Liu <song@kernel.org> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Yonghong Song <yonghong.song@linux.dev> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-05ceph: fix NULL pointer dereference in ceph_mds_auth_match()Viacheslav Dubeyko4-11/+37
The CephFS kernel client has regression starting from 6.18-rc1. We have issue in ceph_mds_auth_match() if fs_name == NULL: const char fs_name = mdsc->fsc->mount_options->mds_namespace; ... if (auth->match.fs_name && strcmp(auth->match.fs_name, fs_name)) { / fsname mismatch, try next one */ return 0; } Patrick Donnelly suggested that: In summary, we should definitely start decoding `fs_name` from the MDSMap and do strict authorizations checks against it. Note that the `-o mds_namespace=foo` should only be used for selecting the file system to mount and nothing else. It's possible no mds_namespace is specified but the kernel will mount the only file system that exists which may have name "foo". This patch reworks ceph_mdsmap_decode() and namespace_equals() with the goal of supporting the suggested concept. Now struct ceph_mdsmap contains m_fs_name field that receives copy of extracted FS name by ceph_extract_encoded_string(). For the case of "old" CephFS file systems, it is used "cephfs" name. [ idryomov: replace redundant %*pE with %s in ceph_mdsmap_decode(), get rid of a series of strlen() calls in ceph_namespace_match(), drop changes to namespace_equals() body to avoid treating empty mds_namespace as equal, drop changes to ceph_mdsc_handle_fsmap() as namespace_equals() isn't an equivalent substitution there ] Cc: stable@vger.kernel.org Fixes: 22c73d52a6d0 ("ceph: fix multifs mds auth caps issue") Link: https://tracker.ceph.com/issues/73886 Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Patrick Donnelly <pdonnell@ibm.com> Tested-by: Patrick Donnelly <pdonnell@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2026-02-04Merge tag 'v6.19rc8-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds2-1/+4
Pull smb client fixes from Steve French: "Two small client memory leak fixes" * tag 'v6.19rc8-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb/client: fix memory leak in SendReceive() smb/client: fix memory leak in smb2_open_file()
2026-02-03ceph: fix oops due to invalid pointer for kfree() in parse_longname()Daniel Vogelbacher1-4/+5
This fixes a kernel oops when reading ceph snapshot directories (.snap), for example by simply running `ls /mnt/my_ceph/.snap`. The variable str is guarded by __free(kfree), but advanced by one for skipping the initial '_' in snapshot names. Thus, kfree() is called with an invalid pointer. This patch removes the need for advancing the pointer so kfree() is called with correct memory pointer. Steps to reproduce: 1. Create snapshots on a cephfs volume (I've 63 snaps in my testcase) 2. Add cephfs mount to fstab $ echo "samba-fileserver@.files=/volumes/datapool/stuff/3461082b-ecc9-4e82-8549-3fd2590d3fb6 /mnt/test/stuff ceph acl,noatime,_netdev 0 0" >> /etc/fstab 3. Reboot the system $ systemctl reboot 4. Check if it's really mounted $ mount | grep stuff 5. List snapshots (expected 63 snapshots on my system) $ ls /mnt/test/stuff/.snap Now ls hangs forever and the kernel log shows the oops. Cc: stable@vger.kernel.org Fixes: 101841c38346 ("[ceph] parse_longname(): strrchr() expects NUL-terminated string") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220807 Suggested-by: Helge Deller <deller@gmx.de> Signed-off-by: Daniel Vogelbacher <daniel@chaospixel.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2026-02-03Merge tag 'for-6.19-rc8-tag' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "A regression fix for a memory leak when raid56 is used" * tag 'for-6.19-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: raid56: fix memory leak of btrfs_raid_bio::stripe_uptodate_bitmap
2026-02-03btrfs: raid56: fix memory leak of btrfs_raid_bio::stripe_uptodate_bitmapFilipe Manana1-0/+1
We allocate the bitmap but we never free it in free_raid_bio_pointers(). Fix this by adding a bitmap_free() call against the stripe_uptodate_bitmap of a raid bio. Fixes: 1810350b04ef ("btrfs: raid56: move sector_ptr::uptodate into a dedicated bitmap") Reported-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/linux-btrfs/20260126045315.GA31641@lst.de/ Reviewed-by: Qu Wenruo <wqu@suse.com> Tested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-02-02smb/client: fix memory leak in SendReceive()ChenXiaoSong1-1/+3
Reproducer: 1. server: supports SMB1, directories are exported read-only 2. client: mount -t cifs -o vers=1.0 //${server_ip}/export /mnt 3. client: dd if=/dev/zero of=/mnt/file bs=512 count=1000 oflag=direct 4. client: umount /mnt 5. client: sleep 1 6. client: modprobe -r cifs The error message is as follows: ============================================================================= BUG cifs_small_rq (Not tainted): Objects remaining on __kmem_cache_shutdown() ----------------------------------------------------------------------------- Object 0x00000000d34491e6 @offset=896 Object 0x00000000bde9fab3 @offset=4480 Object 0x00000000104a1f70 @offset=6272 Object 0x0000000092a51bb5 @offset=7616 Object 0x000000006714a7db @offset=13440 ... WARNING: mm/slub.c:1251 at __kmem_cache_shutdown+0x379/0x3f0, CPU#7: modprobe/712 ... Call Trace: <TASK> kmem_cache_destroy+0x69/0x160 cifs_destroy_request_bufs+0x39/0x40 [cifs] cleanup_module+0x43/0xfc0 [cifs] __se_sys_delete_module+0x1d5/0x300 __x64_sys_delete_module+0x1a/0x30 x64_sys_call+0x2299/0x2ff0 do_syscall_64+0x6e/0x270 entry_SYSCALL_64_after_hwframe+0x76/0x7e ... kmem_cache_destroy cifs_small_rq: Slab cache still has objects when called from cifs_destroy_request_bufs+0x39/0x40 [cifs] WARNING: mm/slab_common.c:532 at kmem_cache_destroy+0x142/0x160, CPU#7: modprobe/712 Link: https://lore.kernel.org/linux-cifs/9751f02d-d1df-4265-a7d6-b19761b21834@linux.dev/T/#mf14808c144448b715f711ce5f0477a071f08eaf6 Fixes: 6be09580df5c ("cifs: Make smb1's SendReceive() wrap cifs_send_recv()") Reported-by: Paulo Alcantara <pc@manguebit.org> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-02-02smb/client: fix memory leak in smb2_open_file()ChenXiaoSong1-0/+1
Reproducer: 1. server: directories are exported read-only 2. client: mount -t cifs //${server_ip}/export /mnt 3. client: dd if=/dev/zero of=/mnt/file bs=512 count=1000 oflag=direct 4. client: umount /mnt 5. client: sleep 1 6. client: modprobe -r cifs The error message is as follows: ============================================================================= BUG cifs_small_rq (Not tainted): Objects remaining on __kmem_cache_shutdown() ----------------------------------------------------------------------------- Object 0x00000000d47521be @offset=14336 ... WARNING: mm/slub.c:1251 at __kmem_cache_shutdown+0x34e/0x440, CPU#0: modprobe/1577 ... Call Trace: <TASK> kmem_cache_destroy+0x94/0x190 cifs_destroy_request_bufs+0x3e/0x50 [cifs] cleanup_module+0x4e/0x540 [cifs] __se_sys_delete_module+0x278/0x400 __x64_sys_delete_module+0x5f/0x70 x64_sys_call+0x2299/0x2ff0 do_syscall_64+0x89/0x350 entry_SYSCALL_64_after_hwframe+0x76/0x7e ... kmem_cache_destroy cifs_small_rq: Slab cache still has objects when called from cifs_destroy_request_bufs+0x3e/0x50 [cifs] WARNING: mm/slab_common.c:532 at kmem_cache_destroy+0x16b/0x190, CPU#0: modprobe/1577 Link: https://lore.kernel.org/linux-cifs/9751f02d-d1df-4265-a7d6-b19761b21834@linux.dev/T/#mf14808c144448b715f711ce5f0477a071f08eaf6 Fixes: e255612b5ed9 ("cifs: Add fallback for SMB2 CREATE without FILE_READ_ATTRIBUTES") Reported-by: Paulo Alcantara <pc@manguebit.org> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Reviewed-by: Pali Rohár <pali@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-01-31Merge tag 'efi-fixes-for-v6.19-3' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fix from Ard Biesheuvel: - Fix regression in efivarfs error propagation * tag 'efi-fixes-for-v6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efivarfs: fix error propagation in efivar_entry_get()
2026-01-29Merge tag 'for-6.19-rc7-tag' of ↵Linus Torvalds4-26/+3
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix leaked folio refcount on s390x when using hw zlib compression acceleration - remove own threshold from ->writepages() which could collide with cgroup limits and lead to a deadlock when metadadata are not written because the amount is under the internal limit * tag 'for-6.19-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: zlib: fix the folio leak on S390 hardware acceleration btrfs: do not strictly require dirty metadata threshold for metadata writepages
2026-01-26Merge tag 'vfs-6.19-rc8.fixes' of ↵Linus Torvalds13-39/+75
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Fix the the buggy conversion of fuse_reverse_inval_entry() introduced during the creation rework - Disallow nfs delegation requests for directories by setting simple_nosetlease() - Require an opt-in for getting readdir flag bits outside of S_DT_MASK set in d_type - Fix scheduling delayed writeback work by only scheduling when the dirty time expiry interval is non-zero and cancel the delayed work if the interval is set to zero - Use rounded_jiffies_interval for dirty time work - Check the return value of sb_set_blocksize() for romfs - Wait for batched folios to be stable in __iomap_get_folio() - Use private naming for fuse hash size - Fix the stale dentry cleanup to prevent a race that causes a UAF * tag 'vfs-6.19-rc8.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: vfs: document d_dispose_if_unused() fuse: shrink once after all buckets have been scanned fuse: clean up fuse_dentry_tree_work() fuse: add need_resched() before unlocking bucket fuse: make sure dentry is evicted if stale fuse: fix race when disposing stale dentries fuse: use private naming for fuse hash size writeback: use round_jiffies_relative for dirtytime_work iomap: wait for batched folios to be stable in __iomap_get_folio romfs: check sb_set_blocksize() return value docs: clarify that dirtytime_expire_seconds=0 disables writeback writeback: fix 100% CPU usage when dirtytime_expire_interval is 0 readdir: require opt-in for d_type flags vboxsf: don't allow delegations to be set on directories ceph: don't allow delegations to be set on directories gfs2: don't allow delegations to be set on directories 9p: don't allow delegations to be set on directories smb/client: properly disallow delegations on directories nfs: properly disallow delegation requests on directories fuse: fix conversion of fuse_reverse_inval_entry() to start_removing()
2026-01-24Merge tag 'v6.19-rc6-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds2-9/+9
Pull smb server fixes from Steve French: - Use the original nents value for ib_dma_unmap_sg(), preventing potential memory corruption in the RDMA transport layer - Fix a naming discrepancy in the kernel-doc for ksmbd_vfs_kern_path_start_removing() as identified by sparse static analysis - Reset smb_direct_port to its default value during initialization to ensure the correct port is used when switching between different RDMA device types without module reload * tag 'v6.19-rc6-server-fixes' of git://git.samba.org/ksmbd: smb: server: reset smb_direct_port = SMB_DIRECT_PORT_INFINIBAND on init smb: server: fix comment for ksmbd_vfs_kern_path_start_removing() ksmbd: smbd: fix dma_unmap_sg() nents
2026-01-23smb: server: reset smb_direct_port = SMB_DIRECT_PORT_INFINIBAND on initStefan Metzmacher1-0/+1
This allows testing with different devices (iwrap vs. non-iwarp) without 'rmmod ksmbd && modprobe ksmbd', but instead 'ksmbd.control -s && ksmbd.mountd' is enough. In the long run we want to listen on iwarp and non-iwarp at the same time, but requires more changes, most likely also in the rdma layer. Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-01-23smb: server: fix comment for ksmbd_vfs_kern_path_start_removing()Stefan Metzmacher1-1/+1
This was found by sparse... Fixes: 1ead2213dd7d ("smb/server: use end_removing_noperm for for target of smb2_create_link()") Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: NeilBrown <neil@brown.name> Cc: Christian Brauner <brauner@kernel.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-01-23ksmbd: smbd: fix dma_unmap_sg() nentsThomas Fourier1-8/+7
The dma_unmap_sg() functions should be called with the same nents as the dma_map_sg(), not the value the map function returned. Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Cc: <stable@vger.kernel.org> Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-01-21btrfs: zlib: fix the folio leak on S390 hardware accelerationQu Wenruo1-0/+1
[BUG] After commit aa60fe12b4f4 ("btrfs: zlib: refactor S390x HW acceleration buffer preparation"), we no longer release the folio of the page cache of folio returned by btrfs_compress_filemap_get_folio() for S390 hardware acceleration path. [CAUSE] Before that commit, we call kumap_local() and folio_put() after handling each folio. Although the timing is not ideal (it release previous folio at the beginning of the loop, and rely on some extra cleanup out of the loop), it at least handles the folio release correctly. Meanwhile the refactored code is easier to read, it lacks the call to release the filemap folio. [FIX] Add the missing folio_put() for copy_data_into_buffer(). CC: linux-s390@vger.kernel.org # 6.18+ Fixes: aa60fe12b4f4 ("btrfs: zlib: refactor S390x HW acceleration buffer preparation") Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-01-21btrfs: do not strictly require dirty metadata threshold for metadata writepagesQu Wenruo3-26/+2
[BUG] There is an internal report that over 1000 processes are waiting at the io_schedule_timeout() of balance_dirty_pages(), causing a system hang and trigger a kernel coredump. The kernel is v6.4 kernel based, but the root problem still applies to any upstream kernel before v6.18. [CAUSE] From Jan Kara for his wisdom on the dirty page balance behavior first. This cgroup dirty limit was what was actually playing the role here because the cgroup had only a small amount of memory and so the dirty limit for it was something like 16MB. Dirty throttling is responsible for enforcing that nobody can dirty (significantly) more dirty memory than there's dirty limit. Thus when a task is dirtying pages it periodically enters into balance_dirty_pages() and we let it sleep there to slow down the dirtying. When the system is over dirty limit already (either globally or within a cgroup of the running task), we will not let the task exit from balance_dirty_pages() until the number of dirty pages drops below the limit. So in this particular case, as I already mentioned, there was a cgroup with relatively small amount of memory and as a result with dirty limit set at 16MB. A task from that cgroup has dirtied about 28MB worth of pages in btrfs btree inode and these were practically the only dirty pages in that cgroup. So that means the only way to reduce the dirty pages of that cgroup is to writeback the dirty pages of btrfs btree inode, and only after that those processes can exit balance_dirty_pages(). Now back to the btrfs part, btree_writepages() is responsible for writing back dirty btree inode pages. The problem here is, there is a btrfs internal threshold that if the btree inode's dirty bytes are below the 32M threshold, it will not do any writeback. This behavior is to batch as much metadata as possible so we won't write back those tree blocks and then later re-COW them again for another modification. This internal 32MiB is higher than the existing dirty page size (28MiB), meaning no writeback will happen, causing a deadlock between btrfs and cgroup: - Btrfs doesn't want to write back btree inode until more dirty pages - Cgroup/MM doesn't want more dirty pages for btrfs btree inode Thus any process touching that btree inode is put into sleep until the number of dirty pages is reduced. Thanks Jan Kara a lot for the analysis of the root cause. [ENHANCEMENT] Since kernel commit b55102826d7d ("btrfs: set AS_KERNEL_FILE on the btree_inode"), btrfs btree inode pages will only be charged to the root cgroup which should have a much larger limit than btrfs' 32MiB threshold. So it should not affect newer kernels. But for all current LTS kernels, they are all affected by this problem, and backporting the whole AS_KERNEL_FILE may not be a good idea. Even for newer kernels I still think it's a good idea to get rid of the internal threshold at btree_writepages(), since for most cases cgroup/MM has a better view of full system memory usage than btrfs' fixed threshold. For internal callers using btrfs_btree_balance_dirty() since that function is already doing internal threshold check, we don't need to bother them. But for external callers of btree_writepages(), just respect their requests and write back whatever they want, ignoring the internal btrfs threshold to avoid such deadlock on btree inode dirty page balancing. CC: stable@vger.kernel.org CC: Jan Kara <jack@suse.cz> Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-01-21Merge tag 'for-6.19-rc6-tag' of ↵Linus Torvalds5-2/+73
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - protect reading super block vs setting block size externally (found by syzbot) - make sure no transaction is started in read-only mode even with some rescue mount option combinations - fix checksum calculation of backup super blocks when block-group-tree is enabled - more extensive mount-time checks of device items that could be left after device replace and attempting degraded mount - fix build warning with -Wmaybe-uninitialized on loongarch64-gcc 12 * tag 'for-6.19-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: add extra device item checks at mount btrfs: fix missing fields in superblock backup with BLOCK_GROUP_TREE btrfs: reject new transactions if the fs is fully read-only btrfs: sync read disk super and set block size btrfs: fix Wmaybe-uninitialized warning in replay_one_buffer()
2026-01-20btrfs: add extra device item checks at mountQu Wenruo3-0/+48
[BUG] There is a bug report where after a dev-replace, the replace source device with devid 4 is properly erased (dump tree shows it's the old devid 4), but the target device is still using devid 0. When the user tries to mount the fs degraded, the mount failed with the following errors: BTRFS: device fsid 84a1ed4a-365c-45c3-a9ee-a7df525dc3c9 devid 5 transid 1394395 /dev/sda (8:0) scanned by btrfs (261) BTRFS: device fsid 84a1ed4a-365c-45c3-a9ee-a7df525dc3c9 devid 6 transid 1394395 /dev/sde (8:64) scanned by btrfs (261) BTRFS: device fsid 84a1ed4a-365c-45c3-a9ee-a7df525dc3c9 devid 0 transid 1394395 /dev/sdd (8:48) scanned by btrfs (261) BTRFS: device fsid 84a1ed4a-365c-45c3-a9ee-a7df525dc3c9 devid 3 transid 1394395 /dev/sdf (8:80) scanned by btrfs (261) BTRFS info (device sdd): first mount of filesystem 84a1ed4a-365c-45c3-a9ee-a7df525dc3c9 BTRFS info (device sdd): using crc32c (crc32c-intel) checksum algorithm BTRFS warning (device sdd): devid 4 uuid 01e2081c-9c2a-4071-b9f4-e1b27e571ff5 is missing BTRFS info (device sdd): bdev <missing disk> errs: wr 84994544, rd 15567, flush 65872, corrupt 0, gen 0 BTRFS info (device sdd): bdev /dev/sdd errs: wr 71489901, rd 0, flush 30001, corrupt 0, gen 0 BTRFS error (device sdd): replace without active item, run 'device scan --forget' on the target device BTRFS error (device sdd): failed to init dev_replace: -117 BTRFS error (device sdd): open_ctree failed: -117 [CAUSE] The devid 0 didn't get its devid updated is its own problem, here I'm only focusing on the mount failure itself. The mount is not caused by the missing device, as the fs has RAID1C3 for metadata and RAID10 for data, thus is completely able to tolerate one missing device. The device tree shows the dev-replace has properly finished: item 7 key (0 DEV_REPLACE 0) itemoff 15931 itemsize 72 src devid -1 cursor left 11091821199360 cursor right 11091821199360 mode ALWAYS state FINISHED write errors 0 uncorrectable read errors 0 ^^^^^^^^ And the chunk tree shows there is no devid 0: leaf 37980736602112 items 23 free space 12548 generation 1394388 owner CHUNK_TREE leaf 37980736602112 flags 0x1(WRITTEN) backref revision 1 fs uuid 84a1ed4a-365c-45c3-a9ee-a7df525dc3c9 chunk uuid d074c661-6311-4570-b59f-a5c83fd37f8e item 0 key (DEV_ITEMS DEV_ITEM 3) itemoff 16185 itemsize 98 devid 3 total_bytes 20000588955648 bytes_used 8282877984768 io_align 4096 io_width 4096 sector_size 4096 type 0 generation 0 start_offset 0 dev_group 0 seek_speed 0 bandwidth 0 uuid 0d596b69-fb0d-4031-b4af-a301d0868b8b fsid 84a1ed4a-365c-45c3-a9ee-a7df525dc3c9 ... Which shows the first device is devid 3. But there is indeed /dev/sdd with devid 0: superblock: bytenr=65536, device=/dev/sdd --------------------------------------------------------- csum_type 0 (crc32c) csum_size 4 csum 0xd4bed87e [match] bytenr 65536 flags 0x1 ( WRITTEN ) magic _BHRfS_M [match] fsid 84a1ed4a-365c-45c3-a9ee-a7df525dc3c9 ... uuid_tree_generation 1394388 dev_item.uuid ee6532ad-5442-45f7-87fb-7703e29ed934 dev_item.fsid 84a1ed4a-365c-45c3-a9ee-a7df525dc3c9 [match] dev_item.type 0 dev_item.total_bytes 20000588955648 dev_item.bytes_used 8292541661184 dev_item.io_align 0 dev_item.io_width 0 dev_item.sector_size 0 dev_item.devid 0 <<< So this means device scan will register sdd as devid 0 into the fs, then during btrfs_init_dev_replace(), we located the replace progress item, found the previous replace is finished, but we still need to check if the dev-replace target device (devid 0) exists. If that device exists, we error out showing that error message. But to be honest the end user may not really remember which device is the replace target device, thus not sure what to do in the next step. [ENHANCEMENT] To make the error more obvious, and tell the end user which devices should be unregistered: - Introduce BTRFS_DEV_STATE_ITEM_FOUND flag During device item read from the chunk tree, set the flag for each found device item. - Verify there is no device without the above flag during mount Even missing device should have that flag set. If we found a device without that flag set, it means it's an unexpected one and should be rejected. - More detailed error message on what to do next This will show all unexpected devices and tell the end user to use 'btrfs dev scan --forget' to forget them or remove them before mount. There is an example dmesg where a device of a valid filesystem is modified to have devid 0, then try degraded mount: BTRFS info (device dm-6): first mount of filesystem 7c873869-844c-4b39-bd75-a96148bf4656 BTRFS info (device dm-6): using crc32c checksum algorithm BTRFS warning (device dm-6): devid 3 uuid b4a9f35b-db42-4ac4-b55a-cbf81d3b9683 is missing BTRFS error (device dm-6): devid 0 path /dev/mapper/test-scratch3 is registered but not found in chunk tree BTRFS error (device dm-6): please remove above devices or use 'btrfs device scan --forget <dev>' to unregister them before mount BTRFS error (device dm-6): open_ctree failed: -117 Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-01-20btrfs: fix missing fields in superblock backup with BLOCK_GROUP_TREEMark Harmstone1-1/+1
When the BLOCK_GROUP_TREE compat_ro flag is set, the extent root and csum root fields are getting missed. This is because EXTENT_TREE_V2 treated these differently, and when they were split off this special-casing was mistakenly assigned to BGT rather than the rump EXTENT_TREE_V2. There's no reason why the existence of the block group tree should mean that we don't record the details of the last commit's extent root and csum root. Fix the code in backup_super_roots() so that the correct check gets made. Fixes: 1c56ab991903 ("btrfs: separate BLOCK_GROUP_TREE compat RO flag from EXTENT_TREE_V2") Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Mark Harmstone <mark@harmstone.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>