summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2025-05-29NFSv4: Allow FREE_STATEID to clean up delegationsBenjamin Coddington3-15/+25
The NFS client's list of delegations can grow quite large (well beyond the delegation watermark) if the server is revoking or there are repeated events that expire state. Once this happens, the revoked delegations can cause a performance problem for subsequent walks of the servers->delegations list when the client tries to test and free state. If we can determine that the FREE_STATEID operation has completed without error, we can prune the delegation from the list. Since the NFS client combines TEST_STATEID with FREE_STATEID in its minor version operations, there isn't an easy way to communicate success of FREE_STATEID. Rather than re-arrange quite a number of calling paths to break out the separate procedures, let's signal the success of FREE_STATEID by setting the stateid's type. Set NFS4_FREED_STATEID_TYPE for stateids that have been successfully discarded from the server, and use that type to signal that the delegation can be cleaned up. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-29NFSv4: Don't check for OPEN feature support in v4.1Scott Mayhew1-2/+3
fattr4_open_arguments is a v4.2 recommended attribute, so we shouldn't be sending it to v4.1 servers. Fixes: cb78f9b7d0c0 ("nfs: fix the fetch of FATTR4_OPEN_ARGUMENTS") Signed-off-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org # 6.11+ Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-29NFSv4.2: fix listxattr to return selinux security labelOlga Kornievskaia1-2/+10
Currently, when NFS is queried for all the labels present on the file via a command example "getfattr -d -m . /mnt/testfile", it does not return the security label. Yet when asked specifically for the label (getfattr -n security.selinux) it will be returned. Include the security label when all attributes are queried. Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-29NFSv4.2: fix setattr caching of TIME_[MODIFY|ACCESS]_SET when timestamps are ↵Sagi Grimberg2-8/+49
delegated nfs_setattr will flush all pending writes before updating a file time attributes. However when the client holds delegated timestamps, it can update its timestamps locally as it is the authority for the file times attributes. The client will later set the file attributes by adding a setattr to the delegreturn compound updating the server time attributes. Fix nfs_setattr to avoid flushing pending writes when the file time attributes are delegated and the mtime/atime are set to a fixed timestamp (ATTR_[MODIFY|ACCESS]_SET. Also, when sending the setattr procedure over the wire, we need to clear the correct attribute bits from the bitmask. I was able to measure a noticable speedup when measuring untar performance. Test: $ time tar xzf ~/dir.tgz Baseline: 1m13.072s Patched: 0m49.038s Which is more than 30% latency improvement. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-29NFS: Add support for fallocate(FALLOC_FL_ZERO_RANGE)Anna Schumaker6-3/+103
This implements a suggestion from Trond that we can mimic FALLOC_FL_ZERO_RANGE by sending a compound that first does a DEALLOCATE to punch a hole in a file, and then an ALLOCATE to fill the hole with zeroes. There might technically be a race here, but once the DEALLOCATE finishes any reads from the region would return zeroes anyway, so I don't expect it to cause problems. Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-29fs/nfs/read: fix double-unlock bug in nfs_return_empty_folio()Max Kellermann1-1/+2
Sometimes, when a file was read while it was being truncated by another NFS client, the kernel could deadlock because folio_unlock() was called twice, and the second call would XOR back the `PG_locked` flag. Most of the time (depending on the timing of the truncation), nobody notices the problem because folio_unlock() gets called three times, which flips `PG_locked` back off: 1. vfs_read, nfs_read_folio, ... nfs_read_add_folio, nfs_return_empty_folio 2. vfs_read, nfs_read_folio, ... netfs_read_collection, netfs_unlock_abandoned_read_pages 3. vfs_read, ... nfs_do_read_folio, nfs_read_add_folio, nfs_return_empty_folio The problem is that nfs_read_add_folio() is not supposed to unlock the folio if fscache is enabled, and a nfs_netfs_folio_unlock() check is missing in nfs_return_empty_folio(). Rarely this leads to a warning in netfs_read_collection(): ------------[ cut here ]------------ R=0000031c: folio 10 is not locked WARNING: CPU: 0 PID: 29 at fs/netfs/read_collect.c:133 netfs_read_collection+0x7c0/0xf00 [...] Workqueue: events_unbound netfs_read_collection_worker RIP: 0010:netfs_read_collection+0x7c0/0xf00 [...] Call Trace: <TASK> netfs_read_collection_worker+0x67/0x80 process_one_work+0x12e/0x2c0 worker_thread+0x295/0x3a0 Most of the time, however, processes just get stuck forever in folio_wait_bit_common(), waiting for `PG_locked` to disappear, which never happens because nobody is really holding the folio lock. Fixes: 000dbe0bec05 ("NFS: Convert buffered read paths to use netfs when fscache is enabled") Cc: stable@vger.kernel.org Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Reviewed-by: Dave Wysochanski <dwysocha@redhat.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-05-28Merge tag 'jfs-6.16' of github.com:kleikamp/linux-shaggyLinus Torvalds3-5/+22
Pull jfs updates from David Kleikamp: "A few small fixes for jfs" * tag 'jfs-6.16' of github.com:kleikamp/linux-shaggy: jfs: fix array-index-out-of-bounds read in add_missing_indices jfs: Fix null-ptr-deref in jfs_ioc_trim jfs: validate AG parameters in dbMount() to prevent crashes
2025-05-28Merge tag 'dlm-6.16' of ↵Linus Torvalds3-3/+8
git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm Pull dlm updates from David Teigland: "This fixes delays when shutting down SCTP connections, and updates dlm Kconfig for SCTP" * tag 'dlm-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: dlm: drop SCTP Kconfig dependency dlm: reject SCTP configuration if not enabled dlm: use SHUT_RDWR for SCTP shutdown dlm: mask sk_shutdown value
2025-05-28Merge tag 'nfsd-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds21-260/+703
Pull nfsd updates from Chuck Lever: "The marquee feature for this release is that the limit on the maximum rsize and wsize has been raised to 4MB. The default remains at 1MB, but risk-seeking administrators now have the ability to try larger I/O sizes with NFS clients that support them. Eventually the default setting will be increased when we have confidence that this change will not have negative impact. With v6.16, NFSD now has its own debugfs file system where we can add experimental features and make them available outside of our development community without impacting production deployments. The first experimental setting added is one that makes all NFS READ operations use vfs_iter_read() instead of the NFSD splice actor. The plan is to eventually retire the splice actor, as that will enable a number of new capabilities such as the use of struct bio_vec from the top to the bottom of the NFSD stack. Jeff Layton contributed a number of observability improvements. The use of dprintk() in a number of high-traffic code paths has been replaced with static trace points. This release sees the continuation of efforts to harden the NFSv4.2 COPY operation. Soon, the restriction on async COPY operations can be lifted. Many thanks to the contributors, reviewers, testers, and bug reporters who participated during the v6.16 development cycle" * tag 'nfsd-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (60 commits) xdrgen: Fix code generated for counted arrays SUNRPC: Bump the maximum payload size for the server NFSD: Add a "default" block size NFSD: Remove NFSSVC_MAXBLKSIZE_V2 macro NFSD: Remove NFSD_BUFSIZE sunrpc: Remove the RPCSVC_MAXPAGES macro svcrdma: Adjust the number of entries in svc_rdma_send_ctxt::sc_pages svcrdma: Adjust the number of entries in svc_rdma_recv_ctxt::rc_pages sunrpc: Adjust size of socket's receive page array dynamically SUNRPC: Remove svc_rqst :: rq_vec SUNRPC: Remove svc_fill_write_vector() NFSD: Use rqstp->rq_bvec in nfsd_iter_write() SUNRPC: Export xdr_buf_to_bvec() NFSD: De-duplicate the svc_fill_write_vector() call sites NFSD: Use rqstp->rq_bvec in nfsd_iter_read() sunrpc: Replace the rq_bvec array with dynamically-allocated memory sunrpc: Replace the rq_pages array with dynamically-allocated memory sunrpc: Remove backchannel check in svc_init_buffer() sunrpc: Add a helper to derive maxpages from sv_max_mesg svcrdma: Reduce the number of rdma_rw contexts per-QP ...
2025-05-28Merge tag 'ext4_for_linus-6.16-rc1' of ↵Linus Torvalds24-471/+1062
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "New ext4 features and performance improvements: - Fast commit performance improvements - Multi-fsblock atomic write support for bigalloc file systems - Large folio support for regular files This last can result in really stupendous performance for the right workloads. For example, see [1] where the Kernel Test Robot reported over 37% improvement on a large sequential I/O workload. There are also the usual bug fixes and cleanups. Of note are cleanups of the extent status tree to fix potential races that could result in the extent status tree getting corrupted under heavy simultaneous allocation and deallocation to a single file" Link: https://lore.kernel.org/all/202505161418.ec0d753f-lkp@intel.com/ [1] * tag 'ext4_for_linus-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (52 commits) ext4: Add a WARN_ON_ONCE for querying LAST_IN_LEAF instead ext4: Simplify flags in ext4_map_query_blocks() ext4: Rename and document EXT4_EX_FILTER to EXT4_EX_QUERY_FILTER ext4: Simplify last in leaf check in ext4_map_query_blocks ext4: Unwritten to written conversion requires EXT4_EX_NOCACHE ext4: only dirty folios when data journaling regular files ext4: Add atomic block write documentation ext4: Enable support for ext4 multi-fsblock atomic write using bigalloc ext4: Add multi-fsblock atomic write support with bigalloc ext4: Add support for EXT4_GET_BLOCKS_QUERY_LEAF_BLOCKS ext4: Make ext4_meta_trans_blocks() non-static for later use ext4: Check if inode uses extents in ext4_inode_can_atomic_write() ext4: Document an edge case for overwrites jbd2: remove journal_t argument from jbd2_superblock_csum() jbd2: remove journal_t argument from jbd2_chksum() ext4: remove sb argument from ext4_superblock_csum() ext4: remove sbi argument from ext4_chksum() ext4: enable large folio for regular file ext4: make online defragmentation support large folios ext4: make the writeback path support large folios ...
2025-05-28Merge tag 'ntfs3_for_6.16' of ↵Linus Torvalds8-257/+28
https://github.com/Paragon-Software-Group/linux-ntfs3 Pull ntfs updates from Konstantin Komarov: "Added: - missing direct_IO in ntfs_aops_cmpr - handling of hdr_first_de() return value Fixed: - handling of InitializeFileRecordSegment operation. Removed: - ability to change compression on mounted volume - redundant NULL check" * tag 'ntfs3_for_6.16' of https://github.com/Paragon-Software-Group/linux-ntfs3: fs/ntfs3: remove ability to change compression on mounted volume fs/ntfs3: Fix handling of InitializeFileRecordSegment fs/ntfs3: Add missing direct_IO in ntfs_aops_cmpr fs/ntfs3: handle hdr_first_de() return value fs/ntfs3: Drop redundant NULL check
2025-05-28Merge tag 'for-linus-6.16-ofs1' of ↵Linus Torvalds3-98/+102
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux Pull orangefs update from Mike Marshall: "Convert to use the new mount API. Code from Eric Sandeen at redhat that converts orangefs over to the new mount API" * tag 'for-linus-6.16-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: orangefs: Convert to use the new mount API
2025-05-28Merge tag 'exfat-for-6.16-rc1' of ↵Linus Torvalds2-23/+8
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat updates from Namjae Jeon: - Fix xfstests generic/482 test failure - Fix double free in delayed_free * tag 'exfat-for-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: do not clear volume dirty flag during sync exfat: fix double free in delayed_free
2025-05-28Merge tag 'for-6.16-tag' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "A fixup to the xarray conversion sent in the main 6.16 batch. It was not included because it would cause rebase/refresh of like 80 patches, right before sending the early pull request last week. It's fixing a bug when zoned mode is enabled on btrfs so it's not affecting most people" * tag 'for-6.16-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: don't drop a reference if btrfs_check_write_meta_pointer() fails
2025-05-28smb: client: Remove an unused function and variableDr. David Alan Gilbert3-69/+0
SMB2_QFS_info() has been unused since 2018's commit 730928c8f4be ("cifs: update smb2_queryfs() to use compounding") sign_CIFS_PDUs has been unused since 2009's commit 2edd6c5b0517 ("[CIFS] NTLMSSP support moving into new file, old dead code removed") Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-28f2fs: fix to correct check conditions in f2fs_cross_renameZhiguo Niu1-1/+1
Should be "old_dir" here. Fixes: 5c57132eaf52 ("f2fs: support project quota") Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-28f2fs: use d_inode(dentry) cleanup dentry->d_inodeZhiguo Niu2-6/+6
no logic changes. Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-28f2fs: fix to skip f2fs_balance_fs() if checkpoint is disabledChao Yu1-1/+1
Syzbot reports a f2fs bug as below: INFO: task syz-executor328:5856 blocked for more than 144 seconds. Not tainted 6.15.0-rc6-syzkaller-00208-g3c21441eeffc #0 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor328 state:D stack:24392 pid:5856 tgid:5832 ppid:5826 task_flags:0x400040 flags:0x00004006 Call Trace: <TASK> context_switch kernel/sched/core.c:5382 [inline] __schedule+0x168f/0x4c70 kernel/sched/core.c:6767 __schedule_loop kernel/sched/core.c:6845 [inline] schedule+0x165/0x360 kernel/sched/core.c:6860 io_schedule+0x81/0xe0 kernel/sched/core.c:7742 f2fs_balance_fs+0x4b4/0x780 fs/f2fs/segment.c:444 f2fs_map_blocks+0x3af1/0x43b0 fs/f2fs/data.c:1791 f2fs_expand_inode_data+0x653/0xaf0 fs/f2fs/file.c:1872 f2fs_fallocate+0x4f5/0x990 fs/f2fs/file.c:1975 vfs_fallocate+0x6a0/0x830 fs/open.c:338 ioctl_preallocate fs/ioctl.c:290 [inline] file_ioctl fs/ioctl.c:-1 [inline] do_vfs_ioctl+0x1b8f/0x1eb0 fs/ioctl.c:885 __do_sys_ioctl fs/ioctl.c:904 [inline] __se_sys_ioctl+0x82/0x170 fs/ioctl.c:892 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xf6/0x210 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f The root cause is after commit 84b5bb8bf0f6 ("f2fs: modify f2fs_is_checkpoint_ready logic to allow more data to be written with the CP disable"), we will get chance to allow f2fs_is_checkpoint_ready() to return true once below conditions are all true: 1. checkpoint is disabled 2. there are not enough free segments 3. there are enough free blocks Then it will cause f2fs_balance_fs() to trigger foreground GC. void f2fs_balance_fs(struct f2fs_sb_info *sbi, bool need) ... if (!f2fs_is_checkpoint_ready(sbi)) return; And the testcase mounts f2fs image w/ gc_merge,checkpoint=disable, so deadloop will happen through below race condition: - f2fs_do_shutdown - vfs_fallocate - gc_thread_func - file_start_write - __sb_start_write(SB_FREEZE_WRITE) - f2fs_fallocate - f2fs_expand_inode_data - f2fs_map_blocks - f2fs_balance_fs - prepare_to_wait - wake_up(gc_wait_queue_head) - io_schedule - bdev_freeze - freeze_super - sb->s_writers.frozen = SB_FREEZE_WRITE; - sb_wait_write(sb, SB_FREEZE_WRITE); - if (sbi->sb->s_writers.frozen >= SB_FREEZE_WRITE) continue; : cause deadloop This patch fix to add check condition in f2fs_balance_fs(), so that if checkpoint is disabled, we will just skip trigger foreground GC to avoid such deadloop issue. Meanwhile let's remove f2fs_is_checkpoint_ready() check condition in f2fs_balance_fs(), since it's redundant, due to the main logic in the function is to check: a) whether checkpoint is disabled b) there is enough free segments f2fs_balance_fs() still has all logics after f2fs_is_checkpoint_ready()'s removal. Reported-by: syzbot+aa5bb5f6860e08a60450@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/682d743a.a00a0220.29bc26.0289.GAE@google.com Fixes: 84b5bb8bf0f6 ("f2fs: modify f2fs_is_checkpoint_ready logic to allow more data to be written with the CP disable") Cc: Qi Han <hanqi@vivo.com> Signed-off-by: Chao Yu <chao@kernel.org> Reviewed-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-28f2fs: clean up to check bi_status w/ BLK_STS_OKChao Yu2-4/+4
Check bi_status w/ BLK_STS_OK instead of 0 for cleanup. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-28f2fs: introduce is_{meta,node}_folioChao Yu5-15/+24
Just cleanup, no changes. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-28f2fs: add ckpt_valid_blocks to the section entryyohan.joung2-18/+85
when performing buffered writes in a large section, overhead is incurred due to the iteration through ckpt_valid_blocks within the section. when SEGS_PER_SEC is 128, this overhead accounts for 20% within the f2fs_write_single_data_page routine. as the size of the section increases, the overhead also grows. to handle this problem ckpt_valid_blocks is added within the section entries. Test insmod null_blk.ko nr_devices=1 completion_nsec=1 submit_queues=8 hw_queue_depth=64 max_sectors=512 bs=4096 memory_backed=1 make_f2fs /dev/block/nullb0 make_f2fs -s 128 /dev/block/nullb0 fio --bs=512k --size=1536M --rw=write --name=1 --filename=/mnt/test_dir/seq_write --ioengine=io_uring --iodepth=64 --end_fsync=1 before SEGS_PER_SEC 1 2556MiB/s SEGS_PER_SEC 128 2145MiB/s after SEGS_PER_SEC 1 2556MiB/s SEGS_PER_SEC 128 2556MiB/s Signed-off-by: yohan.joung <yohan.joung@sk.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-28f2fs: add a method for calculating the remaining blocks in the current ↵yohan.joung1-4/+19
segment in LFS mode. In LFS mode, the previous segment cannot use invalid blocks, so the remaining blocks from the next_blkoff of the current segment to the end of the section are calculated. Signed-off-by: yohan.joung <yohan.joung@sk.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-28iomap: don't lose folio dropbehind state for overwritesJens Axboe2-2/+22
DONTCACHE I/O must have the completion punted to a workqueue, just like what is done for unwritten extents, as the completion needs task context to perform the invalidation of the folio(s). However, if writeback is started off filemap_fdatawrite_range() off generic_sync() and it's an overwrite, then the DONTCACHE marking gets lost as iomap_add_to_ioend() don't look at the folio being added and no further state is passed down to help it know that this is a dropbehind/DONTCACHE write. Check if the folio being added is marked as dropbehind, and set IOMAP_IOEND_DONTCACHE if that is the case. Then XFS can factor this into the decision making of completion context in xfs_submit_ioend(). Additionally include this ioend flag in the NOMERGE flags, to avoid mixing it with unrelated IO. Since this is the 3rd flag that will cause XFS to punt the completion to a workqueue, add a helper so that each one of them can get appropriately commented. This fixes extra page cache being instantiated when the write performed is an overwrite, rather than newly instantiated blocks. Fixes: b2cd5ae693a3 ("iomap: make buffered writes work with RWF_DONTCACHE") Signed-off-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/5153f6e8-274d-4546-bf55-30a5018e0d03@kernel.dk Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-28squashfs: add optional full compressed block cachingChanho Min2-0/+49
The commit 93e72b3c612adcaca1 ("squashfs: migrate from ll_rw_block usage to BIO") removed caching of compressed blocks in SquashFS, causing fio performance regression in workloads with repeated file reads. Without caching, every read triggers disk I/O, severely impacting performance in tools like fio. This patch introduces a new CONFIG_SQUASHFS_COMP_CACHE_FULL Kconfig option to enable caching of all compressed blocks, restoring performance to pre-BIO migration levels. When enabled, all pages in a BIO are cached in the page cache, reducing disk I/O for repeated reads. The fio test results with this patch confirm the performance restoration: For example, fio tests (iodepth=1, numjobs=1, ioengine=psync) show a notable performance restoration: Disable CONFIG_SQUASHFS_COMP_CACHE_FULL: IOPS=815, BW=102MiB/s (107MB/s)(6113MiB/60001msec) Enable CONFIG_SQUASHFS_COMP_CACHE_FULL: IOPS=2223, BW=278MiB/s (291MB/s)(16.3GiB/59999msec) The tradeoff is increased memory usage due to caching all compressed blocks. The CONFIG_SQUASHFS_COMP_CACHE_FULL option allows users to enable this feature selectively, balancing performance and memory usage for workloads with frequent repeated reads. Link: https://lkml.kernel.org/r/20250521072559.2389-1-chanho.min@lge.com Signed-off-by: Chanho Min <chanho.min@lge.com> Reviewed-by Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-28crash_dump, nvme: select CONFIGFS_FS as built-inArnd Bergmann1-1/+0
Configfs can be configured as a loadable module, which causes a link-time failure for dm-crypt crash dump support: crash_dump_dm_crypt.c:(.text+0x3a4): undefined reference to `config_item_init_type_name' aarch64-linux-ld: kernel/crash_dump_dm_crypt.o: in function `configfs_dmcrypt_keys_init': crash_dump_dm_crypt.c:(.init.text+0x90): undefined reference to `config_group_init' aarch64-linux-ld: crash_dump_dm_crypt.c:(.init.text+0xb4): undefined reference to `configfs_register_subsystem' aarch64-linux-ld: crash_dump_dm_crypt.c:(.init.text+0xd8): undefined reference to `configfs_unregister_subsystem' This could be avoided with a dependency on CONFIGFS_FS=y, but the dependency has an additional problem of causing Kconfig dependency loops since most other uses select the symbol. Using a simple 'select CONFIGFS_FS' here in turn fails with CONFIG_DM_CRYPT=m, because that still only causes configfs to be a loadable module. The only version I found that fixes this reliably uses an additional Kconfig symbol to ensure the 'select' actually turns on configfs as builtin, with two additional changes to avoid dependency loops with nvme and sysfs. There is no compile-time dependency between configfs and sysfs, so selecting configfs from a driver with sysfs disabled does not cause link failures, only the default /sys/kernel/config mount point will not be created. Link: https://lkml.kernel.org/r/20250521160359.2132363-1-arnd@kernel.org Fixes: 6b23858fd63b ("crash_dump: make dm crypt keys persist for the kdump kernel") Fixes: 1fb470408497 ("nvme-loop: add configfs dependency") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Breno Leitao <leitao@debian.org> Cc: Chaitanya Kulkarni <kch@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Coiby Xu <coxu@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-28f2fs: introduce FAULT_VMALLOCChao Yu3-5/+11
Introduce a new fault type FAULT_VMALLOC to simulate no memory error in f2fs_vmalloc(). Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-28f2fs: use vmalloc instead of kvmalloc in .init_{,de}compress_ctxChao Yu2-13/+15
.init_{,de}compress_ctx uses kvmalloc() to alloc memory, it will try to allocate physically continuous page first, it may cause more memory allocation pressure, let's use vmalloc instead to mitigate it. [Test] cd /data/local/tmp touch file f2fs_io setflags compression file f2fs_io getflags file for i in $(seq 1 10); do sync; echo 3 > /proc/sys/vm/drop_caches;\ time f2fs_io write 512 0 4096 zero osync file; truncate -s 0 file;\ done [Result] Before After Delta 21.243 21.694 -2.12% For compression, we recommend to use ioctl to compress file data in background for workaround. For decompression, only zstd will be affected. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-28f2fs: add f2fs_bug_on() in f2fs_quota_read()Chao Yu1-0/+6
mapping_read_folio_gfp() will return a folio, it should always be uptodate, let's check folio uptodate status to detect any potenial bug. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-28f2fs: add f2fs_bug_on() to detect potential bugChao Yu2-3/+8
Add f2fs_bug_on() to check whether memory preallocation will fail or not after radix_tree_preload(GFP_NOFS | __GFP_NOFAIL). Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-28f2fs: remove unused sbi argument from checksum functionsEric Biggers5-35/+24
Since __f2fs_crc32() now calls crc32() directly, it no longer uses its sbi argument. Remove that, and simplify its callers accordingly. Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-27Merge tag 'x86_cache_for_v6.16_rc1' of ↵Linus Torvalds10-0/+7554
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 resource control updates from Borislav Petkov: "Carve out the resctrl filesystem-related code into fs/resctrl/ so that multiple architectures can share the fs API for manipulating their respective hw resource control implementation. This is the second step in the work towards sharing the resctrl filesystem interface, the next one being plugging ARM's MPAM into the aforementioned fs API" * tag 'x86_cache_for_v6.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) MAINTAINERS: Add reviewers for fs/resctrl x86,fs/resctrl: Move the resctrl filesystem code to live in /fs/resctrl x86/resctrl: Always initialise rid field in rdt_resources_all[] x86/resctrl: Relax some asm #includes x86/resctrl: Prefer alloc(sizeof(*foo)) idiom in rdt_init_fs_context() x86/resctrl: Squelch whitespace anomalies in resctrl core code x86/resctrl: Move pseudo lock prototypes to include/linux/resctrl.h x86/resctrl: Fix types in resctrl_arch_mon_ctx_{alloc,free}() stubs x86/resctrl: Move enum resctrl_event_id to resctrl.h x86/resctrl: Move the filesystem bits to headers visible to fs/resctrl fs/resctrl: Add boiler plate for external resctrl code x86/resctrl: Add 'resctrl' to the title of the resctrl documentation x86/resctrl: Split trace.h x86/resctrl: Expand the width of domid by replacing mon_data_bits x86/resctrl: Add end-marker to the resctrl_event_id enum x86/resctrl: Move is_mba_sc() out of core.c x86/resctrl: Drop __init/__exit on assorted symbols x86/resctrl: Resctrl_exit() teardown resctrl but leave the mount point x86/resctrl: Check all domains are offline in resctrl_exit() x86/resctrl: Rename resctrl_sched_in() to begin with "resctrl_arch_" ...
2025-05-27Merge tag 'timers-cleanups-2025-05-25' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer cleanups from Thomas Gleixner: "Another set of timer API cleanups: - Convert init_timer*(), try_to_del_timer_sync() and destroy_timer_on_stack() over to the canonical timer_*() namespace convention. There is another large conversion pending, which has not been included because it would have caused a gazillion of merge conflicts in next. The conversion scripts will be run towards the end of the merge window and a pull request sent once all conflict dependencies have been merged" * tag 'timers-cleanups-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: treewide, timers: Rename destroy_timer_on_stack() as timer_destroy_on_stack() treewide, timers: Rename try_to_del_timer_sync() as timer_delete_sync_try() timers: Rename init_timers() as timers_init() timers: Rename NEXT_TIMER_MAX_DELTA as TIMER_NEXT_MAX_DELTA timers: Rename __init_timer_on_stack() as __timer_init_on_stack() timers: Rename __init_timer() as __timer_init() timers: Rename init_timer_on_stack_key() as timer_init_key_on_stack() timers: Rename init_timer_key() as timer_init_key()
2025-05-27ksmbd: allow a filename to contain special characters on SMB3.1.1 posix ↵Namjae Jeon1-26/+27
extension If client send SMB2_CREATE_POSIX_CONTEXT to ksmbd, Allow a filename to contain special characters. Reported-by: Philipp Kerling <pkerling@casix.org> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-27ksmbd: provide zero as a unique ID to the Mac clientNamjae Jeon3-2/+21
The Mac SMB client code seems to expect the on-disk file identifier to have the semantics of HFS+ Catalog Node Identifier (CNID). ksmbd provides the inode number as a unique ID to the client, but in the case of subvolumes of btrfs, there are cases where different files have the same inode number, so the mac smb client treats it as an error. There is a report that a similar problem occurs when the share is ZFS. Returning UniqueId of zero will make the Mac client to stop using and trusting the file id returned from the server. Reported-by: Justin Turner Arthur <justinarthur@gmail.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-27btrfs: don't drop a reference if btrfs_check_write_meta_pointer() failsJosef Bacik1-1/+0
In the zoned mode there's a bug in the extent buffer tree conversion to xarray. The reference for eb is dropped and code continues but the references get dropped by releasing the batch. Reported-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/linux-btrfs/202505191521.435b97ac-lkp@intel.com/ Fixes: 19d7f65f032f ("btrfs: convert the buffer_radix to an xarray") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-27btrfs: don't drop a reference if btrfs_check_write_meta_pointer() failsJosef Bacik1-1/+0
In the zoned mode there's a bug in the extent buffer tree conversion to xarray. The reference for eb is dropped and code continues but the references get dropped by releasing the batch. Reported-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Fixes: 19d7f65f032f ("btrfs: convert the buffer_radix to an xarray") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Tested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-27bcachefs: Don't rewind to run a recovery pass we already ranKent Overstreet1-1/+3
Fix a small regression from the "run recovery passes" rewrite, which enabled async recovery passes. This fixes getting stuck in a loop in recovery. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27bcachefs: Move unicode message to after the startup messageKent Overstreet1-4/+6
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27bcachefs: Fix missing commit in check_direntsKent Overstreet1-1/+2
Other repair code seems to be doing commits themselves, but check_key_has_snapshot() does not. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27bcachefs: Fix lost rebalance wakeupsKent Overstreet4-8/+13
Fix a missing wakeup in 'bcachefs set-file-option' -> xattr option update -> inode_write this was missing because the wakeup needs to happen after transaction commit. Also, add a 'kick' counter, to make sure we don't miss a wakeup that occured right after we finished checking the rebalance_work btree. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27bcachefs: bch2_kthread_io_clock_wait_once()Kent Overstreet2-29/+19
Add a version of bch2_kthread_io_clock_wait() that only schedules once - behaving more like schedule_timeout(). This will be used for fixing rebalance wakeups. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27bcachefs: Ensure we print output of run_recovery_pass if it errorsKent Overstreet1-6/+15
Also, don't error out in bucket_ref_update_err(): we don't want to return -BCH_ERR_cannot_rewind_recovery if it's not an insert, if it's an overwrite we continue. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27Don't propagate mounts into detached treesAl Viro3-20/+4
All versions up to 6.14 did not propagate mount events into detached tree. Shortly after 6.14 a merge of vfs-6.15-rc1.mount.namespace (130e696aa68b) has changed that. Unfortunately, that has caused userland regressions (reported in https://lore.kernel.org/all/CAOYeF9WQhFDe+BGW=Dp5fK8oRy5AgZ6zokVyTj1Wp4EUiYgt4w@mail.gmail.com/) Straight revert wouldn't be an option - in particular, the variant in 6.14 had a bug that got fixed in d1ddc6f1d9f0 ("fix IS_MNT_PROPAGATING uses") and we don't want to bring the bug back. This is a modification of manual revert posted by Christian, with changes needed to avoid reintroducing the breakage in scenario described in d1ddc6f1d9f0. Cc: stable@vger.kernel.org Reported-by: Allison Karlitskaya <lis@redhat.com> Tested-by: Allison Karlitskaya <lis@redhat.com> Acked-by: Christian Brauner <brauner@kernel.org> Co-developed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-05-26Merge tag 'v6.16-p1' of ↵Linus Torvalds2-142/+123
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Fix memcpy_sglist to handle partially overlapping SG lists - Use memcpy_sglist to replace null skcipher - Rename CRYPTO_TESTS to CRYPTO_BENCHMARK - Flip CRYPTO_MANAGER_DISABLE_TEST into CRYPTO_SELFTESTS - Hide CRYPTO_MANAGER - Add delayed freeing of driver crypto_alg structures Compression: - Allocate large buffers on first use instead of initialisation in scomp - Drop destination linearisation buffer in scomp - Move scomp stream allocation into acomp - Add acomp scatter-gather walker - Remove request chaining - Add optional async request allocation Hashing: - Remove request chaining - Add optional async request allocation - Move partial block handling into API - Add ahash support to hmac - Fix shash documentation to disallow usage in hard IRQs Algorithms: - Remove unnecessary SIMD fallback code on x86 and arm/arm64 - Drop avx10_256 xts(aes)/ctr(aes) on x86 - Improve avx-512 optimisations for xts(aes) - Move chacha arch implementations into lib/crypto - Move poly1305 into lib/crypto and drop unused Crypto API algorithm - Disable powerpc/poly1305 as it has no SIMD fallback - Move sha256 arch implementations into lib/crypto - Convert deflate to acomp - Set block size correctly in cbcmac Drivers: - Do not use sg_dma_len before mapping in sun8i-ss - Fix warm-reboot failure by making shutdown do more work in qat - Add locking in zynqmp-sha - Remove cavium/zip - Add support for PCI device 0x17D8 to ccp - Add qat_6xxx support in qat - Add support for RK3576 in rockchip-rng - Add support for i.MX8QM in caam Others: - Fix irq_fpu_usable/kernel_fpu_begin inconsistency during CPU bring-up - Add new SEV/SNP platform shutdown API in ccp" * tag 'v6.16-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (382 commits) x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining crypto: qat - add missing header inclusion crypto: api - Redo lookup on EEXIST Revert "crypto: testmgr - Add hash export format testing" crypto: marvell/cesa - Do not chain submitted requests crypto: powerpc/poly1305 - add depends on BROKEN for now Revert "crypto: powerpc/poly1305 - Add SIMD fallback" crypto: ccp - Add missing tee info reg for teev2 crypto: ccp - Add missing bootloader info reg for pspv5 crypto: sun8i-ce - move fallback ahash_request to the end of the struct crypto: octeontx2 - Use dynamic allocated memory region for lmtst crypto: octeontx2 - Initialize cptlfs device info once crypto: xts - Only add ecb if it is not already there crypto: lrw - Only add ecb if it is not already there crypto: testmgr - Add hash export format testing crypto: testmgr - Use ahash for generic tfm crypto: hmac - Add ahash support crypto: testmgr - Ignore EEXIST on shash allocation crypto: algapi - Add driver template support to crypto_inst_setname crypto: shash - Set reqsize in shash_alg ...
2025-05-26Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linuxLinus Torvalds6-65/+257
Pull fscrypt update from Eric Biggers: "Add support for 'hardware-wrapped inline encryption keys' to fscrypt. When enabled on supported platforms, this feature protects file contents keys from certain attacks, such as cold boot attacks. This feature uses the block layer support for wrapped keys which was merged in 6.15. Wrapped key support has existed out-of-tree in Android for a long time, and it's finally ready for upstream now that there is a platform on which it works end-to-end with upstream. Specifically, it works on the Qualcomm SM8650 HDK, using the Qualcomm ICE (Inline Crypto Engine) and HWKM (Hardware Key Manager). The corresponding driver support is included in the SCSI tree for 6.16. Validation for this feature includes two new tests that were already merged into xfstests (generic/368 and generic/369)" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux: fscrypt: add support for hardware-wrapped keys
2025-05-26Merge tag 'xfs-merge-6.16' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds43-184/+1499
Pull xfs updates from Carlos Maiolino: - Atomic writes for XFS - Remove experimental warnings for pNFS, scrub and parent pointers * tag 'xfs-merge-6.16' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (26 commits) xfs: add inode to zone caching for data placement xfs: free the item in xfs_mru_cache_insert on failure xfs: remove the EXPERIMENTAL warning for pNFS xfs: remove some EXPERIMENTAL warnings xfs: Remove deprecated xfs_bufd sysctl parameters xfs: stop using set_blocksize xfs: allow sysadmins to specify a maximum atomic write limit at mount time xfs: update atomic write limits xfs: add xfs_calc_atomic_write_unit_max() xfs: add xfs_file_dio_write_atomic() xfs: commit CoW-based atomic writes atomically xfs: add large atomic writes checks in xfs_direct_write_iomap_begin() xfs: add xfs_atomic_write_cow_iomap_begin() xfs: refine atomic write size check in xfs_file_write_iter() xfs: refactor xfs_reflink_end_cow_extent() xfs: allow block allocator to take an alignment hint xfs: ignore HW which cannot atomic write a single block xfs: add helpers to compute transaction reservation for finishing intent items xfs: add helpers to compute log item overhead xfs: separate out setting buftarg atomic writes limits ...
2025-05-26Merge tag 'erofs-for-6.16-rc1' of ↵Linus Torvalds11-63/+387
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs updates from Gao Xiang: "In this cycle, Intel QAT hardware accelerators are supported to improve DEFLATE decompression performance. I've tested it with the enwik9 dataset of 1 MiB pclusters on our Intel Sapphire Rapids bare-metal server and a PL0 ESSD, and the sequential read performance even surpasses LZ4 software decompression on this setup. In addition, a `fsoffset` mount option is introduced for file-backed mounts to specify the filesystem offset in order to adapt customized container formats. And other improvements and minor cleanups. Summary: - Add a `fsoffset` mount option to specify the filesystem offset - Support Intel QAT accelerators to boost up the DEFLATE algorithm - Initialize per-CPU workers and CPU hotplug hooks lazily to avoid unnecessary overhead when EROFS is not mounted - Fix file handle encoding for 64-bit NIDs - Minor cleanups" * tag 'erofs-for-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: support DEFLATE decompression by using Intel QAT erofs: clean up erofs_{init,exit}_sysfs() erofs: add 'fsoffset' mount option to specify filesystem offset erofs: lazily initialize per-CPU workers and CPU hotplug hooks erofs: refine readahead tracepoint erofs: avoid using multiple devices with different type erofs: fix file handle encoding for 64-bit NIDs
2025-05-26Merge tag 'bcachefs-2025-05-24' of git://evilpiepirate.org/bcachefsLinus Torvalds138-3542/+7026
Pull bcachefs updates from Kent Overstreet: - Poisoned extents can now be moved: this lets us handle bitrotted data without deleting it. For now, reading from poisoned extents only returns -EIO: in the future we'll have an API for specifying "read this data even if there were bitflips". - Incompatible features may now be enabled at runtime, via "opts/version_upgrade" in sysfs. Toggle it to incompatible, and then toggle it back - option changes via the sysfs interface are persistent. - Various changes to support deployable disk images: - RO mounts now use less memory - Images may be stripped of alloc info, particularly useful for slimming them down if they will primarily be mounted RO. Alloc info will be automatically regenerated on first RW mount, and this is quite fast - Filesystem images generated with 'bcachefs image' will be automatically resized the first time they're mounted on a larger device The images 'bcachefs image' generates with compression enabled have been comparable in size to those generated by squashfs and erofs - but you get a full RW capable filesystem - Major error message improvements for btree node reads, data reads, and elsewhere. We now build up a single error message that lists all the errors encountered, actions taken to repair, and success/failure of the IO. This extends to other error paths that may kick off other actions, e.g. scheduling recovery passes: actions we took because of an error are included in that error message, with grouping/indentation so we can see what caused what. - New option, 'rebalance_on_ac_only'. Does exactly what the name suggests, quite handy with background compression. - Repair/self healing: - We can now kick off recovery passes and run them in the background if we detect errors. Currently, this is just used by code that walks backpointers. We now also check for missing backpointers at runtime and run check_extents_to_backpointers if required. The messy 6.14 upgrade left missing backpointers for some users, and this will correct that automatically instead of requiring a manual fsck - some users noticed this as copygc spinning and not making progress. In the future, as more recovery passes come online, we'll be able to repair and recover from nearly anything - except for unreadable btree nodes, and that's why you're using replication, of course - without shutting down the filesystem. - There's a new recovery pass, for checking the rebalance_work btree, which tracks extents that rebalance will process later. - Hardening: - Close the last known hole in btree iterator/btree locking assertions: path->should_be_locked paths must stay locked until the end of the transaction. This shook out a few bugs, including a performance issue that was causing unnecessary path_upgrade transaction restarts. - Performance: - Faster snapshot deletion: this is an incompatible feature, as it requires new sentinal values, for safety. Snapshot deletion no longer has to do a full metadata scan, it now just scans the inodes btree: if an extent/dirent/xattr is present for a given snapshot ID, we already require that an inode be present with that same snapshot ID. If/when users hit scalability limits again (ridiculously huge filesystems with lots of inodes, and many sparse snapshots), let me know - the next step will be to add an index from snapshot ID -> inode number, which won't be too hard. - Faster device removal: the "scan for pointers to this device" no longer does a full metadata scan, instead it walks backpointers. Like fast snapshot deletion this is another incompat feature: it also requires a new sentinal value, because we don't want to reuse these device IDs until after a fsck. - We're now coalescing redundant accounting updates prior to transaction commit, taking some pressure off the journal. Shortly we'll also be doing multiple extent updates in a transaction in the main write path, which combined with the previous should drastically cut down on the amount of metadata updates we have to journal. - Stack usage improvements: All allocator state has been moved off the stack - Debug improvements: - enumerated refcounts: The debug code previously used for filesystem write refs is now a small library, and used for other heavily used refcounts. Different users of a refcount are enumerated, making it much easier to debug refcount issues. - Async object debugging: There's a new kconfig option that makes various async objects (different types of bios, data updates, write ops, etc.) visible in debugfs, and it should be fast enough to leave on in production. - Various sets of assertions no longer require CONFIG_BCACHEFS_DEBUG, instead they're controlled by module parameters and static keys, meaning users won't need to compile custom kernels as often to help debug issues. - bch2_trans_kmalloc() calls can be tracked (there's a new kconfig option). With it on you can check the btree_transaction_stats in debugfs to see the bch2_trans_kmalloc() calls a transaction did when it used the most memory. * tag 'bcachefs-2025-05-24' of git://evilpiepirate.org/bcachefs: (218 commits) bcachefs: Don't mount bs > ps without TRANSPARENT_HUGEPAGE bcachefs: Fix btree_iter_next_node() for new locking asserts bcachefs: Ensure we don't use a blacklisted journal seq bcachefs: Small check_fix_ptr fixes bcachefs: Fix opts.recovery_pass_last bcachefs: Fix allocate -> self healing path bcachefs: Fix endianness in casefold check/repair bcachefs: Path must be locked if trans->locked && should_be_locked bcachefs: Simplify bch2_path_put() bcachefs: Plumb btree_trans for more locking asserts bcachefs: Clear trans->locked before unlock bcachefs: Clear should_be_locked before unlock in key_cache_drop() bcachefs: bch2_path_get() reuses paths if upgrade_fails & !should_be_locked bcachefs: Give out new path if upgrade fails bcachefs: Fix btree_path_get_locks when not doing trans restart bcachefs: btree_node_locked_type_nowrite() bcachefs: Kill bch2_path_put_nokeep() bcachefs: bch2_journal_write_checksum() bcachefs: Reduce stack usage in data_update_index_update() bcachefs: bch2_trans_log_str() ...
2025-05-26Merge tag 'gfs2-for-6.16' of ↵Linus Torvalds24-213/+263
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Fix the long-standing warnings in inode_to_wb() when CONFIG_LOCKDEP is enabled: gfs2 doesn't support cgroup writeback and so inode->i_wb will never change. This is the counterpart of commit 9e888998ea4d ("writeback: fix false warning in inode_to_wb()") - Fix a hang introduced by commit 8d391972ae2d ("gfs2: Remove __gfs2_writepage()"): prevent gfs2_logd from creating transactions for jdata pages while trying to flush the log - Fix a race between gfs2_create_inode() and gfs2_evict_inode() by deallocating partially created inodes on the gfs2_create_inode() error path - Fix a bug in the journal head lookup code that could cause mount to fail after successful recovery - Various smaller fixes and cleanups from various people * tag 'gfs2-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (23 commits) gfs2: No more gfs2_find_jhead caching gfs2: Get rid of duplicate log head lookup gfs2: Simplify clean_journal gfs2: Simplify gfs2_log_pointers_init gfs2: Move gfs2_log_pointers_init gfs2: Minor comments fix gfs2: Don't start unnecessary transactions during log flush gfs2: Move gfs2_trans_add_databufs gfs2: Rename jdata_dirty_folio to gfs2_jdata_dirty_folio gfs2: avoid inefficient use of crc32_le_shift() gfs2: Do not call iomap_zero_range beyond eof gfs: don't check for AOP_WRITEPAGE_ACTIVATE in gfs2_write_jdata_batch gfs2: Fix usage of bio->bi_status in gfs2_end_log_write gfs2: deallocate inodes in gfs2_create_inode gfs2: Move GIF_ALLOC_FAILED check out of gfs2_ea_dealloc gfs2: Move gfs2_dinode_dealloc gfs2: Don't reread inodes unnecessarily gfs2: gfs2_create_inode error handling fix gfs2: Remove unnecessary NULL check before free_percpu() gfs2: check sb_min_blocksize return value ...
2025-05-26Merge tag 'configfs-for-v6.16' of ↵Linus Torvalds2-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/a.hindborg/linux Pull configfs updates from Andreas Hindborg: - Allow creation of rw files with custom permissions. This allows drivers to better protect secrets written through configfs - Fix a bug where an error condition did not cause an early return while populating attributes - Report ENOMEM rather than EFAULT when kvasprintf() fails in config_item_set_name() - Add a Rust API for configfs. This allows Rust drivers to use configfs through a memory safe interface * tag 'configfs-for-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/a.hindborg/linux: MAINTAINERS: add configfs Rust abstractions rust: configfs: add a sample demonstrating configfs usage rust: configfs: introduce rust support for configfs configfs: Correct error value returned by API config_item_set_name() configfs: Do not override creating attribute file failure in populate_attrs() configfs: Delete semicolon from macro type_print() definition configfs: Add CONFIGFS_ATTR_PERM helper