summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)AuthorFilesLines
2025-05-22ubifs: Fix grammar in error messageThorsten Blum1-1/+1
s/much/many/ Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Richard Weinberger <richard@nod.at>
2025-05-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski29-151/+332
Cross-merge networking fixes after downstream PR (net-6.15-rc8). Conflicts: 80f2ab46c2ee ("irdma: free iwdev->rf after removing MSI-X") 4bcc063939a5 ("ice, irdma: fix an off by one in error handling code") c24a65b6a27c ("iidc/ice/irdma: Update IDC to support multiple consumers") https://lore.kernel.org/20250513130630.280ee6c5@canb.auug.org.au No extra adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-22gfs2: No more gfs2_find_jhead cachingAndreas Gruenbacher6-10/+7
We are no longer calling gfs2_find_jhead() on the same log twice, so there is no more reason for keeping the log contents cached across those calls. In addition, log head lookup and log header writing didn't go through the same address space and so the caching wasn't even fully working, anyway. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-05-22gfs2: Get rid of duplicate log head lookupAndreas Gruenbacher2-12/+6
Currently at mount time, the recovery code looks up the current log head and, if necessary, replays the log and writes a recovery header to indicate that the log is clean. It does that for each log that may need recovery. We also know that our own log will always be checked as part of that process. Then, the mount code looks up the log head of our own log again. The double log head lookup can be costly, but more importantly, it is unnecessary because we can trivially compute the position of the log head after recovery; all we need to do for that is bump the position and lh_sequence by one when writing a recovery header. With that in mind, move the call to gfs2_log_pointers_init() into gfs2_recover_func() and get rid of the double lookup in gfs2_make_fs_rw(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-05-22gfs2: Simplify clean_journalAndreas Gruenbacher1-3/+3
In function clean_journal(), update @head to point at the log header that indicates successful recovery: this is where logging needs to resume. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-05-22gfs2: Simplify gfs2_log_pointers_initAndreas Gruenbacher4-16/+12
Move the initialization of sdp->sd_log_sequence and sdp->sd_log_flush_head inside gfs2_log_pointers_init(). Use gfs2_replay_incr_blk(). Before this change, the log head lookup code in freeze_go_xmote_bh() didn't update sdp->sd_log_flush_head. This is now fixed, but the code in freeze_go_xmote_bh() appears to be pretty useless in the first place: on a frozen filesystem, the log head will not change. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-05-22gfs2: Move gfs2_log_pointers_initAndreas Gruenbacher3-11/+10
Move gfs2_log_pointers_init to recovery.c: there is no need for inlining this function. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-05-22gfs2: Minor comments fixAndreas Gruenbacher1-2/+2
Commit 40829760096df ("gfs2: Convert gfs2_find_jhead() to use a folio") replaced grab_cache_page() by filemap_grab_folio(), but the comments were still referring to grab_cache_page(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-05-22gfs2: Don't start unnecessary transactions during log flushAndreas Gruenbacher3-1/+38
Commit 8d391972ae2d ("gfs2: Remove __gfs2_writepage()") changed the log flush code in gfs2_ail1_start_one() to call aops->writepages() instead of aops->writepage(). For jdata inodes, this means that we will now try to reserve log space and start a transaction before we can determine that the pages in question have already been journaled. When this happens in the context of gfs2_logd(), it can now appear that not enough log space is available for freeing up log space, and we will lock up. Fix that by issuing journal writes directly instead of going through aops->writepages() in the log flush code. Fixes: 8d391972ae2d ("gfs2: Remove __gfs2_writepage()") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-05-22gfs2: Move gfs2_trans_add_databufsAndreas Gruenbacher5-25/+26
Move gfs2_trans_add_databufs() to trans.c. Pass in a glock instead of a gfs2_inode. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-05-22gfs2: Rename jdata_dirty_folio to gfs2_jdata_dirty_folioAndreas Gruenbacher1-2/+2
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-05-22gfs2: avoid inefficient use of crc32_le_shift()Eric Biggers1-1/+2
__get_log_header() was using crc32_le_shift() to update a CRC with four zero bytes. However, this is about 5x slower than just CRC'ing four zero bytes in the normal way. Just do that instead. (We could instead make crc32_le_shift() faster on short lengths. But all its callers do just fine without it, so I'd like to just remove it.) Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-05-22gfs2: Do not call iomap_zero_range beyond eofAndreas Gruenbacher1-2/+4
Since commit eb65540aa9fc ("iomap: warn on zero range of a post-eof folio"), iomap_zero_range() warns when asked to zero a folio beyond eof. The warning triggers on the following code path: gfs2_fallocate(FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE) __gfs2_punch_hole() gfs2_block_zero_range() iomap_zero_range() In __gfs2_punch_hole(), gfs2 zeroes out partial folios at the beginning and at the end of the specified range, whether those folios are beyond eof or not. This may add folios to the page cache which are entirely beyond eof, which isn't of any use. Avoid that by truncating the range to zero out at eof. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-05-22gfs: don't check for AOP_WRITEPAGE_ACTIVATE in gfs2_write_jdata_batchChristoph Hellwig1-18/+10
__gfs2_jdata_write_folio can't return AOP_WRITEPAGE_ACTIVATE, so don't check for it in gfs2_write_jdata_batch. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-05-22erofs: add 'fsoffset' mount option to specify filesystem offsetSheng Yong5-6/+24
When attempting to use an archive file, such as APEX on android, as a file-backed mount source, it fails because EROFS image within the archive file does not start at offset 0. As a result, a loop or a dm device is still needed to attach the image file at an appropriate offset first. Similarly, if an EROFS image within a block device does not start at offset 0, it cannot be mounted directly either. To address this issue, this patch adds a new mount option `fsoffset=x' to accept a start offset for the primary device. The offset should be aligned to the block size. EROFS will add this offset before performing read requests. Signed-off-by: Sheng Yong <shengyong1@xiaomi.com> Signed-off-by: Wang Shuai <wangshuai12@xiaomi.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20250517090544.2687651-1-shengyong1@xiaomi.com [ Gao Xiang: minor update on documentation and the error message. ] Reviewed-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-05-22ksmbd: use list_first_entry_or_null for opinfo_get_list()Namjae Jeon1-5/+2
The list_first_entry() macro never returns NULL. If the list is empty then it returns an invalid pointer. Use list_first_entry_or_null() to check if the list is empty. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202505080231.7OXwq4Te-lkp@intel.com/ Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-22ksmbd: fix rename failureNamjae Jeon1-1/+1
I found that rename fails after cifs mount due to update of lookup_one_qstr_excl(). mv a/c b/ mv: cannot move 'a/c' to 'b/c': No such file or directory In order to rename to a new name regardless of whether the dentry is negative, we need to get the dentry through lookup_one_qstr_excl(). So It will not return error if the name doesn't exist. Fixes: 204a575e91f3 ("VFS: add common error checks to lookup_one_qstr_excl()") Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-22bcachefs: Drop empty accounting updatesKent Overstreet1-0/+10
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Improve trace_trans_restart_upgradeKent Overstreet2-47/+24
- Convert to a 'fs_str' tracepoint that just emits as a string: this lets us build up the tracepoint with a printbuf, using our pretty printers, and they're much easier to manage - Include locks_held, before and after - Include the btree node pointer we failed on (error pointer, null, or real node) Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: fix bch2_inum_snapshot_to_path()Kent Overstreet1-22/+28
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: fix duplicate printkKent Overstreet1-2/+0
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: BCH_INODE_has_case_insensitiveKent Overstreet8-13/+196
Add a flag for tracking whether a directory has case-insensitive descendents - so that overlayfs can disallow mounting, even though the filesystem supports case insensitivity. This is a new on disk format version, with a (cheap) upgrade to ensure the flag is correctly set on existing inodes. Create, rename and fssetxattr are all plumbed to ensure the new flag is set, and we've got new fsck code that hooks into check_inode(0. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: bch2_inode_find_by_inum_snapshot()Kent Overstreet3-95/+96
Move a fsck.c helper into inode.c, eliminate some duplicate and organize the inode lookup helpers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: bch2_inum_snapshot_to_path()Kent Overstreet5-30/+43
Add a better helper for printing out paths of inodes when we don't know the subvolume, for fsck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: bch2_rename_trans() only runs rename-to-dir code if neededKent Overstreet1-22/+24
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: subvol_inum_eq()Kent Overstreet4-12/+10
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Don't set bi_casefold on non directoriesKent Overstreet1-0/+3
bi_casefold only makes sense for directories, and since it's one of the variable length fields setting it unnecessarily wastes space. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Remove duplicate call to bch2_trans_begin()Alan Huang1-2/+0
There is one in for_each_btree_key_max(). Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Call bch2_bkey_set_needs_rebalance() earlier in write pathKent Overstreet1-3/+9
There's no reason to be running this inside our transaction; it forces us to copy the key we're updating to a temporary, which we'd like to skip. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Simplify bch2_extent_atomic_end()Kent Overstreet3-51/+22
It used to be that we had a fixed maximum number of btree paths to work with - 64. That's no longer the case, so bch2_extent_atomic_end() doesn't have to be as strict. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Coalesce accounting in trans commitKent Overstreet1-9/+27
Accounting has gotten quite heavy, and there's lots of redundancy in accounting updates within a transaction, as we often add/delete multiple extents that touch the same accountign counters. This will reduce the amount of data that we journal, and reduce pressure downstream on the btree write buffer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Split out accounting in transaction commitKent Overstreet6-34/+52
There can be a lot of rendundancy in accounting updates within a single btree transaction. Split out accounting updates so that they can be deduped, in the next commit. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: btree_trans_subbufKent Overstreet8-54/+82
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Make accounting mismatch errors more readableKent Overstreet1-2/+2
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: async objs now support bch_write_opsKent Overstreet5-1/+17
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: fix bch2_debugfs_flush_buf() when tabstops are in useKent Overstreet1-0/+5
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: fsck: Include loops in error messagesKent Overstreet1-9/+16
This fixes the subvol loop checking and directory loop checking to print the loop. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: bch2_check_bucket_backpointer_mismatch()Kent Overstreet5-11/+98
Detect buckets with missing backpointers, and run repair on demand. __bch2_move_data_phys() now calls bch2_check_bucket_backpointer_mismatch() as it walks buckets, which checks for missing backpointers by comparing backpointers against bucket sector counts. When missing backpointers are detected, we kick off bch2_check_extents_to_backpointers() asynchronously - right away if we're trying to evacuate, or with a threshold if we're just running copygc. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Improve bucket_bitmap codeKent Overstreet6-80/+92
Add some more helpers, and mismatches is now a superset of the empty bitmap - simplifies most checks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Run recovery passes asynchronouslyKent Overstreet4-26/+113
When we request a recovery pass to be run online, i.e. not during recovery, if it's an online pass it'll now be run in the background, instead of waiting for the next mount. To avoid situations where recovery passes are running continuously, this also includes ratelimiting: if the RUN_RECOVERY_PASS_ratelimit flag is passed, the pass may be deferred until later - depending on the runtime and last run stats in the recovery_passes superblock section. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: bch2_run_explicit_recovery_pass() cleanupKent Overstreet9-97/+117
Consolidate the run_explicit_recovery_pass() interfaces by adding a flags parameter; this will also let us add a RUN_RECOVERY_PASS_ratelimit flag. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: bch2_recovery_pass_status_to_text()Kent Overstreet3-0/+32
Show recovery pass status in sysfs - important now that we're running them automatically in the background. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Reduce usage of recovery.curr_passKent Overstreet8-13/+13
We want recovery.curr_pass to be private to the recovery passes code, for better showing recovery pass status; also, it may rewind and is generally not the correct member to use. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: __bch2_run_recovery_passes()Kent Overstreet5-91/+104
Consolidate bch2_run_recovery_passes() and bch2_run_online_recovery_passes(), prep work for automatically scheduling and running recovery passes in the background. - Now takes a mask of which passes to run, automatic background repair will pass in sb.recovery_passes_required. - Skips passes that are failing: a pass that failed may be reattempted after another pass succeeds (some passes depend on repair done by other passes for successful completion). - bch2_recovery_passes_match() helper to skip alloc passes on a filesystem without alloc info. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: struct bch_fs_recoveryKent Overstreet15-69/+81
bch_fs has gotten obnoxiously big, let's start organizing thins a bit better. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: kill copy in bch2_disk_accounting_mod()Kent Overstreet1-5/+13
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Optimize bch2_trans_start_alloc_update()Kent Overstreet1-3/+18
Avoid doing more updates if we already have one. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: btree key cache assertsKent Overstreet1-6/+19
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: journal path now uses discard_opt_enabled()Kent Overstreet3-15/+21
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: relock_fail tracepoint now includes btreeKent Overstreet1-1/+4
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>