summaryrefslogtreecommitdiff
path: root/fs/bcachefs
AgeCommit message (Collapse)AuthorFilesLines
2024-12-21bcachefs: Simplify btree_iter_peek() filter_snapshotsKent Overstreet1-67/+62
Collapse all the BTREE_ITER_filter_snapshots handling down into a single block; btree iteration is much simpler in the !filter_snapshots case. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Rename btree_iter_peek_upto() -> btree_iter_peek_max()Kent Overstreet23-80/+80
We'll be introducing btree_iter_peek_prev_min(), so rename for consistency. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Assert that we're not violating key cache coherency rulesKent Overstreet1-3/+10
We're not allowed to have a dirty key in the key cache if the key doesn't exist at all in the btree - creation has to bypass the key cache, so that iteration over the btree can check if the key is present in the key cache. Things break in subtle ways if cache coherency is broken, so this needs an assert. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: bch2_trans_verify_not_unlocked_or_in_restart()Kent Overstreet6-40/+32
Fold two asserts into one. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Better in_restart errorKent Overstreet3-0/+19
We're ramping up on checking transaction restart handling correctness - so, in debug mode we now save a backtrace for where the restart was emitted, which makes it much easier to track down the incorrect handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Assert we're not in a restart in bch2_trans_put()Kent Overstreet1-0/+3
This always indicates a transaction restart handling bug Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Fix unhandled transaction restart in evacuate_bucket()Kent Overstreet1-0/+7
Generally, releasing a transaction within a transaction restart means an unhandled transaction restart: but this can happen legitimately within the move code, e.g. when bch2_move_ratelimit() tells us to exit before we've retried. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Improved check_topology() assertKent Overstreet1-10/+17
On interior btree node updates, we always verify that we're not introducing topology errors: child nodes should exactly span the range of the parent node. single_device.ktest small_nodes has been popping this assert: change it to give us more information. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Kill BCH_TRANS_COMMIT_lazy_rwKent Overstreet9-46/+16
We unconditionally go read-write, if we're going to do so, before journal replay: lazy_rw is obsolete. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Add assert for use of journal replay keys for updatesKent Overstreet3-0/+13
The journal replay keys mechanism can only be used for updates in early recovery, when still single threaded. Add some asserts to make sure we never accidentally use it elsewhere. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: use attribute define helper for sysfs attributeHongbo Li1-7/+3
The sysfs attribute definition has been wrapped into macro: rw_attribute, read_attribute and write_attribute, we can use these helpers to uniform the attribute definition. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: remove write permission for gc_gens_pos sysfs interfaceHongbo Li1-1/+1
The gc_gens_pos is used to show the status of bucket gen gc. There is no need to assign write permissions for this attribute. Here we can use read_attribute helper to define this attribute. ``` [Before] $ ll internal/gc_gens_pos -rw-r--r-- 1 root root 4096 Oct 28 15:27 internal/gc_gens_pos [After] $ ll internal/gc_gens_pos -r--r--r-- 1 root root 4096 Oct 28 17:27 internal/gc_gens_pos ``` Fixes: ac516d0e7db7 ("bcachefs: Add the status of bucket gen gc to sysfs") Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Move bch_extent_rebalance code to rebalance.cKent Overstreet8-238/+251
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Improve trace_rebalance_extentKent Overstreet3-37/+73
We now say explicitly which pointers are being moved or compressed Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Simplify option logic in rebalanceKent Overstreet3-44/+26
Since bch2_move_get_io_opts() now synchronizes io_opts with options from bch_extent_rebalance, delete the ad-hoc logic in rebalance.c that previously did this. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: get_update_rebalance_opts()Kent Overstreet5-28/+91
bch2_move_get_io_opts() now synchronizes options loaded from the filesystem and inode (if present, i.e. not walking the reflink btree directly) with options from the bch_extent_rebalance_entry, updating the extent if necessary. Since bch_extent_rebalance tracks where its option came from we can preserve "inode options override filesystem options", even for indirect extents where we don't have access to the inode the options came from. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: bch2_write_inode() now checks for changing rebalance optionsKent Overstreet5-14/+32
Previously, BCHFS_IOC_REINHERIT_ATTRS didn't trigger rebalance scans when changing rebalance options - it had been missed, only the xattr interface triggered them. Ideally they'd be done by the transactional trigger, but unpacking the inode to get the options is too heavy to be done in the low level trigger - the inode trigger is run on every extent update, since the bch_inode.bi_journal_seq has to be updated for fsync. bch2_write_inode() is a good compromise, it already unpacks and repacks and is not run in any super-fast paths. Additionally, creating the new rebalance entry to trigger the scan is now done in the same transaction as the inode update that changed the options. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: New bch_extent_rebalance fieldsKent Overstreet3-15/+87
- Add more io path options to bch_extent_rebalance - For each option, track whether it came from the filesystem or the inode This will be used for improved rebalance support for reflinked data. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: bch2_prt_csum_opt()Kent Overstreet4-6/+8
bounds checking helper Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: copygc_enabled, rebalance_enabled now opts.h optionsKent Overstreet7-35/+23
They can now be set at mount time Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Add bch_io_opts fields for indicating whether the opts came from ↵Kent Overstreet2-1/+10
the inode This is going to be used in the bch_extent_rebalance improvements, which propagate io_path options into the extent (important for rebalance, which needs something present in the extent for transactionally tagging them in the rebalance_work btree, and also for indirect extents). By tracking in bch_extent_rebalance whether the option came from the filesystem or the inode we can correctly handle options being changed on indirect extents. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: io_opts_to_rebalance_opts()Kent Overstreet8-50/+39
New helper to simplify bch2_bkey_set_needs_rebalance() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: rename bch_extent_rebalance fields to match other opts structsKent Overstreet3-28/+28
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: kill __bch2_bkey_sectors_need_rebalance()Kent Overstreet1-13/+10
Single caller, fold into bch2_bkey_sectors_need_rebalance() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: kill bch2_bkey_needs_rebalance()Kent Overstreet2-18/+0
Dead code Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: small cleanup for extent ptr bitmasksKent Overstreet3-24/+24
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: bch2_io_opts_fixups()Kent Overstreet6-10/+21
Centralize some io path option fixups - they weren't always being applied correctly: - background_compression uses compression if unset - background_target uses foreground_target if unset - nocow disables most fancy io path options Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: use bch2_data_update_opts_to_text() in trace_move_extent_fail()Kent Overstreet1-19/+14
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: avoid 'unsigned flags'Kent Overstreet1-9/+15
flags should have actual types, where possible: fix btree_update.h helpers Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Annotate struct bucket_gens with __counted_by()Thorsten Blum2-6/+9
Add the __counted_by compiler attribute to the flexible array member b to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and CONFIG_FORTIFY_SOURCE. Use struct_size() to calculate the number of bytes to be allocated. Update bucket_gens->nbuckets and bucket_gens->nbuckets_minus_first when resizing. Compile-tested only. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Use str_write_read() helper in write_super_endio()Thorsten Blum1-1/+2
Remove hard-coded strings by using the str_write_read() helper function. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Use str_write_read() helper in ec_block_endio()Thorsten Blum1-1/+2
Remove hard-coded strings by using the helper function str_write_read(). Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Use str_write_read() helper functionThorsten Blum1-1/+3
Remove hard-coded strings by using the helper function str_write_read(). Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Add version check for bch_btree_ptr_v2.sectors_written validateKent Overstreet1-1/+2
A user popped up with a very old (0.11) filesystem that needed repair and wasn't recently backed up. Reported-by: Manoa <manoa@mail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Add block plugging to read pathsKent Overstreet2-1/+23
This will help with some of the btree_trans srcu lock hold time warnings that are still turning up; submit_bio() can block for awhile if the device is sufficiently congested. It's not a perfect solution since blk_plug bios are submitted when scheduling; we might want a way to disable the "submit on context switch" behaviour, or switch to our own plugging in the future. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Fix warning about passing flex array member by valueKent Overstreet1-5/+5
this showed up when building in userspace Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: bch2_journal_meta() takes ref on c->writesKent Overstreet3-13/+19
This part of addressing https://github.com/koverstreet/bcachefs/issues/656 where we're getting stuck in bch2_journal_meta() in the dump tool. We shouldn't be invoking the journal without a ref on c->writes (if we're not RW), and there's no reason for the dump tool to be going read-write. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: -o norecovery now bails out of recovery earlierKent Overstreet1-2/+7
-o norecovery (used by the dump tool) should be doing the absolute minimum amount of work to get the filesystem up and readable; we shouldn't be running check and repair code, or going read-write. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Refactor new stripe path to reduce dependencies on ec_stripe_headKent Overstreet1-92/+104
We need to add a path for reshaping existing stripes (for e.g. device removal), and this new path won't necessarily use ec_stripe_head. Refactor the code to avoid unnecessary references to it for clarity. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Avoid bch2_btree_id_str()Kent Overstreet15-92/+140
Prefer bch2_btree_id_to_text() - it prints out the integer ID when unknown. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: better error message in check_snapshot_tree()Kent Overstreet1-3/+15
If we find a snapshot node and it didn't match the snapshot tree, we should print it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Factor out jset_entry_log_msg_bytes()Kent Overstreet2-2/+10
Needed for improved userspace cmd_list_journal Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: improved bkey_val_copy()Kent Overstreet1-15/+13
Factor out some common code, add typechecking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: bch2_btree_lost_data() now uses ↵Kent Overstreet4-23/+54
run_explicit_rceovery_pass_persistent() Also get a bit more fine grained about which passes to run for which btrees. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Add locking for bch_fs.curr_recovery_passKent Overstreet3-19/+59
Recovery can rewind in certain situations - when we discover we need to run a pass that doesn't normally run. This can happen from another thread for btree node read errors, so we need a bit of locking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: lru, accounting are alloc btreesKent Overstreet1-0/+2
They can be regenerated by fsck and don't require a btree node scan, like other alloc btrees. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: bch2_run_explicit_recovery_pass() returns different error when not ↵Kent Overstreet2-1/+6
in recovery if we're not in recovery then there's no way to rewind recovery - give this a different errcode so that any error messages will give us a better idea of what happened. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: add more path idx debug assertsKent Overstreet1-0/+2
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Use FOREACH_ACL_ENTRY() macro to iterate over acl entriesThorsten Blum1-8/+3
Use the existing FOREACH_ACL_ENTRY() macro to iterate over POSIX acl entries and remove the custom acl_for_each_entry() macro. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21bcachefs: Remove duplicate included headersThorsten Blum1-2/+0
The header files dirent_format.h and disk_groups_format.h are included twice. Remove the redundant includes and the following warnings reported by make includecheck: disk_groups_format.h is included more than once dirent_format.h is included more than once Reviewed-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>