diff options
author | Josef Bacik <josef@toxicpanda.com> | 2017-10-19 21:15:55 +0300 |
---|---|---|
committer | David Sterba <dsterba@suse.com> | 2017-11-01 22:45:35 +0300 |
commit | 8b62f87bad9cf06e536799bf8cb942ab95f6bfa4 (patch) | |
tree | c3c069a451a3cc3f29637ed540583b914f05f5ad /fs/btrfs/ordered-data.c | |
parent | b115e3bc81aeb624fe7c4eccecbd094601ebde84 (diff) | |
download | linux-8b62f87bad9cf06e536799bf8cb942ab95f6bfa4.tar.xz |
Btrfs: rework outstanding_extents
Right now we do a lot of weird hoops around outstanding_extents in order
to keep the extent count consistent. This is because we logically
transfer the outstanding_extent count from the initial reservation
through the set_delalloc_bits. This makes it pretty difficult to get a
handle on how and when we need to mess with outstanding_extents.
Fix this by revamping the rules of how we deal with outstanding_extents.
Now instead everybody that is holding on to a delalloc extent is
required to increase the outstanding extents count for itself. This
means we'll have something like this
btrfs_delalloc_reserve_metadata - outstanding_extents = 1
btrfs_set_extent_delalloc - outstanding_extents = 2
btrfs_release_delalloc_extents - outstanding_extents = 1
for an initial file write. Now take the append write where we extend an
existing delalloc range but still under the maximum extent size
btrfs_delalloc_reserve_metadata - outstanding_extents = 2
btrfs_set_extent_delalloc
btrfs_set_bit_hook - outstanding_extents = 3
btrfs_merge_extent_hook - outstanding_extents = 2
btrfs_delalloc_release_extents - outstanding_extnets = 1
In order to make the ordered extent transition we of course must now
make ordered extents carry their own outstanding_extent reservation, so
for cow_file_range we end up with
btrfs_add_ordered_extent - outstanding_extents = 2
clear_extent_bit - outstanding_extents = 1
btrfs_remove_ordered_extent - outstanding_extents = 0
This makes all manipulations of outstanding_extents much more explicit.
Every successful call to btrfs_delalloc_reserve_metadata _must_ now be
combined with btrfs_release_delalloc_extents, even in the error case, as
that is the only function that actually modifies the
outstanding_extents counter.
The drawback to this is now we are much more likely to have transient
cases where outstanding_extents is much larger than it actually should
be. This could happen before as we manipulated the delalloc bits, but
now it happens basically at every write. This may put more pressure on
the ENOSPC flushing code, but I think making this code simpler is worth
the cost. I have another change coming to mitigate this side-effect
somewhat.
I also added trace points for the counter manipulation. These were used
by a bpf script I wrote to help track down leak issues.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Diffstat (limited to 'fs/btrfs/ordered-data.c')
-rw-r--r-- | fs/btrfs/ordered-data.c | 21 |
1 files changed, 19 insertions, 2 deletions
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index a3aca495e33e..5b311aeddcc8 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -242,6 +242,15 @@ static int __btrfs_add_ordered_extent(struct inode *inode, u64 file_offset, } spin_unlock(&root->ordered_extent_lock); + /* + * We don't need the count_max_extents here, we can assume that all of + * that work has been done at higher layers, so this is truly the + * smallest the extent is going to get. + */ + spin_lock(&BTRFS_I(inode)->lock); + btrfs_mod_outstanding_extents(BTRFS_I(inode), 1); + spin_unlock(&BTRFS_I(inode)->lock); + return 0; } @@ -591,11 +600,19 @@ void btrfs_remove_ordered_extent(struct inode *inode, { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_ordered_inode_tree *tree; - struct btrfs_root *root = BTRFS_I(inode)->root; + struct btrfs_inode *btrfs_inode = BTRFS_I(inode); + struct btrfs_root *root = btrfs_inode->root; struct rb_node *node; bool dec_pending_ordered = false; - tree = &BTRFS_I(inode)->ordered_tree; + /* This is paired with btrfs_add_ordered_extent. */ + spin_lock(&btrfs_inode->lock); + btrfs_mod_outstanding_extents(btrfs_inode, -1); + spin_unlock(&btrfs_inode->lock); + if (root != fs_info->tree_root) + btrfs_delalloc_release_metadata(btrfs_inode, entry->len); + + tree = &btrfs_inode->ordered_tree; spin_lock_irq(&tree->lock); node = &entry->rb_node; rb_erase(node, &tree->tree); |