diff options
author | Qu Wenruo <wqu@suse.com> | 2021-04-06 03:27:18 +0300 |
---|---|---|
committer | David Sterba <dsterba@suse.com> | 2021-06-21 16:19:08 +0300 |
commit | 266a258678b9f254647f4297843cfbfbddde220a (patch) | |
tree | db635d27ef532030806f5cef10d29ff4362ed66f /fs/btrfs | |
parent | e65f152e43484807b4caf7300e70d882e4652566 (diff) | |
download | linux-266a258678b9f254647f4297843cfbfbddde220a.tar.xz |
btrfs: update comments in btrfs_invalidatepage()
The existing comments in btrfs_invalidatepage() don't really get to the
point, especially for what Private2 is really representing and how the
race avoidance is done.
The truth is, there are only three entrances to do ordered extent
accounting:
- btrfs_writepage_endio_finish_ordered()
- __endio_write_update_ordered()
Those two entrance are just endio functions for dio and buffered
write.
- btrfs_invalidatepage()
But there is a pitfall, in endio functions there is no check on whether
the ordered extent is already accounted.
They just blindly clear the Private2 bit and do the accounting.
So it's all btrfs_invalidatepage()'s responsibility to make sure we
won't do double account for the same sector.
That's why in btrfs_invalidatepage() we have to wait for page writeback,
this will ensure all submitted bios have finished, thus their endio
functions have finished the accounting on the ordered extent.
Then we also check page Private2 to ensure that, we only run ordered
extent accounting on pages who has no bio submitted.
This patch will rework related comments to make it more clear on the
race and how we use wait_on_page_writeback() and Private2 to prevent
double accounting on ordered extent.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Diffstat (limited to 'fs/btrfs')
-rw-r--r-- | fs/btrfs/inode.c | 21 |
1 files changed, 15 insertions, 6 deletions
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index c6243d242bc9..e86a6113e149 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8329,11 +8329,16 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset, bool completed_ordered = false; /* - * we have the page locked, so new writeback can't start, - * and the dirty bit won't be cleared while we are here. + * We have page locked so no new ordered extent can be created on this + * page, nor bio can be submitted for this page. * - * Wait for IO on this page so that we can safely clear - * the PagePrivate2 bit and do ordered accounting + * But already submitted bio can still be finished on this page. + * Furthermore, endio function won't skip page which has Private2 + * already cleared, so it's possible for endio and invalidatepage to do + * the same ordered extent accounting twice on one page. + * + * So here we wait for any submitted bios to finish, so that we won't + * do double ordered extent accounting on the same page. */ wait_on_page_writeback(page); @@ -8363,8 +8368,12 @@ again: EXTENT_LOCKED | EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 1, 0, &cached_state); /* - * whoever cleared the private bit is responsible - * for the finish_ordered_io + * A page with Private2 bit means no bio has been submitted + * covering the page, thus we have to manually do the ordered + * extent accounting. + * + * For page without Private2, the ordered extent accounting is + * done in its endio function of the submitted bio. */ if (TestClearPagePrivate2(page)) { spin_lock_irq(&inode->ordered_tree.lock); |