btrfs: make btrfs_repair_io_failure() handle bs > ps cases without large folios - kernel/linux.git

diff options

author	Qu Wenruo <wqu@suse.com>	2025-11-11 01:41:59 +0300
committer	David Sterba <dsterba@suse.com>	2025-11-25 00:42:23 +0300
commit	2574e9011018a1d6d3da8d03d0bfc4e2675dee2a (patch)
tree	9c410d043b981d9aaaf8d5b102dcc0d3a95f47a3 /tools/perf/scripts/python
parent	62bcbdca0ea9b1add9c22f400b51c56184902053 (diff)
download	linux-2574e9011018a1d6d3da8d03d0bfc4e2675dee2a.tar.xz

btrfs: make btrfs_repair_io_failure() handle bs > ps cases without large folios

Currently btrfs_repair_io_failure() only accept a single @paddr parameter, and for bs > ps cases it's required that @paddr is backed by a large folio. That assumption has quite some limitations, preventing us from utilizing true zero-copy direct-io and encoded read/writes. To address the problem, enhance btrfs_repair_io_failure() by: - Accept an array of paddrs, up to 64K / PAGE_SIZE entries This kind of acts like a bio_vec, but with very limited entries, as the function is only utilized to repair one fs data block, or a tree block. Both have an upper size limit (BTRFS_MAX_BLOCK_SIZE, i.e. 64K), so we don't need the full bio_vec thing to handle it. - Allocate a bio with multiple slots Previously even for bs > ps cases, we only passed in a contiguous physical address range, thus a single slot will be enough. But not anymore, so we have to allocate a bio structure, other than using the on-stack one. - Use on-stack memory to allocate @paddrs array It's at most 16 pages (4K page size, 64K block size), will take up at most 128 bytes. I think the on-stack cost is still acceptable. - Add one extra check to make sure the repair bio is exactly one block - Utilize btrfs_repair_io_failure() to submit a single bio for metadata This should improve the read-repair performance for metadata, as now we submit a node sized bio then wait, other than submit each block of the metadata and wait for each submitted block. - Add one extra parameter indicating the step This is due to the fact that metadata step can be as large as nodesize, instead of sectorsize. So we need a way to distinguish metadata and data repair. - Reduce the width of @length parameter of btrfs_repair_io_failure() Since we only call btrfs_repair_io_failure() on a single data or metadata block, u64 is overkilled. Use u32 instead and add one extra ASSERT()s to make sure the length never exceed BTRFS_MAX_BLOCK_SIZE. Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>

Diffstat (limited to 'tools/perf/scripts/python')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: