diff options
| author | Qu Wenruo <wqu@suse.com> | 2025-11-11 01:41:59 +0300 |
|---|---|---|
| committer | David Sterba <dsterba@suse.com> | 2025-11-25 00:42:23 +0300 |
| commit | 2574e9011018a1d6d3da8d03d0bfc4e2675dee2a (patch) | |
| tree | 9c410d043b981d9aaaf8d5b102dcc0d3a95f47a3 /tools/perf/scripts/python | |
| parent | 62bcbdca0ea9b1add9c22f400b51c56184902053 (diff) | |
| download | linux-2574e9011018a1d6d3da8d03d0bfc4e2675dee2a.tar.xz | |
btrfs: make btrfs_repair_io_failure() handle bs > ps cases without large folios
Currently btrfs_repair_io_failure() only accept a single @paddr
parameter, and for bs > ps cases it's required that @paddr is backed by
a large folio.
That assumption has quite some limitations, preventing us from utilizing
true zero-copy direct-io and encoded read/writes.
To address the problem, enhance btrfs_repair_io_failure() by:
- Accept an array of paddrs, up to 64K / PAGE_SIZE entries
This kind of acts like a bio_vec, but with very limited entries, as the
function is only utilized to repair one fs data block, or a tree block.
Both have an upper size limit (BTRFS_MAX_BLOCK_SIZE, i.e. 64K), so we
don't need the full bio_vec thing to handle it.
- Allocate a bio with multiple slots
Previously even for bs > ps cases, we only passed in a contiguous
physical address range, thus a single slot will be enough.
But not anymore, so we have to allocate a bio structure, other than
using the on-stack one.
- Use on-stack memory to allocate @paddrs array
It's at most 16 pages (4K page size, 64K block size), will take up at
most 128 bytes.
I think the on-stack cost is still acceptable.
- Add one extra check to make sure the repair bio is exactly one block
- Utilize btrfs_repair_io_failure() to submit a single bio for metadata
This should improve the read-repair performance for metadata, as now
we submit a node sized bio then wait, other than submit each block of
the metadata and wait for each submitted block.
- Add one extra parameter indicating the step
This is due to the fact that metadata step can be as large as
nodesize, instead of sectorsize.
So we need a way to distinguish metadata and data repair.
- Reduce the width of @length parameter of btrfs_repair_io_failure()
Since we only call btrfs_repair_io_failure() on a single data or
metadata block, u64 is overkilled.
Use u32 instead and add one extra ASSERT()s to make sure the length
never exceed BTRFS_MAX_BLOCK_SIZE.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Diffstat (limited to 'tools/perf/scripts/python')
0 files changed, 0 insertions, 0 deletions
