<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/fs/btrfs/delalloc-space.c, branch linux-6.0.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-6.0.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-6.0.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2022-07-25T15:45:41+00:00</updated>
<entry>
<title>btrfs: convert count_max_extents() to use fs_info-&gt;max_extent_size</title>
<updated>2022-07-25T15:45:41+00:00</updated>
<author>
<name>Naohiro Aota</name>
<email>naohiro.aota@wdc.com</email>
</author>
<published>2022-07-08T23:18:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7d7672bc5d1038c745716c397d892d21e29de71c'/>
<id>urn:sha1:7d7672bc5d1038c745716c397d892d21e29de71c</id>
<content type='text'>
If count_max_extents() uses BTRFS_MAX_EXTENT_SIZE to calculate the number
of extents needed, btrfs release the metadata reservation too much on its
way to write out the data.

Now that BTRFS_MAX_EXTENT_SIZE is replaced with fs_info-&gt;max_extent_size,
convert count_max_extents() to use it instead, and fix the calculation of
the metadata reservation.

CC: stable@vger.kernel.org # 5.12+
Fixes: d8e3fb106f39 ("btrfs: zoned: use ZONE_APPEND write for zoned mode")
Signed-off-by: Naohiro Aota &lt;naohiro.aota@wdc.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: avoid blocking on space revervation when doing nowait dio writes</title>
<updated>2022-05-16T15:03:10+00:00</updated>
<author>
<name>Filipe Manana</name>
<email>fdmanana@suse.com</email>
</author>
<published>2022-03-23T16:19:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d4135134ab8feb994369d44884733e8031b0f800'/>
<id>urn:sha1:d4135134ab8feb994369d44884733e8031b0f800</id>
<content type='text'>
When doing a NOWAIT direct IO write, if we can NOCOW then it means we can
proceed with the non-blocking, NOWAIT path. However reserving the metadata
space and qgroup meta space can often result in blocking - flushing
delalloc, wait for ordered extents to complete, trigger transaction
commits, etc, going against the semantics of a NOWAIT write.

So make the NOWAIT write path to try to reserve all the metadata it needs
without resulting in a blocking behaviour - if we get -ENOSPC or -EDQUOT
then return -EAGAIN to make the caller fallback to a blocking direct IO
write.

This is part of a patchset comprised of the following patches:

  btrfs: avoid blocking on page locks with nowait dio on compressed range
  btrfs: avoid blocking nowait dio when locking file range
  btrfs: avoid double nocow check when doing nowait dio writes
  btrfs: stop allocating a path when checking if cross reference exists
  btrfs: free path at can_nocow_extent() before checking for checksum items
  btrfs: release path earlier at can_nocow_extent()
  btrfs: avoid blocking when allocating context for nowait dio read/write
  btrfs: avoid blocking on space revervation when doing nowait dio writes

The following test was run before and after applying this patchset:

  $ cat io-uring-nodatacow-test.sh
  #!/bin/bash

  DEV=/dev/sdc
  MNT=/mnt/sdc

  MOUNT_OPTIONS="-o ssd -o nodatacow"
  MKFS_OPTIONS="-R free-space-tree -O no-holes"

  NUM_JOBS=4
  FILE_SIZE=8G
  RUN_TIME=300

  cat &lt;&lt;EOF &gt; /tmp/fio-job.ini
  [io_uring_rw]
  rw=randrw
  fsync=0
  fallocate=posix
  group_reporting=1
  direct=1
  ioengine=io_uring
  iodepth=64
  bssplit=4k/20:8k/20:16k/20:32k/10:64k/10:128k/5:256k/5:512k/5:1m/5
  filesize=$FILE_SIZE
  runtime=$RUN_TIME
  time_based
  filename=foobar
  directory=$MNT
  numjobs=$NUM_JOBS
  thread
  EOF

  echo performance | \
     tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

  umount $MNT &amp;&gt; /dev/null
  mkfs.btrfs -f $MKFS_OPTIONS $DEV &amp;&gt; /dev/null
  mount $MOUNT_OPTIONS $DEV $MNT

  fio /tmp/fio-job.ini

  umount $MNT

The test was run a 12 cores box with 64G of ram, using a non-debug kernel
config (Debian's default config) and a spinning disk.

Result before the patchset:

 READ: bw=407MiB/s (427MB/s), 407MiB/s-407MiB/s (427MB/s-427MB/s), io=119GiB (128GB), run=300175-300175msec
WRITE: bw=407MiB/s (427MB/s), 407MiB/s-407MiB/s (427MB/s-427MB/s), io=119GiB (128GB), run=300175-300175msec

Result after the patchset:

 READ: bw=436MiB/s (457MB/s), 436MiB/s-436MiB/s (457MB/s-457MB/s), io=128GiB (137GB), run=300044-300044msec
WRITE: bw=435MiB/s (456MB/s), 435MiB/s-435MiB/s (456MB/s-456MB/s), io=128GiB (137GB), run=300044-300044msec

That's about +7.2% throughput for reads and +6.9% for writes.

Signed-off-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: support different disk extent size for delalloc</title>
<updated>2022-03-14T12:13:51+00:00</updated>
<author>
<name>Omar Sandoval</name>
<email>osandov@fb.com</email>
</author>
<published>2019-11-19T06:45:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=28c9b1e75aa1794897a9cb709f20f27f4aadebac'/>
<id>urn:sha1:28c9b1e75aa1794897a9cb709f20f27f4aadebac</id>
<content type='text'>
Currently, we always reserve the same extent size in the file and extent
size on disk for delalloc because the former is the worst case for the
latter. For BTRFS_IOC_ENCODED_WRITE writes, we know the exact size of
the extent on disk, which may be less than or greater than (for
bookends) the size in the file. Add a disk_num_bytes parameter to
btrfs_delalloc_reserve_metadata() so that we can reserve the correct
amount of csum bytes. No functional change.

Reviewed-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Omar Sandoval &lt;osandov@fb.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: change root to fs_info for btrfs_reserve_metadata_bytes</title>
<updated>2022-01-03T14:09:45+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2021-11-09T15:12:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9270501c163b67f48fd5fb2f98b1b48d6530f460'/>
<id>urn:sha1:9270501c163b67f48fd5fb2f98b1b48d6530f460</id>
<content type='text'>
We used to need the root for btrfs_reserve_metadata_bytes to check the
orphan cleanup state, but we no longer need that, we simply need the
fs_info.  Change btrfs_reserve_metadata_bytes() to use the fs_info, and
change both btrfs_block_rsv_refill() and btrfs_block_rsv_add() to do the
same as they simply call btrfs_reserve_metadata_bytes() and then
manipulate the block_rsv that is being used.

Reviewed-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: free exchange changeset on failures</title>
<updated>2021-12-08T14:42:32+00:00</updated>
<author>
<name>Johannes Thumshirn</name>
<email>johannes.thumshirn@wdc.com</email>
</author>
<published>2021-12-03T10:55:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=da5e817d9d75422eaaa05490d0b9a5e328fc1a51'/>
<id>urn:sha1:da5e817d9d75422eaaa05490d0b9a5e328fc1a51</id>
<content type='text'>
Fstests runs on my VMs have show several kmemleak reports like the following.

  unreferenced object 0xffff88811ae59080 (size 64):
    comm "xfs_io", pid 12124, jiffies 4294987392 (age 6.368s)
    hex dump (first 32 bytes):
      00 c0 1c 00 00 00 00 00 ff cf 1c 00 00 00 00 00  ................
      90 97 e5 1a 81 88 ff ff 90 97 e5 1a 81 88 ff ff  ................
    backtrace:
      [&lt;00000000ac0176d2&gt;] ulist_add_merge+0x60/0x150 [btrfs]
      [&lt;0000000076e9f312&gt;] set_state_bits+0x86/0xc0 [btrfs]
      [&lt;0000000014fe73d6&gt;] set_extent_bit+0x270/0x690 [btrfs]
      [&lt;000000004f675208&gt;] set_record_extent_bits+0x19/0x20 [btrfs]
      [&lt;00000000b96137b1&gt;] qgroup_reserve_data+0x274/0x310 [btrfs]
      [&lt;0000000057e9dcbb&gt;] btrfs_check_data_free_space+0x5c/0xa0 [btrfs]
      [&lt;0000000019c4511d&gt;] btrfs_delalloc_reserve_space+0x1b/0xa0 [btrfs]
      [&lt;000000006d37e007&gt;] btrfs_dio_iomap_begin+0x415/0x970 [btrfs]
      [&lt;00000000fb8a74b8&gt;] iomap_iter+0x161/0x1e0
      [&lt;0000000071dff6ff&gt;] __iomap_dio_rw+0x1df/0x700
      [&lt;000000002567ba53&gt;] iomap_dio_rw+0x5/0x20
      [&lt;0000000072e555f8&gt;] btrfs_file_write_iter+0x290/0x530 [btrfs]
      [&lt;000000005eb3d845&gt;] new_sync_write+0x106/0x180
      [&lt;000000003fb505bf&gt;] vfs_write+0x24d/0x2f0
      [&lt;000000009bb57d37&gt;] __x64_sys_pwrite64+0x69/0xa0
      [&lt;000000003eba3fdf&gt;] do_syscall_64+0x43/0x90

In case brtfs_qgroup_reserve_data() or btrfs_delalloc_reserve_metadata()
fail the allocated extent_changeset will not be freed.

So in btrfs_check_data_free_space() and btrfs_delalloc_reserve_space()
free the allocated extent_changeset to get rid of the allocated memory.

The issue currently only happens in the direct IO write path, but only
after 65b3c08606e5 ("btrfs: fix ENOSPC failure when attempting direct IO
write into NOCOW range"), and also at defrag_one_locked_target(). Every
other place is always calling extent_changeset_free() even if its call
to btrfs_delalloc_reserve_space() or btrfs_check_data_free_space() has
failed.

CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Filipe Manana &lt;fdmanana@suse.com&gt;
Signed-off-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: fix typos in comments</title>
<updated>2021-06-22T12:11:57+00:00</updated>
<author>
<name>David Sterba</name>
<email>dsterba@suse.com</email>
</author>
<published>2021-05-21T15:42:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1a9fd4172d5c8ba64735b3aef7eed643d398ce05'/>
<id>urn:sha1:1a9fd4172d5c8ba64735b3aef7eed643d398ce05</id>
<content type='text'>
Fix typos that have snuck in since the last round. Found by codespell.

Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: fix parameter description of btrfs_inode_rsv_release/btrfs_delalloc_release_space</title>
<updated>2021-02-08T21:58:54+00:00</updated>
<author>
<name>Nikolay Borisov</name>
<email>nborisov@suse.com</email>
</author>
<published>2021-01-22T09:58:01+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b762d1d08dacdc444ffd6417fc17805408da7af4'/>
<id>urn:sha1:b762d1d08dacdc444ffd6417fc17805408da7af4</id>
<content type='text'>
Fixes following warnings:

fs/btrfs/delalloc-space.c:205: warning: Function parameter or member 'inode' not described in 'btrfs_inode_rsv_release'
fs/btrfs/delalloc-space.c:205: warning: Function parameter or member 'qgroup_free' not described in 'btrfs_inode_rsv_release'
fs/btrfs/delalloc-space.c:472: warning: Function parameter or member 'reserved' not described in 'btrfs_delalloc_release_space'
fs/btrfs/delalloc-space.c:472: warning: Function parameter or member 'qgroup_free' not described in 'btrfs_delalloc_release_space'
fs/btrfs/delalloc-space.c:472: warning: Excess function parameter 'release_bytes' description in 'btrfs_delalloc_release_space'

Signed-off-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: add btrfs_reserve_data_bytes and use it</title>
<updated>2020-10-07T10:06:52+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2020-07-21T14:22:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8698fc4eb7884eeba29adc4193f9d05fa2bed16b'/>
<id>urn:sha1:8698fc4eb7884eeba29adc4193f9d05fa2bed16b</id>
<content type='text'>
Create a new function btrfs_reserve_data_bytes() in order to handle data
reservations.  This uses the new flush types and flush states to handle
making data reservations.

This patch specifically does not change any functionality, and is
purposefully not cleaned up in order to make bisection easier for the
future patches.  The new helper is identical to the old helper in how it
handles data reservations.  We first try to force a chunk allocation,
and then we run through the flush states all at once and in the same
order that they were done with the old helper.

Subsequent patches will clean this up and change the behavior of the
flushing, and it is important to keep those changes separate so we can
easily bisect down to the patch that caused the regression, rather than
the patch that made us start using the new infrastructure.

Reviewed-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Tested-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: use the btrfs_space_info_free_bytes_may_use helper for delalloc</title>
<updated>2020-10-07T10:06:52+00:00</updated>
<author>
<name>Josef Bacik</name>
<email>josef@toxicpanda.com</email>
</author>
<published>2020-07-21T14:22:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=39753e4a3a43c7d14a17f626f32d4a53ee59d563'/>
<id>urn:sha1:39753e4a3a43c7d14a17f626f32d4a53ee59d563</id>
<content type='text'>
We are going to use the ticket infrastructure for data, so use the
btrfs_space_info_free_bytes_may_use() helper in
btrfs_free_reserved_data_space_noquota() so we get the
btrfs_try_granting_tickets call when we free our reservation.

Reviewed-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Tested-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Signed-off-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>btrfs: make btrfs_delalloc_reserve_space take btrfs_inode</title>
<updated>2020-07-27T10:55:36+00:00</updated>
<author>
<name>Nikolay Borisov</name>
<email>nborisov@suse.com</email>
</author>
<published>2020-06-03T05:55:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e5b7231e2009967b63a39ce0ec6a022307616b82'/>
<id>urn:sha1:e5b7231e2009967b63a39ce0ec6a022307616b82</id>
<content type='text'>
All of its children take btrfs_inode so bubble up this requirement to
btrfs_delalloc_reserve_space's interface and stop calling BTRFS_I
internally.

Signed-off-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
</feed>
