<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/block/ioctl.c, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-05-29T09:02:01+00:00</updated>
<entry>
<title>block: fix race between set_blocksize and read paths</title>
<updated>2025-05-29T09:02:01+00:00</updated>
<author>
<name>Darrick J. Wong</name>
<email>djwong@kernel.org</email>
</author>
<published>2025-04-23T19:53:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=64f505b08e0cfd8163491c8c082d4f47a88e51d4'/>
<id>urn:sha1:64f505b08e0cfd8163491c8c082d4f47a88e51d4</id>
<content type='text'>
[ Upstream commit c0e473a0d226479e8e925d5ba93f751d8df628e9 ]

With the new large sector size support, it's now the case that
set_blocksize can change i_blksize and the folio order in a manner that
conflicts with a concurrent reader and causes a kernel crash.

Specifically, let's say that udev-worker calls libblkid to detect the
labels on a block device.  The read call can create an order-0 folio to
read the first 4096 bytes from the disk.  But then udev is preempted.

Next, someone tries to mount an 8k-sectorsize filesystem from the same
block device.  The filesystem calls set_blksize, which sets i_blksize to
8192 and the minimum folio order to 1.

Now udev resumes, still holding the order-0 folio it allocated.  It then
tries to schedule a read bio and do_mpage_readahead tries to create
bufferheads for the folio.  Unfortunately, blocks_per_folio == 0 because
the page size is 4096 but the blocksize is 8192 so no bufferheads are
attached and the bh walk never sets bdev.  We then submit the bio with a
NULL block device and crash.

Therefore, truncate the page cache after flushing but before updating
i_blksize.  However, that's not enough -- we also need to lock out file
IO and page faults during the update.  Take both the i_rwsem and the
invalidate_lock in exclusive mode for invalidations, and in shared mode
for read/write operations.

I don't know if this is the correct fix, but xfs/259 found it.

Signed-off-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Tested-by: Shin'ichiro Kawasaki &lt;shinichiro.kawasaki@wdc.com&gt;
Link: https://lore.kernel.org/r/174543795699.4139148.2086129139322431423.stgit@frogsfrogsfrogs
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: implement async io_uring discard cmd</title>
<updated>2024-09-11T16:45:28+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2024-09-11T16:34:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=50c52250e2d74b098465841163c18f4b4e9ad430'/>
<id>urn:sha1:50c52250e2d74b098465841163c18f4b4e9ad430</id>
<content type='text'>
io_uring allows implementing custom file specific asynchronous
operations via the fops-&gt;uring_cmd callback, a.k.a. IORING_OP_URING_CMD
requests or just io_uring commands. Use it to add support for async
discards.

Normally, it first tries to queue up bios in a non-blocking context,
and if that fails, we'd retry from a blocking context by returning
-EAGAIN to the core io_uring. We always get the result from bios
asynchronously by setting a custom bi_end_io callback, at which point
we drag the request into the task context to either reissue or complete
it and post a completion to the user.

Unlike ioctl(BLKDISCARD) with stronger guarantees against races, we only
do a best effort attempt to invalidate page cache, and it can race with
any writes and reads and leave page cache stale. It's the same kind of
races we allow to direct writes.

Also, apart from cases where discarding is not allowed at all, e.g.
discards are not supported or the file/device is read only, the user
should assume that the sector range on disk is not valid anymore, even
when an error was returned to the user.

Suggested-by: Conrad Meyer &lt;conradmeyer@meta.com&gt;
Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/2b5210443e4fa0257934f73dfafcc18a77cd0e09.1726072086.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: introduce blk_validate_byte_range()</title>
<updated>2024-09-11T16:45:07+00:00</updated>
<author>
<name>Pavel Begunkov</name>
<email>asml.silence@gmail.com</email>
</author>
<published>2024-09-11T16:34:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7a07210bbcb35c0075830bb94b5321978164cb60'/>
<id>urn:sha1:7a07210bbcb35c0075830bb94b5321978164cb60</id>
<content type='text'>
In preparation to further changes extract a helper function out of
blk_ioctl_discard() that validates if we can do IO against the given
range of disk byte addresses.

Signed-off-by: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Link: https://lore.kernel.org/r/19a7779323c71e742a2f511e4cf49efcfd68cfd4.1726072086.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: fix integer overflow in BLKSECDISCARD</title>
<updated>2024-09-04T14:01:30+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2024-09-03T19:48:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=697ba0b6ec4ae04afb67d3911799b5e2043b4455'/>
<id>urn:sha1:697ba0b6ec4ae04afb67d3911799b5e2043b4455</id>
<content type='text'>
I independently rediscovered

	commit 22d24a544b0d49bbcbd61c8c0eaf77d3c9297155
	block: fix overflow in blk_ioctl_discard()

but for secure erase.

Same problem:

	uint64_t r[2] = {512, 18446744073709551104ULL};
	ioctl(fd, BLKSECDISCARD, r);

will enter near infinite loop inside blkdev_issue_secure_erase():

	a.out: attempt to access beyond end of device
	loop0: rw=5, sector=3399043073, nr_sectors = 1024 limit=2048
	bio_check_eod: 3286214 callbacks suppressed

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Link: https://lore.kernel.org/r/9e64057f-650a-46d1-b9f7-34af391536ef@p183
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blk-lib: check for kill signal in ioctl BLKZEROOUT</title>
<updated>2024-07-05T06:53:15+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2024-07-01T16:51:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bf86bcdb40123ee99669ee91b67e023669433a1a'/>
<id>urn:sha1:bf86bcdb40123ee99669ee91b67e023669433a1a</id>
<content type='text'>
Zeroout can access a significant capacity and take longer than the user
expected.  A user may change their mind about wanting to run that
command and attempt to kill the process and do something else with their
device. But since the task is uninterruptable, they have to wait for it
to finish, which could be many hours.

Add a new BLKDEV_ZERO_KILLABLE flag for blkdev_issue_zeroout that checks
for a fatal signal at each iteration so the user doesn't have to wait for
their regretted operation to complete naturally.

Heavily based on an earlier patch from Keith Busch.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Link: https://lore.kernel.org/r/20240701165219.1571322-11-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Merge tag 'pull-bd_flags-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2024-05-21T20:02:56+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-05-21T20:02:56+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3413efa8885d7a714c54c6752eaf49fd17d351c9'/>
<id>urn:sha1:3413efa8885d7a714c54c6752eaf49fd17d351c9</id>
<content type='text'>
Pull bdev flags update from Al Viro:
 "Compactifying bdev flags.

  We can easily have up to 24 flags with sane atomicity, _without_
  pushing anything out of the first cacheline of struct block_device"

* tag 'pull-bd_flags-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  bdev: move -&gt;bd_make_it_fail to -&gt;__bd_flags
  bdev: move -&gt;bd_ro_warned to -&gt;__bd_flags
  bdev: move -&gt;bd_has_subit_bio to -&gt;__bd_flags
  bdev: move -&gt;bd_write_holder into -&gt;__bd_flags
  bdev: move -&gt;bd_read_only to -&gt;__bd_flags
  bdev: infrastructure for flags
  wrapper for access to -&gt;bd_partno
  Use bdev_is_paritition() instead of open-coding it
</content>
</entry>
<entry>
<title>Merge tag 'pull-bd_inode-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2024-05-21T16:51:42+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-05-21T16:51:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=38da32ee70b876f5b8bea7c4135eff46339c18f2'/>
<id>urn:sha1:38da32ee70b876f5b8bea7c4135eff46339c18f2</id>
<content type='text'>
Pull bdev bd_inode updates from Al Viro:
 "Replacement of bdev-&gt;bd_inode with sane(r) set of primitives by me and
  Yu Kuai"

* tag 'pull-bd_inode-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  RIP -&gt;bd_inode
  dasd_format(): killing the last remaining user of -&gt;bd_inode
  nilfs_attach_log_writer(): use -&gt;bd_mapping-&gt;host instead of -&gt;bd_inode
  block/bdev.c: use the knowledge of inode/bdev coallocation
  gfs2: more obvious initializations of mapping-&gt;host
  fs/buffer.c: massage the remaining users of -&gt;bd_inode to -&gt;bd_mapping
  blk_ioctl_{discard,zeroout}(): we only want -&gt;bd_inode-&gt;i_mapping here...
  grow_dev_folio(): we only want -&gt;bd_inode-&gt;i_mapping there
  use -&gt;bd_mapping instead of -&gt;bd_inode-&gt;i_mapping
  block_device: add a pointer to struct address_space (page cache of bdev)
  missing helpers: bdev_unhash(), bdev_drop()
  block: move two helpers into bdev.c
  block2mtd: prevent direct access of bd_inode
  dm-vdo: use bdev_nr_bytes(bdev) instead of i_size_read(bdev-&gt;bd_inode)
  blkdev_write_iter(): saner way to get inode and bdev
  bcachefs: remove dead function bdev_sectors()
  ext4: remove block_device_ejected()
  erofs_buf: store address_space instead of inode
  erofs: switch erofs_bread() to passing offset instead of block number
</content>
</entry>
<entry>
<title>Merge tag 'pull-set_blocksize' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2024-05-21T15:34:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-05-21T15:34:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5ad8b6ad9a08abdbc8c57a51a5faaf2ef1afc547'/>
<id>urn:sha1:5ad8b6ad9a08abdbc8c57a51a5faaf2ef1afc547</id>
<content type='text'>
Pull vfs blocksize updates from Al Viro:
 "This gets rid of bogus set_blocksize() uses, switches it over
  to be based on a 'struct file *' and verifies that the caller
  has the device opened exclusively"

* tag 'pull-set_blocksize' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  make set_blocksize() fail unless block device is opened exclusive
  set_blocksize(): switch to passing struct file *
  btrfs_get_bdev_and_sb(): call set_blocksize() only for exclusive opens
  swsusp: don't bother with setting block size
  zram: don't bother with reopening - just use O_EXCL for open
  swapon(2): open swap with O_EXCL
  swapon(2)/swapoff(2): don't bother with block size
  pktcdvd: sort set_blocksize() calls out
  bcache_register(): don't bother with set_blocksize()
</content>
</entry>
<entry>
<title>Merge tag 'for-6.10/block-20240511' of git://git.kernel.dk/linux</title>
<updated>2024-05-13T20:03:54+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-05-13T20:03:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0c9f4ac808b017a0013cee92a30de980550145d5'/>
<id>urn:sha1:0c9f4ac808b017a0013cee92a30de980550145d5</id>
<content type='text'>
Pull block updates from Jens Axboe:

 - Add a partscan attribute in sysfs, fixing an issue with systemd
   relying on an internal interface that went away.

 - Attempt #2 at making long running discards interruptible. The
   previous attempt went into 6.9, but we ended up mostly reverting it
   as it had issues.

 - Remove old ida_simple API in bcache

 - Support for zoned write plugging, greatly improving the performance
   on zoned devices.

 - Remove the old throttle low interface, which has been experimental
   since 2017 and never made it beyond that and isn't being used.

 - Remove page-&gt;index debugging checks in brd, as it hasn't caught
   anything and prepares us for removing in struct page.

 - MD pull request from Song

 - Don't schedule block workers on isolated CPUs

* tag 'for-6.10/block-20240511' of git://git.kernel.dk/linux: (84 commits)
  blk-throttle: delay initialization until configuration
  blk-throttle: remove CONFIG_BLK_DEV_THROTTLING_LOW
  block: fix that util can be greater than 100%
  block: support to account io_ticks precisely
  block: add plug while submitting IO
  bcache: fix variable length array abuse in btree_iter
  bcache: Remove usage of the deprecated ida_simple_xx() API
  md: Revert "md: Fix overflow in is_mddev_idle"
  blk-lib: check for kill signal in ioctl BLKDISCARD
  block: add a bio_await_chain helper
  block: add a blk_alloc_discard_bio helper
  block: add a bio_chain_and_submit helper
  block: move discard checks into the ioctl handler
  block: remove the discard_granularity check in __blkdev_issue_discard
  block/ioctl: prefer different overflow check
  null_blk: Fix the WARNING: modpost: missing MODULE_DESCRIPTION()
  block: fix and simplify blkdevparts= cmdline parsing
  block: refine the EOF check in blkdev_iomap_begin
  block: add a partscan sysfs attribute for disks
  block: add a disk_has_partscan helper
  ...
</content>
</entry>
<entry>
<title>blk-lib: check for kill signal in ioctl BLKDISCARD</title>
<updated>2024-05-07T13:29:42+00:00</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2024-05-06T04:20:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=719c15a75ebf3bda3ca718fe8e0ce63d262ec7ae'/>
<id>urn:sha1:719c15a75ebf3bda3ca718fe8e0ce63d262ec7ae</id>
<content type='text'>
Discards can access a significant capacity and take longer than the user
expected.  A user may change their mind about wanting to run that command
and attempt to kill the process and do something else with their device.
But since the task is uninterruptable, they have to wait for it to
finish, which could be many hours.

Open code blkdev_issue_discard in the BLKDISCARD ioctl handler and check
for a fatal signal at each iteration so the user doesn't have to wait
for their regretted operation to complete naturally.

Heavily based on an earlier patch from Keith Busch.

Reported-by: Conrad Meyer &lt;conradmeyer@meta.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20240506042027.2289826-7-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
