<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/block/rbd.c, branch v7.2-rc1</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v7.2-rc1</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v7.2-rc1'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-06-26T23:15:53+00:00</updated>
<entry>
<title>Merge tag 'ceph-for-7.2-rc1' of https://github.com/ceph/ceph-client</title>
<updated>2026-06-26T23:15:53+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-06-26T23:15:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5422e496b313b9b0b2f6df068902d6c79925d5e9'/>
<id>urn:sha1:5422e496b313b9b0b2f6df068902d6c79925d5e9</id>
<content type='text'>
Pull ceph updates from Ilya Dryomov:
 "This adds support for manual client session reset in CephFS, allowing
  operators to get out of tricky livelock situations involving caps and
  file locks without evicting the problematic client instance on the MDS
  side or rebooting the client node both of which can be disruptive"

* tag 'ceph-for-7.2-rc1' of https://github.com/ceph/ceph-client:
  ceph: add manual reset debugfs control and tracepoints
  ceph: add client reset state machine and session teardown
  ceph: add diagnostic timeout loop to wait_caps_flush()
  ceph: harden send_mds_reconnect and handle active-MDS peer reset
  ceph: use proper endian conversion for flock_len in reconnect
  ceph: convert inode flags to named bit positions and atomic bitops
  rbd: switch to dynamic root device
</content>
</entry>
<entry>
<title>rbd: switch to dynamic root device</title>
<updated>2026-06-22T20:27:12+00:00</updated>
<author>
<name>Johan Hovold</name>
<email>johan@kernel.org</email>
</author>
<published>2026-04-24T10:39:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e5c6b06a97c88227da49874d805393f960b661c6'/>
<id>urn:sha1:e5c6b06a97c88227da49874d805393f960b661c6</id>
<content type='text'>
Driver core expects devices to be dynamically allocated and will, for
example, complain loudly when no release function has been provided.

Use root_device_register() to allocate and register the root device
instead of open coding using a static device.

Signed-off-by: Johan Hovold &lt;johan@kernel.org&gt;
Reviewed-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-7.2/block-20260615' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux</title>
<updated>2026-06-16T07:32:47+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-06-16T07:32:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ba9c792c824fff732df85119011d399d9b6d9155'/>
<id>urn:sha1:ba9c792c824fff732df85119011d399d9b6d9155</id>
<content type='text'>
Pull block updates from Jens Axboe:

 - NVMe pull request via Keith:
     - Per-controller admin and IO timeout sysfs attributes, and
       letting the block layer set request timeouts (Maurizio,
       Maximilian)
     - Multipath passthrough iostats, and PCI P2PDMA enablement for
       multipath devices (Keith, Kiran)
     - A new diag sysfs attribute group exporting per-controller
       counters (retries, multipath failover, error counters, requeue
       and failure counts, reset and reconnect events) (Nilay)
     - FDP configuration validation and bounds check fixes (liuxixin)
     - Various nvmet fixes, including a pre-auth out-of-bounds read in
       the Discovery Get Log Page handler, auth payload bounds
       validation, and tcp error-path leak fixes (Bryam, Tianchu,
       Geliang)
     - nvme-tcp lockdep and workqueue fixes (Shin'ichiro, Kuniyuki,
       Eric)
     - Assorted other fixes and cleanups (John, Yao, Chao, Mateusz,
       Achkinazi, Wentao)

 - MD pull request via Yu Kuai:
     - raid1/raid10 fixes for a deadlock in the read error recovery
       path, error-path detection and bio accounting with cloned bios,
       and an nr_pending leak in the REQ_ATOMIC bad-block error path
       (Abd-Alrhman)
     - PCI P2PDMA propagation from member devices to the RAID device
       (Kiran)
     - dm-raid bio requeue fix, and various smaller fixes and cleanups
       (Benjamin, Chen, Li, Thorsten)

 - Enable Clang lock context analysis for the block layer, with the
   accompanying annotations across queue limits, the blk_holder_ops
   callbacks, crypto, cgroup, iocost, kyber and mq-deadline (Bart)

 - Block status code infrastructure work: a tagged status table, a
   str_to_blk_op() helper, a bio_endio_status() helper, and on top of
   that a new configurable block-layer error injection facility
   (Christoph)

 - DRBD netlink rework, replacing the genl_magic machinery with explicit
   netlink serialization and moving the DRBD UAPI headers to
   include/uapi/linux/ (Christoph Böhmwalder)

 - bvec improvements: a bvec_folio() helper and making the bvec_iter
   helpers proper inline functions (Willy, Christoph)

 - ublk cleanups and a canceling-flag fix for the disk-not-allocated
   case (Caleb, Ming)

 - Partition handling fixes: bound the AIX pp_count scan, fix an of_node
   refcount leak, and replace __get_free_page() with kmalloc() (Bryam,
   Wentao, Mike)

 - Convert numa_node to int in blk_mq_hw_ctx and -&gt;init_request, and add
   WQ_PERCPU to the block workqueue users (Mateusz, Marco)

 - Block statistics and tracing: propagate in-flight to the whole disk
   on partition IO, export passthrough stats, and a new
   block_rq_tag_wait tracepoint (Tang, Keith, Aaron)

 - A round of removals, unexports and cleanups across bio, direct-io and
   the bvec helpers (Christoph)

 - Various driver fixes (mtip32xx use-after-free, rbd snap_count
   validation and strscpy conversion, nbd socket lockdep reclassify,
   virtio-blk zone report clamp, floppy) and a batch of MAINTAINERS
   email/list updates (Coly, Li, Yu, Christoph Böhmwalder)

 - Other little fixes and cleanups all over

* tag 'for-7.2/block-20260615' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (117 commits)
  MAINTAINERS: Update Coly Li's email address
  block: check bio split for unaligned bvec
  nbd: Reclassify sockets to avoid lockdep circular dependency
  block: add configurable error injection
  block: add a str_to_blk_op helper
  block: add a "tag" for block status codes
  block: add a macro to initialize the status table
  floppy: Drop unused pnp driver data
  block: propagate in_flight to whole disk on partition I/O
  virtio-blk: clamp zone report to the report buffer capacity
  block: optimize I/O merge hot path with unlikely() hints
  drivers/block/rbd: Use strscpy() to copy strings into arrays
  partitions: aix: bound the pp_count scan to the ppe array
  block: Enable lock context analysis
  block/mq-deadline: Make the lock context annotations compatible with Clang
  block/Kyber: Make the lock context annotations compatible with Clang
  block/blk-mq-debugfs: Improve lock context annotations
  block/blk-iocost: Inline iocg_lock() and iocg_unlock()
  block/blk-iocost: Split ioc_rqos_throttle()
  block/crypto: Annotate the crypto functions
  ...
</content>
</entry>
<entry>
<title>drivers/block/rbd: Use strscpy() to copy strings into arrays</title>
<updated>2026-06-08T13:45:56+00:00</updated>
<author>
<name>David Laight</name>
<email>david.laight.linux@gmail.com</email>
</author>
<published>2026-06-06T20:27:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5ef1b0194b382fafe5023b5b014e4db3b948ee15'/>
<id>urn:sha1:5ef1b0194b382fafe5023b5b014e4db3b948ee15</id>
<content type='text'>
Replacing strcpy() with strscpy() ensures than overflow of the target
buffer cannot happen.

Signed-off-by: David Laight &lt;david.laight.linux@gmail.com&gt;
Reviewed-by: Alex Elder &lt;elder@riscstar.com&gt;
Link: https://patch.msgid.link/20260606202744.5113-5-david.laight.linux@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>rbd: check snap_count against RBD_MAX_SNAP_COUNT</title>
<updated>2026-06-01T01:48:53+00:00</updated>
<author>
<name>Rosen Penev</name>
<email>rosenp@gmail.com</email>
</author>
<published>2026-05-30T01:12:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2e1b3f4c51ace14f67201bd2a92ca6312a3c3724'/>
<id>urn:sha1:2e1b3f4c51ace14f67201bd2a92ca6312a3c3724</id>
<content type='text'>
snap_count is u32 but the comparison is against a SIZE_MAX-derived value
(~2^61 on 64-bit), which clang flags as always false with
-Wtautological-constant-out-of-range-compare.

The proper check here should be that snap_count does not go over
RBD_MAX_SNAP_COUNT.

Assisted-by: Opencode:Big-pickle
Signed-off-by: Rosen Penev &lt;rosenp@gmail.com&gt;
Reviewed-by: Alex Elder &lt;elder@riscstar.com&gt;
Link: https://patch.msgid.link/20260530011255.52916-1-rosenp@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>rbd: eliminate a race in lock_dwork draining on unmap</title>
<updated>2026-05-20T20:09:08+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2026-05-19T21:07:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9fc75b71fdd38465c76c6f6a884cdd4ae3c72d90'/>
<id>urn:sha1:9fc75b71fdd38465c76c6f6a884cdd4ae3c72d90</id>
<content type='text'>
Given how rbd_lock_add_request() and rbd_img_exclusive_lock() are
written, lock_dwork may be (re)queued more than it's actually needed:
for example in case a new I/O request comes in while we are in the
middle of rbd_acquire_lock() on behalf of another I/O request.  This is
expected and with rbd_release_lock() preemptively canceling lock_dwork
is benign under normal operation.

A more problematic example is maybe_kick_acquire():

    if (have_requests || delayed_work_pending(&amp;rbd_dev-&gt;lock_dwork)) {
            dout("%s rbd_dev %p kicking lock_dwork\n", __func__, rbd_dev);
            mod_delayed_work(rbd_dev-&gt;task_wq, &amp;rbd_dev-&gt;lock_dwork, 0);
    }

It's not unrealistic for lock_dwork to get canceled right after
delayed_work_pending() returns true and for mod_delayed_work() to
requeue it right there anyway.  This is a classic TOCTOU race.

When it comes to unmapping the image, there is an implicit assumption
of no self-initiated exclusive lock activity past the point of return
from rbd_dev_image_unlock() which unlocks the lock if it happens to be
held.  This unlock is assumed to be final and lock_dwork (as well as
all other exclusive lock tasks, really) isn't expected to get queued
again.  However, lock_dwork is canceled only in cancel_tasks_sync()
(i.e. later in the unmap sequence) and on top of that the cancellation
can get in effect nullified by maybe_kick_acquire().  This may result
in rbd_acquire_lock() executing after rbd_dev_device_release() and
rbd_dev_image_release() run and free and/or reset a bunch of things.
One of the possible failure modes then is a violated

    rbd_assert(rbd_image_format_valid(rbd_dev-&gt;image_format));

in rbd_dev_header_info() which is called via rbd_dev_refresh() from
rbd_post_acquire_action().

Redo exclusive lock task draining to provide saner semantics and try
to meet the assumptions around rbd_dev_image_unlock().

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Viacheslav Dubeyko &lt;Slava.Dubeyko@ibm.com&gt;
</content>
</entry>
<entry>
<title>rbd: fix null-ptr-deref when device_add_disk() fails</title>
<updated>2026-04-21T23:40:23+00:00</updated>
<author>
<name>Dawei Feng</name>
<email>dawei.feng@seu.edu.cn</email>
</author>
<published>2026-04-19T09:03:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d1fef92e414433ca7b89abf85cb0df42b8d475eb'/>
<id>urn:sha1:d1fef92e414433ca7b89abf85cb0df42b8d475eb</id>
<content type='text'>
do_rbd_add() publishes the device with device_add() before calling
device_add_disk(). If device_add_disk() fails after device_add()
succeeds, the error path calls rbd_free_disk() directly and then later
falls through to rbd_dev_device_release(), which calls rbd_free_disk()
again. This double teardown can leave blk-mq cleanup operating on
invalid state and trigger a null-ptr-deref in
__blk_mq_free_map_and_rqs(), reached from blk_mq_free_tag_set().

Fix this by following the normal remove ordering: call device_del()
before rbd_dev_device_release() when device_add_disk() fails after
device_add(). That keeps the teardown sequence consistent and avoids
re-entering disk cleanup through the wrong path.

The bug was first flagged by an experimental analysis tool we are
developing for kernel memory-management bugs while analyzing
v6.13-rc1. The tool is still under development and is not yet publicly
available.

We reproduced the bug on v7.0 with a real Ceph backend and a QEMU x86_64
guest booted with KASAN and CONFIG_FAILSLAB enabled. The reproducer
confines failslab injections to the __add_disk() range and injects
fail-nth while mapping an RBD image through
/sys/bus/rbd/add_single_major.

On the unpatched kernel, fail-nth=4 reliably triggered the fault:

	Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI
	KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
	CPU: 0 UID: 0 PID: 273 Comm: bash Not tainted 7.0.0-01247-gd60bc1401583 #6 PREEMPT(lazy)
	Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
	RIP: 0010:__blk_mq_free_map_and_rqs+0x8c/0x240
	Code: 00 00 48 8b 6b 60 41 89 f4 49 c1 e4 03 4c 01 e5 45 85 ed 0f 85 0a 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 e9 48 c1 e9 03 &lt;80&gt; 3c 01 00 0f 85 31 01 00 00 4c 8b 6d 00 4d 85 ed 0f 84 e2 00 00
	RSP: 0018:ff1100000ab0fac8 EFLAGS: 00000246
	RAX: dffffc0000000000 RBX: ff1100000c4806a0 RCX: 0000000000000000
	RDX: 0000000000000002 RSI: 0000000000000000 RDI: ff1100000c4806f4
	RBP: 0000000000000000 R08: 0000000000000001 R09: ffe21c000189001b
	R10: ff1100000c4800df R11: ff1100006cf37be0 R12: 0000000000000000
	R13: 0000000000000000 R14: ff1100000c480700 R15: ff1100000c480004
	FS:  00007f0fbe8fe740(0000) GS:ff110000e5851000(0000) knlGS:0000000000000000
	CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
	CR2: 00007fe53473b2e0 CR3: 0000000012eef000 CR4: 00000000007516f0
	PKRU: 55555554
	Call Trace:
	 &lt;TASK&gt;
	 blk_mq_free_tag_set+0x77/0x460
	 do_rbd_add+0x1446/0x2b80
	 ? __pfx_do_rbd_add+0x10/0x10
	 ? lock_acquire+0x18c/0x300
	 ? find_held_lock+0x2b/0x80
	 ? sysfs_file_kobj+0xb6/0x1b0
	 ? __pfx_sysfs_kf_write+0x10/0x10
	 kernfs_fop_write_iter+0x2f4/0x4a0
	 vfs_write+0x98e/0x1000
	 ? expand_files+0x51f/0x850
	 ? __pfx_vfs_write+0x10/0x10
	 ksys_write+0xf2/0x1d0
	 ? __pfx_ksys_write+0x10/0x10
	 do_syscall_64+0x115/0x690
	 entry_SYSCALL_64_after_hwframe+0x77/0x7f
	RIP: 0033:0x7f0fbea15907
	Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 &lt;48&gt; 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
	RSP: 002b:00007ffe22346ea8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
	RAX: ffffffffffffffda RBX: 0000000000000058 RCX: 00007f0fbea15907
	RDX: 0000000000000058 RSI: 0000563ace6c0ef0 RDI: 0000000000000001
	RBP: 0000563ace6c0ef0 R08: 0000563ace6c0ef0 R09: 6b6435726d694141
	R10: 5250337279762f78 R11: 0000000000000246 R12: 0000000000000058
	R13: 00007f0fbeb1c780 R14: ff1100000c480700 R15: ff1100000c480004
	 &lt;/TASK&gt;

With this fix applied, rerunning the reproducer over fail-nth=1..256
yields no KASAN reports.

[ idryomov: rename err_out_device_del -&gt; err_out_device ]

Cc: stable@vger.kernel.org
Fixes: 27c97abc30e2 ("rbd: add add_disk() error handling")
Signed-off-by: Zilin Guan &lt;zilin@seu.edu.cn&gt;
Signed-off-by: Dawei Feng &lt;dawei.feng@seu.edu.cn&gt;
Reviewed-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
</content>
</entry>
<entry>
<title>Convert 'alloc_obj' family to use the new default GFP_KERNEL argument</title>
<updated>2026-02-22T01:09:51+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-22T00:37:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43'/>
<id>urn:sha1:bf4afc53b77aeaa48b5409da5c8da6bb4eff7f43</id>
<content type='text'>
This was done entirely with mindless brute force, using

    git grep -l '\&lt;k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
        xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'

to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.

Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.

For the same reason the 'flex' versions will be done as a separate
conversion.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>treewide: Replace kmalloc with kmalloc_obj for non-scalar types</title>
<updated>2026-02-21T09:02:28+00:00</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2026-02-21T07:49:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=69050f8d6d075dc01af7a5f2f550a8067510366f'/>
<id>urn:sha1:69050f8d6d075dc01af7a5f2f550a8067510366f</id>
<content type='text'>
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:

Single allocations:	kmalloc(sizeof(TYPE), ...)
are replaced with:	kmalloc_obj(TYPE, ...)

Array allocations:	kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with:	kmalloc_objs(TYPE, COUNT, ...)

Flex array allocations:	kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with:	kmalloc_flex(*PTR, FAM, COUNT, ...)

(where TYPE may also be *VAR)

The resulting allocations no longer return "void *", instead returning
"TYPE *".

Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>rbd: check for EOD after exclusive lock is ensured to be held</title>
<updated>2026-02-03T20:00:22+00:00</updated>
<author>
<name>Ilya Dryomov</name>
<email>idryomov@gmail.com</email>
</author>
<published>2026-01-07T21:37:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=bd3884a204c3b507e6baa9a4091aa927f9af5404'/>
<id>urn:sha1:bd3884a204c3b507e6baa9a4091aa927f9af5404</id>
<content type='text'>
Similar to commit 870611e4877e ("rbd: get snapshot context after
exclusive lock is ensured to be held"), move the "beyond EOD" check
into the image request state machine so that it's performed after
exclusive lock is ensured to be held.  This avoids various race
conditions which can arise when the image is shrunk under I/O (in
practice, mostly readahead).  In one such scenario

    rbd_assert(objno &lt; rbd_dev-&gt;object_map_size);

can be triggered if a close-to-EOD read gets queued right before the
shrink is initiated and the EOD check is performed against an outdated
mapping_size.  After the resize is done on the server side and exclusive
lock is (re)acquired bringing along the new (now shrunk) object map, the
read starts going through the state machine and rbd_obj_may_exist() gets
invoked on an object that is out of bounds of rbd_dev-&gt;object_map array.

Cc: stable@vger.kernel.org
Signed-off-by: Ilya Dryomov &lt;idryomov@gmail.com&gt;
Reviewed-by: Dongsheng Yang &lt;dongsheng.yang@linux.dev&gt;
</content>
</entry>
</feed>
