diff options
| author | Ming Lei <tom.leiming@gmail.com> | 2026-05-27 17:40:42 +0300 |
|---|---|---|
| committer | Jens Axboe <axboe@kernel.dk> | 2026-05-27 18:03:20 +0300 |
| commit | 1133b93fc7f63defaa2c07d5f49873c14bb74681 (patch) | |
| tree | 35371d9637fe34c807bcc589146ea7a529befd26 | |
| parent | 6235ea3f8b8ffca0333ade0863992f3cd69592ea (diff) | |
| download | linux-1133b93fc7f63defaa2c07d5f49873c14bb74681.tar.xz | |
ublk: set canceling flag even when disk is not allocated
ublk_start_cancel() previously bailed out early when ublk_get_disk()
returned NULL, treating it as "our disk has been dead". That is correct
for the post-teardown case, but it also wrongly covers the pre-start
case: ublk_ctrl_start_dev() has not assigned ub->ub_disk yet, while
io_uring is already tearing down the daemon's uring_cmds via
ublk_uring_cmd_cancel_fn().
In that window, the cancel path skips ublk_set_canceling(), so
ubq->canceling stays false, even though ublk_cancel_cmd() goes on to
NULL out every io->cmd. ublk_ctrl_start_dev() then proceeds to set
ub->ub_disk, call add_disk(), and schedule partition_scan_work. When
ublk_partition_scan_work() runs bdev_disk_changed() and the resulting
read reaches ublk_queue_rq() -> ublk_queue_cmd(), the ubq->canceling
check passes and the code dereferences the NULL io->cmd:
BUG: kernel NULL pointer dereference, address: 0000000000000018
RIP: ublk_queue_cmd drivers/block/ublk_drv.c [inline]
RIP: ublk_queue_rq+0x73/0x100
Call Trace:
blk_mq_dispatch_rq_list+0x1c5/0xca0
...
bdev_disk_changed+0x3d4/0x5e0
ublk_partition_scan_work+0x89/0xe0
process_one_work+0x344/0x8a0
Fix it by always setting ub->canceling / ubq->canceling under
cancel_mutex. When the disk is allocated, keep the existing
quiesce/unquiesce dance so the flag is observed across the
ublk_queue_rq() barrier. When the disk is not yet allocated, there is
no request_queue and ublk_queue_rq() cannot be running concurrently, so
simply flipping the flag is sufficient: any subsequent I/O - including
the partition scan started by ublk_ctrl_start_dev() - will see
canceling set and be aborted via __ublk_queue_rq_common().
Fixes: 7fc4da6a304b ("ublk: scan partition in async way")
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Link: https://patch.msgid.link/20260527144042.2095194-1-tom.leiming@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
| -rw-r--r-- | drivers/block/ublk_drv.c | 32 |
1 files changed, 18 insertions, 14 deletions
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 49624e65fe75..0c6b9b34b255 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -2704,23 +2704,27 @@ static void ublk_start_cancel(struct ublk_device *ub) { struct gendisk *disk = ublk_get_disk(ub); - /* Our disk has been dead */ - if (!disk) - return; - mutex_lock(&ub->cancel_mutex); if (ub->canceling) goto out; - /* - * Now we are serialized with ublk_queue_rq() - * - * Make sure that ubq->canceling is set when queue is frozen, - * because ublk_queue_rq() has to rely on this flag for avoiding to - * touch completed uring_cmd - */ - blk_mq_quiesce_queue(disk->queue); - ublk_set_canceling(ub, true); - blk_mq_unquiesce_queue(disk->queue); + + if (disk) { + /* + * Quiesce to serialize with ublk_queue_rq(), ensuring + * ubq->canceling is visible when the queue resumes. + */ + blk_mq_quiesce_queue(disk->queue); + ublk_set_canceling(ub, true); + blk_mq_unquiesce_queue(disk->queue); + } else { + /* + * Disk not yet allocated by ublk_ctrl_start_dev(), so + * there is no request queue and ublk_queue_rq() cannot + * be running. Just set the flag; if start_dev proceeds + * later, new I/O will see canceling and be aborted. + */ + ublk_set_canceling(ub, true); + } out: mutex_unlock(&ub->cancel_mutex); ublk_put_disk(disk); |
