kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2023-01-12	virtio_blk: Fix signedness bug in virtblk_prep_rq()	Rafael Mendonca	1	-2/+4
	[ Upstream commit a26116c1e74028914f281851488546c91cbae57d ] The virtblk_map_data() function returns negative error codes, however, the 'nents' field of vbr->sg_table is an unsigned int, which causes the error handling not to work correctly. Cc: stable@vger.kernel.org Fixes: 0e9911fa768f ("virtio-blk: support mq_ops->queue_rqs()") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Message-Id: <20221021204126.927603-1-rafaelmendsr@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Suwan Kim <suwan.kim027@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-12	virtio-blk: use a helper to handle request queuing errors	Dmitry Fomichev	1	-13/+16
	[ Upstream commit 258896fcc786b4e7db238eba26f6dd080e0ff41e ] Define a new helper function, virtblk_fail_to_queue(), to clean up the error handling code in virtio_queue_rq(). Signed-off-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Message-Id: <20221016034127.330942-2-dmitry.fomichev@wdc.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Stable-dep-of: a26116c1e740 ("virtio_blk: Fix signedness bug in virtblk_prep_rq()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-12	ublk: honor IO_URING_F_NONBLOCK for handling control command	Ming Lei	1	-0/+3
	[ Upstream commit fa8e442e832a3647cdd90f3e606c473a51bc1b26 ] Most of control command handlers may sleep, so return -EAGAIN in case of IO_URING_F_NONBLOCK to defer the handling into io wq context. Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Reported-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20230104133235.836536-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31	floppy: Fix memory leak in do_floppy_init()	Yuan Can	1	-1/+3
	commit f8ace2e304c5dd8a7328db9cd2b8a4b1b98d83ec upstream. A memory leak was reported when floppy_alloc_disk() failed in do_floppy_init(). unreferenced object 0xffff888115ed25a0 (size 8): comm "modprobe", pid 727, jiffies 4295051278 (age 25.529s) hex dump (first 8 bytes): 00 ac 67 5b 81 88 ff ff ..g[.... backtrace: [<000000007f457abb>] __kmalloc_node+0x4c/0xc0 [<00000000a87bfa9e>] blk_mq_realloc_tag_set_tags.part.0+0x6f/0x180 [<000000006f02e8b1>] blk_mq_alloc_tag_set+0x573/0x1130 [<0000000066007fd7>] 0xffffffffc06b8b08 [<0000000081f5ac40>] do_one_initcall+0xd0/0x4f0 [<00000000e26d04ee>] do_init_module+0x1a4/0x680 [<000000001bb22407>] load_module+0x6249/0x7110 [<00000000ad31ac4d>] __do_sys_finit_module+0x140/0x200 [<000000007bddca46>] do_syscall_64+0x35/0x80 [<00000000b5afec39>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 unreferenced object 0xffff88810fc30540 (size 32): comm "modprobe", pid 727, jiffies 4295051278 (age 25.529s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000007f457abb>] __kmalloc_node+0x4c/0xc0 [<000000006b91eab4>] blk_mq_alloc_tag_set+0x393/0x1130 [<0000000066007fd7>] 0xffffffffc06b8b08 [<0000000081f5ac40>] do_one_initcall+0xd0/0x4f0 [<00000000e26d04ee>] do_init_module+0x1a4/0x680 [<000000001bb22407>] load_module+0x6249/0x7110 [<00000000ad31ac4d>] __do_sys_finit_module+0x140/0x200 [<000000007bddca46>] do_syscall_64+0x35/0x80 [<00000000b5afec39>] entry_SYSCALL_64_after_hwframe+0x46/0xb0 If the floppy_alloc_disk() failed, disks of current drive will not be set, thus the lastest allocated set->tag cannot be freed in the error handling path. A simple call graph shown as below: floppy_module_init() floppy_init() do_floppy_init() for (drive = 0; drive < N_DRIVE; drive++) blk_mq_alloc_tag_set() blk_mq_alloc_tag_set_tags() blk_mq_realloc_tag_set_tags() # set->tag allocated floppy_alloc_disk() blk_mq_alloc_disk() # error occurred, disks failed to allocated ->out_put_disk: for (drive = 0; drive < N_DRIVE; drive++) if (!disks[drive][0]) # the last disks is not set and loop break break; blk_mq_free_tag_set() # the latest allocated set->tag leaked Fix this problem by free the set->tag of current drive before jump to error handling path. Cc: stable@vger.kernel.org Fixes: 302cfee15029 ("floppy: use a separate gendisk for each media format") Signed-off-by: Yuan Can <yuancan@huawei.com> [efremov: added stable list, changed title] Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-12-31	loop: Fix the max_loop commandline argument treatment when it is set to 0	Isaac J. Manjarres	1	-16/+12
	commit 85c50197716c60fe57f411339c579462e563ac57 upstream. Currently, the max_loop commandline argument can be used to specify how many loop block devices are created at init time. If it is not specified on the commandline, CONFIG_BLK_DEV_LOOP_MIN_COUNT loop block devices will be created. The max_loop commandline argument can be used to override the value of CONFIG_BLK_DEV_LOOP_MIN_COUNT. However, when max_loop is set to 0 through the commandline, the current logic treats it as if it had not been set, and creates CONFIG_BLK_DEV_LOOP_MIN_COUNT devices anyway. Fix this by starting max_loop off as set to CONFIG_BLK_DEV_LOOP_MIN_COUNT. This preserves the intended behavior of creating CONFIG_BLK_DEV_LOOP_MIN_COUNT loop block devices if the max_loop commandline parameter is not specified, and allowing max_loop to be respected for all values, including 0. This allows environments that can create all of their required loop block devices on demand to not have to unnecessarily preallocate loop block devices. Fixes: 732850827450 ("remove artificial software max_loop limit") Cc: stable@vger.kernel.org Cc: Ken Chen <kenchen@google.com> Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com> Link: https://lore.kernel.org/r/20221208212902.765781-1-isaacmanjarres@google.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-12-31	drbd: destroy workqueue when drbd device was freed	Wang ShaoBo	1	-1/+5
	[ Upstream commit 8692814b77ca4228a99da8a005de0acf40af6132 ] A submitter workqueue is dynamically allocated by init_submitter() called by drbd_create_device(), we should destroy it when this device is not needed or destroyed. Fixes: 113fef9e20e0 ("drbd: prepare to queue write requests on a submit worker") Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com> Link: https://lore.kernel.org/r/20221124015817.2729789-3-bobo.shaobowang@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31	drbd: remove call to memset before free device/resource/connection	Wang ShaoBo	1	-3/+0
	[ Upstream commit 6e7b854e4c1b02dba00760dfa79d8dbf6cce561e ] This revert c2258ffc56f2 ("drbd: poison free'd device, resource and connection structs"), add memset is odd here for debugging, there are some methods to accurately show what happened, such as kdump. Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com> Link: https://lore.kernel.org/r/20221124015817.2729789-2-bobo.shaobowang@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Stable-dep-of: 8692814b77ca ("drbd: destroy workqueue when drbd device was freed") Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31	drbd: use blk_queue_max_discard_sectors helper	Christoph Böhmwalder	1	-5/+5
	[ Upstream commit 258bea6388ac93f34561fd91064232d14e174bff ] We currently only set q->limits.max_discard_sectors, but that is not enough. Another field, max_hw_discard_sectors, was introduced in commit 0034af036554 ("block: make /sys/block/<dev>/queue/discard_max_bytes writeable"). The difference is that max_discard_sectors can be changed from user space via sysfs, while max_hw_discard_sectors is the "hardware" upper limit. So use this helper, which sets both. This is also a fixup for commit 998e9cbcd615 ("drbd: cleanup decide_on_discard_support"): if discards are not supported, that does not necessarily mean we also want to disable write_zeroes. Fixes: 998e9cbcd615 ("drbd: cleanup decide_on_discard_support") Reviewed-by: Joel Colledge <joel.colledge@linbit.com> Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Link: https://lore.kernel.org/r/20221109133453.51652-2-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-11-26	drbd: use after free in drbd_create_device()	Dan Carpenter	1	-2/+2
	[ Upstream commit a7a1598189228b5007369a9622ccdf587be0730f ] The drbd_destroy_connection() frees the "connection" so use the _safe() iterator to prevent a use after free. Fixes: b6f85ef9538b ("drbd: Iterate over all connections") Signed-off-by: Dan Carpenter <error27@gmail.com> Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Link: https://lore.kernel.org/r/Y3Jd5iZRbNQ9w6gm@kili Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-11-10	ublk_drv: return flag of UBLK_F_URING_CMD_COMP_IN_TASK in case of module	Ming Lei	1	-0/+3
	[ Upstream commit 224e858f215a3d6304f95a92357a1753475ca9cf ] UBLK_F_URING_CMD_COMP_IN_TASK needs to be set and returned to userspace if ublk driver is built as module, otherwise userspace may get wrong flags shown. Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Signed-off-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com> Link: https://lore.kernel.org/r/20221029010432.598367-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-29	drbd: only clone bio if we have a backing device	Christoph Böhmwalder	1	-8/+6
	[ Upstream commit 6d42ddf7f27b6723549ee6d4c8b1b418b59bf6b5 ] Commit c347a787e34cb (drbd: set ->bi_bdev in drbd_req_new) moved a bio_set_dev call (which has since been removed) to "earlier", from drbd_request_prepare to drbd_req_new. The problem is that this accesses device->ldev->backing_bdev, which is not NULL-checked at this point. When we don't have an ldev (i.e. when the DRBD device is diskless), this leads to a null pointer deref. So, only allocate the private_bio if we actually have a disk. This is also a small optimization, since we don't clone the bio to only to immediately free it again in the diskless case. Fixes: c347a787e34cb ("drbd: set ->bi_bdev in drbd_req_new") Co-developed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> Co-developed-by: Joel Colledge <joel.colledge@linbit.com> Signed-off-by: Joel Colledge <joel.colledge@linbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20221020085205.129090-1-christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-21	nbd: Fix hung when signal interrupts nbd_start_device_ioctl()	Shigeru Yoshida	1	-2/+4
	[ Upstream commit 1de7c3cf48fc41cd95adb12bd1ea9033a917798a ] syzbot reported hung task [1]. The following program is a simplified version of the reproducer: int main(void) { int sv[2], fd; if (socketpair(AF_UNIX, SOCK_STREAM, 0, sv) < 0) return 1; if ((fd = open("/dev/nbd0", 0)) < 0) return 1; if (ioctl(fd, NBD_SET_SIZE_BLOCKS, 0x81) < 0) return 1; if (ioctl(fd, NBD_SET_SOCK, sv[0]) < 0) return 1; if (ioctl(fd, NBD_DO_IT) < 0) return 1; return 0; } When signal interrupt nbd_start_device_ioctl() waiting the condition atomic_read(&config->recv_threads) == 0, the task can hung because it waits the completion of the inflight IOs. This patch fixes the issue by clearing queue, not just shutdown, when signal interrupt nbd_start_device_ioctl(). Link: https://syzkaller.appspot.com/bug?id=7d89a3ffacd2b83fdd39549bc4d8e0a89ef21239 [1] Reported-by: syzbot+38e6c55d4969a14c1534@syzkaller.appspotmail.com Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20220907163502.577561-1-syoshida@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28	virtio-blk: Fix WARN_ON_ONCE in virtio_queue_rq()	Suwan Kim	1	-6/+5
	If a request fails at virtio_queue_rqs(), it is inserted to requeue_list and passed to virtio_queue_rq(). Then blk_mq_start_request() can be called again at virtio_queue_rq() and trigger WARN_ON_ONCE like below trace because request state was already set to MQ_RQ_IN_FLIGHT in virtio_queue_rqs() despite the failure. [ 1.890468] ------------[ cut here ]------------ [ 1.890776] WARNING: CPU: 2 PID: 122 at block/blk-mq.c:1143 blk_mq_start_request+0x8a/0xe0 [ 1.891045] Modules linked in: [ 1.891250] CPU: 2 PID: 122 Comm: journal-offline Not tainted 5.19.0+ #44 [ 1.891504] Hardware name: ChromiumOS crosvm, BIOS 0 [ 1.891739] RIP: 0010:blk_mq_start_request+0x8a/0xe0 [ 1.891961] Code: 12 80 74 22 48 8b 4b 10 8b 89 64 01 00 00 8b 53 20 83 fa ff 75 08 ba 00 00 00 80 0b 53 24 c1 e1 10 09 d1 89 48 34 5b 41 5e c3 <0f> 0b eb b8 65 8b 05 2b 39 b6 7e 89 c0 48 0f a3 05 39 77 5b 01 0f [ 1.892443] RSP: 0018:ffffc900002777b0 EFLAGS: 00010202 [ 1.892673] RAX: 0000000000000000 RBX: ffff888004bc0000 RCX: 0000000000000000 [ 1.892952] RDX: 0000000000000000 RSI: ffff888003d7c200 RDI: ffff888004bc0000 [ 1.893228] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff888004bc0100 [ 1.893506] R10: ffffffffffffffff R11: ffffffff8185ca10 R12: ffff888004bc0000 [ 1.893797] R13: ffffc90000277900 R14: ffff888004ab2340 R15: ffff888003d86e00 [ 1.894060] FS: 00007ffa143a4640(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 [ 1.894412] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.894682] CR2: 00005648577d9088 CR3: 00000000053da004 CR4: 0000000000170ee0 [ 1.894953] Call Trace: [ 1.895139] <TASK> [ 1.895303] virtblk_prep_rq+0x1e5/0x280 [ 1.895509] virtio_queue_rq+0x5c/0x310 [ 1.895710] ? virtqueue_add_sgs+0x95/0xb0 [ 1.895905] ? _raw_spin_unlock_irqrestore+0x16/0x30 [ 1.896133] ? virtio_queue_rqs+0x340/0x390 [ 1.896453] ? sbitmap_get+0xfa/0x220 [ 1.896678] __blk_mq_issue_directly+0x41/0x180 [ 1.896906] blk_mq_plug_issue_direct+0xd8/0x2c0 [ 1.897115] blk_mq_flush_plug_list+0x115/0x180 [ 1.897342] blk_add_rq_to_plug+0x51/0x130 [ 1.897543] blk_mq_submit_bio+0x3a1/0x570 [ 1.897750] submit_bio_noacct_nocheck+0x418/0x520 [ 1.897985] ? submit_bio_noacct+0x1e/0x260 [ 1.897989] ext4_bio_write_page+0x222/0x420 [ 1.898000] mpage_process_page_bufs+0x178/0x1c0 [ 1.899451] mpage_prepare_extent_to_map+0x2d2/0x440 [ 1.899603] ext4_writepages+0x495/0x1020 [ 1.899733] do_writepages+0xcb/0x220 [ 1.899871] ? __seccomp_filter+0x171/0x7e0 [ 1.900006] file_write_and_wait_range+0xcd/0xf0 [ 1.900167] ext4_sync_file+0x72/0x320 [ 1.900308] __x64_sys_fsync+0x66/0xa0 [ 1.900449] do_syscall_64+0x31/0x50 [ 1.900595] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 1.900747] RIP: 0033:0x7ffa16ec96ea [ 1.900883] Code: b8 4a 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 e3 02 f8 ff 8b 7c 24 0c 89 c2 b8 4a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 36 89 d7 89 44 24 0c e8 43 03 f8 ff 8b 44 24 [ 1.901302] RSP: 002b:00007ffa143a3ac0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [ 1.901499] RAX: ffffffffffffffda RBX: 0000560277ec6fe0 RCX: 00007ffa16ec96ea [ 1.901696] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000016 [ 1.901884] RBP: 0000560277ec5910 R08: 0000000000000000 R09: 00007ffa143a4640 [ 1.902082] R10: 00007ffa16e4d39e R11: 0000000000000293 R12: 00005602773f59e0 [ 1.902459] R13: 0000000000000000 R14: 00007fffbfc007ff R15: 00007ffa13ba4000 [ 1.902763] </TASK> [ 1.902877] ---[ end trace 0000000000000000 ]--- To avoid calling blk_mq_start_request() twice, This patch moves the execution of blk_mq_start_request() to the end of virtblk_prep_rq(). And instead of requeuing failed request to plug list in the error path of virtblk_add_req_batch(), it uses blk_mq_requeue_request() to change failed request state to MQ_RQ_IDLE. Then virtblk can safely handle the request on the next trial. Fixes: 0e9911fa768f ("virtio-blk: support mq_ops->queue_rqs()") Reported-by: Alexandre Courbot <acourbot@chromium.org> Tested-by: Alexandre Courbot <acourbot@chromium.org> Signed-off-by: Suwan Kim <suwan.kim027@gmail.com> Message-Id: <20220830150153.12627-1-suwan.kim027@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
2022-09-03	Merge tag 'for-linus-6.0-rc4-tag' of ↵	Linus Torvalds	3	-10/+19
	git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - a minor fix for the Xen grant driver - a small series fixing a recently introduced problem in the Xen blkfront/blkback drivers with negotiation of feature usage * tag 'for-linus-6.0-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/grants: prevent integer overflow in gnttab_dma_alloc_pages() xen-blkfront: Cache feature_persistent value before advertisement xen-blkfront: Advertise feature-persistent as user requested xen-blkback: Advertise feature-persistent as user requested
2022-09-02	xen-blkfront: Cache feature_persistent value before advertisement	SeongJae Park	1	-7/+7
	Xen blkfront advertises its support of the persistent grants feature when it first setting up and when resuming in 'talk_to_blkback()'. Then, blkback reads the advertised value when it connects with blkfront and decides if it will use the persistent grants feature or not, and advertises its decision to blkfront. Blkfront reads the blkback's decision and it also makes the decision for the use of the feature. Commit 402c43ea6b34 ("xen-blkfront: Apply 'feature_persistent' parameter when connect"), however, made the blkfront's read of the parameter for disabling the advertisement, namely 'feature_persistent', to be done when it negotiate, not when advertise. Therefore blkfront advertises without reading the parameter. As the field for caching the parameter value is zero-initialized, it always advertises as the feature is disabled, so that the persistent grants feature becomes always disabled. This commit fixes the issue by making the blkfront does parmeter caching just before the advertisement. Fixes: 402c43ea6b34 ("xen-blkfront: Apply 'feature_persistent' parameter when connect") Cc: <stable@vger.kernel.org> # 5.10.x Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: SeongJae Park <sj@kernel.org> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220831165824.94815-4-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>
2022-09-02	xen-blkfront: Advertise feature-persistent as user requested	SeongJae Park	1	-2/+6
	The advertisement of the persistent grants feature (writing 'feature-persistent' to xenbus) should mean not the decision for using the feature but only the availability of the feature. However, commit 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants") made a field of blkfront, which was a place for saving only the negotiation result, to be used for yet another purpose: caching of the 'feature_persistent' parameter value. As a result, the advertisement, which should follow only the parameter value, becomes inconsistent. This commit fixes the misuse of the semantic by making blkfront saves the parameter value in a separate place and advertises the support based on only the saved value. Fixes: 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants") Cc: <stable@vger.kernel.org> # 5.10.x Suggested-by: Juergen Gross <jgross@suse.com> Signed-off-by: SeongJae Park <sj@kernel.org> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220831165824.94815-3-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>
2022-09-02	xen-blkback: Advertise feature-persistent as user requested	SeongJae Park	2	-2/+7
	The advertisement of the persistent grants feature (writing 'feature-persistent' to xenbus) should mean not the decision for using the feature but only the availability of the feature. However, commit aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") made a field of blkback, which was a place for saving only the negotiation result, to be used for yet another purpose: caching of the 'feature_persistent' parameter value. As a result, the advertisement, which should follow only the parameter value, becomes inconsistent. This commit fixes the misuse of the semantic by making blkback saves the parameter value in a separate place and advertises the support based on only the saved value. Fixes: aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") Cc: <stable@vger.kernel.org> # 5.10.x Suggested-by: Juergen Gross <jgross@suse.com> Signed-off-by: SeongJae Park <sj@kernel.org> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220831165824.94815-2-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-26	Merge tag 'block-6.0-2022-08-26' of git://git.kernel.dk/linux-block	Linus Torvalds	1	-0/+5
	Pull block fixes from Jens Axboe: - MD pull request via Song: - Fix for clustered raid (Guoqing Jiang) - req_op fix (Bart Van Assche) - Fix race condition in raid recreate (David Sloan) - loop configuration overflow fix (Siddh) - Fix missing commit_rqs call for certain conditions (Yu) * tag 'block-6.0-2022-08-26' of git://git.kernel.dk/linux-block: md: call __md_stop_writes in md_stop Revert "md-raid: destroy the bitmap after destroying the thread" md: Flush workqueue md_rdev_misc_wq in md_alloc() md/raid10: Fix the data type of an r10_sync_page_io() argument loop: Check for overflow while configuring loop blk-mq: fix io hung due to missing commit_rqs
2022-08-24	loop: Check for overflow while configuring loop	Siddh Raman Pant	1	-0/+5
	The userspace can configure a loop using an ioctl call, wherein a configuration of type loop_config is passed (see lo_ioctl()'s case on line 1550 of drivers/block/loop.c). This proceeds to call loop_configure() which in turn calls loop_set_status_from_info() (see line 1050 of loop.c), passing &config->info which is of type loop_info64*. This function then sets the appropriate values, like the offset. loop_device has lo_offset of type loff_t (see line 52 of loop.c), which is typdef-chained to long long, whereas loop_info64 has lo_offset of type __u64 (see line 56 of include/uapi/linux/loop.h). The function directly copies offset from info to the device as follows (See line 980 of loop.c): lo->lo_offset = info->lo_offset; This results in an overflow, which triggers a warning in iomap_iter() due to a call to iomap_iter_done() which has: WARN_ON_ONCE(iter->iomap.offset > iter->pos); Thus, check for negative value during loop_set_status_from_info(). Bug report: https://syzkaller.appspot.com/bug?id=c620fe14aac810396d3c3edc9ad73848bf69a29e Reported-and-tested-by: syzbot+a8e049cd3abd342936b6@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Siddh Raman Pant <code@siddh.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220823160810.181275-1-code@siddh.me Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-23	Merge tag 'mm-hotfixes-stable-2022-08-22' of ↵	Linus Torvalds	2	-10/+33
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "Thirteen fixes, almost all for MM. Seven of these are cc:stable and the remainder fix up the changes which went into this -rc cycle" * tag 'mm-hotfixes-stable-2022-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: kprobes: don't call disarm_kprobe() for disabled kprobes mm/shmem: shmem_replace_page() remember NR_SHMEM mm/shmem: tmpfs fallocate use file_modified() mm/shmem: fix chattr fsflags support in tmpfs mm/hugetlb: support write-faults in shared mappings mm/hugetlb: fix hugetlb not supporting softdirty tracking mm/uffd: reset write protection when unregister with wp-mode mm/smaps: don't access young/dirty bit if pte unpresent mm: add DEVICE_ZONE to FOR_ALL_ZONES kernel/sys_ni: add compat entry for fadvise64_64 mm/gup: fix FOLL_FORCE COW security issue and remove FOLL_COW Revert "zram: remove double compression logic" get_maintainer: add Alan to .get_maintainer.ignore
2022-08-21	Revert "zram: remove double compression logic"	Jiri Slaby	2	-10/+33
	This reverts commit e7be8d1dd983156b ("zram: remove double compression logic") as it causes zram failures. It does not revert cleanly, PTR_ERR handling was introduced in the meantime. This is handled by appropriate IS_ERR. When under memory pressure, zs_malloc() can fail. Before the above commit, the allocation was retried with direct reclaim enabled (GFP_NOIO). After the commit, it is not -- only __GFP_KSWAPD_RECLAIM is tried. So when the failure occurs under memory pressure, the overlaying filesystem such as ext2 (mounted by ext4 module in this case) can emit failures, making the (file)system unusable: EXT4-fs warning (device zram0): ext4_end_bio:343: I/O error 10 writing to inode 16386 starting block 159744) Buffer I/O error on device zram0, logical block 159744 With direct reclaim, memory is really reclaimed and allocation succeeds, eventually. In the worst case, the oom killer is invoked, which is proper outcome if user sets up zram too large (in comparison to available RAM). This very diff doesn't apply to 5.19 (stable) cleanly (see PTR_ERR note above). Use revert of e7be8d1dd983 directly. Link: https://bugzilla.suse.com/show_bug.cgi?id=1202203 Link: https://lkml.kernel.org/r/20220810070609.14402-1-jslaby@suse.cz Fixes: e7be8d1dd983 ("zram: remove double compression logic") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Alexey Romanov <avromanov@sberdevices.ru> Cc: Dmitry Rokosov <ddrokosov@sberdevices.ru> Cc: Lukas Czerner <lczerner@redhat.com> Cc: <stable@vger.kernel.org> [5.19] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-20	Merge tag 'block-6.0-2022-08-19' of git://git.kernel.dk/linux-block	Linus Torvalds	1	-6/+27
	Pull block fixes from Jens Axboe: "A few fixes that should go into this release: - Small series of patches for ublk (ZiyangZhang) - Remove dead function (Yu) - Fix for running a block queue in case of resource starvation (Yufen)" * tag 'block-6.0-2022-08-19' of git://git.kernel.dk/linux-block: blk-mq: run queue no matter whether the request is the last request blk-mq: remove unused function blk_mq_queue_stopped() ublk_drv: do not add a re-issued request aborted previously to ioucmd's task_work ublk_drv: update comment for __ublk_fail_req() ublk_drv: check ubq_daemon_is_dying() in __ublk_rq_task_work() ublk_drv: update iod->addr for UBLK_IO_NEED_GET_DATA
2022-08-16	ublk_drv: do not add a re-issued request aborted previously to ioucmd's ↵	ZiyangZhang	1	-1/+17
	task_work In ublk_queue_rq(), Assume current request is a re-issued request aborted previously in monitor_work because the ubq_daemon(ioucmd's task) is PF_EXITING. For this request, we cannot call io_uring_cmd_complete_in_task() anymore because at that moment io_uring context may be freed in case that no inflight ioucmd exists. Otherwise, we may cause null-deref in ctx->fallback_work. Add a check on UBLK_IO_FLAG_ABORTED to prevent the above situation. This check is safe and makes sense. Note: monitor_work sets UBLK_IO_FLAG_ABORTED and ends this request (releasing the tag). Then the request is restarted(allocating the tag) and we are here. Since releasing/allocating a tag implies smp_mb(), finding UBLK_IO_FLAG_ABORTED guarantees that here is a re-issued request aborted previously. Suggested-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220815023633.259825-4-ZiyangZhang@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-16	ublk_drv: update comment for __ublk_fail_req()	ZiyangZhang	1	-2/+3
	Since __ublk_rq_task_work always fails requests immediately during exiting, __ublk_fail_req() is only called from abort context during exiting. So lock is unnecessary. Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220815023633.259825-3-ZiyangZhang@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-16	ublk_drv: check ubq_daemon_is_dying() in __ublk_rq_task_work()	ZiyangZhang	1	-3/+2
	Replace direct check on PF_EXITING in __ublk_rq_task_work() by the existing wrapper. Also inline ubq_daemon_is_dying(). Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com> Link: https://lore.kernel.org/r/20220815023633.259825-2-ZiyangZhang@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-14	Merge tag 'for-linus-6.0-rc1b-tag' of ↵	Linus Torvalds	2	-15/+9
	git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull more xen updates from Juergen Gross: - fix the handling of the "persistent grants" feature negotiation between Xen blkfront and Xen blkback drivers - a cleanup of xen.config and adding xen.config to Xen section in MAINTAINERS - support HVMOP_set_evtchn_upcall_vector, which is more compliant to "normal" interrupt handling than the global callback used up to now - further small cleanups * tag 'for-linus-6.0-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: MAINTAINERS: add xen config fragments to XEN HYPERVISOR sections xen: remove XEN_SCRUB_PAGES in xen.config xen/pciback: Fix comment typo xen/xenbus: fix return type in xenbus_file_read() xen-blkfront: Apply 'feature_persistent' parameter when connect xen-blkback: Apply 'feature_persistent' parameter when connect xen-blkback: fix persistent grants negotiation x86/xen: Add support for HVMOP_set_evtchn_upcall_vector
2022-08-13	ublk_drv: update iod->addr for UBLK_IO_NEED_GET_DATA	ZiyangZhang	1	-0/+5
	If ublksrv sends UBLK_IO_NEED_GET_DATA with new allocated io buffer, we have to update iod->addr in task_work before calling io_uring_cmd_done(). Then usersapce target can handle (write)io request with the new io buffer reading from updated iod. Without this change, userspace target may touch a wrong io buffer! Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220810055212.66417-1-ZiyangZhang@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-12	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost	Linus Torvalds	1	-14/+10
	Pull virtio updates from Michael Tsirkin: - A huge patchset supporting vq resize using the new vq reset capability - Features, fixes, and cleanups all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (88 commits) vdpa/mlx5: Fix possible uninitialized return value vdpa_sim_blk: add support for discard and write-zeroes vdpa_sim_blk: add support for VIRTIO_BLK_T_FLUSH vdpa_sim_blk: make vdpasim_blk_check_range usable by other requests vdpa_sim_blk: check if sector is 0 for commands other than read or write vdpa_sim: Implement suspend vdpa op vhost-vdpa: uAPI to suspend the device vhost-vdpa: introduce SUSPEND backend feature bit vdpa: Add suspend operation virtio-blk: Avoid use-after-free on suspend/resume virtio_vdpa: support the arg sizes of find_vqs() vhost-vdpa: Call ida_simple_remove() when failed vDPA: fix 'cast to restricted le16' warnings in vdpa.c vDPA: !FEATURES_OK should not block querying device config space vDPA/ifcvf: support userspace to query features and MQ of a management device vDPA/ifcvf: get_config_size should return a value no greater than dev implementation vhost scsi: Allow user to control num virtqueues vhost-scsi: Fix max number of virtqueues vdpa/mlx5: Support different address spaces for control and data vdpa/mlx5: Implement susupend virtqueue callback ...
2022-08-12	xen-blkfront: Apply 'feature_persistent' parameter when connect	SeongJae Park	1	-3/+1
	In some use cases[1], the backend is created while the frontend doesn't support the persistent grants feature, but later the frontend can be changed to support the feature and reconnect. In the past, 'blkback' enabled the persistent grants feature since it unconditionally checked if frontend supports the persistent grants feature for every connect ('connect_ring()') and decided whether it should use persistent grans or not. However, commit aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") has mistakenly changed the behavior. It made the frontend feature support check to not be repeated once it shown the 'feature_persistent' as 'false', or the frontend doesn't support persistent grants. Similar behavioral change has made on 'blkfront' by commit 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants"). This commit changes the behavior of the parameter to make effect for every connect, so that the previous behavior of 'blkfront' can be restored. [1] https://lore.kernel.org/xen-devel/CAJwUmVB6H3iTs-C+U=v-pwJB7-_ZRHPxHzKRJZ22xEPW7z8a=g@mail.gmail.com/ Fixes: 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220715225108.193398-4-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-12	xen-blkback: Apply 'feature_persistent' parameter when connect	Maximilian Heyne	1	-6/+3
	In some use cases[1], the backend is created while the frontend doesn't support the persistent grants feature, but later the frontend can be changed to support the feature and reconnect. In the past, 'blkback' enabled the persistent grants feature since it unconditionally checked if frontend supports the persistent grants feature for every connect ('connect_ring()') and decided whether it should use persistent grans or not. However, commit aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") has mistakenly changed the behavior. It made the frontend feature support check to not be repeated once it shown the 'feature_persistent' as 'false', or the frontend doesn't support persistent grants. This commit changes the behavior of the parameter to make effect for every connect, so that the previous workflow can work again as expected. [1] https://lore.kernel.org/xen-devel/CAJwUmVB6H3iTs-C+U=v-pwJB7-_ZRHPxHzKRJZ22xEPW7z8a=g@mail.gmail.com/ Reported-by: Andrii Chepurnyi <andrii.chepurnyi82@gmail.com> Fixes: aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220715225108.193398-3-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-12	xen-blkback: fix persistent grants negotiation	SeongJae Park	1	-8/+7
	Persistent grants feature can be used only when both backend and the frontend supports the feature. The feature was always supported by 'blkback', but commit aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") has introduced a parameter for disabling it runtime. To avoid the parameter be updated while being used by 'blkback', the commit caches the parameter into 'vbd->feature_gnt_persistent' in 'xen_vbd_create()', and then check if the guest also supports the feature and finally updates the field in 'connect_ring()'. However, 'connect_ring()' could be called before 'xen_vbd_create()', so later execution of 'xen_vbd_create()' can wrongly overwrite 'true' to 'vbd->feature_gnt_persistent'. As a result, 'blkback' could try to use 'persistent grants' feature even if the guest doesn't support the feature. This commit fixes the issue by moving the parameter value caching to 'xen_blkif_alloc()', which allocates the 'blkif'. Because the struct embeds 'vbd' object, which will be used by 'connect_ring()' later, this should be called before 'connect_ring()' and therefore this should be the right and safe place to do the caching. Fixes: aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Signed-off-by: SeongJae Park <sj@kernel.org> Reviewed-by: Maximilian Heyne <mheyne@amazon.de> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20220715225108.193398-2-sj@kernel.org Signed-off-by: Juergen Gross <jgross@suse.com>
2022-08-11	Merge tag 'ceph-for-5.20-rc1' of https://github.com/ceph/ceph-client	Linus Torvalds	1	-3/+3
	Pull ceph updates from Ilya Dryomov: "We have a good pile of various fixes and cleanups from Xiubo, Jeff, Luis and others, almost exclusively in the filesystem. Several patches touch files outside of our normal purview to set the stage for bringing in Jeff's long awaited ceph+fscrypt series in the near future. All of them have appropriate acks and sat in linux-next for a while" * tag 'ceph-for-5.20-rc1' of https://github.com/ceph/ceph-client: (27 commits) libceph: clean up ceph_osdc_start_request prototype libceph: fix ceph_pagelist_reserve() comment typo ceph: remove useless check for the folio ceph: don't truncate file in atomic_open ceph: make f_bsize always equal to f_frsize ceph: flush the dirty caps immediatelly when quota is approaching libceph: print fsid and epoch with osd id libceph: check pointer before assigned to "c->rules[]" ceph: don't get the inline data for new creating files ceph: update the auth cap when the async create req is forwarded ceph: make change_auth_cap_ses a global symbol ceph: fix incorrect old_size length in ceph_mds_request_args ceph: switch back to testing for NULL folio->private in ceph_dirty_folio ceph: call netfs_subreq_terminated with was_async == false ceph: convert to generic_file_llseek ceph: fix the incorrect comment for the ceph_mds_caps struct ceph: don't leak snap_rwsem in handle_cap_grant ceph: prevent a client from exceeding the MDS maximum xattr size ceph: choose auth MDS for getxattr with the Xs caps ceph: add session already open notify support ...
2022-08-11	virtio-blk: Avoid use-after-free on suspend/resume	Shigeru Yoshida	1	-14/+10
	hctx->user_data is set to vq in virtblk_init_hctx(). However, vq is freed on suspend and reallocated on resume. So, hctx->user_data is invalid after resume, and it will cause use-after-free accessing which will result in the kernel crash something like below: [ 22.428391] Call Trace: [ 22.428899] <TASK> [ 22.429339] virtqueue_add_split+0x3eb/0x620 [ 22.430035] ? __blk_mq_alloc_requests+0x17f/0x2d0 [ 22.430789] ? kvm_clock_get_cycles+0x14/0x30 [ 22.431496] virtqueue_add_sgs+0xad/0xd0 [ 22.432108] virtblk_add_req+0xe8/0x150 [ 22.432692] virtio_queue_rqs+0xeb/0x210 [ 22.433330] blk_mq_flush_plug_list+0x1b8/0x280 [ 22.434059] __blk_flush_plug+0xe1/0x140 [ 22.434853] blk_finish_plug+0x20/0x40 [ 22.435512] read_pages+0x20a/0x2e0 [ 22.436063] ? folio_add_lru+0x62/0xa0 [ 22.436652] page_cache_ra_unbounded+0x112/0x160 [ 22.437365] filemap_get_pages+0xe1/0x5b0 [ 22.437964] ? context_to_sid+0x70/0x100 [ 22.438580] ? sidtab_context_to_sid+0x32/0x400 [ 22.439979] filemap_read+0xcd/0x3d0 [ 22.440917] xfs_file_buffered_read+0x4a/0xc0 [ 22.441984] xfs_file_read_iter+0x65/0xd0 [ 22.442970] __kernel_read+0x160/0x2e0 [ 22.443921] bprm_execve+0x21b/0x640 [ 22.444809] do_execveat_common.isra.0+0x1a8/0x220 [ 22.446008] __x64_sys_execve+0x2d/0x40 [ 22.446920] do_syscall_64+0x37/0x90 [ 22.447773] entry_SYSCALL_64_after_hwframe+0x63/0xcd This patch fixes this issue by getting vq from vblk, and removes virtblk_init_hctx(). Fixes: 4e0400525691 ("virtio-blk: support polling I/O") Cc: "Suwan Kim" <suwan.kim027@gmail.com> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Message-Id: <20220810160948.959781-1-syoshida@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-08-06	Merge tag 'mm-stable-2022-08-03' of ↵	Linus Torvalds	2	-8/+9
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Most of the MM queue. A few things are still pending. Liam's maple tree rework didn't make it. This has resulted in a few other minor patch series being held over for next time. Multi-gen LRU still isn't merged as we were waiting for mapletree to stabilize. The current plan is to merge MGLRU into -mm soon and to later reintroduce mapletree, with a view to hopefully getting both into 6.1-rc1. Summary: - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe Lin, Yang Shi, Anshuman Khandual and Mike Rapoport - Some kmemleak fixes from Patrick Wang and Waiman Long - DAMON updates from SeongJae Park - memcg debug/visibility work from Roman Gushchin - vmalloc speedup from Uladzislau Rezki - more folio conversion work from Matthew Wilcox - enhancements for coherent device memory mapping from Alex Sierra - addition of shared pages tracking and CoW support for fsdax, from Shiyang Ruan - hugetlb optimizations from Mike Kravetz - Mel Gorman has contributed some pagealloc changes to improve latency and realtime behaviour. - mprotect soft-dirty checking has been improved by Peter Xu - Many other singleton patches all over the place" [ XFS merge from hell as per Darrick Wong in https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ] * tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits) tools/testing/selftests/vm/hmm-tests.c: fix build mm: Kconfig: fix typo mm: memory-failure: convert to pr_fmt() mm: use is_zone_movable_page() helper hugetlbfs: fix inaccurate comment in hugetlbfs_statfs() hugetlbfs: cleanup some comments in inode.c hugetlbfs: remove unneeded header file hugetlbfs: remove unneeded hugetlbfs_ops forward declaration hugetlbfs: use helper macro SZ_1{K,M} mm: cleanup is_highmem() mm/hmm: add a test for cross device private faults selftests: add soft-dirty into run_vmtests.sh selftests: soft-dirty: add test for mprotect mm/mprotect: fix soft-dirty check in can_change_pte_writable() mm: memcontrol: fix potential oom_lock recursion deadlock mm/gup.c: fix formatting in check_and_migrate_movable_page() xfs: fail dax mount if reflink is enabled on a partition mm/memcontrol.c: remove the redundant updating of stats_flush_threshold userfaultfd: don't fail on unrecognized features hugetlb_cgroup: fix wrong hugetlb cgroup numa stat ...
2022-08-03	libceph: clean up ceph_osdc_start_request prototype	Jeff Layton	1	-3/+3
	This function always returns 0, and ignores the nofail boolean. Drop the nofail argument, make the function void return and fix up the callers. Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-08-03	ublk_drv: add support for UBLK_IO_NEED_GET_DATA	ZiyangZhang	1	-12/+94
	UBLK_IO_NEED_GET_DATA is one ublk IO command. It is designed for a user application who wants to allocate IO buffer and set IO buffer address only after it receives an IO request from ublksrv. This is a reasonable scenario because these users may use a RPC framework as one IO backend to handle IO requests passed from ublksrv. And a RPC framework may allocate its own buffer(or memory pool). This new feature (UBLK_F_NEED_GET_DATA) is optional for ublk users. Related userspace code has been added in ublksrv[1] as one pull request. Test cases for this feature are added in ublksrv and all the tests pass. The performance result shows that this new feature does bring additional latency because one IO is issued back to ublk_drv once again to copy data from bio vectors to user-provided data buffer. UBLK_IO_NEED_GET_DATA is suitable for bigger block size such as 512B or 1MB. [1] https://github.com/ming1/ubdsrv Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/3a21007ea1be8304246e654cebbd581ab0012623.1659011443.git.ZiyangZhang@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	ublk_drv: cleanup ublksrv_ctrl_dev_info	Ming Lei	1	-11/+7
	Remove all block device related info from ublksrv_ctrl_dev_info, meantime reduce its size into 64 bytes because: 1) ublksrv_ctrl_dev_info becomes cleaner without including any block related info 2) generic set/get parameter command can be used to set block related setting easily and cleanly 3) generic set/get parameter command can be used for extending ublk without needing more info in ublksrv_ctrl_dev_info Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220730092750.1118167-5-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	ublk_drv: add SET_PARAMS/GET_PARAMS control command	Ming Lei	1	-18/+187
	Add two commands to set/get parameters generically. One important goal of ublk is to provide generic framework for making block device by userspace flexibly. As one generic block device, there are still lots of block parameters, such as max_sectors, write_cache/fua, discard related limits, zoned parameters, ...., so this patch starts to add generic mechanism for set/get device parameters. Both generic block parameters(all kinds of queue settings) and ublk feature parameters can be covered with this way, then it becomes quite easy to extend in future. Add two parameter types are used so far: basic(covers basic queue setting and misc settings which can't be grouped easily) and discard, basic type must be set, and discard type becomes optional now This way provides mechanism to simulate any kind of generic block device from userspace easily, from both block queue setting viewpoint or ublk feature viewpoint. The style of putting all parameters together is suggested by Christoph. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220730092750.1118167-4-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	ublk_drv: fix ublk device leak in case that add_disk fails	Ming Lei	1	-0/+5
	->free_disk is only called after disk is added successfully, so drop ublk device reference in case of add_disk() failure. Fixes: 6d9e6dfdf3b2 ("ublk: defer disk allocation") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220730092750.1118167-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	ublk_drv: cancel device even though disk isn't up	Ming Lei	1	-6/+12
	Each ublk queue is started before adding disk, we have to cancel queues in ublk_stop_dev() so that ubq daemon can be exited, otherwise DEL_DEV command may hang forever. Also avoid to cancel queues two times by checking if queue is ready, otherwise use-after-free on io_uring may be triggered because ublk_stop_dev is called by ublk_remove() too. Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20220730092750.1118167-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	block: change the blk_queue_split calling convention	Christoph Hellwig	3	-3/+3
	The double indirect bio leads to somewhat suboptimal code generation. Instead return the (original or split) bio, and make sure the request_queue arguments to the lower level helpers is passed after the bio to avoid constant reshuffling of the argument passing registers. Also give it and the helpers used to implement it more descriptive names. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220727162300.3089193-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	remove the sx8 block driver	Christoph Hellwig	3	-1593/+0
	This driver is for fairly obscure hardware, and has only seen random drive-by changes after the maintainer stopped working on it in 2005 (about a year and a half after it was introduced). It has some "interesting" block layer interactions, so let's just drop it unless anyone complains. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220721064102.1715460-1-hch@lst.de [axboe: fix date typo, it was in 2005, not 2015] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	nbd: add missing definition of pr_fmt	Yu Kuai	1	-2/+4
	commit 1243172d5894 ("nbd: use pr_err to output error message") tries to define pr_fmt and use short pr_err() to output error message, however, the definition is missed. This patch also remove existing "nbd:" inside pr_err(). Fixes: 1243172d5894 ("nbd: use pr_err to output error message") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20220723082427.3890655-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	null_blk: fix ida error handling in null_add_dev()	Dan Carpenter	1	-3/+11
	There needs to be some error checking if ida_simple_get() fails. Also call ida_free() if there are errors later. Fixes: 94bc02e30fb8 ("nullb: use ida to manage index") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/YtEhXsr6vJeoiYhd@kili Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	null_blk: add configfs variables for 2 options	Vincent Fu	2	-19/+50
	Allow setting via configfs these two options: no_sched shared_tag_bitmap Previously these could only be activated as module parameters. Still missing are: shared_tags timeout requeue init_hctx Signed-off-by: Vincent Fu <vincent.fu@samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220708174943.87787-3-vincent.fu@samsung.com [axboe: fold in nullb == NULL fix] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	null_blk: add module parameters for 4 options	Vincent Fu	1	-0/+20
	Add as module parameters these options: memory_backed discard mbps cache_size Previously these could only be set via configfs. Still missing is bad_blocks. The kernel test robot found a documentation formatting issue in v1 of this patch. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Vincent Fu <vincent.fu@samsung.com> Link: https://lore.kernel.org/r/20220708174943.87787-2-vincent.fu@samsung.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	block/rnbd-srv: Replace sess_dev_list with index_idr	Md Haris Iqbal	2	-14/+7
	The structure rnbd_srv_session maintains a list and an xarray of rnbd_srv_dev. There is no need to keep both as one of them can serve the purpose. Since one of the places where the lookup of rnbd_srv_dev using rnbd_srv_session is IO path, an xarray would serve us better than a list traversal. Hence remove sess_dev_list from rnbd_srv_session, and replace its uses from xarray. Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Reviewed-by: Aleksei Marov <aleksei.marov@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Link: https://lore.kernel.org/r/20220707143122.460362-3-haris.iqbal@ionos.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	block/rnbd-srv: Set keep_id to true after mutex_trylock	Md Haris Iqbal	1	-1/+2
	After setting keep_id if the mutex trylock fails, the keep_id stays set for the rest of the sess_dev lifetime. Therefore, set keep_id to true after mutex_trylock succeeds, so that a failure of trylock does'nt touch keep_id. Fixes: b168e1d85cf3 ("block/rnbd-srv: Prevent a deadlock generated by accessing sysfs in parallel") Cc: gi-oh.kim@ionos.com Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Jack Wang <jinpu.wang@ionos.com> Link: https://lore.kernel.org/r/20220707143122.460362-2-haris.iqbal@ionos.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	rnbd-clt: make rnbd_clt_change_capacity return void	Guoqing Jiang	1	-4/+3
	No need to checking the return value, make it return void. Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev> Link: https://lore.kernel.org/r/20220706133152.12058-9-guoqing.jiang@linux.dev Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-08-03	rnbd-clt: pass sector_t type for resize capacity	Guoqing Jiang	3	-5/+5
	Let's change the parameter type to 'sector_t' then we don't need to cast it from rnbd_clt_resize_dev_store, and update rnbd_clt_resize_disk too. Acked-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev> Link: https://lore.kernel.org/r/20220706133152.12058-8-guoqing.jiang@linux.dev Signed-off-by: Jens Axboe <axboe@kernel.dk>