summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-06-14md/raid10: improve code of mrdev in raid10_sync_requestLi Nan1-11/+12
'need_recover' and 'mrdev' are equivalent in raid10_sync_request(), and inc mrdev->nr_pending is unreasonable if don't need recovery. Replace 'need_recover' with 'mrdev', and only inc nr_pending when needed. Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230527072218.2365857-3-linan666@huaweicloud.com
2023-06-14md/raid10: fix null-ptr-deref of mreplace in raid10_sync_requestLi Nan1-9/+5
There are two check of 'mreplace' in raid10_sync_request(). In the first check, 'need_replace' will be set and 'mreplace' will be used later if no-Faulty 'mreplace' exists, In the second check, 'mreplace' will be set to NULL if it is Faulty, but 'need_replace' will not be changed accordingly. null-ptr-deref occurs if Faulty is set between two check. Fix it by merging two checks into one. And replace 'need_replace' with 'mreplace' because their values are always the same. Fixes: ee37d7314a32 ("md/raid10: Fix raid10 replace hang when new added disk faulty") Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230527072218.2365857-2-linan666@huaweicloud.com
2023-06-14md/raid5: don't start reshape when recovery or replace is in progressYu Kuai1-0/+8
When recovery is interrupted (reboot, etc.) check for MD_RECOVERY_RUNNING is not enough to tell recovery is in progress. Also check recovery_cp before starting reshape. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230529133410.2125914-1-yukuai1@huaweicloud.com
2023-06-14md: protect md_thread with rcuYu Kuai12-81/+97
Currently, there are many places that md_thread can be accessed without protection, following are known scenarios that can cause null-ptr-dereference or uaf: 1) sync_thread that is allocated and started from md_start_sync() 2) mddev->thread can be accessed directly from timeout_store() and md_bitmap_daemon_work() 3) md_unregister_thread() from action_store(). Currently, a global spinlock 'pers_lock' is borrowed to protect 'mddev->thread' in some places, this problem can be fixed likewise, however, use a global lock for all the cases is not good. Fix this problem by protecting all md_thread with rcu. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230523021017.3048783-6-yukuai1@huaweicloud.com
2023-06-14md/bitmap: factor out a helper to set timeoutYu Kuai1-16/+19
Register/unregister 'mddev->thread' are both under 'reconfig_mutex', however, some context didn't hold the mutex to access mddev->thread, which can cause null-ptr-deference: 1) md_bitmap_daemon_work() can be called from md_check_recovery() where 'reconfig_mutex' is not held, deference 'mddev->thread' might cause null-ptr-deference, because md_unregister_thread() reset the pointer before stopping the thread. 2) timeout_store() access 'mddev->thread' multiple times, null-ptr-deference can be triggered if 'mddev->thread' is reset in the middle. This patch factor out a helper to set timeout, the new helper always check if 'mddev->thread' is null first, so that problem 1 can be fixed. Now that this helper only access 'mddev->thread' once, but it's possible that 'mddev->thread' can be freed while this helper is still in progress, hence the problem is not fixed yet. Follow up patches will fix this by protecting md_thread with rcu. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230523021017.3048783-5-yukuai1@huaweicloud.com
2023-06-14md/bitmap: always wake up md_thread in timeout_storeYu Kuai1-3/+3
md_wakeup_thread() can handle the case that pass in md_thread is NULL, the only difference is that md_wakeup_thread() will be called when current timeout is 'MAX_SCHEDULE_TIMEOUT', this should not matter because timeout_store() is not hot path, and the daemon process is woke up more than demand from other context already. Prepare to factor out a helper to set timeout. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230523021017.3048783-4-yukuai1@huaweicloud.com
2023-06-14dm-raid: remove useless checking in raid_message()Yu Kuai1-2/+2
md_wakeup_thread() handle the case that pass in md_thread is NULL, there is no need to check this. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230523021017.3048783-3-yukuai1@huaweicloud.com
2023-06-14md: factor out a helper to wake up md_thread directlyYu Kuai1-8/+19
md_wakeup_thread() can't wakeup md_thread->tsk if md_thread->run is still in progress, and in some cases md_thread->tsk need to be woke up directly, like md_set_readonly() and do_md_stop(). Commit 9dfbdafda3b3 ("md: unlock mddev before reap sync_thread in action_store") introduce a new scenario where unregister sync_thread is not protected by 'reconfig_mutex', this can cause null-ptr-deference in theroy: t1: md_set_readonly t2: action_store md_unregister_thread // 'reconfig_mutex' is not held // 'reconfig_mutex' is held by caller if (mddev->sync_thread) thread = *threadp *threadp = NULL wake_up_process(mddev->sync_thread->tsk) // null-ptr-deference Fix this problem by factoring out a helper to wake up md_thread directly, so that 'sync_thread' won't be accessed multiple times from the reader side. This helper also prepare to protect md_thread with rcu. Noted that later patches is going to fix that unregister sync_thread is not protected by 'reconfig_mutex' from action_store(). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230523021017.3048783-2-yukuai1@huaweicloud.com
2023-06-14md: fix duplicate filename for rdevYu Kuai2-44/+52
Commit 5792a2856a63 ("[PATCH] md: avoid a deadlock when removing a device from an md array via sysfs") delays the deletion of rdev, however, this introduces a window that rdev can be added again while the deletion is not done yet, and sysfs will complain about duplicate filename. Follow up patches try to fix this problem by flushing workqueue, however, flush_rdev_wq() is just dead code, the progress in md_kick_rdev_from_array(): 1) list_del_rcu(&rdev->same_set); 2) synchronize_rcu(); 3) queue_work(md_rdev_misc_wq, &rdev->del_work); So in flush_rdev_wq(), if rdev is found in the list, work_pending() can never pass, in the meantime, if work is queued, then rdev can never be found in the list. flush_rdev_wq() can be replaced by flush_workqueue() directly, however, this approach is not good: - the workqueue is global, this synchronization for all raid disks is not necessary. - flush_workqueue can't be called under 'reconfig_mutex', there is still a small window between flush_workqueue() and mddev_lock() that other contexts can queue new work, hence the problem is not solved completely. sysfs already has apis to support delete itself through writer, and these apis, specifically sysfs_break/unbreak_active_protection(), is used to support deleting rdev synchronously. Therefore, the above commit can be reverted, and sysfs duplicate filename can be avoided. A new mdadm regression test is proposed as well([1]). [1] https://lore.kernel.org/linux-raid/20230428062845.1975462-1-yukuai1@huaweicloud.com/ Fixes: 5792a2856a63 ("[PATCH] md: avoid a deadlock when removing a device from an md array via sysfs") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230523012727.3042247-1-yukuai1@huaweicloud.com
2023-06-14md/raid10: fix wrong setting of max_corr_read_errorsLi Nan1-0/+2
There is no input check when echo md/max_read_errors and overflow might occur. Add check of input number. Fixes: 1e50915fe0bb ("raid: improve MD/raid10 handling of correctable read errors.") Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230522072535.1523740-3-linan666@huaweicloud.com
2023-06-14md/raid10: fix overflow of md/safe_mode_delayLi Nan1-3/+4
There is no input check when echo md/safe_mode_delay in safe_delay_store(). And msec might also overflow when HZ < 1000 in safe_delay_show(), Fix it by checking overflow in safe_delay_store() and use unsigned long conversion in safe_delay_show(). Fixes: 72e02075a33f ("md: factor out parsing of fixed-point numbers") Signed-off-by: Li Nan <linan122@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230522072535.1523740-2-linan666@huaweicloud.com
2023-06-14md/raid5: fix a deadlock in the case that reshape is interruptedYu Kuai2-1/+44
If reshape is in progress and io across reshape_position is issued, such io will wait for reshape to make progress(see details in the case that make_stripe_request() return STRIPE_SCHEDULE_AND_RETRY). It has been reported several times that if system reboot while growing raid5 to raid6, array assemble will hang infinitely([1, 2]). This is because following deadlock is triggered: 1) a normal io is waiting for reshape to progress, this io can be from system-udevd or mdadm. 2) while assemble, mdadm tries to suspend the array, hence 'reconfig_mutex' is held and mddev_suspend() must wait for normal io to be done. 3) daemon thread can't start reshape because 'reconfig_mutex' can't be held. 1) and 3) is unbreakable because they're foundation design. In order to break 2), following is possible solutions that I can think of: a) Let mddev_suspend() fail is not a good option, because this will break many scenarios since mddev_suspend() doesn't fail before. b) Fail the io that is waiting for reshape to make progress from mddev_suspend(). c) Return false for the io that is waiting for reshape to make progress from raid5_make_request(), and these io will wait for suspend to be done in md_handle_request(), where 'active_io' is not grabbed. c) sounds better than b), however, b) is used because it's easy and straightforward, and it's verified that mdadm can assemble in this case. On the other hand, c) breaks the logic that mddev_suspend() will wait for submitted io to be completely handled. Fix the problem by checking reshape in mddev_suspend(), if reshape can't make progress and there are still some io waiting for reshape, fail those io. [1] https://lore.kernel.org/all/CAFig2csUV2QiomUhj_t3dPOgV300dbQ6XtM9ygKPdXJFSH__Nw@mail.gmail.com/ [2] https://lore.kernel.org/all/CAO2ABipzbw6QL5eNa44CQHjiVa-LTvS696Mh9QaTw+qsUKFUCw@mail.gmail.com/ Reported-by: Jove <jovetoo@gmail.com> Reported-by: David Gilmour <dgilmour76@gmail.com> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230512015610.821290-6-yukuai1@huaweicloud.com
2023-06-14md: add a new api prepare_suspend() in md_personalityYu Kuai2-0/+5
There are no functional changes, the new api will be used later to do special handling for raid456 in md_suspend(). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230512015610.821290-5-yukuai1@huaweicloud.com
2023-06-14md: export md_is_rdwr() and is_md_suspended()Yu Kuai2-16/+17
The two apis will be used later to fix a deadlock in raid456, there are no functional changes. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230512015610.821290-4-yukuai1@huaweicloud.com
2023-06-14md: fix data corruption for raid456 when reshape restart while grow upYu Kuai1-2/+12
Currently, if reshape is interrupted, echo "reshape" to sync_action will restart reshape from scratch, for example: echo frozen > sync_action echo reshape > sync_action This will corrupt data before reshape_position if the array is growing, fix the problem by continue reshape from reshape_position. Reported-by: Peter Neuwirth <reddunur@online.de> Link: https://lore.kernel.org/linux-raid/e2f96772-bfbc-f43b-6da1-f520e5164536@online.de/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230512015610.821290-3-yukuai1@huaweicloud.com
2023-06-14md/raid5: don't allow replacement while reshape is in progressYu Kuai1-0/+1
If reshape is interrupted(for example, echo frozen to sync_action), then rdev replacement can be set. It's safe because reshape is always prior to resync in md_check_recovery(). However, if system reboots, then kernel will complain cannot handle concurrent replacement and reshape and this array is not able to assemble anymore. Fix this problem by don't allow replacement until reshape is done. Reported-by: Peter Neuwirth <reddunur@online.de> Link: https://lore.kernel.org/linux-raid/e2f96772-bfbc-f43b-6da1-f520e5164536@online.de/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230512015610.821290-2-yukuai1@huaweicloud.com
2023-06-14md/raid10: check slab-out-of-bounds in md_bitmap_get_counterLi Nan1-8/+9
If we write a large number to md/bitmap_set_bits, md_bitmap_checkpage() will return -EINVAL because 'page >= bitmap->pages', but the return value was not checked immediately in md_bitmap_get_counter() in order to set *blocks value and slab-out-of-bounds occurs. Move check of 'page >= bitmap->pages' to md_bitmap_get_counter() and return directly if true. Fixes: ef4256733506 ("md/bitmap: optimise scanning of empty bitmaps.") Signed-off-by: Li Nan <linan122@huawei.com> Reviewed-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230515134808.3936750-2-linan666@huaweicloud.com
2023-06-14raid6: neon: add missing prototypesArnd Bergmann4-7/+25
The raid6 syndrome functions are generated for different sizes and have no generic prototype, while in the inner functions have a prototype in a header that cannot be included from the correct file. In both cases, the compiler warns about missing prototypes: lib/raid6/recov_neon_inner.c:27:6: warning: no previous prototype for '__raid6_2data_recov_neon' [-Wmissing-prototypes] lib/raid6/recov_neon_inner.c:77:6: warning: no previous prototype for '__raid6_datap_recov_neon' [-Wmissing-prototypes] lib/raid6/neon1.c:56:6: warning: no previous prototype for 'raid6_neon1_gen_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon1.c:86:6: warning: no previous prototype for 'raid6_neon1_xor_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon2.c:56:6: warning: no previous prototype for 'raid6_neon2_gen_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon2.c:97:6: warning: no previous prototype for 'raid6_neon2_xor_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon4.c:56:6: warning: no previous prototype for 'raid6_neon4_gen_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon4.c:119:6: warning: no previous prototype for 'raid6_neon4_xor_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon8.c:56:6: warning: no previous prototype for 'raid6_neon8_gen_syndrome_real' [-Wmissing-prototypes] lib/raid6/neon8.c:163:6: warning: no previous prototype for 'raid6_neon8_xor_syndrome_real' [-Wmissing-prototypes] Add a new header file that contains the prototypes for both to avoid the warnings. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20230517132220.937200-1-arnd@kernel.org
2023-06-13swim3: fix the floppy_locked_ioctl prototypeChristoph Hellwig1-1/+1
Add back the accidentally dropped mode parameter. Fixes: b60f7635788a ("swim3: fix the floppy_locked_ioctl prototype") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20230613154309.327557-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12blk-mq: fix potential io hang by wrong 'wake_batch'Yu Kuai4-10/+13
In __blk_mq_tag_busy/idle(), updating 'active_queues' and calculating 'wake_batch' is not atomic: t1: t2: _blk_mq_tag_busy blk_mq_tag_busy inc active_queues // assume 1->2 inc active_queues // 2 -> 3 blk_mq_update_wake_batch // calculate based on 3 blk_mq_update_wake_batch /* calculate based on 2, while active_queues is actually 3. */ Fix this problem by protecting them wih 'tags->lock', this is not a hot path, so performance should not be concerned. And now that all writers are inside the lock, switch 'actives_queues' from atomic to unsigned int. Fixes: 180dccb0dba4 ("blk-mq: fix tag_get wait task can't be awakened") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230610023043.2559121-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12fs: remove the now unused FMODE_* flagsChristoph Hellwig1-7/+0
FMODE_NDELAY, FMODE_EXCL and FMODE_WRITE_IOCTL were only used for block internal purposed and are now entirely unused, so remove them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-31-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: store the holder in file->private_dataChristoph Hellwig1-6/+8
Store the file struct used as the holder in file->private_data as an indicator that this file descriptor was opened exclusively to remove the last use of FMODE_EXCL. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20230608110258.189493-30-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: always use I_BDEV on file->f_mapping->host to find the bdevChristoph Hellwig1-10/+8
Always use I_BDEV(file->f_mapping->host) to find the bdev for a file to free up file->private_data for other uses. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-29-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: replace fmode_t with a block-specific type for block open flagsChristoph Hellwig82-315/+334
The only overlap between the block open flags mapped into the fmode_t and other uses of fmode_t are FMODE_READ and FMODE_WRITE. Define a new blk_mode_t instead for use in blkdev_get_by_{dev,path}, ->open and ->ioctl and stop abusing fmode_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-28-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: remove unused fmode_t arguments from ioctl handlersChristoph Hellwig3-12/+12
A few ioctl handlers have fmode_t arguments that are entirely unused, remove them. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20230608110258.189493-27-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: move a few internal definitions out of blkdev.hChristoph Hellwig2-29/+21
All these helpers are only used in core block code, so move them out of the public header. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-26-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12ubd: remove commented out code in ubd_openChristoph Hellwig1-7/+0
This code has been dead forever, make sure it doesn't show up in code searches. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Richard Weinberger <richard@nod.at> Link: https://lore.kernel.org/r/20230608110258.189493-25-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12rnbd-srv: replace sess->open_flags with a "bool readonly"Christoph Hellwig3-11/+9
Stop passing the fmode_t around and just use a simple bool to track if an export is read-only. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Jack Wang <jinpu.wang@ionos.com> Link: https://lore.kernel.org/r/20230608110258.189493-24-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12mtd: block: use a simple bool to track open for writeChristoph Hellwig3-3/+3
Instead of propagating the fmode_t, just use a bool to track if a mtd block device was opened for writing. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Richard Weinberger <richard@nod.at> Link: https://lore.kernel.org/r/20230608110258.189493-23-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12nvme: replace the fmode_t argument to the nvme ioctl handlers with a simple boolChristoph Hellwig1-28/+34
Instead of passing a fmode_t and only checking it fo0r FMODE_WRITE, pass a bool open_for_write to prepare for callers that won't have the fmode_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-22-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12scsi: replace the fmode_t argument to ->sg_io_fn with a simple boolChristoph Hellwig4-7/+9
Instead of passing a fmode_t and only checking it for FMODE_WRITE, pass a bool open_for_write to prepare for callers that won't have the fmode_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-21-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12scsi: replace the fmode_t argument to scsi_ioctl with a simple boolChristoph Hellwig7-24/+26
Instead of passing a fmode_t and only checking it for FMODE_WRITE, pass a bool open_for_write to prepare for callers that won't have the fmode_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-20-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12scsi: replace the fmode_t argument to scsi_cmd_allowed with a simple boolChristoph Hellwig4-7/+7
Instead of passing a fmode_t and only checking it for FMODE_WRITE, pass a bool open_for_write to prepare for callers that won't have the fmode_t. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-19-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12fs: remove sb->s_modeChristoph Hellwig4-8/+6
There is no real need to store the open mode in the super_block now. It is only used by f2fs, which can easily recalculate it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-18-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: add a sb_open_mode helperChristoph Hellwig4-20/+14
Add a helper to return the open flags for blkdev_get_by* for passed in super block flags instead of open coding the logic in many places. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-17-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: use the holder as indication for exclusive opensChristoph Hellwig37-192/+183
The current interface for exclusive opens is rather confusing as it requires both the FMODE_EXCL flag and a holder. Remove the need to pass FMODE_EXCL and just key off the exclusive open off a non-NULL holder. For blkdev_put this requires adding the holder argument, which provides better debug checking that only the holder actually releases the hold, but at the same time allows removing the now superfluous mode argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: David Sterba <dsterba@suse.com> [btrfs] Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Link: https://lore.kernel.org/r/20230608110258.189493-16-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12btrfs: don't pass a holder for non-exclusive blkdev_get_by_pathChristoph Hellwig3-21/+15
Passing a holder to blkdev_get_by_path when FMODE_EXCL isn't set doesn't make sense, so pass NULL instead and remove the holder argument from the call chains the only end up in non-FMODE_EXCL blkdev_get_by_path calls. Exclusive mode for device scanning is not used since commit 50d281fc434c ("btrfs: scan device in non-exclusive mode")". Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: David Sterba <dsterba@suse.com> Link: https://lore.kernel.org/r/20230608110258.189493-15-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12rnbd-srv: don't pass a holder for non-exclusive blkdev_get_by_pathChristoph Hellwig1-1/+1
Passing a holder to blkdev_get_by_path when FMODE_EXCL isn't set doesn't make sense, so pass NULL instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Jack Wang <jinpu.wang@ionos.com> Link: https://lore.kernel.org/r/20230608110258.189493-14-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12bcache: don't pass a stack address to blkdev_get_by_pathChristoph Hellwig1-1/+1
sb is just an on-stack pointer that can easily be reused by other calls. Switch to use the bcache-wide bcache_kobj instead as there is no need to claim per-bcache device anyway. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20230608110258.189493-13-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12swsusp: don't pass a stack address to blkdev_get_by_pathChristoph Hellwig1-2/+3
holder is just an on-stack pointer that can easily be reused by other calls, replace it with a static variable that doesn't change. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-12-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: rename blkdev_close to blkdev_releaseChristoph Hellwig1-2/+2
Make the function name match the method name. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: remove the unused mode argument to ->releaseChristoph Hellwig30-43/+41
The mode argument to the ->release block_device_operation is never used, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Link: https://lore.kernel.org/r/20230608110258.189493-10-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: pass a gendisk to ->openChristoph Hellwig31-107/+102
->open is only called on the whole device. Make that explicit by passing a gendisk instead of the block_device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Acked-by: Jack Wang <jinpu.wang@ionos.com> [rnbd] Link: https://lore.kernel.org/r/20230608110258.189493-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: pass a gendisk on bdev_check_media_changeChristoph Hellwig11-32/+31
bdev_check_media_change should only ever be called for the whole device. Pass a gendisk to make that explicit and rename the function to disk_check_media_change. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-8-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12cdrom: remove the unused mode argument to cdrom_releaseChristoph Hellwig4-4/+4
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Phillip Potter <phil@philpotter.co.uk> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12cdrom: track if a cdrom_device_info was opened for dataChristoph Hellwig2-7/+6
Set a flag when a cdrom_device_info is opened for writing, instead of trying to figure out this at release time. This will allow to eventually remove the mode argument to the ->release block_device_operation as nothing but the CDROM drivers uses that argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Phillip Potter <phil@philpotter.co.uk> Acked-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20230608110258.189493-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12cdrom: remove the unused cdrom_close_write release codeChristoph Hellwig2-16/+0
cdrom_close_write is empty, and the for_data flag it is keyed off is never set. Remove all this clutter. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Phillip Potter <phil@philpotter.co.uk> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12cdrom: remove the unused mode argument to cdrom_ioctlChristoph Hellwig4-5/+5
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Phillip Potter <phil@philpotter.co.uk> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-4-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12cdrom: remove the unused bdev argument to cdrom_openChristoph Hellwig4-6/+4
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Phillip Potter <phil@philpotter.co.uk> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2023-06-12block: also call ->open for incremental partition opensChristoph Hellwig1-10/+8
For whole devices ->open is called for each open, but for partitions it is only called on the first open of a partition, e.g.: open("/dev/vdb", ...) open("/dev/vdb", ...) - 2 call to ->open open("/dev/vdb1", ...) open("/dev/vdb", ...) - 2 call to ->open open("/dev/vdb", ...) open("/dev/vdb", ...) - just open call to ->open This is problematic as various block drivers look at open flags and might not do all the required setup if the earlier open was with an odd flag like O_NDELAY or the magic 3 ioctl-only open mode. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Phillip Potter <phil@philpotter.co.uk> Reviewed-by: Hannes Reinecke <hare@suse.de> Acked-by: Christian Brauner <brauner@kernel.org> Link: https://lore.kernel.org/r/20230608110258.189493-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>