summaryrefslogtreecommitdiff
path: root/drivers/nvme/host/core.c
AgeCommit message (Collapse)AuthorFilesLines
2021-05-04nvme: move the fabrics queue ready check routines to coreTao Chiu1-0/+60
queue_rq() in pci only checks if the dispatched queue (nvmeq) is ready, e.g. not being suspended. Since nvme_alloc_admin_tags() in reset flow restarts the admin queue, users are able to submit admin commands to a controller before reset_work() completes. Commands submitted under this condition may interfere with commands that performs identify, IO queue setup in reset_work(), and may result in a hang described in the following patch. As seen in the fabrics, user commands are prevented from being executed under inproper controller states. We may reuse this logic to maintain a clear admin queue during reset_work(). Signed-off-by: Tao Chiu <taochiu@synology.com> Signed-off-by: Cody Wong <codywong@synology.com> Reviewed-by: Leon Chien <leonchien@synology.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-05-04nvme: avoid memset for passthrough requestsKanchan Joshi1-4/+3
nvme_clear_nvme_request() clears the nvme_command, which is unncessary for passthrough requests as nvme_command is overwritten immediately. Move clearing part from this helper to the caller, so that double memset for passthrough requests is avoided. Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-05-04nvme: add nvme_get_ns helperKanchan Joshi1-2/+7
Add a helper to avoid opencoding ns->kref increment. Decrement is already done via nvme_put_ns helper. Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-05-04nvme: fix controller ioctl through ns_headMinwoo Im1-22/+0
In multipath case, we should consider namespace attachment with controllers in a subsystem when we find out the live controller for the namespace. This patch manually reverted the commit 3557a4409701 ("nvme: don't bother to look up a namespace for controller ioctls") with few more updates to nvme_ns_head_chr_ioctl which has been newly updated. Fixes: 3557a4409701 ("nvme: don't bother to look up a namespace for controller ioctls") Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-29Merge tag 'for-5.13/drivers-2021-04-27' of git://git.kernel.dk/linux-blockLinus Torvalds1-657/+419
Pull block driver updates from Jens Axboe: - MD changes via Song: - raid5 POWER fix - raid1 failure fix - UAF fix for md cluster - mddev_find_or_alloc() clean up - Fix NULL pointer deref with external bitmap - Performance improvement for raid10 discard requests - Fix missing information of /proc/mdstat - rsxx const qualifier removal (Arnd) - Expose allocated brd pages (Calvin) - rnbd via Gioh Kim: - Change maintainer - Change domain address of maintainers' email - Add polling IO mode and document update - Fix memory leak and some bug detected by static code analysis tools - Code refactoring - Series of floppy cleanups/fixes (Denis) - s390 dasd fixes (Julian) - kerneldoc fixes (Lee) - null_blk double free (Lv) - null_blk virtual boundary addition (Max) - Remove xsysace driver (Michal) - umem driver removal (Davidlohr) - ataflop fixes (Dan) - Revalidate disk removal (Christoph) - Bounce buffer cleanups (Christoph) - Mark lightnvm as deprecated (Christoph) - mtip32xx init cleanups (Shixin) - Various fixes (Tian, Gustavo, Coly, Yang, Zhang, Zhiqiang) * tag 'for-5.13/drivers-2021-04-27' of git://git.kernel.dk/linux-block: (143 commits) async_xor: increase src_offs when dropping destination page drivers/block/null_blk/main: Fix a double free in null_init. md/raid1: properly indicate failure when ending a failed write request md-cluster: fix use-after-free issue when removing rdev nvme: introduce generic per-namespace chardev nvme: cleanup nvme_configure_apst nvme: do not try to reconfigure APST when the controller is not live nvme: add 'kato' sysfs attribute nvme: sanitize KATO setting nvmet: avoid queuing keep-alive timer if it is disabled brd: expose number of allocated pages in debugfs ataflop: fix off by one in ataflop_probe() ataflop: potential out of bounds in do_format() drbd: Fix fall-through warnings for Clang block/rnbd: Use strscpy instead of strlcpy block/rnbd-clt-sysfs: Remove copy buffer overlap in rnbd_clt_get_path_name block/rnbd-clt: Remove max_segment_size block/rnbd-clt: Generate kobject_uevent when the rnbd device state changes block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_ev Documentation/ABI/rnbd-clt: Add description for nr_poll_queues ...
2021-04-22nvme: introduce generic per-namespace chardevMinwoo Im1-0/+87
Userspace has not been allowed to I/O to device that's failed to be initialized. This patch introduces generic per-namespace character device to allow userspace to I/O regardless the block device is there or not. The chardev naming convention will similar to the existing blkdev naming, using a ng prefix instead of nvme, i.e. - /dev/ngXnY It also supports multipath which means it will not expose chardev for the hidden namespace blkdevs (e.g., nvmeXcYnZ). If /dev/ngXnY is created for a ns_head, then I/O request will be routed to a specific controller selected by the iopolicy of the subsystem. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> Signed-off-by: Javier González <javier.gonz@samsung.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Tested-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-21nvme: cleanup nvme_configure_apstChristoph Hellwig1-80/+69
Remove a level of indentation from the main code implementating the table search by using a goto for the APST not supported case. Also move the main comment above the function. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
2021-04-21nvme: do not try to reconfigure APST when the controller is not liveChristoph Hellwig1-1/+2
Do not call nvme_configure_apst when the controller is not live, given that nvme_configure_apst will fail due the lack of an admin queue when the controller is being torn down and nvme_set_latency_tolerance is called from dev_pm_qos_hide_latency_tolerance. Fixes: 510a405d945b("nvme: fix memory leak for power latency tolerance") Reported-by: Peng Liu <liupeng17@lenovo.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org>
2021-04-21nvme: add 'kato' sysfs attributeHannes Reinecke1-0/+2
Add a 'kato' controller sysfs attribute to display the current keep-alive timeout value (if any). This allows userspace to identify persistent discovery controllers, as these will have a non-zero KATO value. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-21nvme: sanitize KATO settingHannes Reinecke1-3/+14
According to the NVMe base spec the KATO commands should be sent at half of the KATO interval, to properly account for round-trip times. As we now will only ever send one KATO command per connection we can easily use the recommended values. This also fixes a potential issue where the request timeout for the KATO command does not match the value in the connect command, which might be causing spurious connection drops from the target. Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-15nvme: fix NULL derefence in nvme_ctrl_fast_io_fail_tmo_show/storeGopal Tiwari1-0/+2
Adding entry for dev_attr_fast_io_fail_tmo to avoid the kernel crash while reading and writing the fast_io_fail_tmo. Fixes: 09fbed636382 (nvme: export fast_io_fail_tmo to sysfs) Signed-off-by: Gopal Tiwari <gtiwari@redhat.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-15nvme: let namespace probing continue for unsupported featuresChristoph Hellwig1-1/+10
Instead of failing to scan the namespace entirely when unsupported features are detected, just mark the gendisk hidden but allow other access like the upcoming per-namespace character device. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Javier González <javier.gonz@samsung.com>
2021-04-15nvme: factor out nvme_ns_open and nvme_ns_release helpersChristoph Hellwig1-4/+12
These will be reused for the per-namespace character devices. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Javier González <javier.gonz@samsung.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
2021-04-15nvme: move nvme_ns_head_ops to multipath.cChristoph Hellwig1-27/+4
Move the multipath block_device_operations to multipath.c, where they belong. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Javier González <javier.gonz@samsung.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
2021-04-15nvme: factor out a nvme_tryget_ns_head helperChristoph Hellwig1-4/+7
Add a helper to avoid opencoding ns_head->ref manipulations. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Javier González <javier.gonz@samsung.com> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
2021-04-15nvme: move the ioctl code to a separate fileChristoph Hellwig1-447/+3
Split out the ioctl code from core.c into a new file. Also update copyrights while we're at it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Javier González <javier.gonz@samsung.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
2021-04-15nvme: don't bother to look up a namespace for controller ioctlsChristoph Hellwig1-24/+42
Don't bother to look up a namespace just to drop if after retreiving the controller for the multipath case. Just look up a live controller for the subsystem directly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Javier González <javier.gonz@samsung.com>
2021-04-15nvme: simplify block device ioctl handling for the !multipath caseChristoph Hellwig1-36/+47
Only use the existing ioctl handler for the multipath case, and add a simpler one that reverts to the pre-multipath case for not shared use case. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Javier González <javier.gonz@samsung.com>
2021-04-15nvme: simplify the compat ioctl handlingChristoph Hellwig1-43/+26
Don't bother defining a separate compat_ioctl handler, and just handle the NVME_IOCTL_SUBMIT_IO32 case inline. Also only defined it for those ABIs (currently just i386 vs x86_64) that are affected. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Javier González <javier.gonz@samsung.com>
2021-04-15nvme: factor out a nvme_ns_ioctl helperChristoph Hellwig1-21/+21
Factor out a helper for the namespace based ioctls. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Javier González <javier.gonz@samsung.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
2021-04-15nvme: pass a user pointer to nvme_nvm_ioctlChristoph Hellwig1-1/+1
Pass the proper user pointer instead of the not all that useful integer representation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Javier González <javier.gonz@samsung.com>
2021-04-15nvme: cleanup setting the disk nameChristoph Hellwig1-6/+11
Return false from nvme_set_disk_name and let the caller set the non-multipath name instead of duplicating the naming information in two places. Also remove the pointless local variables for the disk name and flags and the not needed ctrl argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Javier González <javier.gonz@samsung.com>
2021-04-15nvme: add a nvme_ns_head_multipath helperMinwoo Im1-6/+2
Move the multipath gendisk out of #ifdef CONFIG_NVME_MULTIPATH and add a new nvme_ns_head_multipath that uses it to check if a ns_head has a multipath device associated with it. Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com> [hch: added the IS_ENABLED, converted a few existing users] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Javier González <javier.gonz@samsung.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
2021-04-15nvme: remove single trailing whitespaceNiklas Cassel1-1/+1
There is a single trailing whitespace in core.c. Since this is just a single whitespace, the chances of this affecting backports to stable should be quite low, so let's just remove it. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-09treewide: Change list_sort to use const pointersSami Tolvanen1-1/+2
list_sort() internally casts the comparison function passed to it to a different type with constant struct list_head pointers, and uses this pointer to call the functions, which trips indirect call Control-Flow Integrity (CFI) checking. Instead of removing the consts, this change defines the list_cmp_func_t type and changes the comparison function types of all list_sort() callers to use const pointers, thus avoiding type mismatches. Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Nick Desaulniers <ndesaulniers@google.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20210408182843.1754385-10-samitolvanen@google.com
2021-04-06nvme: fix handling of large MDTS valuesBart Van Assche1-2/+4
Instead of triggering an integer overflow and undefined behavior if MDTS is large, set max_hw_sectors to UINT_MAX. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Keith Busch <kbusch@kernel.org> [hch: rebased to account for the new nvme_mps_to_sectors helper] Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-06nvme: implement non-mdts command limitsKeith Busch1-34/+72
Commands that access LBA contents without a data transfer between the host historically have not had a spec defined upper limit. The driver set the queue constraints for such commands to the max data transfer size just to be safe, but this artificial constraint frequently limits devices below their capabilities. The NVMe Workgroup ratified TP4040 defines how a controller may advertise their non-MDTS limits. Use these if provided and default to the current constraints if not. Since the Dataset Management command limits are defined in logical blocks, but without a namespace to tell us the logical block size, the code defaults to the safe 512b size. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-06nvme: disallow passthru cmd from targeting a nsid != nsid of the block devNiklas Cassel1-0/+12
When a passthru command targets a specific namespace, the ns parameter to nvme_user_cmd()/nvme_user_cmd64() is set. However, there is currently no validation that the nsid specified in the passthru command targets the namespace/nsid represented by the block device that the ioctl was performed on. Add a check that validates that the nsid in the passthru command matches that of the supplied namespace. Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Javier González <javier@javigon.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Kanchan Joshi <joshi.k@samsung.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-02nvme: export fast_io_fail_tmo to sysfsDaniel Wagner1-0/+31
Commit 8c4dfea97f15 ("nvme-fabrics: reject I/O to offline device") introduced fast_io_fail_tmo but didn't export the value to sysfs. The value can be set during the 'nvme connect'. Export the timeout value to user space via sysfs to allow runtime configuration. Cc: Victor Gladkov <Victor.Gladkov@kioxia.com> Signed-off-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Himanshu Madhani <himanshu.madhaani@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-02nvme: remove superfluous else in nvme_ctrl_loss_tmo_storeDaniel Wagner1-1/+1
If there is an error we will leave the function early. So there is no need for an else. Remove it. Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-02nvme: use sysfs_emit instead of sprintfDaniel Wagner1-20/+20
sysfs_emit is the recommended API to use for formatting strings to be returned to user space. It is equivalent to scnprintf and aware of the PAGE_SIZE buffer size. Suggested-by: Chaitanya Kulkarni <Chaitanya.Kulkarni@wdc.com> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-02nvme: warn of unhandled effects only onceKeith Busch1-3/+3
We don't need to repeatedly spam the kernel logs with the same warning about unhandled passthrough IO effects. Just one warning is sufficient to observe this condition occurs. Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-02nvme: use driver pdu command for passthroughKeith Busch1-13/+10
All nvme transport drivers preallocate an nvme command for each request. Assume to use that command for nvme_setup_cmd() instead of requiring drivers pass a pointer to it. All nvme drivers must initialize the generic nvme_request 'cmd' to point to the transport's preallocated nvme_command. The generic nvme_request cmd pointer had previously been used only as a temporary copy for passthrough commands. Since it now points to the command that gets dispatched, passthrough commands must directly set it up prior to executing the request. Signed-off-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-02nvme: add new line after variable declatationChaitanya Kulkarni1-0/+3
Add a new line in functions nvme_pr_preempt(), nvme_pr_clear(), and nvme_pr_release() after variable declaration which follows the rest of the code in the nvme/host/core.c. No functional change(s) in this patch. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-02nvme: don't check nvme_req flags for new reqChaitanya Kulkarni1-6/+5
nvme_clear_request() has a check for flag REQ_DONTPREP and it is called from nvme_init_request() and nvme_setuo_cmd(). The function nvme_init_request() is called from nvme_alloc_request() and nvme_alloc_request_qid(). From these two callers new request is allocated everytime. For newly allocated request RQF_DONTPREP is never set. Since after getting a tag, block layer sets the req->rq_flags == 0 and never sets the REQ_DONTPREP when returning the request :- nvme_alloc_request() blk_mq_alloc_request() blk_mq_rq_ctx_init() rq->rq_flags = 0 <---- nvme_alloc_request_qid() blk_mq_alloc_request_hctx() blk_mq_rq_ctx_init() rq->rq_flags = 0 <---- The block layer does set req->rq_flags but REQ_DONTPREP is not one of them and that is set by the driver. That means we can unconditinally set the REQ_DONTPREP value to the rq->rq_flags when nvme_init_request()->nvme_clear_request() is called from above two callers. Move the check for REQ_DONTPREP from nvme_clear_nvme_request() into nvme_setup_cmd(). This is needed since nvme_alloc_request() now gets called from fast path when NVMeOF target is configured with passthru backend to avoid unnecessary checks in the fast path. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-02nvme: mark nvme_setup_passsthru() inlineChaitanya Kulkarni1-1/+1
Since nvmet_setup_passthru() function falls in fast path when called from the NVMeOF passthru backend, make it inline. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-02nvme: split init identify into helperChaitanya Kulkarni1-23/+32
The function nvme_init_ctrl_finish() (formerly nvme_init_identify()) has grown over the period of time about ~200 lines given the size of nvme id ctrl data structure. Move the nvme_id_ctrl data structure related initilzation into helper nvme_init_identify() and call it from nvme_init_ctrl_finish(). When we move the code into nvme_init_identify() change the local variable i from int to unsigned int and remove the duplicate kfree() after nvme_mpath_init() and jump to the label out_free if nvme_mpath_ini() fails. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-02nvme: rename nvme_init_identify()Chaitanya Kulkarni1-4/+4
This is a prep patch so that we can move the identify data structure related code initialization from nvme_init_identify() into a helper. Rename the function nvmet_init_identify() to nvmet_init_ctrl_finish(). Next patch will move the nvme_id_ctrl related initialization from newly renamed function nvme_init_ctrl_finish() into the nvme_init_identify() helper. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-04-02nvme: reduce checks for zero command effectsKanchan Joshi1-1/+2
For passthrough I/O commands, effects are usually to be zero. nvme_passthrough_end() does three checks in futility for this case. Bail out of function-call/checks. Signed-off-by: Kanchan Joshi <joshi.k@samsung.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-03-18nvme: fix Write Zeroes limitationsChristoph Hellwig1-24/+12
We voluntarily limit the Write Zeroes sizes to the MDTS value provided by the hardware, but currently get the units wrong, so fix that. Fixes: 6e02318eaea5 ("nvme: add support for the Write Zeroes command") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Tested-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
2021-03-18nvme: allocate the keep alive request using BLK_MQ_REQ_NOWAITChristoph Hellwig1-2/+2
To avoid an error recovery deadlock where the keep alive work is waiting for a request and thus can't be flushed to make progress for tearing down the controller. Also print the error code returned from blk_mq_alloc_request to help debugging any future issues in this code. Based on an earlier patch from Hannes Reinecke <hare@suse.de>. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de>
2021-03-18nvme: merge nvme_keep_alive into nvme_keep_alive_workChristoph Hellwig1-18/+8
Merge nvme_keep_alive into its only caller to prepare for additional changes to this code. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de>
2021-03-12nvme: fix the nsid value to print in nvme_validate_or_alloc_nsChristoph Hellwig1-1/+1
ns can be NULL at this point, and my move of the check from the original patch by Chaitanya broke this. Fixes: 0ec84df4953b ("nvme-core: check ctrl css before setting up zns") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-11nvme-core: check ctrl css before setting up znsChaitanya Kulkarni1-0/+6
Ensure multiple Command Sets are supported before starting to setup a ZNS namespace. Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> [hch: move the check around a bit] Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-03-11nvme: add NVME_REQ_CANCELLED flag in nvme_cancel_request()Hannes Reinecke1-0/+1
NVME_REQ_CANCELLED is translated into -EINTR in nvme_submit_sync_cmd(), so we should be setting this flags during nvme_cancel_request() to ensure that the callers to nvme_submit_sync_cmd() will get the correct error code when the controller is reset. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Chao Leng <lengchao@huawei.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-03-11nvme: simplify error logic in nvme_validate_ns()Hannes Reinecke1-4/+4
We only should remove namespaces when we get fatal error back from the device or when the namespace IDs have changed. So instead of painfully masking out error numbers which might indicate that the error should be ignored we could use an NVME status code to indicated when the namespace should be removed. That simplifies the final logic and makes it less error-prone. Signed-off-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Keith Busch <kbusch@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-21Merge tag 'for-5.12/drivers-2021-02-17' of git://git.kernel.dk/linux-blockLinus Torvalds1-14/+49
Pull block driver updates from Jens Axboe: - Remove the skd driver. It's been EOL for a long time (Damien) - NVMe pull requests - fix multipath handling of ->queue_rq errors (Chao Leng) - nvmet cleanups (Chaitanya Kulkarni) - add a quirk for buggy Amazon controller (Filippo Sironi) - avoid devm allocations in nvme-hwmon that don't interact well with fabrics (Hannes Reinecke) - sysfs cleanups (Jiapeng Chong) - fix nr_zones for multipath (Keith Busch) - nvme-tcp crash fix for no-data commands (Sagi Grimberg) - nvmet-tcp fixes (Sagi Grimberg) - add a missing __rcu annotation (Christoph) - failed reconnect fixes (Chao Leng) - various tracing improvements (Michal Krakowiak, Johannes Thumshirn) - switch the nvmet-fc assoc_list to use RCU protection (Leonid Ravich) - resync the status codes with the latest spec (Max Gurtovoy) - minor nvme-tcp improvements (Sagi Grimberg) - various cleanups (Rikard Falkeborn, Minwoo Im, Chaitanya Kulkarni, Israel Rukshin) - Floppy O_NDELAY fix (Denis) - MD pull request - raid5 chunk_sectors fix (Guoqing) - Use lore links (Kees) - Use DEFINE_SHOW_ATTRIBUTE for nbd (Liao) - loop lock scaling (Pavel) - mtip32xx PCI fixes (Bjorn) - bcache fixes (Kai, Dongdong) - Misc fixes (Tian, Yang, Guoqing, Joe, Andy) * tag 'for-5.12/drivers-2021-02-17' of git://git.kernel.dk/linux-block: (64 commits) lightnvm: pblk: Replace guid_copy() with export_guid()/import_guid() lightnvm: fix unnecessary NULL check warnings nvme-tcp: fix crash triggered with a dataless request submission block: Replace lkml.org links with lore nbd: Convert to DEFINE_SHOW_ATTRIBUTE nvme: add 48-bit DMA address quirk for Amazon NVMe controllers nvme-hwmon: rework to avoid devm allocation nvmet: remove else at the end of the function nvmet: add nvmet_req_subsys() helper nvmet: use min of device_path and disk len nvmet: use invalid cmd opcode helper nvmet: use invalid cmd opcode helper nvmet: add helper to report invalid opcode nvmet: remove extra variable in id-ns handler nvmet: make nvmet_find_namespace() req based nvmet: return uniform error for invalid ns nvmet: set status to 0 in case for invalid nsid nvmet-fc: add a missing __rcu annotation to nvmet_fc_tgt_assoc.queues nvme-multipath: set nr_zones for zoned namespaces nvmet-tcp: fix potential race of tcp socket closing accept_work ...
2021-02-21Merge tag 'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-blockLinus Torvalds1-15/+16
Pull core block updates from Jens Axboe: "Another nice round of removing more code than what is added, mostly due to Christoph's relentless pursuit of tech debt removal/cleanups. This pull request contains: - Two series of BFQ improvements (Paolo, Jan, Jia) - Block iov_iter improvements (Pavel) - bsg error path fix (Pan) - blk-mq scheduler improvements (Jan) - -EBUSY discard fix (Jan) - bvec allocation improvements (Ming, Christoph) - bio allocation and init improvements (Christoph) - Store bdev pointer in bio instead of gendisk + partno (Christoph) - Block trace point cleanups (Christoph) - hard read-only vs read-only split (Christoph) - Block based swap cleanups (Christoph) - Zoned write granularity support (Damien) - Various fixes/tweaks (Chunguang, Guoqing, Lei, Lukas, Huhai)" * tag 'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-block: (104 commits) mm: simplify swapdev_block sd_zbc: clear zone resources for non-zoned case block: introduce blk_queue_clear_zone_settings() zonefs: use zone write granularity as block size block: introduce zone_write_granularity limit block: use blk_queue_set_zoned in add_partition() nullb: use blk_queue_set_zoned() to setup zoned devices nvme: cleanup zone information initialization block: document zone_append_max_bytes attribute block: use bi_max_vecs to find the bvec pool md/raid10: remove dead code in reshape_request block: mark the bio as cloned in bio_iov_bvec_set block: set BIO_NO_PAGE_REF in bio_iov_bvec_set block: remove a layer of indentation in bio_iov_iter_get_pages block: turn the nr_iovecs argument to bio_alloc* into an unsigned short block: remove the 1 and 4 vec bvec_slabs entries block: streamline bvec_alloc block: factor out a bvec_alloc_gfp helper block: move struct biovec_slab to bio.c block: reuse BIO_INLINE_VECS for integrity bvecs ...
2021-02-10nvme-hwmon: rework to avoid devm allocationHannes Reinecke1-0/+1
The original design to use device-managed resource allocation doesn't really work as the NVMe controller has a vastly different lifetime than the hwmon sysfs attributes, causing warning about duplicate sysfs entries upon reconnection. This patch reworks the hwmon allocation to avoid device-managed resource allocation, and uses the NVMe controller as parent for the sysfs attributes. Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Hannes Reinecke <hare@suse.de> Tested-by: Enzo Matsumiya <ematsumiya@suse.de> Tested-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-10nvme: introduce a nvme_host_path_error helperChao Leng1-0/+15
When using nvme native multipathing, if a path related error occurs during ->queue_rq, the request needs to be completed with NVME_SC_HOST_PATH_ERROR so that the request can be failed over. Introduce a helper to complete the command from ->queue_rq in a wait that invokes nvme_complete_rq. Signed-off-by: Chao Leng <lengchao@huawei.com> [hch: renamed, added a return value to clean up the callers a bit] Signed-off-by: Christoph Hellwig <hch@lst.de>