kernel/linux.git/drivers/nvme/host, branch v6.1.87

drivers/nvme: Add quirks for device 126f:2262

2024-04-13T11:05:20+00:00

[ Upstream commit e89086c43f0500bc7c4ce225495b73b8ce234c1f ] This commit adds NVME_QUIRK_NO_DEEPEST_PS and NVME_QUIRK_BOGUS_NID for device [126f:2262], which appears to be a generic VID:PID pair used for many SSDs based on the Silicon Motion SM2262/SM2262EN controller. Two of my SSDs with this VID:PID pair exhibit the same behavior: * They frequently have trouble exiting the deepest power state (5), resulting in the entire disk unresponsive. Verified by setting nvme_core.default_ps_max_latency_us=10000 and observing them behaving normally. * They produce all-zero nguid and eui64 with `nvme id-ns` command. The offending products are: * HP SSD EX950 1TB * HIKVISION C2000Pro 2TB Signed-off-by: Jiawei Fu Reviewed-by: Christoph Hellwig Reviewed-by: Sagi Grimberg Signed-off-by: Keith Busch Signed-off-by: Sasha Levin

nvme: fix miss command type check

2024-04-10T14:28:35+00:00

commit 31a5978243d24d77be4bacca56c78a0fbc43b00d upstream. In the function nvme_passthru_end(), only the value of the command opcode is checked, without checking the command type (IO command or Admin command). When we send a Dataset Management command (The opcode of the Dataset Management command is the same as the Set Feature command), kernel thinks it is a set feature command, then sets the controller's keep alive interval, and calls nvme_keep_alive_work(). Signed-off-by: min15.li Reviewed-by: Kanchan Joshi Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch Fixes: b58da2d270db ("nvme: update keep alive interval when kato is modified") Signed-off-by: Tokunori Ikegami Signed-off-by: Greg Kroah-Hartman

nvme: fix reconnection fail due to reserved tag allocation

2024-03-26T22:20:59+00:00

[ Upstream commit de105068fead55ed5c07ade75e9c8e7f86a00d1d ] We found a issue on production environment while using NVMe over RDMA, admin_q reconnect failed forever while remote target and network is ok. After dig into it, we found it may caused by a ABBA deadlock due to tag allocation. In my case, the tag was hold by a keep alive request waiting inside admin_q, as we quiesced admin_q while reset ctrl, so the request maked as idle and will not process before reset success. As fabric_q shares tagset with admin_q, while reconnect remote target, we need a tag for connect command, but the only one reserved tag was held by keep alive command which waiting inside admin_q. As a result, we failed to reconnect admin_q forever. In order to fix this issue, I think we should keep two reserved tags for admin queue. Fixes: ed01fee283a0 ("nvme-fabrics: only reserve a single tag") Signed-off-by: Chunguang Xu Reviewed-by: Sagi Grimberg Reviewed-by: Chaitanya Kulkarni Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch Signed-off-by: Sasha Levin

nvme: add the Apple shared tag workaround to nvme_alloc_io_tag_set

2024-03-26T22:20:59+00:00

[ Upstream commit 93b24f579c392bac2e491fee79ad5ce5a131992e ] Add the apple shared tag workaround to nvme_alloc_io_tag_set to prepare for using that helper in the PCIe driver. Signed-off-by: Christoph Hellwig Reviewed-by: Sagi Grimberg Reviewed-by: Chaitanya Kulkarni Stable-dep-of: de105068fead ("nvme: fix reconnection fail due to reserved tag allocation") Signed-off-by: Sasha Levin

nvme: only set reserved_tags in nvme_alloc_io_tag_set for fabrics controllers

2024-03-26T22:20:59+00:00

[ Upstream commit b794d1c2ad6d7921f2867ce393815ad31b5b5a83 ] The reserved_tags are only needed for fabrics controllers. Right now only fabrics drivers call this helper, so this is harmless, but we'll use it in the PCIe driver soon. Signed-off-by: Christoph Hellwig Reviewed-by: Sagi Grimberg Reviewed-by: Chaitanya Kulkarni Stable-dep-of: de105068fead ("nvme: fix reconnection fail due to reserved tag allocation") Signed-off-by: Sasha Levin

nvme-fc: do not wait in vain when unloading module

2024-03-01T12:26:27+00:00

[ Upstream commit 70fbfc47a392b98e5f8dba70c6efc6839205c982 ] The module exit path has race between deleting all controllers and freeing 'left over IDs'. To prevent double free a synchronization between nvme_delete_ctrl and ida_destroy has been added by the initial commit. There is some logic around trying to prevent from hanging forever in wait_for_completion, though it does not handling all cases. E.g. blktests is able to reproduce the situation where the module unload hangs forever. If we completely rely on the cleanup code executed from the nvme_delete_ctrl path, all IDs will be freed eventually. This makes calling ida_destroy unnecessary. We only have to ensure that all nvme_delete_ctrl code has been executed before we leave nvme_fc_exit_module. This is done by flushing the nvme_delete_wq workqueue. While at it, remove the unused nvme_fc_wq workqueue too. Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Signed-off-by: Daniel Wagner Signed-off-by: Keith Busch Signed-off-by: Sasha Levin

nvme: introduce helper function to get ctrl state

2024-01-20T10:50:06+00:00

[ Upstream commit 5c687c287c46fadb14644091823298875a5216aa ] The controller state is typically written by another CPU, so reading it should ensure no optimizations are taken. This is a repeated pattern in the driver, so start with adding a convenience function that returns the controller state with READ_ONCE(). Reviewed-by: Sagi Grimberg Signed-off-by: Keith Busch Signed-off-by: Sasha Levin

nvme-core: check for too small lba shift

2024-01-20T10:50:04+00:00

[ Upstream commit 74fbc88e161424b3b96a22b23a8e3e1edab9d05c ] The block layer doesn't support logical block sizes smaller than 512 bytes. The nvme spec doesn't support that small either, but the driver isn't checking to make sure the device responded with usable data. Failing to catch this will result in a kernel bug, either from a division by zero when stacking, or a zero length bio. Reviewed-by: Jens Axboe Signed-off-by: Keith Busch Signed-off-by: Sasha Levin

nvme-core: fix a memory leak in nvme_ns_info_from_identify()

2024-01-20T10:50:04+00:00

[ Upstream commit e3139cef8257fcab1725441e2fd5fd0ccb5481b1 ] In case of error, free the nvme_id_ns structure that was allocated by nvme_identify_ns(). Signed-off-by: Maurizio Lombardi Reviewed-by: Sagi Grimberg Reviewed-by: Kanchan Joshi Signed-off-by: Keith Busch Signed-off-by: Sasha Levin

nvme-pci: fix sleeping function called from interrupt context

2024-01-01T12:38:59+00:00

[ Upstream commit f6fe0b2d35457c10ec37acc209d19726bdc16dbd ] the nvme_handle_cqe() interrupt handler calls nvme_complete_async_event() but the latter may call nvme_auth_stop() which is a blocking function. Sleeping functions can't be called in interrupt context BUG: sleeping function called from invalid context in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/15 Call Trace: __cancel_work_timer+0x31e/0x460 ? nvme_change_ctrl_state+0xcf/0x3c0 [nvme_core] ? nvme_change_ctrl_state+0xcf/0x3c0 [nvme_core] nvme_complete_async_event+0x365/0x480 [nvme_core] nvme_poll_cq+0x262/0xe50 [nvme] Fix the bug by moving nvme_auth_stop() to fw_act_work (executed by the nvme_wq workqueue) Fixes: f50fff73d620 ("nvme: implement In-Band authentication") Signed-off-by: Maurizio Lombardi Reviewed-by: Jens Axboe Reviewed-by: Sagi Grimberg Signed-off-by: Keith Busch Signed-off-by: Sasha Levin