kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2020-01-29	scsi: iscsi: Avoid potential deadlock in iscsi_if_rx func	Bo Wu	1	-0/+7
	commit bba340c79bfe3644829db5c852fdfa9e33837d6d upstream. In iscsi_if_rx func, after receiving one request through iscsi_if_recv_msg func, iscsi_if_send_reply will be called to try to reply to the request in a do-while loop. If the iscsi_if_send_reply function keeps returning -EAGAIN, a deadlock will occur. For example, a client only send msg without calling recvmsg func, then it will result in the watchdog soft lockup. The details are given as follows: sock_fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_ISCSI); retval = bind(sock_fd, (struct sock addr*) & src_addr, sizeof(src_addr); while (1) { state_msg = sendmsg(sock_fd, &msg, 0); //Note: recvmsg(sock_fd, &msg, 0) is not processed here. } close(sock_fd); watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [netlink_test:253305] Sample time: 4000897528 ns(HZ: 250) Sample stat: curr: user: 675503481560, nice: 321724050, sys: 448689506750, idle: 4654054240530, iowait: 40885550700, irq: 14161174020, softirq: 8104324140, st: 0 deta: user: 0, nice: 0, sys: 3998210100, idle: 0, iowait: 0, irq: 1547170, softirq: 242870, st: 0 Sample softirq: TIMER: 992 SCHED: 8 Sample irqstat: irq 2: delta 1003, curr: 3103802, arch_timer CPU: 7 PID: 253305 Comm: netlink_test Kdump: loaded Tainted: G OE Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 40400005 (nZcv daif +PAN -UAO) pc : __alloc_skb+0x104/0x1b0 lr : __alloc_skb+0x9c/0x1b0 sp : ffff000033603a30 x29: ffff000033603a30 x28: 00000000000002dd x27: ffff800b34ced810 x26: ffff800ba7569f00 x25: 00000000ffffffff x24: 0000000000000000 x23: ffff800f7c43f600 x22: 0000000000480020 x21: ffff0000091d9000 x20: ffff800b34eff200 x19: ffff800ba7569f00 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0001000101000100 x13: 0000000101010000 x12: 0101000001010100 x11: 0001010101010001 x10: 00000000000002dd x9 : ffff000033603d58 x8 : ffff800b34eff400 x7 : ffff800ba7569200 x6 : ffff800b34eff400 x5 : 0000000000000000 x4 : 00000000ffffffff x3 : 0000000000000000 x2 : 0000000000000001 x1 : ffff800b34eff2c0 x0 : 0000000000000300 Call trace: __alloc_skb+0x104/0x1b0 iscsi_if_rx+0x144/0x12bc [scsi_transport_iscsi] netlink_unicast+0x1e0/0x258 netlink_sendmsg+0x310/0x378 sock_sendmsg+0x4c/0x70 sock_write_iter+0x90/0xf0 __vfs_write+0x11c/0x190 vfs_write+0xac/0x1c0 ksys_write+0x6c/0xd8 __arm64_sys_write+0x24/0x30 el0_svc_common+0x78/0x130 el0_svc_handler+0x38/0x78 el0_svc+0x8/0xc Link: https://lore.kernel.org/r/EDBAAA0BBBA2AC4E9C8B6B81DEEE1D6915E3D4D2@dggeml505-mbx.china.huawei.com Signed-off-by: Bo Wu <wubo40@huawei.com> Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-29	sd: Fix REQ_OP_ZONE_REPORT completion handling	Masato Suzuki	1	-2/+6
	ZBC/ZAC report zones command may return less bytes than requested if the number of matching zones for the report request is small. However, unlike read or write commands, the remainder of incomplete report zones commands cannot be automatically requested by the block layer: the start sector of the next report cannot be known, and the report reply may not be 512B aligned for SAS drives (a report zone reply size is always a multiple of 64B). The regular request completion code executing bio_advance() and restart of the command remainder part currently causes invalid zone descriptor data to be reported to the caller if the report zone size is smaller than 512B (a case that can happen easily for a report of the last zones of a SAS drive for example). Since blkdev_report_zones() handles report zone command processing in a loop until completion (no more zones are being reported), we can safely avoid that the block layer performs an incorrect bio_advance() call and restart of the remainder of incomplete report zone BIOs. To do so, always indicate a full completion of REQ_OP_ZONE_REPORT by setting good_bytes to the request buffer size and by setting the command resid to 0. This does not affect the post processing of the report zone reply done by sd_zbc_complete() since the reply header indicates the number of zones reported. Fixes: 89d947561077 ("sd: Implement support for ZBC devices") Cc: <stable@vger.kernel.org> # 4.19 Cc: <stable@vger.kernel.org> # 4.14 Signed-off-by: Masato Suzuki <masato.suzuki@wdc.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-27	scsi: fnic: fix msix interrupt allocation	Govindarajulu Varadarajan	1	-2/+2
	[ Upstream commit 3ec24fb4c035e9cbb2f02a48640a09aa913442a2 ] pci_alloc_irq_vectors() returns number of vectors allocated. Fix the check for error condition. Fixes: cca678dfbad49 ("scsi: fnic: switch to pci_alloc_irq_vectors") Link: https://lore.kernel.org/r/20190827211340.1095-1-gvaradar@cisco.com Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com> Acked-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27	scsi: libfc: fix null pointer dereference on a null lport	Colin Ian King	1	-1/+1
	[ Upstream commit 41a6bf6529edd10a6def42e3b2c34a7474bcc2f5 ] Currently if lport is null then the null lport pointer is dereference when printing out debug via the FC_LPORT_DB macro. Fix this by using the more generic FC_LIBFC_DBG debug macro instead that does not use lport. Addresses-Coverity: ("Dereference after null check") Fixes: 7414705ea4ae ("libfc: Add runtime debugging with debug_logging module parameter") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27	scsi: qla2xxx: Avoid that qlt_send_resp_ctio() corrupts memory	Bart Van Assche	1	-6/+6
	[ Upstream commit a861b49273578e255426a499842cf7f465456351 ] The "(&ctio->u.status1.sense_data)[i]" where i >= 0 expressions in qlt_send_resp_ctio() are probably typos and should have been "(&ctio->u.status1.sense_data[4 * i])" instead. Instead of only fixing these typos, modify the code for storing sense data such that it becomes easy to read. This patch fixes a Coverity complaint about accessing an array outside its bounds. Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Fixes: be25152c0d9e ("qla2xxx: Improve T10-DIF/PI handling in driver.") # v4.11. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27	scsi: qla2xxx: Fix error handling in qlt_alloc_qfull_cmd()	Bart Van Assche	1	-5/+2
	[ Upstream commit c04466c17142d5eb566984372b9a5003d1900fe3 ] The test "if (!cmd)" is not useful because it is guaranteed that cmd != NULL. Instead of testing the cmd pointer, rely on the tag to decide whether or not command allocation failed. Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Fixes: 33e799775593 ("qla2xxx: Add support for QFull throttling and Term Exchange retry") # v3.18. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27	scsi: qla2xxx: Fix a format specifier	Bart Van Assche	1	-1/+1
	[ Upstream commit 19ce192cd718e02f880197c0983404ca48236807 ] Since mcmd->sess->port_name is eight bytes long, use %8phC to format that port name instead of %phC. Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <gmalavali@marvell.com> Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery") # v4.11. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27	scsi: qla2xxx: Unregister chrdev if module initialization fails	Bart Van Assche	1	-13/+21
	[ Upstream commit c794d24ec9eb6658909955772e70f34bef5b5b91 ] If module initialization fails after the character device has been registered, unregister the character device. Additionally, avoid duplicating error path code. Cc: Himanshu Madhani <hmadhani@marvell.com> Cc: Giridhar Malavali <giridhar.malavali@qlogic.com> Fixes: 6a03b4cd78f3 ("[SCSI] qla2xxx: Add char device to increase driver use count") # v2.6.35. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-27	scsi: megaraid_sas: reduce module load time	Steve Sistare	1	-2/+2
	[ Upstream commit 31b6a05f86e690e1818116fd23c3be915cc9d9ed ] megaraid_sas takes 1+ seconds to load while waiting for firmware: [2.822603] megaraid_sas 0000:03:00.0: Waiting for FW to come to ready state [3.871003] megaraid_sas 0000:03:00.0: FW now in Ready state This is due to the following loop in megasas_transition_to_ready(), which waits a minimum of 1 second, even though the FW becomes ready in tens of millisecs: /* * The cur_state should not last for more than max_wait secs */ for (i = 0; i < max_wait; i++) { ... msleep(1000); ... dev_info(&instance->pdev->dev, "FW now in Ready state\n"); This is a regression, caused by a change of the msleep granularity from 1 to 1000 due to concern about waiting too long on systems with coarse jiffies. To fix, increase iterations and use msleep(20), which results in: [2.670627] megaraid_sas 0000:03:00.0: Waiting for FW to come to ready state [2.739386] megaraid_sas 0000:03:00.0: FW now in Ready state Fixes: fb2f3e96d80f ("scsi: megaraid_sas: Fix msleep granularity") Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-23	scsi: core: scsi_trace: Use get_unaligned_be*()	Bart Van Assche	1	-80/+33
	commit b1335f5b0486f61fb66b123b40f8e7a98e49605d upstream. This patch fixes an unintended sign extension on left shifts. From Colin King: "Shifting a u8 left will cause the value to be promoted to an integer. If the top bit of the u8 is set then the following conversion to an u64 will sign extend the value causing the upper 32 bits to be set in the result." Fix this by using get_unaligned_be*() instead. Fixes: bf8162354233 ("[SCSI] add scsi trace core functions and put trace points") Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20191101211447.187151-1-bvanassche@acm.org Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23	scsi: qla2xxx: fix rports not being mark as lost in sync fabric scan	Martin Wilck	1	-3/+3
	commit d341e9a8f2cffe4000c610225c629f62c7489c74 upstream. In qla2x00_find_all_fabric_devs(), fcport->flags & FCF_LOGIN_NEEDED is a necessary condition for logging into new rports, but not for dropping lost ones. Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery") Link: https://lore.kernel.org/r/20191122221912.20100-2-martin.wilck@suse.com Tested-by: David Bond <dbond@suse.com> Signed-off-by: Martin Wilck <mwilck@suse.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23	scsi: qla2xxx: Fix qla2x00_request_irqs() for MSI	Huacai Chen	1	-3/+3
	commit 45dc8f2d9c94ed74a5e31e63e9136a19a7e16081 upstream. Commit 4fa183455988 ("scsi: qla2xxx: Utilize pci_alloc_irq_vectors/ pci_free_irq_vectors calls.") use pci_alloc_irq_vectors() to replace pci_enable_msi() but it didn't handle the return value correctly. This bug make qla2x00 always fail to setup MSI if MSI-X fail, so fix it. BTW, improve the log message of return value in qla2x00_request_irqs() to avoid confusion. Fixes: 4fa183455988 ("scsi: qla2xxx: Utilize pci_alloc_irq_vectors/pci_free_irq_vectors calls.") Cc: Michael Hernandez <michael.hernandez@cavium.com> Link: https://lore.kernel.org/r/1574314847-14280-1-git-send-email-chenhc@lemote.com Signed-off-by: Huacai Chen <chenhc@lemote.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23	scsi: bnx2i: fix potential use after free	Pan Bian	1	-1/+1
	commit 29d28f2b8d3736ac61c28ef7e20fda63795b74d9 upstream. The member hba->pcidev may be used after its reference is dropped. Move the put function to where it is never used to avoid potential use after free issues. Fixes: a77171806515 ("[SCSI] bnx2i: Removed the reference to the netdev->base_addr") Link: https://lore.kernel.org/r/1573043541-19126-1-git-send-email-bianpan2016@163.com Signed-off-by: Pan Bian <bianpan2016@163.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23	scsi: qla4xxx: fix double free bug	Pan Bian	1	-3/+0
	commit 3fe3d2428b62822b7b030577cd612790bdd8c941 upstream. The variable init_fw_cb is released twice, resulting in a double free bug. The call to the function dma_free_coherent() before goto is removed to get rid of potential double free. Fixes: 2a49a78ed3c8 ("[SCSI] qla4xxx: added IPv6 support.") Link: https://lore.kernel.org/r/1572945927-27796-1-git-send-email-bianpan2016@163.com Signed-off-by: Pan Bian <bianpan2016@163.com> Acked-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23	scsi: esas2r: unlock on error in esas2r_nvram_read_direct()	Dan Carpenter	1	-0/+1
	commit 906ca6353ac09696c1bf0892513c8edffff5e0a6 upstream. This error path is missing an unlock. Fixes: 26780d9e12ed ("[SCSI] esas2r: ATTO Technology ExpressSAS 6G SAS/SATA RAID Adapter Driver") Link: https://lore.kernel.org/r/20191022102324.GA27540@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-23	scsi: fnic: fix invalid stack access	Arnd Bergmann	1	-10/+10
	commit 42ec15ceaea74b5f7a621fc6686cbf69ca66c4cf upstream. gcc -O3 warns that some local variables are not properly initialized: drivers/scsi/fnic/vnic_dev.c: In function 'fnic_dev_hang_notify': drivers/scsi/fnic/vnic_dev.c:511:16: error: 'a0' is used uninitialized in this function [-Werror=uninitialized] vdev->args[0] = a0; ~~~~~~~~~~~~~~^~~~~ drivers/scsi/fnic/vnic_dev.c:691:6: note: 'a0' was declared here u64 a0, a1; ^~ drivers/scsi/fnic/vnic_dev.c:512:16: error: 'a1' is used uninitialized in this function [-Werror=uninitialized] vdev->args[1] = a1; ~~~~~~~~~~~~~~^~~~~ drivers/scsi/fnic/vnic_dev.c:691:10: note: 'a1' was declared here u64 a0, a1; ^~ drivers/scsi/fnic/vnic_dev.c: In function 'fnic_dev_mac_addr': drivers/scsi/fnic/vnic_dev.c:512:16: error: 'a1' is used uninitialized in this function [-Werror=uninitialized] vdev->args[1] = *a1; ~~~~~~~~~~~~~~^~~~~ drivers/scsi/fnic/vnic_dev.c:698:10: note: 'a1' was declared here u64 a0, a1; ^~ Apparently the code relies on the local variables occupying adjacent memory locations in the same order, but this is of course not guaranteed. Use an array of two u64 variables where needed to make it work correctly. I suspect there is also an endianness bug here, but have not digged in deep enough to be sure. Fixes: 5df6d737dd4b ("[SCSI] fnic: Add new Cisco PCI-Express FCoE HBA") Fixes: mmtom ("init/Kconfig: enable -O3 for all arches") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200107201602.4096790-1-arnd@arndb.de Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17	scsi: libcxgbi: fix NULL pointer dereference in cxgbi_device_destroy()	Varun Prakash	1	-1/+2
	[ Upstream commit 71482fde704efdd8c3abe0faf34d922c61e8d76b ] If cxgb4i_ddp_init() fails then cdev->cdev2ppm will be NULL, so add a check for NULL pointer before dereferencing it. Link: https://lore.kernel.org/r/1576676731-3068-1-git-send-email-varun@chelsio.com Signed-off-by: Varun Prakash <varun@chelsio.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-17	scsi: sd: enable compat ioctls for sed-opal	Arnd Bergmann	1	-2/+12
	commit 142b2ac82e31c174936c5719fa12ae28f51a55b7 upstream. The sed_ioctl() function is written to be compatible between 32-bit and 64-bit processes, however compat mode is only wired up for nvme, not for sd. Add the missing call to sed_ioctl() in sd_compat_ioctl(). Fixes: d80210f25ff0 ("sd: add support for TCG OPAL self encrypting disks") Cc: linux-scsi@vger.kernel.org Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-17	scsi: sd: Clear sdkp->protection_type if disk is reformatted without PI	Xiang Chen	1	-1/+3
	commit 465f4edaecc6c37f81349233e84d46246bcac11a upstream. If an attached disk with protection information enabled is reformatted to Type 0 the revalidation code does not clear the original protection type and subsequent accesses will keep setting RDPROTECT/WRPROTECT. Set the protection type to 0 if the disk reports PROT_EN=0 in READ CAPACITY(16). [mkp: commit desc] Fixes: fe542396da73 ("[SCSI] sd: Ensure we correctly disable devices with unknown protection type") Link: https://lore.kernel.org/r/1578532344-101668-1-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-14	scsi: bfa: release allocated memory in case of error	Navid Emamdoost	1	-1/+3
	commit 0e62395da2bd5166d7c9e14cbc7503b256a34cb0 upstream. In bfad_im_get_stats if bfa_port_get_stats fails, allocated memory needs to be released. Link: https://lore.kernel.org/r/20190910234417.22151-1-navid.emamdoost@gmail.com Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-09	scsi: qedf: Do not retry ELS request if qedf_alloc_cmd fails	Chad Dupuis	1	-12/+4
	[ Upstream commit f1c43590365bac054d753d808dbbd207d09e088d ] If we cannot allocate an ELS middlepath request, simply fail instead of trying to delay and then reallocate. This delay logic is causing soft lockup messages: NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:1:7639] Modules linked in: xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun devlink ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter dm_service_time vfat fat rpcrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support qedr(OE) ib_core joydev ipmi_ssif pcspkr hpilo hpwdt sg ipmi_si ipmi_devintf ipmi_msghandler ioatdma shpchp lpc_ich wmi dca acpi_power_meter dm_multipath ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic qedf(OE) libfcoe mgag200 libfc i2c_algo_bit drm_kms_helper scsi_transport_fc qede(OE) syscopyarea sysfillrect sysimgblt fb_sys_fops ttm qed(OE) drm crct10dif_pclmul e1000e crct10dif_common crc32c_intel scsi_tgt hpsa i2c_core ptp scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod CPU: 2 PID: 7639 Comm: kworker/2:1 Kdump: loaded Tainted: G OEL ------------ 3.10.0-861.el7.x86_64 #1 Hardware name: HP ProLiant DL580 Gen9/ProLiant DL580 Gen9, BIOS U17 07/21/2016 Workqueue: qedf_2_dpc qedf_handle_rrq [qedf] task: ffff959edd628fd0 ti: ffff959ed6f08000 task.ti: ffff959ed6f08000 RIP: 0010:[<ffffffff8355913a>] [<ffffffff8355913a>] delay_tsc+0x3a/0x60 RSP: 0018:ffff959ed6f0bd30 EFLAGS: 00000246 RAX: 000000008ef5f791 RBX: 5f646d635f666465 RCX: 0000025b8ededa2f RDX: 000000000000025b RSI: 0000000000000002 RDI: 0000000000217d1e RBP: ffff959ed6f0bd30 R08: ffffffffc079aae8 R09: 0000000000000200 R10: ffffffffc07952c6 R11: 0000000000000000 R12: 6c6c615f66646571 R13: ffff959ed6f0bcc8 R14: ffff959ed6f0bd08 R15: ffff959e00000028 FS: 0000000000000000(0000) GS:ffff959eff480000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4117fa1eb0 CR3: 0000002039e66000 CR4: 00000000003607e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: [<ffffffff8355907d>] __const_udelay+0x2d/0x30 [<ffffffffc079444a>] qedf_initiate_els+0x13a/0x450 [qedf] [<ffffffffc0794210>] ? qedf_srr_compl+0x2a0/0x2a0 [qedf] [<ffffffffc0795337>] qedf_send_rrq+0x127/0x230 [qedf] [<ffffffffc078ed55>] qedf_handle_rrq+0x15/0x20 [qedf] [<ffffffff832b2dff>] process_one_work+0x17f/0x440 [<ffffffff832b3ac6>] worker_thread+0x126/0x3c0 [<ffffffff832b39a0>] ? manage_workers.isra.24+0x2a0/0x2a0 [<ffffffff832bae31>] kthread+0xd1/0xe0 [<ffffffff832bad60>] ? insert_kthread_work+0x40/0x40 [<ffffffff8391f637>] ret_from_fork_nospec_begin+0x21/0x21 [<ffffffff832bad60>] ? insert_kthread_work+0x40/0x40 Signed-off-by: Chad Dupuis <cdupuis@marvell.com> Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09	scsi: libsas: stop discovering if oob mode is disconnected	Jason Yan	1	-1/+10
	[ Upstream commit f70267f379b5e5e11bdc5d72a56bf17e5feed01f ] The discovering of sas port is driven by workqueue in libsas. When libsas is processing port events or phy events in workqueue, new events may rise up and change the state of some structures such as asd_sas_phy. This may cause some problems such as follows: ==>thread 1 ==>thread 2 ==>phy up ==>phy_up_v3_hw() ==>oob_mode = SATA_OOB_MODE; ==>phy down quickly ==>hisi_sas_phy_down() ==>sas_ha->notify_phy_event() ==>sas_phy_disconnected() ==>oob_mode = OOB_NOT_CONNECTED ==>workqueue wakeup ==>sas_form_port() ==>sas_discover_domain() ==>sas_get_port_device() ==>oob_mode is OOB_NOT_CONNECTED and device is wrongly taken as expander This at last lead to the panic when libsas trying to issue a command to discover the device. [183047.614035] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058 [183047.622896] Mem abort info: [183047.625762] ESR = 0x96000004 [183047.628893] Exception class = DABT (current EL), IL = 32 bits [183047.634888] SET = 0, FnV = 0 [183047.638015] EA = 0, S1PTW = 0 [183047.641232] Data abort info: [183047.644189] ISV = 0, ISS = 0x00000004 [183047.648100] CM = 0, WnR = 0 [183047.651145] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000b7df67be [183047.657834] [0000000000000058] pgd=0000000000000000 [183047.662789] Internal error: Oops: 96000004 [#1] SMP [183047.667740] Process kworker/u16:2 (pid: 31291, stack limit = 0x00000000417c4974) [183047.675208] CPU: 0 PID: 3291 Comm: kworker/u16:2 Tainted: G W OE 4.19.36-vhulk1907.1.0.h410.eulerosv2r8.aarch64 #1 [183047.687015] Hardware name: N/A N/A/Kunpeng Desktop Board D920S10, BIOS 0.15 10/22/2019 [183047.695007] Workqueue: 0000:74:02.0_disco_q sas_discover_domain [183047.700999] pstate: 20c00009 (nzCv daif +PAN +UAO) [183047.705864] pc : prep_ata_v3_hw+0xf8/0x230 [hisi_sas_v3_hw] [183047.711510] lr : prep_ata_v3_hw+0xb0/0x230 [hisi_sas_v3_hw] [183047.717153] sp : ffff00000f28ba60 [183047.720541] x29: ffff00000f28ba60 x28: ffff8026852d7228 [183047.725925] x27: ffff8027dba3e0a8 x26: ffff8027c05fc200 [183047.731310] x25: 0000000000000000 x24: ffff8026bafa8dc0 [183047.736695] x23: ffff8027c05fc218 x22: ffff8026852d7228 [183047.742079] x21: ffff80007c2f2940 x20: ffff8027c05fc200 [183047.747464] x19: 0000000000f80800 x18: 0000000000000010 [183047.752848] x17: 0000000000000000 x16: 0000000000000000 [183047.758232] x15: ffff000089a5a4ff x14: 0000000000000005 [183047.763617] x13: ffff000009a5a50e x12: ffff8026bafa1e20 [183047.769001] x11: ffff0000087453b8 x10: ffff00000f28b870 [183047.774385] x9 : 0000000000000000 x8 : ffff80007e58f9b0 [183047.779770] x7 : 0000000000000000 x6 : 000000000000003f [183047.785154] x5 : 0000000000000040 x4 : ffffffffffffffe0 [183047.790538] x3 : 00000000000000f8 x2 : 0000000002000007 [183047.795922] x1 : 0000000000000008 x0 : 0000000000000000 [183047.801307] Call trace: [183047.803827] prep_ata_v3_hw+0xf8/0x230 [hisi_sas_v3_hw] [183047.809127] hisi_sas_task_prep+0x750/0x888 [hisi_sas_main] [183047.814773] hisi_sas_task_exec.isra.7+0x88/0x1f0 [hisi_sas_main] [183047.820939] hisi_sas_queue_command+0x28/0x38 [hisi_sas_main] [183047.826757] smp_execute_task_sg+0xec/0x218 [183047.831013] smp_execute_task+0x74/0xa0 [183047.834921] sas_discover_expander.part.7+0x9c/0x5f8 [183047.839959] sas_discover_root_expander+0x90/0x160 [183047.844822] sas_discover_domain+0x1b8/0x1e8 [183047.849164] process_one_work+0x1b4/0x3f8 [183047.853246] worker_thread+0x54/0x470 [183047.856981] kthread+0x134/0x138 [183047.860283] ret_from_fork+0x10/0x18 [183047.863931] Code: f9407a80 528000e2 39409281 72a04002 (b9405800) [183047.870097] kernel fault(0x1) notification starting on CPU 0 [183047.875828] kernel fault(0x1) notification finished on CPU 0 [183047.881559] Modules linked in: unibsp(OE) hns3(OE) hclge(OE) hnae3(OE) mem_drv(OE) hisi_sas_v3_hw(OE) hisi_sas_main(OE) [183047.892418] ---[ end trace 4cc26083fc11b783 ]--- [183047.897107] Kernel panic - not syncing: Fatal exception [183047.902403] kernel fault(0x5) notification starting on CPU 0 [183047.908134] kernel fault(0x5) notification finished on CPU 0 [183047.913865] SMP: stopping secondary CPUs [183047.917861] Kernel Offset: disabled [183047.921422] CPU features: 0x2,a2a00a38 [183047.925243] Memory Limit: none [183047.928372] kernel reboot(0x2) notification starting on CPU 0 [183047.934190] kernel reboot(0x2) notification finished on CPU 0 [183047.940008] ---[ end Kernel panic - not syncing: Fatal exception ]--- Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver") Link: https://lore.kernel.org/r/20191206011118.46909-1-yanaijie@huawei.com Reported-by: Gao Chuan <gaochuan4@huawei.com> Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09	scsi: iscsi: qla4xxx: fix double free in probe	Dan Carpenter	1	-1/+0
	[ Upstream commit fee92f25777789d73e1936b91472e9c4644457c8 ] On this error path we call qla4xxx_mem_free() and then the caller also calls qla4xxx_free_adapter() which calls qla4xxx_mem_free(). It leads to a couple double frees: drivers/scsi/qla4xxx/ql4_os.c:8856 qla4xxx_probe_adapter() warn: 'ha->chap_dma_pool' double freed drivers/scsi/qla4xxx/ql4_os.c:8856 qla4xxx_probe_adapter() warn: 'ha->fw_ddb_dma_pool' double freed Fixes: afaf5a2d341d ("[SCSI] Initial Commit of qla4xxx") Link: https://lore.kernel.org/r/20191203094421.hw7ex7qr3j2rbsmx@kili.mountain Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09	scsi: qla2xxx: Ignore PORT UPDATE after N2N PLOGI	Roman Bolshakov	1	-1/+2
	[ Upstream commit af22f0c7b052c5c203207f1e5ebd6aa65f87c538 ] PORT UPDATE asynchronous event is generated on the host that issues PLOGI ELS (in the case of higher WWPN). In that case, the event shouldn't be handled as it sets unwanted DPC flags (i.e. LOOP_RESYNC_NEEDED) that trigger link flap. Ignore the event if the host has higher WWPN, but handle otherwise. Cc: Quinn Tran <qutran@marvell.com> Link: https://lore.kernel.org/r/20191125165702.1013-13-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09	scsi: qla2xxx: Send Notify ACK after N2N PLOGI	Roman Bolshakov	1	-0/+1
	[ Upstream commit 5e6b01d84b9d20bcd77fc7c4733a2a4149bf220a ] qlt_handle_login schedules session for deletion even if a login is in progress. That causes login bouncing, i.e. a few logins are made before it settles down. Complete the first login by sending Notify Acknowledge IOCB via qlt_plogi_ack_unref if the session is pending login completion. Fixes: 9cd883f07a54 ("scsi: qla2xxx: Fix session cleanup for N2N") Cc: Krishna Kant <krishna.kant@purestorage.com> Cc: Alexei Potashnik <alexei@purestorage.com> Link: https://lore.kernel.org/r/20191125165702.1013-11-r.bolshakov@yadro.com Acked-by: Quinn Tran <qutran@marvell.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09	scsi: qla2xxx: Configure local loop for N2N target	Roman Bolshakov	1	-8/+2
	[ Upstream commit fd1de5830a5abaf444cc4312871e02c41e24fdc1 ] qla2x00_configure_local_loop initializes PLOGI payload for PLOGI ELS using Get Parameters mailbox command. In the case when the driver is running in target mode, the topology is N2N and the target port has higher WWPN, LOCAL_LOOP_UPDATE bit is cleared too early and PLOGI payload is not initialized by the Get Parameters command. That causes a failure of ELS IOCB carrying the PLOGI with 0x15 aka Data Underrun error. LOCAL_LOOP_UPDATE has to be set to initialize PLOGI payload. Fixes: 48acad099074 ("scsi: qla2xxx: Fix N2N link re-connect") Link: https://lore.kernel.org/r/20191125165702.1013-10-r.bolshakov@yadro.com Acked-by: Quinn Tran <qutran@marvell.com> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09	scsi: qla2xxx: Fix PLOGI payload and ELS IOCB dump length	Roman Bolshakov	1	-2/+4
	[ Upstream commit 0334cdea1fba36fad8bdf9516f267ce01de625f7 ] The size of the buffer is hardcoded as 0x70 or 112 bytes, while the size of ELS IOCB is 0x40 and the size of PLOGI payload returned by Get Parameters command is 0x74. Cc: Quinn Tran <qutran@marvell.com> Link: https://lore.kernel.org/r/20191125165702.1013-9-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09	scsi: qla2xxx: Don't call qlt_async_event twice	Roman Bolshakov	1	-4/+0
	[ Upstream commit 2c2f4bed9b6299e6430a65a29b5d27b8763fdf25 ] MBA_PORT_UPDATE generates duplicate log lines in target mode because qlt_async_event is called twice. Drop the calls within the case as the function will be called right after the switch statement. Cc: Quinn Tran <qutran@marvell.com> Link: https://lore.kernel.org/r/20191125165702.1013-8-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvel.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Acked-by: Himanshu Madhani <hmadhani@marvell.com> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09	scsi: qla2xxx: Drop superfluous INIT_WORK of del_work	Roman Bolshakov	1	-1/+0
	[ Upstream commit 600954e6f2df695434887dfc6a99a098859990cf ] del_work is already initialized inside qla2x00_alloc_fcport, there's no need to overwrite it. Indeed, it might prevent complete traversal of workqueue list. Fixes: a01c77d2cbc45 ("scsi: qla2xxx: Move session delete to driver work queue") Cc: Quinn Tran <qutran@marvell.com> Link: https://lore.kernel.org/r/20191125165702.1013-5-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09	scsi: lpfc: Fix memory leak on lpfc_bsg_write_ebuf_set func	Bo Wu	1	-6/+9
	[ Upstream commit 9a1b0b9a6dab452fb0e39fe96880c4faf3878369 ] When phba->mbox_ext_buf_ctx.seqNum != phba->mbox_ext_buf_ctx.numBuf, dd_data should be freed before return SLI_CONFIG_HANDLED. When lpfc_sli_issue_mbox func return fails, pmboxq should be also freed in job_error tag. Link: https://lore.kernel.org/r/EDBAAA0BBBA2AC4E9C8B6B81DEEE1D6915E7A966@DGGEML525-MBS.china.huawei.com Signed-off-by: Bo Wu <wubo40@huawei.com> Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-09	nvme_fc: add module to ops template to allow module references	James Smart	2	-0/+3
	[ Upstream commit 863fbae929c7a5b64e96b8a3ffb34a29eefb9f8f ] In nvme-fc: it's possible to have connected active controllers and as no references are taken on the LLDD, the LLDD can be unloaded. The controller would enter a reconnect state and as long as the LLDD resumed within the reconnect timeout, the controller would resume. But if a namespace on the controller is the root device, allowing the driver to unload can be problematic. To reload the driver, it may require new io to the boot device, and as it's no longer connected we get into a catch-22 that eventually fails, and the system locks up. Fix this issue by taking a module reference for every connected controller (which is what the core layer did to the transport module). Reference is cleared when the controller is removed. Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: iscsi: Don't send data to unbound connection	Anatol Pomazau	1	-0/+8
	[ Upstream commit 238191d65d7217982d69e21c1d623616da34b281 ] If a faulty initiator fails to bind the socket to the iSCSI connection before emitting a command, for instance, a subsequent send_pdu, it will crash the kernel due to a null pointer dereference in sock_sendmsg(), as shown in the log below. This patch makes sure the bind succeeded before trying to use the socket. BUG: kernel NULL pointer dereference, address: 0000000000000018 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI CPU: 3 PID: 7 Comm: kworker/u8:0 Not tainted 5.4.0-rc2.iscsi+ #13 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 24.158246] Workqueue: iscsi_q_0 iscsi_xmitworker [ 24.158883] RIP: 0010:apparmor_socket_sendmsg+0x5/0x20 [...] [ 24.161739] RSP: 0018:ffffab6440043ca0 EFLAGS: 00010282 [ 24.162400] RAX: ffffffff891c1c00 RBX: ffffffff89d53968 RCX: 0000000000000001 [ 24.163253] RDX: 0000000000000030 RSI: ffffab6440043d00 RDI: 0000000000000000 [ 24.164104] RBP: 0000000000000030 R08: 0000000000000030 R09: 0000000000000030 [ 24.165166] R10: ffffffff893e66a0 R11: 0000000000000018 R12: ffffab6440043d00 [ 24.166038] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9d5575a62e90 [ 24.166919] FS: 0000000000000000(0000) GS:ffff9d557db80000(0000) knlGS:0000000000000000 [ 24.167890] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 24.168587] CR2: 0000000000000018 CR3: 000000007a838000 CR4: 00000000000006e0 [ 24.169451] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 24.170320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 24.171214] Call Trace: [ 24.171537] security_socket_sendmsg+0x3a/0x50 [ 24.172079] sock_sendmsg+0x16/0x60 [ 24.172506] iscsi_sw_tcp_xmit_segment+0x77/0x120 [ 24.173076] iscsi_sw_tcp_pdu_xmit+0x58/0x170 [ 24.173604] ? iscsi_dbg_trace+0x63/0x80 [ 24.174087] iscsi_tcp_task_xmit+0x101/0x280 [ 24.174666] iscsi_xmit_task+0x83/0x110 [ 24.175206] iscsi_xmitworker+0x57/0x380 [ 24.175757] ? __schedule+0x2a2/0x700 [ 24.176273] process_one_work+0x1b5/0x360 [ 24.176837] worker_thread+0x50/0x3c0 [ 24.177353] kthread+0xf9/0x130 [ 24.177799] ? process_one_work+0x360/0x360 [ 24.178401] ? kthread_park+0x90/0x90 [ 24.178915] ret_from_fork+0x35/0x40 [ 24.179421] Modules linked in: [ 24.179856] CR2: 0000000000000018 [ 24.180327] ---[ end trace b4b7674b6df5f480 ]--- Signed-off-by: Anatol Pomazau <anatol@google.com> Co-developed-by: Frank Mayhar <fmayhar@google.com> Signed-off-by: Frank Mayhar <fmayhar@google.com> Co-developed-by: Bharath Ravi <rbharath@google.com> Signed-off-by: Bharath Ravi <rbharath@google.com> Co-developed-by: Khazhimsel Kumykov <khazhy@google.com> Signed-off-by: Khazhimsel Kumykov <khazhy@google.com> Co-developed-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: NCR5380: Add disconnect_mask module parameter	Finn Thain	1	-1/+5
	[ Upstream commit 0b7a223552d455bcfba6fb9cfc5eef2b5fce1491 ] Add a module parameter to inhibit disconnect/reselect for individual targets. This gains compatibility with Aztec PowerMonster SCSI/SATA adapters with buggy firmware. (No fix is available from the vendor.) Apparently these adapters pass-through the product/vendor of the attached SATA device. Since they can't be identified from the response to an INQUIRY command, a device blacklist flag won't work. Cc: Michael Schmitz <schmitzmic@gmail.com> Link: https://lore.kernel.org/r/993b17545990f31f9fa5a98202b51102a68e7594.1573875417.git.fthain@telegraphics.com.au Reviewed-and-tested-by: Michael Schmitz <schmitzmic@gmail.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: scsi_debug: num_tgts must be >= 0	Maurizio Lombardi	1	-0/+5
	[ Upstream commit aa5334c4f3014940f11bf876e919c956abef4089 ] Passing the parameter "num_tgts=-1" will start an infinite loop that exhausts the system memory Link: https://lore.kernel.org/r/20191115163727.24626-1-mlombard@redhat.com Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: ufs: Fix error handing during hibern8 enter	Subhash Jadavani	1	-5/+14
	[ Upstream commit 6d303e4b19d694cdbebf76bcdb51ada664ee953d ] During clock gating (ufshcd_gate_work()), we first put the link hibern8 by calling ufshcd_uic_hibern8_enter() and if ufshcd_uic_hibern8_enter() returns success (0) then we gate all the clocks. Now let’s zoom in to what ufshcd_uic_hibern8_enter() does internally: It calls __ufshcd_uic_hibern8_enter() and if failure is encountered, link recovery shall put the link back to the highest HS gear and returns success (0) to ufshcd_uic_hibern8_enter() which is the issue as link is still in active state due to recovery! Now ufshcd_uic_hibern8_enter() returns success to ufshcd_gate_work() and hence it goes ahead with gating the UFS clock while link is still in active state hence I believe controller would raise UIC error interrupts. But when we service the interrupt, clocks might have already been disabled! This change fixes for this by returning failure from __ufshcd_uic_hibern8_enter() if recovery succeeds as link is still not in hibern8, upon receiving the error ufshcd_hibern8_enter() would initiate retry to put the link state back into hibern8. Link: https://lore.kernel.org/r/1573798172-20534-8-git-send-email-cang@codeaurora.org Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: pm80xx: Fix for SATA device discovery	peter chang	1	-0/+2
	[ Upstream commit ce21c63ee995b7a8b7b81245f2cee521f8c3c220 ] Driver was missing complete() call in mpi_sata_completion which result in SATA abort error handling timing out. That causes the device to be left in the in_recovery state so subsequent commands sent to the device fail and the OS removes access to it. Link: https://lore.kernel.org/r/20191114100910.6153-2-deepak.ukey@microchip.com Acked-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: peter chang <dpf@google.com> Signed-off-by: Deepak Ukey <deepak.ukey@microchip.com> Signed-off-by: Viswas G <Viswas.G@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: atari_scsi: sun3_scsi: Set sg_tablesize to 1 instead of SG_NONE	Finn Thain	3	-6/+6
	[ Upstream commit 79172ab20bfd8437b277254028efdb68484e2c21 ] Since the scsi subsystem adopted the blk-mq API, a host with zero sg_tablesize crashes with a NULL pointer dereference. blk_queue_max_segments: set to minimum 1 scsi 0:0:0:0: Direct-Access QEMU QEMU HARDDISK 2.5+ PQ: 0 ANSI: 5 scsi target0:0:0: Beginning Domain Validation scsi target0:0:0: Domain Validation skipping write tests scsi target0:0:0: Ending Domain Validation blk_queue_max_segments: set to minimum 1 scsi 0:0:1:0: Direct-Access QEMU QEMU HARDDISK 2.5+ PQ: 0 ANSI: 5 scsi target0:0:1: Beginning Domain Validation scsi target0:0:1: Domain Validation skipping write tests scsi target0:0:1: Ending Domain Validation blk_queue_max_segments: set to minimum 1 scsi 0:0:2:0: CD-ROM QEMU QEMU CD-ROM 2.5+ PQ: 0 ANSI: 5 scsi target0:0:2: Beginning Domain Validation scsi target0:0:2: Domain Validation skipping write tests scsi target0:0:2: Ending Domain Validation blk_queue_max_segments: set to minimum 1 blk_queue_max_segments: set to minimum 1 blk_queue_max_segments: set to minimum 1 blk_queue_max_segments: set to minimum 1 sr 0:0:2:0: Power-on or device reset occurred sd 0:0:0:0: Power-on or device reset occurred sd 0:0:1:0: Power-on or device reset occurred sd 0:0:0:0: [sda] 10485762 512-byte logical blocks: (5.37 GB/5.00 GiB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Unable to handle kernel NULL pointer dereference at virtual address (ptrval) Oops: 00000000 Modules linked in: PC: [<001cd874>] blk_mq_free_request+0x66/0xe2 SR: 2004 SP: (ptrval) a2: 00874520 d0: 00000000 d1: 00000000 d2: 009ba800 d3: 00000000 d4: 00000000 d5: 08000002 a0: 0087be68 a1: 009a81e0 Process kworker/u2:2 (pid: 15, task=(ptrval)) Frame format=7 eff addr=0000007a ssw=0505 faddr=0000007a wb 1 stat/addr/data: 0000 00000000 00000000 wb 2 stat/addr/data: 0000 00000000 00000000 wb 3 stat/addr/data: 0000 0000007a 00000000 push data: 00000000 00000000 00000000 00000000 Stack from 0087bd98: 00000002 00000000 0087be72 009a7820 0087bdb4 001c4f6c 009a7820 0087bdd4 0024d200 009a7820 0024d0dc 0087be72 009baa00 0087be68 009a5000 0087be7c 00265d10 009a5000 0087be72 00000003 00000000 00000000 00000000 0087be68 00000bb8 00000005 00000000 00000000 00000000 00000000 00265c56 00000000 009ba60c 0036ddf4 00000002 ffffffff 009baa00 009ba600 009a50d6 0087be74 00227ba0 009baa08 00000001 009baa08 009ba60c 0036ddf4 00000000 00000000 Call Trace: [<001c4f6c>] blk_put_request+0xe/0x14 [<0024d200>] __scsi_execute+0x124/0x174 [<0024d0dc>] __scsi_execute+0x0/0x174 [<00265d10>] sd_revalidate_disk+0xba/0x1f02 [<00265c56>] sd_revalidate_disk+0x0/0x1f02 [<0036ddf4>] strlen+0x0/0x22 [<00227ba0>] device_add+0x3da/0x604 [<0036ddf4>] strlen+0x0/0x22 [<00267e64>] sd_probe+0x30c/0x4b4 [<0002da44>] process_one_work+0x0/0x402 [<0022b978>] really_probe+0x226/0x354 [<0022bc34>] driver_probe_device+0xa4/0xf0 [<0002da44>] process_one_work+0x0/0x402 [<0022bcd0>] __driver_attach_async_helper+0x50/0x70 [<00035dae>] async_run_entry_fn+0x36/0x130 [<0002db88>] process_one_work+0x144/0x402 [<0002e1aa>] worker_thread+0x0/0x570 [<0002e29a>] worker_thread+0xf0/0x570 [<0002e1aa>] worker_thread+0x0/0x570 [<003768d8>] schedule+0x0/0xb8 [<0003f58c>] __init_waitqueue_head+0x0/0x12 [<00033e92>] kthread+0xc2/0xf6 [<000331e8>] kthread_parkme+0x0/0x4e [<003768d8>] schedule+0x0/0xb8 [<00033dd0>] kthread+0x0/0xf6 [<00002c10>] ret_from_kernel_thread+0xc/0x14 Code: 0280 0006 0800 56c0 4400 0280 0000 00ff <52b4> 0c3a 082b 0006 0013 6706 2042 53a8 00c4 4ab9 0047 3374 6640 202d 000c 670c Disabling lock debugging due to kernel taint Avoid this by setting sg_tablesize = 1. Link: https://lore.kernel.org/r/4567bcae94523b47d6f3b77450ba305823bca479.1572656814.git.fthain@telegraphics.com.au Reported-and-tested-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Michael Schmitz <schmitzmic@gmail.com> References: commit 68ab2d76e4be ("scsi: cxlflash: Set sg_tablesize to 1 instead of SG_NONE") Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: ufs: fix potential bug which ends in system hang	Bean Huo	1	-1/+1
	[ Upstream commit cfcbae3895b86c390ede57b2a8f601dd5972b47b ] In function __ufshcd_query_descriptor(), in the event of an error happening, we directly goto out_unlock and forget to invaliate hba->dev_cmd.query.descriptor pointer. This results in this pointer still valid in ufshcd_copy_query_response() for other query requests which go through ufshcd_exec_raw_upiu_cmd(). This will cause __memcpy() crash and system hangs. Log as shown below: Unable to handle kernel paging request at virtual address ffff000012233c40 Mem abort info: ESR = 0x96000047 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000047 CM = 0, WnR = 1 swapper pgtable: 4k pages, 48-bit VAs, pgdp = 0000000028cc735c [ffff000012233c40] pgd=00000000bffff003, pud=00000000bfffe003, pmd=00000000ba8b8003, pte=0000000000000000 Internal error: Oops: 96000047 [#2] PREEMPT SMP ... Call trace: __memcpy+0x74/0x180 ufshcd_issue_devman_upiu_cmd+0x250/0x3c0 ufshcd_exec_raw_upiu_cmd+0xfc/0x1a8 ufs_bsg_request+0x178/0x3b0 bsg_queue_rq+0xc0/0x118 blk_mq_dispatch_rq_list+0xb0/0x538 blk_mq_sched_dispatch_requests+0x18c/0x1d8 __blk_mq_run_hw_queue+0xb4/0x118 blk_mq_run_work_fn+0x28/0x38 process_one_work+0x1ec/0x470 worker_thread+0x48/0x458 kthread+0x130/0x138 ret_from_fork+0x10/0x1c Code: 540000ab a8c12027 a88120c7 a8c12027 (a88120c7) ---[ end trace 793e1eb5dff69f2d ]--- note: kworker/0:2H[2054] exited with preempt_count 1 This patch is to move "descriptor = NULL" down to below the label "out_unlock". Fixes: d44a5f98bb49b2(ufs: query descriptor API) Link: https://lore.kernel.org/r/20191112223436.27449-3-huobean@gmail.com Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: lpfc: fix: Coverity: lpfc_cmpl_els_rsp(): Null pointer dereferences	James Smart	1	-1/+1
	[ Upstream commit 6c6d59e0fe5b86cf273d6d744a6a9768c4ecc756 ] Coverity reported the following: *** CID 101747: Null pointer dereferences (FORWARD_NULL) /drivers/scsi/lpfc/lpfc_els.c: 4439 in lpfc_cmpl_els_rsp() 4433 kfree(mp); 4434 } 4435 mempool_free(mbox, phba->mbox_mem_pool); 4436 } 4437 out: 4438 if (ndlp && NLP_CHK_NODE_ACT(ndlp)) { vvv CID 101747: Null pointer dereferences (FORWARD_NULL) vvv Dereferencing null pointer "shost". 4439 spin_lock_irq(shost->host_lock); 4440 ndlp->nlp_flag &= ~(NLP_ACC_REGLOGIN \| NLP_RM_DFLT_RPI); 4441 spin_unlock_irq(shost->host_lock); 4442 4443 /* If the node is not being used by another discovery thread, 4444 * and we are sending a reject, we are done with it. Fix by adding a check for non-null shost in line 4438. The scenario when shost is set to null is when ndlp is null. As such, the ndlp check present was sufficient. But better safe than sorry so add the shost check. Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 101747 ("Null pointer dereferences") Fixes: 2e0fef85e098 ("[SCSI] lpfc: NPIV: split ports") CC: James Bottomley <James.Bottomley@SteelEye.com> CC: "Gustavo A. R. Silva" <gustavo@embeddedor.com> CC: linux-next@vger.kernel.org Link: https://lore.kernel.org/r/20191111230401.12958-3-jsmart2021@gmail.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: lpfc: Fix duplicate unreg_rpi error in port offline flow	James Smart	1	-0/+7
	[ Upstream commit 7cfd5639d99bec0d27af089d0c8c114330e43a72 ] If the driver receives a login that is later then LOGO'd by the remote port (aka ndlp), the driver, upon the completion of the LOGO ACC transmission, will logout the node and unregister the rpi that is being used for the node. As part of the unreg, the node's rpi value is replaced by the LPFC_RPI_ALLOC_ERROR value. If the port is subsequently offlined, the offline walks the nodes and ensures they are logged out, which possibly entails unreg'ing their rpi values. This path does not validate the node's rpi value, thus doesn't detect that it has been unreg'd already. The replaced rpi value is then used when accessing the rpi bitmask array which tracks active rpi values. As the LPFC_RPI_ALLOC_ERROR value is not a valid index for the bitmask, it may fault the system. Revise the rpi release code to detect when the rpi value is the replaced RPI_ALLOC_ERROR value and ignore further release steps. Link: https://lore.kernel.org/r/20191105005708.7399-2-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: tracing: Fix handling of TRANSFER LENGTH == 0 for READ(6) and WRITE(6)	Bart Van Assche	1	-4/+7
	[ Upstream commit f6b8540f40201bff91062dd64db8e29e4ddaaa9d ] According to SBC-2 a TRANSFER LENGTH field of zero means that 256 logical blocks must be transferred. Make the SCSI tracing code follow SBC-2. Fixes: bf8162354233 ("[SCSI] add scsi trace core functions and put trace points") Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Link: https://lore.kernel.org/r/20191105215553.185018-1-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: hisi_sas: Replace in_softirq() check in hisi_sas_task_exec()	Xiang Chen	1	-1/+7
	[ Upstream commit 550c0d89d52d3bec5c299f69b4ed5d2ee6b8a9a6 ] For IOs from upper layer, preemption may be disabled as it may be called by function __blk_mq_delay_run_hw_queue which will call get_cpu() (it disables preemption). So if flags HISI_SAS_REJECT_CMD_BIT is set in function hisi_sas_task_exec(), it may disable preempt twice after down() and up() which will cause following call trace: BUG: scheduling while atomic: fio/60373/0x00000002 Call trace: dump_backtrace+0x0/0x150 show_stack+0x24/0x30 dump_stack+0xa0/0xc4 __schedule_bug+0x68/0x88 __schedule+0x4b8/0x548 schedule+0x40/0xd0 schedule_timeout+0x200/0x378 __down+0x78/0xc8 down+0x54/0x70 hisi_sas_task_exec.isra.10+0x598/0x8d8 [hisi_sas_main] hisi_sas_queue_command+0x28/0x38 [hisi_sas_main] sas_queuecommand+0x168/0x1b0 [libsas] scsi_queue_rq+0x2ac/0x980 blk_mq_dispatch_rq_list+0xb0/0x550 blk_mq_do_dispatch_sched+0x6c/0x110 blk_mq_sched_dispatch_requests+0x114/0x1d8 __blk_mq_run_hw_queue+0xb8/0x130 __blk_mq_delay_run_hw_queue+0x1c0/0x220 blk_mq_run_hw_queue+0xb0/0x128 blk_mq_sched_insert_requests+0xdc/0x208 blk_mq_flush_plug_list+0x1b4/0x3a0 blk_flush_plug_list+0xdc/0x110 blk_finish_plug+0x3c/0x50 blkdev_direct_IO+0x404/0x550 generic_file_read_iter+0x9c/0x848 blkdev_read_iter+0x50/0x78 aio_read+0xc8/0x170 io_submit_one+0x1fc/0x8d8 __arm64_sys_io_submit+0xdc/0x280 el0_svc_common.constprop.0+0xe0/0x1e0 el0_svc_handler+0x34/0x90 el0_svc+0x10/0x14 ... To solve the issue, check preemptible() to avoid disabling preempt multiple when flag HISI_SAS_REJECT_CMD_BIT is set. Link: https://lore.kernel.org/r/1571926105-74636-5-git-send-email-john.garry@huawei.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: csiostor: Don't enable IRQs too early	Dan Carpenter	1	-6/+9
	[ Upstream commit d6c9b31ac3064fbedf8961f120a4c117daa59932 ] These are called with IRQs disabled from csio_mgmt_tmo_handler() so we can't call spin_unlock_irq() or it will enable IRQs prematurely. Fixes: a3667aaed569 ("[SCSI] csiostor: Chelsio FCoE offload driver") Link: https://lore.kernel.org/r/20191019085913.GA14245@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: lpfc: Fix SLI3 hba in loop mode not discovering devices	James Smart	1	-1/+3
	[ Upstream commit feff8b3d84d3d9570f893b4d83e5eab6693d6a52 ] When operating in private loop mode, PLOGI exchanges are racing and the driver tries to abort it's PLOGI. But the PLOGI abort ends up terminating the login with the other end causing the other end to abort its PLOGI as well. Discovery never fully completes. Fix by disabling the PLOGI abort when private loop and letting the state machine play out. Link: https://lore.kernel.org/r/20191018211832.7917-5-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: lpfc: Fix locking on mailbox command completion	James Smart	1	-1/+7
	[ Upstream commit 07b8582430370097238b589f4e24da7613ca6dd3 ] Symptoms were seen of the driver not having valid data for mailbox commands. After debugging, the following sequence was found: The driver maintains a port-wide pointer of the mailbox command that is currently in execution. Once finished, the port-wide pointer is cleared (done in lpfc_sli4_mq_release()). The next mailbox command issued will set the next pointer and so on. The mailbox response data is only copied if there is a valid port-wide pointer. In the failing case, it was seen that a new mailbox command was being attempted in parallel with the completion. The parallel path was seeing the mailbox no long in use (flag check under lock) and thus set the port pointer. The completion path had cleared the active flag under lock, but had not touched the port pointer. The port pointer is cleared after the lock is released. In this case, the completion path cleared the just-set value by the parallel path. Fix by making the calls that clear mbox state/port pointer while under lock. Also slightly cleaned up the error path. Link: https://lore.kernel.org/r/20190922035906.10977-8-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: mpt3sas: Fix clear pending bit in ioctl status	Sreekanth Reddy	1	-1/+2
	[ Upstream commit 782b281883caf70289ba6a186af29441a117d23e ] When user issues diag register command from application with required size, and if driver unable to allocate the memory, then it will fail the register command. While failing the register command, driver is not currently clearing MPT3_CMD_PENDING bit in ctl_cmds.status variable which was set before trying to allocate the memory. As this bit is set, subsequent register command will be failed with BUSY status even when user wants to register the trace buffer will less memory. Clear MPT3_CMD_PENDING bit in ctl_cmds.status before returning the diag register command with no memory status. Link: https://lore.kernel.org/r/1568379890-18347-4-git-send-email-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-04	scsi: lpfc: Fix discovery failures when target device connectivity bounces	James Smart	1	-1/+6
	[ Upstream commit 3f97aed6117c7677eb16756c4ec8b86000fd5822 ] An issue was seen discovering all SCSI Luns when a target device undergoes link bounce. The driver currently does not qualify the FC4 support on the target. Therefore it will send a SCSI PRLI and an NVMe PRLI. The expectation is that the target will reject the PRLI if it is not supported. If a PRLI times out, the driver will retry. The driver will not proceed with the device until both SCSI and NVMe PRLIs are resolved. In the failure case, the device is FCP only and does not respond to the NVMe PRLI, thus initiating the wait/retry loop in the driver. During that time, a RSCN is received (device bounced) causing the driver to issue a GID_FT. The GID_FT response comes back before the PRLI mess is resolved and it prematurely cancels the PRLI retry logic and leaves the device in a STE_PRLI_ISSUE state. Discovery with the target never completes or resets. Fix by resetting the node state back to STE_NPR_NODE when GID_FT completes, thereby restarting the discovery process for the node. Link: https://lore.kernel.org/r/20190922035906.10977-10-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-12-21	scsi: qla2xxx: Change discovery state before PLOGI	Roman Bolshakov	1	-0/+1
	commit 58e39a2ce4be08162c0368030cdc405f7fd849aa upstream. When a port sends PLOGI, discovery state should be changed to login pending, otherwise RELOGIN_NEEDED bit is set in qla24xx_handle_plogi_done_event(). RELOGIN_NEEDED triggers another PLOGI, and it never goes out of the loop until login timer expires. Fixes: 8777e4314d397 ("scsi: qla2xxx: Migrate NVME N2N handling into state machine") Fixes: 8b5292bcfcacf ("scsi: qla2xxx: Fix Relogin to prevent modifying scan_state flag") Cc: Quinn Tran <qutran@marvell.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20191125165702.1013-6-r.bolshakov@yadro.com Acked-by: Himanshu Madhani <hmadhani@marvell.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Tested-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-21	scsi: iscsi: Fix a potential deadlock in the timeout handler	Bart Van Assche	1	-2/+2
	commit 5480e299b5ae57956af01d4839c9fc88a465eeab upstream. Some time ago the block layer was modified such that timeout handlers are called from thread context instead of interrupt context. Make it safe to run the iSCSI timeout handler in thread context. This patch fixes the following lockdep complaint: ================================ WARNING: inconsistent lock state 5.5.1-dbg+ #11 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. kworker/7:1H/206 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff88802d9827e8 (&(&session->frwd_lock)->rlock){+.?.}, at: iscsi_eh_cmd_timed_out+0xa6/0x6d0 [libiscsi] {IN-SOFTIRQ-W} state was registered at: lock_acquire+0x106/0x240 _raw_spin_lock+0x38/0x50 iscsi_check_transport_timeouts+0x3e/0x210 [libiscsi] call_timer_fn+0x132/0x470 __run_timers.part.0+0x39f/0x5b0 run_timer_softirq+0x63/0xc0 __do_softirq+0x12d/0x5fd irq_exit+0xb3/0x110 smp_apic_timer_interrupt+0x131/0x3d0 apic_timer_interrupt+0xf/0x20 default_idle+0x31/0x230 arch_cpu_idle+0x13/0x20 default_idle_call+0x53/0x60 do_idle+0x38a/0x3f0 cpu_startup_entry+0x24/0x30 start_secondary+0x222/0x290 secondary_startup_64+0xa4/0xb0 irq event stamp: 1383705 hardirqs last enabled at (1383705): [<ffffffff81aace5c>] _raw_spin_unlock_irq+0x2c/0x50 hardirqs last disabled at (1383704): [<ffffffff81aacb98>] _raw_spin_lock_irq+0x18/0x50 softirqs last enabled at (1383690): [<ffffffffa0e2efea>] iscsi_queuecommand+0x76a/0xa20 [libiscsi] softirqs last disabled at (1383682): [<ffffffffa0e2e998>] iscsi_queuecommand+0x118/0xa20 [libiscsi] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&session->frwd_lock)->rlock); <Interrupt> lock(&(&session->frwd_lock)->rlock); * DEADLOCK * 2 locks held by kworker/7:1H/206: #0: ffff8880d57bf928 ((wq_completion)kblockd){+.+.}, at: process_one_work+0x472/0xab0 #1: ffff88802b9c7de8 ((work_completion)(&q->timeout_work)){+.+.}, at: process_one_work+0x476/0xab0 stack backtrace: CPU: 7 PID: 206 Comm: kworker/7:1H Not tainted 5.5.1-dbg+ #11 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: kblockd blk_mq_timeout_work Call Trace: dump_stack+0xa5/0xe6 print_usage_bug.cold+0x232/0x23b mark_lock+0x8dc/0xa70 __lock_acquire+0xcea/0x2af0 lock_acquire+0x106/0x240 _raw_spin_lock+0x38/0x50 iscsi_eh_cmd_timed_out+0xa6/0x6d0 [libiscsi] scsi_times_out+0xf4/0x440 [scsi_mod] scsi_timeout+0x1d/0x20 [scsi_mod] blk_mq_check_expired+0x365/0x3a0 bt_iter+0xd6/0xf0 blk_mq_queue_tag_busy_iter+0x3de/0x650 blk_mq_timeout_work+0x1af/0x380 process_one_work+0x56d/0xab0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 Fixes: 287922eb0b18 ("block: defer timeouts to a workqueue") Cc: Christoph Hellwig <hch@lst.de> Cc: Keith Busch <keith.busch@intel.com> Cc: Lee Duncan <lduncan@suse.com> Cc: Chris Leech <cleech@redhat.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20191209173457.187370-1-bvanassche@acm.org Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-12-17	scsi: zorro_esp: Limit DMA transfers to 65536 bytes (except on Fastlane)	Kars de Jong	1	-2/+9
	[ Upstream commit 02f7e9f351a9de95577eafdc3bd413ed1c3b589f ] When using this driver on a Blizzard 1260, there were failures whenever DMA transfers from the SCSI bus to memory of 65535 bytes were followed by a DMA transfer of 1 byte. This caused the byte at offset 65535 to be overwritten with 0xff. The Blizzard hardware can't handle single byte DMA transfers. Besides this issue, limiting the DMA length to something that is not a multiple of the page size is very inefficient on most file systems. It seems this limit was chosen because the DMA transfer counter of the ESP by default is 16 bits wide, thus limiting the length to 65535 bytes. However, the value 0 means 65536 bytes, which is handled by the ESP and the Blizzard just fine. It is also the default maximum used by esp_scsi when drivers don't provide their own dma_length_limit() function. The limit of 65536 bytes can be used by all boards except the Fastlane. The old driver used a limit of 65532 bytes (0xfffc), which is reintroduced in this patch. Fixes: b7ded0e8b0d1 ("scsi: zorro_esp: Limit DMA transfers to 65535 bytes") Link: https://lore.kernel.org/r/20191112175523.23145-1-jongk@linux-m68k.org Signed-off-by: Kars de Jong <jongk@linux-m68k.org> Reviewed-by: Finn Thain <fthain@telegraphics.com.au> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>