Age | Commit message (Collapse) | Author | Files | Lines |
|
commit 5e420fe635813e5746b296cfc8fff4853ae205a2 upstream.
Add missing break statement and fix identation issue.
This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.
Fixes: 9cb62fa24e0d ("aacraid: Log firmware AIF messages")
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d8f6382a7d026989029e2e50c515df954488459b ]
This reverts commit bbc0f8bd88abefb0f27998f40a073634a3a2db89.
It added a warning whose intent was to check whether the rport was still
linked into the peer list. It doesn't work as intended and gives false
positive warnings for two reasons:
1) If the rport is never linked into the peer list it will not be
considered empty since the list_head is never initialized.
2) If the rport is deleted from the peer list using list_del_rcu(), then
the list_head is in an undefined state and it is not considered empty.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8437fcf14deed67e5ad90b5e8abf62fb20f30881 ]
The "hostdata->dev" pointer is NULL here. We set "hostdata->dev = dev;"
later in the function and we also use "hostdata->dev" when we call
dma_free_attrs() in NCR_700_release().
This bug predates git version control.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b2d3492fc591b1fb46b81d79ca1fc44cac6ae0ae ]
There are two issues here. First if cmgr->hba is not set early enough then
it leads to a NULL dereference. Second if we don't completely initialize
cmgr->io_bdt_pool[] then we end up dereferencing uninitialized pointers.
Fixes: 853e2bd2103a ("[SCSI] bnx2fc: Broadcom FCoE offload driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 40d07b523cf434f252b134c86b1f8f2d907ffb0b ]
The WRITE SAME(10) and (16) implementations didn't take account of the
buffer wrap required when the virtual_gb parameter is greater than 0.
Fix that and rename the fake_store() function to lba2fake_store() to lessen
confusion with the global fake_storep pointer. Bump version date.
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Reported-by: Bart Van Assche <bvanassche@acm.org>
Tested by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5d8fc4a9f0eec20b6c07895022a6bea3fb6dfb38 ]
The issue to be fixed in this commit is when libfc found it received a
invalid FLOGI response from FC switch, it would return without freeing the
fc frame, which is just the skb data. This would cause memory leak if FC
switch keeps sending invalid FLOGI responses.
This fix is just to make it execute `fc_frame_free(fp)` before returning
from function `fc_lport_flogi_resp`.
Signed-off-by: Ming Lu <ming.lu@citrix.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 4a067cf823d9d8e50d41cfb618011c0d4a969c72 upstream.
Up to 4.12, __scsi_error_from_host_byte() would reset the host byte to
DID_OK for various cases including DID_NEXUS_FAILURE. Commit
2a842acab109 ("block: introduce new block status code type") replaced this
function with scsi_result_to_blk_status() and removed the host-byte
resetting code for the DID_NEXUS_FAILURE case. As the line
set_host_byte(cmd, DID_OK) was preserved for the other cases, I suppose
this was an editing mistake.
The fact that the host byte remains set after 4.13 is causing problems with
the sg_persist tool, which now returns success rather then exit status 24
when a RESERVATION CONFLICT error is encountered.
Fixes: 2a842acab109 "block: introduce new block status code type"
Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit fe35a40e675473eb65f2f5462b82770f324b5689 ]
Assign fc_vport to ln->fc_vport before calling csio_fcoe_alloc_vnp() to
avoid a NULL pointer dereference in csio_vport_set_state().
ln->fc_vport is dereferenced in csio_vport_set_state().
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c41f59884be5cca293ed61f3d64637dbba3a6381 ]
We cannot wait on a completion object in the lpfc_nvme_targetport structure
in the _destroy_targetport() code path because the NVMe/fc transport will
free that structure immediately after the .targetport_delete() callback.
This results in a use-after-free, and a hang if slub_debug=FZPU is enabled.
Fix this by putting the completion on the stack.
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7961cba6f7d8215fa632df3d220e5154bb825249 ]
We cannot wait on a completion object in the lpfc_nvme_lport structure in
the _destroy_localport() code path because the NVMe/fc transport will free
that structure immediately after the .localport_delete() callback. This
results in a use-after-free, and a hang if slub_debug=FZPU is enabled.
Fix this by putting the completion on the stack.
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit ffeafdd2bf0b280d67ec1a47ea6287910d271f3f upstream.
The sysfs phy_identifier attribute for a sas_end_device comes from the rphy
phy_identifier value.
Currently this is not being set for rphys with an end device attached, so
we see incorrect symlinks from systemd disk/by-path:
root@localhost:~# ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx 1 root root 9 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0 -> ../../sdb
lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part3 -> ../../sdc3
Indeed, each sas_end_device phy_identifier value is 0:
root@localhost:/# more sys/class/sas_device/end_device-0\:0\:2/phy_identifier
0
root@localhost:/# more sys/class/sas_device/end_device-0\:0\:10/phy_identifier
0
This patch fixes the discovery code to set the phy_identifier. With this,
we now get proper symlinks:
root@localhost:~# ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy10-lun-0 -> ../../sdg
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy11-lun-0 -> ../../sdh
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy2-lun-0 -> ../../sda
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy2-lun-0-part1 -> ../../sda1
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0 -> ../../sdb
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0 -> ../../sdc
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part2 -> ../../sdc2
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part3 -> ../../sdc3
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy5-lun-0 -> ../../sdd
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0 -> ../../sde
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part1 -> ../../sde1
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part2 -> ../../sde2
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part3 -> ../../sde3
lrwxrwxrwx 1 root root 9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0 -> ../../sdf
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part1 -> ../../sdf1
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part2 -> ../../sdf2
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part3 -> ../../sdf3
Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver")
Reported-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Tested-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 9e8f1c79831424d30c0e3df068be7f4a244157c9 ]
In case of ->set_param() and ->bind_conn() cxgb4i driver does not wait for
cmd completion, this can create race conditions, to avoid this add
wait_for_completion().
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9be9db9f78f52ef03ee90063730cb9d730e7032b ]
Albeit we no longer rely on those hard-coded descriptor sizes, we still use
them as our defaults, so better get it right. While adding its sysfs
entries, we forgot to update the geometry descriptor size. It is 0x48
according to UFS2.1, and wasn't changed in UFS3.0.
[mkp: typo]
Fixes: c720c091222e (scsi: ufs: sysfs: geometry descriptor)
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 34a2ce887668db9dda4b56e6f155c49ac13f3e54 ]
When the driver finds invalid destination MAC for the first un-reachable
target, and before completes the PATH_REQ operation, set new ep_state to
OFFLDCONN_NONE so that as part of driver ep_poll mechanism, the upper
open-iscsi layer is notified to complete the login process on the first
un-reachable target and thus proceed login to other reachable targets.
Signed-off-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ce9e7bce43526626f7cffe2e657953997870197e ]
hba->is_sys_suspended is set after successful system suspend but
not clear after successful system resume.
According to current behavior, hba->is_sys_suspended will not be set if
host is runtime-suspended but not system-suspended. Thus we shall aligh the
same policy: clear this flag even if host remains runtime-suspended after
ufshcd_system_resume is successfully returned.
Simply fix this flag to correct host status logs.
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit cc29a1b0a3f2597ce887d339222fa85b9307706d ]
scsi_mq_setup_tags(), which is called by scsi_add_host(), calculates the
command size to allocate based on the prot_capabilities. In the isci
driver, scsi_host_set_prot() is called after scsi_add_host() so the command
size gets calculated to be smaller than it needs to be. Eventually,
scsi_mq_init_request() locates the 'prot_sdb' after the command assuming it
was sized correctly and a buffer overrun may occur.
However, seeing blk_mq_alloc_rqs() rounds up to the nearest cache line
size, the mistake can go unnoticed.
The bug was noticed after the struct request size was reduced by commit
9d037ad707ed ("block: remove req->timeout_list")
Which likely reduced the allocated space for the request by an entire cache
line, enough that the overflow could be hit and it caused a panic, on boot,
at:
RIP: 0010:t10_pi_complete+0x77/0x1c0
Call Trace:
<IRQ>
sd_done+0xf5/0x340
scsi_finish_command+0xc3/0x120
blk_done_softirq+0x83/0xb0
__do_softirq+0xa1/0x2e6
irq_exit+0xbc/0xd0
call_function_single_interrupt+0xf/0x20
</IRQ>
sd_done() would call scsi_prot_sg_count() which reads the number of
entities in 'prot_sdb', but seeing 'prot_sdb' is located after the end of
the allocated space it reads a garbage number and erroneously calls
t10_pi_complete().
To prevent this, the calls to scsi_host_set_prot() are moved into
isci_host_alloc() before the call to scsi_add_host(). Out of caution, also
move the similar call to scsi_host_set_guard().
Fixes: 3d2d75254915 ("[SCSI] isci: T10 DIF support")
Link: http://lkml.kernel.org/r/da851333-eadd-163a-8c78-e1f4ec5ec857@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Intel SCU Linux support <intel-linux-scu@intel.com>
Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jeff Moyer <jmoyer@redhat.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 72b4a0465f995175a2e22cf4a636bf781f1f28a7 ]
The return code should be check while qla4xxx_copy_from_fwddb_param fails.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit e4a056987c86f402f1286e050b1dee3f4ce7c7eb upstream.
The problem is that the default for MQ is not to gather entropy, whereas
the default for the legacy queue was always to gather it. The original
attempt to fix entropy gathering for rotational disks under MQ added an
else branch in sd_read_block_characteristics(). Unfortunately, the entire
check isn't reached if the device has no characteristics VPD page. Since
this page was only introduced in SBC-3 and its optional anyway, most less
expensive rotational disks don't have one, meaning they all stopped
gathering entropy when we made MQ the default. In a wholly unrelated
change, openssl and openssh won't function until the random number
generator is initialised, meaning lots of people have been seeing large
delays before they could log into systems with default MQ kernels due to
this lack of entropy, because it now can take tens of minutes to initialise
the kernel random number generator.
The fix is to set the non-rotational and add-randomness flags
unconditionally early on in the disk initialization path, so they can be
reset only if the device actually reports being non-rotational via the VPD
page.
Reported-by: Mikael Pettersson <mikpelinux@gmail.com>
Fixes: 83e32a591077 ("scsi: sd: Contribute to randomness when running rotational device")
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Xuewei Zhang <xueweiz@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 42caa0edabd6a0a392ec36a5f0943924e4954311 upstream.
The aic94xx driver is currently failing to load with errors like
sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:03.0/0000:02:00.3/0000:07:02.0/revision'
Because the PCI code had recently added a file named 'revision' to every
PCI device. Fix this by renaming the aic94xx revision file to
aic_revision. This is safe to do for us because as far as I can tell,
there's nothing in userspace relying on the current aic94xx revision file
so it can be renamed without breaking anything.
Fixes: 702ed3be1b1b (PCI: Create revision file in sysfs)
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bb61b843ffd46978d7ca5095453e572714934eeb upstream.
Presently when an error is encountered during probe of the cxlflash
adapter, a deadlock is seen with cpu thread stuck inside
cxlflash_remove(). Below is the trace of the deadlock as logged by
khungtaskd:
cxlflash 0006:00:00.0: cxlflash_probe: init_afu failed rc=-16
INFO: task kworker/80:1:890 blocked for more than 120 seconds.
Not tainted 5.0.0-rc4-capi2-kexec+ #2
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/80:1 D 0 890 2 0x00000808
Workqueue: events work_for_cpu_fn
Call Trace:
0x4d72136320 (unreliable)
__switch_to+0x2cc/0x460
__schedule+0x2bc/0xac0
schedule+0x40/0xb0
cxlflash_remove+0xec/0x640 [cxlflash]
cxlflash_probe+0x370/0x8f0 [cxlflash]
local_pci_probe+0x6c/0x140
work_for_cpu_fn+0x38/0x60
process_one_work+0x260/0x530
worker_thread+0x280/0x5d0
kthread+0x1a8/0x1b0
ret_from_kernel_thread+0x5c/0x80
INFO: task systemd-udevd:5160 blocked for more than 120 seconds.
The deadlock occurs as cxlflash_remove() is called from cxlflash_probe()
without setting 'cxlflash_cfg->state' to STATE_PROBED and the probe thread
starts to wait on 'cxlflash_cfg->reset_waitq'. Since the device was never
successfully probed the 'cxlflash_cfg->state' never changes from
STATE_PROBING hence the deadlock occurs.
We fix this deadlock by setting the variable 'cxlflash_cfg->state' to
STATE_PROBED in case an error occurs during cxlflash_probe() and just
before calling cxlflash_remove().
Cc: stable@vger.kernel.org
Fixes: c21e0bbfc485("cxlflash: Base support for IBM CXL Flash Adapter")
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 65111785acccb836ec75263b03b0e33f21e74f47 ]
Problem:
- during the driver initialization, driver will poll fw
for KERNEL_UP in a 30 seconds timeout.
- if the firmware is not ready after 30 seconds,
driver will not be loaded.
Fix:
- change timeout from 30 seconds to 3 minutes.
Reported-by: Feng Li <lifeng1519@gmail.com>
Reviewed-by: Ajish Koshy <ajish.koshy@microsemi.com>
Reviewed-by: Murthy Bhat <Murthy.Bhat@microsemi.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7ff44499bafbd376115f0bb6b578d980f56ee13b ]
- fix race condition when a unit is deleted after an RLL,
and before we have gotten the LV_STATUS page of the unit.
- In this case we will get a standard inquiry, rather than
the desired page. This will result in a unit presented
which no longer exists.
- If we ask for LV_STATUS, insure we get LV_STATUS
Reviewed-by: Murthy Bhat <murthy.bhat@microsemi.com>
Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Dave Carroll <david.carroll@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b2346b5030cf9458f30a84028d9fe904b8c942a7 ]
Reviewed-by: Scott Benesh <scott.benesh@microsemi.com>
Reviewed-by: Ajish Koshy <ajish.koshy@microsemi.com>
Reviewed-by: Murthy Bhat <murthy.bhat@microsemi.com>
Reviewed-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com>
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 15bc43f31a074076f114e0b87931e3b220b7bff1 ]
Currently the time of SAS SSP connection is 1ms, which means the link
connection will fail if no IO response after this period.
For some disks handling large IO (such as 512k), 1ms is not enough, so
change it to 5ms.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 30e196cacefdd9a38c857caed23cefc9621bc5c1 ]
After a LOGO in response to an ABTS timeout, a PLOGI wasn't issued to
re-establish the login. An nlp_type check in the LOGO completion
handler failed to restart discovery for NVME targets. Revised the
nlp_type check for NVME as well as SCSI.
While reviewing the LOGO handling a few other issues were seen and
were addressed:
- Better lock synchronization around ndlp data types
- When the ABTS times out, unregister the RPI before sending the LOGO
so that all local exchange contexts are cleared and nothing received
while awaiting LOGO/PLOGI handling will be accepted.
- LOGO handling optimized to:
Wait only R_A_TOV for a response.
It doesn't need to be retried on timeout. If there wasn't a
response, a PLOGI will be sent, thus an implicit logout
applies as well when the other port sees it.
If there is a response, any kind of response is considered "good"
and the XRI quarantined for a exchange qualifier window.
- PLOGI is issued as soon a LOGO state is resolved.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit dc730212e8a378763cb182b889f90c8101331332 ]
Call sas_remove_host() before removing the target devices in the driver's
.remove() callback function(i.e. during driver unload time). So that
driver can provide a way to allow SYNC CACHE, START STOP unit commands
etc. (which are issued from SML) to the target drives during driver unload
time.
Once sas_remove_host() is called before removing the target drives then
driver can just clean up the resources allocated for target devices and no
need to call sas_port_delete_phy(), sas_port_delete() API's as these API's
internally called from sas_remove_host().
Signed-off-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b114d9009d386276bfc3352289fc235781ae3353 ]
When LCB's are rejected, if beaconing was already in progress, the
Reason Code Explanation was not being set. Should have been set to
command in progress.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit d67247566450cf89a693307c9bc9f05a32d96cea upstream.
memcpy_fromio() doesn't provide any control over access size. For example,
on arm64, it is implemented using readb and readq. This may trigger a
synchronous external abort:
[ 3.729943] Internal error: synchronous external abort: 96000210 [#1] PREEMPT SMP
[ 3.737000] Modules linked in:
[ 3.744371] CPU: 2 PID: 1 Comm: swapper/0 Tainted: G S 4.20.0-rc4 #16
[ 3.747413] Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
[ 3.755295] pstate: 00000005 (nzcv daif -PAN -UAO)
[ 3.761978] pc : __memcpy_fromio+0x68/0x80
[ 3.766718] lr : ufshcd_dump_regs+0x50/0xb0
[ 3.770767] sp : ffff00000807ba00
[ 3.774830] x29: ffff00000807ba00 x28: 00000000fffffffb
[ 3.778344] x27: ffff0000089db068 x26: ffff8000f6e58000
[ 3.783728] x25: 000000000000000e x24: 0000000000000800
[ 3.789023] x23: ffff8000f6e587c8 x22: 0000000000000800
[ 3.794319] x21: ffff000008908368 x20: ffff8000f6e1ab80
[ 3.799615] x19: 000000000000006c x18: ffffffffffffffff
[ 3.804910] x17: 0000000000000000 x16: 0000000000000000
[ 3.810206] x15: ffff000009199648 x14: ffff000089244187
[ 3.815502] x13: ffff000009244195 x12: ffff0000091ab000
[ 3.820797] x11: 0000000005f5e0ff x10: ffff0000091998a0
[ 3.826093] x9 : 0000000000000000 x8 : ffff8000f6e1ac00
[ 3.831389] x7 : 0000000000000000 x6 : 0000000000000068
[ 3.836676] x5 : ffff8000f6e1abe8 x4 : 0000000000000000
[ 3.841971] x3 : ffff00000928c868 x2 : ffff8000f6e1abec
[ 3.847267] x1 : ffff00000928c868 x0 : ffff8000f6e1abe8
[ 3.852567] Process swapper/0 (pid: 1, stack limit = 0x(____ptrval____))
[ 3.857900] Call trace:
[ 3.864473] __memcpy_fromio+0x68/0x80
[ 3.866683] ufs_qcom_dump_dbg_regs+0x1c0/0x370
[ 3.870522] ufshcd_print_host_regs+0x168/0x190
[ 3.874946] ufshcd_init+0xd4c/0xde0
[ 3.879459] ufshcd_pltfrm_init+0x3c8/0x550
[ 3.883264] ufs_qcom_probe+0x24/0x60
[ 3.887188] platform_drv_probe+0x50/0xa0
Assuming aligned 32-bit registers, let's use readl, after making sure
that 'offset' and 'len' are indeed multiples of 4.
Fixes: ba80917d9932d ("scsi: ufs: ufshcd_dump_regs to use memcpy_fromio")
Cc: <stable@vger.kernel.org>
Signed-off-by: Marc Gonzalez <marc.w.gonzalez@free.fr>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Jeffrey Hugo <jhugo@codeaurora.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Tested-by: Evan Green <evgreen@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit c7a082e4242fd8cd21a441071e622f87c16bdacc ]
UBSAN reported those with MegaRAID SAS-3 3108,
[ 77.467308] UBSAN: Undefined behaviour in drivers/scsi/megaraid/megaraid_sas_fp.c:117:32
[ 77.475402] index 255 is out of range for type 'MR_LD_SPAN_MAP [1]'
[ 77.481677] CPU: 16 PID: 333 Comm: kworker/16:1 Not tainted 4.20.0-rc5+ #1
[ 77.488556] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018
[ 77.495791] Workqueue: events work_for_cpu_fn
[ 77.500154] Call trace:
[ 77.502610] dump_backtrace+0x0/0x2c8
[ 77.506279] show_stack+0x24/0x30
[ 77.509604] dump_stack+0x118/0x19c
[ 77.513098] ubsan_epilogue+0x14/0x60
[ 77.516765] __ubsan_handle_out_of_bounds+0xfc/0x13c
[ 77.521767] mr_update_load_balance_params+0x150/0x158 [megaraid_sas]
[ 77.528230] MR_ValidateMapInfo+0x2cc/0x10d0 [megaraid_sas]
[ 77.533825] megasas_get_map_info+0x244/0x2f0 [megaraid_sas]
[ 77.539505] megasas_init_adapter_fusion+0x9b0/0xf48 [megaraid_sas]
[ 77.545794] megasas_init_fw+0x1ab4/0x3518 [megaraid_sas]
[ 77.551212] megasas_probe_one+0x2c4/0xbe0 [megaraid_sas]
[ 77.556614] local_pci_probe+0x7c/0xf0
[ 77.560365] work_for_cpu_fn+0x34/0x50
[ 77.564118] process_one_work+0x61c/0xf08
[ 77.568129] worker_thread+0x534/0xa70
[ 77.571882] kthread+0x1c8/0x1d0
[ 77.575114] ret_from_fork+0x10/0x1c
[ 89.240332] UBSAN: Undefined behaviour in drivers/scsi/megaraid/megaraid_sas_fp.c:117:32
[ 89.248426] index 255 is out of range for type 'MR_LD_SPAN_MAP [1]'
[ 89.254700] CPU: 16 PID: 95 Comm: kworker/u130:0 Not tainted 4.20.0-rc5+ #1
[ 89.261665] Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.50 06/01/2018
[ 89.268903] Workqueue: events_unbound async_run_entry_fn
[ 89.274222] Call trace:
[ 89.276680] dump_backtrace+0x0/0x2c8
[ 89.280348] show_stack+0x24/0x30
[ 89.283671] dump_stack+0x118/0x19c
[ 89.287167] ubsan_epilogue+0x14/0x60
[ 89.290835] __ubsan_handle_out_of_bounds+0xfc/0x13c
[ 89.295828] MR_LdRaidGet+0x50/0x58 [megaraid_sas]
[ 89.300638] megasas_build_io_fusion+0xbb8/0xd90 [megaraid_sas]
[ 89.306576] megasas_build_and_issue_cmd_fusion+0x138/0x460 [megaraid_sas]
[ 89.313468] megasas_queue_command+0x398/0x3d0 [megaraid_sas]
[ 89.319222] scsi_dispatch_cmd+0x1dc/0x8a8
[ 89.323321] scsi_request_fn+0x8e8/0xdd0
[ 89.327249] __blk_run_queue+0xc4/0x158
[ 89.331090] blk_execute_rq_nowait+0xf4/0x158
[ 89.335449] blk_execute_rq+0xdc/0x158
[ 89.339202] __scsi_execute+0x130/0x258
[ 89.343041] scsi_probe_and_add_lun+0x2fc/0x1488
[ 89.347661] __scsi_scan_target+0x1cc/0x8c8
[ 89.351848] scsi_scan_channel.part.3+0x8c/0xc0
[ 89.356382] scsi_scan_host_selected+0x130/0x1f0
[ 89.361002] do_scsi_scan_host+0xd8/0xf0
[ 89.364927] do_scan_async+0x9c/0x320
[ 89.368594] async_run_entry_fn+0x138/0x420
[ 89.372780] process_one_work+0x61c/0xf08
[ 89.376793] worker_thread+0x13c/0xa70
[ 89.380546] kthread+0x1c8/0x1d0
[ 89.383778] ret_from_fork+0x10/0x1c
This is because when populating Driver Map using firmware raid map, all
non-existing VDs set their ldTgtIdToLd to 0xff, so it can be skipped later.
From drivers/scsi/megaraid/megaraid_sas_base.c ,
memset(instance->ld_ids, 0xff, MEGASAS_MAX_LD_IDS);
From drivers/scsi/megaraid/megaraid_sas_fp.c ,
/* For non existing VDs, iterate to next VD*/
if (ld >= (MAX_LOGICAL_DRIVES_EXT - 1))
continue;
However, there are a few places that failed to skip those non-existing VDs
due to off-by-one errors. Then, those 0xff leaked into MR_LdRaidGet(0xff,
map) and triggered the out-of-bound accesses.
Fixes: 51087a8617fe ("megaraid_sas : Extended VD support")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e57b2945aa654e48f85a41e8917793c64ecb9de8 ]
We must free all irqs during shutdown, else kexec's 2nd kernel would hang
in pqi_wait_for_completion_io() as below:
Call trace:
pqi_wait_for_completion_io
pqi_submit_raid_request_synchronous.constprop.78+0x23c/0x310 [smartpqi]
pqi_configure_events+0xec/0x1f8 [smartpqi]
pqi_ctrl_init+0x814/0xca0 [smartpqi]
pqi_pci_probe+0x400/0x46c [smartpqi]
local_pci_probe+0x48/0xb0
pci_device_probe+0x14c/0x1b0
really_probe+0x218/0x3fc
driver_probe_device+0x70/0x140
__driver_attach+0x11c/0x134
bus_for_each_dev+0x70/0xc8
driver_attach+0x30/0x38
bus_add_driver+0x1f0/0x294
driver_register+0x74/0x12c
__pci_register_driver+0x64/0x70
pqi_init+0xd0/0x10000 [smartpqi]
do_one_initcall+0x60/0x1d8
do_init_module+0x64/0x1f8
load_module+0x10ec/0x1350
__se_sys_finit_module+0xd4/0x100
__arm64_sys_finit_module+0x28/0x34
el0_svc_handler+0x104/0x160
el0_svc+0x8/0xc
This happens only in the following combinations:
1. smartpqi is built as module, not built-in;
2. We have a disk connected to smartpqi card;
3. Both kexec's 1st and 2nd kernels use this disk as Rootfs' mount point.
Signed-off-by: Yanjiang Jin <yanjiang.jin@hxt-semitech.com>
Acked-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2ba55c9851d74eb015a554ef69ddf2ef061d5780 ]
Problem:
The Linux kernel takes a logical volume offline after a LUN reset. This is
generally accompanied by this message in the dmesg output:
Device offlined - not ready after error recovery
Root Cause:
The root cause is a "quirk" in the timeout handling in the Linux SCSI
layer. The Linux kernel places a 30-second timeout on most media access
commands (reads and writes) that it send to device drivers. When a media
access command times out, the Linux kernel goes into error recovery mode
for the LUN that was the target of the command that timed out. Every
command that timed out is kept on a list inside of the Linux kernel to be
retried later. The kernel attempts to recover the command(s) that timed out
by issuing a LUN reset followed by a TEST UNIT READY. If the LUN reset and
TEST UNIT READY commands are successful, the kernel retries the command(s)
that timed out.
Each SCSI command issued by the kernel has a result field associated with
it. This field indicates the final result of the command (success or
error). When a command times out, the kernel places a value in this result
field indicating that the command timed out.
The "quirk" is that after the LUN reset and TEST UNIT READY commands are
completed, the kernel checks each command on the timed-out command list
before retrying it. If the result field is still "timed out", the kernel
treats that command as not having been successfully recovered for a
retry. If the number of commands that are in this state are greater than
two, the kernel takes the LUN offline.
Fix:
When our RAIDStack receives a LUN reset, it simply waits until all
outstanding commands complete. Generally, all of these outstanding commands
complete successfully. Therefore, the fix in the smartpqi driver is to
always set the command result field to indicate success when a request
completes successfully. This normally isn’t necessary because the result
field is always initialized to success when the command is submitted to the
driver. So when the command completes successfully, the result field is
left untouched. But in this case, the kernel changes the result field
behind the driver’s back and then expects the field to be changed by the
driver as the commands that timed-out complete.
Reviewed-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 23c3828aa2f84edec7020c7397a22931e7a879e1 ]
With commit 09c2f95ad404 ("scsi: mpt3sas: Swap I/O memory read value back
to cpu endianness"), 64bit writes in _base_writeq() were rewritten to use
__raw_writeq() instad of writeq().
This introduced a bug apparent on powerpc64 systems such as the Raptor
Talos II that causes the HBA to drop from the PCIe bus under heavy load and
being reinitialized after a couple of seconds.
It can easily be triggered on affacted systems by using something like
fio --name=random-write --iodepth=4 --rw=randwrite --bs=4k --direct=0 \
--size=128M --numjobs=64 --end_fsync=1
fio --name=random-write --iodepth=4 --rw=randwrite --bs=64k --direct=0 \
--size=128M --numjobs=64 --end_fsync=1
a couple of times. In my case I tested it on both a ZFS raidz2 and a btrfs
raid6 using LSI 9300-8i and 9400-8i controllers.
The fix consists in resembling the write ordering of writeq() by adding a
mandatory write memory barrier before device access and a compiler barrier
afterwards. The additional MMIO barrier is superfluous.
Signed-off-by: Stephan Günther <moepi@moepi.net>
Reported-by: Matt Corallo <linux@bluematt.me>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d5632b11f0a17efa6356311e535ae135d178438d ]
The kernel panic was observed after switch side perturbation,
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff8132b5a0>] strcmp+0x20/0x40
PGD 0 Oops: 0000 [#1] SMP
CPU: 8 PID: 647 Comm: kworker/8:1 Tainted: G W OE ------------ 3.10.0-693.el7.x86_64 #1
Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10, BIOS U30 06/20/2018
Workqueue: slowpath-13:00. qed_slowpath_task [qed]
task: ffff880429eb8fd0 ti: ffff880429190000 task.ti: ffff880429190000
RIP: 0010:[<ffffffff8132b5a0>] [<ffffffff8132b5a0>] strcmp+0x20/0x40
RSP: 0018:ffff880429193c68 EFLAGS: 00010202
RAX: 000000000000000a RBX: 0000000000000002 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff88042bda7a41
RBP: ffff880429193c68 R08: 000000000000ffff R09: 000000000000ffff
R10: 0000000000000007 R11: ffff88042b3af338 R12: ffff880420b007a0
R13: ffff88081aa56af8 R14: 0000000000000001 R15: ffff88081aa50410
FS: 0000000000000000(0000) GS:ffff88042fe00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000000019f2000 CR4: 00000000003407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Stack:
ffff880429193d20 ffffffffc02a0c90 ffffc90004b32000 ffff8803fd3ec600
ffff88042bda7800 ffff88042bda7a00 ffff88042bda7840 ffff88042bda7a40
0000000129193d10 2e3836312e323931 ff000a342e363232 ffffffffc01ad99d
Call Trace:
[<ffffffffc02a0c90>] qedi_get_protocol_tlv_data+0x270/0x470 [qedi]
[<ffffffffc01ad99d>] ? qed_mfw_process_tlv_req+0x24d/0xbf0 [qed]
[<ffffffffc01653ae>] qed_mfw_fill_tlv_data+0x5e/0xd0 [qed]
[<ffffffffc01ad9b9>] qed_mfw_process_tlv_req+0x269/0xbf0 [qed]
Fix kernel NULL pointer deref by checking for session is online before
getting iSCSI TLV data.
Signed-off-by: Manish Rangankar <manish.rangankar@cavium.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 44759979a49bfd2d20d789add7fa81a21eb1a4ab upstream.
Changing of caching mode via /sys/devices/.../scsi_disk/.../cache_type may
fail if device responds to MODE SENSE command with DPOFUA flag set, and
then checks this flag to be not set on MODE SELECT command.
In this scenario, when trying to change cache_type, write always fails:
# echo "none" >cache_type
bash: echo: write error: Invalid argument
And following appears in dmesg:
[13007.865745] sd 1:0:1:0: [sda] Sense Key : Illegal Request [current]
[13007.865753] sd 1:0:1:0: [sda] Add. Sense: Invalid field in parameter list
From SBC-4 r15, 6.5.1 "Mode pages overview", description of DEVICE-SPECIFIC
PARAMETER field in the mode parameter header:
...
The write protect (WP) bit for mode data sent with a MODE SELECT
command shall be ignored by the device server.
...
The DPOFUA bit is reserved for mode data sent with a MODE SELECT
command.
...
The remaining bits in the DEVICE-SPECIFIC PARAMETER byte are also reserved
and shall be set to zero.
[mkp: shuffled commentary to commit description]
Cc: stable@vger.kernel.org
Signed-off-by: Ivan Mironov <mironov.ivan@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3f7e62bba0003f9c68f599f5997c4647ef5b4f4e upstream.
The commit 356fd2663cff ("scsi: Set request queue runtime PM status back to
active on resume") fixed up the inconsistent RPM status between request
queue and device. However changing request queue RPM status shall be done
only on successful resume, otherwise status may be still inconsistent as
below,
Request queue: RPM_ACTIVE
Device: RPM_SUSPENDED
This ends up soft lockup because requests can be submitted to underlying
devices but those devices and their required resource are not resumed.
For example,
After above inconsistent status happens, IO request can be submitted to UFS
device driver but required resource (like clock) is not resumed yet thus
lead to warning as below call stack,
WARN_ON(hba->clk_gating.state != CLKS_ON);
ufshcd_queuecommand
scsi_dispatch_cmd
scsi_request_fn
__blk_run_queue
cfq_insert_request
__elv_add_request
blk_flush_plug_list
blk_finish_plug
jbd2_journal_commit_transaction
kjournald2
We may see all behind IO requests hang because of no response from storage
host or device and then soft lockup happens in system. In the end, system
may crash in many ways.
Fixes: 356fd2663cff (scsi: Set request queue runtime PM status back to active on resume)
Cc: stable@vger.kernel.org
Signed-off-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
invalid
commit 4e87eb2f46ea547d12a276b2e696ab934d16cfb6 upstream.
Certain older adapters such as the OneConnect OCe10100 may not have a valid
wqpcnt value. In this case, do not set queue->page_count to 0 in
lpfc_sli4_queue_alloc() as this will prevent the driver from initializing.
Fixes: 895427bd01 ("scsi: lpfc: NVME Initiator: Base modifications")
Cc: stable@vger.kernel.org # 4.11+
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 9ae4f8420ed7be4b13c96600e3568c144d101a23 ]
If "interface" is NULL then we can't release it and trying to will only
lead to an Oops.
Fixes: aea71a024914 ("[SCSI] bnx2fc: Introduce interface structure for each vlan interface")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c64a87f9518409d0a439895f09f6149ffdd427b8 ]
This reverts commit db186382af21e926e90df19499475f2552192b77.
This commit introduced regression with FCP discovery so revert it to fix
discovery for FCP luns.
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 61cce6f6eeced5ddd9cac55e807fe28b4f18c1ba upstream.
When boxes are run near (or to) OOM, we have a problem with the discard
page allocation in sd. If we fail allocating the special page, we return
busy, and it'll get retried. But since ordering is honored for dispatch
requests, we can keep retrying this same IO and failing. Behind that IO
could be requests that want to free memory, but they never get the
chance. This means you get repeated spews of traces like this:
[1201401.625972] Call Trace:
[1201401.631748] dump_stack+0x4d/0x65
[1201401.639445] warn_alloc+0xec/0x190
[1201401.647335] __alloc_pages_slowpath+0xe84/0xf30
[1201401.657722] ? get_page_from_freelist+0x11b/0xb10
[1201401.668475] ? __alloc_pages_slowpath+0x2e/0xf30
[1201401.679054] __alloc_pages_nodemask+0x1f9/0x210
[1201401.689424] alloc_pages_current+0x8c/0x110
[1201401.699025] sd_setup_write_same16_cmnd+0x51/0x150
[1201401.709987] sd_init_command+0x49c/0xb70
[1201401.719029] scsi_setup_cmnd+0x9c/0x160
[1201401.727877] scsi_queue_rq+0x4d9/0x610
[1201401.736535] blk_mq_dispatch_rq_list+0x19a/0x360
[1201401.747113] blk_mq_sched_dispatch_requests+0xff/0x190
[1201401.758844] __blk_mq_run_hw_queue+0x95/0xa0
[1201401.768653] blk_mq_run_work_fn+0x2c/0x30
[1201401.777886] process_one_work+0x14b/0x400
[1201401.787119] worker_thread+0x4b/0x470
[1201401.795586] kthread+0x110/0x150
[1201401.803089] ? rescuer_thread+0x320/0x320
[1201401.812322] ? kthread_park+0x90/0x90
[1201401.820787] ? do_syscall_64+0x53/0x150
[1201401.829635] ret_from_fork+0x29/0x40
Ensure that the discard page allocation has a mempool backing, so we
know we can make progress.
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
unload
[ Upstream commit 02f425f811cefcc4d325d7a72272651e622dc97e ]
Currently pvscsi_remove calls free_irq more than once as
pvscsi_release_resources and __pvscsi_shutdown both call
pvscsi_shutdown_intr. This results in a 'Trying to free already-free IRQ'
warning and stack trace. To solve the problem pvscsi_shutdown_intr has been
moved out of pvscsi_release_resources.
Signed-off-by: Cathy Avery <cavery@redhat.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5db6dd14b31397e8cccaaddab2ff44ebec1acf25 ]
This commit addresses NULL pointer dereference in iscsi_eh_session_reset.
Reference should not be made to session->leadconn when session->state is
set to ISCSI_STATE_TERMINATE.
Signed-off-by: Fred Herard <fred.herard@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
Commit 0eeec01488da9b1403c8c29e73eacac8af9e4bf2 upstream.
I ran into a new warning on randconfig kernels:
drivers/scsi/raid_class.c: In function 'raid_match':
drivers/scsi/raid_class.c:64:24: error: unused variable 'i' [-Werror=unused-variable]
This looks like a very old problem that for some reason was very hard to
run into, but it is very easy to fix, by replacing the incorrect #ifdef
with a simpler IS_ENABLED() check.
Fixes: fac829fdcaf4 ("[SCSI] raid_attrs: fix dependency problems")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8e4829c6f7470722c1f5a1dc5769ebe09ef036d6 ]
Hynix ufs has deviations on hi36xx platform which will result in ufs bursts
transfer failures.
To fix the problem, the Hynix device must set the register
VS_DebugSaveConfigTime to 0x10, which will set time reference for
SaveConfigTime is 250 ns. The time reference for SaveConfigTime is 40 ns by
default.
This patch is necessary to boot on HiKey960 boards that use Hynix UFS chips
(H28U62301AMR model: hB8aL1).
Cc: Vinayak Holikatti <vinholikatti@gmail.com>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Wei Li <liwei213@huawei.com>
Signed-off-by: Dmitry Shmidt <dimitrysh@google.com>
[jstultz: Forward ported from older code, slight tweak to commit message]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit dfb7513374c1f8e7cd595106fbdba3fd07ebaf30 upstream.
Since f44ac12f1dcc, BG enablement is tracked with the LPFC_SLI3_BG_ENABLED
bit, which is set in lpfc_get_cfgparam before lpfc_sli_config_sli_port() is
called. The bit shouldn't be cleared before checking the feature. Based on
problem analysis by David Bond.
Fixes: f44ac12f1dcc "scsi: lpfc: Memory allocation error during driver start-up on power8"
Tested-by: David Bond <dbond@suse.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Cc: stable@vger.kernel.org # 4.17.x
Cc: stable@vger.kernel.org # 4.18.x
Cc: stable@vger.kernel.org # 4.19.x
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit e34ff8edcae89922d187425ab0b82e6a039aa371 ]
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/scsi/hisi_sas/hisi_sas_v1_hw.c: In function 'start_delivery_v1_hw':
drivers/scsi/hisi_sas/hisi_sas_v1_hw.c:907:20: warning:
variable 'dq_list' set but not used [-Wunused-but-set-variable]
drivers/scsi/hisi_sas/hisi_sas_v2_hw.c: In function 'start_delivery_v2_hw':
drivers/scsi/hisi_sas/hisi_sas_v2_hw.c:1671:20: warning:
variable 'dq_list' set but not used [-Wunused-but-set-variable]
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c: In function 'start_delivery_v3_hw':
drivers/scsi/hisi_sas/hisi_sas_v3_hw.c:889:20: warning:
variable 'dq_list' set but not used [-Wunused-but-set-variable]
It never used since introduction in commit
fa222db0b036 ("scsi: hisi_sas: Don't lock DQ for complete task sending")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f8d294324598ec85bea2779512e48c94cbe4d7c6 ]
The addition of a spinlock in lpfc_debugfs_nodelist_data() introduced
a bug that lets us not skip NULL pointers correctly, as noticed by
gcc-8:
drivers/scsi/lpfc/lpfc_debugfs.c: In function 'lpfc_debugfs_nodelist_data.constprop':
drivers/scsi/lpfc/lpfc_debugfs.c:728:13: error: 'nrport' may be used uninitialized in this function [-Werror=maybe-uninitialized]
if (nrport->port_role & FC_PORT_ROLE_NVME_INITIATOR)
This changes the logic back to what it was, while keeping the added
spinlock.
Fixes: 9e210178267b ("scsi: lpfc: Synchronize access to remoteport via rport")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 8dc765d438f1e42b3e8227b3b09fad7d73f4ec9a upstream.
c2856ae2f315d ("blk-mq: quiesce queue before freeing queue") has
already fixed this race, however the implied synchronize_rcu()
in blk_mq_quiesce_queue() can slow down LUN probe a lot, so caused
performance regression.
Then 1311326cf4755c7 ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()")
tried to quiesce queue for avoiding unnecessary synchronize_rcu()
only when queue initialization is done, because it is usual to see
lots of inexistent LUNs which need to be probed.
However, turns out it isn't safe to quiesce queue only when queue
initialization is done. Because when one SCSI command is completed,
the user of sending command can be waken up immediately, then the
scsi device may be removed, meantime the run queue in scsi_end_request()
is still in-progress, so kernel panic can be caused.
In Red Hat QE lab, there are several reports about this kind of kernel
panic triggered during kernel booting.
This patch tries to address the issue by grabing one queue usage
counter during freeing one request and the following run queue.
Fixes: 1311326cf4755c7 ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()")
Cc: Andrew Jones <drjones@redhat.com>
Cc: Bart Van Assche <bart.vanassche@wdc.com>
Cc: linux-scsi@vger.kernel.org
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
Cc: stable <stable@vger.kernel.org>
Cc: jianchao.wang <jianchao.w.wang@oracle.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f635e48e866ee1a47d2d42ce012fdcc07bf55853 upstream.
This patch initializes port speed so that firmware does not set lower
operating speed. Setting lower speed in firmware impacts WRITE perfomance.
Fixes: 726b85487067 ("qla2xxx: Add framework for async fabric discovery")
Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 39553065f77c297239308470ee313841f4e07db4 upstream.
This patch fixes multiple call for qla_nvme_unregister_remote_port() as part
of qlt_schedule_session_for_deletion(), Do not call it again during
qla_nvme_delete()
Fixes: e473b3074104 ("scsi: qla2xxx: Add FC-NVMe abort processing")
Cc: <stable@vger.kernel.org>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 732ee9a912cf2d9a50c5f9c4213cdc2f885d6aa6 upstream.
The response data buffer used in switch scan is reused 4 times. (For example,
for commands GPN_FT, GNN_FT for FCP and FC-NVME) Before driver reuses this
buffer, clear it to prevent duplicate entries in our database.
Fixes: a4239945b8ad1 ("scsi: qla2xxx: Add switch command to simplify fabric discovery"
Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|