summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)AuthorFilesLines
2021-01-08scsi: storvsc: Validate length of incoming packet in ↵Andrea Parri (Microsoft)1-0/+6
storvsc_on_channel_callback() Check that the packet is of the expected size at least, don't copy data past the packet. Link: https://lore.kernel.org/r/20201217203321.4539-4-parri.andrea@gmail.com Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Reported-by: Saruhan Karademir <skarade@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: storvsc: Resolve data race in storvsc_probe()Andrea Parri (Microsoft)1-20/+25
vmscsi_size_delta can be written concurrently by multiple instances of storvsc_probe(), corresponding to multiple synthetic IDE/SCSI devices; cf. storvsc_drv's probe_type == PROBE_PREFER_ASYNCHRONOUS. Change the global variable vmscsi_size_delta to per-synthetic-IDE/SCSI-device. Link: https://lore.kernel.org/r/20201217203321.4539-3-parri.andrea@gmail.com Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Suggested-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: storvsc: Fix max_outstanding_req_per_channel for Win8 and newerAndrea Parri (Microsoft)1-2/+5
Current code overestimates the value of max_outstanding_req_per_channel for Win8 and newer hosts, since vmscsi_size_delta is set to the initial value of sizeof(vmscsi_win8_extension) rather than zero. This may lead to wrong decisions when using ring_avail_percent_lowater equals to zero. The estimate of max_outstanding_req_per_channel is 'exact' for Win7 and older hosts. A better choice, keeping the algorithm for the estimation simple, is to err the other way around, i.e., to underestimate for Win7 and older but to use the exact value for Win8 and newer. Link: https://lore.kernel.org/r/20201217203321.4539-2-parri.andrea@gmail.com Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Suggested-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Update lpfc version to 12.8.0.7James Smart1-1/+1
Update lpfc version to 12.8.0.7 Link: https://lore.kernel.org/r/20210104180240.46824-16-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Enhancements to LOG_TRACE_EVENT for better readabilityJames Smart2-2/+20
While testing recent discovery node rework, several items were seen that could be done better with respect to the new trace event logic. 1) in the following msg: kernel: lpfc 0000:44:00.0: start 35 end 35 cnt 0 If cnt is zero in the 1st message, there is no reason to display the 1st message, which is just giving start/end positioning. Fix by not displaying message if cnt is 0. 2) If the driver is loaded with module log verbosity off, and later a single NPIV host instance verbosity is enabled via sysfs, it enables messages on all instances. This is due to the trace log verbosity checks (lpfc_dmp_dbg) looking at the phba only. It should look at the phba and the vport. Fix by enabling a check on both phba and vport. 3) in the following messages: 2904 Firmware Dump Image Present on Adapter 2887 Reset Needed: Attempting Port Recovery... These messages are not necessary for the trace event log, which is primarily for discovery. Fix by changing log level on these 2 messages to LOG_SLI. Link: https://lore.kernel.org/r/20210104180240.46824-15-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Implement health checking when aborting I/OJames Smart10-75/+178
Several errors have occurred where the adapter stops or fails but does not raise the register values for the driver to detect failure. Thus driver is unaware of the failure. The failure typically results in I/O timeouts, the I/O timeout handler failing (after several seconds), and the error handler escalating recovery policy and resulting in more errors. Eventually, the driver is in a position where things have spiraled and it can't do recovery because other recovery ops are still outstanding and it becomes unusable. Resolve the situation by having the I/O timeout handler (actually a els, SCSI I/O, NVMe ls, or NVMe I/O timeout), in addition to aborting the I/O, perform a mailbox command and look for a response from the hardware. If the mailbox command fails, it will mark the adapter offline and then invoke the adapter reset handler to clean up. The new I/O timeout test will be limited to a test every 5s. If there are multiple I/O timeouts concurrently, only the 1st I/O timeout will generate the mailbox command. Further testing will only occur once a timeout occurs after a 5s delay from the last mailbox command has expired. Link: https://lore.kernel.org/r/20210104180240.46824-14-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Fix crash when nvmet transport calls host_releaseJames Smart3-15/+63
When lpfc is running in NVMET mode and supports the NVME-1 addendum changes, a LIP on a bound NVME Initiator or lipping the lpfc NVMET's link resulted in an Oops in lpfc_nvmet_host_release. The fix requires lpfc NVMET to maintain an additional reference on any node structure that acts as the hosthandle for the NVMET transport. This reference get is a one-time addition, is taken prior to the upcall of an unsolicited LS_REQ, and is released when the NVMET transport releases the hosthandle during the host_release downcall. Link: https://lore.kernel.org/r/20210104180240.46824-13-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Fix vport create loggingJames Smart1-1/+1
When with testing with large numbers of npiv vports and link bounces, the driver is flooding the messages file, even with log_verbose = 0. The new LOG_TRACE_EVENT messages are still generating events to the messages files. Fix by converting the vport create msg from LOG_TRACE_EVENT to LOG_VPORT. Link: https://lore.kernel.org/r/20210104180240.46824-12-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Fix NVMe recovery after mailbox timeoutJames Smart4-28/+37
If a mailbox command times out, the SLI port is deemed in error and the port is reset. The HBA cleanup is not returning I/Os to the NVMe layer before the port is unregistered. This is due to the HBA being marked offline (!SLI_ACTIVE) and cleanup being done by the mailbox timeout handler rather than an general adapter reset routine. The mailbox timeout handler mailbox handler only cleaned up SCSI I/Os. Fix by reworking the mailbox handler to: - After handling the mailbox error, detect the board is already in failure (may be due to another error), and leave cleanup to the other handler. - If the mailbox command timeout is initial detector of the port error, continue with the board cleanup and marking the adapter offline (!SLI_ACTIVE). Remove the SCSI-only I/O cleanup routine. The generic reset adapter routine that is subsequently invoked, will clean up the I/Os. - Have the reset adapter routine flush all NVMe and SCSI I/Os if the adapter has been marked failed (!SLI_ACTIVE). - Rework the NVMe I/O terminate routine to take a status code to fail the I/O with and update so that cleaned up I/O calls the wqe completion routine. Currently it is bypassing the wqe cleanup and calling the NVMe I/O completion directly. The wqe completion routine will take care of data structure and node cleanup then call the NVMe I/O completion handler. Link: https://lore.kernel.org/r/20210104180240.46824-11-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Fix target reset failingJames Smart3-2/+46
Target reset is failed by the target as an invalid command. The Target Reset TMF has been obsoleted in T10 for a while, but continues to be used. On (newer) devices, the TMF is rejected causing the reset handler to escalate to adapter resets. Fix by having Target Reset TMF rejections be translated into a LOGO and re-PLOGI with the target device. This provides the same semantic action (although, if the device also supports nvme traffic, it will terminate nvme traffic as well - but it's still recoverable). Link: https://lore.kernel.org/r/20210104180240.46824-10-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Fix error log messages being logged following SCSI task mgntJames Smart1-4/+13
A successful task mgmt command is logging errors, making it look like problems were encountered. This is due to log messages for the device/target and bus reset handlers having the LOG_TRACE_EVENT flag set. Fix by adjusting the event flag such that the call to the logging routine only receives a LOG_TRACE_EVENT if a prior call actually failed. Link: https://lore.kernel.org/r/20210104180240.46824-9-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Prevent duplicate requests to unregister with cpuhp frameworkJames Smart1-1/+5
In the lpfc offline routine, called for various reasons such as sysfs attribute, driver unload, or port error, the driver is calling __lpfc_cpuhp_remove() to destroy the hot plug data. If the offline routine is called while the driver is in the process of being unloaded, a request using lpfc_cpuhp_remove() is also made from lpfc_sli4_hba_unset(). The cpuhp elements are no longer valid when the second removal request is made. Fix by only calling the cpuhp removal once when the adapter is in the process of unloading. Link: https://lore.kernel.org/r/20210104180240.46824-8-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Fix FW reset action if I/Os are outstandingJames Smart1-0/+10
If the port is configured for NVME and has any outstanding IOs when a FW reset is requesteed, outstanding I/Os are not properly cleaned up. This causes the fw download request to fail. Fix by clearing the LPFC_SLI_ACTIVE flag to signify the I/O must be manually flushed by the driver on port reset. Link: https://lore.kernel.org/r/20210104180240.46824-7-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Use the nvme-fc transport supplied timeout for LS requestsJames Smart1-2/+2
When lpfc generates a GEN_REQUEST wqe for the nvme LS (such as Create Association), the timeout is set to R_A_TOV without regard to the timeout value supplied by the nvme-fc transport. The driver should be setting the timeout to the value passed into the routine. Additionally the caller should be setting the timeout value to the value in the ls request set by the nvme transport. Instead, it unconditionally is setting it to a driver defined value. So the driver actually overrode the value twice. Fix by using the timeout provided to the routine, and for the caller, set the timeout to the ls request timeout value. Link: https://lore.kernel.org/r/20210104180240.46824-6-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Fix crash when a fabric node is released prematurelyJames Smart2-6/+20
The driver's management of the fabric controller (aka pseudo-scsi initiator) node in SLI3 mode is causing this crash. The crash occurs because of a node reference imbalance that frees the fabric controller node while devloss is outstanding from the SCSI transport. This is triggered by an odd behavior where the switch reacts to a rejected RDP request with a PLOGI and nothing else, not even a LOGO. The driver ACKS the PLOGI and after successfully registering the RPI, incorrectly registers the fabric controller node because it has the NLP_FC4_FCP flag still set from the fabric controller PRLI. If a LIP is issued, the driver attempts to cleanup on Link Up and ends up executing too many puts. Fix by detecting the fabric node type and clearing out the nodes internal flags that triggered a SCSI transport registration and subsequence dev_loss event. The driver cannot count on any persistence from fabric controller nodes. Link: https://lore.kernel.org/r/20210104180240.46824-5-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Refresh ndlp when a new PRLI is received in the PRLI issue stateJames Smart1-0/+11
Testing with target ports coming and going, the driver eventually reached a state where it no longer discovered the target. When the driver has issued a PRLI and receives a PRLI from the target, it is not properly updating the node's initiator/target role flags. Thus, when a subsequent RSCN is received for a target loss, the driver mis-identifies the target as an initiator and does not initiate LUN scanning. Fix by always refreshing the ndlp with the latest PRLI state information whenever a PRLI is processed. Also clear the ndlp flags when processing a PLOGI so that there is no carry over through a re-login. Link: https://lore.kernel.org/r/20210104180240.46824-4-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Fix auto sli_mode and its effect on CONFIG_PORT for SLI3James Smart4-48/+26
A very long time ago, there was a feature: auto sli mode. It gave the user the ability to auto select the SLI mode (SLI2 or SLI3) to run the port in, or even force SLI2 mode if configured. Because of the convoluted logic, the CONFIG_PORT mbox command ends up being called 2 or 3 times. It should have been called only once. Additionally, the driver no longer supports SLI-2, so only SLI-3 mode should be allowed. The following changes were made: - Force module parameter to SLI3 only. - Rip out redundant CONFIG_PORT mbox commands. - Force CONFIG_PORT mbox command to be in beginning of enable ISR routine. - Added changes for offline to online behavior Link: https://lore.kernel.org/r/20210104180240.46824-3-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: lpfc: Fix PLOGI S_ID of 0 on pt2pt configJames Smart1-26/+5
Under some pt2pt situations, the other end of the link may issue a LOGO after successfully completing PLOGI and assigning addresses to the port. Thus the driver may attempt a new PLOGI to re-create the login, but the LOGO handling cleared the address back to 0. Once this happens, the other end, which may be address 0, gets all confused and this cannot be resolved without an administrative action to bounce the link. Fix by assuming that address assignment only occurs on the 1st PLOGI after link up, and regardless of login state, the address assignment sticks. The FC standards aren't particularly clear in this situation (it only describes initial PLOGI), but there is nothing that contradicts this and behaviors on the devices tested appears to conform to the understanding. Thus, don't reset the port address to 0 as part of LOGO handling. Port addresses will only reset on link down. Link: https://lore.kernel.org/r/20210104180240.46824-2-jsmart2021@gmail.com Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: hisi_sas: Remove auto_affine_msi_experimental module_paramJohn Garry1-5/+0
Now that the driver always uses managed interrupts, delete auto_affine_msi_experimental module param. Link: https://lore.kernel.org/r/1609763622-34119-2-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ufs: Fix tm request when non-fatal error happensJaegeuk Kim1-5/+13
When non-fatal error like line-reset happens, ufshcd_err_handler() starts to abort tasks by ufshcd_try_to_abort_task(). When it tries to issue a task management request, we hit two warnings: WARNING: CPU: 7 PID: 7 at block/blk-core.c:630 blk_get_request+0x68/0x70 WARNING: CPU: 4 PID: 157 at block/blk-mq-tag.c:82 blk_mq_get_tag+0x438/0x46c After fixing the above warnings we hit another tm_cmd timeout which may be caused by unstable controller state: __ufshcd_issue_tm_cmd: task management cmd 0x80 timed-out Then, ufshcd_err_handler() enters full reset, and kernel gets stuck. It turned out ufshcd_print_trs() printed too many messages on console which requires CPU locks. Likewise hba->silence_err_logs, we need to avoid too verbose messages. This is actually not an error case. Link: https://lore.kernel.org/r/20210107185316.788815-3-jaegeuk@kernel.org Fixes: 69a6c269c097 ("scsi: ufs: Use blk_{get,put}_request() to allocate and free TMFs") Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ufs: Fix livelock of ufshcd_clear_ua_wluns()Jaegeuk Kim1-7/+12
When gate_work/ungate_work experience an error during hibern8_enter or exit we can livelock: ufshcd_err_handler() ufshcd_scsi_block_requests() ufshcd_reset_and_restore() ufshcd_clear_ua_wluns() -> stuck ufshcd_scsi_unblock_requests() In order to avoid this, ufshcd_clear_ua_wluns() can be called per recovery flows such as suspend/resume, link_recovery, and error_handler. Link: https://lore.kernel.org/r/20210107185316.788815-2-jaegeuk@kernel.org Fixes: 1918651f2d7e ("scsi: ufs: Clear UAC for RPMB after ufshcd resets") Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ufs: Replace sprintf and snprintf with sysfs_emitBean Huo1-15/+15
sprintf and snprintf may cause output defect in sysfs content, it is better to use new added sysfs_emit function which knows the size of the temporary buffer. Link: https://lore.kernel.org/r/20210106211541.23039-1-huobean@gmail.com Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ufs: Fix all Kconfig help text indentationRandy Dunlap1-7/+7
Use consistent and expected indentation for all Kconfig text. Link: https://lore.kernel.org/r/20210106205554.18082-1-rdunlap@infradead.org Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: Avri Altman <avri.altman@wdc.com> Cc: linux-scsi@vger.kernel.org Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ibmvfc: Relax locking around ibmvfc_queuecommand()Tyrel Datwyler1-8/+4
The driver's queuecommand routine is still wrapped to hold the host lock for the duration of the call. This will become problematic when moving to multiple queues due to the lock contention preventing asynchronous submissions to mulitple queues. There is no real legitimate reason to hold the host lock, and previous patches have insured proper protection of moving ibmvfc_event objects between free and sent lists. Link: https://lore.kernel.org/r/20210106201835.1053593-6-tyreld@linux.ibm.com Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ibmvfc: Complete commands outside the host/queue lockTyrel Datwyler2-14/+47
Drain the command queue and place all commands on a completion list. Perform command completion on that list outside the host/queue locks. Further, move purged command compeletions outside the host_lock as well. Link: https://lore.kernel.org/r/20210106201835.1053593-5-tyreld@linux.ibm.com Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ibmvfc: Define per-queue state/list locksTyrel Datwyler2-20/+80
Define per-queue locks for protecting queue state and event pool sent/free lists. The evt list lock is initially redundant but it allows the driver to be modified in the follow-up patches to relax the queue locking around submissions and completions. Link: https://lore.kernel.org/r/20210106201835.1053593-4-tyreld@linux.ibm.com Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ibmvfc: Make command event pool queue specificTyrel Datwyler2-50/+55
There is currently a single command event pool per host. In anticipation of providing multiple queues add a per-queue event pool definition and reimplement the existing CRQ to use its queue defined event pool for command submission and completion. Link: https://lore.kernel.org/r/20210106201835.1053593-3-tyreld@linux.ibm.com Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ibmvfc: Define generic queue structure for CRQsTyrel Datwyler2-62/+107
The primary and async CRQs are nearly identical outside of the format and length of each message entry in the dma mapped page that represents the queue data. These queues can be represented with a generic queue structure that uses a union to differentiate between message format of the mapped page. This structure will further be leveraged in a followup patcheset that introduces Sub-CRQs. Link: https://lore.kernel.org/r/20210106201835.1053593-2-tyreld@linux.ibm.com Reviewed-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ibmvfc: Fix missing cast of ibmvfc_event pointer to u64 handleTyrel Datwyler1-2/+2
Commit 2aa0102c6688 ("scsi: ibmvfc: Use correlation token to tag commands") sets the vfcFrame correlation token to the pointer handle of the associated ibmvfc_event. However, that commit failed to cast the pointer to an appropriate type which in this case is a u64. As such sparse warnings are generated for both correlation token assignments. ibmvfc.c:2375:36: sparse: incorrect type in argument 1 (different base types) ibmvfc.c:2375:36: sparse: expected unsigned long long [usertype] val ibmvfc.c:2375:36: sparse: got struct ibmvfc_event *[assigned] evt Add the appropriate u64 casts when assigning an ibmvfc_event as a correlation token. Link: https://lore.kernel.org/r/20210106203721.1054693-1-tyreld@linux.ibm.com Fixes: 2aa0102c6688 ("scsi: ibmvfc: Use correlation token to tag commands") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ufs: ufshcd-pltfrm depends on HAS_IOMEMRandy Dunlap1-0/+1
Building ufshcd-pltfrm.c on arch/s390/ has a linker error since S390 does not support IOMEM, so add a dependency on HAS_IOMEM. s390-linux-ld: drivers/scsi/ufs/ufshcd-pltfrm.o: in function `ufshcd_pltfrm_init': ufshcd-pltfrm.c:(.text+0x38e): undefined reference to `devm_platform_ioremap_resource' where that devm_ function is inside an #ifdef CONFIG_HAS_IOMEM/#endif block. Link: lore.kernel.org/r/202101031125.ZEFCUiKi-lkp@intel.com Link: https://lore.kernel.org/r/20210106040822.933-1-rdunlap@infradead.org Fixes: 03b1781aa978 ("[SCSI] ufs: Add Platform glue driver for ufshcd") Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: Avri Altman <avri.altman@wdc.com> Cc: linux-scsi@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ufs: Make UPIU trace easier differentiate among CDB, OSF, and TMBean Huo2-4/+12
Transaction Specific Fields (TSF) in the UPIU package could be CDB (SCSI/UFS Command Descriptor Block), OSF (Opcode Specific Field), and TM I/O parameter (Task Management Input/Output Parameter). But, currently, we take all of these as CDB in the UPIU trace. Thus makes user confuse among CDB, OSF, and TM message. So fix it with this patch. Link: https://lore.kernel.org/r/20210105113446.16027-7-huobean@gmail.com Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ufs: Distinguish between TM request UPIU and response UPIU in TM UPIU ↵Bean Huo1-2/+6
trace Distinguish between TM request UPIU and response UPIU in TM UPIU trace, for the TM response, let TM UPIU trace print its TM response UPIU. Link: https://lore.kernel.org/r/20210105113446.16027-6-huobean@gmail.com Acked-by: Avri Altman <avri.altman@wdc.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ufs: Distinguish between query REQ and query RSP in query traceBean Huo1-2/+8
Currently, in the query completion trace print, since we use hba->lrb[tag].ucd_req_ptr and didn't differentiate UPIU between request and response, thus header and transaction-specific field in UPIU printed by query trace are identical. This is not very practical. As below: query_send: HDR:16 00 00 0e 00 81 00 00 00 00 00 00, CDB:06 0e 03 00 00 00 00 00 00 00 00 00 00 00 00 00 query_complete: HDR:16 00 00 0e 00 81 00 00 00 00 00 00, CDB:06 0e 03 00 00 00 00 00 00 00 00 00 00 00 00 00 For the failure analysis, we want to understand the real response reported by the UFS device, however, the current query trace tells us nothing. After this patch, the query trace on the query_send, and the above a pair of query_send and query_complete will be: query_send: HDR:16 00 00 0e 00 81 00 00 00 00 00 00, CDB:06 0e 03 00 00 00 00 00 00 00 00 00 00 00 00 00 ufshcd_upiu: HDR:36 00 00 0e 00 81 00 00 00 00 00 00, CDB:06 0e 03 00 00 00 00 00 00 00 00 01 00 00 00 00 Link: https://lore.kernel.org/r/20210105113446.16027-5-huobean@gmail.com Acked-by: Avri Altman <avri.altman@wdc.com> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ufs: Don't call trace_ufshcd_upiu() in case trace poit is disabledBean Huo1-0/+9
Don't call trace_ufshcd_upiu() in case ufshba_upiu trace poit is not enabled. Link: https://lore.kernel.org/r/20210105113446.16027-4-huobean@gmail.com Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: ufs: Use __print_symbolic() for UFS trace string printBean Huo2-24/+34
__print_symbolic() is designed for exporting the print formatting table to userspace and allows parsing tool, such as trace-cmd and perf, to analyze trace log according to this print formatting table, meanwhile, by using __print_symbolic()s, save space in the trace ring buffer. original print format: print fmt: "%s: %s: HDR:%s, CDB:%s", __get_str(str), __get_str(dev_name), __print_hex(REC->hdr, sizeof(REC->hdr)), __print_hex(REC->tsf, sizeof(REC->tsf)) after this change: print fmt: "%s: %s: HDR:%s, CDB:%s", print_symbolic(REC->str_t, {0, "send"}, {1, "complete"}, {2, "dev_complete"}, {3, "query_send"}, {4, "query_complete"}, {5, "query_complete_err"}, {6, "tm_send"}, {7, "tm_complete"}, {8, "tm_complete_err"}), __get_str(dev_name), __print_hex(REC->hdr, sizeof(REC->hdr)), __print_hex(REC->tsf, sizeof(REC->tsf)) Note: This patch just converts current __get_str(str) to __print_symbolic(), the original tracing log will not be affected by this change, so it doesn't break what current parsers expect. Link: https://lore.kernel.org/r/20210105113446.16027-3-huobean@gmail.com Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-08scsi: megaraid_sas: Fix MEGASAS_IOC_FIRMWARE regressionArnd Bergmann1-4/+2
Phil Oester reported that a fix for a possible buffer overrun that I sent caused a regression that manifests in this output: Event Message: A PCI parity error was detected on a component at bus 0 device 5 function 0. Severity: Critical Message ID: PCI1308 The original code tried to handle the sense data pointer differently when using 32-bit 64-bit DMA addressing, which would lead to a 32-bit dma_addr_t value of 0x11223344 to get stored 32-bit kernel: 44 33 22 11 ?? ?? ?? ?? 64-bit LE kernel: 44 33 22 11 00 00 00 00 64-bit BE kernel: 00 00 00 00 44 33 22 11 or a 64-bit dma_addr_t value of 0x1122334455667788 to get stored as 32-bit kernel: 88 77 66 55 ?? ?? ?? ?? 64-bit kernel: 88 77 66 55 44 33 22 11 In my patch, I tried to ensure that the same value is used on both 32-bit and 64-bit kernels, and picked what seemed to be the most sensible combination, storing 32-bit addresses in the first four bytes (as 32-bit kernels already did), and 64-bit addresses in eight consecutive bytes (as 64-bit kernels already did), but evidently this was incorrect. Always storing the dma_addr_t pointer as 64-bit little-endian, i.e. initializing the second four bytes to zero in case of 32-bit addressing, apparently solved the problem for Phil, and is consistent with what all 64-bit little-endian machines did before. I also checked in the history that in previous versions of the code, the pointer was always in the first four bytes without padding, and that previous attempts to fix 64-bit user space, big-endian architectures and 64-bit DMA were clearly flawed and seem to have introduced made this worse. Link: https://lore.kernel.org/r/20210104234137.438275-1-arnd@kernel.org Fixes: 381d34e376e3 ("scsi: megaraid_sas: Check user-provided offsets") Fixes: 107a60dd71b5 ("scsi: megaraid_sas: Add support for 64bit consistent DMA") Fixes: 94cd65ddf4d7 ("[SCSI] megaraid_sas: addded support for big endian architecture") Fixes: 7b2519afa1ab ("[SCSI] megaraid_sas: fix 64 bit sense pointer truncation") Reported-by: Phil Oester <kernel@linuxace.com> Tested-by: Phil Oester <kernel@linuxace.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-06scsi: sd: Remove obsolete variable in sd_remove()Lukas Bulwahn1-2/+0
Commit 996e509bbc95 ("sd: use __register_blkdev to avoid a modprobe for an unregistered dev_t") removed blk_register_region(devt, ...) in sd_remove() and since then, devt is unused in sd_remove(). Hence, make W=1 warns: drivers/scsi/sd.c:3516:8: warning: variable 'devt' set but not used [-Wunused-but-set-variable] Simply remove this obsolete variable. [mkp: fixed commit sha] Link: https://lore.kernel.org/r/20201214095424.12479-1-lukas.bulwahn@gmail.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-06scsi: sd: Suppress spurious errors when WRITE SAME is being disabledEwan D. Milne1-1/+3
The block layer code will split a large zeroout request into multiple bios and if WRITE SAME is disabled because the storage device reports that it does not support it (or support the length used), we can get an error message from the block layer despite the setting of RQF_QUIET on the first request. This is because more than one request may have already been submitted. Fix this by setting RQF_QUIET when BLK_STS_TARGET is returned to fail the request early, we don't need to log a message because we did not actually submit the command to the device, and the block layer code will handle the error by submitting individual write bios. Link: https://lore.kernel.org/r/20201207221021.28243-1-emilne@redhat.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-06scsi: scsi_debug: Fix memleak in scsi_debug_init()Dinghao Liu1-2/+3
When sdeb_zbc_model does not match BLK_ZONED_NONE, BLK_ZONED_HA or BLK_ZONED_HM, we should free sdebug_q_arr to prevent memleak. Also there is no need to execute sdebug_erase_store() on failure of sdeb_zbc_model_str(). Link: https://lore.kernel.org/r/20201226061503.20050-1-dinghao.liu@zju.edu.cn Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-06scsi: mpt3sas: Fix spelling mistake in Kconfig "compatiblity" -> "compatibility"Colin Ian King1-1/+1
There is a spelling mistake in the Kconfig help text. Fix it. Link: https://lore.kernel.org/r/20201217172019.57768-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-06scsi: qedi: Correct max length of CHAP secretNilesh Javali1-2/+2
The CHAP secret displayed garbage characters causing iSCSI login authentication failure. Correct the CHAP password max length. Link: https://lore.kernel.org/r/20201217105144.8055-1-njavali@marvell.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-06scsi: ufs: Correct the LUN used in eh_device_reset_handler() callbackCan Guo1-7/+4
Users can initiate resets to specific SCSI device/target/host through IOCTL. When this happens, the SCSI cmd passed to eh_device/target/host _reset_handler() callbacks is initialized with a request whose tag is -1. In this case it is not right for eh_device_reset_handler() callback to count on the LUN get from hba->lrb[-1]. Fix it by getting LUN from the SCSI device associated with the SCSI cmd. Link: https://lore.kernel.org/r/1609157080-26283-1-git-send-email-cang@codeaurora.org Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-06scsi: ufs: ufs-exynos: Apply vendor-specific values for three timeoutsKiwoong Kim1-1/+7
Set optimized values for the following timeouts: - FC0_PROTECTION_TIMER - TC0_REPLAY_TIMER - AFC0_REQUEST_TIMER Exynos doesn't yet use traffic class #1. Link: https://lore.kernel.org/r/a0ff44f665a4f31d2f945fd71de03571204c576c.1608513782.git.kwmad.kim@samsung.com Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-06scsi: ufs: Add a quirk to permit overriding UniPro defaultsKiwoong Kim2-19/+27
The UniPro specification states that attribute IDs of the following parameters are vendor-specific so some SoCs could have no regions at the defined addresses: - DME_LocalFC0ProtectionTimeOutVal - DME_LocalTC0ReplayTimeOutVal - DME_LocalAFC0ReqTimeOutVal In addition, the following parameters should be set considering the compatibility between host and device. - PA_PWRMODEUSERDATA0 - PA_PWRMODEUSERDATA1 - PA_PWRMODEUSERDATA2 - PA_PWRMODEUSERDATA3 - PA_PWRMODEUSERDATA4 - PA_PWRMODEUSERDATA5 Introduce a quirk to allow vendor drivers to override the UniPro defaults. Link: https://lore.kernel.org/r/1fedd3dea0ccc980913a5995a10510d86a5b01b9.1608513782.git.kwmad.kim@samsung.com Acked-by: Avri Altman <Avri.Altman@wdc.com> Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-06scsi: ufs: Relocate flush of exceptional eventKiwoong Kim1-2/+2
The current flush location does not guarantee disabling BKOPS for the case of requesting device power off. 1) The exceptional event handler is queued 2) ufs suspend starts with a request of device power off 3) BKOPS is disabled in ufs suspend 4) The queued work for the handler is done and BKOPS is re-enabled Relocate the flush statement to ensure BKOPS remain disabled. Link: https://lore.kernel.org/r/1608360039-16390-1-git-send-email-kwmad.kim@samsung.com Reviewed-by: Can Guo <cang@codeaurora.org> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-06scsi: ufs-mediatek: Enable UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRLStanley Chu1-0/+1
Flush during hibern8 is sufficient on MediaTek platforms, thus enable UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL to skip enabling fWriteBoosterBufferFlush during WriteBooster initialization. Link: https://lore.kernel.org/r/20201222072928.32328-1-stanley.chu@mediatek.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-06scsi: ufs: Relax the condition of UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRLStanley Chu1-4/+2
UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL is intended to skip enabling fWriteBoosterBufferFlushEn while WriteBooster is initializing. Therefore it is better to apply the checking during WriteBooster initialization only. Link: https://lore.kernel.org/r/20201222072905.32221-3-stanley.chu@mediatek.com Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-06scsi: ufs: Fix possible power drain during system suspendStanley Chu1-1/+2
Currently if device needs to do flush or BKOP operations, the device VCC power is kept during runtime-suspend period. However, if system suspend is happening while device is runtime-suspended, such power may not be disabled successfully. The reasons may be, 1. If current PM level is the same as SPM level, device will keep runtime-suspended by ufshcd_system_suspend(). 2. Flush recheck work may not be scheduled successfully during system suspend period. If it can wake up the system, this is also not the intention of the recheck work. To fix this issue, simply runtime-resume the device if the flush is allowed during runtime suspend period. Flush capability will be disabled while leaving runtime suspend, and also not be allowed in system suspend period. Link: https://lore.kernel.org/r/20201222072905.32221-2-stanley.chu@mediatek.com Fixes: 51dd905bd2f6 ("scsi: ufs: Fix WriteBooster flush during runtime suspend") Reviewed-by: Chaotian Jing <chaotian.jing@mediatek.com> Reviewed-by: Can Guo <cang@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-04Merge branch '5.11/scsi-postmerge' into 5.11/scsi-fixesMartin K. Petersen5-26/+123
Merge two commits that had dependencies on other 5.11 trees (the block and the irq trees respectively). - We reverted a megaraid_sas change in 5.10 due to missing block layer plumbing. Now that this is in place, reinstate the change. - The hisi_sas driver had a dependency on a driver core irq change that went in through Thomas' tree. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-01Merge tag 'scsi-fixes' of ↵Linus Torvalds11-51/+164
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a load of driver fixes (12 ufs, 1 mpt3sas, 1 cxgbi). The big core two fixes are for power management ("block: Do not accept any requests while suspended" and "block: Fix a race in the runtime power management code") which finally sorts out the resume problems we've occasionally been having. To make the resume fix, there are seven necessary precursors which effectively renames REQ_PREEMPT to REQ_PM, so every "special" request in block is automatically a power management exempt one. All of the non-PM preempt cases are removed except for the one in the SCSI Parallel Interface (spi) domain validation which is a genuine case where we have to run requests at high priority to validate the bus so this becomes an autopm get/put protected request" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (22 commits) scsi: cxgb4i: Fix TLS dependency scsi: ufs: Un-inline ufshcd_vops_device_reset function scsi: ufs: Re-enable WriteBooster after device reset scsi: ufs-mediatek: Use correct path to fix compile error scsi: mpt3sas: Signedness bug in _base_get_diag_triggers() scsi: block: Do not accept any requests while suspended scsi: block: Remove RQF_PREEMPT and BLK_MQ_REQ_PREEMPT scsi: core: Only process PM requests if rpm_status != RPM_ACTIVE scsi: scsi_transport_spi: Set RQF_PM for domain validation commands scsi: ide: Mark power management requests with RQF_PM instead of RQF_PREEMPT scsi: ide: Do not set the RQF_PREEMPT flag for sense requests scsi: block: Introduce BLK_MQ_REQ_PM scsi: block: Fix a race in the runtime power management code scsi: ufs-pci: Enable UFSHCD_CAP_RPM_AUTOSUSPEND for Intel controllers scsi: ufs-pci: Fix recovery from hibernate exit errors for Intel controllers scsi: ufs-pci: Ensure UFS device is in PowerDown mode for suspend-to-disk ->poweroff() scsi: ufs-pci: Fix restore from S4 for Intel controllers scsi: ufs-mediatek: Keep VCC always-on for specific devices scsi: ufs: Allow regulators being always-on scsi: ufs: Clear UAC for RPMB after ufshcd resets ...