scsi: ufs: Fix concurrency of error handler and other error recovery paths

Error recovery can be invoked from multiple code paths, including hibern8 enter/exit (from ufshcd_link_recovery), ufshcd_eh_host_reset_handler() and eh_work scheduled from IRQ context. Ultimately, these paths are all trying to invoke ufshcd_reset_and_restore() in either a synchronous or asynchronous manner. This causes problems: - If link recovery happens during ungate work, ufshcd_hold() would be called recursively. Although commit 53c12d0ef6fc ("scsi: ufs: fix error recovery after the hibern8 exit failure") fixed a deadlock due to recursive calls of ufshcd_hold() by adding a check of eh_in_progress into ufshcd_hold, this check allows eh_work to run in parallel while link recovery is running. - Similar concurrency can also happen when error recovery is invoked from ufshcd_eh_host_reset_handler and ufshcd_link_recovery. - Concurrency can even happen between eh_works. eh_work, currently queued on system_wq, is allowed to have multiple instances running in parallel, but we don't have proper protection for that. If any of above concurrency scenarios happen, error recovery would fail and lead ufs device and host into bad states. To fix the concurrency problem, this change queues eh_work on a single threaded workqueue and removes link recovery calls from the hibern8 enter/exit path. In addition, make use of eh_work in eh_host_reset_handler instead of calling ufshcd_reset_and_restore. This unifies the UFS error recovery mechanism. According to the UFSHCI JEDEC spec, hibern8 enter/exit error occurs when the link is broken. This essentially applies to any power mode change operations (since they all use PACP_PWR cmds in UniPro layer). So, if a power mode change operation (including AH8 enter/exit) fails, mark link state as UIC_LINK_BROKEN_STATE and schedule the eh_work. In this case, error handler needs to do a full reset and restore to recover the link back to active. Before the link state is recovered to active, ufshcd_uic_pwr_ctrl simply returns -ENOLINK to avoid more errors. Link: https://lore.kernel.org/r/1596975355-39813-6-git-send-email-cang@codeaurora.org Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
author: Can Guo <cang@codeaurora.org> 2020-08-09 15:15:51 +0300
committer: Martin K. Petersen <martin.petersen@oracle.com> 2020-08-18 03:54:54 +0300
commit: 4db7a23605973ac4a2cb24e759bebaaa82054f21 (patch)
tree: 4784909ee4d155b85dbf72eb9d8f681e48b32ab2 /drivers/scsi/ufs/ufshcd.h
parent: 3f8af6044713fab0aeb884e1be6bcc2cb276725e (diff)
download: linux-4db7a23605973ac4a2cb24e759bebaaa82054f21.tar.xz
1 files changed, 9 insertions, 0 deletions
diff --git a/drivers/scsi/ufs/ufshcd.h b/drivers/scsi/ufs/ufshcd.h
index b7f54af38b68..618b253e5ec8 100644
--- a/drivers/scsi/ufs/ufshcd.h
+++ b/drivers/scsi/ufs/ufshcd.h
@@ -90,6 +90,7 @@ enum uic_link_state {
 	UIC_LINK_OFF_STATE	= 0, /* Link powered down or disabled */
 	UIC_LINK_ACTIVE_STATE	= 1, /* Link is in Fast/Slow/Sleep state */
 	UIC_LINK_HIBERN8_STATE	= 2, /* Link is in Hibernate state */
+	UIC_LINK_BROKEN_STATE	= 3, /* Link is in broken state */
 };
 
 #define ufshcd_is_link_off(hba) ((hba)->uic_link_state == UIC_LINK_OFF_STATE)
@@ -97,11 +98,15 @@ enum uic_link_state {
 				    UIC_LINK_ACTIVE_STATE)
 #define ufshcd_is_link_hibern8(hba) ((hba)->uic_link_state == \
 				    UIC_LINK_HIBERN8_STATE)
+#define ufshcd_is_link_broken(hba) ((hba)->uic_link_state == \
+				   UIC_LINK_BROKEN_STATE)
 #define ufshcd_set_link_off(hba) ((hba)->uic_link_state = UIC_LINK_OFF_STATE)
 #define ufshcd_set_link_active(hba) ((hba)->uic_link_state = \
 				    UIC_LINK_ACTIVE_STATE)
 #define ufshcd_set_link_hibern8(hba) ((hba)->uic_link_state = \
 				    UIC_LINK_HIBERN8_STATE)
+#define ufshcd_set_link_broken(hba) ((hba)->uic_link_state = \
+				    UIC_LINK_BROKEN_STATE)
 
 #define ufshcd_set_ufs_dev_active(h) \
 	((h)->curr_dev_pwr_mode = UFS_ACTIVE_PWR_MODE)
@@ -616,12 +621,14 @@ struct ufs_hba_variant_params {
  * @intr_mask: Interrupt Mask Bits
  * @ee_ctrl_mask: Exception event control mask
  * @is_powered: flag to check if HBA is powered
+ * @eh_wq: Workqueue that eh_work works on
  * @eh_work: Worker to handle UFS errors that require s/w attention
  * @eeh_work: Worker to handle exception events
  * @errors: HBA errors
  * @uic_error: UFS interconnect layer error status
  * @saved_err: sticky error mask
  * @saved_uic_err: sticky UIC error mask
+ * @force_reset: flag to force eh_work perform a full reset
  * @silence_err_logs: flag to silence error logs
  * @dev_cmd: ufs device management command information
  * @last_dme_cmd_tstamp: time stamp of the last completed DME command
@@ -710,6 +717,7 @@ struct ufs_hba {
 	bool is_powered;
 
 	/* Work Queues */
+	struct workqueue_struct *eh_wq;
 	struct work_struct eh_work;
 	struct work_struct eeh_work;
 
@@ -719,6 +727,7 @@ struct ufs_hba {
 	u32 saved_err;
 	u32 saved_uic_err;
 	struct ufs_stats ufs_stats;
+	bool force_reset;
 	bool silence_err_logs;
 
 	/* Device management request data */
author	Can Guo <cang@codeaurora.org>	2020-08-09 15:15:51 +0300
committer	Martin K. Petersen <martin.petersen@oracle.com>	2020-08-18 03:54:54 +0300
commit	4db7a23605973ac4a2cb24e759bebaaa82054f21 (patch)
tree	4784909ee4d155b85dbf72eb9d8f681e48b32ab2 /drivers/scsi/ufs/ufshcd.h
parent	3f8af6044713fab0aeb884e1be6bcc2cb276725e (diff)
download	linux-4db7a23605973ac4a2cb24e759bebaaa82054f21.tar.xz