diff options
author | Saurav Kashyap <skashyap@marvell.com> | 2022-01-10 08:02:03 +0300 |
---|---|---|
committer | Martin K. Petersen <martin.petersen@oracle.com> | 2022-01-25 07:57:30 +0300 |
commit | 31e6cdbe0eae37badceb5e0d4f06cf051432fd77 (patch) | |
tree | d4f668a402e788be59cf39d988f894ed0ae96d62 /drivers/scsi/qla2xxx/qla_edif.c | |
parent | d4523bd6fd5d3afa9f08a86038a8a92176089f5b (diff) | |
download | linux-31e6cdbe0eae37badceb5e0d4f06cf051432fd77.tar.xz |
scsi: qla2xxx: Implement ref count for SRB
The timeout handler and the done function are racing. When
qla2x00_async_iocb_timeout() starts to run it can be preempted by the
normal response path (via the firmware?). qla24xx_async_gpsc_sp_done()
releases the SRB unconditionally. When scheduling back to
qla2x00_async_iocb_timeout() qla24xx_async_abort_cmd() will access an freed
sp->qpair pointer:
qla2xxx [0000:83:00.0]-2871:0: Async-gpsc timeout - hdl=63d portid=234500 50:06:0e:80:08:77:b6:21.
qla2xxx [0000:83:00.0]-2853:0: Async done-gpsc res 0, WWPN 50:06:0e:80:08:77:b6:21
qla2xxx [0000:83:00.0]-2854:0: Async-gpsc OUT WWPN 20:45:00:27:f8:75:33:00 speeds=2c00 speed=0400.
qla2xxx [0000:83:00.0]-28d8:0: qla24xx_handle_gpsc_event 50:06:0e:80:08:77:b6:21 DS 7 LS 6 rc 0 login 1|1 rscn 1|0 lid 5
BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
IP: qla24xx_async_abort_cmd+0x1b/0x1c0 [qla2xxx]
Obvious solution to this is to introduce a reference counter. One reference
is taken for the normal code path (the 'good' case) and one for the timeout
path. As we always race between the normal good case and the timeout/abort
handler we need to serialize it. Also we cannot assume any order between
the handlers. Since this is slow path we can use proper synchronization via
locks.
When we are able to cancel a timer (del_timer returns 1) we know there
can't be any error handling in progress because the timeout handler hasn't
expired yet, thus we can safely decrement the refcounter by one.
If we are not able to cancel the timer, we know an abort handler is
running. We have to make sure we call sp->done() in the abort handlers
before calling kref_put().
Link: https://lore.kernel.org/r/20220110050218.3958-3-njavali@marvell.com
Cc: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Co-developed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Diffstat (limited to 'drivers/scsi/qla2xxx/qla_edif.c')
-rw-r--r-- | drivers/scsi/qla2xxx/qla_edif.c | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/drivers/scsi/qla2xxx/qla_edif.c b/drivers/scsi/qla2xxx/qla_edif.c index 53d2b8562027..c04957c363d8 100644 --- a/drivers/scsi/qla2xxx/qla_edif.c +++ b/drivers/scsi/qla2xxx/qla_edif.c @@ -2146,7 +2146,8 @@ edif_doorbell_show(struct device *dev, struct device_attribute *attr, static void qla_noop_sp_done(srb_t *sp, int res) { - sp->free(sp); + /* ref: INIT */ + kref_put(&sp->cmd_kref, qla2x00_sp_release); } /* |