summaryrefslogtreecommitdiff
path: root/drivers/scsi/mpi3mr
AgeCommit message (Collapse)AuthorFilesLines
2025-08-28scsi: mpi3mr: Serialize admin queue BAR writes on 32-bit systemsRanjan Kumar3-4/+17
[ Upstream commit c91e140c82eb58724c435f623702e51cc7896646 ] On 32-bit systems, 64-bit BAR writes to admin queue registers are performed as two 32-bit writes. Without locking, this can cause partial writes when accessed concurrently. Updated per-queue spinlocks is used to serialize these writes and prevent race conditions. Fixes: 824a156633df ("scsi: mpi3mr: Base driver code") Cc: stable@vger.kernel.org Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250627194539.48851-4-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28scsi: mpi3mr: Drop unnecessary volatile from __iomem pointersRanjan Kumar2-3/+3
[ Upstream commit 6853885b21cb1d7157cc14c9d30cc17141565bae ] The volatile qualifier is redundant for __iomem pointers. Cleaned up usage in mpi3mr_writeq() and sysif_regs pointer as per Upstream compliance. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250627194539.48851-3-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: c91e140c82eb ("scsi: mpi3mr: Serialize admin queue BAR writes on 32-bit systems") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28scsi: mpi3mr: Fix race between config read submit and interrupt completionRanjan Kumar1-1/+1
commit e6327c4acf925bb6d6d387d76fc3bd94471e10d8 upstream. The "is_waiting" flag was updated after calling complete(), which could lead to a race where the waiting thread wakes up before the flag is cleared. This may cause a missed wakeup or stale state check. Reorder the operations to update "is_waiting" before signaling completion to ensure consistent state. Fixes: 824a156633df ("scsi: mpi3mr: Base driver code") Cc: stable@vger.kernel.org Co-developed-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250627194539.48851-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-20scsi: mpi3mr: Correctly handle ATA device errorsDamien Le Moal1-1/+19
[ Upstream commit 04caad5a7ba86e830d04750417a15bad8ac2613c ] With the ATA error model, an NCQ command failure always triggers an abort (termination) of all NCQ commands queued on the device. In such case, the SAT or the host must handle the failed command according to the command sense data and immediately retry all other NCQ commands that were aborted due to the failed NCQ command. For SAS HBAs controlled by the mpi3mr driver, NCQ command aborts are not handled by the HBA SAT and sent back to the host, with an ioc log information equal to 0x31080000 (IOC_LOGINFO_PREFIX_PL with the PL code PL_LOGINFO_CODE_SATA_NCQ_FAIL_ALL_CMDS_AFTR_ERR). The function mpi3mr_process_op_reply_desc() always forces a retry of commands terminated with the status MPI3_IOCSTATUS_SCSI_IOC_TERMINATED using the SCSI result DID_SOFT_ERROR, regardless of the ioc_loginfo for the command. This correctly forces the retry of collateral NCQ abort commands, but with the retry counter for the command being incremented. If a command to an ATA device is subject to too many retries due to other NCQ commands failing (e.g. read commands trying to access unreadable sectors), the collateral NCQ abort commands may be terminated with an error as they run out of retries. This violates the SAT specification and causes hard-to-debug command errors. Solve this issue by modifying the handling of the MPI3_IOCSTATUS_SCSI_IOC_TERMINATED status to check if a command is for an ATA device and if the command ioc_loginfo indicates an NCQ collateral abort. If that is the case, force the command retry using the SCSI result DID_IMM_RETRY to avoid incrementing the command retry count. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20250606052747.742998-2-dlemoal@kernel.org Tested-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29scsi: mpi3mr: Update timestamp only for supervisor IOCsRanjan Kumar1-1/+4
[ Upstream commit 83a9d30d29f275571f6e8f879f04b2379be7eb6c ] The driver issues the time stamp update command periodically. Even if the command fails with supervisor only IOC Status. Instead check the Non-Supervisor capability bit reported by IOC as part of IOC Facts. Co-developed-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250220142528.20837-3-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29scsi: mpi3mr: Add level check to control event loggingRanjan Kumar1-0/+3
[ Upstream commit b0b7ee3b574a72283399b9232f6190be07f220c0 ] Ensure event logs are only generated when the debug logging level MPI3_DEBUG_EVENT is enabled. This prevents unnecessary logging. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250415101546.204018-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02scsi: mpi3mr: Fix pending I/O counterRanjan Kumar1-1/+1
commit cdd445258db9919e9dde497a6d5c3477ea7faf4d upstream. Commit 199510e33dea ("scsi: mpi3mr: Update consumer index of reply queues after every 100 replies") introduced a regression with the per-reply queue pending I/O counter which was erroneously decremented, leading to the counter going negative. Drop the incorrect atomic decrement for the pending I/O counter. Fixes: 199510e33dea ("scsi: mpi3mr: Update consumer index of reply queues after every 100 replies") Cc: stable@vger.kernel.org Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250411111419.135485-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-25scsi: replace blk_mq_pci_map_queues with blk_mq_map_hw_queuesDaniel Wagner2-2/+1
[ Upstream commit bd326a5ad6397ccfc67af862606be107c15a43e6 ] Replace all users of blk_mq_pci_map_queues with the more generic blk_mq_map_hw_queues. This in preparation to retire blk_mq_pci_map_queues. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Signed-off-by: Daniel Wagner <wagi@kernel.org> Link: https://lore.kernel.org/r/20241202-refactor-blk-affinity-helpers-v6-5-27211e9c2cd5@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Stable-dep-of: a2d5a0072235 ("scsi: smartpqi: Use is_kdump_kernel() to check for kdump") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-20scsi: mpi3mr: Synchronous access b/w reset and tm thread for reply queueRanjan Kumar2-3/+66
[ Upstream commit f195fc060c738d303a21fae146dbf85e1595fb4c ] When the task management thread processes reply queues while the reset thread resets them, the task management thread accesses an invalid queue ID (0xFFFF), set by the reset thread, which points to unallocated memory, causing a crash. Add flag 'io_admin_reset_sync' to synchronize access between the reset, I/O, and admin threads. Before a reset, the reset handler sets this flag to block I/O and admin processing threads. If any thread bypasses the initial check, the reset thread waits up to 10 seconds for processing to finish. If the wait exceeds 10 seconds, the controller is marked as unrecoverable. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250129100850.25430-4-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-20scsi: mpi3mr: Avoid reply queue full conditionRanjan Kumar3-5/+63
[ Upstream commit f08b24d82749117ce779cc66689e8594341130d3 ] To avoid reply queue full condition, update the driver to check IOCFacts capabilities for qfull. Update the operational reply queue's Consumer Index after processing 100 replies. If pending I/Os on a reply queue exceeds a threshold (reply_queue_depth - 200), then return I/O back to OS to retry. Also increase default admin reply queue size to 2K. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20250129100850.25430-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08scsi: mpi3mr: Fix possible crash when setting up bsg failsGuixin Liu1-2/+6
[ Upstream commit 295006f6e8c17212d3098811166e29627d19e05c ] If bsg_setup_queue() fails, the bsg_queue is assigned a non-NULL value. Consequently, in mpi3mr_bsg_exit(), the condition "if(!mrioc->bsg_queue)" will not be satisfied, preventing execution from entering bsg_remove_queue(), which could lead to the following crash: BUG: kernel NULL pointer dereference, address: 000000000000041c Call Trace: <TASK> mpi3mr_bsg_exit+0x1f/0x50 [mpi3mr] mpi3mr_remove+0x6f/0x340 [mpi3mr] pci_device_remove+0x3f/0xb0 device_release_driver_internal+0x19d/0x220 unbind_store+0xa4/0xb0 kernfs_fop_write_iter+0x11f/0x200 vfs_write+0x1fc/0x3e0 ksys_write+0x67/0xe0 do_syscall_64+0x38/0x80 entry_SYSCALL_64_after_hwframe+0x78/0xe2 Fixes: 4268fa751365 ("scsi: mpi3mr: Add bsg device support") Signed-off-by: Guixin Liu <kanie@linux.alibaba.com> Link: https://lore.kernel.org/r/20250107022032.24006-1-kanie@linux.alibaba.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-01-02scsi: mpi3mr: Handling of fault code for insufficient powerRanjan Kumar1-0/+40
[ Upstream commit fb6eb98f3965e2ee92cbcb466051d2f2acf552d1 ] Before retrying initialization, check and abort if the fault code indicates insufficient power. Also mark the controller as unrecoverable instead of issuing reset in the watch dog timer if the fault code indicates insufficient power. Signed-off-by: Prayas Patel <prayas.patel@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110194405.10108-5-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-01-02scsi: mpi3mr: Start controller indexing from 0Ranjan Kumar1-1/+1
[ Upstream commit 0d32014f1e3e7a7adf1583c45387f26b9bb3a49d ] Instead of displaying the controller index starting from '1' make the driver display the controller index starting from '0'. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110194405.10108-4-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-01-02scsi: mpi3mr: Fix corrupt config pages PHY state is switched in sysfsRanjan Kumar2-77/+13
[ Upstream commit 711201a8b8334a397440ac0b859df0054e174bc9 ] The driver, through the SAS transport, exposes a sysfs interface to enable/disable PHYs in a controller/expander setup. When multiple PHYs are disabled and enabled in rapid succession, the persistent and current config pages related to SAS IO unit/SAS Expander pages could get corrupted. Use separate memory for each config request. Signed-off-by: Prayas Patel <prayas.patel@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110194405.10108-3-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-01-02scsi: mpi3mr: Synchronize access to ioctl data bufferRanjan Kumar1-11/+25
[ Upstream commit 367ac16e5ff2dcd6b7f00a8f94e6ba98875cb397 ] The driver serializes ioctls through a mutex lock but access to the ioctl data buffer is not guarded by the mutex. This results in multiple user threads being able to write to the driver's ioctl buffer simultaneously. Protect the ioctl buffer with the ioctl mutex. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241110194405.10108-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-19Merge tag 'scsi-fixes' of ↵Linus Torvalds2-17/+29
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Fixes all in drivers. The largest is the mpi3mr which corrects a phy count limit that should only apply to the controller but was being incorrectly applied to expander phys" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: target: core: Fix null-ptr-deref in target_alloc_device() scsi: mpi3mr: Validate SAS port assignments scsi: ufs: core: Set SDEV_OFFLINE when UFS is shut down scsi: ufs: core: Requeue aborted request scsi: ufs: core: Fix the issue of ICU failure
2024-10-16scsi: mpi3mr: Validate SAS port assignmentsRanjan Kumar2-17/+29
A sanity check on phy_mask was added in commit 3668651def2c ("scsi: mpi3mr: Sanitise num_phys"). This causes warning messages when more than 64 phys are detected and devices connected to phys greater than 64 are dropped. The phy_mask bitmap is only needed for controller phys and not required for expander phys. Controller phys can go up to a maximum of 64 and therefore u64 is good enough to contain phy_mask bitmap. To suppress those warnings and allow devices to be discovered as before the offending commit, restrict the phy_mask setting and lowest phy setting only to the controller phys. Fixes: 3668651def2c ("scsi: mpi3mr: Sanitise num_phys") Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202410051943.Mp9o5DlF-lkp@intel.com/ Reported-by: Alexander Motin <mav@ixsystems.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20241008074353.200379-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-10-03move asm/unaligned.h to linux/unaligned.hAl Viro1-1/+1
asm/unaligned.h is always an include of asm-generic/unaligned.h; might as well move that thing to linux/unaligned.h and include that - there's nothing arch-specific in that header. auto-generated by the following: for i in `git grep -l -w asm/unaligned.h`; do sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i done for i in `git grep -l -w asm-generic/unaligned.h`; do sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i done git mv include/asm-generic/unaligned.h include/linux/unaligned.h git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-09-29Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds6-28/+121
Pull more SCSI updates from James Bottomley: "These are mostly minor updates. There are two drivers (lpfc and mpi3mr) which missed the initial pull and a core change to retry a start/stop unit which affect suspend/resume" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits) scsi: lpfc: Update lpfc version to 14.4.0.5 scsi: lpfc: Support loopback tests with VMID enabled scsi: lpfc: Revise TRACE_EVENT log flag severities from KERN_ERR to KERN_WARNING scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance scsi: lpfc: Fix kref imbalance on fabric ndlps from dev_loss_tmo handler scsi: lpfc: Restrict support for 32 byte CDBs to specific HBAs scsi: lpfc: Update phba link state conditional before sending CMF_SYNC_WQE scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd() scsi: mpi3mr: Update driver version to 8.12.0.0.50 scsi: mpi3mr: Improve wait logic while controller transitions to READY state scsi: mpi3mr: Update MPI Headers to revision 34 scsi: mpi3mr: Use firmware-provided timestamp update interval scsi: mpi3mr: Enhance the Enable Controller retry logic scsi: sd: Fix off-by-one error in sd_read_block_characteristics() scsi: pm8001: Do not overwrite PCI queue mapping scsi: scsi_debug: Remove a useless memset() scsi: pmcraid: Convert comma to semicolon scsi: sd: Retry START STOP UNIT commands scsi: mpi3mr: A performance fix scsi: ufs: qcom: Update MODE_MAX cfg_bw value ...
2024-09-19Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds5-32/+35
Pull SCSI updates from James Bottomley: "Updates to the usual drivers (ufs, smartpqi, NCR5380, mac_scsi, lpfc, mpi3mr). There are no user visible core changes and a whole series of minor updates and fixes. The largest core change is probably the simplification of the workqueue allocation path" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (86 commits) scsi: smartpqi: update driver version to 2.1.30-031 scsi: smartpqi: fix volume size updates scsi: smartpqi: fix rare system hang during LUN reset scsi: smartpqi: add new controller PCI IDs scsi: smartpqi: add counter for parity write stream requests scsi: smartpqi: correct stream detection scsi: smartpqi: Add fw log to kdump scsi: bnx2fc: Remove some unused fields in struct bnx2fc_rport scsi: qla2xxx: Remove the unused 'del_list_entry' field in struct fc_port scsi: ufs: core: Remove ufshcd_urgent_bkops() scsi: core: Remove obsoleted declaration for scsi_driverbyte_string() scsi: bnx2i: Remove unused declarations scsi: core: Simplify an alloc_workqueue() invocation scsi: ufs: Simplify alloc*_workqueue() invocation scsi: stex: Simplify an alloc_ordered_workqueue() invocation scsi: scsi_transport_fc: Simplify alloc_workqueue() invocations scsi: snic: Simplify alloc_workqueue() invocations scsi: qedi: Simplify an alloc_workqueue() invocation scsi: qedf: Simplify alloc_workqueue() invocations scsi: myrs: Simplify an alloc_ordered_workqueue() invocation ...
2024-09-13Merge patch series "mpi3mr: Few Enhancements and minor fix"Martin K. Petersen6-26/+119
Ranjan Kumar <ranjan.kumar@broadcom.com> says: Few Enhancements and minor fix of mpi3mr driver. Link: https://lore.kernel.org/r/20240905102753.105310-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-13scsi: mpi3mr: Update driver version to 8.12.0.0.50Ranjan Kumar1-2/+2
Update driver version to 8.12.0.0.50. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240905102753.105310-6-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-13scsi: mpi3mr: Improve wait logic while controller transitions to READY stateRanjan Kumar1-14/+11
During controller transitioning to READY state, if the controller is found in transient states ("becoming ready" or "reset requested"), driver waits for 510 secs even if the controller transitions out of these states early. This causes an unnecessary delay of 510 secs in the overall firmware initialization sequence. Poll the controller state periodically (every 100 milliseconds) while waiting for the controller to come out of the mentioned transient states. Once the controller transits out of the transient states, come out of the wait loop. Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240905102753.105310-5-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-13scsi: mpi3mr: Update MPI Headers to revision 34Ranjan Kumar4-7/+52
Update MPI Headers to revision 34. Signed-off-by: Prayas Patel <prayas.patel@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240905102753.105310-4-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-13scsi: mpi3mr: Use firmware-provided timestamp update intervalRanjan Kumar3-2/+30
Make driver use the timestamp update interval value provided by firmware in the driver page 1. If firmware fails to provide non-zero value, then the driver will fall back to the driver defined macro. Signed-off-by: Prayas Patel <prayas.patel@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240905102753.105310-3-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-13scsi: mpi3mr: Enhance the Enable Controller retry logicRanjan Kumar1-1/+24
When enabling the IOC request and polling for controller ready status, poll for controller fault and reset history bit. If the controller is faulted or the reset history bit is set, retry the initialization a maximum of three times (2 retries) or if the cumulative time taken for all retries exceeds 510 seconds. Signed-off-by: Prayas Patel <prayas.patel@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240905102753.105310-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-13scsi: mpi3mr: A performance fixTomas Henzl2-2/+2
Commit 0c52310f2600 ("hrtimer: Ignore slack time for RT tasks in schedule_hrtimeout_range()") effectivelly shortens a sleep in a polling function in the driver. That is causing a performance regression as the new value of just 2us is too low, in certain tests the perf drop is ~30%. Fix this by adjusting the sleep to 20us (close to the previous value). Reported-by: Jan Jurca <jjurca@redhat.com> Signed-off-by: Tomas Henzl <thenzl@redhat.com> Acked-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20240903144729.37218-1-thenzl@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-23Merge patch series "Simplify multiple create*_workqueue() invocations"Martin K. Petersen3-7/+3
Bart Van Assche <bvanassche@acm.org> says: Hi Martin, Multiple SCSI drivers use snprintf() to format a workqueue name before invoking one of the create*_workqueue() macros. This patch series simplifies such code by passing the format string and arguments to alloc_workqueue(). Additionally, the structure members that are only used as a temporary buffer for formatting workqueue names are removed. Please consider this patch series for the next merge window. Thanks, Bart. Link: https://lore.kernel.org/r/20240822195944.654691-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-23scsi: mpi3mr: Simplify an alloc_ordered_workqueue() invocationBart Van Assche2-5/+1
Let alloc_ordered_workqueue() format the workqueue name instead of calling snprintf() explicitly. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240822195944.654691-9-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-23scsi: Expand all create*_workqueue() invocationsBart Van Assche1-2/+2
The workqueue maintainer wants to remove the create*_workqueue() macros because these macros always set the WQ_MEM_RECLAIM flag and because these only support literal workqueue names. Hence this patch that replaces the create*_workqueue() invocations with the definition of this macro. The WQ_MEM_RECLAIM flag has been retained because I think that flag is necessary for workqueues created by storage drivers. This patch has been generated by running spatch and git clang-format. spatch has been invoked as follows: spatch --in-place --sp-file expand-create-workqueue.spatch $(git grep -lEw 'create_(freezable_|singlethread_|)workqueue' */scsi */ufs) The contents of the expand-create-workqueue.spatch file is as follows: @@ expression name; @@ -create_workqueue(name) +alloc_workqueue("%s", WQ_MEM_RECLAIM, 1, name) @@ expression name; @@ -create_freezable_workqueue(name) +alloc_workqueue("%s", WQ_FREEZABLE | WQ_UNBOUND | WQ_MEM_RECLAIM, 1, name) @@ expression name; @@ -create_singlethread_workqueue(name) +alloc_ordered_workqueue("%s", WQ_MEM_RECLAIM, name) Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240822195944.654691-2-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-13scsi: mpi3mr: Driver version update to 8.10.0.5.50Ranjan Kumar1-2/+2
Update driver version to 8.10.0.5.50. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240808125418.8832-4-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-13scsi: mpi3mr: Update consumer index of reply queues after every 100 repliesRanjan Kumar2-2/+17
Instead of updating the ConsumerIndex of the Admin and Operational ReplyQueues after processing all replies in the queue, the index will now be periodically updated after processing every 100 replies. Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240808125418.8832-3-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-13scsi: mpi3mr: Return complete ioc_status for ioctl commandsRanjan Kumar1-5/+9
The driver masked the loginfo available bit in the iocstatus before passing it to the applications, causing a mismatch in error messages between Linux and other operating systems. Modify driver to return unmasked (complete) iocstatus, including the loginfo available bit, for the MPI commands sent through the ioctl interface. Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240808125418.8832-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-13scsi: mpi3mr: Avoid MAX_PAGE_ORDER WARNING for buffer allocationsShin'ichiro Kawasaki1-3/+8
Commit fc4444941140 ("scsi: mpi3mr: HDB allocation and posting for hardware and firmware buffers") added mpi3mr_alloc_diag_bufs() which calls dma_alloc_coherent() to allocate the trace buffer and the firmware buffer. mpi3mr_alloc_diag_bufs() decides the buffer sizes from the driver configuration. In my environment, the sizes are 8MB. With the sizes, dma_alloc_coherent() fails and report this WARNING: WARNING: CPU: 4 PID: 438 at mm/page_alloc.c:4676 __alloc_pages_noprof+0x52f/0x640 The WARNING indicates that the order of the allocation size is larger than MAX_PAGE_ORDER. After this failure, mpi3mr_alloc_diag_bufs() reduces the buffer sizes and retries dma_alloc_coherent(). In the end, the buffer allocations succeed with 4MB size in my environment, which corresponds to MAX_PAGE_ORDER=10. Though the allocations succeed, the WARNING message is misleading and should be avoided. To avoid the WARNING, check the orders of the buffer allocation sizes before calling dma_alloc_coherent(). If the orders are larger than MAX_PAGE_ORDER, fall back to the retry path. Fixes: fc4444941140 ("scsi: mpi3mr: HDB allocation and posting for hardware and firmware buffers") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20240810042701.661841-3-shinichiro.kawasaki@wdc.com Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-13scsi: mpi3mr: Add missing spin_lock_init() for mrioc->trigger_lockShin'ichiro Kawasaki1-0/+1
Commit fc4444941140 ("scsi: mpi3mr: HDB allocation and posting for hardware and firmware buffers") added the spinlock trigger_lock to the struct mpi3mr_ioc. However, spin_lock_init() call was not added for it, then the lock does not work as expected. Also, the kernel reports the message below when lockdep is enabled. INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? To fix the issue and to avoid the INFO message, add the missing spin_lock_init() call. Fixes: fc4444941140 ("scsi: mpi3mr: HDB allocation and posting for hardware and firmware buffers") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Link: https://lore.kernel.org/r/20240810042701.661841-2-shinichiro.kawasaki@wdc.com Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-03scsi: mpi3mr: struct mpi3_sas_io_unit_page1: Replace 1-element array with ↵Kees Cook1-4/+1
flexible array Replace the deprecated[1] use of a 1-element array in struct mpi3_sas_io_unit_page1 with a modern flexible array. No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook <kees@kernel.org> Link: https://lore.kernel.org/r/20240711155637.3757036-4-kees@kernel.org Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-03scsi: mpi3mr: struct mpi3_sas_io_unit_page0: Replace 1-element array with ↵Kees Cook1-4/+1
flexible array Replace the deprecated[1] use of a 1-element array in struct mpi3_sas_io_unit_page0 with a modern flexible array. No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook <kees@kernel.org> Link: https://lore.kernel.org/r/20240711155637.3757036-3-kees@kernel.org Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-03scsi: mpi3mr: struct mpi3_event_data_pcie_topology_change_list: Replace ↵Kees Cook1-4/+1
1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct mpi3_event_data_pcie_topology_change_list with a modern flexible array. Additionally add __counted_by annotation since port_entry is only ever accessed in loops controlled by num_entries. For example: for (i = 0; i < event_data->num_entries; i++) { handle = le16_to_cpu(event_data->port_entry[i].attached_dev_handle); No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook <kees@kernel.org> Link: https://lore.kernel.org/r/20240711155637.3757036-2-kees@kernel.org Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-03scsi: mpi3mr: struct mpi3_event_data_sas_topology_change_list: Replace ↵Kees Cook1-4/+1
1-element array with flexible array Replace the deprecated[1] use of a 1-element array in struct mpi3_event_data_sas_topology_change_list with a modern flexible array. Additionally add __counted_by annotation since phy_entry is only ever accessed in loops controlled by num_entries. For example: for (i = 0; i < event_data->num_entries; i++) { ... handle = le16_to_cpu(event_data->phy_entry[i].attached_dev_handle); No binary differences are present after this conversion. Link: https://github.com/KSPP/linux/issues/79 [1] Signed-off-by: Kees Cook <kees@kernel.org> Link: https://lore.kernel.org/r/20240711155637.3757036-1-kees@kernel.org Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-23scsi: mpi3mr: Avoid IOMMU page faults on REPORT ZONESDamien Le Moal1-0/+11
Some firmware versions of the 9600 series SAS HBA byte-swap the REPORT ZONES command reply buffer from ATA-ZAC devices by directly accessing the buffer in the host memory. This does not respect the default command DMA direction and causes IOMMU page faults on architectures with an IOMMU enforcing write-only mappings for DMA_FROM_DEVICE DMA direction (e.g. AMD hosts), leading to the device capacity to be dropped to 0: scsi 18:0:58:0: Direct-Access-ZBC ATA WDC WSH722626AL W930 PQ: 0 ANSI: 7 scsi 18:0:58:0: Power-on or device reset occurred sd 18:0:58:0: Attached scsi generic sg9 type 20 sd 18:0:58:0: [sdj] Host-managed zoned block device mpi3mr 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0001 address=0xfec0c400 flags=0x0050] mpi3mr 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0001 address=0xfec0c500 flags=0x0050] sd 18:0:58:0: [sdj] REPORT ZONES start lba 0 failed sd 18:0:58:0: [sdj] REPORT ZONES: Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK sd 18:0:58:0: [sdj] 0 4096-byte logical blocks: (0 B/0 B) sd 18:0:58:0: [sdj] Write Protect is off sd 18:0:58:0: [sdj] Mode Sense: 6b 00 10 08 sd 18:0:58:0: [sdj] Write cache: enabled, read cache: enabled, supports DPO and FUA sd 18:0:58:0: [sdj] Attached SCSI disk Avoid this issue by always mapping the buffer of REPORT ZONES commands using DMA_BIDIRECTIONAL, that is, using a read-write IOMMU mapping. Suggested-by: Christoph Hellwig <hch@lst.de> Fixes: 023ab2a9b4ed ("scsi: mpi3mr: Add support for queue command processing") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20240719073913.179559-2-dlemoal@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-11Merge branch '6.10/scsi-fixes' into 6.11/scsi-stagingMartin K. Petersen2-1/+63
Pull in my fixes branch to resolve an mpi3mr merge conflict reported by sfr. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-05Merge patch series "mpi3mr: Support PCI Error Recovery"Martin K. Petersen5-19/+311
Sumit Saxena <sumit.saxena@broadcom.com> says: This patch series contains the changes done in the driver to support PCI error recovery. It is rework of older patch series from Ranjan Kumar, see [1]. [1] https://lore.kernel.org/all/20231214205900.270488-1-ranjan.kumar@broadcom.com/ Link: https://lore.kernel.org/r/20240627101735.18286-1-sumit.saxena@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-05scsi: mpi3mr: Driver version updateSumit Saxena1-2/+2
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20240627101735.18286-4-sumit.saxena@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-05scsi: mpi3mr: Prevent PCI writes from driver during PCI error recoverySumit Saxena5-17/+104
Prevent interaction with the hardware while the error recovery in progress. Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Co-developed-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20240627101735.18286-3-sumit.saxena@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-05scsi: mpi3mr: Support PCI Error Recovery callback handlersSumit Saxena2-0/+205
PCI Error recovery support is required to recover the controller upon detection of PCI errors. Add support for the PCI error recovery callback handlers in mpi3mr driver. Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Co-developed-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20240627101735.18286-2-sumit.saxena@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-05scsi: mpi3mr: Correct a test in mpi3mr_sas_port_add()Tomas Henzl1-2/+2
The test for a possible shift overflow is not correct. Fix it by replacing the '>' with a '>='. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Link: https://lore.kernel.org/r/20240627074827.13672-1-thenzl@redhat.com Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-27scsi: mpi3mr: Update driver version to 8.9.1.0.50Ranjan Kumar1-2/+2
Update driver version to 8.9.1.0.50 Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240626102646.14298-5-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-27scsi: mpi3mr: Add ioctl support for HDBRanjan Kumar2-0/+277
Add interface for applications to manage the host diagnostic buffers and update the automatic diag buffer capture triggers. Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240626102646.14298-4-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-27scsi: mpi3mr: Trigger supportRanjan Kumar4-8/+564
Add functions to process automatic diag triggers. If a condition defined in the triggers is met, the driver will call appropriate controller functions to save the diagnostic information. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405151955.BiAWI1SY-lkp@intel.com/ Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240626102646.14298-3-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-27scsi: mpi3mr: HDB allocation and posting for hardware and firmware buffersRanjan Kumar4-1/+790
To be able to debug controller problems it is beneficial to allocate and configure system/host memory buffers which can be used to capture hardware and firmware diagnostic information. Add functions required to allocate and post firmware and hardware diagnostic buffers to the controller and to set up automatic diagnostic capture triggers. Captures will be triggered under the following circumstances: 1. Firmware is in FAULT state. 2. Admin commands time out. 3. Controller reset caused due to I/O timeout Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202405151758.7xrJz6rp-lkp@intel.com/ Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/r/20240626102646.14298-2-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>