summaryrefslogtreecommitdiff
path: root/drivers/scsi
AgeCommit message (Collapse)AuthorFilesLines
4 daysscsi: sg: Fix occasional bogus elapsed time that exceeds timeoutMichal Rábek1-7/+13
[ Upstream commit 0e1677654259a2f3ccf728de1edde922a3c4ba57 ] A race condition was found in sg_proc_debug_helper(). It was observed on a system using an IBM LTO-9 SAS Tape Drive (ULTRIUM-TD9) and monitoring /proc/scsi/sg/debug every second. A very large elapsed time would sometimes appear. This is caused by two race conditions. We reproduced the issue with an IBM ULTRIUM-HH9 tape drive on an x86_64 architecture. A patched kernel was built, and the race condition could not be observed anymore after the application of this patch. A reproducer C program utilising the scsi_debug module was also built by Changhui Zhong and can be viewed here: https://github.com/MichaelRabek/linux-tests/blob/master/drivers/scsi/sg/sg_race_trigger.c The first race happens between the reading of hp->duration in sg_proc_debug_helper() and request completion in sg_rq_end_io(). The hp->duration member variable may hold either of two types of information: #1 - The start time of the request. This value is present while the request is not yet finished. #2 - The total execution time of the request (end_time - start_time). If sg_proc_debug_helper() executes *after* the value of hp->duration was changed from #1 to #2, but *before* srp->done is set to 1 in sg_rq_end_io(), a fresh timestamp is taken in the else branch, and the elapsed time (value type #2) is subtracted from a timestamp, which cannot yield a valid elapsed time (which is a type #2 value as well). To fix this issue, the value of hp->duration must change under the protection of the sfp->rq_list_lock in sg_rq_end_io(). Since sg_proc_debug_helper() takes this read lock, the change to srp->done and srp->header.duration will happen atomically from the perspective of sg_proc_debug_helper() and the race condition is thus eliminated. The second race condition happens between sg_proc_debug_helper() and sg_new_write(). Even though hp->duration is set to the current time stamp in sg_add_request() under the write lock's protection, it gets overwritten by a call to get_sg_io_hdr(), which calls copy_from_user() to copy struct sg_io_hdr from userspace into kernel space. hp->duration is set to the start time again in sg_common_write(). If sg_proc_debug_helper() is called between these two calls, an arbitrary value set by userspace (usually zero) is used to compute the elapsed time. To fix this issue, hp->duration must be set to the current timestamp again after get_sg_io_hdr() returns successfully. A small race window still exists between get_sg_io_hdr() and setting hp->duration, but this window is only a few instructions wide and does not result in observable issues in practice, as confirmed by testing. Additionally, we fix the format specifier from %d to %u for printing unsigned int values in sg_proc_debug_helper(). Signed-off-by: Michal Rábek <mrabek@redhat.com> Suggested-by: Tomas Henzl <thenzl@redhat.com> Tested-by: Changhui Zhong <czhong@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: John Meneghini <jmeneghi@redhat.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Link: https://patch.msgid.link/20251212160900.64924-1-mrabek@redhat.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysscsi: Revert "scsi: libsas: Fix exp-attached device scan after probe failure ↵Xingui Yang1-14/+0
scanned in again after probe failed" [ Upstream commit 278712d20bc8ec29d1ad6ef9bdae9000ef2c220c ] This reverts commit ab2068a6fb84751836a84c26ca72b3beb349619d. When probing the exp-attached sata device, libsas/libata will issue a hard reset in sas_probe_sata() -> ata_sas_async_probe(), then a broadcast event will be received after the disk probe fails, and this commit causes the probe will be re-executed on the disk, and a faulty disk may get into an indefinite loop of probe. Therefore, revert this commit, although it can fix some temporary issues with disk probe failure. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20251202065627.140361-1-yangxingui@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysscsi: ipr: Enable/disable IRQD_NO_BALANCING during resetWen Xiong1-1/+27
[ Upstream commit 6ac3484fb13b2fc7f31cfc7f56093e7d0ce646a5 ] A dynamic remove/add storage adapter test hits EEH on PowerPC: EEH: [c00000000004f75c] __eeh_send_failure_event+0x7c/0x160 EEH: [c000000000048444] eeh_dev_check_failure.part.0+0x254/0x650 EEH: [c008000001650678] eeh_readl+0x60/0x90 [ipr] EEH: [c00800000166746c] ipr_cancel_op+0x2b8/0x524 [ipr] EEH: [c008000001656524] ipr_eh_abort+0x6c/0x130 [ipr] EEH: [c000000000ab0d20] scmd_eh_abort_handler+0x140/0x440 EEH: [c00000000017e558] process_one_work+0x298/0x590 EEH: [c00000000017eef8] worker_thread+0xa8/0x620 EEH: [c00000000018be34] kthread+0x124/0x130 EEH: [c00000000000cd64] ret_from_kernel_thread+0x5c/0x64 A PCIe bus trace reveals that a vector of MSI-X is cleared to 0 by irqbalance daemon. If we disable irqbalance daemon, we won't see the issue. With debug enabled in ipr driver: [ 44.103071] ipr: Entering __ipr_remove [ 44.103083] ipr: Entering ipr_initiate_ioa_bringdown [ 44.103091] ipr: Entering ipr_reset_shutdown_ioa [ 44.103099] ipr: Leaving ipr_reset_shutdown_ioa [ 44.103105] ipr: Leaving ipr_initiate_ioa_bringdown [ 44.149918] ipr: Entering ipr_reset_ucode_download [ 44.149935] ipr: Entering ipr_reset_alert [ 44.150032] ipr: Entering ipr_reset_start_timer [ 44.150038] ipr: Leaving ipr_reset_alert [ 44.244343] scsi 1:2:3:0: alua: Detached [ 44.254300] ipr: Entering ipr_reset_start_bist [ 44.254320] ipr: Entering ipr_reset_start_timer [ 44.254325] ipr: Leaving ipr_reset_start_bist [ 44.364329] scsi 1:2:4:0: alua: Detached [ 45.134341] scsi 1:2:5:0: alua: Detached [ 45.860949] ipr: Entering ipr_reset_shutdown_ioa [ 45.860962] ipr: Leaving ipr_reset_shutdown_ioa [ 45.860966] ipr: Entering ipr_reset_alert [ 45.861028] ipr: Entering ipr_reset_start_timer [ 45.861035] ipr: Leaving ipr_reset_alert [ 45.964302] ipr: Entering ipr_reset_start_bist [ 45.964309] ipr: Entering ipr_reset_start_timer [ 45.964313] ipr: Leaving ipr_reset_start_bist [ 46.264301] ipr: Entering ipr_reset_bist_done [ 46.264309] ipr: Leaving ipr_reset_bist_done During adapter reset, ipr device driver blocks config space access but can't block MMIO access for MSI-X entries. There is very small window: irqbalance daemon kicks in during adapter reset before ipr driver calls pci_restore_state(pdev) to restore MSI-X table. irqbalance daemon reads back all 0 for that MSI-X vector in __pci_read_msi_msg(). irqbalance daemon: msi_domain_set_affinity() ->irq_chip_set_affinity_patent() ->xive_irq_set_affinity() ->irq_chip_compose_msi_msg() ->pseries_msi_compose_msg() ->__pci_read_msi_msg(): read all 0 since didn't call pci_restore_state ->irq_chip_write_msi_msg() -> pci_write_msg_msi(): write 0 to the msix vector entry When ipr driver calls pci_restore_state(pdev) in ipr_reset_restore_cfg_space(), the MSI-X vector entry has been cleared by irqbalance daemon in pci_write_msg_msix(). pci_restore_state() ->__pci_restore_msix_state() Below is the MSI-X table for ipr adapter after irqbalance daemon kicked in during adapter reset: Dump MSIx table: index=0 address_lo=c800 address_hi=10000000 msg_data=0 Dump MSIx table: index=1 address_lo=c810 address_hi=10000000 msg_data=0 Dump MSIx table: index=2 address_lo=c820 address_hi=10000000 msg_data=0 Dump MSIx table: index=3 address_lo=c830 address_hi=10000000 msg_data=0 Dump MSIx table: index=4 address_lo=c840 address_hi=10000000 msg_data=0 Dump MSIx table: index=5 address_lo=c850 address_hi=10000000 msg_data=0 Dump MSIx table: index=6 address_lo=c860 address_hi=10000000 msg_data=0 Dump MSIx table: index=7 address_lo=c870 address_hi=10000000 msg_data=0 Dump MSIx table: index=8 address_lo=0 address_hi=0 msg_data=0 ---------> Hit EEH since msix vector of index=8 are 0 Dump MSIx table: index=9 address_lo=c890 address_hi=10000000 msg_data=0 Dump MSIx table: index=10 address_lo=c8a0 address_hi=10000000 msg_data=0 Dump MSIx table: index=11 address_lo=c8b0 address_hi=10000000 msg_data=0 Dump MSIx table: index=12 address_lo=c8c0 address_hi=10000000 msg_data=0 Dump MSIx table: index=13 address_lo=c8d0 address_hi=10000000 msg_data=0 Dump MSIx table: index=14 address_lo=c8e0 address_hi=10000000 msg_data=0 Dump MSIx table: index=15 address_lo=c8f0 address_hi=10000000 msg_data=0 [ 46.264312] ipr: Entering ipr_reset_restore_cfg_space [ 46.267439] ipr: Entering ipr_fail_all_ops [ 46.267447] ipr: Leaving ipr_fail_all_ops [ 46.267451] ipr: Leaving ipr_reset_restore_cfg_space [ 46.267454] ipr: Entering ipr_ioa_bringdown_done [ 46.267458] ipr: Leaving ipr_ioa_bringdown_done [ 46.267467] ipr: Entering ipr_worker_thread [ 46.267470] ipr: Leaving ipr_worker_thread IRQ balancing is not required during adapter reset. Enable "IRQ_NO_BALANCING" flag before starting adapter reset and disable it after calling pci_restore_state(). The irqbalance daemon is disabled for this short period of time (~2s). Co-developed-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com> Signed-off-by: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com> Signed-off-by: Wen Xiong <wenxiong@linux.ibm.com> Link: https://patch.msgid.link/20251028142427.3969819-2-wenxiong@linux.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
4 daysscsi: mpi3mr: Prevent duplicate SAS/SATA device entries in channel 1Suganath Prabu S2-3/+5
[ Upstream commit 4588e65cfd66fc8bbd9969ea730db39b60a36a30 ] Avoid scanning SAS/SATA devices in channel 1 when SAS transport is enabled, as the SAS/SATA devices are exposed through channel 0. Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Link: https://lore.kernel.org/stable/20251120071955.463475-1-suganath-prabu.subramani%40broadcom.com Link: https://patch.msgid.link/20251120071955.463475-1-suganath-prabu.subramani@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-01-02scsi: mpi3mr: Read missing IOCFacts flag for reply queue full overflowChandrakanth Patil2-0/+3
commit d373163194982f43b92c552c138c29d9f0b79553 upstream. The driver was not reading the MAX_REQ_PER_REPLY_QUEUE_LIMIT IOCFacts flag, so the reply-queue-full handling was never enabled, even on firmware that supports it. Reading this flag enables the feature and prevents reply queue overflow. Fixes: f08b24d82749 ("scsi: mpi3mr: Avoid reply queue full condition") Cc: stable@vger.kernel.org Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com> Link: https://patch.msgid.link/20251211002929.22071-1-chandrakanth.patil@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-02scsi: aic94xx: fix use-after-free in device removal pathJunrui Luo1-0/+3
commit f6ab594672d4cba08540919a4e6be2e202b60007 upstream. The asd_pci_remove() function fails to synchronize with pending tasklets before freeing the asd_ha structure, leading to a potential use-after-free vulnerability. When a device removal is triggered (via hot-unplug or module unload), race condition can occur. The fix adds tasklet_kill() before freeing the asd_ha structure, ensuring all scheduled tasklets complete before cleanup proceeds. Reported-by: Yuhao Jiang <danisjiang@gmail.com> Reported-by: Junrui Luo <moonafterrain@outlook.com> Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver") Cc: stable@vger.kernel.org Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Link: https://patch.msgid.link/ME2PR01MB3156AB7DCACA206C845FC7E8AFFDA@ME2PR01MB3156.ausprd01.prod.outlook.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-02scsi: Revert "scsi: qla2xxx: Perform lockless command completion in abort path"Tony Battersby1-6/+0
commit b57fbc88715b6d18f379463f48a15b560b087ffe upstream. This reverts commit 0367076b0817d5c75dfb83001ce7ce5c64d803a9. The commit being reverted added code to __qla2x00_abort_all_cmds() to call sp->done() without holding a spinlock. But unlike the older code below it, this new code failed to check sp->cmd_type and just assumed TYPE_SRB, which results in a jump to an invalid pointer in target-mode with TYPE_TGT_CMD: qla2xxx [0000:65:00.0]-d034:8: qla24xx_do_nack_work create sess success 0000000009f7a79b qla2xxx [0000:65:00.0]-5003:8: ISP System Error - mbx1=1ff5h mbx2=10h mbx3=0h mbx4=0h mbx5=191h mbx6=0h mbx7=0h. qla2xxx [0000:65:00.0]-d01e:8: -> fwdump no buffer qla2xxx [0000:65:00.0]-f03a:8: qla_target(0): System error async event 0x8002 occurred qla2xxx [0000:65:00.0]-00af:8: Performing ISP error recovery - ha=0000000058183fda. BUG: kernel NULL pointer dereference, address: 0000000000000000 PF: supervisor instruction fetch in kernel mode PF: error_code(0x0010) - not-present page PGD 0 P4D 0 Oops: 0010 [#1] SMP CPU: 2 PID: 9446 Comm: qla2xxx_8_dpc Tainted: G O 6.1.133 #1 Hardware name: Supermicro Super Server/X11SPL-F, BIOS 4.2 12/15/2023 RIP: 0010:0x0 Code: Unable to access opcode bytes at 0xffffffffffffffd6. RSP: 0018:ffffc90001f93dc8 EFLAGS: 00010206 RAX: 0000000000000282 RBX: 0000000000000355 RCX: ffff88810d16a000 RDX: ffff88810dbadaa8 RSI: 0000000000080000 RDI: ffff888169dc38c0 RBP: ffff888169dc38c0 R08: 0000000000000001 R09: 0000000000000045 R10: ffffffffa034bdf0 R11: 0000000000000000 R12: ffff88810800bb40 R13: 0000000000001aa8 R14: ffff888100136610 R15: ffff8881070f7400 FS: 0000000000000000(0000) GS:ffff88bf80080000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 000000010c8ff006 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> ? __die+0x4d/0x8b ? page_fault_oops+0x91/0x180 ? trace_buffer_unlock_commit_regs+0x38/0x1a0 ? exc_page_fault+0x391/0x5e0 ? asm_exc_page_fault+0x22/0x30 __qla2x00_abort_all_cmds+0xcb/0x3e0 [qla2xxx_scst] qla2x00_abort_all_cmds+0x50/0x70 [qla2xxx_scst] qla2x00_abort_isp_cleanup+0x3b7/0x4b0 [qla2xxx_scst] qla2x00_abort_isp+0xfd/0x860 [qla2xxx_scst] qla2x00_do_dpc+0x581/0xa40 [qla2xxx_scst] kthread+0xa8/0xd0 </TASK> Then commit 4475afa2646d ("scsi: qla2xxx: Complete command early within lock") added the spinlock back, because not having the lock caused a race and a crash. But qla2x00_abort_srb() in the switch below already checks for qla2x00_chip_is_down() and handles it the same way, so the code above the switch is now redundant and still buggy in target-mode. Remove it. Cc: stable@vger.kernel.org Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/3a8022dc-bcfd-4b01-9f9b-7a9ec61fa2a3@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-02scsi: scsi_debug: Fix atomic write enable module param descriptionJohn Garry1-1/+1
[ Upstream commit 1f7d6e2efeedd8f545d3e0e9bf338023bf4ea584 ] The atomic write enable module param is "atomic_wr", and not "atomic_write", so fix the module param description. Fixes: 84f3a3c01d70 ("scsi: scsi_debug: Atomic write support") Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251211100651.9056-1-john.g.garry@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-01-02scsi: qla2xxx: Use reinit_completion on mbx_intr_compTony Battersby1-0/+2
[ Upstream commit 957aa5974989fba4ae4f807ebcb27f12796edd4d ] If a mailbox command completes immediately after wait_for_completion_timeout() times out, ha->mbx_intr_comp could be left in an inconsistent state, causing the next mailbox command not to wait for the hardware. Fix by reinitializing the completion before use. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/11b6485e-0bfd-4784-8f99-c06a196dad94@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-01-02scsi: qla2xxx: Fix initiator mode with qlini_mode=exclusiveTony Battersby1-7/+1
[ Upstream commit 8f58fc64d559b5fda1b0a5e2a71422be61e79ab9 ] When given the module parameter qlini_mode=exclusive, qla2xxx in initiator mode is initially unable to successfully send SCSI commands to devices it finds while scanning, resulting in an escalating series of resets until an adapter reset clears the issue. Fix by checking the active mode instead of the module parameter. Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/1715ec14-ba9a-45dc-9cf2-d41aa6b81b5e@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-01-02scsi: qla2xxx: Fix lost interrupts with qlini_mode=disabledTony Battersby4-34/+5
[ Upstream commit 4f6aaade2a22ac428fa99ed716cf2b87e79c9837 ] When qla2xxx is loaded with qlini_mode=disabled, ha->flags.disable_msix_handshake is used before it is set, resulting in the wrong interrupt handler being used on certain HBAs (qla2xxx_msix_rsp_q_hs() is used when qla2xxx_msix_rsp_q() should be used). The only difference between these two interrupt handlers is that the _hs() version writes to a register to clear the "RISC" interrupt, whereas the other version does not. So this bug results in the RISC interrupt being cleared when it should not be. This occasionally causes a different interrupt handler qla24xx_msix_default() for a different vector to see ((stat & HSRX_RISC_INT) == 0) and ignore its interrupt, which then causes problems like: qla2xxx [0000:02:00.0]-d04c:6: MBX Command timeout for cmd 20, iocontrol=8 jiffies=1090c0300 mb[0-3]=[0x4000 0x0 0x40 0xda] mb7 0x500 host_status 0x40000010 hccr 0x3f00 qla2xxx [0000:02:00.0]-101e:6: Mailbox cmd timeout occurred, cmd=0x20, mb[0]=0x20. Scheduling ISP abort (the cmd varies; sometimes it is 0x20, 0x22, 0x54, 0x5a, 0x5d, or 0x6a) This problem can be reproduced with a 16 or 32 Gbps HBA by loading qla2xxx with qlini_mode=disabled and running a high IOPS test while triggering frequent RSCN database change events. While analyzing the problem I discovered that even with disable_msix_handshake forced to 0, it is not necessary to clear the RISC interrupt from qla2xxx_msix_rsp_q_hs() (more below). So just completely remove qla2xxx_msix_rsp_q_hs() and the logic for selecting it, which also fixes the bug with qlini_mode=disabled. The test below describes the justification for not needing qla2xxx_msix_rsp_q_hs(): Force disable_msix_handshake to 0: qla24xx_config_rings(): if (0 && (ha->fw_attributes & BIT_6) && (IS_MSIX_NACK_CAPABLE(ha)) && (ha->flags.msix_enabled)) { In qla24xx_msix_rsp_q() and qla2xxx_msix_rsp_q_hs(), check: (rd_reg_dword(&reg->host_status) & HSRX_RISC_INT) Count the number of calls to each function with HSRX_RISC_INT set and the number with HSRX_RISC_INT not set while performing some I/O. If qla2xxx_msix_rsp_q_hs() clears the RISC interrupt (original code): qla24xx_msix_rsp_q: 50% of calls have HSRX_RISC_INT set qla2xxx_msix_rsp_q_hs: 5% of calls have HSRX_RISC_INT set (# of qla2xxx_msix_rsp_q_hs interrupts) = (# of qla24xx_msix_rsp_q interrupts) * 3 If qla2xxx_msix_rsp_q_hs() does not clear the RISC interrupt (patched code): qla24xx_msix_rsp_q: 100% of calls have HSRX_RISC_INT set qla2xxx_msix_rsp_q_hs: 9% of calls have HSRX_RISC_INT set (# of qla2xxx_msix_rsp_q_hs interrupts) = (# of qla24xx_msix_rsp_q interrupts) * 3 In the case of the original code, qla24xx_msix_rsp_q() was seeing HSRX_RISC_INT set only 50% of the time because qla2xxx_msix_rsp_q_hs() was clearing it when it shouldn't have been. In the patched code, qla24xx_msix_rsp_q() sees HSRX_RISC_INT set 100% of the time, which makes sense if that interrupt handler needs to clear the RISC interrupt (which it does). qla2xxx_msix_rsp_q_hs() sees HSRX_RISC_INT only 9% of the time, which is just overlap from the other interrupt during the high IOPS test. Tested with SCST on: QLE2742 FW:v9.08.02 (32 Gbps 2-port) QLE2694L FW:v9.10.11 (16 Gbps 4-port) QLE2694L FW:v9.08.02 (16 Gbps 4-port) QLE2672 FW:v8.07.12 (16 Gbps 2-port) both initiator and target mode Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/56d378eb-14ad-49c7-bae9-c649b6c7691e@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-01-02scsi: lpfc: Fix reusing an ndlp that is marked NLP_DROPPED during FLOGIJustin Tee2-8/+32
[ Upstream commit 07caedc6a3887938813727beafea40f07c497705 ] It's possible for an unstable link to repeatedly bounce allowing a FLOGI retry, but then bounce again forcing an abort of the FLOGI. Ensure that the initial reference count on the FLOGI ndlp is restored in this faulty link scenario. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://patch.msgid.link/20251106224639.139176-8-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-01-02scsi: smartpqi: Add support for Hurray Data new controller PCI deviceDavid Strahan1-0/+4
[ Upstream commit 48e6b7e708029cea451e53a8c16fc8c16039ecdc ] Add support for new Hurray Data controller. All entries are in HEX. Add PCI IDs for Hurray Data controllers: VID / DID / SVID / SDID ---- ---- ---- ---- 9005 028f 207d 4840 Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: David Strahan <David.Strahan@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://patch.msgid.link/20251106163823.786828-4-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-18scsi: imm: Fix use-after-free bug caused by unfinished delayed workDuoming Zhou1-0/+1
[ Upstream commit ab58153ec64fa3fc9aea09ca09dc9322e0b54a7c ] The delayed work item 'imm_tq' is initialized in imm_attach() and scheduled via imm_queuecommand() for processing SCSI commands. When the IMM parallel port SCSI host adapter is detached through imm_detach(), the imm_struct device instance is deallocated. However, the delayed work might still be pending or executing when imm_detach() is called, leading to use-after-free bugs when the work function imm_interrupt() accesses the already freed imm_struct memory. The race condition can occur as follows: CPU 0(detach thread) | CPU 1 | imm_queuecommand() | imm_queuecommand_lck() imm_detach() | schedule_delayed_work() kfree(dev) //FREE | imm_interrupt() | dev = container_of(...) //USE dev-> //USE Add disable_delayed_work_sync() in imm_detach() to guarantee proper cancellation of the delayed work item before imm_struct is deallocated. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Link: https://patch.msgid.link/20251028100149.40721-1-duoming@zju.edu.cn Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-18scsi: qla2xxx: Fix improper freeing of purex itemZilin Guan1-1/+1
[ Upstream commit 78b1a242fe612a755f2158fd206ee6bb577d18ca ] In qla2xxx_process_purls_iocb(), an item is allocated via qla27xx_copy_multiple_pkt(), which internally calls qla24xx_alloc_purex_item(). The qla24xx_alloc_purex_item() function may return a pre-allocated item from a per-adapter pool for small allocations, instead of dynamically allocating memory with kzalloc(). An error handling path in qla2xxx_process_purls_iocb() incorrectly uses kfree() to release the item. If the item was from the pre-allocated pool, calling kfree() on it is a bug that can lead to memory corruption. Fix this by using the correct deallocation function, qla24xx_free_purex_item(), which properly handles both dynamically allocated and pre-allocated items. Fixes: 875386b98857 ("scsi: qla2xxx: Add Unsolicited LS Request and Response Support for NVMe") Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com> Link: https://patch.msgid.link/20251113151246.762510-1-zilin@seu.edu.cn Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-18scsi: sim710: Fix resource leak by adding missing ioport_unmap() callsHaotian Zhang1-0/+2
[ Upstream commit acd194d9b5bac419e04968ffa44351afabb50bac ] The driver calls ioport_map() to map I/O ports in sim710_probe_common() but never calls ioport_unmap() to release the mapping. This causes resource leaks in both the error path when request_irq() fails and in the normal device removal path via sim710_device_remove(). Add ioport_unmap() calls in the out_release error path and in sim710_device_remove(). Fixes: 56fece20086e ("[PATCH] finally fix 53c700 to use the generic iomem infrastructure") Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Link: https://patch.msgid.link/20251029032555.1476-1-vulab@iscas.ac.cn Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-18scsi: qla2xxx: Clear cmds after chip resetTony Battersby3-6/+20
[ Upstream commit d46c69a087aa3d1513f7a78f871b80251ea0c1ae ] Commit aefed3e5548f ("scsi: qla2xxx: target: Fix offline port handling and host reset handling") caused two problems: 1. Commands sent to FW, after chip reset got stuck and never freed as FW is not going to respond to them anymore. 2. BUG_ON(cmd->sg_mapped) in qlt_free_cmd(). Commit 26f9ce53817a ("scsi: qla2xxx: Fix missed DMA unmap for aborted commands") attempted to fix this, but introduced another bug under different circumstances when two different CPUs were racing to call qlt_unmap_sg() at the same time: BUG_ON(!valid_dma_direction(dir)) in dma_unmap_sg_attrs(). So revert "scsi: qla2xxx: Fix missed DMA unmap for aborted commands" and partially revert "scsi: qla2xxx: target: Fix offline port handling and host reset handling" at __qla2x00_abort_all_cmds. Fixes: aefed3e5548f ("scsi: qla2xxx: target: Fix offline port handling and host reset handling") Fixes: 26f9ce53817a ("scsi: qla2xxx: Fix missed DMA unmap for aborted commands") Co-developed-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com> Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Link: https://patch.msgid.link/0e7e5d26-e7a0-42d1-8235-40eeb27f3e98@cybernetics.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-18scsi: smartpqi: Fix device resources accessed after device removalMike McGowen1-0/+19
[ Upstream commit b518e86d1a70a88f6592a7c396cf1b93493d1aab ] Correct possible race conditions during device removal. Previously, a scheduled work item to reset a LUN could still execute after the device was removed, leading to use-after-free and other resource access issues. This race condition occurs because the abort handler may schedule a LUN reset concurrently with device removal via sdev_destroy(), leading to use-after-free and improper access to freed resources. - Check in the device reset handler if the device is still present in the controller's SCSI device list before running; if not, the reset is skipped. - Cancel any pending TMF work that has not started in sdev_destroy(). - Ensure device freeing in sdev_destroy() is done while holding the LUN reset mutex to avoid races with ongoing resets. Fixes: 2d80f4054f7f ("scsi: smartpqi: Update deleting a LUN via sysfs") Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Signed-off-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://patch.msgid.link/20251106163823.786828-3-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-18scsi: stex: Fix reboot_notifier leak in probe error pathHaotian Zhang1-0/+1
[ Upstream commit 20da637eb545b04753e20c675cfe97b04c7b600b ] In stex_probe(), register_reboot_notifier() is called at the beginning, but if any subsequent initialization step fails, the function returns without unregistering the notifier, resulting in a resource leak. Add unregister_reboot_notifier() in the out_disable error path to ensure proper cleanup on all failure paths. Fixes: 61b745fa63db ("scsi: stex: Add S6 support") Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Link: https://patch.msgid.link/20251104094847.270-1-vulab@iscas.ac.cn Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-20scsi: sg: Do not sleep in atomic contextBart Van Assche1-1/+9
sg_finish_rem_req() calls blk_rq_unmap_user(). The latter function may sleep. Hence, call sg_finish_rem_req() with interrupts enabled instead of disabled. Reported-by: syzbot+c01f8e6e73f20459912e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-scsi/691560c4.a70a0220.3124cb.001a.GAE@google.com/ Cc: Hannes Reinecke <hare@suse.de> Cc: stable@vger.kernel.org Fixes: 97d27b0dd015 ("scsi: sg: close race condition in sg_remove_sfp_usercontext()") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251113181643.1108973-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-22scsi: core: Fix the unit attention counter implementationBart Van Assche1-2/+2
scsi_decide_disposition() may call scsi_check_sense(). scsi_decide_disposition() calls are not serialized. Hence, counter updates by scsi_check_sense() must be serialized. Hence this patch that makes the counters updated by scsi_check_sense() atomic. Cc: Kai Mäkisara <Kai.Makisara@kolumbus.fi> Fixes: a5d518cd4e3e ("scsi: core: Add counters for New Media and Power On/Reset UNIT ATTENTIONs") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Link: https://patch.msgid.link/20251014220244.3689508-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-20scsi: core: Fix a regression triggered by scsi_host_busy()Bart Van Assche1-2/+3
Commit 995412e23bb2 ("blk-mq: Replace tags->lock with SRCU for tag iterators") introduced the following regression: Call trace: __srcu_read_lock+0x30/0x80 (P) blk_mq_tagset_busy_iter+0x44/0x300 scsi_host_busy+0x38/0x70 ufshcd_print_host_state+0x34/0x1bc ufshcd_link_startup.constprop.0+0xe4/0x2e0 ufshcd_init+0x944/0xf80 ufshcd_pltfrm_init+0x504/0x820 ufs_rockchip_probe+0x2c/0x88 platform_probe+0x5c/0xa4 really_probe+0xc0/0x38c __driver_probe_device+0x7c/0x150 driver_probe_device+0x40/0x120 __driver_attach+0xc8/0x1e0 bus_for_each_dev+0x7c/0xdc driver_attach+0x24/0x30 bus_add_driver+0x110/0x230 driver_register+0x68/0x130 __platform_driver_register+0x20/0x2c ufs_rockchip_pltform_init+0x1c/0x28 do_one_initcall+0x60/0x1e0 kernel_init_freeable+0x248/0x2c4 kernel_init+0x20/0x140 ret_from_fork+0x10/0x20 Fix this regression by making scsi_host_busy() check whether the SCSI host tag set has already been initialized. tag_set->ops is set by scsi_mq_setup_tags() just before blk_mq_alloc_tag_set() is called. This fix is based on the assumption that scsi_host_busy() and scsi_mq_setup_tags() calls are serialized. This is the case in the UFS driver. Reported-by: Sebastian Reichel <sebastian.reichel@collabora.com> Closes: https://lore.kernel.org/linux-block/pnezafputodmqlpumwfbn644ohjybouveehcjhz2hmhtcf2rka@sdhoiivync4y/ Cc: Ming Lei <ming.lei@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Ming Lei <ming.lei@redhat.com> Tested-by: Sebastian Reichel <sebastian.reichel@collabora.com> Link: https://patch.msgid.link/20251007214800.1678255-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-13Merge branch '6.18/scsi-queue' into 6.18/scsi-fixesMartin K. Petersen3-56/+50
Pull in outstanding SCSI fixes for 6.18. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-11Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds5-19/+17
Pull SCSI fixes from James Bottomley: "Fixes only in drivers (ufs, mvsas, qla2xxx, target) that came in just before or during the merge window. The most important one is the qla2xxx which reverts a conversion to fix flexible array member warnings, that went up in this merge window but which turned out on further testing to be causing data corruption" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: core: Include UTP error in INT_FATAL_ERRORS scsi: ufs: sysfs: Make HID attributes visible scsi: mvsas: Fix use-after-free bugs in mvs_work_queue scsi: ufs: core: Fix PM QoS mutex initialization scsi: ufs: core: Fix runtime suspend error deadlock Revert "scsi: qla2xxx: Fix memcpy() field-spanning write issue" scsi: target: target_core_configfs: Add length check to avoid buffer overflow
2025-10-07Merge tag 'hyperv-next-signed-20251006' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Unify guest entry code for KVM and MSHV (Sean Christopherson) - Switch Hyper-V MSI domain to use msi_create_parent_irq_domain() (Nam Cao) - Add CONFIG_HYPERV_VMBUS and limit the semantics of CONFIG_HYPERV (Mukesh Rathor) - Add kexec/kdump support on Azure CVMs (Vitaly Kuznetsov) - Deprecate hyperv_fb in favor of Hyper-V DRM driver (Prasanna Kumar T S M) - Miscellaneous enhancements, fixes and cleanups (Abhishek Tiwari, Alok Tiwari, Nuno Das Neves, Wei Liu, Roman Kisel, Michael Kelley) * tag 'hyperv-next-signed-20251006' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: hyperv: Remove the spurious null directive line MAINTAINERS: Mark hyperv_fb driver Obsolete fbdev/hyperv_fb: deprecate this in favor of Hyper-V DRM driver Drivers: hv: Make CONFIG_HYPERV bool Drivers: hv: Add CONFIG_HYPERV_VMBUS option Drivers: hv: vmbus: Fix typos in vmbus_drv.c Drivers: hv: vmbus: Fix sysfs output format for ring buffer index Drivers: hv: vmbus: Clean up sscanf format specifier in target_cpu_store() x86/hyperv: Switch to msi_create_parent_irq_domain() mshv: Use common "entry virt" APIs to do work in root before running guest entry: Rename "kvm" entry code assets to "virt" to genericize APIs entry/kvm: KVM: Move KVM details related to signal/-EINTR into KVM proper mshv: Handle NEED_RESCHED_LAZY before transferring to guest x86/hyperv: Add kexec/kdump support on Azure CVMs Drivers: hv: Simplify data structures for VMBus channel close message Drivers: hv: util: Cosmetic changes for hv_utils_transport.c mshv: Add support for a new parent partition configuration clocksource: hyper-v: Skip unnecessary checks for the root partition hyperv: Add missing field to hv_output_map_device_interrupt
2025-10-07scsi: libfc: Prevent integer overflow in fc_fcp_recv_data()Dan Carpenter1-1/+1
The "offset" comes from the skb->data that we received. Here the code is verifying that "offset + len" is within bounds however it does not take integer overflows into account. Use size_add() to be safe. This would only be an issue on 32bit systems which are probably a very small percent of the users. Still, it's worth fixing just for correctness sake. Fixes: 42e9a92fe6a9 ("[SCSI] libfc: A modular Fibre Channel library") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Message-Id: <aNvPMet7TPtM9CY1@stanley.mountain> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-07scsi: qla4xxx: Fix typos in commentsAlok Tiwari1-4/+4
Fix several spelling mistakes in qla4xxx driver comments: "Unfortunely" -> "Unfortunately" "becase" -> "because" "funcions" -> "functions" "targer_id" -> "target_id" Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-07scsi: storvsc: Prefer returning channel with the same CPU as on the I/O ↵Long Li1-51/+45
issuing CPU When selecting an outgoing channel for I/O, storvsc tries to select a channel with a returning CPU that is not the same as issuing CPU. This worked well in the past, however it doesn't work well when the Hyper-V exposes a large number of channels (up to the number of all CPUs). Use a different CPU for returning channel is not efficient on Hyper-V. Change this behavior by preferring to the channel with the same CPU as the current I/O issuing CPU whenever possible. Tests have shown improvements in newer Hyper-V/Azure environment, and no regression with older Hyper-V/Azure environments. Tested-by: Raheel Abdul Faizy <rabdulfaizy@microsoft.com> Signed-off-by: Long Li <longli@microsoft.com> Message-Id: <1759381530-7414-1-git-send-email-longli@linux.microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-10-06Merge tag 'pci-v6.18-changes' of ↵Linus Torvalds2-6/+1
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Add PCI_FIND_NEXT_CAP() and PCI_FIND_NEXT_EXT_CAP() macros that take config space accessor functions. Implement pci_find_capability(), pci_find_ext_capability(), and dwc, dwc endpoint, and cadence capability search interfaces with them (Hans Zhang) - Leave parent unit address 0 in 'interrupt-map' so that when we build devicetree nodes to describe PCI functions that contain multiple peripherals, we can build this property even when interrupt controllers lack 'reg' properties (Lorenzo Pieralisi) - Add a Xeon 6 quirk to disable Extended Tags and limit Max Read Request Size to 128B to avoid a performance issue (Ilpo Järvinen) - Add sysfs 'serial_number' file to expose the Device Serial Number (Matthew Wood) - Fix pci_acpi_preserve_config() memory leak (Nirmoy Das) Resource management: - Align m68k pcibios_enable_device() with other arches (Ilpo Järvinen) - Remove sparc pcibios_enable_device() implementations that don't do anything beyond what pci_enable_resources() does (Ilpo Järvinen) - Remove mips pcibios_enable_resources() and use pci_enable_resources() instead (Ilpo Järvinen) - Clean up bridge window sizing and assignment (Ilpo Järvinen), including: - Leave non-claimed bridge windows disabled - Enable bridges even if a window wasn't assigned because not all windows are required by downstream devices - Preserve bridge window type when releasing the resource, since the type is needed for reassignment - Consolidate selection of bridge windows into two new interfaces, pbus_select_window() and pbus_select_window_for_type(), so this is done consistently - Compute bridge window start and end earlier to avoid logging stale information MSI: - Add quirk to disable MSI on RDC PCI to PCIe bridges (Marcos Del Sol Vives) Error handling: - Align AER with EEH by allowing drivers to request a Bus Reset on Non-Fatal Errors (in addition to the reset on Fatal Errors that we already do) (Lukas Wunner) - If error recovery fails, emit FAILED_RECOVERY uevents for the devices, not for the bridge leading to them. This makes them correspond to BEGIN_RECOVERY uevents (Lukas Wunner) - Align AER with EEH by calling err_handler.error_detected() callbacks to notify drivers if error recovery fails (Lukas Wunner) - Align AER with EEH by restoring device error_state to pci_channel_io_normal before the err_handler.slot_reset() callback. This is earlier than before the err_handler.resume() callback (Lukas Wunner) - Emit a BEGIN_RECOVERY uevent when driver's err_handler.error_detected() requests a reset, as well as when it says recovery is complete or can be done without a reset (Niklas Schnelle) - Align s390 with AER and EEH by emitting uevents during error recovery (Niklas Schnelle) - Align EEH with AER and s390 by emitting BEGIN_RECOVERY, SUCCESSFUL_RECOVERY, or FAILED_RECOVERY uevents depending on the result of err_handler.error_detected() (Niklas Schnelle) - Fix a NULL pointer dereference in aer_ratelimit() when ACPI GHES error information identifies a device without an AER Capability (Breno Leitao) - Update error decoding and TLP Log printing for new errors in current PCIe base spec (Lukas Wunner) - Update error recovery documentation to match the current code and use consistent nomenclature (Lukas Wunner) ASPM: - Enable all ClockPM and ASPM states for devicetree platforms, since there's typically no firmware that enables ASPM This is a risky change that may uncover hardware or configuration defects at boot-time rather than when users enable ASPM via sysfs later. Booting with "pcie_aspm=off" prevents this enabling (Manivannan Sadhasivam) - Remove the qcom code that enabled ASPM (Manivannan Sadhasivam) Power management: - If a device has already been disconnected, e.g., by a hotplug removal, don't bother trying to resume it to D0 when detaching the driver. This avoids annoying "Unable to change power state from D3cold to D0" messages (Mario Limonciello) - Ensure devices are powered up before config reads for 'max_link_width', 'current_link_speed', 'current_link_width', 'secondary_bus_number', and 'subordinate_bus_number' sysfs files. This prevents using invalid data (~0) in drivers or lspci and, depending on how the PCIe controller reports errors, may avoid error interrupts or crashes (Brian Norris) Virtualization: - Add rescan/remove locking when enabling/disabling SR-IOV, which avoids list corruption on s390, where disabling SR-IOV also generates hotplug events (Niklas Schnelle) Peer-to-peer DMA: - Free struct p2p_pgmap, not a member within it, in the pci_p2pdma_add_resource() error path (Sungho Kim) Endpoint framework: - Document sysfs interface for BAR assignment of vNTB endpoint functions (Jerome Brunet) - Fix array underflow in endpoint BAR test case (Dan Carpenter) - Skip endpoint IRQ test if the IRQ is out of range to avoid false errors (Christian Bruel) - Fix endpoint test case for controllers with fixed-size BARs smaller than requested by the test (Marek Vasut) - Restore inbound translation when disabling doorbell so the endpoint doorbell test case can be run more than once (Niklas Cassel) - Avoid a NULL pointer dereference when releasing DMA channels in endpoint DMA test case (Shin'ichiro Kawasaki) - Convert tegra194 interrupt number to MSI vector to fix endpoint Kselftest MSI_TEST test case (Niklas Cassel) - Reset tegra194 BARs when running in endpoint mode so the BAR tests don't overwrite the ATU settings in BAR4 (Niklas Cassel) - Handle errors in tegra194 BPMP transactions so we don't mistakenly skip future PERST# assertion (Vidya Sagar) AMD MDB PCIe controller driver: - Update DT binding example to separate PERST# to a Root Port stanza to make multiple Root Ports possible in the future (Sai Krishna Musham) - Add driver support for PERST# being described in a Root Port stanza, falling back to the host bridge if not found there (Sai Krishna Musham) Freescale i.MX6 PCIe controller driver: - Enable the 3.3V Vaux supply if available so devices can request wakeup with either Beacon or WAKE# (Richard Zhu) MediaTek PCIe Gen3 controller driver: - Add optional sys clock ready time setting to avoid sys_clk_rdy signal glitching in MT6991 and MT8196 (AngeloGioacchino Del Regno) - Add DT binding and driver support for MT6991 and MT8196 (AngeloGioacchino Del Regno) NVIDIA Tegra PCIe controller driver: - When asserting PERST#, disable the controller instead of mistakenly disabling the PLL twice (Nagarjuna Kristam) - Convert struct tegra_msi mask_lock to raw spinlock to avoid a lock nesting error (Marek Vasut) Qualcomm PCIe controller driver: - Select PCI Power Control Slot driver so slot voltage rails can be turned on/off if described in Root Port devicetree node (Qiang Yu) - Parse only PCI bridge child nodes in devicetree, skipping unrelated nodes such as OPP (Operating Performance Points), which caused probe failures (Krishna Chaitanya Chundru) - Add 8.0 GT/s and 32.0 GT/s equalization settings (Ziyue Zhang) - Consolidate Root Port 'phy' and 'reset' properties in struct qcom_pcie_port, regardless of whether we got them from the Root Port node or the host bridge node (Manivannan Sadhasivam) - Fetch and map the ELBI register space in the DWC core rather than in each driver individually (Krishna Chaitanya Chundru) - Enable ECAM mechanism in DWC core by setting up iATU with 'CFG Shift Feature' and use this in the qcom driver (Krishna Chaitanya Chundru) - Add SM8750 compatible to qcom,pcie-sm8550.yaml (Krishna Chaitanya Chundru) - Update qcom,pcie-x1e80100.yaml to allow fifth PCIe host on Qualcomm Glymur, which is compatible with X1E80100 but doesn't have the cnoc_sf_axi clock (Qiang Yu) Renesas R-Car PCIe controller driver: - Fix a typo that prevented correct PHY initialization (Marek Vasut) - Add a missing 1ms delay after PWR reset assertion as required by the V4H manual (Marek Vasut) - Assure reset has completed before DBI access to avoid SError (Marek Vasut) - Fix inverted PHY initialization check, which sometimes led to timeouts and failure to start the controller (Marek Vasut) - Pass the correct IRQ domain to generic_handle_domain_irq() to fix a regression when converting to msi_create_parent_irq_domain() (Claudiu Beznea) - Drop the spinlock protecting the PMSR register - it's no longer required since pci_lock already serializes accesses (Marek Vasut) - Convert struct rcar_msi mask_lock to raw spinlock to avoid a lock nesting error (Marek Vasut) SOPHGO PCIe controller driver: - Check for existence of struct cdns_pcie.ops before using it to allow Cadence drivers that don't need to supply ops (Chen Wang) - Add DT binding and driver for the SOPHGO SG2042 PCIe controller (Chen Wang) STMicroelectronics STM32MP25 PCIe controller driver: - Update pinctrl documentation of initial states and use in runtime suspend/resume (Christian Bruel) - Add pinctrl_pm_select_init_state() for use by stm32 driver, which needs it during resume (Christian Bruel) - Add devicetree bindings and drivers for the STMicroelectronics STM32MP25 in host and endpoint modes (Christian Bruel) Synopsys DesignWare PCIe controller driver: - Add support for x16 in devicetree 'num-lanes' property (Konrad Dybcio) - Verify that if DT specifies a single IRQ for all eDMA channels, it is named 'dma' (Niklas Cassel) TI J721E PCIe driver: - Add MODULE_DEVICE_TABLE() so driver can be autoloaded (Siddharth Vadapalli) - Power controller off before configuring the glue layer so the controller latches the correct values on power-on (Siddharth Vadapalli) TI Keystone PCIe controller driver: - Use devm_request_irq() so 'ks-pcie-error-irq' is freed when driver exits with error (Siddharth Vadapalli) - Add Peripheral Virtualization Unit (PVU), which restricts DMA from PCIe devices to specific regions of host memory, to the ti,am65 binding (Jan Kiszka) Xilinx NWL PCIe controller driver: - Clear bootloader E_ECAM_CONTROL before merging in the new driver value to avoid writing invalid values (Jani Nurminen)" * tag 'pci-v6.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (141 commits) PCI/AER: Avoid NULL pointer dereference in aer_ratelimit() MAINTAINERS: Add entry for ST STM32MP25 PCIe drivers PCI: stm32-ep: Add PCIe Endpoint support for STM32MP25 dt-bindings: PCI: Add STM32MP25 PCIe Endpoint bindings PCI: stm32: Add PCIe host support for STM32MP25 PCI: xilinx-nwl: Fix ECAM programming PCI: j721e: Fix incorrect error message in probe() PCI: keystone: Use devm_request_irq() to free "ks-pcie-error-irq" on exit dt-bindings: PCI: qcom,pcie-x1e80100: Set clocks minItems for the fifth Glymur PCIe Controller PCI: dwc: Support 16-lane operation PCI: Add lockdep assertion in pci_stop_and_remove_bus_device() PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV PCI: rcar-host: Convert struct rcar_msi mask_lock into raw spinlock PCI: tegra194: Rename 'root_bus' to 'root_port_bus' in tegra_pcie_downstream_dev_to_D0() PCI: tegra: Convert struct tegra_msi mask_lock into raw spinlock PCI: rcar-gen4: Fix inverted break condition in PHY initialization PCI: rcar-gen4: Assure reset occurs before DBI access PCI: rcar-gen4: Add missing 1ms delay after PWR reset assertion PCI: Set up bridge resources earlier PCI: rcar-host: Drop PMSR spinlock ...
2025-10-04Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds55-715/+559
Pull SCSI updates from James Bottomley: "Usual driver updates (ufs, mpi3mr, lpfc, pm80xx, mpt3sas) plus assorted cleanups and fixes. The only core update is to sd.c and is mostly cosmetic" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (105 commits) scsi: MAINTAINERS: Update FC element owners scsi: mpt3sas: Update driver version to 54.100.00.00 scsi: mpt3sas: Add support for 22.5 Gbps SAS link rate scsi: mpt3sas: Suppress unnecessary IOCLogInfo on CONFIG_INVALID_PAGE scsi: mpt3sas: Fix crash in transport port remove by using ioc_info() scsi: ufs: ufs-qcom: Add support for limiting HS gear and rate scsi: ufs: pltfrm: Add DT support to limit HS gear and gear rate scsi: ufs: ufs-qcom: Remove redundant re-assignment to hs_rate scsi: ufs: dt-bindings: Document gear and rate limit properties scsi: ufs: core: Fix data race in CPU latency PM QoS request handling scsi: libfc: Fix potential buffer overflow in fc_ct_ms_fill() scsi: storvsc: Remove redundant ternary operators scsi: ufs: core: Change MCQ interrupt enable flow scsi: smartpqi: Replace kmalloc() + copy_from_user() with memdup_user() scsi: hpsa: Replace kmalloc() + copy_from_user() with memdup_user() scsi: hpsa: Fix potential memory leak in hpsa_big_passthru_ioctl() scsi: lpfc: Copyright updates for 14.4.0.11 patches scsi: lpfc: Update lpfc version to 14.4.0.11 scsi: lpfc: Convert debugfs directory counts from atomic to unsigned int scsi: lpfc: Clean up extraneous phba dentries ...
2025-10-03Merge tag 'mm-stable-2025-10-01-19-00' of ↵Linus Torvalds2-4/+2
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "mm, swap: improve cluster scan strategy" from Kairui Song improves performance and reduces the failure rate of swap cluster allocation - "support large align and nid in Rust allocators" from Vitaly Wool permits Rust allocators to set NUMA node and large alignment when perforning slub and vmalloc reallocs - "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets for virtual address spaces for ops-level DAMOS filters - "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren Baghdasaryan reduces mmap_lock contention during reads of /proc/pid/maps - "mm/mincore: minor clean up for swap cache checking" from Kairui Song performs some cleanup in the swap code - "mm: vm_normal_page*() improvements" from David Hildenbrand provides code cleanup in the pagemap code - "add persistent huge zero folio support" from Pankaj Raghav provides a block layer speedup by optionalls making the huge_zero_pagepersistent, instead of releasing it when its refcount falls to zero - "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to the recently added Kexec Handover feature - "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap. To end the constant struggle with space shortage on 32-bit conflicting with 64-bit's needs - "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap code - "selftests/mm: Fix false positives and skip unsupported tests" from Donet Tom fixes a few things in our selftests code - "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised" from David Hildenbrand "allows individual processes to opt-out of THP=always into THP=madvise, without affecting other workloads on the system". It's a long story - the [1/N] changelog spells out the considerations - "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on the memdesc project. Please see https://kernelnewbies.org/MatthewWilcox/Memdescs and https://blogs.oracle.com/linux/post/introducing-memdesc - "Tiny optimization for large read operations" from Chi Zhiling improves the efficiency of the pagecache read path - "Better split_huge_page_test result check" from Zi Yan improves our folio splitting selftest code - "test that rmap behaves as expected" from Wei Yang adds some rmap selftests - "remove write_cache_pages()" from Christoph Hellwig removes that function and converts its two remaining callers - "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD selftests issues - "introduce kernel file mapped folios" from Boris Burkov introduces the concept of "kernel file pages". Using these permits btrfs to account its metadata pages to the root cgroup, rather than to the cgroups of random inappropriate tasks - "mm/pageblock: improve readability of some pageblock handling" from Wei Yang provides some readability improvements to the page allocator code - "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON to understand arm32 highmem - "tools: testing: Use existing atomic.h for vma/maple tests" from Brendan Jackman performs some code cleanups and deduplication under tools/testing/ - "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes a couple of 32-bit issues in tools/testing/radix-tree.c - "kasan: unify kasan_enabled() and remove arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific initialization code into a common arch-neutral implementation - "mm: remove zpool" from Johannes Weiner removes zspool - an indirection layer which now only redirects to a single thing (zsmalloc) - "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a couple of cleanups in the fork code - "mm: remove nth_page()" from David Hildenbrand makes rather a lot of adjustments at various nth_page() callsites, eventually permitting the removal of that undesirable helper function - "introduce kasan.write_only option in hw-tags" from Yeoreum Yun creates a KASAN read-only mode for ARM, using that architecture's memory tagging feature. It is felt that a read-only mode KASAN is suitable for use in production systems rather than debug-only - "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does some tidying in the hugetlb folio allocation code - "mm: establish const-correctness for pointer parameters" from Max Kellermann makes quite a number of the MM API functions more accurate about the constness of their arguments. This was getting in the way of subsystems (in this case CEPH) when they attempt to improving their own const/non-const accuracy - "Cleanup free_pages() misuse" from Vishal Moola fixes a number of code sites which were confused over when to use free_pages() vs __free_pages() - "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the mapletree code accessible to Rust. Required by nouveau and by its forthcoming successor: the new Rust Nova driver - "selftests/mm: split_huge_page_test: split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and some cleanups to the thp selftesting code - "mm, swap: introduce swap table as swap cache (phase I)" from Chris Li and Kairui Song is the first step along the path to implementing "swap tables" - a new approach to swap allocation and state tracking which is expected to yield speed and space improvements. This patchset itself yields a 5-20% performance benefit in some situations - "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc layer to clean up the ptdesc code a little - "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some issues in our 5-level pagetable selftesting code - "Minor fixes for memory allocation profiling" from Suren Baghdasaryan addresses a couple of minor issues in relatively new memory allocation profiling feature - "Small cleanups" from Matthew Wilcox has a few cleanups in preparation for more memdesc work - "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in furtherance of supporting arm highmem - "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad Anjum adds that compiler check to selftests code and fixes the fallout, by removing dead code - "Improvements to Victim Process Thawing and OOM Reaper Traversal Order" from zhongjinji makes a number of improvements in the OOM killer: mainly thawing a more appropriate group of victim threads so they can release resources - "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park is a bunch of small and unrelated fixups for DAMON - "mm/damon: define and use DAMON initialization check function" from SeongJae Park implement reliability and maintainability improvements to a recently-added bug fix - "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from SeongJae Park provides additional transparency to userspace clients of the DAMON_STAT information - "Expand scope of khugepaged anonymous collapse" from Dev Jain removes some constraints on khubepaged's collapsing of anon VMAs. It also increases the success rate of MADV_COLLAPSE against an anon vma - "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards removal of file_operations.mmap(). This patchset concentrates upon clearing up the treatment of stacked filesystems - "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau provides some fixes and improvements to mlock's tracking of large folios. /proc/meminfo's "Mlocked" field became more accurate - "mm/ksm: Fix incorrect accounting of KSM counters during fork" from Donet Tom fixes several user-visible KSM stats inaccuracies across forks and adds selftest code to verify these counters - "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses some potential but presently benign issues in KSM's mm_slot handling * tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits) mm: swap: check for stable address space before operating on the VMA mm: convert folio_page() back to a macro mm/khugepaged: use start_addr/addr for improved readability hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list alloc_tag: fix boot failure due to NULL pointer dereference mm: silence data-race in update_hiwater_rss mm/memory-failure: don't select MEMORY_ISOLATION mm/khugepaged: remove definition of struct khugepaged_mm_slot mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL hugetlb: increase number of reserving hugepages via cmdline selftests/mm: add fork inheritance test for ksm_merging_pages counter mm/ksm: fix incorrect KSM counter handling in mm_struct during fork drivers/base/node: fix double free in register_one_node() mm: remove PMD alignment constraint in execmem_vmalloc() mm/memory_hotplug: fix typo 'esecially' -> 'especially' mm/rmap: improve mlock tracking for large folios mm/filemap: map entire large folio faultaround mm/fault: try to map the entire file folio in finish_fault() mm/rmap: mlock large folios in try_to_unmap_one() mm/rmap: fix a mlock race condition in folio_referenced_one() ...
2025-10-02Merge tag 'for-6.18/block-20250929' of ↵Linus Torvalds38-65/+65
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - NVMe pull request via Keith: - FC target fixes (Daniel) - Authentication fixes and updates (Martin, Chris) - Admin controller handling (Kamaljit) - Target lockdep assertions (Max) - Keep-alive updates for discovery (Alastair) - Suspend quirk (Georg) - MD pull request via Yu: - Add support for a lockless bitmap. A key feature for the new bitmap are that the IO fastpath is lockless. If a user issues lots of write IO to the same bitmap bit in a short time, only the first write has additional overhead to update bitmap bit, no additional overhead for the following writes. By supporting only resync or recover written data, means in the case creating new array or replacing with a new disk, there is no need to do a full disk resync/recovery. - Switch ->getgeo() and ->bios_param() to using struct gendisk rather than struct block_device. - Rust block changes via Andreas. This series adds configuration via configfs and remote completion to the rnull driver. The series also includes a set of changes to the rust block device driver API: a few cleanup patches, and a few features supporting the rnull changes. The series removes the raw buffer formatting logic from `kernel::block` and improves the logic available in `kernel::string` to support the same use as the removed logic. - floppy arch cleanups - Reduce the number of dereferencing needed for ublk commands - Restrict supported sockets for nbd. Mostly done to eliminate a class of issues perpetually reported by syzbot, by using nonsensical socket setups. - A few s390 dasd block fixes - Fix a few issues around atomic writes - Improve DMA interation for integrity requests - Improve how iovecs are treated with regards to O_DIRECT aligment constraints. We used to require each segment to adhere to the constraints, now only the request as a whole needs to. - Clean up and improve p2p support, enabling use of p2p for metadata payloads - Improve locking of request lookup, using SRCU where appropriate - Use page references properly for brd, avoiding very long RCU sections - Fix ordering of recursively submitted IOs - Clean up and improve updating nr_requests for a live device - Various fixes and cleanups * tag 'for-6.18/block-20250929' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (164 commits) s390/dasd: enforce dma_alignment to ensure proper buffer validation s390/dasd: Return BLK_STS_INVAL for EINVAL from do_dasd_request ublk: remove redundant zone op check in ublk_setup_iod() nvme: Use non zero KATO for persistent discovery connections nvmet: add safety check for subsys lock nvme-core: use nvme_is_io_ctrl() for I/O controller check nvme-core: do ioccsz/iorcsz validation only for I/O controllers nvme-core: add method to check for an I/O controller blk-cgroup: fix possible deadlock while configuring policy blk-mq: fix null-ptr-deref in blk_mq_free_tags() from error path blk-mq: Fix more tag iteration function documentation selftests: ublk: fix behavior when fio is not installed ublk: don't access ublk_queue in ublk_unmap_io() ublk: pass ublk_io to __ublk_complete_rq() ublk: don't access ublk_queue in ublk_need_complete_req() ublk: don't access ublk_queue in ublk_check_commit_and_fetch() ublk: don't pass ublk_queue to ublk_fetch() ublk: don't access ublk_queue in ublk_config_io_buf() ublk: don't access ublk_queue in ublk_check_fetch_buf() ublk: pass q_id and tag to __ublk_check_and_get_req() ...
2025-10-01Drivers: hv: Add CONFIG_HYPERV_VMBUS optionMukesh Rathor1-1/+1
At present VMBus driver is hinged off of CONFIG_HYPERV which entails lot of builtin code and encompasses too much. It's not always clear what depends on builtin hv code and what depends on VMBus. Setting CONFIG_HYPERV as a module and fudging the Makefile to switch to builtin adds even more confusion. VMBus is an independent module and should have its own config option. Also, there are scenarios like baremetal dom0/root where support is built in with CONFIG_HYPERV but without VMBus. Lastly, there are more features coming down that use CONFIG_HYPERV and add more dependencies on it. So, create a fine grained HYPERV_VMBUS option and update Kconfigs for dependency on VMBus. Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # drivers/pci Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-09-30scsi: mvsas: Fix use-after-free bugs in mvs_work_queueDuoming Zhou1-1/+1
During the detaching of Marvell's SAS/SATA controller, the original code calls cancel_delayed_work() in mvs_free() to cancel the delayed work item mwq->work_q. However, if mwq->work_q is already running, the cancel_delayed_work() may fail to cancel it. This can lead to use-after-free scenarios where mvs_free() frees the mvs_info while mvs_work_queue() is still executing and attempts to access the already-freed mvs_info. A typical race condition is illustrated below: CPU 0 (remove) | CPU 1 (delayed work callback) mvs_pci_remove() | mvs_free() | mvs_work_queue() cancel_delayed_work() | kfree(mvi) | | mvi-> // UAF Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure that the delayed work item is properly canceled and any executing delayed work item completes before the mvs_info is deallocated. This bug was found by static analysis. Fixes: 20b09c2992fe ("[SCSI] mvsas: add support for 94xx; layout change; bug fixes") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-29Revert "scsi: qla2xxx: Fix memcpy() field-spanning write issue"John Meneghini4-18/+16
This reverts commit 6f4b10226b6b1e7d1ff3cdb006cf0f6da6eed71e. We've been testing this patch and it turns out there is a significant bug here. This leaks memory and causes a driver hang. Link: https://lore.kernel.org/linux-scsi/yq1zfajqpec.fsf@ca-mkp.ca.oracle.com/ Signed-off-by: John Meneghini <jmeneghi@redhat.com> Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-25Merge patch series "mpt3sas: Few Enhancements and minor fixes"Martin K. Petersen3-8/+15
Ranjan Kumar <ranjan.kumar@broadcom.com> says: Few Enhancements and minor fixes of mpt3sas driver. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-25scsi: mpt3sas: Update driver version to 54.100.00.00Ranjan Kumar1-2/+2
Updated driver version to 54.100.00.00. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Message-Id: <20250922095113.281484-5-ranjan.kumar@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-25scsi: mpt3sas: Add support for 22.5 Gbps SAS link rateRanjan Kumar1-0/+3
Add handling for MPI26_SAS_NEG_LINK_RATE_22_5 in _transport_convert_phy_link_rate(). This maps the new 22.5 Gbps negotiated rate to SAS_LINK_RATE_22_5_GBPS, to get correct PHY link speeds. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Message-Id: <20250922095113.281484-4-ranjan.kumar@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-25scsi: mpt3sas: Suppress unnecessary IOCLogInfo on CONFIG_INVALID_PAGERanjan Kumar1-1/+7
Avoid unconditional IOCLogInfo prints for CONFIG_INVALID_PAGE. Log only if MPT_DEBUG_REPLY is enabled or when loginfo represents other errors. This reduces uncessary logging without losing useful error reporting. Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-25scsi: mpt3sas: Fix crash in transport port remove by using ioc_info()Ranjan Kumar1-5/+3
During mpt3sas_transport_port_remove(), messages were logged with dev_printk() against &mpt3sas_port->port->dev. At this point the SAS transport device may already be partially unregistered or freed, leading to a crash when accessing its struct device. Using ioc_info(), which logs via the PCI device (ioc->pdev->dev), guaranteed to remain valid until driver removal. [83428.295776] Oops: general protection fault, probably for non-canonical address 0x6f702f323a33312d: 0000 [#1] SMP NOPTI [83428.295785] CPU: 145 UID: 0 PID: 113296 Comm: rmmod Kdump: loaded Tainted: G OE 6.16.0-rc1+ #1 PREEMPT(voluntary) [83428.295792] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [83428.295795] Hardware name: Dell Inc. Precision 7875 Tower/, BIOS 89.1.67 02/23/2024 [83428.295799] RIP: 0010:__dev_printk+0x1f/0x70 [83428.295805] Code: 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 49 89 d1 48 85 f6 74 52 4c 8b 46 50 4d 85 c0 74 1f 48 8b 46 68 48 85 c0 74 22 <48> 8b 08 0f b6 7f 01 48 c7 c2 db e8 42 ad 83 ef 30 e9 7b f8 ff ff [83428.295813] RSP: 0018:ff85aeafc3137bb0 EFLAGS: 00010206 [83428.295817] RAX: 6f702f323a33312d RBX: ff4290ee81292860 RCX: 5000cca25103be32 [83428.295820] RDX: ff85aeafc3137bb8 RSI: ff4290eeb1966c00 RDI: ffffffffc1560845 [83428.295823] RBP: ff85aeafc3137c18 R08: 74726f702f303a33 R09: ff85aeafc3137bb8 [83428.295826] R10: ff85aeafc3137b18 R11: ff4290f5bd60fe68 R12: ff4290ee81290000 [83428.295830] R13: ff4290ee6e345de0 R14: ff4290ee81290000 R15: ff4290ee6e345e30 [83428.295833] FS: 00007fd9472a6740(0000) GS:ff4290f5ce96b000(0000) knlGS:0000000000000000 [83428.295837] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [83428.295840] CR2: 00007f242b4db238 CR3: 00000002372b8006 CR4: 0000000000771ef0 [83428.295844] PKRU: 55555554 [83428.295846] Call Trace: [83428.295848] <TASK> [83428.295850] _dev_printk+0x5c/0x80 [83428.295857] ? srso_alias_return_thunk+0x5/0xfbef5 [83428.295863] mpt3sas_transport_port_remove+0x1c7/0x420 [mpt3sas] [83428.295882] _scsih_remove_device+0x21b/0x280 [mpt3sas] [83428.295894] ? _scsih_expander_node_remove+0x108/0x140 [mpt3sas] [83428.295906] ? srso_alias_return_thunk+0x5/0xfbef5 [83428.295910] mpt3sas_device_remove_by_sas_address.part.0+0x8f/0x110 [mpt3sas] [83428.295921] _scsih_expander_node_remove+0x129/0x140 [mpt3sas] [83428.295933] _scsih_expander_node_remove+0x6a/0x140 [mpt3sas] [83428.295944] scsih_remove+0x3f0/0x4a0 [mpt3sas] [83428.295957] pci_device_remove+0x3b/0xb0 [83428.295962] device_release_driver_internal+0x193/0x200 [83428.295968] driver_detach+0x44/0x90 [83428.295971] bus_remove_driver+0x69/0xf0 [83428.295975] pci_unregister_driver+0x2a/0xb0 [83428.295979] _mpt3sas_exit+0x1f/0x300 [mpt3sas] [83428.295991] __do_sys_delete_module.constprop.0+0x174/0x310 [83428.295997] ? srso_alias_return_thunk+0x5/0xfbef5 [83428.296000] ? __x64_sys_getdents64+0x9a/0x110 [83428.296005] ? srso_alias_return_thunk+0x5/0xfbef5 [83428.296009] ? syscall_trace_enter+0xf6/0x1b0 [83428.296014] do_syscall_64+0x7b/0x2c0 [83428.296019] ? srso_alias_return_thunk+0x5/0xfbef5 [83428.296023] entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: f92363d12359 ("[SCSI] mpt3sas: add new driver supporting 12GB SAS") Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-25scsi: libfc: Fix potential buffer overflow in fc_ct_ms_fill()Alok Tiwari1-1/+1
The fc_ct_ms_fill() helper currently formats the OS name and version into entry->value using "%s v%s". Since init_utsname()->sysname and ->release are unbounded strings, snprintf() may attempt to write more than FC_FDMI_HBA_ATTR_OSNAMEVERSION_LEN bytes, triggering a -Wformat-truncation warning with W=1. In file included from drivers/scsi/libfc/fc_elsct.c:18: drivers/scsi/libfc/fc_encode.h: In function ‘fc_ct_ms_fill.constprop’: drivers/scsi/libfc/fc_encode.h:359:30: error: ‘%s’ directive output may be truncated writing up to 64 bytes into a region of size between 62 and 126 [-Werror=format-truncation=] 359 | "%s v%s", | ^~ 360 | init_utsname()->sysname, 361 | init_utsname()->release); | ~~~~~~~~~~~~~~~~~~~~~~~ drivers/scsi/libfc/fc_encode.h:357:17: note: ‘snprintf’ output between 3 and 131 bytes into a destination of size 128 357 | snprintf((char *)&entry->value, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 358 | FC_FDMI_HBA_ATTR_OSNAMEVERSION_LEN, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 359 | "%s v%s", | ~~~~~~~~~ 360 | init_utsname()->sysname, | ~~~~~~~~~~~~~~~~~~~~~~~~ 361 | init_utsname()->release); | ~~~~~~~~~~~~~~~~~~~~~~~~ Fix this by using "%.62s v%.62s", which ensures sysname and release are truncated to fit within the 128-byte field defined by FC_FDMI_HBA_ATTR_OSNAMEVERSION_LEN. [mkp: clarified commit description] Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-25scsi: storvsc: Remove redundant ternary operatorsLiao Yuanhong1-2/+2
Remove redundant ternary operators to clean up the code. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-25scsi: smartpqi: Replace kmalloc() + copy_from_user() with memdup_user()Thorsten Blum1-9/+8
Replace kmalloc() followed by copy_from_user() with memdup_user() to simplify and improve pqi_passthru_ioctl(). Since memdup_user() already allocates memory, use kzalloc() in the else branch instead of manually zeroing 'kernel_buffer' using memset(0). Return early if an error occurs. No functional changes intended. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Acked-by: Don Brace <don.brace@microchip.com> Message-Id: <20250922201832.1697874-2-thorsten.blum@linux.dev> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-25scsi: hpsa: Replace kmalloc() + copy_from_user() with memdup_user()Thorsten Blum1-11/+6
Replace kmalloc() followed by copy_from_user() with memdup_user() to improve and simplify hpsa_passthru_ioctl(). Since memdup_user() already allocates memory, use kzalloc() in the else branch instead of manually zeroing 'buff' using memset(0). Return early if an error occurs and remove the 'out_kfree' label. No functional changes intended. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Acked-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-25scsi: hpsa: Fix potential memory leak in hpsa_big_passthru_ioctl()Thorsten Blum1-9/+12
Replace kmalloc() followed by copy_from_user() with memdup_user() to fix a memory leak that occurs when copy_from_user(buff[sg_used],,) fails and the 'cleanup1:' path does not free the memory for 'buff[sg_used]'. Using memdup_user() avoids this by freeing the memory internally. Since memdup_user() already allocates memory, use kzalloc() in the else branch instead of manually zeroing 'buff[sg_used]' using memset(0). Cc: stable@vger.kernel.org Fixes: edd163687ea5 ("[SCSI] hpsa: add driver for HP Smart Array controllers.") Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Acked-by: Don Brace <don.brace@microchip.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-09-24Merge patch series "Add generated modalias to modules.builtin.modinfo"Nathan Chancellor1-3/+1
Alexey Gladkov says: The modules.builtin.modinfo file is used by userspace (kmod to be specific) to get information about builtin modules. Among other information about the module, information about module aliases is stored. This is very important to determine that a particular modalias will be handled by a module that is inside the kernel. There are several mechanisms for creating modalias for modules: The first is to explicitly specify the MODULE_ALIAS of the macro. In this case, the aliases go into the '.modinfo' section of the module if it is compiled separately or into vmlinux.o if it is builtin into the kernel. The second is the use of MODULE_DEVICE_TABLE followed by the use of the modpost utility. In this case, vmlinux.o no longer has this information and does not get it into modules.builtin.modinfo. For example: $ modinfo pci:v00008086d0000A36Dsv00001043sd00008694bc0Csc03i30 modinfo: ERROR: Module pci:v00008086d0000A36Dsv00001043sd00008694bc0Csc03i30 not found. $ modinfo xhci_pci name: xhci_pci filename: (builtin) license: GPL file: drivers/usb/host/xhci-pci description: xHCI PCI Host Controller Driver The builtin module is missing alias "pci:v*d*sv*sd*bc0Csc03i30*" which will be generated by modpost if the module is built separately. To fix this it is necessary to add the generated by modpost modalias to modules.builtin.modinfo. Fortunately modpost already generates .vmlinux.export.c for exported symbols. It is possible to add `.modinfo` for builtin modules and modify the build system so that `.modinfo` section is extracted from the intermediate vmlinux after modpost is executed. Link: https://patch.msgid.link/cover.1758182101.git.legion@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2025-09-24scsi: Always define blogic_pci_tbl structureAlexey Gladkov1-3/+1
The blogic_pci_tbl structure is used by the MODULE_DEVICE_TABLE macro. There is no longer a need to protect it with the MODULE condition, since this no longer causes the compiler to warn about an unused variable. To avoid warnings when -Wunused-const-variable option is used, mark it as __maybe_unused for such configuration. Cc: Khalid Aziz <khalid@gonehiking.org> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: linux-scsi@vger.kernel.org Suggested-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Alexey Gladkov <legion@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://patch.msgid.link/fd8e30de07de79a4923ae967eaee5ba2f2fcef00.1758182101.git.legion@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2025-09-22scsi: sg: drop nth_page() usage within SG entryDavid Hildenbrand1-2/+1
It's no longer required to use nth_page() when iterating pages within a single SG entry, so let's drop the nth_page() usage. Link: https://lkml.kernel.org/r/20250901150359.867252-32-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-22scsi: scsi_lib: drop nth_page() usage within SG entryDavid Hildenbrand1-2/+1
It's no longer required to use nth_page() when iterating pages within a single SG entry, so let's drop the nth_page() usage. Link: https://lkml.kernel.org/r/20250901150359.867252-31-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-17Merge patch series "Update lpfc to revision 14.4.0.11"Martin K. Petersen10-507/+274
Justin Tee <justintee8345@gmail.com> says: Update lpfc to revision 14.4.0.11 This patch set contains clean up of unused members in various structs, bug fixes related to discovery and resource allocation, and updates to handling of debugfs entries. The patches were cut against Martin's 6.18/scsi-queue tree. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>