kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
5 days	PCI: brcmstb: Fix disabling L0s capability	Jim Quinlan	1	-7/+3
	[ Upstream commit 9583f9d22991d2cfb5cc59a2552040c4ae98d998 ] caab002d5069 ("PCI: brcmstb: Disable L0s component of ASPM if requested") set PCI_EXP_LNKCAP_ASPM_L1 and (optionally) PCI_EXP_LNKCAP_ASPM_L0S in PCI_EXP_LNKCAP (aka PCIE_RC_CFG_PRIV1_LINK_CAPABILITY in brcmstb). But instead of using PCI_EXP_LNKCAP_ASPM_L1 and PCI_EXP_LNKCAP_ASPM_L0S directly, it used PCIE_LINK_STATE_L1 and PCIE_LINK_STATE_L0S, which are Linux-created values that only coincidentally matched the PCIe spec. b478e162f227 ("PCI/ASPM: Consolidate link state defines") later changed them so they no longer matched the PCIe spec, so the bits ended up in the wrong place in PCI_EXP_LNKCAP. Use PCI_EXP_LNKCAP_ASPM_L0S to clear L0s support when there's an 'aspm-no-l0s' property. Rely on brcmstb hardware to advertise L0s and/or L1 support otherwise. Fixes: caab002d5069 ("PCI: brcmstb: Disable L0s component of ASPM if requested") Reported-by: Bjorn Helgaas <bhelgaas@google.com> Closes: https://lore.kernel.org/linux-pci/20250925194424.GA2197200@bhelgaas Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> [mani: reworded subject and description, added closes tag and CCed stable] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251003170436.1446030-1-james.quinlan@broadcom.com [ Adjust context in variable declaration ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 days	PCI/PM: Reinstate clearing state_saved in legacy and !PM codepaths	Lukas Wunner	1	-0/+4
	commit 894f475f88e06c0f352c829849560790dbdedbe5 upstream. When a PCI device is suspended, it is normally the PCI core's job to save Config Space and put the device into a low power state. However drivers are allowed to assume these responsibilities. When they do, the PCI core can tell by looking at the state_saved flag in struct pci_dev: The flag is cleared before commencing the suspend sequence and it is set when pci_save_state() is called. If the PCI core finds the flag set late in the suspend sequence, it refrains from calling pci_save_state() itself. But there are two corner cases where the PCI core neglects to clear the flag before commencing the suspend sequence: * If a driver has legacy PCI PM callbacks, pci_legacy_suspend() neglects to clear the flag. The (stale) flag is subsequently queried by pci_legacy_suspend() itself and pci_legacy_suspend_late(). * If a device has no driver or its driver has no PCI PM callbacks, pci_pm_freeze() neglects to clear the flag. The (stale) flag is subsequently queried by pci_pm_freeze_noirq(). The flag may be set prior to suspend if the device went through error recovery: Drivers commonly invoke pci_restore_state() + pci_save_state() to restore Config Space after reset. The flag may also be set if drivers call pci_save_state() on probe to allow for recovery from subsequent errors. The result is that pci_legacy_suspend_late() and pci_pm_freeze_noirq() don't call pci_save_state() and so the state that will be restored on resume is the one recorded on last error recovery or on probe, not the one that the device had on suspend. If the two states happen to be identical, there's no problem. Reinstate clearing the flag in pci_legacy_suspend() and pci_pm_freeze(). The two functions used to do that until commit 4b77b0a2ba27 ("PCI: Clear saved_state after the state has been restored") deemed it unnecessary because it assumed that it's sufficient to clear the flag on resume in pci_restore_state(). The commit seemingly did not take into account that pci_save_state() and pci_restore_state() are not only used by power management code, but also for error recovery. Devices without driver or whose driver has no PCI PM callbacks may be in runtime suspend when pci_pm_freeze() is called. Their state has already been saved, so don't clear the flag to skip a pointless pci_save_state() in pci_pm_freeze_noirq(). None of the drivers with legacy PCI PM callbacks seem to use runtime PM, so clear the flag unconditionally in their case. Fixes: 4b77b0a2ba27 ("PCI: Clear saved_state after the state has been restored") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Cc: stable@vger.kernel.org # v2.6.32+ Link: https://patch.msgid.link/094f2aad64418710daf0940112abe5a0afdc6bce.1763483367.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 days	PCI: dwc: Fix wrong PORT_LOGIC_LTSSM_STATE_MASK definition	Shawn Lin	1	-1/+1
	[ Upstream commit bcc9a4a0bca3aee4303fa4a20302e57b24ac8f68 ] As per DesignWare Cores PCI Express Controller Databook, section 5.50, SII: Debug Signals, cxpl_debug_info[63:0]: [5:0] smlh_ltssm_state: LTSSM current state. Encoding is same as the dedicated smlh_ltssm_state output. The mask should be 6 bits, from 0 to 5. Hence, fix the mask definition. Fixes: 23fe5bd4be90 ("PCI: keystone: Cleanup ks_pcie_link_up()") Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> [mani: reworded description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/1763122140-203068-1-git-send-email-shawn.lin@rock-chips.com Signed-off-by: Sasha Levin <sashal@kernel.org>
5 days	PCI: keystone: Exit ks_pcie_probe() for invalid mode	Siddharth Vadapalli	1	-0/+2
	[ Upstream commit 95d9c3f0e4546eaec0977f3b387549a8463cd49f ] Commit under Fixes introduced support for PCIe EP mode on AM654x platforms. When the mode happens to be either "DW_PCIE_RC_TYPE" or "DW_PCIE_EP_TYPE", the PCIe Controller is configured accordingly. However, when the mode is neither of them, an error message is displayed, but the driver probe succeeds. Since this "invalid" mode is not associated with a functional PCIe Controller, the probe should fail. Fix the behavior by exiting "ks_pcie_probe()" with the return value of "-EINVAL" in addition to displaying the existing error message when the mode is invalid. Fixes: 23284ad677a9 ("PCI: keystone: Add support for PCIe EP in AM654x Platforms") Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251029080547.1253757-4-s-vadapalli@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	PCI: cadence: Check for the existence of cdns_pcie::ops before using it	Chen Wang	3	-6/+6
	[ Upstream commit 49a6c160ad4812476f8ae1a8f4ed6d15adfa6c09 ] cdns_pcie::ops might not be populated by all the Cadence glue drivers. This is going to be true for the upcoming Sophgo platform which doesn't set the ops. Hence, add a check to prevent NULL pointer dereference. Signed-off-by: Chen Wang <unicorn_wang@outlook.com> [mani: reworded subject and description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/35182ee1d972dfcd093a964e11205efcebbdc044.1757643388.git.unicorn_wang@outlook.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	PCI/P2PDMA: Fix incorrect pointer usage in devm_kfree() call	Sungho Kim	1	-1/+1
	[ Upstream commit 6238784e502b6a9fbeb3a6b77284b29baa4135cc ] The error handling path in pci_p2pdma_add_resource() contains a bug in its `pgmap_free` label. Memory is allocated for the `p2p_pgmap` struct, and the pointer is stored in `p2p_pgmap`. However, the error path calls devm_kfree() with `pgmap`, which is a pointer to a member field within the `p2p_pgmap` struct, not the base pointer of the allocation. Correct the bug by passing the correct base pointer, `p2p_pgmap`, to devm_kfree(). Signed-off-by: Sungho Kim <sungho.kim@furiosa.ai> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Link: https://patch.msgid.link/20250820105714.2939896-1-sungho.kim@furiosa.ai Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07	PCI: Disable MSI on RDC PCI to PCIe bridges	Marcos Del Sol Vives	1	-0/+1
	[ Upstream commit ebc7086b39e5e4f3d3ca82caaea20538c9b62d42 ] RDC PCI to PCIe bridges, present on Vortex86DX3 and Vortex86EX2 SoCs, do not support MSIs. If enabled, interrupts generated by PCIe devices never reach the processor. I have contacted the manufacturer (DM&P) and they confirmed that PCI MSIs need to be disabled for them. Signed-off-by: Marcos Del Sol Vives <marcos@orca.pet> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250705233209.721507-1-marcos@orca.pet Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-29	PCI/sysfs: Ensure devices are powered for config reads	Brian Norris	1	-1/+19
	[ Upstream commit 48991e4935078b05f80616c75d1ee2ea3ae18e58 ] The "max_link_width", "current_link_speed", "current_link_width", "secondary_bus_number", and "subordinate_bus_number" sysfs files all access config registers, but they don't check the runtime PM state. If the device is in D3cold or a parent bridge is suspended, we may see -EINVAL, bogus values, or worse, depending on implementation details. Wrap these access in pci_config_pm_runtime_{get,put}() like most of the rest of the similar sysfs attributes. Notably, "max_link_speed" does not access config registers; it returns a cached value since d2bd39c0456b ("PCI: Store all PCIe Supported Link Speeds"). Fixes: 56c1af4606f0 ("PCI: Add sysfs max_link_speed/width, current_link_speed/width, etc") Signed-off-by: Brian Norris <briannorris@google.com> Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250924095711.v2.1.Ibb5b6ca1e2c059e04ec53140cd98a44f2684c668@changeid Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	PCI/sysfs: Use sysfs_emit() and sysfs_emit_at() in "show" functions	Krzysztof Wilczyński	2	-42/+40
	[ Upstream commit ad025f8e46f3dbf09b1bf8d7a5b4ce858df74544 ] The sysfs_emit() and sysfs_emit_at() functions were introduced to make it less ambiguous which function is preferred when writing to the output buffer in a device attribute's "show" callback [1]. Convert the PCI sysfs object "show" functions from sprintf(), snprintf() and scnprintf() to sysfs_emit() and sysfs_emit_at() accordingly, as the latter is aware of the PAGE_SIZE buffer and correctly returns the number of bytes written into the buffer. No functional change intended. [1] Documentation/filesystems/sysfs.rst [bhelgaas: drop dsm_label_utf16s_to_utf8s(), link speed/width changes] Link: https://lore.kernel.org/r/20210416205856.3234481-10-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Stable-dep-of: 48991e493507 ("PCI/sysfs: Ensure devices are powered for config reads") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	PCI: Add sysfs attribute for device power state	Maximilian Luz	1	-0/+10
	[ Upstream commit 80a129afb75cba8434fc5071bd6919172442315c ] While PCI power states D0-D3hot can be queried from user-space via lspci, D3cold cannot. lspci cannot provide an accurate value when the device is in D3cold as it has to restore the device to D0 before it can access its power state via the configuration space, leading to it reporting D0 or another on-state. Thus lspci cannot be used to diagnose power consumption issues for devices that can enter D3cold or to ensure that devices properly enter D3cold at all. Add a new sysfs device attribute for the PCI power state, showing the current power state as seen by the kernel. [bhelgaas: drop READ_ONCE(), see discussion at the link] Link: https://lore.kernel.org/r/20201102141520.831630-1-luzmaximilian@gmail.com Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Stable-dep-of: 48991e493507 ("PCI/sysfs: Ensure devices are powered for config reads") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	PCI: j721e: Fix programming sequence of "strap" settings	Siddharth Vadapalli	1	-0/+25
	[ Upstream commit f842d3313ba179d4005096357289c7ad09cec575 ] The Cadence PCIe Controller integrated in the TI K3 SoCs supports both Root-Complex and Endpoint modes of operation. The Glue Layer allows "strapping" the Mode of operation of the Controller, the Link Speed and the Link Width. This is enabled by programming the "PCIEn_CTRL" register (n corresponds to the PCIe instance) within the CTRL_MMR memory-mapped register space. The "reset-values" of the registers are also different depending on the mode of operation. Since the PCIe Controller latches onto the "reset-values" immediately after being powered on, if the Glue Layer configuration is not done while the PCIe Controller is off, it will result in the PCIe Controller latching onto the wrong "reset-values". In practice, this will show up as a wrong representation of the PCIe Controller's capability structures in the PCIe Configuration Space. Some such capabilities which are supported by the PCIe Controller in the Root-Complex mode but are incorrectly latched onto as being unsupported are: - Link Bandwidth Notification - Alternate Routing ID (ARI) Forwarding Support - Next capability offset within Advanced Error Reporting (AER) capability Fix this by powering off the PCIe Controller before programming the "strap" settings and powering it on after that. The runtime PM APIs namely pm_runtime_put_sync() and pm_runtime_get_sync() will decrement and increment the usage counter respectively, causing GENPD to power off and power on the PCIe Controller. Fixes: f3e25911a430 ("PCI: j721e: Add TI J721E PCIe driver") Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250908120828.1471776-1-s-vadapalli@ti.com [ removed offset parameter from j721e_pcie_set_mode() and ACSPCIE refclk handling ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	PCI: tegra194: Handle errors in BPMP response	Vidya Sagar	1	-2/+16
	[ Upstream commit f8c9ad46b00453a8c075453f3745f8d263f44834 ] The return value from tegra_bpmp_transfer() indicates the success or failure of the IPC transaction with BPMP. If the transaction succeeded, we also need to check the actual command's result code. If we don't have error handling for tegra_bpmp_transfer(), we will set the pcie->ep_state to EP_STATE_ENABLED even when the tegra_bpmp_transfer() command fails. Thus, the pcie->ep_state will get out of sync with reality, and any further PERST# assert + deassert will be a no-op and will not trigger the hardware initialization sequence. This is because pex_ep_event_pex_rst_deassert() checks the current pcie->ep_state, and does nothing if the current state is already EP_STATE_ENABLED. Thus, it is important to have error handling for tegra_bpmp_transfer(), such that the pcie->ep_state can not get out of sync with reality, so that we will try to initialize the hardware not only during the first PERST# assert + deassert, but also during any succeeding PERST# assert + deassert. One example where this fix is needed is when using a rock5b as host. During the initial PERST# assert + deassert (triggered by the bootloader on the rock5b) pex_ep_event_pex_rst_deassert() will get called, but for some unknown reason, the tegra_bpmp_transfer() call to initialize the PHY fails. Once Linux has been loaded on the rock5b, the PCIe driver will once again assert + deassert PERST#. However, without tegra_bpmp_transfer() error handling, this second PERST# assert + deassert will not trigger the hardware initialization sequence. With tegra_bpmp_transfer() error handling, the second PERST# assert + deassert will once again trigger the hardware to be initialized and this time the tegra_bpmp_transfer() succeeds. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> [cassel: improve commit log] Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-8-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	PCI: tegra194: Fix broken tegra_pcie_ep_raise_msi_irq()	Niklas Cassel	1	-2/+2
	commit b640d42a6ac9ba01abe65ec34f7c73aaf6758ab8 upstream. The pci_epc_raise_irq() supplies a MSI or MSI-X interrupt number in range (1-N), as per the pci_epc_raise_irq() kdoc, where N is 32 for MSI. But tegra_pcie_ep_raise_msi_irq() incorrectly uses the interrupt number as the MSI vector. This causes wrong MSI vector to be triggered, leading to the failure of PCI endpoint Kselftest MSI_TEST test case. To fix this issue, convert the interrupt number to MSI vector. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-6-cassel@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	PCI: keystone: Use devm_request_irq() to free "ks-pcie-error-irq" on exit	Siddharth Vadapalli	1	-2/+2
	commit e51d05f523e43ce5d2bad957943a2b14f68078cd upstream. Commit under Fixes introduced the IRQ handler for "ks-pcie-error-irq". The interrupt is acquired using "request_irq()" but is never freed if the driver exits due to an error. Although the section in the driver that invokes "request_irq()" has moved around over time, the issue hasn't been addressed until now. Fix this by using "devm_request_irq()" which automatically frees the interrupt if the driver exits. Fixes: 025dd3daeda7 ("PCI: keystone: Add error IRQ handler") Reported-by: Jiri Slaby <jirislaby@kernel.org> Closes: https://lore.kernel.org/r/3d3a4b52-e343-42f3-9d69-94c259812143@kernel.org Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250912100802.3136121-2-s-vadapalli@ti.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	PCI/AER: Support errors introduced by PCIe r6.0	Lukas Wunner	1	-6/+6
	commit 6633875250b38b18b8638cf01e695de031c71f02 upstream. PCIe r6.0 defined five additional errors in the Uncorrectable Error Status, Mask and Severity Registers (PCIe r7.0 sec 7.8.4.2ff). lspci has been supporting them since commit 144b0911cc0b ("ls-ecaps: extend decode support for more fields for AER CE and UE status"): https://git.kernel.org/pub/scm/utils/pciutils/pciutils.git/commit/?id=144b0911cc0b Amend the AER driver to recognize them as well, instead of logging them as "Unknown Error Bit". Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/21f1875b18d4078c99353378f37dcd6b994f6d4e.1756301211.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	PCI/AER: Fix missing uevent on recovery when a reset is requested	Niklas Schnelle	1	-0/+1
	commit bbf7d0468d0da71d76cc6ec9bc8a224325d07b6b upstream. Since commit 7b42d97e99d3 ("PCI/ERR: Always report current recovery status for udev") AER uses the result of error_detected() as parameter to pci_uevent_ers(). As pci_uevent_ers() however does not handle PCI_ERS_RESULT_NEED_RESET this results in a missing uevent for the beginning of recovery if drivers request a reset. Fix this by treating PCI_ERS_RESULT_NEED_RESET as beginning recovery. Fixes: 7b42d97e99d3 ("PCI/ERR: Always report current recovery status for udev") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250807-add_err_uevents-v5-1-adf85b0620b0@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	PCI/ERR: Fix uevent on failure to recover	Lukas Wunner	1	-1/+7
	commit 1cbc5e25fb70e942a7a735a1f3d6dd391afc9b29 upstream. Upon failure to recover from a PCIe error through AER, DPC or EDR, a uevent is sent to inform user space about disconnection of the bridge whose subordinate devices failed to recover. However the bridge itself is not disconnected. Instead, a uevent should be sent for each of the subordinate devices. Only if the "bridge" happens to be a Root Complex Event Collector or Integrated Endpoint does it make sense to send a uevent for it (because there are no subordinate devices). Right now if there is a mix of subordinate devices with and without pci_error_handlers, a BEGIN_RECOVERY event is sent for those with pci_error_handlers but no FAILED_RECOVERY event is ever sent for them afterwards. Fix it. Fixes: 856e1eb9bdd4 ("PCI/AER: Add uevents in AER and EEH error/resume") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.16+ Link: https://patch.msgid.link/68fc527a380821b5d861dd554d2ce42cb739591c.1755008151.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV	Niklas Schnelle	1	-0/+5
	commit 05703271c3cdcc0f2a8cf6ebdc45892b8ca83520 upstream. Before disabling SR-IOV via config space accesses to the parent PF, sriov_disable() first removes the PCI devices representing the VFs. Since commit 9d16947b7583 ("PCI: Add global pci_lock_rescan_remove()") such removal operations are serialized against concurrent remove and rescan using the pci_rescan_remove_lock. No such locking was ever added in sriov_disable() however. In particular when commit 18f9e9d150fc ("PCI/IOV: Factor out sriov_add_vfs()") factored out the PCI device removal into sriov_del_vfs() there was still no locking around the pci_iov_remove_virtfn() calls. On s390 the lack of serialization in sriov_disable() may cause double remove and list corruption with the below (amended) trace being observed: PSW: 0704c00180000000 0000000c914e4b38 (klist_put+56) GPRS: 000003800313fb48 0000000000000000 0000000100000001 0000000000000001 00000000f9b520a8 0000000000000000 0000000000002fbd 00000000f4cc9480 0000000000000001 0000000000000000 0000000000000000 0000000180692828 00000000818e8000 000003800313fe2c 000003800313fb20 000003800313fad8 #0 [3800313fb20] device_del at c9158ad5c #1 [3800313fb88] pci_remove_bus_device at c915105ba #2 [3800313fbd0] pci_iov_remove_virtfn at c9152f198 #3 [3800313fc28] zpci_iov_remove_virtfn at c90fb67c0 #4 [3800313fc60] zpci_bus_remove_device at c90fb6104 #5 [3800313fca0] __zpci_event_availability at c90fb3dca #6 [3800313fd08] chsc_process_sei_nt0 at c918fe4a2 #7 [3800313fd60] crw_collect_info at c91905822 #8 [3800313fe10] kthread at c90feb390 #9 [3800313fe68] __ret_from_fork at c90f6aa64 #10 [3800313fe98] ret_from_fork at c9194f3f2. This is because in addition to sriov_disable() removing the VFs, the platform also generates hot-unplug events for the VFs. This being the reverse operation to the hotplug events generated by sriov_enable() and handled via pdev->no_vf_scan. And while the event processing takes pci_rescan_remove_lock and checks whether the struct pci_dev still exists, the lack of synchronization makes this checking racy. Other races may also be possible of course though given that this lack of locking persisted so long observable races seem very rare. Even on s390 the list corruption was only observed with certain devices since the platform events are only triggered by config accesses after the removal, so as long as the removal finished synchronously they would not race. Either way the locking is missing so fix this by adding it to the sriov_del_vfs() helper. Just like PCI rescan-remove, locking is also missing in sriov_add_vfs() including for the error case where pci_stop_and_remove_bus_device() is called without the PCI rescan-remove lock being held. Even in the non-error case, adding new PCI devices and buses should be serialized via the PCI rescan-remove lock. Add the necessary locking. Fixes: 18f9e9d150fc ("PCI/IOV: Factor out sriov_add_vfs()") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Farhan Ali <alifm@linux.ibm.com> Reviewed-by: Julian Ruess <julianr@linux.ibm.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250826-pci_fix_sriov_disable-v1-1-2d0bc938f2a3@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29	PCI: tegra: Fix devm_kcalloc() argument order for port->phys allocation	Alok Tiwari	1	-1/+1
	[ Upstream commit e1a8805e5d263453ad76a4f50ab3b1c18ea07560 ] Fix incorrect argument order in devm_kcalloc() when allocating port->phys. The original call used sizeof(phy) as the number of elements and port->lanes as the element size, which is reversed. While this happens to produce the correct total allocation size with current pointer size and lane counts, the argument order is wrong. Fixes: 6fe7c187e026 ("PCI: tegra: Support per-lane PHYs") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> [mani: added Fixes tag] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250819150436.3105973-1-alok.a.tiwari@oracle.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28	PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports	Lukas Wunner	3	-6/+8
	[ Upstream commit 6cff20ce3b92ffbf2fc5eb9e5a030b3672aa414a ] pci_bridge_d3_possible() is called from both pcie_portdrv_probe() and pcie_portdrv_remove() to determine whether runtime power management shall be enabled (on probe) or disabled (on remove) on a PCIe port. The underlying assumption is that pci_bridge_d3_possible() always returns the same value, else a runtime PM reference imbalance would occur. That assumption is not given if the PCIe port is inaccessible on remove due to hot-unplug: pci_bridge_d3_possible() calls pciehp_is_native(), which accesses Config Space to determine whether the port is Hot-Plug Capable. An inaccessible port returns "all ones", which is converted to "all zeroes" by pcie_capability_read_dword(). Hence the port no longer seems Hot-Plug Capable on remove even though it was on probe. The resulting runtime PM ref imbalance causes warning messages such as: pcieport 0000:02:04.0: Runtime PM usage count underflow! Avoid the Config Space access (and thus the runtime PM ref imbalance) by caching the Hot-Plug Capable bit in struct pci_dev. The struct already contains an "is_hotplug_bridge" flag, which however is not only set on Hot-Plug Capable PCIe ports, but also Conventional PCI Hot-Plug bridges and ACPI slots. The flag identifies bridges which are allocated additional MMIO and bus number resources to allow for hierarchy expansion. The kernel is somewhat sloppily using "is_hotplug_bridge" in a number of places to identify Hot-Plug Capable PCIe ports, even though the flag encompasses other devices. Subsequent commits replace these occurrences with the new flag to clearly delineate Hot-Plug Capable PCIe ports from other kinds of hotplug bridges. Document the existing "is_hotplug_bridge" and the new "is_pciehp" flag and document the (non-obvious) requirement that pci_bridge_d3_possible() always returns the same value across the entire lifetime of a bridge, including its hot-removal. Fixes: 5352a44a561d ("PCI: pciehp: Make pciehp_is_native() stricter") Reported-by: Laurent Bigonville <bigon@bigon.be> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220216 Reported-by: Mario Limonciello <mario.limonciello@amd.com> Closes: https://lore.kernel.org/r/20250609020223.269407-3-superm1@kernel.org/ Link: https://lore.kernel.org/all/20250620025535.3425049-3-superm1@kernel.org/T/#u Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Cc: stable@vger.kernel.org # v4.18+ Link: https://patch.msgid.link/fe5dcc3b2e62ee1df7905d746bde161eb1b3291c.1752390101.git.lukas@wunner.de [ Adjust surrounding documentation changes ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	PCI: endpoint: Fix configfs group removal on driver teardown	Damien Le Moal	2	-0/+2
	commit 910bdb8197f9322790c738bb32feaa11dba26909 upstream. An endpoint driver configfs attributes group is added to the epf_group list of struct pci_epf_driver by pci_epf_add_cfs() but an added group is not removed from this list when the attribute group is unregistered with pci_ep_cfs_remove_epf_group(). Add the missing list_del() call in pci_ep_cfs_remove_epf_group() to correctly remove the attribute group from the driver list. With this change, once the loop over all attribute groups in pci_epf_remove_cfs() completes, the driver epf_group list should be empty. Add a WARN_ON() to make sure of that. Fixes: ef1433f717a2 ("PCI: endpoint: Create configfs entry for each pci_epf_device_id table entry") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250624114544.342159-3-dlemoal@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	PCI: endpoint: Fix configfs group list head handling	Damien Le Moal	1	-1/+0
	commit d79123d79a8154b4318529b7b2ff7e15806f480b upstream. Doing a list_del() on the epf_group field of struct pci_epf_driver in pci_epf_remove_cfs() is not correct as this field is a list head, not a list entry. This list_del() call triggers a KASAN warning when an endpoint function driver which has a configfs attribute group is torn down: ================================================================== BUG: KASAN: slab-use-after-free in pci_epf_remove_cfs+0x17c/0x198 Write of size 8 at addr ffff00010f4a0d80 by task rmmod/319 CPU: 3 UID: 0 PID: 319 Comm: rmmod Not tainted 6.16.0-rc2 #1 NONE Hardware name: Radxa ROCK 5B (DT) Call trace: show_stack+0x2c/0x84 (C) dump_stack_lvl+0x70/0x98 print_report+0x17c/0x538 kasan_report+0xb8/0x190 __asan_report_store8_noabort+0x20/0x2c pci_epf_remove_cfs+0x17c/0x198 pci_epf_unregister_driver+0x18/0x30 nvmet_pci_epf_cleanup_module+0x24/0x30 [nvmet_pci_epf] __arm64_sys_delete_module+0x264/0x424 invoke_syscall+0x70/0x260 el0_svc_common.constprop.0+0xac/0x230 do_el0_svc+0x40/0x58 el0_svc+0x48/0xdc el0t_64_sync_handler+0x10c/0x138 el0t_64_sync+0x198/0x19c ... Remove this incorrect list_del() call from pci_epf_remove_cfs(). Fixes: ef1433f717a2 ("PCI: endpoint: Create configfs entry for each pci_epf_device_id table entry") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250624114544.342159-2-dlemoal@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28	PCI: pnv_php: Fix surprise plug detection and recovery	Timothy Pearson	1	-3/+107
	[ Upstream commit a2a2a6fc2469524caa713036297c542746d148dc ] The existing PowerNV hotplug code did not handle surprise plug events correctly, leading to a complete failure of the hotplug system after device removal and a required reboot to detect new devices. This comes down to two issues: 1) When a device is surprise removed, often the bridge upstream port will cause a PE freeze on the PHB. If this freeze is not cleared, the MSI interrupts from the bridge hotplug notification logic will not be received by the kernel, stalling all plug events on all slots associated with the PE. 2) When a device is removed from a slot, regardless of surprise or programmatic removal, the associated PHB/PE ls left frozen. If this freeze is not cleared via a fundamental reset, skiboot is unable to clear the freeze and cannot retrain / rescan the slot. This also requires a reboot to clear the freeze and redetect the device in the slot. Issue the appropriate unfreeze and rescan commands on hotplug events, and don't oops on hotplug if pci_bus_to_OF_node() returns NULL. Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> [bhelgaas: tidy comments] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/171044224.1359864.1752615546988.JavaMail.zimbra@raptorengineeringinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28	PCI: pnv_php: Work around switches with broken presence detection	Timothy Pearson	1	-0/+27
	[ Upstream commit 80f9fc2362797538ebd4fd70a1dfa838cc2c2cdb ] The Microsemi Switchtec PM8533 PFX 48xG3 [11f8:8533] PCIe switch system was observed to incorrectly assert the Presence Detect Set bit in its capabilities when tested on a Raptor Computing Systems Blackbird system, resulting in the hot insert path never attempting a rescan of the bus and any downstream devices not being re-detected. Work around this by additionally checking whether the PCIe data link is active or not when performing presence detection on downstream switches' ports, similar to the pciehp_hpc.c driver. Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/505981576.1359853.1752615415117.JavaMail.zimbra@raptorengineeringinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28	PCI: pnv_php: Clean up allocated IRQs on unplug	Timothy Pearson	1	-19/+77
	[ Upstream commit 4668619092554e1b95c9a5ac2941ca47ba6d548a ] When the root of a nested PCIe bridge configuration is unplugged, the pnv_php driver leaked the allocated IRQ resources for the child bridges' hotplug event notifications, resulting in a panic. Fix this by walking all child buses and deallocating all its IRQ resources before calling pci_hp_remove_devices(). Also modify the lifetime of the workqueue at struct pnv_php_slot::wq so that it is only destroyed in pnv_php_free_slot(), instead of pnv_php_disable_irq(). This is required since pnv_php_disable_irq() will now be called by workers triggered by hot unplug interrupts, so the workqueue needs to stay allocated. The abridged kernel panic that occurs without this patch is as follows: WARNING: CPU: 0 PID: 687 at kernel/irq/msi.c:292 msi_device_data_release+0x6c/0x9c CPU: 0 UID: 0 PID: 687 Comm: bash Not tainted 6.14.0-rc5+ #2 Call Trace: msi_device_data_release+0x34/0x9c (unreliable) release_nodes+0x64/0x13c devres_release_all+0xc0/0x140 device_del+0x2d4/0x46c pci_destroy_dev+0x5c/0x194 pci_hp_remove_devices+0x90/0x128 pci_hp_remove_devices+0x44/0x128 pnv_php_disable_slot+0x54/0xd4 power_write_file+0xf8/0x18c pci_slot_attr_store+0x40/0x5c sysfs_kf_write+0x64/0x78 kernfs_fop_write_iter+0x1b0/0x290 vfs_write+0x3bc/0x50c ksys_write+0x84/0x140 system_call_exception+0x124/0x230 system_call_vectored_common+0x15c/0x2ec Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> [bhelgaas: tidy comments] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/2013845045.1359852.1752615367790.JavaMail.zimbra@raptorengineeringinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28	PCI: rockchip-host: Fix "Unexpected Completion" log message	Hans Zhang	1	-1/+1
	[ Upstream commit fcc5f586c4edbcc10de23fb9b8c0972a84e945cd ] Fix the debug message for the PCIE_CORE_INT_UCR interrupt to clearly indicate "Unexpected Completion" instead of a duplicate "malformed TLP" message. Fixes: e77f847df54c ("PCI: rockchip: Add Rockchip PCIe controller support") Signed-off-by: Hans Zhang <18255117159@163.com> [mani: added fixes tag] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Link: https://patch.msgid.link/20250607160201.807043-2-18255117159@163.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-17	PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot time	Dexuan Cui	1	-6/+11
	commit 23e118a48acf7be223e57d98e98da8ac5a4071ac upstream. Currently when the pci-hyperv driver finishes probing and initializing the PCI device, it sets the PCI_COMMAND_MEMORY bit; later when the PCI device is registered to the core PCI subsystem, the core PCI driver's BAR detection and initialization code toggles the bit multiple times, and each toggling of the bit causes the hypervisor to unmap/map the virtual BARs from/to the physical BARs, which can be slow if the BAR sizes are huge, e.g., a Linux VM with 14 GPU devices has to spend more than 3 minutes on BAR detection and initialization, causing a long boot time. Reduce the boot time by not setting the PCI_COMMAND_MEMORY bit when we register the PCI device (there is no need to have it set in the first place). The bit stays off till the PCI device driver calls pci_enable_device(). With this change, the boot time of such a 14-GPU VM is reduced by almost 3 minutes. Link: https://lore.kernel.org/lkml/20220419220007.26550-1-decui@microsoft.com/ Tested-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Jake Oshins <jakeo@microsoft.com> Link: https://lore.kernel.org/r/20220502074255.16901-1-decui@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-07-17	PCI: cadence-ep: Correct PBA offset in .set_msix() callback	Niklas Cassel	1	-2/+3
	[ Upstream commit c8bcb01352a86bc5592403904109c22b66bd916e ] While cdns_pcie_ep_set_msix() writes the Table Size field correctly (N-1), the calculation of the PBA offset is wrong because it calculates space for (N-1) entries instead of N. This results in the following QEMU error when using PCI passthrough on a device which relies on the PCI endpoint subsystem: failed to add PCI capability 0x11[0x50]@0xb0: table & pba overlap, or they don't fit in BARs, or don't align Fix the calculation of PBA offset in the MSI-X capability. [bhelgaas: more specific subject and commit log] Fixes: 3ef5d16f50f8 ("PCI: cadence: Add MSI-X support to Endpoint driver") Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250514074313.283156-10-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-27	PCI: Fix lock symmetry in pci_slot_unlock()	Ilpo Järvinen	1	-1/+2
	commit f3efb9569b4a21354ef2caf7ab0608a3e14cc6e4 upstream. The commit a4e772898f8b ("PCI: Add missing bridge lock to pci_bus_lock()") made the lock function to call depend on dev->subordinate but left pci_slot_unlock() unmodified creating locking asymmetry compared with pci_slot_lock(). Because of the asymmetric lock handling, the same bridge device is unlocked twice. First pci_bus_unlock() unlocks bus->self and then pci_slot_unlock() will unconditionally unlock the same bridge device. Move pci_dev_unlock() inside an else branch to match the logic in pci_slot_lock(). Fixes: a4e772898f8b ("PCI: Add missing bridge lock to pci_bus_lock()") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250505115412.37628-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-27	PCI: Add ACS quirk for Loongson PCIe	Huacai Chen	1	-0/+23
	commit 1f3303aa92e15fa273779acac2d0023609de30f1 upstream. Loongson PCIe Root Ports don't advertise an ACS capability, but they do not allow peer-to-peer transactions between Root Ports. Add an ACS quirk so each Root Port can be in a separate IOMMU group. Signed-off-by: Xianglai Li <lixianglai@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250403040756.720409-1-chenhuacai@loongson.cn Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-27	PCI/DPC: Initialize aer_err_info before using it	Bjorn Helgaas	1	-1/+1
	[ Upstream commit a424b598e6a6c1e69a2bb801d6fd16e805ab2c38 ] Previously the struct aer_err_info "info" was allocated on the stack without being initialized, so it contained junk except for the fields we explicitly set later. Initialize "info" at declaration so it starts as all zeros. Fixes: 8aefa9b0d910 ("PCI/DPC: Print AER status in DPC event handling") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20250522232339.1525671-2-helgaas@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-27	PCI: cadence: Fix runtime atomic count underflow	Hans Zhang	1	-10/+1
	[ Upstream commit 8805f32a96d3b97cef07999fa6f52112678f7e65 ] If the call to pci_host_probe() in cdns_pcie_host_setup() fails, PM runtime count is decremented in the error path using pm_runtime_put_sync(). But the runtime count is not incremented by this driver, but only by the callers (cdns_plat_pcie_probe/j721e_pcie_probe). And the callers also decrement the runtime PM count in their error path. So this leads to the below warning from the PM core: "runtime PM usage count underflow!" So fix it by getting rid of pm_runtime_put_sync() in the error path and directly return the errno. Fixes: 49e427e6bdd1 ("Merge branch 'pci/host-probe-refactor'") Signed-off-by: Hans Zhang <18255117159@163.com> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://patch.msgid.link/20250419133058.162048-1-18255117159@163.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-04	PCI: Fix old_size lower bound in calculate_iosize() too	Ilpo Järvinen	1	-4/+2
	[ Upstream commit ff61f380de5652e723168341480cc7adf1dd6213 ] Commit 903534fa7d30 ("PCI: Fix resource double counting on remove & rescan") fixed double counting of mem resources because of old_size being applied too early. Fix a similar counting bug on the io resource side. Link: https://lore.kernel.org/r/20241216175632.4175-6-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiaochun Lee <lixc17@lenovo.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-04	PCI: brcmstb: Add a softdep to MIP MSI-X driver	Stanimir Varbanov	1	-0/+1
	[ Upstream commit 2294059118c550464dd8906286324d90c33b152b ] Then the brcmstb PCIe driver and MIP MSI-X interrupt controller drivers are built as modules there could be a race in probing. To avoid this, add a softdep to MIP driver to guarantee that MIP driver will be load first. Signed-off-by: Stanimir Varbanov <svarbanov@suse.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Ivan T. Ivanov <iivanov@suse.de> Link: https://lore.kernel.org/r/20250224083559.47645-5-svarbanov@suse.de [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-04	PCI: brcmstb: Expand inbound window size up to 64GB	Stanimir Varbanov	1	-2/+2
	[ Upstream commit 25a98c727015638baffcfa236e3f37b70cedcf87 ] The BCM2712 memory map can support up to 64GB of system memory, thus expand the inbound window size in calculation helper function. The change is safe for the currently supported SoCs that have smaller inbound window sizes. Signed-off-by: Stanimir Varbanov <svarbanov@suse.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Jim Quinlan <james.quinlan@broadcom.com> Tested-by: Ivan T. Ivanov <iivanov@suse.de> Link: https://lore.kernel.org/r/20250224083559.47645-7-svarbanov@suse.de [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-04	PCI: imx6: Skip controller_id generation logic for i.MX7D	Richard Zhu	1	-3/+2
	commit f068ffdd034c93f0c768acdc87d4d2d7023c1379 upstream. The i.MX7D only has one PCIe controller, so controller_id should always be 0. The previous code is incorrect although yielding the correct result. Fix by removing "IMX7D" from the switch case branch. Fixes: 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ") Link: https://lore.kernel.org/r/20241126075702.4099164-5-hongxing.zhu@nxp.com Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> [Because this switch case does more than just controller_id logic, move the "IMX7D" case label instead of removing it entirely.] Signed-off-by: Ryan Matthews <ryanmatthews@fastmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02	PCI: Fix use-after-free in pci_bus_release_domain_nr()	Rob Herring	1	-2/+3
	commit 30ba2d09edb5ea857a1473ae3d820911347ada62 upstream. Commit c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()") introduced a use-after-free bug in the bus removal cleanup. The issue was found with kfence: [ 19.293351] BUG: KFENCE: use-after-free read in pci_bus_release_domain_nr+0x10/0x70 [ 19.302817] Use-after-free read at 0x000000007f3b80eb (in kfence-#115): [ 19.309677] pci_bus_release_domain_nr+0x10/0x70 [ 19.309691] dw_pcie_host_deinit+0x28/0x78 [ 19.309702] tegra_pcie_deinit_controller+0x1c/0x38 [pcie_tegra194] [ 19.309734] tegra_pcie_dw_probe+0x648/0xb28 [pcie_tegra194] [ 19.309752] platform_probe+0x90/0xd8 ... [ 19.311457] kfence-#115: 0x00000000063a155a-0x00000000ba698da8, size=1072, cache=kmalloc-2k [ 19.311469] allocated by task 96 on cpu 10 at 19.279323s: [ 19.311562] __kmem_cache_alloc_node+0x260/0x278 [ 19.311571] kmalloc_trace+0x24/0x30 [ 19.311580] pci_alloc_bus+0x24/0xa0 [ 19.311590] pci_register_host_bridge+0x48/0x4b8 [ 19.311601] pci_scan_root_bus_bridge+0xc0/0xe8 [ 19.311613] pci_host_probe+0x18/0xc0 [ 19.311623] dw_pcie_host_init+0x2c0/0x568 [ 19.311630] tegra_pcie_dw_probe+0x610/0xb28 [pcie_tegra194] [ 19.311647] platform_probe+0x90/0xd8 ... [ 19.311782] freed by task 96 on cpu 10 at 19.285833s: [ 19.311799] release_pcibus_dev+0x30/0x40 [ 19.311808] device_release+0x30/0x90 [ 19.311814] kobject_put+0xa8/0x120 [ 19.311832] device_unregister+0x20/0x30 [ 19.311839] pci_remove_bus+0x78/0x88 [ 19.311850] pci_remove_root_bus+0x5c/0x98 [ 19.311860] dw_pcie_host_deinit+0x28/0x78 [ 19.311866] tegra_pcie_deinit_controller+0x1c/0x38 [pcie_tegra194] [ 19.311883] tegra_pcie_dw_probe+0x648/0xb28 [pcie_tegra194] [ 19.311900] platform_probe+0x90/0xd8 ... [ 19.313579] CPU: 10 PID: 96 Comm: kworker/u24:2 Not tainted 6.2.0 #4 [ 19.320171] Hardware name: /, BIOS 1.0-d7fb19b 08/10/2022 [ 19.325852] Workqueue: events_unbound deferred_probe_work_func The stack trace is a bit misleading as dw_pcie_host_deinit() doesn't directly call pci_bus_release_domain_nr(). The issue turns out to be in pci_remove_root_bus() which first calls pci_remove_bus() which frees the struct pci_bus when its struct device is released. Then pci_bus_release_domain_nr() is called and accesses the freed struct pci_bus. Reordering these fixes the issue. Fixes: c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()") Link: https://lore.kernel.org/r/20230329123835.2724518-1-robh@kernel.org Link: https://lore.kernel.org/r/b529cb69-0602-9eed-fc02-2f068707a006@nvidia.com Reported-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: stable@vger.kernel.org # v6.2+ Cc: Pali Rohár <pali@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02	PCI: Fix reference leak in pci_register_host_bridge()	Ma Ke	1	-2/+7
	[ Upstream commit 804443c1f27883926de94c849d91f5b7d7d696e9 ] If device_register() fails, call put_device() to give up the reference to avoid a memory leak, per the comment at device_register(). Found by code review. Link: https://lore.kernel.org/r/20250225021440.3130264-1-make24@iscas.ac.cn Fixes: 37d6a0a6f470 ("PCI: Add pci_register_host_bridge() interface") Signed-off-by: Ma Ke <make24@iscas.ac.cn> [bhelgaas: squash Dan Carpenter's double free fix from https://lore.kernel.org/r/db806a6c-a91b-4e5a-a84b-6b7e01bdac85@stanley.mountain] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02	PCI: Assign PCI domain IDs by ida_alloc()	Pali Rohár	3	-43/+73
	[ Upstream commit c14f7ccc9f5dcf9d06ddeec706f85405b2c80600 ] Replace assignment of PCI domain IDs from atomic_inc_return() to ida_alloc(). Use two IDAs, one for static domain allocations (those which are defined in device tree) and second for dynamic allocations (all other). During removal of root bus / host bridge, also release the domain ID. The released ID can be reused again, for example when dynamically loading and unloading native PCI host bridge drivers. This change also allows to mix static device tree assignment and dynamic by kernel as all static allocations are reserved in dynamic pool. [bhelgaas: set "err" if "bus->domain_nr < 0"] Link: https://lore.kernel.org/r/20220714184130.5436-1-pali@kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Stable-dep-of: 804443c1f278 ("PCI: Fix reference leak in pci_register_host_bridge()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02	PCI: Coalesce host bridge contiguous apertures	Kai-Heng Feng	1	-4/+27
	[ Upstream commit 65db04053efea3f3e412a7e0cc599962999c96b4 ] Built-in graphics on HP EliteDesk 805 G6 doesn't work because graphics can't get the BAR it needs: pci_bus 0000:00: root bus resource [mem 0x10020200000-0x100303fffff window] pci_bus 0000:00: root bus resource [mem 0x10030400000-0x100401fffff window] pci 0000:00:08.1: bridge window [mem 0xd2000000-0xd23fffff] pci 0000:00:08.1: bridge window [mem 0x10030000000-0x100401fffff 64bit pref] pci 0000:00:08.1: can't claim BAR 15 [mem 0x10030000000-0x100401fffff 64bit pref]: no compatible bridge window pci 0000:00:08.1: [mem 0x10030000000-0x100401fffff 64bit pref] clipped to [mem 0x10030000000-0x100303fffff 64bit pref] pci 0000:00:08.1: bridge window [mem 0x10030000000-0x100303fffff 64bit pref] pci 0000:07:00.0: can't claim BAR 0 [mem 0x10030000000-0x1003fffffff 64bit pref]: no compatible bridge window pci 0000:07:00.0: can't claim BAR 2 [mem 0x10040000000-0x100401fffff 64bit pref]: no compatible bridge window However, the root bus has two contiguous apertures that can contain the child resource requested. Coalesce contiguous apertures so we can allocate from the entire contiguous region. [bhelgaas: fold in https://lore.kernel.org/r/20210528170242.1564038-1-kai.heng.feng@canonical.com] Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=212013 Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20210401131252.531935-1-kai.heng.feng@canonical.com Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Stable-dep-of: 804443c1f278 ("PCI: Fix reference leak in pci_register_host_bridge()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02	PCI: Introduce domain_nr in pci_host_bridge	Boqun Feng	1	-1/+5
	[ Upstream commit 15d82ca23c996d50062286d27ed6a42a8105c04a ] Currently we retrieve the PCI domain number of the host bridge from the bus sysdata (or pci_config_window if PCI_DOMAINS_GENERIC=y). Actually we have the information at PCI host bridge probing time, and it makes sense that we store it into pci_host_bridge. One benefit of doing so is the requirement for supporting PCI on Hyper-V for ARM64, because the host bridge of Hyper-V doesn't have pci_config_window, whereas ARM64 is a PCI_DOMAINS_GENERIC=y arch, so we cannot retrieve the PCI domain number from pci_config_window on ARM64 Hyper-V guest. As the preparation for ARM64 Hyper-V PCI support, we introduce the domain_nr in pci_host_bridge and a sentinel value to allow drivers to set domain numbers properly at probing time. Currently CONFIG_PCI_DOMAINS_GENERIC=y archs are only users of this newly-introduced field. Link: https://lore.kernel.org/r/20210726180657.142727-2-boqun.feng@gmail.com Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Stable-dep-of: 804443c1f278 ("PCI: Fix reference leak in pci_register_host_bridge()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-02	PCI: Fix reference leak in pci_alloc_child_bus()	Ma Ke	1	-1/+4
	commit 1f2768b6a3ee77a295106e3a5d68458064923ede upstream. If device_register(&child->dev) fails, call put_device() to explicitly release child->dev, per the comment at device_register(). Found by code review. Link: https://lore.kernel.org/r/20250202062357.872971-1-make24@iscas.ac.cn Fixes: 4f535093cf8f ("PCI: Put pci_dev in device tree as early as possible") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02	PCI: brcmstb: Fix missing of_node_put() in brcm_pcie_probe()	Stanimir Varbanov	1	-4/+9
	commit 2df181e1aea4628a8fd257f866026625d0519627 upstream. A call to of_parse_phandle() is incrementing the refcount, and as such, the of_node_put() must be called when the reference is no longer needed. Thus, refactor the existing code and add a missing of_node_put() call following the check to ensure that "msi_np" matches "pcie->np" and after MSI initialization, but only if the MSI support is enabled system-wide. Cc: stable@vger.kernel.org # v5.10+ Fixes: 40ca1bf580ef ("PCI: brcmstb: Add MSI support") Signed-off-by: Stanimir Varbanov <svarbanov@suse.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250122222955.1752778-1-svarbanov@suse.de [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-04-10	PCI: pciehp: Don't enable HPIE when resuming in poll mode	Ilpo Järvinen	1	-1/+3
	[ Upstream commit 527664f738afb6f2c58022cd35e63801e5dc7aec ] PCIe hotplug can operate in poll mode without interrupt handlers using a polling kthread only. eb34da60edee ("PCI: pciehp: Disable hotplug interrupt during suspend") failed to consider that and enables HPIE (Hot-Plug Interrupt Enable) unconditionally when resuming the Port. Only set HPIE if non-poll mode is in use. This makes pcie_enable_interrupt() match how pcie_enable_notification() already handles HPIE. Link: https://lore.kernel.org/r/20250321162114.3939-1-ilpo.jarvinen@linux.intel.com Fixes: eb34da60edee ("PCI: pciehp: Disable hotplug interrupt during suspend") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-10	PCI: xilinx-cpm: Fix IRQ domain leak in error path of probe	Thippeswamy Havalige	1	-4/+6
	[ Upstream commit 57b0302240741e73fe51f88404b3866e0d2933ad ] The IRQ domain allocated for the PCIe controller is not freed if resource_list_first_type() returns NULL, leading to a resource leak. This fix ensures properly cleaning up the allocated IRQ domain in the error path. Fixes: 49e427e6bdd1 ("Merge branch 'pci/host-probe-refactor'") Signed-off-by: Thippeswamy Havalige <thippeswamy.havalige@amd.com> [kwilczynski: added missing Fixes: tag, refactored to use one of the goto labels] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Link: https://lore.kernel.org/r/20250224155025.782179-2-thippeswamy.havalige@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-10	PCI: Remove stray put_device() in pci_register_host_bridge()	Dan Carpenter	1	-3/+2
	[ Upstream commit 6e8d06e5096c80cbf41313b4a204f43071ca42be ] This put_device() was accidentally left over from when we changed the code from using device_register() to calling device_add(). Delete it. Link: https://lore.kernel.org/r/55b24870-89fb-4c91-b85d-744e35db53c2@stanley.mountain Fixes: 9885440b16b8 ("PCI: Fix pci_host_bridge struct device release/free handling") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-10	PCI/portdrv: Only disable pciehp interrupts early when needed	Feng Tang	1	-3/+5
	[ Upstream commit 9d7db4db19827380e225914618c0c1bf435ed2f5 ] Firmware developers reported that Linux issues two PCIe hotplug commands in very short intervals on an ARM server, which doesn't comply with the PCIe spec. According to PCIe r6.1, sec 6.7.3.2, if the Command Completed event is supported, software must wait for a command to complete or wait at least 1 second before sending a new command. In the failure case, the first PCIe hotplug command is from get_port_device_capability(), which sends a command to disable PCIe hotplug interrupts without waiting for its completion, and the second command comes from pcie_enable_notification() of pciehp driver, which enables hotplug interrupts again. Fix this by only disabling the hotplug interrupts when the pciehp driver is not enabled. Link: https://lore.kernel.org/r/20250303023630.78397-1-feng.tang@linux.alibaba.com Fixes: 2bd50dd800b5 ("PCI: PCIe: Disable PCIe port services during port initialization") Suggested-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-10	PCI: brcmstb: Use internal register to change link capability	Jim Quinlan	1	-2/+2
	[ Upstream commit 0c97321e11e0e9e18546f828492758f6aaecec59 ] The driver has been mistakenly writing to a read-only (RO) configuration space register (PCI_EXP_LNKCAP) to change the PCIe link capability. Although harmless in this case, the proper write destination is an internal register that is reflected by PCI_EXP_LNKCAP. Thus, fix the brcm_pcie_set_gen() function to correctly update the link capability. Fixes: c0452137034b ("PCI: brcmstb: Add Broadcom STB PCIe host controller driver") Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250214173944.47506-3-james.quinlan@broadcom.com [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-10	PCI: cadence-ep: Fix the driver to send MSG TLP for INTx without data payload	Hans Zhang	2	-3/+2
	[ Upstream commit 3ac47fbf4f6e8c3a7c3855fac68cc3246f90f850 ] Per the Cadence's "PCIe Controller IP for AX14" user guide, Version 1.04, Section 9.1.7.1, "AXI Subordinate to PCIe Address Translation Registers", Table 9.4, the bit 16 of the AXI Subordinate Address (axi_s_awaddr) when set corresponds to MSG with data, and when not set, to MSG without data. However, the driver is currently doing the opposite and due to this, the INTx is never received on the host. So, fix the driver to reflect the documentation and also make INTx work. Fixes: 37dddf14f1ae ("PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller") Signed-off-by: Hans Zhang <18255117159@163.com> Signed-off-by: Hans Zhang <hans.zhang@cixtech.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250214165724.184599-1-18255117159@163.com [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-10	PCI/ASPM: Fix link state exit during switch upstream function removal	Daniel Stodden	1	-8/+9
	[ Upstream commit cbf937dcadfd571a434f8074d057b32cd14fbea5 ] Before 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free"), we would free the ASPM link only after the last function on the bus pertaining to the given link was removed. That was too late. If function 0 is removed before sibling function, link->downstream would point to free'd memory after. After above change, we freed the ASPM parent link state upon any function removal on the bus pertaining to a given link. That is too early. If the link is to a PCIe switch with MFD on the upstream port, then removing functions other than 0 first would free a link which still remains parent_link to the remaining downstream ports. The resulting GPFs are especially frequent during hot-unplug, because pciehp removes devices on the link bus in reverse order. On that switch, function 0 is the virtual P2P bridge to the internal bus. Free exactly when function 0 is removed -- before the parent link is obsolete, but after all subordinate links are gone. Link: https://lore.kernel.org/r/e12898835f25234561c9d7de4435590d957b85d9.1734924854.git.dns@arista.com Fixes: 456d8aa37d0f ("PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free") Signed-off-by: Daniel Stodden <dns@arista.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>