summaryrefslogtreecommitdiff
path: root/drivers/pci
AgeCommit message (Collapse)AuthorFilesLines
2026-03-12PCI: dwc: ep: Fix resizable BAR support for multi-PF configurationsAksh Garg1-16/+32
[ Upstream commit 43d67ec26b329f8aea34ba9dff23d69b84a8e564 ] The resizable BAR support added by the commit 3a3d4cabe681 ("PCI: dwc: ep: Allow EPF drivers to configure the size of Resizable BARs") incorrectly configures the resizable BARs only for the first Physical Function (PF0) in EP mode. The resizable BAR configuration functions use generic dw_pcie_*_dbi() operations instead of physical function specific dw_pcie_ep_*_dbi() operations. This causes resizable BAR configuration to always target PF0 regardless of the requested function number. Additionally, dw_pcie_ep_init_non_sticky_registers() only initializes resizable BAR registers for PF0, leaving other PFs unconfigured during the execution of this function. Fix this by using physical function specific configuration space access operations throughout the resizable BAR code path and initializing registers for all the physical functions that support resizable BARs. Fixes: 3a3d4cabe681 ("PCI: dwc: ep: Allow EPF drivers to configure the size of Resizable BARs") Signed-off-by: Aksh Garg <a-garg7@ti.com> [mani: added stable tag] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260130115516.515082-2-a-garg7@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-12PCI: dwc: Remove duplicate dw_pcie_ep_hide_ext_capability() functionQiang Yu3-49/+1
[ Upstream commit 86291f774fe8524178446cb2c792939640b4970c ] Remove dw_pcie_ep_hide_ext_capability() and replace its usage with dw_pcie_remove_ext_capability(). Both functions serve the same purpose of hiding PCIe extended capabilities, but dw_pcie_remove_ext_capability() provides a cleaner API that doesn't require the caller to specify the previous capability ID. Suggested-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Qiang Yu <qiang.yu@oss.qualcomm.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20251224-remove_dw_pcie_ep_hide_ext_capability-v1-1-4302c9cdc316@oss.qualcomm.com Stable-dep-of: 43d67ec26b32 ("PCI: dwc: ep: Fix resizable BAR support for multi-PF configurations") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-12PCI: dwc: Add new APIs to remove standard and extended CapabilityQiang Yu2-0/+55
[ Upstream commit 0183562f1e824c0ca6c918309a0978e9a269af3e ] On some platforms, certain PCIe Capabilities may be present in hardware but are not fully implemented as defined in PCIe spec. These incomplete capabilities should be hidden from the PCI framework to prevent unexpected behavior. Introduce two APIs to remove a specific PCIe Capability and Extended Capability by updating the previous capability's next offset field to skip over the unwanted capability. These APIs allow RC drivers to easily hide unsupported or partially implemented capabilities from software. Co-developed-by: Wenbin Yao <wenbin.yao@oss.qualcomm.com> Signed-off-by: Wenbin Yao <wenbin.yao@oss.qualcomm.com> Signed-off-by: Qiang Yu <qiang.yu@oss.qualcomm.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20251109-remove_cap-v1-2-2208f46f4dc2@oss.qualcomm.com Stable-dep-of: 43d67ec26b32 ("PCI: dwc: ep: Fix resizable BAR support for multi-PF configurations") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-12PCI: Add preceding capability position support in PCI_FIND_NEXT_*_CAP macrosQiang Yu5-14/+29
[ Upstream commit a2582e05e39adf9ab82a02561cd6f70738540ae0 ] Add support for finding the preceding capability position in PCI capability list by extending the capability finding macros with an additional parameter. This functionality is essential for modifying PCI capability list, as it provides the necessary information to update the "next" pointer of the predecessor capability when removing entries. Modify two macros to accept a new 'prev_ptr' parameter: - PCI_FIND_NEXT_CAP - Now accepts 'prev_ptr' parameter for standard capabilities - PCI_FIND_NEXT_EXT_CAP - Now accepts 'prev_ptr' parameter for extended capabilities When a capability is found, these macros: - Store the position of the preceding capability in *prev_ptr (if prev_ptr != NULL) - Maintain all existing functionality when prev_ptr is NULL Update current callers to accommodate this API change by passing NULL to 'prev_ptr' argument if they do not care about the preceding capability position. No functional changes to driver behavior result from this commit as it maintains the existing capability finding functionality while adding the infrastructure for future capability removal operations. Signed-off-by: Qiang Yu <qiang.yu@oss.qualcomm.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20251109-remove_cap-v1-1-2208f46f4dc2@oss.qualcomm.com Stable-dep-of: 43d67ec26b32 ("PCI: dwc: ep: Fix resizable BAR support for multi-PF configurations") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-12Revert "PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ"Niklas Cassel1-56/+3
[ Upstream commit 180c3cfe36786d261a55da52a161f9e279b19a6f ] This reverts commit 0e0b45ab5d770a748487ba0ae8f77d1fb0f0de3e. While this fake hotplugging was a nice idea, it has shown that this feature does not handle PCIe switches correctly: pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43 pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44 pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45 pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46 pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46 pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41]) pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46 During the initial scan, PCI core doesn't see the switch and since the Root Port is not hot plug capable, the secondary bus number gets assigned as the subordinate bus number. This means, the PCI core assumes that only one bus will appear behind the Root Port since the Root Port is not hot plug capable. This works perfectly fine for PCIe endpoints connected to the Root Port, since they don't extend the bus. However, if a PCIe switch is connected, then there is a problem when the downstream busses starts showing up and the PCI core doesn't extend the subordinate bus number and bridge resources after initial scan during boot. The long term plan is to migrate this driver to the upcoming pwrctrl APIs that are supposed to handle this problem elegantly. Suggested-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Shawn Lin <shawn.lin@rock-chips.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251222064207.3246632-10-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-12PCI: dw-rockchip: Change get_ltssm() to provide L1 Substates infoShawn Lin1-4/+25
[ Upstream commit f994bb8f1c94726e0124356ccd31c3c23a8a69f4 ] Rename rockchip_pcie_get_ltssm() to rockchip_pcie_get_ltssm_reg() and add rockchip_pcie_get_ltssm() to get_ltssm() callback in order to show the proper L1 Substates. The PCIE_CLIENT_LTSSM_STATUS[5:0] register returns the same LTSSM layout as enum dw_pcie_ltssm. So the driver just need to convey L1 PM Substates by returning the proper value defined in pcie-designware.h. cat /sys/kernel/debug/dwc_pcie_a40000000.pcie/ltssm_status L1_2 (0x142) Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/1765503205-22184-2-git-send-email-shawn.lin@rock-chips.com Stable-dep-of: 180c3cfe3678 ("Revert "PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ"") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-12PCI: dwc: Add L1 Substates context to ltssm_status of debugfsShawn Lin2-0/+6
[ Upstream commit 679ec639f29cbdaf36bd79bf3e98240fffa335ee ] DWC core couldn't distinguish LTSSM state among L1.0, L1.1 and L1.2. But the vendor glue driver may implement additional logic to convey this information. So add two pseudo definitions for vendor glue drivers to translate their internal L1 Substates for debugfs to show. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/1765503205-22184-1-git-send-email-shawn.lin@rock-chips.com Stable-dep-of: 180c3cfe3678 ("Revert "PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ"") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-12PCI: dw-rockchip: Configure L1SS supportShawn Lin1-0/+40
[ Upstream commit b5e719f26107f4a7f82946dc5be92dceb9b443cb ] L1 PM Substates for RC mode require support in the dw-rockchip driver including proper handling of the CLKREQ# sideband signal. It is mostly handled by hardware, but software still needs to set the clkreq fields in the PCIE_CLIENT_POWER_CON register to match the hardware implementation. For more details, see section '18.6.6.4 L1 Substate' in the RK3568 TRM 1.1 Part 2, or section '11.6.6.4 L1 Substate' in the RK3588 TRM 1.0 Part2. [bhelgaas: set pci->l1ss_support so DWC core preserves L1SS Capability bits; drop corresponding code here, include updates from https://lore.kernel.org/r/aRRG8wv13HxOCqgA@ryzen] Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/1761187883-150120-1-git-send-email-shawn.lin@rock-chips.com Link: https://patch.msgid.link/20251118214312.2598220-4-helgaas@kernel.org Stable-dep-of: 180c3cfe3678 ("Revert "PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ"") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-12PCI: dwc: Advertise L1 PM Substates only if driver requests itBjorn Helgaas5-0/+33
[ Upstream commit a00bba406b5a682764ecb507e580ca8159196aa3 ] L1 PM Substates require the CLKREQ# signal and may also require device-specific support. If CLKREQ# is not supported or driver support is lacking, enabling L1.1 or L1.2 may cause errors when accessing devices, e.g., nvme nvme0: controller is down; will reset: CSTS=0xffffffff, PCI_STATUS=0x10 If the kernel is built with CONFIG_PCIEASPM_POWER_SUPERSAVE=y or users enable L1.x via sysfs, users may trip over these errors even if L1 Substates haven't been enabled by firmware or the driver. To prevent such errors, disable advertising the L1 PM Substates unless the driver sets "dw_pcie.l1ss_support" to indicate that it knows CLKREQ# is present and any device-specific configuration has been done. Set "dw_pcie.l1ss_support" in tegra194 (if DT includes the "supports-clkreq' property) and qcom (for cfg_2_7_0, cfg_1_9_0, cfg_1_34_0, and cfg_sc8280xp controllers) so they can continue to use L1 Substates. Based on Niklas's patch: https://patch.msgid.link/20251017163252.598812-2-cassel@kernel.org [bhelgaas: drop hiding for endpoints] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251118214312.2598220-2-helgaas@kernel.org Stable-dep-of: 180c3cfe3678 ("Revert "PCI: dw-rockchip: Enumerate endpoints based on dll_link_up IRQ"") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-12PCI: j721e: Add config guards for Cadence Host and Endpoint library APIsSiddharth Vadapalli1-16/+25
[ Upstream commit 4b361b1e92be255ff923453fe8db74086cc7cf66 ] Commit under Fixes enabled loadable module support for the driver under the assumption that it shall be the sole user of the Cadence Host and Endpoint library APIs. This assumption guarantees that we won't end up in a case where the driver is built-in and the library support is built as a loadable module. With the introduction of [1], this assumption is no longer valid. The SG2042 driver could be built as a loadable module, implying that the Cadence Host library is also selected as a loadable module. However, the pci-j721e.c driver could be built-in as indicated by CONFIG_PCI_J721E=y due to which the Cadence Endpoint library is built-in. Despite the library drivers being built as specified by their respective consumers, since the 'pci-j721e.c' driver has references to the Cadence Host library APIs as well, we run into a build error as reported at [0]. Fix this by adding config guards as a temporary workaround. The proper fix is to split the 'pci-j721e.c' driver into independent Host and Endpoint drivers as aligned at [2]. [0]: https://lore.kernel.org/r/202511111705.MZ7ls8Hm-lkp@intel.com/ [1]: commit 1c72774df028 ("PCI: sg2042: Add Sophgo SG2042 PCIe driver") [2]: https://lore.kernel.org/r/37f6f8ce-12b2-44ee-a94c-f21b29c98821@app.fastmail.com/ Fixes: a2790bf81f0f ("PCI: j721e: Add support to build as a loadable module") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202511111705.MZ7ls8Hm-lkp@intel.com/ Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251117113246.1460644-1-s-vadapalli@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-12PCI: j721e: Use devm_clk_get_optional_enabled() to get and enable the clockAnand Moon1-15/+5
[ Upstream commit 6fad11c61d0dbf87601ab9e2e37cba7a9a427f7b ] Use devm_clk_get_optional_enabled() helper instead of calling devm_clk_get_optional() and then clk_prepare_enable(). Assign the result of devm_clk_get_optional_enabled() directly to pcie->refclk to avoid using a local 'clk' variable. Signed-off-by: Anand Moon <linux.amoon@gmail.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com> Link: https://patch.msgid.link/20251028154229.6774-2-linux.amoon@gmail.com Stable-dep-of: 4b361b1e92be ("PCI: j721e: Add config guards for Cadence Host and Endpoint library APIs") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-12PCI: dwc: ep: Flush MSI-X write before unmapping its ATU entryNiklas Cassel1-0/+3
[ Upstream commit c22533c66ccae10511ad6a7afc34bb26c47577e3 ] Endpoint drivers use dw_pcie_ep_raise_msix_irq() to raise an MSI-X interrupt to the host using a writel(), which generates a PCI posted write transaction. There's no completion for posted writes, so the writel() may return before the PCI write completes. dw_pcie_ep_raise_msix_irq() also unmaps the outbound ATU entry used for the PCI write, so the write races with the unmap. If the PCI write loses the race with the ATU unmap, the write may corrupt host memory or cause IOMMU errors, e.g., these when running fio with a larger queue depth against nvmet-pci-epf: arm-smmu-v3 fc900000.iommu: 0x0000010000000010 arm-smmu-v3 fc900000.iommu: 0x0000020000000000 arm-smmu-v3 fc900000.iommu: 0x000000090000f040 arm-smmu-v3 fc900000.iommu: 0x0000000000000000 arm-smmu-v3 fc900000.iommu: event: F_TRANSLATION client: 0000:01:00.0 sid: 0x100 ssid: 0x0 iova: 0x90000f040 ipa: 0x0 arm-smmu-v3 fc900000.iommu: unpriv data write s1 "Input address caused fault" stag: 0x0 Flush the write by performing a readl() of the same address to ensure that the write has reached the destination before the ATU entry is unmapped. The same problem was solved for dw_pcie_ep_raise_msi_irq() in commit 8719c64e76bf ("PCI: dwc: ep: Cache MSI outbound iATU mapping"), but there it was solved by dedicating an outbound iATU only for MSI. We can't do the same for MSI-X because each vector can have a different msg_addr and the msg_addr may be changed while the vector is masked. Fixes: beb4641a787d ("PCI: dwc: Add MSI-X callbacks handler") Signed-off-by: Niklas Cassel <cassel@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260211175540.105677-2-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Fix pci_slot_trylock() error handlingJinhui Guo1-3/+1
[ Upstream commit 9368d1ee62829b08aa31836b3ca003803caf0b72 ] Commit a4e772898f8b ("PCI: Add missing bridge lock to pci_bus_lock()") delegates the bridge device's pci_dev_trylock() to pci_bus_trylock() in pci_slot_trylock(), but it forgets to remove the corresponding pci_dev_unlock() when pci_bus_trylock() fails. Before a4e772898f8b, the code did: if (!pci_dev_trylock(dev)) /* <- lock bridge device */ goto unlock; if (dev->subordinate) { if (!pci_bus_trylock(dev->subordinate)) { pci_dev_unlock(dev); /* <- unlock bridge device */ goto unlock; } } After a4e772898f8b the bridge-device lock is no longer taken, but the pci_dev_unlock(dev) on the failure path was left in place, leading to the bug. This yields one of two errors: 1. A warning that the lock is being unlocked when no one holds it. 2. An incorrect unlock of a lock that belongs to another thread. Fix it by removing the now-redundant pci_dev_unlock(dev) on the failure path. [Same patch later posted by Keith at https://patch.msgid.link/20260116184150.3013258-1-kbusch@meta.com] Fixes: a4e772898f8b ("PCI: Add missing bridge lock to pci_bus_lock()") Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251212145528.2555-1-guojinhui.liam@bytedance.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Don't claim disabled bridge windowsIlpo Järvinen1-0/+2
[ Upstream commit 2ecc1bf14e2fdaff78bd1b8e7ed3dba336a3fad5 ] The commit 8278c6914306 ("PCI: Preserve bridge window resource type flags") changed bridge window resource behavior such that flags are no longer zero if the bridge window is not valid or is disabled (mainly to preserve the type flags for later use). If a bridge window has its limit smaller than base address, pci_read_bridge_*() sets both IORESOURCE_UNSET and IORESOURCE_DISABLED to indicate the bridge window exists but is not valid with the current base and limit configuration. The code in pci_claim_bridge_resources() still depends on the old behavior of checking validity of the bridge window solely based on !r->flags, whereas after 8278c6914306, also IORESOURCE_DISABLED may indicate bridge window addresses are not valid. While pci_claim_resource() does check IORESOURCE_UNSET, pci_claim_bridge_resource() attempts to clip the resource if pci_claim_resource() fails, which is not correct for bridge window resources that are not valid. As pci_bus_clip_resource() performs clipping regardless of flags and then clears IORESOURCE_UNSET, it should not be called unless the resource is valid. The problem is visible in this log: pci 0000:20:00.0: PCI bridge to [bus 21] pci 0000:20:00.0: bridge window [io size 0x0000 disabled]: can't claim; no address assigned pci 0000:20:00.0: [io 0x0000-0xffffffffffffffff disabled] clipped to [io 0x0000-0xffff disabled] Add IORESOURCE_DISABLED check in pci_claim_bridge_resources() to only claim bridge windows that appear to have a valid configuration. Fixes: 8278c6914306 ("PCI: Preserve bridge window resource type flags") Reported-by: Sizhe Liu <liusizhe5@huawei.com> Link: https://lore.kernel.org/all/20260203023545.2753811-1-liusizhe5@huawei.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/4d9228d6-a230-6ddf-e300-fbf42d523863@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: dwc: Fix msg_atu_index assignmentNiklas Cassel1-1/+8
[ Upstream commit 58fbf08935d9c4396417e5887df89a4e681fa7e3 ] When dw_pcie_iatu_setup() configures outbound address translation for both type PCIE_ATU_TYPE_MEM and PCIE_ATU_TYPE_IO, the iATU index to use is incremented before calling dw_pcie_prog_outbound_atu(). However for msg_atu_index, the index is not incremented before use, causing the iATU index to be the same as the last configured iATU index, which means that it will incorrectly use the same iATU index that is already in use, breaking outbound address translation. In total there are three problems with this code: -It assigns msg_atu_index the same index that was used for the last outbound address translation window, rather than incrementing the index before assignment. -The index should only be incremented (and msg_atu_index assigned) if the use_atu_msg feature is actually requested/in use (pp->use_atu_msg is set). -If the use_atu_msg feature is requested/in use, and there are no outbound iATUs available, the code should return an error, as otherwise when this this feature is used, it will use an iATU index that is out of bounds. Fixes: e1a4ec1a9520 ("PCI: dwc: Add generic MSG TLP support for sending PME_Turn_Off when system suspend") Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Maciej W. Rozycki <macro@orcam.me.uk> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hans Zhang <zhanghuabing@ecosda.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260127151038.1484881-6-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/IOV: Fix race between SR-IOV enable/disable and hotplugNiklas Schnelle1-0/+4
[ Upstream commit a5338e365c4559d7b4d7356116b0eb95b12e08d5 ] Commit 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV") tried to fix a race between the VF removal inside sriov_del_vfs() and concurrent hot unplug by taking the PCI rescan/remove lock in sriov_del_vfs(). Similarly the PCI rescan/remove lock was also taken in sriov_add_vfs() to protect addition of VFs. This approach however causes deadlock on trying to remove PFs with SR-IOV enabled because PFs disable SR-IOV during removal and this removal happens under the PCI rescan/remove lock. So the original fix had to be reverted. Instead of taking the PCI rescan/remove lock in sriov_add_vfs() and sriov_del_vfs(), fix the race that occurs with SR-IOV enable and disable vs hotplug higher up in the callchain by taking the lock in sriov_numvfs_store() before calling into the driver's sriov_configure() callback. Fixes: 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV") Reported-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251216-revert_sriov_lock-v3-2-dac4925a7621@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04Revert "PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV"Niklas Schnelle1-5/+0
[ Upstream commit 2fa119c0e5e528453ebae9e70740e8d2d8c0ed5a ] This reverts commit 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV"), which causes a deadlock by recursively taking pci_rescan_remove_lock when sriov_del_vfs() is called as part of pci_stop_and_remove_bus_device(). For example with the following sequence of commands: $ echo <NUM> > /sys/bus/pci/devices/<pf>/sriov_numvfs $ echo 1 > /sys/bus/pci/devices/<pf>/remove A trimmed trace of the deadlock on a mlx5 device is as below: zsh/5715 is trying to acquire lock: 000002597926ef50 (pci_rescan_remove_lock){+.+.}-{3:3}, at: sriov_disable+0x34/0x140 but task is already holding lock: 000002597926ef50 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pci_stop_and_remove_bus_device_locked+0x24/0x80 ... Call Trace: [<00000259778c4f90>] dump_stack_lvl+0xc0/0x110 [<00000259779c844e>] print_deadlock_bug+0x31e/0x330 [<00000259779c1908>] __lock_acquire+0x16c8/0x32f0 [<00000259779bffac>] lock_acquire+0x14c/0x350 [<00000259789643a6>] __mutex_lock_common+0xe6/0x1520 [<000002597896413c>] mutex_lock_nested+0x3c/0x50 [<00000259784a07e4>] sriov_disable+0x34/0x140 [<00000258f7d6dd80>] mlx5_sriov_disable+0x50/0x80 [mlx5_core] [<00000258f7d5745e>] remove_one+0x5e/0xf0 [mlx5_core] [<00000259784857fc>] pci_device_remove+0x3c/0xa0 [<000002597851012e>] device_release_driver_internal+0x18e/0x280 [<000002597847ae22>] pci_stop_bus_device+0x82/0xa0 [<000002597847afce>] pci_stop_and_remove_bus_device_locked+0x5e/0x80 [<00000259784972c2>] remove_store+0x72/0x90 [<0000025977e6661a>] kernfs_fop_write_iter+0x15a/0x200 [<0000025977d7241c>] vfs_write+0x24c/0x300 [<0000025977d72696>] ksys_write+0x86/0x110 [<000002597895b61c>] __do_syscall+0x14c/0x400 [<000002597896e0ee>] system_call+0x6e/0x90 This alone is not a complete fix as it restores the issue the cited commit tried to solve. A new fix will be provided as a follow on. Fixes: 05703271c3cd ("PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV") Reported-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Acked-by: Gerd Bayer <gbayer@linux.ibm.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251216-revert_sriov_lock-v3-1-dac4925a7621@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Fix bridge window alignment with optional resourcesIlpo Järvinen1-4/+6
[ Upstream commit 7e90360e6d4599795b6f4e094e20d0bdf3b2615f ] pbus_size_mem() has two alignments, one for required resources in min_align and another in add_align that takes account optional resources. The add_align is applied to the bridge window through the realloc_head list. It can happen, however, that add_align is larger than min_align but calculated size1 and size0 are equal due to extra tailroom (e.g., hotplug reservation, tail alignment), and therefore no entry is created to the realloc_head list. Without the bridge appearing in the realloc head, add_align is lost when pbus_size_mem() returns. The problem is visible in this log for 0000:05:00.0 which lacks add_size ... add_align ... line that would indicate it was added into the realloc_head list: pci 0000:05:00.0: PCI bridge to [bus 06-16] ... pci 0000:06:00.0: bridge window [mem 0x00100000-0x001fffff] to [bus 07] requires relaxed alignment rules pci 0000:06:06.0: bridge window [mem 0x00100000-0x001fffff] to [bus 0a] requires relaxed alignment rules pci 0000:06:07.0: bridge window [mem 0x00100000-0x003fffff] to [bus 0b] requires relaxed alignment rules pci 0000:06:08.0: bridge window [mem 0x00800000-0x00ffffff 64bit pref] to [bus 0c-14] requires relaxed alignment rules pci 0000:06:08.0: bridge window [mem 0x01000000-0x057fffff] to [bus 0c-14] requires relaxed alignment rules pci 0000:06:08.0: bridge window [mem 0x01000000-0x057fffff] to [bus 0c-14] requires relaxed alignment rules pci 0000:06:08.0: bridge window [mem 0x01000000-0x057fffff] to [bus 0c-14] add_size 100000 add_align 1000000 pci 0000:06:0c.0: bridge window [mem 0x00100000-0x001fffff] to [bus 15] requires relaxed alignment rules pci 0000:06:0d.0: bridge window [mem 0x00100000-0x001fffff] to [bus 16] requires relaxed alignment rules pci 0000:06:0d.0: bridge window [mem 0x00100000-0x001fffff] to [bus 16] requires relaxed alignment rules pci 0000:05:00.0: bridge window [mem 0xd4800000-0xd97fffff]: assigned pci 0000:05:00.0: bridge window [mem 0x1060000000-0x10607fffff 64bit pref]: assigned pci 0000:06:08.0: bridge window [mem size 0x04900000]: can't assign; no space pci 0000:06:08.0: bridge window [mem size 0x04900000]: failed to assign While this bug itself seems old, it has likely become more visible after the relaxed tail alignment that does not grossly overestimate the size needed for the bridge window. Make sure add_align > min_align too results in adding an entry into the realloc head list. In addition, add handling to the cases where add_size is zero while only alignment differs. Fixes: d74b9027a4da ("PCI: Consider additional PF's IOV BAR alignment in sizing and assigning") Reported-by: Malte Schröder <malte+lkml@tnxip.de> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Malte Schröder <malte+lkml@tnxip.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251219174036.16738-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: endpoint: Fix swapped parameters in ↵Manikanta Maddireddy1-4/+4
pci_{primary/secondary}_epc_epf_unlink() functions [ Upstream commit 8754dd7639ab0fd68c3ab9d91c7bdecc3e5740a8 ] struct configfs_item_operations callbacks are defined like the following: int (*allow_link)(struct config_item *src, struct config_item *target); void (*drop_link)(struct config_item *src, struct config_item *target); While pci_primary_epc_epf_link() and pci_secondary_epc_epf_link() specify the parameters in the correct order, pci_primary_epc_epf_unlink() and pci_secondary_epc_epf_unlink() specify the parameters in the wrong order, leading to the below kernel crash when using the unlink command in configfs: Unable to handle kernel paging request at virtual address 0000000300000857 Mem abort info: ... pc : string+0x54/0x14c lr : vsnprintf+0x280/0x6e8 ... string+0x54/0x14c vsnprintf+0x280/0x6e8 vprintk_default+0x38/0x4c vprintk+0xc4/0xe0 pci_epf_unbind+0xdc/0x108 configfs_unlink+0xe0/0x208+0x44/0x74 vfs_unlink+0x120/0x29c __arm64_sys_unlinkat+0x3c/0x90 invoke_syscall+0x48/0x134 do_el0_svc+0x1c/0x30prop.0+0xd0/0xf0 Fixes: e85a2d783762 ("PCI: endpoint: Add support in configfs to associate two EPCs with EPF") Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> [mani: cced stable, changed commit message as per https://lore.kernel.org/linux-pci/aV9joi3jF1R6ca02@ryzen] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260108062747.1870669-1-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/PM: Prevent runtime suspend until devices are fully initializedBrian Norris2-1/+15
[ Upstream commit 51c0996dadaea20d73eb0495aeda9cb0422243e8 ] Previously, it was possible for a PCI device to be runtime-suspended before it was fully initialized. When that happened, the suspend process could save invalid device state, for example, before BAR assignment. Restoring the invalid state during resume may leave the device non-functional. Prevent runtime suspend for PCI devices until they are fully initialized by deferring pm_runtime_enable(). More details on how exactly this may occur: 1. PCI device is created by pci_scan_slot() or similar 2. As part of pci_scan_slot(), pci_pm_init() puts the device in D0 and prevents runtime suspend prevented via pm_runtime_forbid() 3. pci_device_add() adds the underlying 'struct device' via device_add(), which means user space can allow runtime suspend, e.g., echo auto > /sys/bus/pci/devices/.../power/control 4. PCI device receives BAR configuration (pci_assign_unassigned_bus_resources(), etc.) 5. pci_bus_add_device() applies final fixups, saves device state, and tries to attach a driver The device may potentially be suspended between #3 and #5, so this is racy with user space (udev or similar). Many PCI devices are enumerated at subsys_initcall time and so will not race with user space, but devices created later by hotplug or modular pwrctrl or host controller drivers are susceptible to this race. More runtime PM details at the first Link: below. Link: https://lore.kernel.org/all/0e35a4e1-894a-47c1-9528-fc5ffbafd9e2@samsung.com/ Signed-off-by: Brian Norris <briannorris@chromium.org> [bhelgaas: update comments per https://lore.kernel.org/r/CAJZ5v0iBNOmMtqfqEbrYyuK2u+2J2+zZ-iQd1FvyCPjdvU2TJg@mail.gmail.com] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260122094815.v5.1.I60a53c170a8596661883bd2b4ef475155c7aa72b@changeid Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: dwc: Skip waiting for L2/L3 Ready if dw_pcie_rp::skip_l23_wait is trueRichard Zhu3-0/+16
[ Upstream commit 58a17b2647ba5aac47e3ffafd0a9b92bf4a9bcbe ] In NXP i.MX6QP and i.MX7D SoCs, LTSSM registers are not accessible once PME_Turn_Off message is broadcasted to the link. So there is no way to verify whether the link has entered L2/L3 Ready state or not. Hence, add a new flag 'dw_pcie_rp::skip_l23_ready' and set it to 'true' for the above mentioned SoCs. This flag when set, will allow the DWC core to skip polling for L2/L3 Ready state and just wait for 10ms as recommended in the PCIe spec r6.0, sec 5.3.3.2.1. Fixes: a528d1a72597 ("PCI: imx6: Use DWC common suspend resume method") Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> [mani: renamed flag to skip_l23_ready and reworded description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260114083300.3689672-2-hongxing.zhu@nxp.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Use resource_set_range() that correctly sets ->endIlpo Järvinen1-4/+2
[ Upstream commit 11721c45a8266a9d0c9684153d20e37159465f96 ] __pci_read_base() sets resource start and end addresses when resource is larger than 4G but pci_bus_addr_t or resource_size_t are not capable of representing 64-bit PCI addresses. This creates a problematic resource that has non-zero flags but the start and end addresses do not yield to resource size of 0 but 1. Replace custom resource addresses setup with resource_set_range() that correctly sets end address as -1 which results in resource_size() returning 0. For consistency, also use resource_set_range() in the other branch that does size based resource setup. Fixes: 23b13bc76f35 ("PCI: Fail safely if we can't handle BARs larger than 4GB") Link: https://lore.kernel.org/all/20251207215359.28895-1-ansuelsmth@gmail.com/T/#m990492684913c5a158ff0e5fc90697d8ad95351b Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Cc: stable@vger.kernel.org Cc: Christian Marangi <ansuelsmth@gmail.com> Link: https://patch.msgid.link/20251208145654.5294-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04Revert "PCI: dwc: Don't wait for link up if driver can detect Link Up event"Niklas Cassel2-9/+2
[ Upstream commit 142d5869f6eec3110adda0ad2d931f5b3c22371d ] This reverts commit 8d3bf19f1b585a3cc0027f508b64c33484db8d0d. While this fake hotplugging was a nice idea, it has shown that this feature does not handle PCIe switches correctly: pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43 pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44 pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45 pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46 pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46 pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41]) pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46 During the initial scan, PCI core doesn't see the switch and since the Root Port is not hot plug capable, the secondary bus number gets assigned as the subordinate bus number. This means, the PCI core assumes that only one bus will appear behind the Root Port since the Root Port is not hot plug capable. This works perfectly fine for PCIe endpoints connected to the Root Port, since they don't extend the bus. However, if a PCIe switch is connected, then there is a problem when the downstream busses starts showing up and the PCI core doesn't extend the subordinate bus number and bridge resources after initial scan during boot. So revert the change that skipped dw_pcie_wait_for_link() if the Link up IRQ was used by a vendor glue driver. Suggested-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Shawn Lin <shawn.lin@rock-chips.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251222064207.3246632-14-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04Revert "PCI: qcom: Enumerate endpoints based on Link up event in ↵Niklas Cassel1-57/+1
'global_irq' interrupt" [ Upstream commit 9a9793b55854422652ea92625e48277c4651c0fd ] This reverts commit 4581403f67929d02c197cb187c4e1e811c9e762a. While this fake hotplugging was a nice idea, it has shown that this feature does not handle PCIe switches correctly: pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43 pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44 pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45 pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46 pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46 pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41]) pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46 During the initial scan, PCI core doesn't see the switch and since the Root Port is not hot plug capable, the secondary bus number gets assigned as the subordinate bus number. This means, the PCI core assumes that only one bus will appear behind the Root Port since the Root Port is not hot plug capable. This works perfectly fine for PCIe endpoints connected to the Root Port, since they don't extend the bus. However, if a PCIe switch is connected, then there is a problem when the downstream busses starts showing up and the PCI core doesn't extend the subordinate bus number and bridge resources after initial scan during boot. The long term plan is to migrate this driver to the upcoming pwrctrl APIs that are supposed to handle this problem elegantly. Suggested-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Shawn Lin <shawn.lin@rock-chips.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251222064207.3246632-13-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04Revert "PCI: qcom: Enable MSI interrupts together with Link up if 'Global ↵Niklas Cassel1-3/+1
IRQ' is supported" [ Upstream commit 7ebdefb87942073679e56cfbc5a72e8fc5441bfc ] This reverts commit ba4a2e2317b9faeca9193ed6d3193ddc3cf2aba3. Since the Link up IRQ support is going away, revert the MSI logic that got added for it too. Suggested-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> [mani: reworded the description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Shawn Lin <shawn.lin@rock-chips.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251222064207.3246632-12-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04Revert "PCI: qcom: Don't wait for link if we can detect Link Up"Niklas Cassel1-4/+1
[ Upstream commit e9ce5b3804436301ab343bc14203a4c14b336d1b ] This reverts commit 36971d6c5a9a134c15760ae9fd13c6d5f9a36abb. While this fake hotplugging was a nice idea, it has shown that this feature does not handle PCIe switches correctly: pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43 pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44 pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45 pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46 pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46 pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41]) pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46 During the initial scan, PCI core doesn't see the switch and since the Root Port is not hot plug capable, the secondary bus number gets assigned as the subordinate bus number. This means, the PCI core assumes that only one bus will appear behind the Root Port since the Root Port is not hot plug capable. This works perfectly fine for PCIe endpoints connected to the Root Port, since they don't extend the bus. However, if a PCIe switch is connected, then there is a problem when the downstream busses starts showing up and the PCI core doesn't extend the subordinate bus number and bridge resources after initial scan during boot. The long term plan is to migrate this driver to the upcoming pwrctrl APIs that are supposed to handle this problem elegantly. Suggested-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Shawn Lin <shawn.lin@rock-chips.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251222064207.3246632-11-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04Revert "PCI: dw-rockchip: Don't wait for link since we can detect Link Up"Niklas Cassel1-1/+0
[ Upstream commit fc6298086bfacaa7003b0bd1da4e4f42b29f7d77 ] This reverts commit ec9fd499b9c60a187ac8d6414c3c343c77d32e42. While this fake hotplugging was a nice idea, it has shown that this feature does not handle PCIe switches correctly: pci_bus 0004:43: busn_res: can not insert [bus 43-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:43: busn_res: [bus 43-41] end is updated to 43 pci_bus 0004:43: busn_res: can not insert [bus 43] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:00.0: devices behind bridge are unusable because [bus 43] cannot be assigned for them pci_bus 0004:44: busn_res: can not insert [bus 44-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:44: busn_res: [bus 44-41] end is updated to 44 pci_bus 0004:44: busn_res: can not insert [bus 44] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:02.0: devices behind bridge are unusable because [bus 44] cannot be assigned for them pci_bus 0004:45: busn_res: can not insert [bus 45-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:45: busn_res: [bus 45-41] end is updated to 45 pci_bus 0004:45: busn_res: can not insert [bus 45] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:06.0: devices behind bridge are unusable because [bus 45] cannot be assigned for them pci_bus 0004:46: busn_res: can not insert [bus 46-41] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci_bus 0004:46: busn_res: [bus 46-41] end is updated to 46 pci_bus 0004:46: busn_res: can not insert [bus 46] under [bus 42-41] (conflicts with (null) [bus 42-41]) pci 0004:42:0e.0: devices behind bridge are unusable because [bus 46] cannot be assigned for them pci_bus 0004:42: busn_res: [bus 42-41] end is updated to 46 pci_bus 0004:42: busn_res: can not insert [bus 42-46] under [bus 41] (conflicts with (null) [bus 41]) pci 0004:41:00.0: devices behind bridge are unusable because [bus 42-46] cannot be assigned for them pcieport 0004:40:00.0: bridge has subordinate 41 but max busn 46 During the initial scan, PCI core doesn't see the switch and since the Root Port is not hot plug capable, the secondary bus number gets assigned as the subordinate bus number. This means, the PCI core assumes that only one bus will appear behind the Root Port since the Root Port is not hot plug capable. This works perfectly fine for PCIe endpoints connected to the Root Port, since they don't extend the bus. However, if a PCIe switch is connected, then there is a problem when the downstream busses starts showing up and the PCI core doesn't extend the subordinate bus number and bridge resources after initial scan during boot. The long term plan is to migrate this driver to the upcoming pwrctrl APIs that are supposed to handle this problem elegantly. Suggested-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Shawn Lin <shawn.lin@rock-chips.com> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251222064207.3246632-9-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/bwctrl: Disable BW controller on Intel P45 using a quirkIlpo Järvinen2-0/+13
[ Upstream commit 46a9f70e93ef73860d1dbbec75ef840031f8f30a ] The commit 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller") was found to lead to a boot hang on a Intel P45 system. Testing without setting Link Bandwidth Management Interrupt Enable (LBMIE) and Link Autonomous Bandwidth Interrupt Enable (LABIE) (PCIe r7.0, sec 7.5.3.7) in bwctrl allowed system to come up. P45 is a very old chipset and supports only up to gen2 PCIe, so not having bwctrl does not seem a huge deficiency. Add no_bw_notif in struct pci_dev and quirk Intel P45 Root Port with it. Reported-by: Adam Stylinski <kungfujesus06@gmail.com> Link: https://lore.kernel.org/linux-pci/aUCt1tHhm_-XIVvi@eggsbenedict/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Adam Stylinski <kungfujesus06@gmail.com> Link: https://patch.msgid.link/20260116131513.2359-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Mark Nvidia GB10 to avoid bus resetJohnny-CC Chang1-0/+8
[ Upstream commit c81a2ce6b6a844d1a57d2a69833a9d0f00403f00 ] After asserting Secondary Bus Reset to downstream devices via a GB10 Root Port, the link may not retrain correctly, e.g., the link may retrain with a lower lane count or config accesses to downstream devices may fail. Prevent use of Secondary Bus Reset for devices below GB10. Signed-off-by: Johnny-CC Chang <Johnny-CC.Chang@mediatek.com> [bhelgaas: drop pci_ids.h update (only used once), update commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20251113084441.2124737-1-Johnny-CC.Chang@mediatek.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Add ACS quirk for Qualcomm Hamoa & GlymurKrishna Chaitanya Chundru1-0/+4
[ Upstream commit 44d2f70b1fd72c339c72983fcffa181beae3e113 ] The Qualcomm Hamoa & Glymur Root Ports don't advertise an ACS capability, but they do provide ACS-like features to disable peer transactions and validate bus numbers in requests. Add an ACS quirk for Hamoa & Glymur. Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260109-acs_quirk-v1-1-82adf95a89ae@oss.qualcomm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Enable ACS after configuring IOMMU for OF platformsManivannan Sadhasivam3-9/+10
[ Upstream commit c41e2fb67e26b04d919257875fa954aa5f6e392e ] Platform, ACPI, or IOMMU drivers call pci_request_acs(), which sets 'pci_acs_enable' to request that ACS be enabled for any devices enumerated in the future. OF platforms called pci_enable_acs() for the first device before of_iommu_configure() called pci_request_acs(), so ACS was never enabled for that device (typically a Root Port). Call pci_enable_acs() later, from pci_dma_configure(), after of_dma_configure() has had a chance to call pci_request_acs(). Here's the call path, showing the move of pci_enable_acs() from pci_acs_init() to pci_dma_configure(), where it always happens after pci_request_acs(): pci_device_add pci_init_capabilities pci_acs_init - pci_enable_acs - if (pci_acs_enable) <-- previous test - ... device_add bus_notify(BUS_NOTIFY_ADD_DEVICE) iommu_bus_notifier iommu_probe_device iommu_init_device dev->bus->dma_configure pci_dma_configure # pci_bus_type.dma_configure of_dma_configure of_iommu_configure pci_request_acs pci_acs_enable = 1 <-- set + pci_enable_acs + if (pci_acs_enable) <-- new test + ... bus_probe_device device_initial_probe ... really_probe dev->bus->dma_configure pci_dma_configure # pci_bus_type.dma_configure ... pci_enable_acs Note that we will now call pci_enable_acs() twice for every device, first from the iommu_probe_device() path and again from the really_probe() path. Presumably that's not an issue since we also call dev->bus->dma_configure() twice. For the ACPI platforms, pci_request_acs() is called during ACPI initialization time itself, independent of the IOMMU framework. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Link: https://patch.msgid.link/20260102-pci_acs-v3-1-72280b94d288@oss.qualcomm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Fix pci_slot_lock () device lockingKeith Busch1-6/+17
[ Upstream commit 1f5e57c622b4dc9b8e7d291d560138d92cfbe5bf ] Like pci_bus_lock(), pci_slot_lock() needs to lock the bridge device to prevent warnings like: pcieport 0000:e2:05.0: unlocked secondary bus reset via: pciehp_reset_slot+0x55/0xa0 Take and release the lock for the bridge providing the slot for the lock/trylock and unlock routines. Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20260130165953.751063-3-kbusch@meta.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/AER: Clear stale errors on reporting agents upon probeLukas Wunner1-1/+25
[ Upstream commit e242d09b58e869f86071b7889acace4cff215935 ] Correctable and Uncorrectable Error Status Registers on reporting agents are cleared upon PCI device enumeration in pci_aer_init() to flush past events. They're cleared again when an error is handled by the AER driver. If an agent reports a new error after pci_aer_init() and before the AER driver has probed on the corresponding Root Port or Root Complex Event Collector, that error is not handled by the AER driver: It clears the Root Error Status Register on probe, but neglects to re-clear the Correctable and Uncorrectable Error Status Registers on reporting agents. The error will eventually be reported when another error occurs. Which is irritating because to an end user it appears as if the earlier error has just happened. Amend the AER driver to clear stale errors on reporting agents upon probe. Skip reporting agents which have not invoked pci_aer_init() yet to avoid using an uninitialized pdev->aer_cap. They're recognizable by the error bits in the Device Control register still being clear. Reporting agents may execute pci_aer_init() after the AER driver has probed, particularly when devices are hotplugged or removed/rescanned via sysfs. For this reason, it continues to be necessary that pci_aer_init() clears Correctable and Uncorrectable Error Status Registers. Reported-by: Lucas Van <lucas.van@intel.com> # off-list Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Lucas Van <lucas.van@intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/3011c2ed30c11f858e35e29939add754adea7478.1769332702.git.lukas@wunner.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: Mark ASM1164 SATA controller to avoid bus resetAlex Williamson1-0/+10
[ Upstream commit beb2f81792a8a619e5122b6b24a374861309c54b ] User forums report issues when assigning ASM1164 SATA controllers to VMs, especially in configurations with multiple controllers. Logs show the device fails to retrain after bus reset. Reports suggest this is an issue across multiple platforms. The device indicates support for PM reset, therefore the device still has a viable function level reset mechanism. The reporting user confirms the device is well behaved in this use case with bus reset disabled. Reported-by: Patrick Bianchi <patrick.w.bianchi@gmail.com> Link: https://forum.proxmox.com/threads/problems-with-pcie-passthrough-with-two-identical-devices.149003/ Signed-off-by: Alex Williamson <alex.williamson@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260109000211.398300-1-alex.williamson@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: imx6: Add CLKREQ# override to enable REFCLK for i.MX95 PCIeRichard Zhu1-0/+20
[ Upstream commit 27a064aba2da6bc58fc36a6b8e889187ae3bf89d ] The CLKREQ# is an open drain, active low signal that is driven low by the card to request reference clock. It's an optional signal added in PCIe CEM r4.0, sec 2. Thus, this signal wouldn't be driven low if it's not exposed on the slot. On the i.MX95 EVK board, REFCLK to the host and endpoint is gated by this CLKREQ# signal. So if the CLKREQ# signal is not driven by the endpoint, it will gate the REFCLK to host too, leading to operational failure. Hence, enable the REFCLK on this SoC by enabling the CLKREQ# override using imx95_pcie_clkreq_override() helper during probe. This override should only be cleared when the CLKREQ# signal is exposed on the slot. Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> [mani: reworded description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Alexander Stein <alexander.stein@ew.tq-group.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20251015030428.2980427-11-hongxing.zhu@nxp.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: dw-rockchip: Disable BAR 0 and BAR 1 for Root PortShawn Lin1-0/+8
[ Upstream commit b5d712e5b87fc56ff838684afb1bae359eb8069f ] Some Rockchip PCIe Root Ports report bogus size of 1GiB for the BAR memories and they cause below resource allocation issue during probe. pci 0000:00:00.0: [1d87:3588] type 01 class 0x060400 PCIe Root Port pci 0000:00:00.0: BAR 0 [mem 0x00000000-0x3fffffff] pci 0000:00:00.0: BAR 1 [mem 0x00000000-0x3fffffff] pci 0000:00:00.0: ROM [mem 0x00000000-0x0000ffff pref] ... pci 0000:00:00.0: BAR 0 [mem 0x900000000-0x93fffffff]: assigned pci 0000:00:00.0: BAR 1 [mem size 0x40000000]: can't assign; no space pci 0000:00:00.0: BAR 1 [mem size 0x40000000]: failed to assign pci 0000:00:00.0: ROM [mem 0xf0200000-0xf020ffff pref]: assigned pci 0000:00:00.0: BAR 0 [mem 0x900000000-0x93fffffff]: releasing pci 0000:00:00.0: ROM [mem 0xf0200000-0xf020ffff pref]: releasing pci 0000:00:00.0: BAR 0 [mem 0x900000000-0x93fffffff]: assigned pci 0000:00:00.0: BAR 1 [mem size 0x40000000]: can't assign; no space pci 0000:00:00.0: BAR 1 [mem size 0x40000000]: failed to assign Since there is no use of the Root Port BAR memories, disable both of them. Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> [mani: reworded the description and comment] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/1766570461-138256-1-git-send-email-shawn.lin@rock-chips.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI: dwc: Skip PME_Turn_Off broadcast and L2/L3 transition during suspend if ↵Manivannan Sadhasivam1-1/+5
link is not up [ Upstream commit cfd2fdfd0a8da2e5bbfdc4009b9c4b8bf164c937 ] During system suspend, if the PCIe link is not up, then there is no need to broadcast PME_Turn_Off message and wait for L2/L3 transition. So skip them. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Shawn Lin <shawn.lin@rock-chips.com> Link: https://patch.msgid.link/20251218-pci-dwc-suspend-rework-v2-1-5a7778c6094a@oss.qualcomm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04PCI/MSI: Unmap MSI-X region on errorHaoxiang Li1-1/+3
[ Upstream commit 1a8d4c6ecb4c81261bcdf13556abd4a958eca202 ] msix_capability_init() fails to unmap the MSI-X region if msix_setup_interrupts() fails. Add the missing iounmap() for that error path. [ tglx: Massaged change log ] Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Signed-off-by: Thomas Gleixner <tglx@kernel.org> Link: https://patch.msgid.link/20260125144452.2103812-1-lihaoxiang@isrc.iscas.ac.cn Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI: Validate window resource type in pbus_select_window_for_type()Kai-Heng Feng1-3/+10
[ Upstream commit e5f72cb9cea599dc9f5a9b80a33560a1d06f01cc ] After ebe091ad81e1 ("PCI: Use pbus_select_window_for_type() during IO window sizing") and ae88d0b9c57f ("PCI: Use pbus_select_window_for_type() during mem window sizing"), many bridge windows can't get resources assigned: pci 0006:05:00.0: bridge window [??? 0x00001000-0x00001fff flags 0x20080000]: can't assign; no space pci 0006:05:00.0: bridge window [??? 0x00001000-0x00001fff flags 0x20080000]: failed to assign Those commits replace find_bus_resource_of_type() with pbus_select_window_for_type(), and the latter lacks resource type validation. Add the resource type validation back to pbus_select_window_for_type() to match the original behavior. Fixes: 74afce3dfcba ("PCI: Add bridge window selection functions") Link: https://bugzilla.kernel.org/show_bug.cgi?id=221072 Signed-off-by: Kai-Heng Feng <kaihengf@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://patch.msgid.link/20260210142058.82701-1-kaihengf@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI: Add ACS quirk for Pericom PI7C9X2G404 switches [12d8:b404]Nicolas Cavallari1-0/+4
[ Upstream commit 5907a90551e9f7968781f3a6ab8684458959beb3 ] 12d8:b404 is apparently another PCI ID for Pericom PI7C9X2G404 (as identified by the chip silkscreen and lspci). It is also affected by the PI7C9X2G errata (e.g. a network card attached to it fails under load when P2P Redirect Request is enabled), so apply the same quirk to this PCI ID too. PCI bridge [0604]: Pericom Semiconductor PI7C9X2G404 EV/SV PCIe2 4-Port/4-Lane Packet Switch [12d8:b404] (rev 01) Fixes: acd61ffb2f16 ("PCI: Add ACS quirk for Pericom PI7C9X2G switches") Closes: https://lore.kernel.org/all/a1d926f0-4cb5-4877-a4df-617902648d80@green-communications.fr/ Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260119160915.26456-1-nicolas.cavallari@green-communications.fr Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI/ACPI: Restrict program_hpx_type2() to AER bitsHåkon Bugge3-38/+27
[ Upstream commit 9abf79c8d7b40db0e5a34aa8c744ea60ff9a3fcf ] Previously program_hpx_type2() applied PCIe settings unconditionally, which could incorrectly change bits like Extended Tag Field Enable and Enable Relaxed Ordering. When _HPX was added to ACPI r3.0, the intent of the PCIe Setting Record (Type 2) in sec 6.2.7.3 was to configure AER registers when the OS does not own the AER Capability: The PCI Express setting record contains ... [the AER] Uncorrectable Error Mask, Uncorrectable Error Severity, Correctable Error Mask ... to be used when configuring registers in the Advanced Error Reporting Extended Capability Structure ... OSPM [1] will only evaluate _HPX with Setting Record – Type 2 if OSPM is not controlling the PCI Express Advanced Error Reporting capability. ACPI r3.0b, sec 6.2.7.3, added more AER registers, including registers in the PCIe Capability with AER-related bits, and the restriction that the OS use this only when it owns PCIe native hotplug: ... when configuring PCI Express registers in the Advanced Error Reporting Extended Capability Structure *or PCI Express Capability Structure* ... An OS that has assumed ownership of native hot plug but does not ... have ownership of the AER register set must use ... the Type 2 record to program the AER registers ... However, since the Type 2 record also includes register bits that have functions other than AER, the OS must ignore values ... that are not applicable. Restrict program_hpx_type2() to only the intended purpose: - Apply settings only when OS owns PCIe native hotplug but not AER, - Only touch the AER-related bits (Error Reporting Enables) in Device Control - Don't touch Link Control at all, since nothing there seems AER-related, but log _HPX settings for debugging purposes Note that Read Completion Boundary is now configured elsewhere, since it is unrelated to _HPX. [1] Operating System-directed configuration and Power Management Fixes: 40abb96c51bb ("[PATCH] pciehp: Fix programming hotplug parameters") Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260129175237.727059-3-haakon.bugge@oracle.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI: Initialize RCB from pci_configure_device()Håkon Bugge1-0/+32
[ Upstream commit 1a6845aaa6de81f95959b380b45de8f10d6a8502 ] Commit e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") worked around a bogus _HPX type 2 record, which caused program_hpx_type2() to set the RCB in an endpoint even though the Root Port did not have the RCB bit set. e42010d8207f fixed that by setting the RCB in the endpoint only when it was set in the Root Port. In retrospect, program_hpx_type2() is intended for AER-related settings, and the RCB should be configured elsewhere so it doesn't depend on the presence or contents of an _HPX record. Explicitly program the RCB from pci_configure_device() so it matches the Root Port's RCB. The Root Port may not be visible to virtualized guests; in that case, leave RCB alone. Fixes: e42010d8207f ("PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)") Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260129175237.727059-2-haakon.bugge@oracle.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI: Check parent for NULL in of_pci_bus_release_domain_nr()Sergey Shtylyov1-1/+1
[ Upstream commit f7245901de8978d829f80b3d8e36ed9a8fd18049 ] of_pci_bus_find_domain_nr() allows its parent parameter to be NULL but of_pci_bus_release_domain_nr() (that undoes its effect) doesn't -- that means it's going to blow up while calling of_get_pci_domain_nr() if the parent parameter indeed happens to be NULL. Add the missing NULL check. Found by Linux Verification Center (linuxtesting.org) with the Svace static analysis tool. Fixes: c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()") Signed-off-by: Sergey Shtylyov <s.shtylyov@auroraos.dev> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260127203944.28588-1-s.shtylyov@auroraos.dev Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI: Remove old_size limit from bridge window sizingIlpo Järvinen1-8/+3
[ Upstream commit f909e3ee3ed1a44202f09ac7e637a0f9ec372225 ] calculate_memsize() applies lower bound to the resource size before aligning the resource size making it impossible to shrink bridge window resources. I've not found any justification for this lower bound and nothing indicated it was to work around some HW issue. Prior to the commit 3baeae36039a ("PCI: Use pci_release_resource() instead of release_resource()"), releasing a bridge window during BAR resize resulted in clearing start and end address of the resource. Clearing addresses destroys the resource size as a side-effect, therefore nullifying the effect of the old size lower bound. After the commit 3baeae36039a ("PCI: Use pci_release_resource() instead of release_resource()"), BAR resize uses the aligned old size, which results in exceeding what fits into the parent window in some cases: xe 0030:03:00.0: [drm] Attempting to resize bar from 256MiB -> 16384MiB xe 0030:03:00.0: BAR 0 [mem 0x620c000000000-0x620c000ffffff 64bit]: releasing xe 0030:03:00.0: BAR 2 [mem 0x6200000000000-0x620000fffffff 64bit pref]: releasing pci 0030:02:01.0: bridge window [mem 0x6200000000000-0x620001fffffff 64bit pref]: releasing pci 0030:01:00.0: bridge window [mem 0x6200000000000-0x6203fbff0ffff 64bit pref]: releasing pci 0030:00:00.0: bridge window [mem 0x6200000000000-0x6203fbff0ffff 64bit pref]: was not released (still contains assigned resources) pci 0030:00:00.0: Assigned bridge window [mem 0x6200000000000-0x6203fbff0ffff 64bit pref] to [bus 01-04] free space at [mem 0x6200400000000-0x62007ffffffff 64bit pref] pci 0030:00:00.0: Assigned bridge window [mem 0x6200000000000-0x6203fbff0ffff 64bit pref] to [bus 01-04] cannot fit 0x4000000000 required for 0030:01:00.0 bridging to [bus 02-04] The old size of 0x6200000000000-0x6203fbff0ffff resource was used as the lower bound which results in 0x4000000000 size request due to alignment. That exceeds what can fit into the parent window. Since the lower bound never even was enforced fully because the resource addresses were cleared when the bridge window is released, remove the old_size lower bound entirely and trust the calculated bridge window size is enough. This same problem may occur on io window side but seems less likely to cause issues due to general difference in alignment. Removing the lower bound may have other unforeseen consequences in case of io window so it's better to leave it as -next material if no problem is reported related to io window sizing (BAR resize shouldn't touch io windows anyway). Fixes: 3baeae36039a ("PCI: Use pci_release_resource() instead of release_resource()") Reported-by: Simon Richter <Simon.Richter@hogyros.de> Link: https://lore.kernel.org/r/f9a8c975-f5d3-4dd2-988e-4371a1433a60@hogyros.de/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251219174036.16738-6-ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI: Stop over-estimating bridge window sizeIlpo Järvinen1-92/+5
[ Upstream commit 3958bf16e2fe1b1c95467e58694102122c951a31 ] New way to calculate the bridge window head alignment produces tight-fit, that is, it does not leave any gaps between the resources. Similarly, relaxed tail alignment does not leave extra tail room. Start to use bridge window calculation that does not over-estimate the size of the required window. pbus_upstream_space_available() can be removed. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Malte Schröder <malte+lkml@tnxip.de> Link: https://patch.msgid.link/20251219174036.16738-4-ilpo.jarvinen@linux.intel.com Stable-dep-of: f909e3ee3ed1 ("PCI: Remove old_size limit from bridge window sizing") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI: Rewrite bridge window head alignment functionIlpo Järvinen1-9/+44
[ Upstream commit bc75c8e5071120e919beb39e69f0979cccfdf219 ] The calculation of bridge window head alignment is done by calculate_mem_align() [*]. With the default bridge window alignment, it is used for both head and tail alignment. The selected head alignment does not always result in tight-fitting resources (gap at d4f00000-d4ffffff): d4800000-dbffffff : PCI Bus 0000:06 d4800000-d48fffff : PCI Bus 0000:07 d4800000-d4803fff : 0000:07:00.0 d4800000-d4803fff : nvme d4900000-d49fffff : PCI Bus 0000:0a d4900000-d490ffff : 0000:0a:00.0 d4900000-d490ffff : r8169 d4910000-d4913fff : 0000:0a:00.0 d4a00000-d4cfffff : PCI Bus 0000:0b d4a00000-d4bfffff : 0000:0b:00.0 d4a00000-d4bfffff : 0000:0b:00.0 d4c00000-d4c07fff : 0000:0b:00.0 d4d00000-d4dfffff : PCI Bus 0000:15 d4d00000-d4d07fff : 0000:15:00.0 d4d00000-d4d07fff : xhci-hcd d4e00000-d4efffff : PCI Bus 0000:16 d4e00000-d4e7ffff : 0000:16:00.0 d4e80000-d4e803ff : 0000:16:00.0 d4e80000-d4e803ff : ahci d5000000-dbffffff : PCI Bus 0000:0c This has not caused problems (for years) with the default bridge window tail alignment that grossly over-estimates the required tail alignment leaving more tail room than necessary. With the introduction of relaxed tail alignment that leaves no extra tail room whatsoever, any gaps will immediately turn into assignment failures. Introduce head alignment calculation that ensures no gaps are left and apply the new approach when using relaxed alignment. We may want to consider using it for the normal alignment eventually, but as the first step, solve only the problem with the relaxed tail alignment. ([*] I don't understand the algorithm in calculate_mem_align().) Link: https://git.kernel.org/history/history/c/5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220775 Reported-by: Malte Schröder <malte+lkml@tnxip.de> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Malte Schröder <malte+lkml@tnxip.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251219174036.16738-3-ilpo.jarvinen@linux.intel.com Stable-dep-of: f909e3ee3ed1 ("PCI: Remove old_size limit from bridge window sizing") Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI: Mark 3ware-9650SA Root Port Extended Tags as brokenJörg Wedekind1-0/+1
[ Upstream commit 959ac08a2c2811305be8c2779779e8b0932e5a99 ] Per PCIe r7.0, sec 2.2.6.2.1 and 7.5.3.4, a Requester may not use 8-bit Tags unless its Extended Tag Field Enable is set, but all Receivers/Completers must handle 8-bit Tags correctly regardless of their Extended Tag Field Enable. Some devices do not handle 8-bit Tags as Completers, so add a quirk for them. If we find such a device, we disable Extended Tags for the entire hierarchy to make peer-to-peer DMA possible. The 3ware 9650SA seems to have issues with handling 8-bit tags. Mark it as broken. This fixes PCI Parity Errors like : 3w-9xxx: scsi0: ERROR: (0x06:0x000C): PCI Parity Error: clearing. 3w-9xxx: scsi0: ERROR: (0x06:0x000D): PCI Abort: clearing. 3w-9xxx: scsi0: ERROR: (0x06:0x000E): Controller Queue Error: clearing. 3w-9xxx: scsi0: ERROR: (0x06:0x0010): Microcontroller Error: clearing. Fixes: 60db3a4d8cc9 ("PCI: Enable PCIe Extended Tags if supported") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=202425 Signed-off-by: Jörg Wedekind <joerg@wedekind.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260119143114.21948-1-joerg@wedekind.de Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI/P2PDMA: Reset page reference count when page mapping failsAlistair Popple1-0/+7
[ Upstream commit 83014d82a1100abc89f7712ad67c3e5accaddc43 ] When mapping a p2pdma page the page reference count is initialised to 1 prior to calling vm_insert_page(). This is to avoid vm_insert_page() warning if the page refcount is zero. Prior to setting the page count there is a check to ensure the page is currently free (ie. has a zero reference count). However vm_insert_page() can fail. In this case the pages are freed back to the genalloc pool, but that does not reset the page refcount. So a future allocation of the same page will see the elevated page refcount from the previous set_page_count() call triggering the VM_WARN_ON_ONCE_PAGE checking that the page is free. Fix this by resetting the page refcount to zero using set_page_count(). Note that put_page() is not used because that would result in freeing the page twice due to implicitly calling p2pdma_folio_free(). Fixes: b7e282378773 ("mm/mm_init: move p2pdma page refcount initialisation to p2pdma") Signed-off-by: Alistair Popple <apopple@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Balbir Singh <balbirs@nvidia.com> Link: https://patch.msgid.link/20260112005440.998543-1-apopple@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI/PTM: Fix pcie_ptm_create_debugfs() memory leakAadityarangan Shridhar Iyengar1-1/+4
[ Upstream commit 62171369cf17794ddd88f602c2c84d008ecafcff ] In pcie_ptm_create_debugfs(), if devm_kasprintf() fails after successfully allocating ptm_debugfs with kzalloc(), the function returns without freeing the allocated memory, resulting in a memory leak. Free ptm_debugfs before returning in the devm_kasprintf() error path and in pcie_ptm_destroy_debugfs(). Fixes: 132833405e61 ("PCI: Add debugfs support for exposing PTM context") Signed-off-by: Aadityarangan Shridhar Iyengar <adiyenga@cisco.com> [bhelgaas: squash additional fix from Mani: https://lore.kernel.org/r/pdp4xc4d5ee3e547mmdro5riui3mclduqdl7j6iclfbozo2a4c@7m3qdm6yrhuv] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20260111163650.33168-1-adiyenga@cisco.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-02-27PCI/portdrv: Fix potential resource leakUwe Kleine-König1-3/+3
[ Upstream commit 01464a3fdf91c041a381d93a1b6fefbdb819a46f ] pcie_port_probe_service() unconditionally calls get_device() (unless it fails). So drop that reference also unconditionally as it's fine for a PCIe driver to not have a remove callback. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/e1c68c3b3f1af8427e98ca5e2c79f8bf0ebe2ce4.1764688034.git.u.kleine-koenig@baylibre.com Signed-off-by: Sasha Levin <sashal@kernel.org>