summaryrefslogtreecommitdiff
path: root/drivers/pci/controller
AgeCommit message (Collapse)AuthorFilesLines
6 daysPCI: brcmstb: Fix disabling L0s capabilityJim Quinlan1-7/+3
[ Upstream commit 9583f9d22991d2cfb5cc59a2552040c4ae98d998 ] caab002d5069 ("PCI: brcmstb: Disable L0s component of ASPM if requested") set PCI_EXP_LNKCAP_ASPM_L1 and (optionally) PCI_EXP_LNKCAP_ASPM_L0S in PCI_EXP_LNKCAP (aka PCIE_RC_CFG_PRIV1_LINK_CAPABILITY in brcmstb). But instead of using PCI_EXP_LNKCAP_ASPM_L1 and PCI_EXP_LNKCAP_ASPM_L0S directly, it used PCIE_LINK_STATE_L1 and PCIE_LINK_STATE_L0S, which are Linux-created values that only coincidentally matched the PCIe spec. b478e162f227 ("PCI/ASPM: Consolidate link state defines") later changed them so they no longer matched the PCIe spec, so the bits ended up in the wrong place in PCI_EXP_LNKCAP. Use PCI_EXP_LNKCAP_ASPM_L0S to clear L0s support when there's an 'aspm-no-l0s' property. Rely on brcmstb hardware to advertise L0s and/or L1 support otherwise. Fixes: caab002d5069 ("PCI: brcmstb: Disable L0s component of ASPM if requested") Reported-by: Bjorn Helgaas <bhelgaas@google.com> Closes: https://lore.kernel.org/linux-pci/20250925194424.GA2197200@bhelgaas Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> [mani: reworded subject and description, added closes tag and CCed stable] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251003170436.1446030-1-james.quinlan@broadcom.com [ Adjust context in variable declaration ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
6 daysPCI: dwc: Fix wrong PORT_LOGIC_LTSSM_STATE_MASK definitionShawn Lin1-1/+1
[ Upstream commit bcc9a4a0bca3aee4303fa4a20302e57b24ac8f68 ] As per DesignWare Cores PCI Express Controller Databook, section 5.50, SII: Debug Signals, cxpl_debug_info[63:0]: [5:0] smlh_ltssm_state: LTSSM current state. Encoding is same as the dedicated smlh_ltssm_state output. The mask should be 6 bits, from 0 to 5. Hence, fix the mask definition. Fixes: 23fe5bd4be90 ("PCI: keystone: Cleanup ks_pcie_link_up()") Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> [mani: reworded description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/1763122140-203068-1-git-send-email-shawn.lin@rock-chips.com Signed-off-by: Sasha Levin <sashal@kernel.org>
6 daysPCI: keystone: Exit ks_pcie_probe() for invalid modeSiddharth Vadapalli1-0/+2
[ Upstream commit 95d9c3f0e4546eaec0977f3b387549a8463cd49f ] Commit under Fixes introduced support for PCIe EP mode on AM654x platforms. When the mode happens to be either "DW_PCIE_RC_TYPE" or "DW_PCIE_EP_TYPE", the PCIe Controller is configured accordingly. However, when the mode is neither of them, an error message is displayed, but the driver probe succeeds. Since this "invalid" mode is not associated with a functional PCIe Controller, the probe should fail. Fix the behavior by exiting "ks_pcie_probe()" with the return value of "-EINVAL" in addition to displaying the existing error message when the mode is invalid. Fixes: 23284ad677a9 ("PCI: keystone: Add support for PCIe EP in AM654x Platforms") Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20251029080547.1253757-4-s-vadapalli@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07PCI: cadence: Check for the existence of cdns_pcie::ops before using itChen Wang3-6/+6
[ Upstream commit 49a6c160ad4812476f8ae1a8f4ed6d15adfa6c09 ] cdns_pcie::ops might not be populated by all the Cadence glue drivers. This is going to be true for the upcoming Sophgo platform which doesn't set the ops. Hence, add a check to prevent NULL pointer dereference. Signed-off-by: Chen Wang <unicorn_wang@outlook.com> [mani: reworded subject and description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/35182ee1d972dfcd093a964e11205efcebbdc044.1757643388.git.unicorn_wang@outlook.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-10-29PCI: rcar: Demote WARN() to dev_warn_ratelimited() in rcar_pcie_wakeup()Marek Vasut1-1/+5
commit c93637e6a4c4e1d0e85ef7efac78d066bbb24d96 upstream. Avoid large backtrace, it is sufficient to warn the user that there has been a link problem. Either the link has failed and the system is in need of maintenance, or the link continues to work and user has been informed. The message from the warning can be looked up in the sources. This makes an actual link issue less verbose. First of all, this controller has a limitation in that the controller driver has to assist the hardware with transition to L1 link state by writing L1IATN to PMCTRL register, the L1 and L0 link state switching is not fully automatic on this controller. In case of an ASMedia ASM1062 PCIe SATA controller which does not support ASPM, on entry to suspend or during platform pm_test, the SATA controller enters D3hot state and the link enters L1 state. If the SATA controller wakes up before rcar_pcie_wakeup() was called and returns to D0, the link returns to L0 before the controller driver even started its transition to L1 link state. At this point, the SATA controller did send an PM_ENTER_L1 DLLP to the PCIe controller and the PCIe controller received it, and the PCIe controller did set PMSR PMEL1RX bit. Once rcar_pcie_wakeup() is called, if the link is already back in L0 state and PMEL1RX bit is set, the controller driver has no way to determine if it should perform the link transition to L1 state, or treat the link as if it is in L0 state. Currently the driver attempts to perform the transition to L1 link state unconditionally, which in this specific case fails with a PMSR L1FAEG poll timeout, however the link still works as it is already back in L0 state. Reduce this warning verbosity. In case the link is really broken, the rcar_pcie_config_access() would fail, otherwise it will succeed and any system with this controller and ASM1062 can suspend without generating a backtrace. Fixes: 84b576146294 ("PCI: rcar: Finish transition to L1 state in rcar_pcie_config_access()") Link: https://lore.kernel.org/linux-pci/20240511235513.77301-1-marek.vasut+renesas@mailbox.org Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29PCI: tegra194: Reset BARs when running in PCIe endpoint modeNiklas Cassel2-0/+11
[ Upstream commit 42f9c66a6d0cc45758dab77233c5460e1cf003df ] Tegra already defines all BARs except BAR0 as BAR_RESERVED. This is sufficient for pci-epf-test to not allocate backing memory and to not call set_bar() for those BARs. However, marking a BAR as BAR_RESERVED does not mean that the BAR gets disabled. The host side driver, pci_endpoint_test, simply does an ioremap for all enabled BARs and will run tests against all enabled BARs, so it will run tests against the BARs marked as BAR_RESERVED. After running the BAR tests (which will write to all enabled BARs), the inbound address translation is broken. This is because the tegra controller exposes the ATU Port Logic Structure in BAR4, so when BAR4 is written, the inbound address translation settings get overwritten. To avoid this, implement the dw_pcie_ep_ops .init() callback and start off by disabling all BARs (pci-epf-test will later enable/configure BARs that are not defined as BAR_RESERVED). This matches the behavior of other PCIe endpoint drivers: dra7xx, imx6, layerscape-ep, artpec6, dw-rockchip, qcom-ep, rcar-gen4, and uniphier-ep. With this, the PCI endpoint kselftest test case CONSECUTIVE_BAR_TEST (which was specifically made to detect address translation issues) passes. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-7-cassel@kernel.org [ changed dw_pcie_ep_ops .init to .ep_init and exported dw_pcie_ep_reset_bar ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29PCI: rcar-host: Drop PMSR spinlockMarek Vasut1-11/+2
[ Upstream commit 0a8f173d9dad13930d5888505dc4c4fd6a1d4262 ] The pmsr_lock spinlock used to be necessary to synchronize access to the PMSR register, because that access could have been triggered from either config space access in rcar_pcie_config_access() or an exception handler rcar_pcie_aarch32_abort_handler(). The rcar_pcie_aarch32_abort_handler() case is no longer applicable since commit 6e36203bc14c ("PCI: rcar: Use PCI_SET_ERROR_RESPONSE after read which triggered an exception"), which performs more accurate, controlled invocation of the exception, and a fixup. This leaves rcar_pcie_config_access() as the only call site from which rcar_pcie_wakeup() is called. The rcar_pcie_config_access() can only be called from the controller struct pci_ops .read and .write callbacks, and those are serialized in drivers/pci/access.c using raw spinlock 'pci_lock' . It should be noted that CONFIG_PCI_LOCKLESS_CONFIG is never set on this platform. Since the 'pci_lock' is a raw spinlock , and the 'pmsr_lock' is not a raw spinlock, this constellation triggers 'BUG: Invalid wait context' with CONFIG_PROVE_RAW_LOCK_NESTING=y . Remove the pmsr_lock to fix the locking. Fixes: a115b1bd3af0 ("PCI: rcar: Add L1 link state fix into data abort hook") Reported-by: Duy Nguyen <duy.nguyen.rh@renesas.com> Reported-by: Thuan Nguyen <thuan.nguyen-hong@banvien.com.vn> Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250909162707.13927-1-marek.vasut+renesas@mailbox.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29PCI: rcar: Finish transition to L1 state in rcar_pcie_config_access()Marek Vasut1-31/+45
[ Upstream commit 84b576146294c2be702cfcd174eaa74167e276f9 ] In case the controller is transitioning to L1 in rcar_pcie_config_access(), any read/write access to PCIECDR triggers asynchronous external abort. This is because the transition to L1 link state must be manually finished by the driver. The PCIe IP can transition back from L1 state to L0 on its own. Avoid triggering the abort in rcar_pcie_config_access() by checking whether the controller is in the transition state, and if so, finish the transition right away. This prevents a lot of unnecessary exceptions, although not all of them. Link: https://lore.kernel.org/r/20220312212349.781799-1-marek.vasut@gmail.com Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Krzysztof Wilczyński <kw@linux.com> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Wolfram Sang <wsa@the-dreams.de> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: linux-renesas-soc@vger.kernel.org Stable-dep-of: 0a8f173d9dad ("PCI: rcar-host: Drop PMSR spinlock") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29PCI: tegra194: Handle errors in BPMP responseVidya Sagar1-2/+16
[ Upstream commit f8c9ad46b00453a8c075453f3745f8d263f44834 ] The return value from tegra_bpmp_transfer() indicates the success or failure of the IPC transaction with BPMP. If the transaction succeeded, we also need to check the actual command's result code. If we don't have error handling for tegra_bpmp_transfer(), we will set the pcie->ep_state to EP_STATE_ENABLED even when the tegra_bpmp_transfer() command fails. Thus, the pcie->ep_state will get out of sync with reality, and any further PERST# assert + deassert will be a no-op and will not trigger the hardware initialization sequence. This is because pex_ep_event_pex_rst_deassert() checks the current pcie->ep_state, and does nothing if the current state is already EP_STATE_ENABLED. Thus, it is important to have error handling for tegra_bpmp_transfer(), such that the pcie->ep_state can not get out of sync with reality, so that we will try to initialize the hardware not only during the first PERST# assert + deassert, but also during any succeeding PERST# assert + deassert. One example where this fix is needed is when using a rock5b as host. During the initial PERST# assert + deassert (triggered by the bootloader on the rock5b) pex_ep_event_pex_rst_deassert() will get called, but for some unknown reason, the tegra_bpmp_transfer() call to initialize the PHY fails. Once Linux has been loaded on the rock5b, the PCIe driver will once again assert + deassert PERST#. However, without tegra_bpmp_transfer() error handling, this second PERST# assert + deassert will not trigger the hardware initialization sequence. With tegra_bpmp_transfer() error handling, the second PERST# assert + deassert will once again trigger the hardware to be initialized and this time the tegra_bpmp_transfer() succeeds. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> [cassel: improve commit log] Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-8-cassel@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29PCI: rcar-host: Convert struct rcar_msi mask_lock into raw spinlockMarek Vasut1-6/+6
[ Upstream commit 5ed35b4d490d8735021cce9b715b62a418310864 ] The rcar_msi_irq_unmask() function may be called from a PCI driver request_threaded_irq() function. This triggers kernel/irq/manage.c __setup_irq() which locks raw spinlock &desc->lock descriptor lock and with that descriptor lock held, calls rcar_msi_irq_unmask(). Since the &desc->lock descriptor lock is a raw spinlock, and the rcar_msi .mask_lock is not a raw spinlock, this setup triggers 'BUG: Invalid wait context' with CONFIG_PROVE_RAW_LOCK_NESTING=y. Use scoped_guard() to simplify the locking. Fixes: 83ed8d4fa656 ("PCI: rcar: Convert to MSI domains") Reported-by: Duy Nguyen <duy.nguyen.rh@renesas.com> Reported-by: Thuan Nguyen <thuan.nguyen-hong@banvien.com.vn> Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250909162707.13927-2-marek.vasut+renesas@mailbox.org [ replaced scoped_guard() with explicit raw_spin_lock_irqsave()/raw_spin_unlock_irqrestore() calls ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29PCI: j721e: Fix programming sequence of "strap" settingsSiddharth Vadapalli1-0/+25
[ Upstream commit f842d3313ba179d4005096357289c7ad09cec575 ] The Cadence PCIe Controller integrated in the TI K3 SoCs supports both Root-Complex and Endpoint modes of operation. The Glue Layer allows "strapping" the Mode of operation of the Controller, the Link Speed and the Link Width. This is enabled by programming the "PCIEn_CTRL" register (n corresponds to the PCIe instance) within the CTRL_MMR memory-mapped register space. The "reset-values" of the registers are also different depending on the mode of operation. Since the PCIe Controller latches onto the "reset-values" immediately after being powered on, if the Glue Layer configuration is not done while the PCIe Controller is off, it will result in the PCIe Controller latching onto the wrong "reset-values". In practice, this will show up as a wrong representation of the PCIe Controller's capability structures in the PCIe Configuration Space. Some such capabilities which are supported by the PCIe Controller in the Root-Complex mode but are incorrectly latched onto as being unsupported are: - Link Bandwidth Notification - Alternate Routing ID (ARI) Forwarding Support - Next capability offset within Advanced Error Reporting (AER) capability Fix this by powering off the PCIe Controller before programming the "strap" settings and powering it on after that. The runtime PM APIs namely pm_runtime_put_sync() and pm_runtime_get_sync() will decrement and increment the usage counter respectively, causing GENPD to power off and power on the PCIe Controller. Fixes: f3e25911a430 ("PCI: j721e: Add TI J721E PCIe driver") Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250908120828.1471776-1-s-vadapalli@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-29PCI: j721e: Enable ACSPCIE Refclk if "ti,syscon-acspcie-proxy-ctrl" existsSiddharth Vadapalli1-1/+38
[ Upstream commit 82c4be4168e26a5593aaa1002b5678128a638824 ] The ACSPCIE module is capable of driving the reference clock required by the PCIe Endpoint device. It is an alternative to on-board and external reference clock generators. Enabling the output from the ACSPCIE module's PAD IO Buffers requires clearing the "PAD IO disable" bits of the ACSPCIE_PROXY_CTRL register in the CTRL_MMR register space. Add support to enable the ACSPCIE reference clock output using the optional device-tree property "ti,syscon-acspcie-proxy-ctrl". Link: https://lore.kernel.org/linux-pci/20240829105316.1483684-3-s-vadapalli@ti.com Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Stable-dep-of: f842d3313ba1 ("PCI: j721e: Fix programming sequence of "strap" settings") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19PCI: tegra194: Fix broken tegra_pcie_ep_raise_msi_irq()Niklas Cassel1-2/+2
commit b640d42a6ac9ba01abe65ec34f7c73aaf6758ab8 upstream. The pci_epc_raise_irq() supplies a MSI or MSI-X interrupt number in range (1-N), as per the pci_epc_raise_irq() kdoc, where N is 32 for MSI. But tegra_pcie_ep_raise_msi_irq() incorrectly uses the interrupt number as the MSI vector. This causes wrong MSI vector to be triggered, leading to the failure of PCI endpoint Kselftest MSI_TEST test case. To fix this issue, convert the interrupt number to MSI vector. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-6-cassel@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19PCI: keystone: Use devm_request_irq() to free "ks-pcie-error-irq" on exitSiddharth Vadapalli1-2/+2
commit e51d05f523e43ce5d2bad957943a2b14f68078cd upstream. Commit under Fixes introduced the IRQ handler for "ks-pcie-error-irq". The interrupt is acquired using "request_irq()" but is never freed if the driver exits due to an error. Although the section in the driver that invokes "request_irq()" has moved around over time, the issue hasn't been addressed until now. Fix this by using "devm_request_irq()" which automatically frees the interrupt if the driver exits. Fixes: 025dd3daeda7 ("PCI: keystone: Add error IRQ handler") Reported-by: Jiri Slaby <jirislaby@kernel.org> Closes: https://lore.kernel.org/r/3d3a4b52-e343-42f3-9d69-94c259812143@kernel.org Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250912100802.3136121-2-s-vadapalli@ti.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-10-19PCI: tegra: Fix devm_kcalloc() argument order for port->phys allocationAlok Tiwari1-1/+1
[ Upstream commit e1a8805e5d263453ad76a4f50ab3b1c18ea07560 ] Fix incorrect argument order in devm_kcalloc() when allocating port->phys. The original call used sizeof(phy) as the number of elements and port->lanes as the element size, which is reversed. While this happens to produce the correct total allocation size with current pointer size and lane counts, the argument order is wrong. Fixes: 6fe7c187e026 ("PCI: tegra: Support per-lane PHYs") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> [mani: added Fixes tag] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250819150436.3105973-1-alok.a.tiwari@oracle.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28PCI: vmd: Assign VMD IRQ domain before enumerationNirmal Patel1-0/+3
commit 886e67100b904cb1b106ed1dfa8a60696aff519a upstream. During the boot process all the PCI devices are assigned default PCI-MSI IRQ domain including VMD endpoint devices. If interrupt-remapping is enabled by IOMMU, the PCI devices except VMD get new INTEL-IR-MSI IRQ domain. And VMD is supposed to create and assign a separate VMD-MSI IRQ domain for its child devices in order to support MSI-X remapping capabilities. Now when MSI-X remapping in VMD is disabled in order to improve performance, VMD skips VMD-MSI IRQ domain assignment process to its child devices. Thus the devices behind VMD get default PCI-MSI IRQ domain instead of INTEL-IR-MSI IRQ domain when VMD creates root bus and configures child devices. As a result host OS fails to boot and DMAR errors were observed when interrupt remapping was enabled on Intel Icelake CPUs. For instance: DMAR: DRHD: handling fault status reg 2 DMAR: [INTR-REMAP] Request device [0xe2:0x00.0] fault index 0xa00 [fault reason 0x25] Blocked a compatibility format interrupt request To fix this issue, dev_msi_info struct in dev struct maintains correct value of IRQ domain. VMD will use this information to assign proper IRQ domain to its child devices when it doesn't create a separate IRQ domain. Link: https://lore.kernel.org/r/20220511095707.25403-2-nirmal.patel@linux.intel.com Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> [ This patch has already been backported to the Ubuntu 5.15 kernel and fixes boot issues on Intel platforms with VMD rev 04, confirmed on version 5.15.189. ] Signed-off-by: Artur Piechocki <artur.piechocki@open-e.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28PCI: rockchip-host: Fix "Unexpected Completion" log messageHans Zhang1-1/+1
[ Upstream commit fcc5f586c4edbcc10de23fb9b8c0972a84e945cd ] Fix the debug message for the PCIE_CORE_INT_UCR interrupt to clearly indicate "Unexpected Completion" instead of a duplicate "malformed TLP" message. Fixes: e77f847df54c ("PCI: rockchip: Add Rockchip PCIe controller support") Signed-off-by: Hans Zhang <18255117159@163.com> [mani: added fixes tag] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Link: https://patch.msgid.link/20250607160201.807043-2-18255117159@163.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-10PCI: hv: Do not set PCI_COMMAND_MEMORY to reduce VM boot timeDexuan Cui1-6/+11
commit 23e118a48acf7be223e57d98e98da8ac5a4071ac upstream. Currently when the pci-hyperv driver finishes probing and initializing the PCI device, it sets the PCI_COMMAND_MEMORY bit; later when the PCI device is registered to the core PCI subsystem, the core PCI driver's BAR detection and initialization code toggles the bit multiple times, and each toggling of the bit causes the hypervisor to unmap/map the virtual BARs from/to the physical BARs, which can be slow if the BAR sizes are huge, e.g., a Linux VM with 14 GPU devices has to spend more than 3 minutes on BAR detection and initialization, causing a long boot time. Reduce the boot time by not setting the PCI_COMMAND_MEMORY bit when we register the PCI device (there is no need to have it set in the first place). The bit stays off till the PCI device driver calls pci_enable_device(). With this change, the boot time of such a 14-GPU VM is reduced by almost 3 minutes. Link: https://lore.kernel.org/lkml/20220419220007.26550-1-decui@microsoft.com/ Tested-by: Boqun Feng (Microsoft) <boqun.feng@gmail.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Jake Oshins <jakeo@microsoft.com> Link: https://lore.kernel.org/r/20220502074255.16901-1-decui@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-27PCI: dw-rockchip: Fix PHY function call sequence in rockchip_pcie_phy_deinit()Diederik de Haas1-1/+1
commit 286ed198b899739862456f451eda884558526a9d upstream. The documentation for the phy_power_off() function explicitly says that it must be called before phy_exit(). Hence, follow the same rule in rockchip_pcie_phy_deinit(). Fixes: 0e898eb8df4e ("PCI: rockchip-dwc: Add Rockchip RK356X host controller driver") Signed-off-by: Diederik de Haas <didi.debian@cknow.org> [mani: commit message change] Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Dragan Simic <dsimic@manjaro.org> Acked-by: Shawn Lin <shawn.lin@rock-chips.com> Cc: stable@vger.kernel.org # v5.15+ Link: https://patch.msgid.link/20250417142138.1377451-1-didi.debian@cknow.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-27PCI: cadence-ep: Correct PBA offset in .set_msix() callbackNiklas Cassel1-2/+3
commit c8bcb01352a86bc5592403904109c22b66bd916e upstream. While cdns_pcie_ep_set_msix() writes the Table Size field correctly (N-1), the calculation of the PBA offset is wrong because it calculates space for (N-1) entries instead of N. This results in the following QEMU error when using PCI passthrough on a device which relies on the PCI endpoint subsystem: failed to add PCI capability 0x11[0x50]@0xb0: table & pba overlap, or they don't fit in BARs, or don't align Fix the calculation of PBA offset in the MSI-X capability. [bhelgaas: more specific subject and commit log] Fixes: 3ef5d16f50f8 ("PCI: cadence: Add MSI-X support to Endpoint driver") Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250514074313.283156-10-cassel@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-27PCI: cadence: Fix runtime atomic count underflowHans Zhang1-10/+1
[ Upstream commit 8805f32a96d3b97cef07999fa6f52112678f7e65 ] If the call to pci_host_probe() in cdns_pcie_host_setup() fails, PM runtime count is decremented in the error path using pm_runtime_put_sync(). But the runtime count is not incremented by this driver, but only by the callers (cdns_plat_pcie_probe/j721e_pcie_probe). And the callers also decrement the runtime PM count in their error path. So this leads to the below warning from the PM core: "runtime PM usage count underflow!" So fix it by getting rid of pm_runtime_put_sync() in the error path and directly return the errno. Fixes: 49e427e6bdd1 ("Merge branch 'pci/host-probe-refactor'") Signed-off-by: Hans Zhang <18255117159@163.com> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://patch.msgid.link/20250419133058.162048-1-18255117159@163.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-04PCI: brcmstb: Add a softdep to MIP MSI-X driverStanimir Varbanov1-0/+1
[ Upstream commit 2294059118c550464dd8906286324d90c33b152b ] Then the brcmstb PCIe driver and MIP MSI-X interrupt controller drivers are built as modules there could be a race in probing. To avoid this, add a softdep to MIP driver to guarantee that MIP driver will be load first. Signed-off-by: Stanimir Varbanov <svarbanov@suse.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Ivan T. Ivanov <iivanov@suse.de> Link: https://lore.kernel.org/r/20250224083559.47645-5-svarbanov@suse.de [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-04PCI: brcmstb: Expand inbound window size up to 64GBStanimir Varbanov1-2/+2
[ Upstream commit 25a98c727015638baffcfa236e3f37b70cedcf87 ] The BCM2712 memory map can support up to 64GB of system memory, thus expand the inbound window size in calculation helper function. The change is safe for the currently supported SoCs that have smaller inbound window sizes. Signed-off-by: Stanimir Varbanov <svarbanov@suse.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Jim Quinlan <james.quinlan@broadcom.com> Tested-by: Ivan T. Ivanov <iivanov@suse.de> Link: https://lore.kernel.org/r/20250224083559.47645-7-svarbanov@suse.de [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-04PCI: vmd: Disable MSI remapping bypass under XenRoger Pau Monne1-0/+20
[ Upstream commit 6c4d5aadf5df31ea0ac025980670eee9beaf466b ] MSI remapping bypass (directly configuring MSI entries for devices on the VMD bus) won't work under Xen, as Xen is not aware of devices in such bus, and hence cannot configure the entries using the pIRQ interface in the PV case, and in the PVH case traps won't be setup for MSI entries for such devices. Until Xen is aware of devices in the VMD bus prevent the VMD_FEAT_CAN_BYPASS_MSI_REMAP capability from being used when running as any kind of Xen guest. The MSI remapping bypass is an optional feature of VMD bridges, and hence when running under Xen it will be masked and devices will be forced to redirect its interrupts from the VMD bridge. That mode of operation must always be supported by VMD bridges and works when Xen is not aware of devices behind the VMD bridge. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Message-ID: <20250219092059.90850-3-roger.pau@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-04PCI: dwc: ep: Ensure proper iteration over outbound map windowsFrank Li1-1/+1
[ Upstream commit f3e1dccba0a0833fc9a05fb838ebeb6ea4ca0e1a ] Most systems' PCIe outbound map windows have non-zero physical addresses, but the possibility of encountering zero increased after following commit ("PCI: dwc: Use parent_bus_offset"). 'ep->outbound_addr[n]', representing 'parent_bus_address', might be 0 on some hardware, which trims high address bits through bus fabric before sending to the PCIe controller. Replace the iteration logic with 'for_each_set_bit()' to ensure only allocated map windows are iterated when determining the ATU index from a given address. Link: https://lore.kernel.org/r/20250315201548.858189-12-helgaas@kernel.org Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-09PCI: imx6: Skip controller_id generation logic for i.MX7DRichard Zhu1-3/+2
commit f068ffdd034c93f0c768acdc87d4d2d7023c1379 upstream. The i.MX7D only has one PCIe controller, so controller_id should always be 0. The previous code is incorrect although yielding the correct result. Fix by removing "IMX7D" from the switch case branch. Fixes: 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ") Link: https://lore.kernel.org/r/20241126075702.4099164-5-hongxing.zhu@nxp.com Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> [Because this switch case does more than just controller_id logic, move the "IMX7D" case label instead of removing it entirely.] Signed-off-by: Ryan Matthews <ryanmatthews@fastmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02PCI: brcmstb: Fix missing of_node_put() in brcm_pcie_probe()Stanimir Varbanov1-4/+9
commit 2df181e1aea4628a8fd257f866026625d0519627 upstream. A call to of_parse_phandle() is incrementing the refcount, and as such, the of_node_put() must be called when the reference is no longer needed. Thus, refactor the existing code and add a missing of_node_put() call following the check to ensure that "msi_np" matches "pcie->np" and after MSI initialization, but only if the MSI support is enabled system-wide. Cc: stable@vger.kernel.org # v5.10+ Fixes: 40ca1bf580ef ("PCI: brcmstb: Add MSI support") Signed-off-by: Stanimir Varbanov <svarbanov@suse.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250122222955.1752778-1-svarbanov@suse.de [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02PCI: vmd: Make vmd_dev::cfg_lock a raw_spinlock_t typeRyo Takakura1-6/+6
[ Upstream commit 18056a48669a040bef491e63b25896561ee14d90 ] The access to the PCI config space via pci_ops::read and pci_ops::write is a low-level hardware access. The functions can be accessed with disabled interrupts even on PREEMPT_RT. The pci_lock is a raw_spinlock_t for this purpose. A spinlock_t becomes a sleeping lock on PREEMPT_RT, so it cannot be acquired with disabled interrupts. The vmd_dev::cfg_lock is accessed in the same context as the pci_lock. Make vmd_dev::cfg_lock a raw_spinlock_t type so it can be used with interrupts disabled. This was reported as: BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 Call Trace: rt_spin_lock+0x4e/0x130 vmd_pci_read+0x8d/0x100 [vmd] pci_user_read_config_byte+0x6f/0xe0 pci_read_config+0xfe/0x290 sysfs_kf_bin_read+0x68/0x90 Signed-off-by: Ryo Takakura <ryotkkr98@gmail.com> Tested-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Acked-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> [bigeasy: reword commit message] Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-off-by: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Link: https://lore.kernel.org/r/20250218080830.ufw3IgyX@linutronix.de [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: add back report info from https://lore.kernel.org/lkml/20241218115951.83062-1-ryotkkr98@gmail.com/] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-10PCI: xilinx-cpm: Fix IRQ domain leak in error path of probeThippeswamy Havalige1-4/+6
[ Upstream commit 57b0302240741e73fe51f88404b3866e0d2933ad ] The IRQ domain allocated for the PCIe controller is not freed if resource_list_first_type() returns NULL, leading to a resource leak. This fix ensures properly cleaning up the allocated IRQ domain in the error path. Fixes: 49e427e6bdd1 ("Merge branch 'pci/host-probe-refactor'") Signed-off-by: Thippeswamy Havalige <thippeswamy.havalige@amd.com> [kwilczynski: added missing Fixes: tag, refactored to use one of the goto labels] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Link: https://lore.kernel.org/r/20250224155025.782179-2-thippeswamy.havalige@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-10PCI: brcmstb: Use internal register to change link capabilityJim Quinlan1-2/+2
[ Upstream commit 0c97321e11e0e9e18546f828492758f6aaecec59 ] The driver has been mistakenly writing to a read-only (RO) configuration space register (PCI_EXP_LNKCAP) to change the PCIe link capability. Although harmless in this case, the proper write destination is an internal register that is reflected by PCI_EXP_LNKCAP. Thus, fix the brcm_pcie_set_gen() function to correctly update the link capability. Fixes: c0452137034b ("PCI: brcmstb: Add Broadcom STB PCIe host controller driver") Signed-off-by: Jim Quinlan <james.quinlan@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250214173944.47506-3-james.quinlan@broadcom.com [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-04-10PCI: cadence-ep: Fix the driver to send MSG TLP for INTx without data payloadHans Zhang2-3/+2
[ Upstream commit 3ac47fbf4f6e8c3a7c3855fac68cc3246f90f850 ] Per the Cadence's "PCIe Controller IP for AX14" user guide, Version 1.04, Section 9.1.7.1, "AXI Subordinate to PCIe Address Translation Registers", Table 9.4, the bit 16 of the AXI Subordinate Address (axi_s_awaddr) when set corresponds to MSG with data, and when not set, to MSG without data. However, the driver is currently doing the opposite and due to this, the INTx is never received on the host. So, fix the driver to reflect the documentation and also make INTx work. Fixes: 37dddf14f1ae ("PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller") Signed-off-by: Hans Zhang <18255117159@163.com> Signed-off-by: Hans Zhang <hans.zhang@cixtech.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250214165724.184599-1-18255117159@163.com [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-03-13PCI: rcar-ep: Fix incorrect variable used when calling devm_request_mem_region()King Dix1-1/+1
[ Upstream commit 2d2da5a4c1b4509f6f7e5a8db015cd420144beb4 ] The rcar_pcie_parse_outbound_ranges() uses the devm_request_mem_region() macro to request a needed resource. A string variable that lives on the stack is then used to store a dynamically computed resource name, which is then passed on as one of the macro arguments. This can lead to undefined behavior. Depending on the current contents of the memory, the manifestations of errors may vary. One possible output may be as follows: $ cat /proc/iomem 30000000-37ffffff : 38000000-3fffffff : Sometimes, garbage may appear after the colon. In very rare cases, if no NULL-terminator is found in memory, the system might crash because the string iterator will overrun which can lead to access of unmapped memory above the stack. Thus, fix this by replacing outbound_name with the name of the previously requested resource. With the changes applied, the output will be as follows: $ cat /proc/iomem 30000000-37ffffff : memory2 38000000-3fffffff : memory3 Fixes: 2a6d0d63d999 ("PCI: rcar: Add endpoint mode support") Link: https://lore.kernel.org/r/tencent_DBDCC19D60F361119E76919ADAB25EC13C06@qq.com Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: King Dix <kingdix10@qq.com> [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-01-23Revert "PCI: Use preserve_config in place of pci_flags"Terry Tritton1-0/+4
This reverts commit c1a1393f7844c645389e5f1a3f1f0350e0fb9316 which is commit 7246a4520b4bf1494d7d030166a11b5226f6d508 upstream. This patch causes a regression in cuttlefish/crossvm boot on arm64. The patch was part of a series that when applied will not cause a regression but this patch was backported to the 5.15 branch by itself. The other patches do not apply cleanly to the 5.15 branch. Signed-off-by: Terry Tritton <terry.tritton@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-09PCI: vmd: Create domain symlink before pci_bus_add_devices()Jiwei Sun1-4/+4
[ Upstream commit f24c9bfcd423e2b2bb0d198456412f614ec2030a ] The vmd driver creates a "domain" symlink in sysfs for each VMD bridge. Previously this symlink was created after pci_bus_add_devices() added devices below the VMD bridge and emitted udev events to announce them to userspace. This led to a race between userspace consumers of the udev events and the kernel creation of the symlink. One such consumer is mdadm, which assembles block devices into a RAID array, and for devices below a VMD bridge, mdadm depends on the "domain" symlink. If mdadm loses the race, it may be unable to assemble a RAID array, which may cause a boot failure or other issues, with complaints like this: (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: Unable to get real path for '/sys/bus/pci/drivers/vmd/0000:c7:00.5/domain/device'' (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: /dev/nvme1n1 is not attached to Intel(R) RAID controller.' (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: No OROM/EFI properties for /dev/nvme1n1' (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: no RAID superblock on /dev/nvme1n1.' (udev-worker)[2149]: nvme1n1: Process '/sbin/mdadm -I /dev/nvme1n1' failed with exit code 1. This symptom prevents the OS from booting successfully. After a NVMe disk is probed/added by the nvme driver, udevd invokes mdadm to detect if there is a mdraid associated with this NVMe disk, and mdadm determines if a NVMe device is connected to a particular VMD domain by checking the "domain" symlink. For example: Thread A Thread B Thread mdadm vmd_enable_domain pci_bus_add_devices __driver_probe_device ... work_on_cpu schedule_work_on : wakeup Thread B nvme_probe : wakeup scan_work to scan nvme disk and add nvme disk then wakeup udevd : udevd executes mdadm command flush_work main : wait for nvme_probe done ... __driver_probe_device find_driver_devices : probe next nvme device : 1) Detect domain symlink ... 2) Find domain symlink ... from vmd sysfs ... 3) Domain symlink not ... created yet; failed sysfs_create_link : create domain symlink Create the VMD "domain" symlink before invoking pci_bus_add_devices() to avoid this race. Suggested-by: Adrian Huang <ahuang12@lenovo.com> Link: https://lore.kernel.org/linux-pci/20240605124844.24293-1-sjiwei@163.com Signed-off-by: Jiwei Sun <sunjw10@lenovo.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Nirmal Patel <nirmal.patel@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-01-09PCI: Use preserve_config in place of pci_flagsVidya Sagar1-4/+0
[ Upstream commit 7246a4520b4bf1494d7d030166a11b5226f6d508 ] Use preserve_config in place of checking for PCI_PROBE_ONLY flag to enable support for "linux,pci-probe-only" on a per host bridge basis. This also obviates the use of adding PCI_REASSIGN_ALL_BUS flag if !PCI_PROBE_ONLY, as pci_assign_unassigned_root_bus_resources() takes care of reassigning the resources that are not already claimed. Link: https://lore.kernel.org/r/20240508174138.3630283-5-vidyas@nvidia.com Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14PCI: rockchip-ep: Fix address translation unit programmingDamien Le Moal2-3/+17
commit 64f093c4d99d797b68b407a9d8767aadc3e3ea7a upstream. The Rockchip PCIe endpoint controller handles PCIe transfers addresses by masking the lower bits of the programmed PCI address and using the same number of lower bits masked from the CPU address space used for the mapping. For a PCI mapping of <size> bytes starting from <pci_addr>, the number of bits masked is the number of address bits changing in the address range [pci_addr..pci_addr + size - 1]. However, rockchip_pcie_prog_ep_ob_atu() calculates num_pass_bits only using the size of the mapping, resulting in an incorrect number of mask bits depending on the value of the PCI address to map. Fix this by introducing the helper function rockchip_pcie_ep_ob_atu_num_bits() to correctly calculate the number of mask bits to use to program the address translation unit. The number of mask bits is calculated depending on both the PCI address and size of the mapping, and clamped between 8 and 20 using the macros ROCKCHIP_PCIE_AT_MIN_NUM_BITS and ROCKCHIP_PCIE_AT_MAX_NUM_BITS. As defined in the Rockchip RK3399 TRM V1.3 Part2, Sections 17.5.5.1.1 and 17.6.8.2.1, this clamping is necessary because: 1) The lower 8 bits of the PCI address to be mapped by the outbound region are ignored. So a minimum of 8 address bits are needed and imply that the PCI address must be aligned to 256. 2) The outbound memory regions are 1MB in size. So while we can specify up to 63-bits for the PCI address (num_bits filed uses bits 0 to 5 of the outbound address region 0 register), we must limit the number of valid address bits to 20 to match the memory window maximum size (1 << 20 = 1MB). Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Link: https://lore.kernel.org/r/20241017015849.190271-2-dlemoal@kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14PCI: keystone: Add link up check to ks_pcie_other_map_bus()Kishon Vijay Abraham I1-0/+11
commit 9e9ec8d8692a6f64d81ef67d4fb6255af6be684b upstream. K2G forwards the error triggered by a link-down state (e.g., no connected endpoint device) on the system bus for PCI configuration transactions; these errors are reported as an SError at system level, which is fatal and hangs the system. So, apply fix similar to how it was done in the DesignWare Core driver commit 15b23906347c ("PCI: dwc: Add link up check in dw_child_pcie_ops.map_bus()"). Fixes: 10a797c6e54a ("PCI: dwc: keystone: Use pci_ops for config space accessors") Link: https://lore.kernel.org/r/20240524105714.191642-3-s-vadapalli@ti.com Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> [kwilczynski: commit log, added tag for stable releases] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17PCI: xilinx-nwl: Fix off-by-one in INTx IRQ handlerSean Anderson1-2/+2
commit 0199d2f2bd8cd97b310f7ed82a067247d7456029 upstream. MSGF_LEG_MASK is laid out with INTA in bit 0, INTB in bit 1, INTC in bit 2, and INTD in bit 3. Hardware IRQ numbers start at 0, and we register PCI_NUM_INTX IRQs. So to enable INTA (aka hwirq 0) we should set bit 0. Remove the subtraction of one. This bug would cause INTx interrupts not to be delivered, as enabling INTB would actually enable INTA, and enabling INTA wouldn't enable anything at all. It is likely that this got overlooked for so long since most PCIe hardware uses MSIs. This fixes the following UBSAN error: UBSAN: shift-out-of-bounds in ../drivers/pci/controller/pcie-xilinx-nwl.c:389:11 shift exponent 18446744073709551615 is too large for 32-bit type 'int' CPU: 1 PID: 61 Comm: kworker/u10:1 Not tainted 6.6.20+ #268 Hardware name: xlnx,zynqmp (DT) Workqueue: events_unbound deferred_probe_work_func Call trace: dump_backtrace (arch/arm64/kernel/stacktrace.c:235) show_stack (arch/arm64/kernel/stacktrace.c:242) dump_stack_lvl (lib/dump_stack.c:107) dump_stack (lib/dump_stack.c:114) __ubsan_handle_shift_out_of_bounds (lib/ubsan.c:218 lib/ubsan.c:387) nwl_unmask_leg_irq (drivers/pci/controller/pcie-xilinx-nwl.c:389 (discriminator 1)) irq_enable (kernel/irq/internals.h:234 kernel/irq/chip.c:170 kernel/irq/chip.c:439 kernel/irq/chip.c:432 kernel/irq/chip.c:345) __irq_startup (kernel/irq/internals.h:239 kernel/irq/chip.c:180 kernel/irq/chip.c:250) irq_startup (kernel/irq/chip.c:270) __setup_irq (kernel/irq/manage.c:1800) request_threaded_irq (kernel/irq/manage.c:2206) pcie_pme_probe (include/linux/interrupt.h:168 drivers/pci/pcie/pme.c:348) Fixes: 9a181e1093af ("PCI: xilinx-nwl: Modify IRQ chip for legacy interrupts") Link: https://lore.kernel.org/r/20240531161337.864994-3-sean.anderson@linux.dev Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17PCI: xilinx-nwl: Clean up clock on probe failure/removalSean Anderson1-4/+19
[ Upstream commit cfd67903977b13f63340a4eb5a1cc890994f2c62 ] Make sure we turn off the clock on probe failure and device removal. Fixes: de0a01f52966 ("PCI: xilinx-nwl: Enable the clock through CCF") Link: https://lore.kernel.org/r/20240531161337.864994-6-sean.anderson@linux.dev Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17PCI: xilinx-nwl: Fix register misspellingSean Anderson1-6/+6
[ Upstream commit a437027ae1730b8dc379c75fa0dd7d3036917400 ] MSIC -> MISC Fixes: c2a7ff18edcd ("PCI: xilinx-nwl: Expand error logging") Link: https://lore.kernel.org/r/20240531161337.864994-4-sean.anderson@linux.dev Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17PCI: keystone: Fix if-statement expression in ks_pcie_quirk()Dan Carpenter1-1/+1
[ Upstream commit 6188a1c762eb9bbd444f47696eda77a5eae6207a ] This code accidentally uses && where || was intended. It potentially results in a NULL dereference. Thus, fix the if-statement expression to use the correct condition. Fixes: 86f271f22bbb ("PCI: keystone: Add workaround for Errata #i2037 (AM65x SR 1.0)") Link: https://lore.kernel.org/linux-pci/1b762a93-e1b2-4af3-8c04-c8843905c279@stanley.mountain Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12PCI: keystone: Add workaround for Errata #i2037 (AM65x SR 1.0)Kishon Vijay Abraham I1-1/+43
[ Upstream commit 86f271f22bbb6391410a07e08d6ca3757fda01fa ] Errata #i2037 in AM65x/DRA80xM Processors Silicon Revision 1.0 (SPRZ452D_July 2018_Revised December 2019 [1]) mentions when an inbound PCIe TLP spans more than two internal AXI 128-byte bursts, the bus may corrupt the packet payload and the corrupt data may cause associated applications or the processor to hang. The workaround for Errata #i2037 is to limit the maximum read request size and maximum payload size to 128 bytes. Add workaround for Errata #i2037 here. The errata and workaround is applicable only to AM65x SR 1.0 and later versions of the silicon will have this fixed. [1] -> https://www.ti.com/lit/er/sprz452i/sprz452i.pdf Link: https://lore.kernel.org/linux-pci/16e1fcae-1ea7-46be-b157-096e05661b15@siemens.com Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Achal Verma <a-verma1@ti.com> Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12PCI: al: Check IORESOURCE_BUS existence during probeAleksandr Mishin1-3/+13
[ Upstream commit a9927c2cac6e9831361e43a14d91277818154e6a ] If IORESOURCE_BUS is not provided in Device Tree it will be fabricated in of_pci_parse_bus_range(), so NULL pointer dereference should not happen here. But that's hard to verify, so check for NULL anyway. Found by Linux Verification Center (linuxtesting.org) with SVACE. Link: https://lore.kernel.org/linux-pci/20240503125705.46055-1-amishin@t-argos.ru Suggested-by: Bjorn Helgaas <helgaas@kernel.org> Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-19PCI: dwc: Restore MSI Receiver mask during resumeJisheng Zhang1-1/+6
commit 815953dc2011ad7a34de355dfa703dcef1085219 upstream If a host that uses the IP's integrated MSI Receiver lost power during suspend, we call dw_pcie_setup_rc() to reinit the RC. But dw_pcie_setup_rc() always sets pp->irq_mask[ctrl] to ~0, so the mask register is always set as 0xffffffff incorrectly, thus the MSI can't work after resume. Fix this issue by moving pp->irq_mask[ctrl] initialization to dw_pcie_host_init() so we can correctly set the mask reg during both boot and resume. Tested-by: Richard Zhu <hongxing.zhu@nxp.com> Link: https://lore.kernel.org/r/20211226074019.2556-1-jszhang@kernel.org Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19PCI: rockchip: Use GPIOD_OUT_LOW flag while requesting ep_gpioManivannan Sadhasivam1-1/+1
commit 840b7a5edf88fe678c60dee88a135647c0ea4375 upstream. Rockchip platforms use 'GPIO_ACTIVE_HIGH' flag in the devicetree definition for ep_gpio. This means, whatever the logical value set by the driver for the ep_gpio, physical line will output the same logic level. For instance, gpiod_set_value_cansleep(rockchip->ep_gpio, 0); --> Level low gpiod_set_value_cansleep(rockchip->ep_gpio, 1); --> Level high But while requesting the ep_gpio, GPIOD_OUT_HIGH flag is currently used. Now, this also causes the physical line to output 'high' creating trouble for endpoint devices during host reboot. When host reboot happens, the ep_gpio will initially output 'low' due to the GPIO getting reset to its POR value. Then during host controller probe, it will output 'high' due to GPIOD_OUT_HIGH flag. Then during rockchip_pcie_host_init_port(), it will first output 'low' and then 'high' indicating the completion of controller initialization. On the endpoint side, each output 'low' of ep_gpio is accounted for PERST# assert and 'high' for PERST# deassert. With the above mentioned flow during host reboot, endpoint will witness below state changes for PERST#: (1) PERST# assert - GPIO POR state (2) PERST# deassert - GPIOD_OUT_HIGH while requesting GPIO (3) PERST# assert - rockchip_pcie_host_init_port() (4) PERST# deassert - rockchip_pcie_host_init_port() Now the time interval between (2) and (3) is very short as both happen during the driver probe(), and this results in a race in the endpoint. Because, before completing the PERST# deassertion in (2), endpoint got another PERST# assert in (3). A proper way to fix this issue is to change the GPIOD_OUT_HIGH flag in (2) to GPIOD_OUT_LOW. Because the usual convention is to request the GPIO with a state corresponding to its 'initial/default' value and let the driver change the state of the GPIO when required. As per that, the ep_gpio should be requested with GPIOD_OUT_LOW as it corresponds to the POR value of '0' (PERST# assert in the endpoint). Then the driver can change the state of the ep_gpio later in rockchip_pcie_host_init_port() as per the initialization sequence. This fixes the firmware crash issue in Qcom based modems connected to Rockpro64 based board. Fixes: e77f847df54c ("PCI: rockchip: Add Rockchip PCIe controller support") Closes: https://lore.kernel.org/mhi/20240402045647.GG2933@thinkpad/ Link: https://lore.kernel.org/linux-pci/20240416-pci-rockchip-perst-fix-v1-1-4800b1d4d954@linaro.org Reported-by: Slark Xiao <slark_xiao@163.com> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Cc: stable@vger.kernel.org # v4.9 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19PCI: dw-rockchip: Fix initial PERST# GPIO valueNiklas Cassel1-1/+1
commit 28b8d7793b8573563b3d45321376f36168d77b1e upstream. PERST# is active low according to the PCIe specification. However, the existing pcie-dw-rockchip.c driver does: gpiod_set_value(..., 0); msleep(100); gpiod_set_value(..., 1); when asserting + deasserting PERST#. This is of course wrong, but because all the device trees for this compatible string have also incorrectly marked this GPIO as ACTIVE_HIGH: $ git grep -B 10 reset-gpios arch/arm64/boot/dts/rockchip/rk3568* $ git grep -B 10 reset-gpios arch/arm64/boot/dts/rockchip/rk3588* The actual toggling of PERST# is correct, and we cannot change it anyway, since that would break device tree compatibility. However, this driver does request the GPIO to be initialized as GPIOD_OUT_HIGH, which does cause a silly sequence where PERST# gets toggled back and forth for no good reason. Fix this by requesting the GPIO to be initialized as GPIOD_OUT_LOW (which for this driver means PERST# asserted). This will avoid an unnecessary signal change where PERST# gets deasserted (by devm_gpiod_get_optional()) and then gets asserted (by rockchip_pcie_start_link()) just a few instructions later. Before patch, debug prints on EP side, when booting RC: [ 845.606810] pci: PERST# asserted by host! [ 852.483985] pci: PERST# de-asserted by host! [ 852.503041] pci: PERST# asserted by host! [ 852.610318] pci: PERST# de-asserted by host! After patch, debug prints on EP side, when booting RC: [ 125.107921] pci: PERST# asserted by host! [ 132.111429] pci: PERST# de-asserted by host! This extra, very short, PERST# assertion + deassertion has been reported to cause issues with certain WLAN controllers, e.g. RTL8822CE. Fixes: 0e898eb8df4e ("PCI: rockchip-dwc: Add Rockchip RK356X host controller driver") Link: https://lore.kernel.org/linux-pci/20240417164227.398901-1-cassel@kernel.org Tested-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Jianfeng Liu <liujianfeng1994@gmail.com> Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-19PCI: hv: Return zero, not garbage, when reading PCI_INTERRUPT_PINWei Liu1-2/+2
commit fea93a3e5d5e6a09eb153866d2ce60ea3287a70d upstream. The intent of the code snippet is to always return 0 for both PCI_INTERRUPT_LINE and PCI_INTERRUPT_PIN. The check misses PCI_INTERRUPT_PIN. This patch fixes that. This is discovered by this call in VFIO: pci_read_config_byte(vdev->pdev, PCI_INTERRUPT_PIN, &pin); The old code does not set *val to 0 because it misses the check for PCI_INTERRUPT_PIN. Garbage is returned in that case. Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs") Link: https://lore.kernel.org/linux-pci/20240701202606.129606-1-wei.liu@kernel.org Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-05PCI: rockchip-ep: Remove wrong mask on subsys_vendor_idRick Wertenbroek1-4/+2
commit 2dba285caba53f309d6060fca911b43d63f41697 upstream. Remove wrong mask on subsys_vendor_id. Both the Vendor ID and Subsystem Vendor ID are u16 variables and are written to a u32 register of the controller. The Subsystem Vendor ID was always 0 because the u16 value was masked incorrectly with GENMASK(31,16) resulting in all lower 16 bits being set to 0 prior to the shift. Remove both masks as they are unnecessary and set the register correctly i.e., the lower 16-bits are the Vendor ID and the upper 16-bits are the Subsystem Vendor ID. This is documented in the RK3399 TRM section 17.6.7.1.17 [kwilczynski: removed unnecesary newline] Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Link: https://lore.kernel.org/linux-pci/20240403144508.489835-1-rick.wertenbroek@gmail.com Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16PCI: tegra194: Fix probe path for Endpoint modeVidya Sagar1-0/+3
[ Upstream commit 19326006a21da26532d982254677c892dae8f29b ] Tegra194 PCIe probe path is taking failure path in success case for Endpoint mode. Return success from the switch case instead of going into the failure path. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Link: https://lore.kernel.org/linux-pci/20240408093053.3948634-1-vidyas@nvidia.com Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-04-10PCI: dwc: endpoint: Fix advertised resizable BAR sizeNiklas Cassel1-1/+6
[ Upstream commit 72e34b8593e08a0ee759b7a038e0b178418ea6f8 ] The commit message in commit fc9a77040b04 ("PCI: designware-ep: Configure Resizable BAR cap to advertise the smallest size") claims that it modifies the Resizable BAR capability to only advertise support for 1 MB size BARs. However, the commit writes all zeroes to PCI_REBAR_CAP (the register which contains the possible BAR sizes that a BAR be resized to). According to the spec, it is illegal to not have a bit set in PCI_REBAR_CAP, and 1 MB is the smallest size allowed. Set bit 4 in PCI_REBAR_CAP, so that we actually advertise support for a 1 MB BAR size. Before: Capabilities: [2e8 v1] Physical Resizable BAR BAR 0: current size: 1MB BAR 1: current size: 1MB BAR 2: current size: 1MB BAR 3: current size: 1MB BAR 4: current size: 1MB BAR 5: current size: 1MB After: Capabilities: [2e8 v1] Physical Resizable BAR BAR 0: current size: 1MB, supported: 1MB BAR 1: current size: 1MB, supported: 1MB BAR 2: current size: 1MB, supported: 1MB BAR 3: current size: 1MB, supported: 1MB BAR 4: current size: 1MB, supported: 1MB BAR 5: current size: 1MB, supported: 1MB Fixes: fc9a77040b04 ("PCI: designware-ep: Configure Resizable BAR cap to advertise the smallest size") Link: https://lore.kernel.org/linux-pci/20240307111520.3303774-1-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: <stable@vger.kernel.org> # 5.2 Signed-off-by: Sasha Levin <sashal@kernel.org>