kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
11 days	PCI: Initialize temporary device in new_id_store()	Samiullah Khawaja	1	-1/+8
	[ Upstream commit f45a49a2380a47332817b7248c61a0ebbc6f0d00 ] When setting new_id of a PCI device driver using sysfs a lockdep splat occurs. This is because new_id_store() builds a temporary pci_dev for pci_match_device(), which calls device_match_driver_override(). That depends on the driver_override.lock added by cb3d1049f4ea ("driver core: generalize driver_override in struct device"). The new driver_override.lock was not initialized in the temporary pci_dev, resulting in this lockdep splat. Initialize the temporary pci_dev to fix this. Repro: Build with CONFIG_LOCKDEP=y, boot with QEMU, and add a new ID: # echo "8086 10f5" > /sys/bus/pci/drivers/e1000e/new_id INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. CPU: 2 UID: 0 PID: 177 Comm: liveupdate-iomm Not tainted 7.0.0+ #9 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x5d/0x80 register_lock_class+0x77e/0x790 lock_acquire+0xbf/0x2e0 pci_match_device+0x24/0x180 new_id_store+0x189/0x1d0 kernfs_fop_write_iter+0x14f/0x210 vfs_write+0x263/0x5e0 ksys_write+0x79/0xf0 do_syscall_64+0x117/0xf80 Fixes: 10a4206a2401 ("PCI: use generic driver_override infrastructure") Fixes: 8895d3bcb8ba ("PCI: Fail new_id for vendor/device values already built into driver") Signed-off-by: Samiullah Khawaja <skhawaja@google.com> [bhelgaas: add commit log details and repro, trim backtrace] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260505234327.716630-1-skhawaja@google.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: sg2042: Avoid L0s and L1 on Sophgo 2042 PCIe Root Ports	Yao Zi	1	-0/+2
	[ Upstream commit 988ef706cdd8a72e61dd90c0d0554eec4df7594a ] Since commit f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms") force enables ASPM on all device tree platforms, the SG2042 Root Ports are breaking as they advertise L0s and L1 capabilities without supporting them. Set ASPM quirks to disable the L0s and L1 capabilities for the Root Ports so that these broken link states won't be enabled. Fixes: 4e27aca4881a ("riscv: sophgo: dts: add PCIe controllers for SG2042") Co-developed-by: Inochi Amaoto <inochiama@gmail.com> Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Signed-off-by: Yao Zi <me@ziyao.cc> [mani: commit log] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Han Gao <gaohan@iscas.ac.cn> Tested-by: Chen Wang <unicorn_wang@outlook.com> # Pioneerbox Reviewed-by: Chen Wang <unicorn_wang@outlook.com> Link: https://patch.msgid.link/20260405154154.46829-3-me@ziyao.cc Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: cadence: Add flags for disabling ASPM capability for broken Root Ports	Yao Zi	2	-0/+26
	[ Upstream commit 5ccc76a87f1ec2422811e61be44165bfc9e7cf54 ] Add flags for disabling the ASPM L0s/L1 capability for broken Root Ports by clearing the corresponding bits in Link Capabilities Register through the local management bus. This allows ASPM to be disabled on platforms which don't support it. Signed-off-by: Yao Zi <me@ziyao.cc> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Han Gao <gaohan@iscas.ac.cn> Tested-by: Chen Wang <unicorn_wang@outlook.com> # Pioneerbox Reviewed-by: Chen Wang <unicorn_wang@outlook.com> Link: https://patch.msgid.link/20260405154154.46829-2-me@ziyao.cc Stable-dep-of: 988ef706cdd8 ("PCI: sg2042: Avoid L0s and L1 on Sophgo 2042 PCIe Root Ports") Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: tegra194: Fix CBB timeout caused by DBI access before core power-on	Manikanta Maddireddy	1	-4/+4
	[ Upstream commit 34b3eef48d980cd37b876e128bbf314f69fb5d70 ] When PERST# is deasserted twice (assert -> deassert -> assert -> deassert), a CBB (Control Backbone) timeout occurs at DBI register offset 0x8bc (PCIE_MISC_CONTROL_1_OFF). This happens because pci_epc_deinit_notify() and dw_pcie_ep_cleanup() are called before reset_control_deassert() powers on the controller core. The call chain that causes the timeout: pex_ep_event_pex_rst_deassert() pci_epc_deinit_notify() pci_epf_test_epc_deinit() pci_epf_test_clear_bar() pci_epc_clear_bar() dw_pcie_ep_clear_bar() __dw_pcie_ep_reset_bar() dw_pcie_dbi_ro_wr_en() <- Accesses 0x8bc DBI register reset_control_deassert(pcie->core_rst) <- Core powered on HERE The DBI registers, including PCIE_MISC_CONTROL_1_OFF (0x8bc), are only accessible after the controller core is powered on via reset_control_deassert(pcie->core_rst). Accessing them before this point results in a CBB timeout because the hardware is not yet operational. Fix this by moving pci_epc_deinit_notify() and dw_pcie_ep_cleanup() to after reset_control_deassert(pcie->core_rst), ensuring the controller is fully powered on before any DBI register accesses occur. Fixes: 40e2125381dc ("PCI: tegra194: Move controller cleanups to pex_ep_event_pex_rst_deassert()") Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Vidya Sagar <vidyas@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-15-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: tegra194: Disable L1.2 capability of Tegra234 EP	Vidya Sagar	1	-0/+19
	[ Upstream commit f59df1d9e6bdb6bd7ef65fb5d200900ac40c20ba ] When Tegra234 is operating in the Endpoint mode with L1.2 enabled, PCIe link goes down during L1.2 exit. This is because Tegra234 powers up UPHY PLL immediately without making sure that the REFCLK is stable. This causes UPHY PLL to fail to lock to the correct frequency and leads to link going down. There is no hardware fix for this, hence do not advertise the L1.2 capability in the Endpoint mode. Fixes: a54e19073718 ("PCI: tegra194: Add Tegra234 PCIe support") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Vidya Sagar <vidyas@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-14-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: dwc: Apply ECRC workaround to DesignWare 5.00a as well	Manikanta Maddireddy	1	-8/+8
	[ Upstream commit 40805f32dceadebb7381d911003100bec7b8cd51 ] The ECRC (TLP digest) workaround was originally added for DesignWare version 4.90a. Tegra234 SoC has 5.00a DWC HW version, which has the same ATU TD override behaviour, so apply the workaround for 5.00a too. Fixes: a54e19073718 ("PCI: tegra194: Add Tegra234 PCIe support") Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Vidya Sagar <vidyas@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-13-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: tegra194: Use DWC IP core version	Manikanta Maddireddy	2	-2/+4
	[ Upstream commit ea60ca067f0f098043610c96a915d162113c1aac ] Tegra194 PCIe driver used custom version numbers to detect Tegra194 and Tegra234 IPs. With version detect logic added, version check results in mismatch warnings: tegra194-pcie 14100000.pcie: Versions don't match (0000562a != 3536322a) Use HW version numbers which match to PORT_LOGIC.PCIE_VERSION_OFF in Tegra194 driver to avoid these kernel warnings. Fixes: a54e19073718 ("PCI: tegra194: Add Tegra234 PCIe support") Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Vidya Sagar <vidyas@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-12-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: tegra194: Free up Endpoint resources during remove()	Vidya Sagar	1	-0/+2
	[ Upstream commit 8870f02f7868209eb9bdc5dc53540a6262cf9227 ] Free up the resources during remove() that were acquired by the DesignWare driver for the Endpoint mode during probe(). Fixes: bb617cbd8151 ("PCI: tegra194: Clean up the exit path for Endpoint mode") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Vidya Sagar <vidyas@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-11-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: tegra194: Allow system suspend when the Endpoint link is not up	Vidya Sagar	1	-6/+25
	[ Upstream commit c76f8eae7d4695b1176c4ea5eb93c17e16a20272 ] Host software initiates the L2 sequence. PCIe link is kept in L2 state during suspend. If Endpoint mode is enabled and the link is up, the software cannot proceed with suspend. However, when the PCIe Endpoint driver is probed, but the PCIe link is not up, Tegra can go into suspend state. So, allow system to suspend in this case. Fixes: de2bbf2b71bb ("PCI: tegra194: Don't allow suspend when Tegra PCIe is in EP mode") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Vidya Sagar <vidyas@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-10-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: tegra194: Set LTR message request before PCIe link up in Endpoint mode	Vidya Sagar	1	-9/+9
	[ Upstream commit b256493bf8cacf0e524bf4c10b5c4901d0c6cefe ] LTR message should be sent as soon as the Root Port enables LTR in the Endpoint mode. So set snoop and no-snoop LTR timing and LTR message request before the PCIe link comes up, so that the LTR message is sent upstream as soon as LTR is enabled. Without programming these values, the Endpoint would send latencies of 0 to the host, which will be inaccurate. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> [mani: commit log] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-9-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: tegra194: Disable direct speed change for Endpoint mode	Vidya Sagar	1	-0/+4
	[ Upstream commit 976f6763f57970388bcd7118931f33f447916927 ] Pre-silicon simulation showed the controller operating in Endpoint mode initiating link speed change after completing Secondary Bus Reset. Ideally, the Root Port or the Switch Downstream Port should initiate the link speed change post SBR, not the Endpoint. So, as per the hardware team recommendation, disable direct speed change for the Endpoint mode to prevent it from initiating speed change after the physical layer link is up at Gen1, leaving speed change ownership with the host. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> [mani: commit log] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Vidya Sagar <vidyas@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-8-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: tegra194: Use devm_gpiod_get_optional() to parse "nvidia,refclk-select"	Vidya Sagar	1	-3/+3
	[ Upstream commit f62bc7917de1374dce86a852ffba8baf9cb7a56a ] The GPIO DT property "nvidia,refclk-select", to select the PCIe reference clock is optional. Use devm_gpiod_get_optional() to get it. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Vidya Sagar <vidyas@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-7-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: tegra194: Disable PERST# IRQ only in Endpoint mode	Manikanta Maddireddy	1	-1/+2
	[ Upstream commit 40658a31b6e134169c648041efc84944c4c71dcd ] The PERST# GPIO interrupt is only registered when the controller is operating in Endpoint mode. In Root Port mode, the PERST# GPIO is configured as an output to control downstream devices, and no interrupt is registered for it. Currently, tegra_pcie_dw_stop_link() unconditionally calls disable_irq() on pex_rst_irq, which causes issues in Root Port mode where this IRQ is not registered. Fix this by only disabling the PERST# IRQ when operating in Endpoint mode, where the interrupt is actually registered and used to detect PERST# assertion/deassertion from the host. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Vidya Sagar <vidyas@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-6-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: tegra194: Don't force the device into the D0 state before L2	Vidya Sagar	1	-41/+0
	[ Upstream commit 71d9f67701e1affc82d18ca88ae798c5361beddf ] As per PCIe CEM r6.0, sec 2.3, the PCIe Endpoint device should be in D3cold to assert WAKE# pin. The previous workaround that forced downstream devices to D0 before taking the link to L2 cited PCIe r4.0, sec 5.2, "Link State Power Management"; however, that spec does not explicitly require putting the device into D0 and only indicates that power removal may be initiated without transitioning to D3hot. Remove the D0 workaround so that Endpoint devices can use wake functionality (WAKE# from D3). With some Endpoints the link may not enter L2 when they remain in D3, but the Root Port continues with the usual flow after PME timeout, so there is no functional issue. Fixes: 56e15a238d92 ("PCI: tegra: Add Tegra194 PCIe support") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Vidya Sagar <vidyas@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-5-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: tegra194: Disable LTSSM after transition to Detect on surprise link down	Manikanta Maddireddy	1	-13/+16
	[ Upstream commit 9fa0c242f8d7acf1b124d4462d18f4023573ac1c ] After the link reaches a Detect-related LTSSM state, disable LTSSM so it does not keep toggling between Polling and Detect. Do this by polling for the Detect state first, then clearing APPL_CTRL_LTSSM_EN in both tegra_pcie_dw_pme_turnoff() and pex_ep_event_pex_rst_assert(). Fixes: 56e15a238d92 ("PCI: tegra: Add Tegra194 PCIe support") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-4-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: tegra194: Increase LTSSM poll time on surprise link down	Manikanta Maddireddy	1	-15/+21
	[ Upstream commit 74dd8efe4d6cead433162147333af989a568aac7 ] On surprise link down, LTSSM state transits from L0 -> Recovery.RcvrLock -> Recovery.RcvrSpeed -> Gen1 Recovery.RcvrLock -> Detect. Recovery.RcvrLock and Recovery.RcvrSpeed transit times are 24 ms and 48 ms respectively, so the total time from L0 to Detect is ~96 ms. Increase the poll timeout to 120 ms to account for this. While at it, add LTSSM state defines for Detect-related states and use them in the poll condition. Use readl_poll_timeout() instead of readl_poll_timeout_atomic() in tegra_pcie_dw_pme_turnoff() since that path runs in non-atomic context. Fixes: 56e15a238d92 ("PCI: tegra: Add Tegra194 PCIe support") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-3-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: tegra194: Fix polling delay for L2 state	Vidya Sagar	1	-5/+4
	[ Upstream commit adaffed907f14f954096555665ad6af2ae724d83 ] As per PCIe r7.0, sec 5.3.3.2.1, after sending PME_Turn_Off message, Root Port should wait for 1-10 msec for PME_TO_Ack message. Currently, driver is polling for 10 msec with 1 usec delay which is aggressive. Use existing macro PCIE_PME_TO_L2_TIMEOUT_US to poll for 10 msec with 1 msec delay. Since this function is used in non-atomic context only, use non-atomic poll function. Fixes: 56e15a238d92 ("PCI: tegra: Add Tegra194 PCIe support") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Link: https://patch.msgid.link/20260324190755.1094879-2-mmaddireddy@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI/NPEM: Set LED_HW_PLUGGABLE for hotplug-capable ports	Richard Cheng	1	-1/+1
	[ Upstream commit 16d021c878dca22532c984668c9e8cf4722d6a49 ] NPEM registers LED classdevs on PCI endpoint that may be behind hotplug-capable ports. During hot-removal, led_classdev_unregister() calls led_set_brightness(LED_OFF) which leads to a PCI config read to a disconnected device, which fails and returns -ENODEV (topology details in msgid.link below): leds 0003:01:00.0:enclosure:ok: Setting an LED's brightness failed (-19) The LED core already suppresses this for devices with LED_HW_PLUGGABLE set, but NPEM never sets it. Add the flag since NPEM LEDs are on hot-pluggable hardware by nature. Fixes: 4e893545ef87 ("PCI/NPEM: Add Native PCIe Enclosure Management support") Signed-off-by: Richard Cheng <icheng@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Acked-by: Kai-Heng Feng <kaihengf@nvidia.com> Link: https://patch.msgid.link/20260402093850.23075-1-icheng@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: mediatek-gen3: Prevent leaking IRQ domains when IRQ not found	Chen-Yu Tsai	1	-4/+4
	[ Upstream commit 5573c44cb3fd01a9f62d569ae9ac870ef5f0e0ba ] In mtk_pcie_setup_irq(), the IRQ domains are allocated before the controller's IRQ is fetched. If the latter fails, the function directly returns an error, without cleaning up the allocated domains. Hence, reverse the order so that the IRQ domains are allocated after the controller's IRQ is found. This was flagged by Sashiko during a review of "[PATCH v6 0/7] PCI: mediatek-gen3: add power control support". Fixes: 814cceebba9b ("PCI: mediatek-gen3: Add INTx support") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://sashiko.dev/#/patchset/20260324052002.4072430-1-wenst%40chromium.org Link: https://patch.msgid.link/20260324093542.18523-1-wenst@chromium.org Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: aspeed: Fix IRQ domain leak on platform_get_irq() failure	Felix Gu	1	-4/+4
	[ Upstream commit c54d5f5b33990f2649c20f35407f340bcadb8a53 ] The aspeed_pcie_probe() function calls aspeed_pcie_init_irq_domain() which allocates pcie->intx_domain and initializes MSI. However, if platform_get_irq() fails afterwards, the cleanup action was not yet registered via devm_add_action_or_reset(), causing the IRQ domain resources to leak. Fix this by registering the devm cleanup action immediately after aspeed_pcie_init_irq_domain() succeeds, before calling platform_get_irq(). This ensures proper cleanup on any subsequent failure. Fixes: 9aa0cb68fcc1 ("PCI: aspeed: Add ASPEED PCIe RC driver") Signed-off-by: Felix Gu <ustc.gu@gmail.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Jacky Chou <jacky_chou@aspeedtech.com> Link: https://patch.msgid.link/20260324-aspeed-v1-1-354181624c00@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: imx6: Keep Root Port MSI capability with iMSI-RX to work around ↵	Richard Zhu	3	-1/+9
	hardware bug [ Upstream commit 3a4e8302e72f83fd5cc8a916fc6f5c8fe5c8690e ] On NXP i.MX7D, i.MX8MM, and i.MX8MQ chipsets, MSIs from the endpoints won't be received by the iMSI-RX MSI controller if the Root Port MSI capability is disabled. Even though the Root Port MSIs won't be received by the iMSI-RX controller due to design, these chipsets have some weird hardware bug that prevents the endpoint MSIs from reaching when the Root Port MSI capability is disabled. Hence, introduce a new flag, 'dw_pcie_rp::keep_rp_msi_en', set it for the above mentioned SoCs, and always keep the Root Port MSI capability when this flag is set. Note that by keeping Root Port MSI capability, Root Port MSIs such as AER, PME and others won't be received by default. So users need to use workarounds such as passing 'pcie_pme=nomsi' cmdline param. Fixes: f5cd8a929c825 ("PCI: dwc: Remove MSI/MSIX capability for Root Port if iMSI-RX is used as MSI controller") Suggested-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> [mani: commit log] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> [bhelgaas: fix typos] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260331085252.1243108-1-hongxing.zhu@nxp.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: Enable AtomicOps only if Root Port supports them	Gerd Bayer	1	-21/+20
	[ Upstream commit 1ae8c4ce157037e266184064a182af9ef9af278b ] When inspecting the config space of a Connect-X physical function in an s390 system after it was initialized by the mlx5_core device driver, we found the function to be enabled to request AtomicOps despite the Root Port lacking support for completing them: 00:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx] Subsystem: Mellanox Technologies Device 0002 DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- AtomicOpsCtl: ReqEn+ On s390 and many virtualized guests, the Endpoint is visible but the Root Port is not. In this case, pci_enable_atomic_ops_to_root() previously enabled AtomicOps in the Endpoint even though it can't tell whether the Root Port supports them as a completer. Change pci_enable_atomic_ops_to_root() to fail if there's no Root Port or the Root Port doesn't support AtomicOps. Fixes: 430a23689dea ("PCI: Add pci_enable_atomic_ops_to_root()") Reported-by: Alexander Schmidt <alexs@linux.ibm.com> Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com> [bhelgaas: commit log, check RP first to simplify flow] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260330-fix_pciatops-v7-2-f601818417e8@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: dwc: Fix type mismatch for kstrtou32_from_user() return value	Hans Zhang	1	-9/+12
	[ Upstream commit 445588a3b18bb0702d746cb61f7a443639027651 ] kstrtou32_from_user() returns int, but the return value was stored in a u32 variable 'val', risking sign loss. Use a dedicated int variable to correctly handle the return code. Fixes: 4fbfa17f9a07 ("PCI: dwc: Add debugfs based Silicon Debug support for DWC") Signed-off-by: Hans Zhang <18255117159@163.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20260401023048.4182452-1-18255117159@163.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI/DPC: Log AER error info for DPC/EDR uncorrectable errors	Kuppuswamy Sathyanarayanan	1	-0/+1
	[ Upstream commit 97970e7c694356e3386a10e3b936d61eafd06bce ] aer_print_error() skips printing if ratelimit_print[i] is not set. In the native AER path, ratelimit_print is initialized by add_error_device() during source device discovery, and is set to 1 for fatal errors to bypass rate limiting since fatal errors should always be logged. The DPC/EDR path uses the DPC-capable port as the error source and reads its AER uncorrectable error status registers directly in dpc_get_aer_uncorrect_severity(). Since it does not go through add_error_device(), ratelimit_print[0] is left uninitialized and zero. As a result, aer_print_error() silently drops all AER error messages for DPC/EDR triggered events. Set ratelimit_print[0] to 1 to bypass rate limiting and always print AER logs for uncorrectable errors detected by the DPC port. Fixes: a57f2bfb4a58 ("PCI/AER: Ratelimit correctable and non-fatal error logging") Co-developed-by: Goudar Manjunath Ramanagouda <manjunath.ramanagouda.goudar@intel.com> Signed-off-by: Goudar Manjunath Ramanagouda <manjunath.ramanagouda.goudar@intel.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260318170449.2733581-1-sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: Fix alignment calculation for resource size larger than align	Ilpo Järvinen	1	-1/+8
	[ Upstream commit 8cb081667377709f4924ab6b3a88a0d7a761fe91 ] The commit bc75c8e50711 ("PCI: Rewrite bridge window head alignment function") did not use if (r_size <= align) check from pbus_size_mem() for the new head alignment bookkeeping structure (aligns2[]). In some configurations, this can result in producing a gap into the bridge window which the resource larger than its alignment cannot fill. The old alignment calculation algorithm was removed by the subsequent commit 3958bf16e2fe ("PCI: Stop over-estimating bridge window size") which renamed the aligns2[] array leaving only aligns[] array. Add the if (r_size <= align) check back to avoid this problem. Fixes: bc75c8e50711 ("PCI: Rewrite bridge window head alignment function") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/all/b05a6f14-979d-42c9-924c-d8408cb12ae7@roeck-us.net/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xifer <xiferdev@gmail.com> Link: https://patch.msgid.link/20260324165633.4583-11-ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: Fix premature removal from realloc_head list during resource assignment	Ilpo Järvinen	1	-4/+4
	[ Upstream commit 1ee4716a5a28eaef81ae1f280d983258bee49623 ] reassign_resources_sorted() checks for two things: a) Resource assignment failures for mandatory resources by checking if the resource remains unassigned, which are known to always repeat, and does not attempt to assign them again. b) That resource is not among the ones being processed/assigned at this stage, leading to skip processing such resources in reassign_resources_sorted() as well (resource assignment progresses one PCI hierarchy level at a time). The problem here is that a) is checked before b), but b) also implies the resource is not being assigned yet, making also a) true. As a) only skips resource assignment but still removes the resource from realloc_head, the later stages that would need to process the information in realloc_head cannot obtain the optional size information anymore. This leads to considering only non-optional part for bridge windows deeper in the PCI hierarchy. This problem has been observed during rescan (add_size is not considered while attempting assignment for 0000:e2:00.0 indicating the corresponding entry was removed from realloc_head while processing resource assignments for 0000:e1): pci_bus 0000:e1: scanning bus ... pci 0000:e3:01.0: bridge window [mem 0x800000000-0x1000ffffff 64bit pref] to [bus e4] add_size 60c000000 add_align 800000000 pci 0000:e3:01.0: bridge window [mem 0x00100000-0x000fffff] to [bus e4] add_size 200000 add_align 200000 pci 0000:e3:02.0: disabling bridge window [mem 0x00000000-0x000fffff 64bit pref] to [bus e5] (unused) pci 0000:e2:00.0: bridge window [mem 0x800000000-0x1000ffffff 64bit pref] to [bus e3-e5] add_size 60c000000 add_align 800000000 pci 0000:e2:00.0: bridge window [mem 0x00100000-0x001fffff] to [bus e3-e5] add_size 200000 add_align 200000 pcieport 0000:e1:02.0: bridge window [io size 0x2000]: can't assign; no space pcieport 0000:e1:02.0: bridge window [io size 0x2000]: failed to assign pcieport 0000:e1:02.0: bridge window [io 0x1000-0x2fff]: resource restored pcieport 0000:e1:02.0: bridge window [io 0x1000-0x2fff]: resource restored pcieport 0000:e1:02.0: bridge window [io size 0x2000]: can't assign; no space pcieport 0000:e1:02.0: bridge window [io size 0x2000]: failed to assign pci 0000:e2:00.0: bridge window [mem 0x28f000000000-0x28f800ffffff 64bit pref]: assigned Fixes: 96336ec70264 ("PCI: Perform reset_resource() and build fail list in sync") Reported-by: Peter Nisbet <peter.nisbet@intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Peter Nisbet <peter.nisbet@intel.com> Link: https://patch.msgid.link/20260313084551.1934-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: Prevent shrinking bridge window from its required size	Ilpo Järvinen	1	-2/+40
	[ Upstream commit dc4b4d04e1caa3552f000d84d832779ebe51b093 ] Steve reported an eGPU (either Radeon Instinct MI50 32GB or NVIDIA 3080 10GB) connected via Thunderbolt was not assigned sufficient BAR space in v6.11, so the amdgpu and nvidia drivers were unable to initialize the device. pci_bridge_distribute_available_resources() -> ... -> adjust_bridge_window() is called between __pci_bus_size_bridges() and assigning the resources. Since the commit 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary") adjust_bridge_window() can also shrink the bridge window. The shrunken size, however, conflicts with what __pci_bus_size_bridges() -> pbus_size_mem() calculated as the required bridge window size. By shrinking the size, adjust_bridge_window() prevents the rest of the resource fitting algorithm from working as intended. Resource fitting logic is expecting assignment failures when bridge windows need resizing, but there are cases where failures are no longer happening after the commit 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary"). The commit 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary") justifies the change by the extra reservation made due to hpmemsize parameter, however, the kernel code contradicts that statement. (For simplicity, finer-grained hpmmiosize and hpmmiopref parameters that can be used to the same effect as hpmemsize are ignored in this description.) pbus_size_mem() calls calculate_memsize() twice. First with add_size=0 to find out the minimal required resource size. The second call occurs with add_size=hpmemsize (effectively) but the result does not directly affect the resource size only resulting in an entry on the realloc_head list (a.k.a. add_list). Yet, adjust_bridge_window() directly changes the resource size which does not include what is reserved due to hpmemsize. Also, if the required size for the bridge window exceeds hpmemsize, the parameter does not have any effect even on the second size calculation made by pbus_size_mem(); from calculate_memsize(): size = max(size, add_size) + children_add_size; The commit ae4611f1d7e9 ("PCI: Set resource size directly in adjust_bridge_window()") that precedes the commit 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary") is also related to causing this problem. Its changelog explicitly states adjust_bridge_window() wants to "guarantee" allocation success. Guaranteed allocations, however, are incompatible with how the other parts of the resource fitting algorithm work. The given justification fails to explain why guaranteed allocations at this stage are required nor why forcing window to a smaller value than what was calculated by pbus_size_mem() is correct. While the change might have worked by chance in some test scenario, too small bridge window does not "guarantee" success from the point of view of the endpoint device resource assignments. No issue is mentioned within the changelog so it's unclear if the change was made to fix some observed issue nor and what that issue was. The unwanted shrinking of a bridge window occurs, e.g., when a device with large BARs such as eGPU is attached using Thunderbolt and the Root Port holds less than enough resource space for the eGPU. The GPU resources are in order of GBs and the default hotplug allocation is a mere 2MB (DEFAULT_HOTPLUG_MMIO_PREF_SIZE). The problem is illustrated by this log (filtered to the relevant content only): pci 0000:00:07.0: PCI bridge to [bus 03-2c] pci 0000:00:07.0: bridge window [mem 0x6000000000-0x601bffffff 64bit pref] pci 0000:03:00.0: PCI bridge to [bus 00] pci 0000:03:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref] pci 0000:03:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring pci 0000:03:00.0: PCI bridge to [bus 04-2c] pcieport 0000:00:07.0: Assigned bridge window [mem 0x6000000000-0x601bffffff 64bit pref] to [bus 03-2c] cannot fit 0xc00000000 required for 0000:03:00.0 bridging to [bus 04-2c] pci 0000:03:00.0: bridge window [mem 0x800000000-0x10003fffff 64bit pref] to [bus 04-2c] add_size 100000 add_align 100000 pcieport 0000:00:07.0: distributing available resources pci 0000:03:00.0: bridge window [mem 0x800000000-0x10003fffff 64bit pref] shrunken by 0x00000007e4400000 pci 0000:03:00.0: bridge window [mem 0x6000000000-0x601bffffff 64bit pref]: assigned The initial size of the Root Port's window is 448MB (0x601bffffff - 0x6000000000). __pci_bus_size_bridges() -> pbus_size_mem() calculates the required size to be 32772 MB (0x10003fffff - 0x800000000) which would fit the eGPU resources. adjust_bridge_window() then shrinks the bridge window down to what is guaranteed to fit into the Root Port's bridge window. The bridge window for 03:00.0 is also eliminated from the add_list (a.k.a. realloc_head) list by adjust_bridge_window(). After adjustment, the resources are assigned and as the bridge window for 03:00.0 is assigned successfully, no failure is recorded. Without a failure, no attempt to resize the window of the Root Port is required. The end result is eGPU not having large enough resources to work. The commit 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary") also claims nested bridge windows are sized the same, which is false. pbus_size_mem() calculates the size for the parent bridge window by summing all the downstream resources so the resource fitting calculates larger bridge window for the parent to accommodate the childen. That is, hpmemsize does not result the same size for the case where there are nested bridge windows. In order to fix the most immediate problem, don't shrink the resource size in adjust_bridge_window() as hpmemsize had nothing to do with it. When considering add_size, only reduce it up to what is added due to hpmemsize (if required size is larger than hpmemsize, the parameter has no impact, see calculate_memsize()). Unfortunately, if the tail of the bridge window was aligned in calculate_memsize() from below hpmemsize to above it, the size check will falsely match but the check at least errs to the side of caution. There's not enough information available in adjust_bridge_window() to know the calculated size precisely. This is not exactly a revert of the commits e4611f1d7e9 ("PCI: Set resource size directly in adjust_bridge_window()") and 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary") as shrinking still remains in place but is implemented differently, and the end result behaves very differently. It is possible that those two commits fixed some other issue that is not described with enough detail in the changelog and undoing parts of them results in another regression due to behavioral change. Nonetheless, as described above, the solution by those two commits was flawed and the issue, if one exists, should be solved in a way that is compatible with the rest of the resource fitting algorithm instead of working against it. Besides shrinking, the case where adjust_bridge_window() expands the bridge window is likely somewhat wrong as well because it removes the entry from add_list (a.k.a. realloc_head), but it is less damaging as that only impacts optional resources and may have no impact if expanding by hpmemsize is larger than what add_size was. Fixing it is left as further work. Fixes: 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary") Fixes: ae4611f1d7e9 ("PCI: Set resource size directly in adjust_bridge_window()") Reported-by: Steve Oswald <stevepeter.oswald@gmail.com> Closes: https://lore.kernel.org/linux-pci/CAN95MYEaO8QYYL=5cN19nv_qDGuuP5QOD17pD_ed6a7UqFVZ-g@mail.gmail.com/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260219153951.68869-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: qcom: Advertise Hotplug Slot Capability with no Command Completion support	Krishna Chaitanya Chundru	1	-6/+11
	[ Upstream commit 33a76fc3c3e61386524479b99f35423bd3d9a895 ] Qcom PCIe Root Ports advertise hotplug capability in hardware, but do not support hotplug command completion. As a result, the hotplug commands issued by the pciehp driver never gets completion notification, leading to repeated timeout warnings and multi-second delays during boot and suspend/resume. Commit a54db86ddc153 ("PCI: qcom: Do not advertise hotplug capability for IPs v2.7.0 and v1.9.0") mistakenly assumed that the Root Ports doesn't support Hotplug due to timeouts and disabled the Hotplug functionality altogether. But the Root Ports does support reporting Hotplug events like DL_Up/Down events. So to fix the command completion timeout issues, just set the No Command Completed Support (NCCS) bit and enable Hotplug in Slot Capability field back. Fixes: a54db86ddc153 ("PCI: qcom: Do not advertise hotplug capability for IPs v2.7.0 and v1.9.0") Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> [mani: renamed function, commit log and added comment] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> # Hamoa CRD, tunneled link Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://patch.msgid.link/20260314-hotplug-v1-1-96ac87d93867@oss.qualcomm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: sky1: Fix missing cleanup of ECAM config on probe failure	Felix Gu	1	-1/+3
	[ Upstream commit 72e76b63d6ff6d1f96acccbfc6c118656f63e66a ] When devm_kzalloc() for reg_off fails, the code returns -ENOMEM without freeing pcie->cfg, which was allocated earlier by pci_ecam_create(). Add the missing pci_ecam_free() call to properly release the allocated ECAM configuration window on this error path. Fixes: a0d9f2c08f45 ("PCI: sky1: Add PCIe host support for CIX Sky1") Signed-off-by: Felix Gu <ustc.gu@gmail.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Hans Zhang <18255117159@163.com> Link: https://patch.msgid.link/20260324-sky1-v1-1-6a00cb2776b6@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: dwc: rcar-gen4: Change EPC BAR alignment to 4K as per the documentation	Koichiro Den	1	-1/+1
	[ Upstream commit 13f55a7ca773c731a1e645934c1ae48577f48785 ] R-Car S4 Series (R8A779F[4-7]*) EP controller uses a 4K minimum iATU region size (CX_ATU_MIN_REGION_SIZE = 4K) as per R19UH0161EJ0130 Rev.1.30. Also, the controller itself can only be configured in the range 4 KB to 64 KB, so the current 1 MB alignment requirement is incorrect. Hence, change the alignment to the min size 4K as per the documentation. This also fixes needless unusability of BAR4 on this platform when the target address is fixed, such as for doorbell targets. Fixes: e311b3834dfa ("PCI: rcar-gen4: Add endpoint mode support") Signed-off-by: Koichiro Den <den@valinux.co.jp> [mani: commit log] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260305151050.1834007-1-den@valinux.co.jp Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: rzg3s-host: Reorder reset assertion during suspend	John Madieu	1	-9/+9
	[ Upstream commit 34735f63748daa2ea27544259c3042b4948376bf ] Reorder the reset assertion sequence during suspend from power_resets -> cfg_resets to cfg_resets -> power_resets. This change ensures the suspend sequence follows the reverse order of the probe/init sequence, where power_resets are deasserted first followed by cfg_resets. Additionally, this ordering is required for RZ/G3E support where cfg resets are controlled through PCIe AXI registers (offset 0x310h). According to the RZ/G3E hardware manual (Rev.1.15, section 6.6.6.1.1 "Changing the Initial Values of the Registers"), AXI register access requires ARESETn to be de-asserted and the clock to be supplied. Since ARESETn is part of power_resets, cfg_resets must be asserted before power_resets, otherwise the AXI registers become inaccessible. Fixes: 7ef502fb35b2 ("PCI: Add Renesas RZ/G3S host controller driver") Signed-off-by: John Madieu <john.madieu.xa@bp.renesas.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # RZ/V2N EVK Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Link: https://patch.msgid.link/20260306143423.19562-3-john.madieu.xa@bp.renesas.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: rzg3s-host: Fix reset handling in probe error path	John Madieu	1	-2/+1
	[ Upstream commit d284389d4576e7c8040dc4cbb66876e539c6d064 ] Fix incorrect reset_control_bulk_deassert() call in the probe error path. When unwinding from a failed pci_host_probe(), the configuration resets should be asserted to restore the hardware to its initial state, not deasserted again. Fixes: 7ef502fb35b2 ("PCI: Add Renesas RZ/G3S host controller driver") Signed-off-by: John Madieu <john.madieu.xa@bp.renesas.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # RZ/V2N EVK Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Link: https://patch.msgid.link/20260306143423.19562-2-john.madieu.xa@bp.renesas.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: dwc: Perform cleanup in the error path of dw_pcie_resume_noirq()	Manivannan Sadhasivam	1	-3/+12
	[ Upstream commit edb5ca3262e2255cf938a5948709d3472d4871ad ] If the dw_pcie_resume_noirq() API fails, it just returns the errno without doing cleanup in the error path, leading to resource leak. So perform cleanup in the error path. Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality") Reported-by: Senchuan Zhang <zhangsenchuan@eswincomputing.com> Closes: https://lore.kernel.org/linux-pci/78296255.3869.19c8eb694d6.Coremail.zhangsenchuan@eswincomputing.com Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20260226133951.296743-1-mani@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: dwc: ep: Mirror the max link width and speed fields to all functions	Aksh Garg	1	-1/+28
	[ Upstream commit 94cbea0f636b55602a9a10583670976680ecea67 ] PCIe r7.0, section 7.5.3.6 states that for multi-function devices, the Max Link Width and Max Link Speed fields in the Link Capabilities Register must report the same values for all functions. Currently, dw_pcie_setup() programs these fields only for Function 0 via dw_pcie_link_set_max_speed() and dw_pcie_link_set_max_link_width(). For multi-function endpoint configurations, Function 1 and beyond retain their default values, violating the PCIe specification. Fix this by reading the Max Link Width and Max Link Speed fields from Link Capabilities Register of Function 0 after dw_pcie_setup() completes, then mirroring these values to all other functions. Fixes: 24ede430fa49 ("PCI: designware-ep: Add multiple PFs support for DWC") Fixes: 89db0793c9f2 ("PCI: dwc: Add missing PCI_EXP_LNKCAP_MLW handling") Signed-off-by: Aksh Garg <a-garg7@ti.com> [mani: renamed ref_lnkcap to func0_lnkcap] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20260224083817.916782-3-a-garg7@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: dwc: ep: Fix MSI-X Table Size configuration in dw_pcie_ep_set_msix()	Aksh Garg	1	-1/+1
	[ Upstream commit 271d0b1f058ae9815e75233d04b23e3558c3e4f4 ] In dw_pcie_ep_set_msix(), while updating the MSI-X Table Size value for individual functions, Message Control register is read from the passed function number register space using dw_pcie_ep_readw_dbi(), but always written back to the Function 0's register space using dw_pcie_writew_dbi(). This causes incorrect MSI-X configuration for the rest of the functions, other than Function 0. Fix this by using dw_pcie_ep_writew_dbi() to write to the correct function's register space, matching the read operation. Fixes: 70fa02ca1446 ("PCI: dwc: Add dw_pcie_ep_{read,write}_dbi[2] helpers") Signed-off-by: Aksh Garg <a-garg7@ti.com> [mani: commit log] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260224083817.916782-2-a-garg7@ti.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: imx6: Fix device node reference leak in imx_pcie_probe()	Felix Gu	1	-2/+2
	[ Upstream commit 3b55079d6387805ede687e234d84669aeb0f7e98 ] In imx_pcie_probe(), of_parse_phandle() returns the device node pointer with increased refcount. The pointer reference must be dropped by the caller when it's no longer needed. However, imx_pcie_probe() doesn't drop the reference, causing reference leak. Fix this by using the __free(device_node) cleanup handler to drop the reference when the function goes out of scope. Fixes: 1df82ec46600 ("PCI: imx: Add workaround for e10728, IMX7d PCIe PLL failure") Signed-off-by: Felix Gu <ustc.gu@gmail.com> [mani: commit log] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Acked-by: Richard Zhu <hongxing.zhu@nxp.com> Link: https://patch.msgid.link/20260124-pci_imx6-v2-1-acb8d5187683@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: endpoint: pci-ep-msi: Fix error unwind and prevent double alloc	Koichiro Den	1	-0/+5
	[ Upstream commit 1cba96c0a795124c3229293ed7b5b5765e66f259 ] pci_epf_alloc_doorbell() stores the allocated doorbell message array in epf->db_msg/epf->num_db before requesting MSI vectors. If MSI allocation fails, the array is freed but the EPF state may still point to freed memory. Clear epf->db_msg and epf->num_db on the MSI allocation failure path so that later cleanup cannot double-free the array and callers can retry allocation. Also return -EBUSY when doorbells have already been allocated to prevent leaking or overwriting an existing allocation. Fixes: 1c3b002c6bf6 ("PCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller") Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260217063856.3759713-4-den@valinux.co.jp Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: endpoint: pci-epf-test: Don't free doorbell IRQ unless requested	Koichiro Den	1	-3/+5
	[ Upstream commit e81fa70179aac6ac3a6636565d5d35968dca3900 ] pci_epf_test_doorbell_cleanup() unconditionally calls free_irq() for the doorbell virq, which can trigger "Trying to free already-free IRQ" warnings when the IRQ was never requested or when request_threaded_irq() failed. Move free_irq() out of pci_epf_test_doorbell_cleanup() and invoke it only after a successful request, so that free_irq() is not called for an unrequested IRQ. Fixes: eff0c286aa91 ("PCI: endpoint: pci-epf-test: Add doorbell test support") Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260217063856.3759713-3-den@valinux.co.jp Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: endpoint: pci-epf-vntb: Fix MSI doorbell IRQ unwind	Koichiro Den	1	-6/+6
	[ Upstream commit cc04f2bfb9dae60b6e34d6bff75c26d4ec3237ce ] epf_ntb_db_bar_init_msi_doorbell() requests ntb->db_count doorbell IRQs and then performs additional MSI doorbell setup that may still fail. The error path unwinds the requested IRQs, but it uses a loop variable that is reused later in the function. When a later step fails, the unwind can run with an unexpected index value and leave some IRQs requested. Track the number of successfully requested IRQs separately and use that counter for the unwind so all previously requested IRQs are freed on failure. Fixes: dc693d606644 ("PCI: endpoint: pci-epf-vntb: Add MSI doorbell support") Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260217063856.3759713-2-den@valinux.co.jp Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI/TPH: Allow TPH enable for RCiEPs	George Abraham P	1	-3/+6
	[ Upstream commit d3e996a596967a62c8a13a279221513461f6ab97 ] Previously, pcie_enable_tph() only enabled TLP Processing Hints (TPH) if both the Endpoint and its Root Port advertised TPH support. Root Complex Integrated Endpoints (RCiEPs) are directly integrated into a Root Complex and do not have an associated Root Port, so pcie_enable_tph() never enabled TPH for RCiEPs. PCIe r7.0 doesn't seem to include a way to learn whether a Root Complex supports TPH, but sec 2.2.7.1.1 says Functions that lack TPH support should ignore TPH, and maybe the same is true for Root Complexes: A Function that does not support the TPH Completer or Routing capability and receives a transaction with the TH bit [which indicates the presence of TPH in the TLP header] Set is required to ignore the TH bit and handle the Request in the same way as Requests of the same transaction type without the TH bit Set. Allow drivers to enable TPH for any RCiEP with a TPH Requester Capability. Fixes: f69767a1ada3 ("PCI: Add TLP Processing Hints (TPH) support") Signed-off-by: George Abraham P <george.abraham.p@intel.com> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260109052923.1170070-1-george.abraham.p@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
11 days	PCI: use generic driver_override infrastructure	Danilo Krummrich	3	-33/+7
	[ Upstream commit 10a4206a24013be4d558d476010cbf2eb4c9fa64 ] When a driver is probed through __driver_attach(), the bus' match() callback is called without the device lock held, thus accessing the driver_override field without a lock, which can cause a UAF. Fix this by using the driver-core driver_override infrastructure taking care of proper locking internally. Note that calling match() from __driver_attach() without the device lock held is intentional. [1] Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1] Reported-by: Gui-Dong Han <hanguidong02@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789 Fixes: 782a985d7af2 ("PCI: Introduce new device binding path using pci_dev.driver_override") Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Williamson <alex@shazbot.org> Tested-by: Gui-Dong Han <hanguidong02@gmail.com> Reviewed-by: Gui-Dong Han <hanguidong02@gmail.com> Link: https://patch.msgid.link/20260324005919.2408620-6-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-05-14	PCI/ASPM: Fix pci_clear_and_set_config_dword() usage	Lukas Wunner	1	-5/+12
	commit cc33985d26c92a5c908c0185239c59ec35b8637c upstream. When aspm_calc_l12_info() programs the L1 PM Substates Control 1 register fields Common_Mode_Restore_Time, LTR_L1.2_THRESHOLD_Value and _Scale, it invokes pci_clear_and_set_config_dword() in an incorrect way: For the bits to clear it selects those corresponding to the field. So far so good. But for the bits to set it passes a full register value. pci_clear_and_set_config_dword() performs a boolean OR operation which sets all bits of that value, not just the ones that were just cleared. Thus, when setting the LTR_L1.2_THRESHOLD_Value and _Scale on the child of an ASPM link, aspm_calc_l12_info() also sets the Common_Mode_Restore_Time. That's a spec violation: PCIe r7.0 sec 7.8.3.3 says this field is RsvdP for Upstream Ports. On Adrià's Pixelbook Eve, Common_Mode_Restore_Time of the Intel 7265 "Stone Peak" wifi card is zero, yet aspm_calc_l12_info() does not preserve the zero bits but instead programs the value calculated for the Root Port into the wifi card. Likewise, when setting the Common_Mode_Restore_Time on the Root Port, aspm_calc_l12_info() also changes the LTR_L1.2_THRESHOLD_Value and _Scale from the initial 163840 nsec to 237568 nsec (due to ORing those fields), only to reduce it afterwards to 106496 nsec. Amend all invocations of pci_clear_and_set_config_dword() to only set bits which are cleared. Finally, when setting the T_POWER_ON_Value and _Scale on the Root Port and the wifi card, aspm_calc_l12_info() fails to preserve bits declared RsvdP and instead overwrites them with zeroes. Replace pci_write_config_dword() with pci_clear_and_set_config_dword() to avoid this. Fixes: aeda9adebab8 ("PCI/ASPM: Configure L1 substate settings") Link: https://bugzilla.kernel.org/show_bug.cgi?id=220705#c22 Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Adrià Vilanova Martínez <me@avm99963.com> Cc: stable@vger.kernel.org # v4.11+ Link: https://patch.msgid.link/5c1752d7512eed0f4ea57b84b12d7ee08ca61fc5.1771226659.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-05-14	PCI/AER: Stop ruling out unbound devices as error source	Lukas Wunner	1	-2/+0
	commit 1ab4a3c805084d752ec571efc78272295a9f2f74 upstream. When searching for the error source, the AER driver rules out devices whose enable_cnt is zero. This was introduced in 2009 by commit 28eb27cf0839 ("PCI AER: support invalid error source IDs") without providing a rationale. Drivers typically call pci_enable_device() on probe, hence the enable_cnt check essentially filters out unbound devices. At the time of the commit, drivers had to opt in to AER by calling pci_enable_pcie_error_reporting() and so any AER-enabled device could be assumed to be bound to a driver. The check thus made sense because it allowed skipping config space accesses to devices which were known not to be the error source. But since 2022, AER is universally enabled on all devices when they are enumerated, cf. commit f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"). Errors may very well be reported by unbound devices, e.g. due to link instability. By ruling them out as error source, errors reported by them are neither logged nor cleared. When they do get bound and another error occurs, the earlier error is reported together with the new error, which may confuse users. Stop doing so. Fixes: f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Stefan Roese <stefan.roese@mailbox.org> Cc: stable@vger.kernel.org # v6.0+ Link: https://patch.msgid.link/734338c2e8b669db5a5a3b45d34131b55ffebfca.1774605029.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-05-14	PCI/AER: Clear only error bits in PCIe Device Status	Shuai Xue	1	-4/+3
	commit a8aeea1bf3c80cc87983689e0118770e019bd4f3 upstream. Currently, pcie_clear_device_status() clears the entire PCIe Device Status register (PCI_EXP_DEVSTA) by writing back the value read from the register, which affects not only the error status bits but also other writable bits. According to PCIe r7.0, sec 7.5.3.5, this register contains: - RW1C error status bits (CED, NFED, FED, URD at bits 0-3): These are the four error status bits that need to be cleared. - Read-only bits (AUXPD at bit 4, TRPND at bit 5): Writing to these has no effect. - Emergency Power Reduction Detected (bit 6): A RW1C non-error bit introduced in PCIe r5.0 (2019). This is currently the only writable non-error bit in the Device Status register. Unconditionally clearing this bit can interfere with other software components that rely on this power management indication. - Reserved bits (RsvdZ): These bits are required to be written as zero. Writing 1s to them (as the current implementation may do) violates the specification. To prevent unintended side effects, modify pcie_clear_device_status() to only write 1s to the four error status bits (CED, NFED, FED, URD), leaving the Emergency Power Reduction Detected bit and reserved bits unaffected. Fixes: ec752f5d54d7 ("PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL") Suggested-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260211124624.49656-1-xueshuai@linux.alibaba.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-05-14	PCI: Update saved_config_space upon resource assignment	Lukas Wunner	1	-0/+2
	commit 909f7bf9b080c10df3c3b38533906dbf09ff1d8b upstream. Bernd reports passthrough failure of a Digital Devices Cine S2 V6 DVB adapter plugged into an ASRock X570S PG Riptide board with BIOS version P5.41 (09/07/2023): ddbridge 0000:05:00.0: detected Digital Devices Cine S2 V6 DVB adapter ddbridge 0000:05:00.0: cannot read registers ddbridge 0000:05:00.0: fail BIOS assigns an incorrect BAR to the DVB adapter which doesn't fit into the upstream bridge window. The kernel corrects the BAR assignment: pci 0000:07:00.0: BAR 0 [mem 0xfffffffffc500000-0xfffffffffc50ffff 64bit]: can't claim; no compatible bridge window pci 0000:07:00.0: BAR 0 [mem 0xfc500000-0xfc50ffff 64bit]: assigned Correction of the BAR assignment happens in an x86-specific fs_initcall, pcibios_assign_resources(), after device enumeration in a subsys_initcall. This order was introduced at the behest of Linus in 2004: https://git.kernel.org/tglx/history/c/a06a30144bbc No other architecture performs such a late BAR correction. Bernd bisected the issue to commit a2f1e22390ac ("PCI/ERR: Ensure error recoverability at all times"), but it only occurs in the absence of commit 4d4c10f763d7 ("PCI: Explicitly put devices into D0 when initializing"). This combination exists in stable kernel v6.12.70, but not in mainline, hence Bernd cannot reproduce the issue with mainline. Since a2f1e22390ac, config space is saved on enumeration, prior to BAR correction. Upon passthrough, the corrected BAR is overwritten with the incorrect saved value by: vfio_pci_core_register_device() vfio_pci_set_power_state() pci_restore_state() But only if the device's current_state is PCI_UNKNOWN, as it was prior to commit 4d4c10f763d7. Since the commit, it is PCI_D0, which changes the behavior of vfio_pci_set_power_state() to no longer restore the state without saving it first. Alexandre is reporting the same issue as Bernd, but in his case, mainline is affected as well. The difference is that on Alexandre's system, the host kernel binds a driver to the device which is unbound prior to passthrough, whereas on Bernd's system no driver gets bound by the host kernel. Unbinding sets current_state to PCI_UNKNOWN in pci_device_remove(), so when vfio-pci is subsequently bound to the device, pci_restore_state() is once again called without invoking pci_save_state() first. To robustly fix the issue, always update saved_config_space upon resource assignment. Reported-by: Bernd Schumacher <bernd@bschu.de> Closes: https://lore.kernel.org/r/acfZrlP0Ua_5D3U4@eldamar.lan/ Reported-by: Alexandre N. <an.tech@mailo.com> Closes: https://lore.kernel.org/r/dd3c3358-de0f-4a56-9c81-04aceaab4058@mailo.com/ Fixes: a2f1e22390ac ("PCI/ERR: Ensure error recoverability at all times") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Bernd Schumacher <bernd@bschu.de> Tested-by: Alexandre N. <an.tech@mailo.com> Cc: stable@vger.kernel.org # v6.12+ Link: https://patch.msgid.link/febc3f354e0c1f5a9f5b3ee9ffddaa44caccf651.1776268054.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-05-07	PCI: imx6: Fix reference clock source selection for i.MX95	Franz Schnyder	1	-2/+2
	commit 88cc4cbe08bba27bb58888d25d336774aa0ccab1 upstream. In the PCIe PHY init for the i.MX95, the reference clock source selection uses a conditional instead of always passing the mask. This currently breaks functionality if the internal refclk is used. To fix this issue, always pass IMX95_PCIE_REF_USE_PAD as the mask and clear bit if external refclk is not used. This essentially swaps the parameters. Fixes: d8574ce57d76 ("PCI: imx6: Add external reference clock input mode support") Signed-off-by: Franz Schnyder <franz.schnyder@toradex.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Richard Zhu <hongxing.zhu@nxp.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260325093118.684142-1-fra.schnyder@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-05-07	PCI: cadence: Use cdns_pcie_read_sz() for byte or word read access	Aksh Garg	1	-31/+25
	commit d9cf7154deed71a4f23e81101571c79cdc77be00 upstream. The commit 18ac51ae9df9 ("PCI: cadence: Implement capability search using PCI core APIs") assumed all the platforms using Cadence PCIe controller support byte and word register accesses. This is not true for all platforms (e.g., TI J721E SoC, which only supports dword register accesses). This causes capability searches via cdns_pcie_find_capability() to fail on such platforms. Fix this by using cdns_pcie_read_sz() for config read functions, which properly handles size-aligned accesses. Remove the now-unused byte and word read wrapper functions (cdns_pcie_readw and cdns_pcie_readb). Fixes: 18ac51ae9df9 ("PCI: cadence: Implement capability search using PCI core APIs") Signed-off-by: Aksh Garg <a-garg7@ti.com> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260402085545.284457-1-a-garg7@ti.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-05-07	PCI: epf-mhi: Return 0, not remaining timeout, when eDMA ops complete	Daniel Hodges	1	-0/+4
	commit 36bfc3642b19a98f1302aed4437c331df9b481f0 upstream. pci_epf_mhi_edma_read() and pci_epf_mhi_edma_write() start DMA operations and wait for completion with a timeout. On successful completion, they previously returned the remaining timeout, which callers may treat as an error. In particular, mhi_ep_ring_add_element(), which calls pci_epf_mhi_edma_write() via mhi_cntrl->write_sync(), interprets any non-zero return value as failure. Return 0 on success instead of the remaining timeout to prevent mhi_ep_ring_add_element() from treating successful completion as an error. Fixes: 7b99aaaddabb ("PCI: epf-mhi: Add eDMA support") Signed-off-by: Daniel Hodges <git@danielhodges.dev> [mani: changed commit log as per https://lore.kernel.org/linux-pci/20260227191510.GA3904799@bhelgaas] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260206200529.10784-1-git@danielhodges.dev Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-05-07	PCI: endpoint: pci-epf-ntb: Remove duplicate resource teardown	Koichiro Den	1	-54/+2
	commit 3446beddba450c8d6f9aca2f028712ac527fead3 upstream. epf_ntb_epc_destroy() duplicates the teardown that the caller is supposed to do later. This leads to an oops when .allow_link fails or when .drop_link is performed. Remove the helper. Also drop pci_epc_put(). EPC device refcounting is tied to configfs EPC group lifetime, and pci_epc_put() in the .drop_link path is sufficient. Fixes: 8b821cf76150 ("PCI: endpoint: Add EP function driver to provide NTB functionality") Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260226084142.2226875-3-den@valinux.co.jp Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-05-07	PCI: imx6: Skip waiting for L2/L3 Ready on i.MX6SX	Richard Zhu	1	-0/+1
	commit 5f73cf1db829c21b7fd44a8d2587cd395b1b2d76 upstream. On i.MX6SX, the LTSSM registers become inaccessible after the PME_Turn_Off message is sent to the link. So there is no way to verify whether the link has entered L2/L3 Ready state or not. Hence, set IMX_PCIE_FLAG_SKIP_L23_READY flag for i.MX6SX SoC to skip the L2/L3 Ready state polling and let the DWC core wait for 10ms after sending the PME_Turn_Off message as per the PCIe spec r6.0, sec 5.3.3.2.1. Fixes: a528d1a72597 ("PCI: imx6: Use DWC common suspend resume method") Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> [mani: commit log] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Frank Li <Frank.Li@nxp.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260228080925.1558395-1-hongxing.zhu@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>