| Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit f45a49a2380a47332817b7248c61a0ebbc6f0d00 ]
When setting new_id of a PCI device driver using sysfs a lockdep splat
occurs. This is because new_id_store() builds a temporary pci_dev for
pci_match_device(), which calls device_match_driver_override(). That
depends on the driver_override.lock added by cb3d1049f4ea ("driver core:
generalize driver_override in struct device").
The new driver_override.lock was not initialized in the temporary pci_dev,
resulting in this lockdep splat.
Initialize the temporary pci_dev to fix this.
Repro:
Build with CONFIG_LOCKDEP=y, boot with QEMU, and add a new ID:
# echo "8086 10f5" > /sys/bus/pci/drivers/e1000e/new_id
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
CPU: 2 UID: 0 PID: 177 Comm: liveupdate-iomm Not tainted 7.0.0+ #9 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x5d/0x80
register_lock_class+0x77e/0x790
lock_acquire+0xbf/0x2e0
pci_match_device+0x24/0x180
new_id_store+0x189/0x1d0
kernfs_fop_write_iter+0x14f/0x210
vfs_write+0x263/0x5e0
ksys_write+0x79/0xf0
do_syscall_64+0x117/0xf80
Fixes: 10a4206a2401 ("PCI: use generic driver_override infrastructure")
Fixes: 8895d3bcb8ba ("PCI: Fail new_id for vendor/device values already built into driver")
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
[bhelgaas: add commit log details and repro, trim backtrace]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260505234327.716630-1-skhawaja@google.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 988ef706cdd8a72e61dd90c0d0554eec4df7594a ]
Since commit f3ac2ff14834 ("PCI/ASPM: Enable all ClockPM and ASPM states
for devicetree platforms") force enables ASPM on all device tree platforms,
the SG2042 Root Ports are breaking as they advertise L0s and L1
capabilities without supporting them.
Set ASPM quirks to disable the L0s and L1 capabilities for the Root Ports
so that these broken link states won't be enabled.
Fixes: 4e27aca4881a ("riscv: sophgo: dts: add PCIe controllers for SG2042")
Co-developed-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Yao Zi <me@ziyao.cc>
[mani: commit log]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Han Gao <gaohan@iscas.ac.cn>
Tested-by: Chen Wang <unicorn_wang@outlook.com> # Pioneerbox
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://patch.msgid.link/20260405154154.46829-3-me@ziyao.cc
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5ccc76a87f1ec2422811e61be44165bfc9e7cf54 ]
Add flags for disabling the ASPM L0s/L1 capability for broken Root Ports
by clearing the corresponding bits in Link Capabilities Register through
the local management bus. This allows ASPM to be disabled on platforms
which don't support it.
Signed-off-by: Yao Zi <me@ziyao.cc>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Han Gao <gaohan@iscas.ac.cn>
Tested-by: Chen Wang <unicorn_wang@outlook.com> # Pioneerbox
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://patch.msgid.link/20260405154154.46829-2-me@ziyao.cc
Stable-dep-of: 988ef706cdd8 ("PCI: sg2042: Avoid L0s and L1 on Sophgo 2042 PCIe Root Ports")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 34b3eef48d980cd37b876e128bbf314f69fb5d70 ]
When PERST# is deasserted twice (assert -> deassert -> assert -> deassert),
a CBB (Control Backbone) timeout occurs at DBI register offset 0x8bc
(PCIE_MISC_CONTROL_1_OFF). This happens because pci_epc_deinit_notify()
and dw_pcie_ep_cleanup() are called before reset_control_deassert() powers
on the controller core.
The call chain that causes the timeout:
pex_ep_event_pex_rst_deassert()
pci_epc_deinit_notify()
pci_epf_test_epc_deinit()
pci_epf_test_clear_bar()
pci_epc_clear_bar()
dw_pcie_ep_clear_bar()
__dw_pcie_ep_reset_bar()
dw_pcie_dbi_ro_wr_en() <- Accesses 0x8bc DBI register
reset_control_deassert(pcie->core_rst) <- Core powered on HERE
The DBI registers, including PCIE_MISC_CONTROL_1_OFF (0x8bc), are only
accessible after the controller core is powered on via
reset_control_deassert(pcie->core_rst). Accessing them before this point
results in a CBB timeout because the hardware is not yet operational.
Fix this by moving pci_epc_deinit_notify() and dw_pcie_ep_cleanup() to
after reset_control_deassert(pcie->core_rst), ensuring the controller is
fully powered on before any DBI register accesses occur.
Fixes: 40e2125381dc ("PCI: tegra194: Move controller cleanups to pex_ep_event_pex_rst_deassert()")
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-15-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f59df1d9e6bdb6bd7ef65fb5d200900ac40c20ba ]
When Tegra234 is operating in the Endpoint mode with L1.2 enabled, PCIe
link goes down during L1.2 exit. This is because Tegra234 powers up UPHY
PLL immediately without making sure that the REFCLK is stable.
This causes UPHY PLL to fail to lock to the correct frequency and leads to
link going down. There is no hardware fix for this, hence do not advertise
the L1.2 capability in the Endpoint mode.
Fixes: a54e19073718 ("PCI: tegra194: Add Tegra234 PCIe support")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-14-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 40805f32dceadebb7381d911003100bec7b8cd51 ]
The ECRC (TLP digest) workaround was originally added for DesignWare
version 4.90a. Tegra234 SoC has 5.00a DWC HW version, which has the same
ATU TD override behaviour, so apply the workaround for 5.00a too.
Fixes: a54e19073718 ("PCI: tegra194: Add Tegra234 PCIe support")
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-13-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ea60ca067f0f098043610c96a915d162113c1aac ]
Tegra194 PCIe driver used custom version numbers to detect Tegra194 and
Tegra234 IPs. With version detect logic added, version check results in
mismatch warnings:
tegra194-pcie 14100000.pcie: Versions don't match (0000562a != 3536322a)
Use HW version numbers which match to PORT_LOGIC.PCIE_VERSION_OFF in
Tegra194 driver to avoid these kernel warnings.
Fixes: a54e19073718 ("PCI: tegra194: Add Tegra234 PCIe support")
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-12-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8870f02f7868209eb9bdc5dc53540a6262cf9227 ]
Free up the resources during remove() that were acquired by the DesignWare
driver for the Endpoint mode during probe().
Fixes: bb617cbd8151 ("PCI: tegra194: Clean up the exit path for Endpoint mode")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-11-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c76f8eae7d4695b1176c4ea5eb93c17e16a20272 ]
Host software initiates the L2 sequence. PCIe link is kept in L2 state
during suspend. If Endpoint mode is enabled and the link is up, the
software cannot proceed with suspend. However, when the PCIe Endpoint
driver is probed, but the PCIe link is not up, Tegra can go into suspend
state. So, allow system to suspend in this case.
Fixes: de2bbf2b71bb ("PCI: tegra194: Don't allow suspend when Tegra PCIe is in EP mode")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-10-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b256493bf8cacf0e524bf4c10b5c4901d0c6cefe ]
LTR message should be sent as soon as the Root Port enables LTR in the
Endpoint mode. So set snoop and no-snoop LTR timing and LTR message request
before the PCIe link comes up, so that the LTR message is sent upstream as
soon as LTR is enabled.
Without programming these values, the Endpoint would send latencies of 0 to
the host, which will be inaccurate.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
[mani: commit log]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-9-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 976f6763f57970388bcd7118931f33f447916927 ]
Pre-silicon simulation showed the controller operating in Endpoint mode
initiating link speed change after completing Secondary Bus Reset. Ideally,
the Root Port or the Switch Downstream Port should initiate the link speed
change post SBR, not the Endpoint.
So, as per the hardware team recommendation, disable direct speed change
for the Endpoint mode to prevent it from initiating speed change after the
physical layer link is up at Gen1, leaving speed change ownership with the
host.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
[mani: commit log]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-8-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f62bc7917de1374dce86a852ffba8baf9cb7a56a ]
The GPIO DT property "nvidia,refclk-select", to select the PCIe reference
clock is optional. Use devm_gpiod_get_optional() to get it.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-7-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 40658a31b6e134169c648041efc84944c4c71dcd ]
The PERST# GPIO interrupt is only registered when the controller is
operating in Endpoint mode. In Root Port mode, the PERST# GPIO is
configured as an output to control downstream devices, and no interrupt is
registered for it.
Currently, tegra_pcie_dw_stop_link() unconditionally calls disable_irq()
on pex_rst_irq, which causes issues in Root Port mode where this IRQ is
not registered.
Fix this by only disabling the PERST# IRQ when operating in Endpoint mode,
where the interrupt is actually registered and used to detect PERST#
assertion/deassertion from the host.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-6-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 71d9f67701e1affc82d18ca88ae798c5361beddf ]
As per PCIe CEM r6.0, sec 2.3, the PCIe Endpoint device should be in D3cold
to assert WAKE# pin. The previous workaround that forced downstream devices
to D0 before taking the link to L2 cited PCIe r4.0, sec 5.2, "Link State
Power Management"; however, that spec does not explicitly require putting
the device into D0 and only indicates that power removal may be initiated
without transitioning to D3hot.
Remove the D0 workaround so that Endpoint devices can use wake
functionality (WAKE# from D3). With some Endpoints the link may not enter
L2 when they remain in D3, but the Root Port continues with the usual flow
after PME timeout, so there is no functional issue.
Fixes: 56e15a238d92 ("PCI: tegra: Add Tegra194 PCIe support")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-5-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9fa0c242f8d7acf1b124d4462d18f4023573ac1c ]
After the link reaches a Detect-related LTSSM state, disable LTSSM so it
does not keep toggling between Polling and Detect. Do this by polling for
the Detect state first, then clearing APPL_CTRL_LTSSM_EN in both
tegra_pcie_dw_pme_turnoff() and pex_ep_event_pex_rst_assert().
Fixes: 56e15a238d92 ("PCI: tegra: Add Tegra194 PCIe support")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-4-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 74dd8efe4d6cead433162147333af989a568aac7 ]
On surprise link down, LTSSM state transits from L0 -> Recovery.RcvrLock ->
Recovery.RcvrSpeed -> Gen1 Recovery.RcvrLock -> Detect. Recovery.RcvrLock
and Recovery.RcvrSpeed transit times are 24 ms and 48 ms respectively, so
the total time from L0 to Detect is ~96 ms. Increase the poll timeout to
120 ms to account for this.
While at it, add LTSSM state defines for Detect-related states and use them
in the poll condition. Use readl_poll_timeout() instead of
readl_poll_timeout_atomic() in tegra_pcie_dw_pme_turnoff() since that path
runs in non-atomic context.
Fixes: 56e15a238d92 ("PCI: tegra: Add Tegra194 PCIe support")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-3-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit adaffed907f14f954096555665ad6af2ae724d83 ]
As per PCIe r7.0, sec 5.3.3.2.1, after sending PME_Turn_Off message, Root
Port should wait for 1-10 msec for PME_TO_Ack message. Currently, driver is
polling for 10 msec with 1 usec delay which is aggressive. Use existing
macro PCIE_PME_TO_L2_TIMEOUT_US to poll for 10 msec with 1 msec delay.
Since this function is used in non-atomic context only, use non-atomic poll
function.
Fixes: 56e15a238d92 ("PCI: tegra: Add Tegra194 PCIe support")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Link: https://patch.msgid.link/20260324190755.1094879-2-mmaddireddy@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 16d021c878dca22532c984668c9e8cf4722d6a49 ]
NPEM registers LED classdevs on PCI endpoint that may be behind
hotplug-capable ports. During hot-removal, led_classdev_unregister() calls
led_set_brightness(LED_OFF) which leads to a PCI config read to a
disconnected device, which fails and returns -ENODEV (topology details in
msgid.link below):
leds 0003:01:00.0:enclosure:ok: Setting an LED's brightness failed (-19)
The LED core already suppresses this for devices with LED_HW_PLUGGABLE set,
but NPEM never sets it. Add the flag since NPEM LEDs are on hot-pluggable
hardware by nature.
Fixes: 4e893545ef87 ("PCI/NPEM: Add Native PCIe Enclosure Management support")
Signed-off-by: Richard Cheng <icheng@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Kai-Heng Feng <kaihengf@nvidia.com>
Link: https://patch.msgid.link/20260402093850.23075-1-icheng@nvidia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5573c44cb3fd01a9f62d569ae9ac870ef5f0e0ba ]
In mtk_pcie_setup_irq(), the IRQ domains are allocated before the
controller's IRQ is fetched. If the latter fails, the function
directly returns an error, without cleaning up the allocated domains.
Hence, reverse the order so that the IRQ domains are allocated after the
controller's IRQ is found.
This was flagged by Sashiko during a review of "[PATCH v6 0/7] PCI:
mediatek-gen3: add power control support".
Fixes: 814cceebba9b ("PCI: mediatek-gen3: Add INTx support")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://sashiko.dev/#/patchset/20260324052002.4072430-1-wenst%40chromium.org
Link: https://patch.msgid.link/20260324093542.18523-1-wenst@chromium.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c54d5f5b33990f2649c20f35407f340bcadb8a53 ]
The aspeed_pcie_probe() function calls aspeed_pcie_init_irq_domain()
which allocates pcie->intx_domain and initializes MSI. However, if
platform_get_irq() fails afterwards, the cleanup action was not yet
registered via devm_add_action_or_reset(), causing the IRQ domain
resources to leak.
Fix this by registering the devm cleanup action immediately after
aspeed_pcie_init_irq_domain() succeeds, before calling
platform_get_irq(). This ensures proper cleanup on any subsequent
failure.
Fixes: 9aa0cb68fcc1 ("PCI: aspeed: Add ASPEED PCIe RC driver")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Tested-by: Jacky Chou <jacky_chou@aspeedtech.com>
Link: https://patch.msgid.link/20260324-aspeed-v1-1-354181624c00@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
hardware bug
[ Upstream commit 3a4e8302e72f83fd5cc8a916fc6f5c8fe5c8690e ]
On NXP i.MX7D, i.MX8MM, and i.MX8MQ chipsets, MSIs from the endpoints won't
be received by the iMSI-RX MSI controller if the Root Port MSI capability
is disabled.
Even though the Root Port MSIs won't be received by the iMSI-RX controller
due to design, these chipsets have some weird hardware bug that prevents
the endpoint MSIs from reaching when the Root Port MSI capability is
disabled.
Hence, introduce a new flag, 'dw_pcie_rp::keep_rp_msi_en', set it for the
above mentioned SoCs, and always keep the Root Port MSI capability when
this flag is set.
Note that by keeping Root Port MSI capability, Root Port MSIs such as AER,
PME and others won't be received by default. So users need to use
workarounds such as passing 'pcie_pme=nomsi' cmdline param.
Fixes: f5cd8a929c825 ("PCI: dwc: Remove MSI/MSIX capability for Root Port if iMSI-RX is used as MSI controller")
Suggested-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
[mani: commit log]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
[bhelgaas: fix typos]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260331085252.1243108-1-hongxing.zhu@nxp.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1ae8c4ce157037e266184064a182af9ef9af278b ]
When inspecting the config space of a Connect-X physical function in an
s390 system after it was initialized by the mlx5_core device driver, we
found the function to be enabled to request AtomicOps despite the Root Port
lacking support for completing them:
00:00.1 Ethernet controller: Mellanox Technologies MT2894 Family [ConnectX-6 Lx]
Subsystem: Mellanox Technologies Device 0002
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
AtomicOpsCtl: ReqEn+
On s390 and many virtualized guests, the Endpoint is visible but the Root
Port is not. In this case, pci_enable_atomic_ops_to_root() previously
enabled AtomicOps in the Endpoint even though it can't tell whether the
Root Port supports them as a completer.
Change pci_enable_atomic_ops_to_root() to fail if there's no Root Port or
the Root Port doesn't support AtomicOps.
Fixes: 430a23689dea ("PCI: Add pci_enable_atomic_ops_to_root()")
Reported-by: Alexander Schmidt <alexs@linux.ibm.com>
Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
[bhelgaas: commit log, check RP first to simplify flow]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260330-fix_pciatops-v7-2-f601818417e8@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 445588a3b18bb0702d746cb61f7a443639027651 ]
kstrtou32_from_user() returns int, but the return value was stored in
a u32 variable 'val', risking sign loss. Use a dedicated int variable
to correctly handle the return code.
Fixes: 4fbfa17f9a07 ("PCI: dwc: Add debugfs based Silicon Debug support for DWC")
Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20260401023048.4182452-1-18255117159@163.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 97970e7c694356e3386a10e3b936d61eafd06bce ]
aer_print_error() skips printing if ratelimit_print[i] is not set. In the
native AER path, ratelimit_print is initialized by add_error_device()
during source device discovery, and is set to 1 for fatal errors to bypass
rate limiting since fatal errors should always be logged.
The DPC/EDR path uses the DPC-capable port as the error source and reads
its AER uncorrectable error status registers directly in
dpc_get_aer_uncorrect_severity(). Since it does not go through
add_error_device(), ratelimit_print[0] is left uninitialized and zero. As
a result, aer_print_error() silently drops all AER error messages for
DPC/EDR triggered events.
Set ratelimit_print[0] to 1 to bypass rate limiting and always print AER
logs for uncorrectable errors detected by the DPC port.
Fixes: a57f2bfb4a58 ("PCI/AER: Ratelimit correctable and non-fatal error logging")
Co-developed-by: Goudar Manjunath Ramanagouda <manjunath.ramanagouda.goudar@intel.com>
Signed-off-by: Goudar Manjunath Ramanagouda <manjunath.ramanagouda.goudar@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260318170449.2733581-1-sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8cb081667377709f4924ab6b3a88a0d7a761fe91 ]
The commit bc75c8e50711 ("PCI: Rewrite bridge window head alignment
function") did not use if (r_size <= align) check from pbus_size_mem() for
the new head alignment bookkeeping structure (aligns2[]). In some
configurations, this can result in producing a gap into the bridge window
which the resource larger than its alignment cannot fill.
The old alignment calculation algorithm was removed by the subsequent
commit 3958bf16e2fe ("PCI: Stop over-estimating bridge window size") which
renamed the aligns2[] array leaving only aligns[] array.
Add the if (r_size <= align) check back to avoid this problem.
Fixes: bc75c8e50711 ("PCI: Rewrite bridge window head alignment function")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/all/b05a6f14-979d-42c9-924c-d8408cb12ae7@roeck-us.net/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Xifer <xiferdev@gmail.com>
Link: https://patch.msgid.link/20260324165633.4583-11-ilpo.jarvinen@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1ee4716a5a28eaef81ae1f280d983258bee49623 ]
reassign_resources_sorted() checks for two things:
a) Resource assignment failures for mandatory resources by checking if the
resource remains unassigned, which are known to always repeat, and does
not attempt to assign them again.
b) That resource is not among the ones being processed/assigned at this
stage, leading to skip processing such resources in
reassign_resources_sorted() as well (resource assignment progresses
one PCI hierarchy level at a time).
The problem here is that a) is checked before b), but b) also implies the
resource is not being assigned yet, making also a) true. As a) only skips
resource assignment but still removes the resource from realloc_head, the
later stages that would need to process the information in realloc_head
cannot obtain the optional size information anymore. This leads to
considering only non-optional part for bridge windows deeper in the PCI
hierarchy.
This problem has been observed during rescan (add_size is not considered
while attempting assignment for 0000:e2:00.0 indicating the corresponding
entry was removed from realloc_head while processing resource assignments
for 0000:e1):
pci_bus 0000:e1: scanning bus
...
pci 0000:e3:01.0: bridge window [mem 0x800000000-0x1000ffffff 64bit pref] to [bus e4] add_size 60c000000 add_align 800000000
pci 0000:e3:01.0: bridge window [mem 0x00100000-0x000fffff] to [bus e4] add_size 200000 add_align 200000
pci 0000:e3:02.0: disabling bridge window [mem 0x00000000-0x000fffff 64bit pref] to [bus e5] (unused)
pci 0000:e2:00.0: bridge window [mem 0x800000000-0x1000ffffff 64bit pref] to [bus e3-e5] add_size 60c000000 add_align 800000000
pci 0000:e2:00.0: bridge window [mem 0x00100000-0x001fffff] to [bus e3-e5] add_size 200000 add_align 200000
pcieport 0000:e1:02.0: bridge window [io size 0x2000]: can't assign; no space
pcieport 0000:e1:02.0: bridge window [io size 0x2000]: failed to assign
pcieport 0000:e1:02.0: bridge window [io 0x1000-0x2fff]: resource restored
pcieport 0000:e1:02.0: bridge window [io 0x1000-0x2fff]: resource restored
pcieport 0000:e1:02.0: bridge window [io size 0x2000]: can't assign; no space
pcieport 0000:e1:02.0: bridge window [io size 0x2000]: failed to assign
pci 0000:e2:00.0: bridge window [mem 0x28f000000000-0x28f800ffffff 64bit pref]: assigned
Fixes: 96336ec70264 ("PCI: Perform reset_resource() and build fail list in sync")
Reported-by: Peter Nisbet <peter.nisbet@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Peter Nisbet <peter.nisbet@intel.com>
Link: https://patch.msgid.link/20260313084551.1934-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit dc4b4d04e1caa3552f000d84d832779ebe51b093 ]
Steve reported an eGPU (either Radeon Instinct MI50 32GB or NVIDIA 3080
10GB) connected via Thunderbolt was not assigned sufficient BAR space in
v6.11, so the amdgpu and nvidia drivers were unable to initialize the
device.
pci_bridge_distribute_available_resources() -> ... ->
adjust_bridge_window() is called between __pci_bus_size_bridges()
and assigning the resources. Since the commit 948675736a77 ("PCI: Allow
adjust_bridge_window() to shrink resource if necessary")
adjust_bridge_window() can also shrink the bridge window. The shrunken
size, however, conflicts with what __pci_bus_size_bridges() ->
pbus_size_mem() calculated as the required bridge window size. By shrinking
the size, adjust_bridge_window() prevents the rest of the resource fitting
algorithm from working as intended. Resource fitting logic is expecting
assignment failures when bridge windows need resizing, but there are cases
where failures are no longer happening after the commit 948675736a77 ("PCI:
Allow adjust_bridge_window() to shrink resource if necessary").
The commit 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink
resource if necessary") justifies the change by the extra reservation
made due to hpmemsize parameter, however, the kernel code contradicts
that statement. (For simplicity, finer-grained hpmmiosize and hpmmiopref
parameters that can be used to the same effect as hpmemsize are ignored in
this description.)
pbus_size_mem() calls calculate_memsize() twice. First with add_size=0
to find out the minimal required resource size. The second call occurs
with add_size=hpmemsize (effectively) but the result does not directly
affect the resource size only resulting in an entry on the realloc_head
list (a.k.a. add_list). Yet, adjust_bridge_window() directly changes
the resource size which does not include what is reserved due to
hpmemsize. Also, if the required size for the bridge window exceeds
hpmemsize, the parameter does not have any effect even on the second
size calculation made by pbus_size_mem(); from calculate_memsize():
size = max(size, add_size) + children_add_size;
The commit ae4611f1d7e9 ("PCI: Set resource size directly in
adjust_bridge_window()") that precedes the commit 948675736a77 ("PCI:
Allow adjust_bridge_window() to shrink resource if necessary") is also
related to causing this problem. Its changelog explicitly states
adjust_bridge_window() wants to "guarantee" allocation success.
Guaranteed allocations, however, are incompatible with how the other
parts of the resource fitting algorithm work. The given justification
fails to explain why guaranteed allocations at this stage are required
nor why forcing window to a smaller value than what was calculated by
pbus_size_mem() is correct. While the change might have worked by chance
in some test scenario, too small bridge window does not "guarantee"
success from the point of view of the endpoint device resource
assignments. No issue is mentioned within the changelog so it's unclear
if the change was made to fix some observed issue nor and what that
issue was.
The unwanted shrinking of a bridge window occurs, e.g., when a device with
large BARs such as eGPU is attached using Thunderbolt and the Root Port
holds less than enough resource space for the eGPU. The GPU resources are
in order of GBs and the default hotplug allocation is a mere 2MB
(DEFAULT_HOTPLUG_MMIO_PREF_SIZE). The problem is illustrated by this log
(filtered to the relevant content only):
pci 0000:00:07.0: PCI bridge to [bus 03-2c]
pci 0000:00:07.0: bridge window [mem 0x6000000000-0x601bffffff 64bit pref]
pci 0000:03:00.0: PCI bridge to [bus 00]
pci 0000:03:00.0: bridge window [mem 0x00000000-0x000fffff 64bit pref]
pci 0000:03:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
pci 0000:03:00.0: PCI bridge to [bus 04-2c]
pcieport 0000:00:07.0: Assigned bridge window [mem 0x6000000000-0x601bffffff 64bit pref] to [bus 03-2c] cannot fit 0xc00000000 required for 0000:03:00.0 bridging to [bus 04-2c]
pci 0000:03:00.0: bridge window [mem 0x800000000-0x10003fffff 64bit pref] to [bus 04-2c] add_size 100000 add_align 100000
pcieport 0000:00:07.0: distributing available resources
pci 0000:03:00.0: bridge window [mem 0x800000000-0x10003fffff 64bit pref] shrunken by 0x00000007e4400000
pci 0000:03:00.0: bridge window [mem 0x6000000000-0x601bffffff 64bit pref]: assigned
The initial size of the Root Port's window is 448MB (0x601bffffff -
0x6000000000). __pci_bus_size_bridges() -> pbus_size_mem() calculates the
required size to be 32772 MB (0x10003fffff - 0x800000000) which would fit
the eGPU resources. adjust_bridge_window() then shrinks the bridge window
down to what is guaranteed to fit into the Root Port's bridge window. The
bridge window for 03:00.0 is also eliminated from the add_list (a.k.a.
realloc_head) list by adjust_bridge_window().
After adjustment, the resources are assigned and as the bridge window for
03:00.0 is assigned successfully, no failure is recorded. Without a
failure, no attempt to resize the window of the Root Port is required. The
end result is eGPU not having large enough resources to work.
The commit 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink
resource if necessary") also claims nested bridge windows are sized the
same, which is false. pbus_size_mem() calculates the size for the parent
bridge window by summing all the downstream resources so the resource
fitting calculates larger bridge window for the parent to accommodate the
childen. That is, hpmemsize does not result the same size for the case
where there are nested bridge windows.
In order to fix the most immediate problem, don't shrink the resource size
in adjust_bridge_window() as hpmemsize had nothing to do with it. When
considering add_size, only reduce it up to what is added due to hpmemsize
(if required size is larger than hpmemsize, the parameter has no impact,
see calculate_memsize()). Unfortunately, if the tail of the bridge window
was aligned in calculate_memsize() from below hpmemsize to above it, the
size check will falsely match but the check at least errs to the side of
caution. There's not enough information available in adjust_bridge_window()
to know the calculated size precisely.
This is not exactly a revert of the commits e4611f1d7e9 ("PCI: Set resource
size directly in adjust_bridge_window()") and 948675736a77 ("PCI: Allow
adjust_bridge_window() to shrink resource if necessary") as shrinking still
remains in place but is implemented differently, and the end result behaves
very differently.
It is possible that those two commits fixed some other issue that is not
described with enough detail in the changelog and undoing parts of them
results in another regression due to behavioral change. Nonetheless, as
described above, the solution by those two commits was flawed and the
issue, if one exists, should be solved in a way that is compatible with the
rest of the resource fitting algorithm instead of working against it.
Besides shrinking, the case where adjust_bridge_window() expands the bridge
window is likely somewhat wrong as well because it removes the entry from
add_list (a.k.a. realloc_head), but it is less damaging as that only
impacts optional resources and may have no impact if expanding by hpmemsize
is larger than what add_size was. Fixing it is left as further work.
Fixes: 948675736a77 ("PCI: Allow adjust_bridge_window() to shrink resource if necessary")
Fixes: ae4611f1d7e9 ("PCI: Set resource size directly in adjust_bridge_window()")
Reported-by: Steve Oswald <stevepeter.oswald@gmail.com>
Closes: https://lore.kernel.org/linux-pci/CAN95MYEaO8QYYL=5cN19nv_qDGuuP5QOD17pD_ed6a7UqFVZ-g@mail.gmail.com/
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260219153951.68869-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 33a76fc3c3e61386524479b99f35423bd3d9a895 ]
Qcom PCIe Root Ports advertise hotplug capability in hardware, but do not
support hotplug command completion. As a result, the hotplug commands
issued by the pciehp driver never gets completion notification, leading to
repeated timeout warnings and multi-second delays during boot and
suspend/resume.
Commit a54db86ddc153 ("PCI: qcom: Do not advertise hotplug capability for
IPs v2.7.0 and v1.9.0") mistakenly assumed that the Root Ports doesn't
support Hotplug due to timeouts and disabled the Hotplug functionality
altogether. But the Root Ports does support reporting Hotplug events like
DL_Up/Down events.
So to fix the command completion timeout issues, just set the No Command
Completed Support (NCCS) bit and enable Hotplug in Slot Capability field
back.
Fixes: a54db86ddc153 ("PCI: qcom: Do not advertise hotplug capability for IPs v2.7.0 and v1.9.0")
Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
[mani: renamed function, commit log and added comment]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Tested-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> # Hamoa CRD, tunneled link
Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com>
Link: https://patch.msgid.link/20260314-hotplug-v1-1-96ac87d93867@oss.qualcomm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 72e76b63d6ff6d1f96acccbfc6c118656f63e66a ]
When devm_kzalloc() for reg_off fails, the code returns -ENOMEM without
freeing pcie->cfg, which was allocated earlier by pci_ecam_create().
Add the missing pci_ecam_free() call to properly release the allocated ECAM
configuration window on this error path.
Fixes: a0d9f2c08f45 ("PCI: sky1: Add PCIe host support for CIX Sky1")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Hans Zhang <18255117159@163.com>
Link: https://patch.msgid.link/20260324-sky1-v1-1-6a00cb2776b6@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 13f55a7ca773c731a1e645934c1ae48577f48785 ]
R-Car S4 Series (R8A779F[4-7]*) EP controller uses a 4K minimum iATU region
size (CX_ATU_MIN_REGION_SIZE = 4K) as per R19UH0161EJ0130 Rev.1.30. Also,
the controller itself can only be configured in the range 4 KB to 64 KB, so
the current 1 MB alignment requirement is incorrect.
Hence, change the alignment to the min size 4K as per the documentation.
This also fixes needless unusability of BAR4 on this platform when the
target address is fixed, such as for doorbell targets.
Fixes: e311b3834dfa ("PCI: rcar-gen4: Add endpoint mode support")
Signed-off-by: Koichiro Den <den@valinux.co.jp>
[mani: commit log]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20260305151050.1834007-1-den@valinux.co.jp
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 34735f63748daa2ea27544259c3042b4948376bf ]
Reorder the reset assertion sequence during suspend from
power_resets -> cfg_resets to cfg_resets -> power_resets.
This change ensures the suspend sequence follows the reverse order
of the probe/init sequence, where power_resets are deasserted first
followed by cfg_resets.
Additionally, this ordering is required for RZ/G3E support where
cfg resets are controlled through PCIe AXI registers (offset 0x310h).
According to the RZ/G3E hardware manual (Rev.1.15, section 6.6.6.1.1
"Changing the Initial Values of the Registers"), AXI register access
requires ARESETn to be de-asserted and the clock to be supplied.
Since ARESETn is part of power_resets, cfg_resets must be asserted
before power_resets, otherwise the AXI registers become inaccessible.
Fixes: 7ef502fb35b2 ("PCI: Add Renesas RZ/G3S host controller driver")
Signed-off-by: John Madieu <john.madieu.xa@bp.renesas.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # RZ/V2N EVK
Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://patch.msgid.link/20260306143423.19562-3-john.madieu.xa@bp.renesas.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d284389d4576e7c8040dc4cbb66876e539c6d064 ]
Fix incorrect reset_control_bulk_deassert() call in the probe error
path. When unwinding from a failed pci_host_probe(), the configuration
resets should be asserted to restore the hardware to its initial state,
not deasserted again.
Fixes: 7ef502fb35b2 ("PCI: Add Renesas RZ/G3S host controller driver")
Signed-off-by: John Madieu <john.madieu.xa@bp.renesas.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # RZ/V2N EVK
Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Link: https://patch.msgid.link/20260306143423.19562-2-john.madieu.xa@bp.renesas.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit edb5ca3262e2255cf938a5948709d3472d4871ad ]
If the dw_pcie_resume_noirq() API fails, it just returns the errno without
doing cleanup in the error path, leading to resource leak.
So perform cleanup in the error path.
Fixes: 4774faf854f5 ("PCI: dwc: Implement generic suspend/resume functionality")
Reported-by: Senchuan Zhang <zhangsenchuan@eswincomputing.com>
Closes: https://lore.kernel.org/linux-pci/78296255.3869.19c8eb694d6.Coremail.zhangsenchuan@eswincomputing.com
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20260226133951.296743-1-mani@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 94cbea0f636b55602a9a10583670976680ecea67 ]
PCIe r7.0, section 7.5.3.6 states that for multi-function devices, the
Max Link Width and Max Link Speed fields in the Link Capabilities
Register must report the same values for all functions.
Currently, dw_pcie_setup() programs these fields only for Function 0
via dw_pcie_link_set_max_speed() and dw_pcie_link_set_max_link_width().
For multi-function endpoint configurations, Function 1 and beyond retain
their default values, violating the PCIe specification.
Fix this by reading the Max Link Width and Max Link Speed fields from
Link Capabilities Register of Function 0 after dw_pcie_setup() completes,
then mirroring these values to all other functions.
Fixes: 24ede430fa49 ("PCI: designware-ep: Add multiple PFs support for DWC")
Fixes: 89db0793c9f2 ("PCI: dwc: Add missing PCI_EXP_LNKCAP_MLW handling")
Signed-off-by: Aksh Garg <a-garg7@ti.com>
[mani: renamed ref_lnkcap to func0_lnkcap]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20260224083817.916782-3-a-garg7@ti.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 271d0b1f058ae9815e75233d04b23e3558c3e4f4 ]
In dw_pcie_ep_set_msix(), while updating the MSI-X Table Size value for
individual functions, Message Control register is read from the passed
function number register space using dw_pcie_ep_readw_dbi(), but always
written back to the Function 0's register space using dw_pcie_writew_dbi().
This causes incorrect MSI-X configuration for the rest of the functions,
other than Function 0.
Fix this by using dw_pcie_ep_writew_dbi() to write to the correct
function's register space, matching the read operation.
Fixes: 70fa02ca1446 ("PCI: dwc: Add dw_pcie_ep_{read,write}_dbi[2] helpers")
Signed-off-by: Aksh Garg <a-garg7@ti.com>
[mani: commit log]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20260224083817.916782-2-a-garg7@ti.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3b55079d6387805ede687e234d84669aeb0f7e98 ]
In imx_pcie_probe(), of_parse_phandle() returns the device node pointer
with increased refcount. The pointer reference must be dropped by the
caller when it's no longer needed. However, imx_pcie_probe() doesn't drop
the reference, causing reference leak.
Fix this by using the __free(device_node) cleanup handler to drop the
reference when the function goes out of scope.
Fixes: 1df82ec46600 ("PCI: imx: Add workaround for e10728, IMX7d PCIe PLL failure")
Signed-off-by: Felix Gu <ustc.gu@gmail.com>
[mani: commit log]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Acked-by: Richard Zhu <hongxing.zhu@nxp.com>
Link: https://patch.msgid.link/20260124-pci_imx6-v2-1-acb8d5187683@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1cba96c0a795124c3229293ed7b5b5765e66f259 ]
pci_epf_alloc_doorbell() stores the allocated doorbell message array in
epf->db_msg/epf->num_db before requesting MSI vectors. If MSI allocation
fails, the array is freed but the EPF state may still point to freed
memory.
Clear epf->db_msg and epf->num_db on the MSI allocation failure path so
that later cleanup cannot double-free the array and callers can retry
allocation.
Also return -EBUSY when doorbells have already been allocated to prevent
leaking or overwriting an existing allocation.
Fixes: 1c3b002c6bf6 ("PCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller")
Signed-off-by: Koichiro Den <den@valinux.co.jp>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20260217063856.3759713-4-den@valinux.co.jp
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e81fa70179aac6ac3a6636565d5d35968dca3900 ]
pci_epf_test_doorbell_cleanup() unconditionally calls free_irq() for the
doorbell virq, which can trigger "Trying to free already-free IRQ"
warnings when the IRQ was never requested or when request_threaded_irq()
failed.
Move free_irq() out of pci_epf_test_doorbell_cleanup() and invoke it
only after a successful request, so that free_irq() is not called for
an unrequested IRQ.
Fixes: eff0c286aa91 ("PCI: endpoint: pci-epf-test: Add doorbell test support")
Signed-off-by: Koichiro Den <den@valinux.co.jp>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20260217063856.3759713-3-den@valinux.co.jp
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit cc04f2bfb9dae60b6e34d6bff75c26d4ec3237ce ]
epf_ntb_db_bar_init_msi_doorbell() requests ntb->db_count doorbell IRQs
and then performs additional MSI doorbell setup that may still fail.
The error path unwinds the requested IRQs, but it uses a loop variable
that is reused later in the function. When a later step fails, the
unwind can run with an unexpected index value and leave some IRQs
requested.
Track the number of successfully requested IRQs separately and use that
counter for the unwind so all previously requested IRQs are freed on
failure.
Fixes: dc693d606644 ("PCI: endpoint: pci-epf-vntb: Add MSI doorbell support")
Signed-off-by: Koichiro Den <den@valinux.co.jp>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://patch.msgid.link/20260217063856.3759713-2-den@valinux.co.jp
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d3e996a596967a62c8a13a279221513461f6ab97 ]
Previously, pcie_enable_tph() only enabled TLP Processing Hints (TPH) if
both the Endpoint and its Root Port advertised TPH support.
Root Complex Integrated Endpoints (RCiEPs) are directly integrated into a
Root Complex and do not have an associated Root Port, so pcie_enable_tph()
never enabled TPH for RCiEPs.
PCIe r7.0 doesn't seem to include a way to learn whether a Root Complex
supports TPH, but sec 2.2.7.1.1 says Functions that lack TPH support should
ignore TPH, and maybe the same is true for Root Complexes:
A Function that does not support the TPH Completer or Routing capability
and receives a transaction with the TH bit [which indicates the presence
of TPH in the TLP header] Set is required to ignore the TH bit and handle
the Request in the same way as Requests of the same transaction type
without the TH bit Set.
Allow drivers to enable TPH for any RCiEP with a TPH Requester Capability.
Fixes: f69767a1ada3 ("PCI: Add TLP Processing Hints (TPH) support")
Signed-off-by: George Abraham P <george.abraham.p@intel.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260109052923.1170070-1-george.abraham.p@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 10a4206a24013be4d558d476010cbf2eb4c9fa64 ]
When a driver is probed through __driver_attach(), the bus' match()
callback is called without the device lock held, thus accessing the
driver_override field without a lock, which can cause a UAF.
Fix this by using the driver-core driver_override infrastructure taking
care of proper locking internally.
Note that calling match() from __driver_attach() without the device lock
held is intentional. [1]
Link: https://lore.kernel.org/driver-core/DGRGTIRHA62X.3RY09D9SOK77P@kernel.org/ [1]
Reported-by: Gui-Dong Han <hanguidong02@gmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220789
Fixes: 782a985d7af2 ("PCI: Introduce new device binding path using pci_dev.driver_override")
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Williamson <alex@shazbot.org>
Tested-by: Gui-Dong Han <hanguidong02@gmail.com>
Reviewed-by: Gui-Dong Han <hanguidong02@gmail.com>
Link: https://patch.msgid.link/20260324005919.2408620-6-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit cc33985d26c92a5c908c0185239c59ec35b8637c upstream.
When aspm_calc_l12_info() programs the L1 PM Substates Control 1 register
fields Common_Mode_Restore_Time, LTR_L1.2_THRESHOLD_Value and _Scale, it
invokes pci_clear_and_set_config_dword() in an incorrect way:
For the bits to clear it selects those corresponding to the field. So far
so good. But for the bits to set it passes a full register value.
pci_clear_and_set_config_dword() performs a boolean OR operation which
sets all bits of that value, not just the ones that were just cleared.
Thus, when setting the LTR_L1.2_THRESHOLD_Value and _Scale on the child of
an ASPM link, aspm_calc_l12_info() also sets the Common_Mode_Restore_Time.
That's a spec violation: PCIe r7.0 sec 7.8.3.3 says this field is RsvdP
for Upstream Ports. On Adrià's Pixelbook Eve, Common_Mode_Restore_Time
of the Intel 7265 "Stone Peak" wifi card is zero, yet aspm_calc_l12_info()
does not preserve the zero bits but instead programs the value calculated
for the Root Port into the wifi card.
Likewise, when setting the Common_Mode_Restore_Time on the Root Port,
aspm_calc_l12_info() also changes the LTR_L1.2_THRESHOLD_Value and _Scale
from the initial 163840 nsec to 237568 nsec (due to ORing those fields),
only to reduce it afterwards to 106496 nsec.
Amend all invocations of pci_clear_and_set_config_dword() to only set bits
which are cleared.
Finally, when setting the T_POWER_ON_Value and _Scale on the Root Port and
the wifi card, aspm_calc_l12_info() fails to preserve bits declared RsvdP
and instead overwrites them with zeroes. Replace pci_write_config_dword()
with pci_clear_and_set_config_dword() to avoid this.
Fixes: aeda9adebab8 ("PCI/ASPM: Configure L1 substate settings")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=220705#c22
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Adrià Vilanova Martínez <me@avm99963.com>
Cc: stable@vger.kernel.org # v4.11+
Link: https://patch.msgid.link/5c1752d7512eed0f4ea57b84b12d7ee08ca61fc5.1771226659.git.lukas@wunner.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1ab4a3c805084d752ec571efc78272295a9f2f74 upstream.
When searching for the error source, the AER driver rules out devices whose
enable_cnt is zero. This was introduced in 2009 by commit 28eb27cf0839
("PCI AER: support invalid error source IDs") without providing a
rationale.
Drivers typically call pci_enable_device() on probe, hence the enable_cnt
check essentially filters out unbound devices. At the time of the commit,
drivers had to opt in to AER by calling pci_enable_pcie_error_reporting()
and so any AER-enabled device could be assumed to be bound to a driver.
The check thus made sense because it allowed skipping config space accesses
to devices which were known not to be the error source.
But since 2022, AER is universally enabled on all devices when they are
enumerated, cf. commit f26e58bf6f54 ("PCI/AER: Enable error reporting when
AER is native").
Errors may very well be reported by unbound devices, e.g. due to link
instability. By ruling them out as error source, errors reported by them
are neither logged nor cleared. When they do get bound and another error
occurs, the earlier error is reported together with the new error, which
may confuse users. Stop doing so.
Fixes: f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Stefan Roese <stefan.roese@mailbox.org>
Cc: stable@vger.kernel.org # v6.0+
Link: https://patch.msgid.link/734338c2e8b669db5a5a3b45d34131b55ffebfca.1774605029.git.lukas@wunner.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a8aeea1bf3c80cc87983689e0118770e019bd4f3 upstream.
Currently, pcie_clear_device_status() clears the entire PCIe Device Status
register (PCI_EXP_DEVSTA) by writing back the value read from the register,
which affects not only the error status bits but also other writable bits.
According to PCIe r7.0, sec 7.5.3.5, this register contains:
- RW1C error status bits (CED, NFED, FED, URD at bits 0-3): These are the
four error status bits that need to be cleared.
- Read-only bits (AUXPD at bit 4, TRPND at bit 5): Writing to these has
no effect.
- Emergency Power Reduction Detected (bit 6): A RW1C non-error bit
introduced in PCIe r5.0 (2019). This is currently the only writable
non-error bit in the Device Status register. Unconditionally clearing
this bit can interfere with other software components that rely on this
power management indication.
- Reserved bits (RsvdZ): These bits are required to be written as zero.
Writing 1s to them (as the current implementation may do) violates the
specification.
To prevent unintended side effects, modify pcie_clear_device_status() to
only write 1s to the four error status bits (CED, NFED, FED, URD), leaving
the Emergency Power Reduction Detected bit and reserved bits unaffected.
Fixes: ec752f5d54d7 ("PCI/AER: Clear device status bits during ERR_FATAL and ERR_NONFATAL")
Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260211124624.49656-1-xueshuai@linux.alibaba.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 909f7bf9b080c10df3c3b38533906dbf09ff1d8b upstream.
Bernd reports passthrough failure of a Digital Devices Cine S2 V6 DVB
adapter plugged into an ASRock X570S PG Riptide board with BIOS version
P5.41 (09/07/2023):
ddbridge 0000:05:00.0: detected Digital Devices Cine S2 V6 DVB adapter
ddbridge 0000:05:00.0: cannot read registers
ddbridge 0000:05:00.0: fail
BIOS assigns an incorrect BAR to the DVB adapter which doesn't fit into the
upstream bridge window. The kernel corrects the BAR assignment:
pci 0000:07:00.0: BAR 0 [mem 0xfffffffffc500000-0xfffffffffc50ffff 64bit]: can't claim; no compatible bridge window
pci 0000:07:00.0: BAR 0 [mem 0xfc500000-0xfc50ffff 64bit]: assigned
Correction of the BAR assignment happens in an x86-specific fs_initcall,
pcibios_assign_resources(), after device enumeration in a subsys_initcall.
This order was introduced at the behest of Linus in 2004:
https://git.kernel.org/tglx/history/c/a06a30144bbc
No other architecture performs such a late BAR correction.
Bernd bisected the issue to commit a2f1e22390ac ("PCI/ERR: Ensure error
recoverability at all times"), but it only occurs in the absence of commit
4d4c10f763d7 ("PCI: Explicitly put devices into D0 when initializing").
This combination exists in stable kernel v6.12.70, but not in mainline,
hence Bernd cannot reproduce the issue with mainline.
Since a2f1e22390ac, config space is saved on enumeration, prior to BAR
correction. Upon passthrough, the corrected BAR is overwritten with the
incorrect saved value by:
vfio_pci_core_register_device()
vfio_pci_set_power_state()
pci_restore_state()
But only if the device's current_state is PCI_UNKNOWN, as it was prior to
commit 4d4c10f763d7. Since the commit, it is PCI_D0, which changes the
behavior of vfio_pci_set_power_state() to no longer restore the state
without saving it first.
Alexandre is reporting the same issue as Bernd, but in his case, mainline
is affected as well. The difference is that on Alexandre's system, the
host kernel binds a driver to the device which is unbound prior to
passthrough, whereas on Bernd's system no driver gets bound by the host
kernel.
Unbinding sets current_state to PCI_UNKNOWN in pci_device_remove(), so when
vfio-pci is subsequently bound to the device, pci_restore_state() is once
again called without invoking pci_save_state() first.
To robustly fix the issue, always update saved_config_space upon resource
assignment.
Reported-by: Bernd Schumacher <bernd@bschu.de>
Closes: https://lore.kernel.org/r/acfZrlP0Ua_5D3U4@eldamar.lan/
Reported-by: Alexandre N. <an.tech@mailo.com>
Closes: https://lore.kernel.org/r/dd3c3358-de0f-4a56-9c81-04aceaab4058@mailo.com/
Fixes: a2f1e22390ac ("PCI/ERR: Ensure error recoverability at all times")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Bernd Schumacher <bernd@bschu.de>
Tested-by: Alexandre N. <an.tech@mailo.com>
Cc: stable@vger.kernel.org # v6.12+
Link: https://patch.msgid.link/febc3f354e0c1f5a9f5b3ee9ffddaa44caccf651.1776268054.git.lukas@wunner.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 88cc4cbe08bba27bb58888d25d336774aa0ccab1 upstream.
In the PCIe PHY init for the i.MX95, the reference clock source selection
uses a conditional instead of always passing the mask. This currently
breaks functionality if the internal refclk is used.
To fix this issue, always pass IMX95_PCIE_REF_USE_PAD as the mask and clear
bit if external refclk is not used. This essentially swaps the parameters.
Fixes: d8574ce57d76 ("PCI: imx6: Add external reference clock input mode support")
Signed-off-by: Franz Schnyder <franz.schnyder@toradex.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Richard Zhu <hongxing.zhu@nxp.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260325093118.684142-1-fra.schnyder@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d9cf7154deed71a4f23e81101571c79cdc77be00 upstream.
The commit 18ac51ae9df9 ("PCI: cadence: Implement capability search
using PCI core APIs") assumed all the platforms using Cadence PCIe
controller support byte and word register accesses. This is not true
for all platforms (e.g., TI J721E SoC, which only supports dword
register accesses).
This causes capability searches via cdns_pcie_find_capability() to fail
on such platforms.
Fix this by using cdns_pcie_read_sz() for config read functions, which
properly handles size-aligned accesses. Remove the now-unused byte and
word read wrapper functions (cdns_pcie_readw and cdns_pcie_readb).
Fixes: 18ac51ae9df9 ("PCI: cadence: Implement capability search using PCI core APIs")
Signed-off-by: Aksh Garg <a-garg7@ti.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260402085545.284457-1-a-garg7@ti.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 36bfc3642b19a98f1302aed4437c331df9b481f0 upstream.
pci_epf_mhi_edma_read() and pci_epf_mhi_edma_write() start DMA
operations and wait for completion with a timeout.
On successful completion, they previously returned the remaining
timeout, which callers may treat as an error. In particular,
mhi_ep_ring_add_element(), which calls pci_epf_mhi_edma_write() via
mhi_cntrl->write_sync(), interprets any non-zero return value as
failure.
Return 0 on success instead of the remaining timeout to prevent
mhi_ep_ring_add_element() from treating successful completion as an
error.
Fixes: 7b99aaaddabb ("PCI: epf-mhi: Add eDMA support")
Signed-off-by: Daniel Hodges <git@danielhodges.dev>
[mani: changed commit log as per https://lore.kernel.org/linux-pci/20260227191510.GA3904799@bhelgaas]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260206200529.10784-1-git@danielhodges.dev
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3446beddba450c8d6f9aca2f028712ac527fead3 upstream.
epf_ntb_epc_destroy() duplicates the teardown that the caller is
supposed to do later. This leads to an oops when .allow_link fails or
when .drop_link is performed. Remove the helper.
Also drop pci_epc_put(). EPC device refcounting is tied to configfs EPC
group lifetime, and pci_epc_put() in the .drop_link path is sufficient.
Fixes: 8b821cf76150 ("PCI: endpoint: Add EP function driver to provide NTB functionality")
Signed-off-by: Koichiro Den <den@valinux.co.jp>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260226084142.2226875-3-den@valinux.co.jp
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5f73cf1db829c21b7fd44a8d2587cd395b1b2d76 upstream.
On i.MX6SX, the LTSSM registers become inaccessible after the
PME_Turn_Off message is sent to the link. So there is no way to verify
whether the link has entered L2/L3 Ready state or not.
Hence, set IMX_PCIE_FLAG_SKIP_L23_READY flag for i.MX6SX SoC to skip the
L2/L3 Ready state polling and let the DWC core wait for 10ms after sending
the PME_Turn_Off message as per the PCIe spec r6.0, sec 5.3.3.2.1.
Fixes: a528d1a72597 ("PCI: imx6: Use DWC common suspend resume method")
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
[mani: commit log]
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260228080925.1558395-1-hongxing.zhu@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|