diff options
author | Maciej W. Rozycki <macro@orcam.me.uk> | 2023-06-11 20:20:10 +0300 |
---|---|---|
committer | Bjorn Helgaas <bhelgaas@google.com> | 2023-06-20 18:58:53 +0300 |
commit | a89c82249c3763780522f763dd2e615e2ea114de (patch) | |
tree | 90bfe06c7c0ea718b492ef16e00f3fcb1841c071 /drivers/pci/pci.c | |
parent | 7604bc294c19fe70fb7d9091731a950b16249c51 (diff) | |
download | linux-a89c82249c3763780522f763dd2e615e2ea114de.tar.xz |
PCI: Work around PCIe link training failures
Attempt to handle cases such as with a downstream port of the ASMedia
ASM2824 PCIe switch where link training never completes and the link
continues switching between speeds indefinitely with the data link layer
never reaching the active state.
It has been observed with a downstream port of the ASMedia ASM2824 Gen 3
switch wired to the upstream port of the Pericom PI7C9X2G304 Gen 2 switch,
using a Delock Riser Card PCI Express x1 > 2 x PCIe x1 device, P/N 41433,
wired to a SiFive HiFive Unmatched board. In this setup the switches
should negotiate a link speed of 5.0GT/s, falling back to 2.5GT/s if
necessary.
Instead the link continues oscillating between the two speeds, at the rate
of 34-35 times per second, with link training reported repeatedly active
~84% of the time. Limiting the target link speed to 2.5GT/s with the
upstream ASM2824 device makes the two switches communicate correctly.
Removing the speed restriction afterwards makes the two devices switch to
5.0GT/s then.
Make use of these observations and detect the inability to train the link
by checking for the Data Link Layer Link Active status bit being off while
the Link Bandwidth Management Status indicating that hardware has changed
the link speed or width in an attempt to correct unreliable link operation.
Restrict the speed to 2.5GT/s then with the Target Link Speed field,
request a retrain and wait 200ms for the data link to go up. If this is
successful, lift the restriction, letting the devices negotiate a higher
speed.
Also check for a 2.5GT/s speed restriction the firmware may have already
arranged and lift it too with ports of devices known to continue working
afterwards (currently only ASM2824), that already report their data link
being up.
[bhelgaas: reorder and squash stubs from
https://lore.kernel.org/r/alpine.DEB.2.21.2306111619570.64925@angie.orcam.me.uk
to avoid adding stubs that do nothing]
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2203022037020.56670@angie.orcam.me.uk/
Link: https://source.denx.de/u-boot/u-boot/-/commit/a398a51ccc68
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310038540.59226@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Diffstat (limited to 'drivers/pci/pci.c')
-rw-r--r-- | drivers/pci/pci.c | 2 |
1 files changed, 2 insertions, 0 deletions
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 62c3a8bc83b3..f599d321c881 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -4951,6 +4951,8 @@ static bool pcie_wait_for_link_delay(struct pci_dev *pdev, bool active, if (active) msleep(20); ret = pcie_wait_for_link_status(pdev, false, active); + if (active && !ret) + ret = pcie_failed_link_retrain(pdev); if (active && ret) msleep(delay); |