kernel/linux.git/drivers/pci/pcie/err.c, branch v7.1

PCI/ERR: Update device error_state already after reset

2025-08-13T18:20:56+00:00

After a Fatal Error has been reported by a device and has been recovered through a Secondary Bus Reset, AER updates the device's error_state to pci_channel_io_normal before invoking its driver's ->resume() callback. By contrast, EEH updates the error_state earlier, namely after resetting the device and before invoking its driver's ->slot_reset() callback. Commit c58dc575f3c8 ("powerpc/pseries: Set error_state to pci_channel_io_normal in eeh_report_reset()") explains in great detail that the earlier invocation is necessitated by various drivers checking accessibility of the device with pci_channel_offline() and avoiding accesses if it returns true. It returns true for any other error_state than pci_channel_io_normal. The device should be accessible already after reset, hence the reasoning is that it's safe to update the error_state immediately afterwards. This deviation between AER and EEH seems problematic because drivers behave differently depending on which error recovery mechanism the platform uses. Three drivers have gone so far as to update the error_state themselves, presumably to work around AER's behavior. For consistency, amend AER to update the error_state at the same recovery steps as EEH. Drop the now unnecessary workaround from the three drivers. Keep updating the error_state before ->resume() in case ->error_detected() or ->mmio_enabled() return PCI_ERS_RESULT_RECOVERED, which causes ->slot_reset() to be skipped. There are drivers doing this even for Fatal Errors, e.g. mhi_pci_error_detected(). Signed-off-by: Lukas Wunner Signed-off-by: Bjorn Helgaas Link: https://patch.msgid.link/4517af6359ffb9d66152b827a5d2833459144e3f.1755008151.git.lukas@wunner.de

PCI/ERR: Notify drivers on failure to recover

2025-08-13T18:20:46+00:00

According to Documentation/PCI/pci-error-recovery.rst, the following shall occur on failure to recover from a PCIe Uncorrectable Error: STEP 6: Permanent Failure ------------------------- A "permanent failure" has occurred, and the platform cannot recover the device. The platform will call error_detected() with a pci_channel_state_t value of pci_channel_io_perm_failure. The device driver should, at this point, assume the worst. It should cancel all pending I/O, refuse all new I/O, returning -EIO to higher layers. The device driver should then clean up all of its memory and remove itself from kernel operations, much as it would during system shutdown. Sathya notes that AER does not call error_detected() on failure and thus deviates from the document (as well as EEH, for which the document was originally added). Most drivers do nothing on permanent failure, but the SCSI drivers and a number of Ethernet drivers do take advantage of the notification to flush queues and give up resources. Amend AER to notify such drivers and align with the documentation and EEH. Link: https://lore.kernel.org/r/f496fc0f-64d7-46a4-8562-dba74e31a956@linux.intel.com/ Suggested-by: Sathyanarayanan Kuppuswamy Signed-off-by: Lukas Wunner Signed-off-by: Bjorn Helgaas Link: https://patch.msgid.link/ec212d4d4f5c65d29349df33acdc9768ff8279d1.1755008151.git.lukas@wunner.de

PCI/ERR: Fix uevent on failure to recover

2025-08-13T18:20:35+00:00

Upon failure to recover from a PCIe error through AER, DPC or EDR, a uevent is sent to inform user space about disconnection of the bridge whose subordinate devices failed to recover. However the bridge itself is not disconnected. Instead, a uevent should be sent for each of the subordinate devices. Only if the "bridge" happens to be a Root Complex Event Collector or Integrated Endpoint does it make sense to send a uevent for it (because there are no subordinate devices). Right now if there is a mix of subordinate devices with and without pci_error_handlers, a BEGIN_RECOVERY event is sent for those with pci_error_handlers but no FAILED_RECOVERY event is ever sent for them afterwards. Fix it. Fixes: 856e1eb9bdd4 ("PCI/AER: Add uevents in AER and EEH error/resume") Signed-off-by: Lukas Wunner Signed-off-by: Bjorn Helgaas Cc: stable@vger.kernel.org # v4.16+ Link: https://patch.msgid.link/68fc527a380821b5d861dd554d2ce42cb739591c.1755008151.git.lukas@wunner.de

PCI/AER: Allow drivers to opt in to Bus Reset on Non-Fatal Errors

2025-08-13T18:20:26+00:00

When Advanced Error Reporting was introduced in September 2006 by commit 6c2b374d7485 ("PCI-Express AER implemetation: AER core and aerdriver"), it sought to adhere to the recovery flow and callbacks specified in Documentation/PCI/pci-error-recovery.rst. That document had been added in January 2006, when Enhanced Error Handling (EEH) was introduced for PowerPC with commit 065c6359071c ("[PATCH] PCI Error Recovery: documentation"). However the AER driver deviates from the document in that it never performs a Secondary Bus Reset on Non-Fatal Errors, but always on Fatal Errors. By contrast, EEH allows drivers to opt in or out of a Bus Reset regardless of error severity, by returning PCI_ERS_RESULT_NEED_RESET or PCI_ERS_RESULT_CAN_RECOVER from their ->error_detected() callback. If all drivers agree that they can recover without a Bus Reset, EEH skips it. Should one of them request a Bus Reset, it overrides all other drivers. This inconsistency between EEH and AER seems problematic because drivers need to be aware of and cope with it. The file Documentation/PCI/pcieaer-howto.rst hints at a rationale for always performing a Bus Reset on Fatal Errors: "Fatal errors [...] cause the link to be unreliable. [...] This [reset_link] callback is used to reset the PCIe physical link when a fatal error happens. If an error message indicates a fatal error, [...] performing link reset at upstream is necessary." There's no such rationale provided for never performing a Bus Reset on Non-Fatal Errors. The "xe" driver has a need to attempt a reset of local units on graphics cards upon a Non-Fatal Error. If that is insufficient for recovery, the driver wants to opt in to a Bus Reset. Accommodate such use cases and align AER more closely with EEH by performing a Bus Reset in pcie_do_recovery() if drivers request it and the faulting device's channel_state is pci_channel_io_normal. The AER driver sets this channel_state for Non-Fatal Errors. For Fatal Errors, it uses pci_channel_io_frozen. This limits the deviation from Documentation/PCI/pci-error-recovery.rst and EEH to the unconditional Bus Reset on Fatal Errors. pcie_do_recovery() is also invoked by the Downstream Port Containment and Error Disconnect Recover drivers. They both set the channel_state to pci_channel_io_frozen, hence pcie_do_recovery() continues to always invoke the ->reset_subordinates() callback in their case. That is necessary because the callback brings the link back up at the containing Downstream Port. There are two behavioral changes resulting from this commit: First, if channel_state is pci_channel_io_normal and one of the affected drivers returns PCI_ERS_RESULT_NEED_RESET from its ->error_detected() callback, a Bus Reset will now be performed. There are drivers doing this and although it would be possible to avoid a behavioral change by letting them return PCI_ERS_RESULT_CAN_RECOVER instead, the impression I got from examination of all drivers is that they actually expect or want a Bus Reset (cxl_error_detected() is a case in point). In any case, if they can cope with a Bus Reset on Fatal Errors, they shouldn't have issues with a Bus Reset on Non-Fatal Errors. Second, if channel_state is pci_channel_io_frozen and all affected drivers return PCI_ERS_RESULT_CAN_RECOVER from ->error_detected(), their ->mmio_enabled() callback is now invoked prior to performing a Bus Reset, instead of afterwards. This actually makes sense: For example, drivers/scsi/sym53c8xx_2/sym_glue.c dumps debug registers in its ->mmio_enabled() callback. Doing so after reset right now captures the post-reset state instead of the faulting state, which is useless. There is only one other driver which implements ->mmio_enabled() and returns PCI_ERS_RESULT_CAN_RECOVER from ->error_detected() for channel_state pci_channel_io_frozen, drivers/scsi/ipr.c (IBM Power RAID). It appears to only be used on EEH platforms. So the second behavioral change is limited to these two drivers. Signed-off-by: Lukas Wunner Signed-off-by: Bjorn Helgaas Link: https://patch.msgid.link/28fd805043bb57af390168d05abb30898cf4fc58.1755008151.git.lukas@wunner.de

PCI/ERR: Remove misleading TODO regarding kernel panic

2025-05-30T17:21:19+00:00

A PCI device is just another peripheral in a system. So failure to recover it, must not result in a kernel panic. So remove the TODO which is quite misleading. Signed-off-by: Manivannan Sadhasivam Signed-off-by: Krzysztof Wilczyński Signed-off-by: Bjorn Helgaas Reviewed-by: Wilfred Mallawa Link: https://patch.msgid.link/20250508-pcie-reset-slot-v4-1-7050093e2b50@linaro.org

PCI/ERR: Cleanup misleading indentation inside if conditions

2024-04-29T15:33:29+00:00

A few if conditions align misleadingly with the following code block. The checks are really cascading NULL checks that fit into 80 chars so remove newlines in between and realign to the if condition indent. Link: https://lore.kernel.org/r/20240429094707.2529-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen Signed-off-by: Bjorn Helgaas

PCI/AER: Block runtime suspend when handling errors

2024-03-07T23:58:07+00:00

PM runtime can be done simultaneously with AER error handling. Avoid that by using pm_runtime_get_sync() before and pm_runtime_put() after reset in pcie_do_recovery() for all recovering devices. pm_runtime_get_sync() will increase dev->power.usage_count counter to prevent any possible future request to runtime suspend a device. It will also resume a device, if it was previously in D3hot state. I tested with igc device by doing simultaneous aer_inject and rpm suspend/resume via /sys/bus/pci/devices/PCI_ID/power/control and can reproduce: igc 0000:02:00.0: not ready 65535ms after bus reset; giving up pcieport 0000:00:1c.2: AER: Root Port link has been reset (-25) pcieport 0000:00:1c.2: AER: subordinate device reset failed pcieport 0000:00:1c.2: AER: device recovery failed igc 0000:02:00.0: Unable to change power state from D3hot to D0, device inaccessible The problem disappears when this patch is applied. Link: https://lore.kernel.org/r/20240212120135.146068-1-stanislaw.gruszka@linux.intel.com Signed-off-by: Stanislaw Gruszka Signed-off-by: Bjorn Helgaas Reviewed-by: Kuppuswamy Sathyanarayanan Acked-by: Rafael J. Wysocki Cc:

PCI/ERR: Recognize disconnected devices in report_error_detected()

2022-06-08T20:08:40+00:00

When a device is already unplugged by pciehp by the time the AER handler is invoked, the PCIe device will already be in the pci_channel_io_perm_failure state. In that case simply return PCI_ERS_RESULT_DISCONNECT instead of trying to do a state transition that will fail. Also untangle the state transition failure from the lack of methods to improve the debugging output in case it happens again. Link: https://lore.kernel.org/r/20220601074024.3481035-1-hch@lst.de Signed-off-by: Christoph Hellwig Signed-off-by: Bjorn Helgaas Reviewed-by: Kuppuswamy Sathyanarayanan

Revert "PCI: Use to_pci_driver() instead of pci_dev->driver"

2021-11-11T19:36:22+00:00

This reverts commit 2a4d9408c9e8b6f6fc150c66f3fef755c9e20d4a. Robert reported a NULL pointer dereference caused by the PCI core (local_pci_probe()) calling the i2c_designware_pci driver's .runtime_resume() method before the .probe() method. i2c_dw_pci_resume() depends on initialization done by i2c_dw_pci_probe(). Prior to 2a4d9408c9e8 ("PCI: Use to_pci_driver() instead of pci_dev->driver"), pci_pm_runtime_resume() avoided calling the .runtime_resume() method because pci_dev->driver had not been set yet. 2a4d9408c9e8 and b5f9c644eb1b ("PCI: Remove struct pci_dev->driver"), removed pci_dev->driver, replacing it by device->driver, which *has* been set by this time, so pci_pm_runtime_resume() called the .runtime_resume() method when it previously had not. Fixes: 2a4d9408c9e8 ("PCI: Use to_pci_driver() instead of pci_dev->driver") Link: https://lore.kernel.org/linux-i2c/CAP145pgdrdiMAT7=-iB1DMgA7t_bMqTcJL4N0=6u8kNY3EU0dw@mail.gmail.com/ Reported-by: Robert Święcki Tested-by: Robert Święcki Signed-off-by: Bjorn Helgaas

PCI: Use to_pci_driver() instead of pci_dev->driver

2021-10-18T14:20:15+00:00

Struct pci_driver contains a struct device_driver, so for PCI devices, it's easy to convert a device_driver * to a pci_driver * with to_pci_driver(). The device_driver * is in struct device, so we don't need to also keep track of the pci_driver * in struct pci_dev. Replace pci_dev->driver with to_pci_driver(). This is a step toward removing pci_dev->driver. [bhelgaas: split to separate patch] Link: https://lore.kernel.org/r/20211004125935.2300113-11-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König Signed-off-by: Bjorn Helgaas