<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/pci/pcie, branch v6.19.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-04T12:20:33+00:00</updated>
<entry>
<title>PCI/bwctrl: Disable BW controller on Intel P45 using a quirk</title>
<updated>2026-03-04T12:20:33+00:00</updated>
<author>
<name>Ilpo Järvinen</name>
<email>ilpo.jarvinen@linux.intel.com</email>
</author>
<published>2026-01-16T13:15:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=41c9a511a96b42c0c6fdb9e3467ef1bcaa87590a'/>
<id>urn:sha1:41c9a511a96b42c0c6fdb9e3467ef1bcaa87590a</id>
<content type='text'>
[ Upstream commit 46a9f70e93ef73860d1dbbec75ef840031f8f30a ]

The commit 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as
PCIe BW controller") was found to lead to a boot hang on a Intel P45
system. Testing without setting Link Bandwidth Management Interrupt Enable
(LBMIE) and Link Autonomous Bandwidth Interrupt Enable (LABIE) (PCIe r7.0,
sec 7.5.3.7) in bwctrl allowed system to come up.

P45 is a very old chipset and supports only up to gen2 PCIe, so not having
bwctrl does not seem a huge deficiency.

Add no_bw_notif in struct pci_dev and quirk Intel P45 Root Port with it.

Reported-by: Adam Stylinski &lt;kungfujesus06@gmail.com&gt;
Link: https://lore.kernel.org/linux-pci/aUCt1tHhm_-XIVvi@eggsbenedict/
Signed-off-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Tested-by: Adam Stylinski &lt;kungfujesus06@gmail.com&gt;
Link: https://patch.msgid.link/20260116131513.2359-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI/AER: Clear stale errors on reporting agents upon probe</title>
<updated>2026-03-04T12:20:32+00:00</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2026-01-25T09:25:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=90170a0da492c0d8095e163a4f3f650268d6d457'/>
<id>urn:sha1:90170a0da492c0d8095e163a4f3f650268d6d457</id>
<content type='text'>
[ Upstream commit e242d09b58e869f86071b7889acace4cff215935 ]

Correctable and Uncorrectable Error Status Registers on reporting agents
are cleared upon PCI device enumeration in pci_aer_init() to flush past
events.  They're cleared again when an error is handled by the AER driver.

If an agent reports a new error after pci_aer_init() and before the AER
driver has probed on the corresponding Root Port or Root Complex Event
Collector, that error is not handled by the AER driver:  It clears the
Root Error Status Register on probe, but neglects to re-clear the
Correctable and Uncorrectable Error Status Registers on reporting agents.

The error will eventually be reported when another error occurs.  Which
is irritating because to an end user it appears as if the earlier error
has just happened.

Amend the AER driver to clear stale errors on reporting agents upon probe.

Skip reporting agents which have not invoked pci_aer_init() yet to avoid
using an uninitialized pdev-&gt;aer_cap.  They're recognizable by the error
bits in the Device Control register still being clear.

Reporting agents may execute pci_aer_init() after the AER driver has
probed, particularly when devices are hotplugged or removed/rescanned via
sysfs.  For this reason, it continues to be necessary that pci_aer_init()
clears Correctable and Uncorrectable Error Status Registers.

Reported-by: Lucas Van &lt;lucas.van@intel.com&gt; # off-list
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Tested-by: Lucas Van &lt;lucas.van@intel.com&gt;
Reviewed-by: Kuppuswamy Sathyanarayanan &lt;sathyanarayanan.kuppuswamy@linux.intel.com&gt;
Link: https://patch.msgid.link/3011c2ed30c11f858e35e29939add754adea7478.1769332702.git.lukas@wunner.de
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI/ACPI: Restrict program_hpx_type2() to AER bits</title>
<updated>2026-02-26T23:01:09+00:00</updated>
<author>
<name>Håkon Bugge</name>
<email>haakon.bugge@oracle.com</email>
</author>
<published>2026-01-29T17:52:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=751206294527e48d431da66f4b9a5011eeaedfaf'/>
<id>urn:sha1:751206294527e48d431da66f4b9a5011eeaedfaf</id>
<content type='text'>
[ Upstream commit 9abf79c8d7b40db0e5a34aa8c744ea60ff9a3fcf ]

Previously program_hpx_type2() applied PCIe settings unconditionally,
which could incorrectly change bits like Extended Tag Field Enable and
Enable Relaxed Ordering.

When _HPX was added to ACPI r3.0, the intent of the PCIe Setting
Record (Type 2) in sec 6.2.7.3 was to configure AER registers when the
OS does not own the AER Capability:

  The PCI Express setting record contains ... [the AER] Uncorrectable
  Error Mask, Uncorrectable Error Severity, Correctable Error Mask
  ... to be used when configuring registers in the Advanced Error
  Reporting Extended Capability Structure ...

  OSPM [1] will only evaluate _HPX with Setting Record – Type 2 if
  OSPM is not controlling the PCI Express Advanced Error Reporting
  capability.

ACPI r3.0b, sec 6.2.7.3, added more AER registers, including registers
in the PCIe Capability with AER-related bits, and the restriction that
the OS use this only when it owns PCIe native hotplug:

  ... when configuring PCI Express registers in the Advanced Error
  Reporting Extended Capability Structure *or PCI Express Capability
  Structure* ...

  An OS that has assumed ownership of native hot plug but does not
  ... have ownership of the AER register set must use ... the Type 2
  record to program the AER registers ...

  However, since the Type 2 record also includes register bits that
  have functions other than AER, the OS must ignore values ... that
  are not applicable.

Restrict program_hpx_type2() to only the intended purpose:

  - Apply settings only when OS owns PCIe native hotplug but not AER,

  - Only touch the AER-related bits (Error Reporting Enables) in Device
    Control

  - Don't touch Link Control at all, since nothing there seems AER-related,
    but log _HPX settings for debugging purposes

Note that Read Completion Boundary is now configured elsewhere, since it is
unrelated to _HPX.

[1] Operating System-directed configuration and Power Management

Fixes: 40abb96c51bb ("[PATCH] pciehp: Fix programming hotplug parameters")
Signed-off-by: Håkon Bugge &lt;haakon.bugge@oracle.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Link: https://patch.msgid.link/20260129175237.727059-3-haakon.bugge@oracle.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI/PTM: Fix pcie_ptm_create_debugfs() memory leak</title>
<updated>2026-02-26T23:01:05+00:00</updated>
<author>
<name>Aadityarangan Shridhar Iyengar</name>
<email>adiyenga@cisco.com</email>
</author>
<published>2026-01-11T16:36:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3fc58ff1e3314f916fc6f04376ed396299eef318'/>
<id>urn:sha1:3fc58ff1e3314f916fc6f04376ed396299eef318</id>
<content type='text'>
[ Upstream commit 62171369cf17794ddd88f602c2c84d008ecafcff ]

In pcie_ptm_create_debugfs(), if devm_kasprintf() fails after successfully
allocating ptm_debugfs with kzalloc(), the function returns without freeing
the allocated memory, resulting in a memory leak.

Free ptm_debugfs before returning in the devm_kasprintf() error path and in
pcie_ptm_destroy_debugfs().

Fixes: 132833405e61 ("PCI: Add debugfs support for exposing PTM context")
Signed-off-by: Aadityarangan Shridhar Iyengar &lt;adiyenga@cisco.com&gt;
[bhelgaas: squash additional fix from Mani:
https://lore.kernel.org/r/pdp4xc4d5ee3e547mmdro5riui3mclduqdl7j6iclfbozo2a4c@7m3qdm6yrhuv]
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Manivannan Sadhasivam &lt;mani@kernel.org&gt;
Link: https://patch.msgid.link/20260111163650.33168-1-adiyenga@cisco.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>PCI/portdrv: Fix potential resource leak</title>
<updated>2026-02-26T23:01:05+00:00</updated>
<author>
<name>Uwe Kleine-König</name>
<email>u.kleine-koenig@baylibre.com</email>
</author>
<published>2025-12-02T15:13:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5be5792fbd46b0b512b59b1677a29c5c0074d6aa'/>
<id>urn:sha1:5be5792fbd46b0b512b59b1677a29c5c0074d6aa</id>
<content type='text'>
[ Upstream commit 01464a3fdf91c041a381d93a1b6fefbdb819a46f ]

pcie_port_probe_service() unconditionally calls get_device() (unless it
fails). So drop that reference also unconditionally as it's fine for a
PCIe driver to not have a remove callback.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Uwe Kleine-König &lt;u.kleine-koenig@baylibre.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
Reviewed-by: Jonathan Cameron &lt;jonathan.cameron@huawei.com&gt;
Link: https://patch.msgid.link/e1c68c3b3f1af8427e98ca5e2c79f8bf0ebe2ce4.1764688034.git.u.kleine-koenig@baylibre.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2025-12-06T22:01:20+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-12-06T22:01:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=509d3f45847627f4c5cdce004c3ec79262b5239c'/>
<id>urn:sha1:509d3f45847627f4c5cdce004c3ec79262b5239c</id>
<content type='text'>
Pull non-MM updates from Andrew Morton:

 - "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko)
   fixes a build issue and does some cleanup in ib/sys_info.c

 - "Implement mul_u64_u64_div_u64_roundup()" (David Laight)
   enhances the 64-bit math code on behalf of a PWM driver and beefs up
   the test module for these library functions

 - "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich)
   makes BPF symbol names, sizes, and line numbers available to the GDB
   debugger

 - "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang)
   adds a sysctl which can be used to cause additional info dumping when
   the hung-task and lockup detectors fire

 - "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu)
   adds a general base64 encoder/decoder to lib/ and migrates several
   users away from their private implementations

 - "rbree: inline rb_first() and rb_last()" (Eric Dumazet)
   makes TCP a little faster

 - "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin)
   reworks the KEXEC Handover interfaces in preparation for Live Update
   Orchestrator (LUO), and possibly for other future clients

 - "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin)
   increases the flexibility of KEXEC Handover. Also preparation for LUO

 - "Live Update Orchestrator" (Pasha Tatashin)
   is a major new feature targeted at cloud environments. Quoting the
   cover letter:

      This series introduces the Live Update Orchestrator, a kernel
      subsystem designed to facilitate live kernel updates using a
      kexec-based reboot. This capability is critical for cloud
      environments, allowing hypervisors to be updated with minimal
      downtime for running virtual machines. LUO achieves this by
      preserving the state of selected resources, such as memory,
      devices and their dependencies, across the kernel transition.

      As a key feature, this series includes support for preserving
      memfd file descriptors, which allows critical in-memory data, such
      as guest RAM or any other large memory region, to be maintained in
      RAM across the kexec reboot.

   Mike Rappaport merits a mention here, for his extensive review and
   testing work.

 - "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain)
   moves the kexec and kdump sysfs entries from /sys/kernel/ to
   /sys/kernel/kexec/ and adds back-compatibility symlinks which can
   hopefully be removed one day

 - "kho: fixes for vmalloc restoration" (Mike Rapoport)
   fixes a BUG which was being hit during KHO restoration of vmalloc()
   regions

* tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits)
  calibrate: update header inclusion
  Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()"
  vmcoreinfo: track and log recoverable hardware errors
  kho: fix restoring of contiguous ranges of order-0 pages
  kho: kho_restore_vmalloc: fix initialization of pages array
  MAINTAINERS: TPM DEVICE DRIVER: update the W-tag
  init: replace simple_strtoul with kstrtoul to improve lpj_setup
  KHO: fix boot failure due to kmemleak access to non-PRESENT pages
  Documentation/ABI: new kexec and kdump sysfs interface
  Documentation/ABI: mark old kexec sysfs deprecated
  kexec: move sysfs entries to /sys/kernel/kexec
  test_kho: always print restore status
  kho: free chunks using free_page() instead of kfree()
  selftests/liveupdate: add kexec test for multiple and empty sessions
  selftests/liveupdate: add simple kexec-based selftest for LUO
  selftests/liveupdate: add userspace API selftests
  docs: add documentation for memfd preservation via LUO
  mm: memfd_luo: allow preserving memfd
  liveupdate: luo_file: add private argument to store runtime state
  mm: shmem: export some functions to internal.h
  ...
</content>
</entry>
<entry>
<title>Merge tag 'pci-v6.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci</title>
<updated>2025-12-05T01:29:41+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-12-05T01:29:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=43dfc13ca972988e620a6edb72956981b75ab6b0'/>
<id>urn:sha1:43dfc13ca972988e620a6edb72956981b75ab6b0</id>
<content type='text'>
Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms (Dan
     Williams)

   - Switch vmd from custom domain number allocator to the common
     allocator to prevent a potential race with new non-VMD buses (Dan
     Williams)

   - Enable Precision Time Measurement (PTM) only if device advertises
     support for a relevant role, to prevent invalid PTM Requests that
     cause ACS violations that are reported as AER Uncorrectable
     Non-Fatal errors (Mika Westerberg)

  Resource management:

   - Prevent resource tree corruption when BAR resize fails (Ilpo
     Järvinen)

   - Restore BARs to the original size if a BAR resize fails (Ilpo
     Järvinen)

   - Remove BAR release from BAR resize attempts by the xe, i915, and
     amdgpu drivers so the PCI core can restore BARs if the resize fails
     (Ilpo Järvinen)

   - Move Resizable BAR code to rebar.c (Ilpo Järvinen)

   - Add pci_rebar_size_supported() and use it in i915 and xe (Ilpo
     Järvinen)

   - Add pci_rebar_get_max_size() and use it in xe and amdgpu (Ilpo
     Järvinen)

  Power management and error handling:

   - For drivers using PCI legacy suspend, save config state at suspend
     so that state (not any earlier state from enumeration, probe, or
     error recovery) will be restored when resuming (Lukas Wunner)

   - For devices with no driver or a driver that lacks power management,
     save config state at hibernate so that state (not any earlier state
     from enumeration, probe, or error recovery) will be restored when
     resuming (Lukas Wunner)

   - Save device config space on device addition, before driver binding,
     so error recovery works more reliably (Lukas Wunner)

   - Drop pci_save_state() from several drivers that no longer need it
     since the PCI core always does it and pci_restore_state() no longer
     invalidates the saved state (Lukas Wunner)

   - Document use of pci_save_state() by drivers to capture the state
     they want restored during error recovery (Lukas Wunner)

  Power control:

   - Add a struct pci_ops.assert_perst() function pointer to
     assert/deassert PCIe PERST# and implement it for the qcom driver
     (Krishna Chaitanya Chundru)

   - Add DT binding and pwrctrl driver for the Toshiba TC9563 PCIe
     switch, which must be held in reset after poweron so the pwrctrl
     driver can configure the switch via I2C before bringing up the
     links (Krishna Chaitanya Chundru)

  Endpoint framework:

   - Convert the endpoint doorbell test to use a threaded IRQ to fix a
     'sleeping while atomic' issue (Bhanu Seshu Kumar Valluri)

   - Add endpoint VNTB MSI doorbell support to reduce latency between
     host and endpoint (Frank Li)

  New native PCIe controller drivers:

   - Add CIX Sky1 host controller DT binding and driver (Hans Zhang)

   - Add NXP S32G host controller DT binding and driver (Vincent
     Guittot)

   - Add Renesas RZ/G3S host controller DT binding and driver (Claudiu
     Beznea)

   - Add SpacemiT K1 host controller DT binding and driver (Alex Elder)

  Amlogic Meson PCIe controller driver:

   - Update DT binding to name DBI region 'dbi', not 'elbi', and update
     driver to support both (Manivannan Sadhasivam)

  Apple PCIe controller driver:

   - Move struct pci_host_bridge allocation from pci_host_common_init()
     to callers, which significantly simplifies pcie-apple (Marc
     Zyngier)

  Broadcom STB PCIe controller driver:

   - Disable advertising ASPM L0s support correctly (Jim Quinlan)

   - Add a panic/die handler to print diagnostic info in case PCIe
     caused an unrecoverable abort (Jim Quinlan)

  Cadence PCIe controller driver:

   - Add module support for Cadence platform host and endpoint
     controller driver (Manikandan K Pillai)

   - Split headers into 'legacy' (LGA) and 'high perf' (HPA) to prepare
     for new CIX Sky1 driver (Manikandan K Pillai)

  MediaTek PCIe controller driver:

   - Convert DT binding to YAML schema (Christian Marangi)

   - Add Airoha AN7583 DT compatible and driver support (Christian
     Marangi)

  Qualcomm PCIe controller driver:

   - Add Qualcomm Kaanapali to SM8550 DT binding (Qiang Yu)

   - Add required 'power-domains' and 'resets' to qcom sa8775p, sc7280,
     sc8280xp, sm8150, sm8250, sm8350, sm8450, sm8550, x1e80100 DT
     schemas (Krzysztof Kozlowski)

   - Look up OPP using both frequency and data rate (not just frequency)
     so RPMh votes can account for both (Krishna Chaitanya Chundru)

  Rockchip DesignWare PCIe controller driver:

   - Add Rockchip RK3528 compatible strings in DT binding (Yao Zi)

  STMicroelectronics STM32MP25 PCIe controller driver:

   - Fix a race between link training and endpoint register
     initialization (Christian Bruel)

   - Align endpoint allocations to match the ATU requirements (Christian
     Bruel)

  Synopsys DesignWare PCIe controller driver:

   - Clear L1 PM Substate Capability 'Supported' bits unless glue driver
     says it's supported, which prevents users from enabling non-working
     L1SS. Currently only qcom and tegra194 support L1SS (Bjorn Helgaas)

   - Remove now-superfluous L1SS disable code from tegra194 (Bjorn
     Helgaas)

   - Configure L1SS support in dw-rockchip when DT says
     'supports-clkreq' (Shawn Lin)

  TI Keystone PCIe controller driver:

   - Fail the probe instead of silently succeeding if ks_pcie_of_data
     didn't specify Root Complex or Endpoint mode (Siddharth Vadapalli)

   - Make keystone buildable as a loadable module, except on ARM32 where
     hook_fault_code() is __init (Siddharth Vadapalli)"

* tag 'pci-v6.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (100 commits)
  MAINTAINERS: Add Manivannan Sadhasivam as PCI/pwrctrl maintainer
  MAINTAINERS: Add CIX Sky1 PCIe controller driver maintainer
  PCI: sky1: Add PCIe host support for CIX Sky1
  dt-bindings: PCI: Add CIX Sky1 PCIe Root Complex bindings
  PCI: cadence: Add support for High Perf Architecture (HPA) controller
  MAINTAINERS: Add NXP S32G PCIe controller driver maintainer
  PCI: s32g: Add NXP S32G PCIe controller driver (RC)
  PCI: dwc: Add register and bitfield definitions
  dt-bindings: PCI: s32g: Add NXP S32G PCIe controller
  PCI: Add Renesas RZ/G3S host controller driver
  PCI: host-generic: Move bridge allocation outside of pci_host_common_init()
  dt-bindings: PCI: Add Renesas RZ/G3S PCIe controller binding
  PCI: Validate pci_rebar_size_supported() input
  Documentation: PCI: Amend error recovery doc with pci_save_state() rules
  treewide: Drop pci_save_state() after pci_restore_state()
  PCI/ERR: Ensure error recoverability at all times
  PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw
  PCI/PM: Reinstate clearing state_saved in legacy and !PM codepaths
  PCI: dw-rockchip: Configure L1SS support
  PCI: tegra194: Remove unnecessary L1SS disable code
  ...
</content>
</entry>
<entry>
<title>Merge branch 'pci/ptm'</title>
<updated>2025-12-03T20:18:31+00:00</updated>
<author>
<name>Bjorn Helgaas</name>
<email>bhelgaas@google.com</email>
</author>
<published>2025-12-03T20:18:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c1e900de220905173c6cef42adf97e8020f64027'/>
<id>urn:sha1:c1e900de220905173c6cef42adf97e8020f64027</id>
<content type='text'>
- Enable PTM only if device advertises support for a relevant role, to
  prevent invalid PTM Requests that cause ACS violations that are reported
  as AER Uncorrectable Non-Fatal errors (Mika Westerberg)

* pci/ptm:
  PCI/PTM: Enable only if device advertises relevant role
</content>
</entry>
<entry>
<title>vmcoreinfo: track and log recoverable hardware errors</title>
<updated>2025-11-27T22:24:44+00:00</updated>
<author>
<name>Breno Leitao</name>
<email>leitao@debian.org</email>
</author>
<published>2025-10-10T10:36:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3fa805c37dd4d3e72ae5c58800f3f46ab3ca1f70'/>
<id>urn:sha1:3fa805c37dd4d3e72ae5c58800f3f46ab3ca1f70</id>
<content type='text'>
Introduce a generic infrastructure for tracking recoverable hardware
errors (HW errors that are visible to the OS but does not cause a panic)
and record them for vmcore consumption.  This aids post-mortem crash
analysis tools by preserving a count and timestamp for the last occurrence
of such errors.  On the other side, correctable errors, which the OS
typically remains unaware of because the underlying hardware handles them
transparently, are less relevant for crash dump and therefore are NOT
tracked in this infrastructure.

Add centralized logging for sources of recoverable hardware errors based
on the subsystem it has been notified.

hwerror_data is write-only at kernel runtime, and it is meant to be read
from vmcore using tools like crash/drgn.  For example, this is how it
looks like when opening the crashdump from drgn.

	&gt;&gt;&gt; prog['hwerror_data']
	(struct hwerror_info[1]){
		{
			.count = (int)844,
			.timestamp = (time64_t)1752852018,
		},
		...

This helps fleet operators quickly triage whether a crash may be
influenced by hardware recoverable errors (which executes a uncommon code
path in the kernel), especially when recoverable errors occurred shortly
before a panic, such as the bug fixed by commit ee62ce7a1d90 ("page_pool:
Track DMA-mapped pages and unmap them when destroying the pool")

This is not intended to replace full hardware diagnostics but provides a
fast way to correlate hardware events with kernel panics quickly.

Rare machine check exceptions—like those indicated by mce_flags.p5 or
mce_flags.winchip—are not accounted for in this method, as they fall
outside the intended usage scope for this feature's user base.

[leitao@debian.org: add hw-recoverable-errors to toctree]
  Link: https://lkml.kernel.org/r/20251127-vmcoreinfo_fix-v1-1-26f5b1c43da9@debian.org
Link: https://lkml.kernel.org/r/20251010-vmcore_hw_error-v5-1-636ede3efe44@debian.org
Signed-off-by: Breno Leitao &lt;leitao@debian.org&gt;
Suggested-by: Tony Luck &lt;tony.luck@intel.com&gt;
Suggested-by: Shuai Xue &lt;xueshuai@linux.alibaba.com&gt;
Reviewed-by: Shuai Xue &lt;xueshuai@linux.alibaba.com&gt;
Reviewed-by: Hanjun Guo &lt;guohanjun@huawei.com&gt;	[APEI]
Cc: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: Bob Moore &lt;robert.moore@intel.com&gt;
Cc: Borislav Betkov &lt;bp@alien8.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Morse &lt;james.morse@arm.com&gt;
Cc: Konrad Rzessutek Wilk &lt;konrad.wilk@oracle.com&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Mahesh Salgaonkar &lt;mahesh@linux.ibm.com&gt;
Cc: Mauro Carvalho Chehab &lt;mchehab@kernel.org&gt;
Cc: "Oliver O'Halloran" &lt;oohall@gmail.com&gt;
Cc: Omar Sandoval &lt;osandov@osandov.com&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>treewide: Drop pci_save_state() after pci_restore_state()</title>
<updated>2025-11-24T22:58:59+00:00</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2025-10-12T13:25:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=383d89699c5028de510a6667f674ed38585f77fc'/>
<id>urn:sha1:383d89699c5028de510a6667f674ed38585f77fc</id>
<content type='text'>
In 2009, commit c82f63e411f1 ("PCI: check saved state before restore")
changed the behavior of pci_restore_state() such that it became necessary
to call pci_save_state() afterwards, lest recovery from subsequent PCI
errors fails.

The commit has just been reverted and so all the pci_save_state() after
pci_restore_state() calls that have accumulated in the tree are now
superfluous.  Drop them.

Two drivers chose a different approach to achieve the same result:
drivers/scsi/ipr.c and drivers/net/ethernet/intel/e1000e/netdev.c set the
pci_dev's "state_saved" flag to true before calling pci_restore_state().
Drop this as well.

Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Acked-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Acked-by: Giovanni Cabiddu &lt;giovanni.cabiddu@intel.com&gt;  # qat
Link: https://patch.msgid.link/c2b28cc4defa1b743cf1dedee23c455be98b397a.1760274044.git.lukas@wunner.de
</content>
</entry>
</feed>
