summaryrefslogtreecommitdiff
path: root/drivers/pci
AgeCommit message (Collapse)AuthorFilesLines
2023-11-28PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e cardLukas Wunner1-0/+12
[ Upstream commit c9260693aa0c1e029ed23693cfd4d7814eee6624 ] Commit ac91e6980563 ("PCI: Unify delay handling for reset and resume") shortened an unconditional 1 sec delay after a Secondary Bus Reset to 100 msec for PCIe (per PCIe r6.1 sec 6.6.1). The 1 sec delay is only required for Conventional PCI. But it turns out that there are PCIe devices which require a longer delay than prescribed before first config space access after reset recovery or resume from D3cold: Chad reports that a "VideoPropulsion Torrent QN16e" MPEG QAM Modulator "raises a PCI system error (PERR), as reported by the IPMI event log, and the hardware itself would suffer a catastrophic event, cycling the server" unless the longer delay is observed. The card is specified to conform to PCIe r1.0 and indeed only supports Gen1 speed (2.5 GT/s) according to lspci. PCIe r1.0 sec 7.6 prescribes the same 100 msec delay as PCIe r6.1 sec 6.6.1: To allow components to perform internal initialization, system software must wait for at least 100 ms from the end of a reset (cold/warm/hot) before it is permitted to issue Configuration Requests The behavior of the Torrent QN16e card thus appears to be a quirk. Treat it as such and lengthen the reset delay for this specific device. Fixes: ac91e6980563 ("PCI: Unify delay handling for reset and resume") Link: https://lore.kernel.org/r/47727e792c7f0282dc144e3ec8ce8eb6e713394e.1695304512.git.lukas@wunner.de Reported-by: Chad Schroeder <CSchroeder@sonifi.com> Closes: https://lore.kernel.org/linux-pci/DM6PR16MB2844903E34CAB910082DF019B1FAA@DM6PR16MB2844.namprd16.prod.outlook.com/ Tested-by: Chad Schroeder <CSchroeder@sonifi.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28PCI: qcom-ep: Add dedicated callback for writing to DBI2 registersManivannan Sadhasivam1-0/+17
[ Upstream commit a07d2497ed657eb2efeb967af47e22f573dcd1d6 ] The DWC core driver exposes the write_dbi2() callback for writing to the DBI2 registers in a vendor-specific way. On the Qcom EP platforms, the DBI_CS2 bit in the ELBI region needs to be asserted before writing to any DBI2 registers and deasserted once done. So, let's implement the callback for the Qcom PCIe EP driver so that the DBI2 writes are correctly handled in the hardware. Without this callback, the DBI2 register writes like BAR size won't go through and as a result, the default BAR size is set for all BARs. [kwilczynski: commit log, renamed function to match the DWC convention] Fixes: f55fee56a631 ("PCI: qcom-ep: Add Qualcomm PCIe Endpoint controller driver") Suggested-by: Serge Semin <fancer.lancer@gmail.com> Link: https://lore.kernel.org/linux-pci/20231025130029.74693-2-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Cc: stable@vger.kernel.org # 5.16+ Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28PCI: exynos: Don't discard .remove() callbackUwe Kleine-König1-2/+2
commit 83a939f0fdc208ff3639dd3d42ac9b3c35607fd2 upstream. With CONFIG_PCI_EXYNOS=y and exynos_pcie_remove() marked with __exit, the function is discarded from the driver. In this case a bound device can still get unbound, e.g via sysfs. Then no cleanup code is run resulting in resource leaks or worse. The right thing to do is do always have the remove callback available. This fixes the following warning by modpost: WARNING: modpost: drivers/pci/controller/dwc/pci-exynos: section mismatch in reference: exynos_pcie_driver+0x8 (section: .data) -> exynos_pcie_remove (section: .exit.text) (with ARCH=x86_64 W=1 allmodconfig). Fixes: 340cba6092c2 ("pci: Add PCIe driver for Samsung Exynos") Link: https://lore.kernel.org/r/20231001170254.2506508-2-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28PCI: kirin: Don't discard .remove() callbackUwe Kleine-König1-2/+2
commit 3064ef2e88c1629c1e67a77d7bc20020b35846f2 upstream. With CONFIG_PCIE_KIRIN=y and kirin_pcie_remove() marked with __exit, the function is discarded from the driver. In this case a bound device can still get unbound, e.g via sysfs. Then no cleanup code is run resulting in resource leaks or worse. The right thing to do is do always have the remove callback available. This fixes the following warning by modpost: drivers/pci/controller/dwc/pcie-kirin: section mismatch in reference: kirin_pcie_driver+0x8 (section: .data) -> kirin_pcie_remove (section: .exit.text) (with ARCH=x86_64 W=1 allmodconfig). Fixes: 000f60db784b ("PCI: kirin: Add support for a PHY layer") Link: https://lore.kernel.org/r/20231001170254.2506508-3-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28PCI/ASPM: Fix L1 substate handling in aspm_attr_store_common()Heiner Kallweit1-0/+2
commit 8e37372ad0bea4c9b4712d9943f6ae96cff9491f upstream. aspm_attr_store_common(), which handles sysfs control of ASPM, has the same problem as fb097dcd5a28 ("PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1"): disabling L1 adds only ASPM_L1 (but not any of the L1.x substates) to the "aspm_disable" mask. Enabling one substate, e.g., L1.1, via sysfs removes ASPM_L1 from the disable mask. Since disabling L1 via sysfs doesn't add any of the substates to the disable mask, enabling L1.1 actually enables *all* the substates. In this scenario: - Write 0 to "l1_aspm" to disable L1 - Write 1 to "l1_1_aspm" to enable L1.1 the intention is to disable L1 and all L1.x substates, then enable just L1.1, but in fact, *all* L1.x substates are enabled. Fix this by explicitly disabling all the L1.x substates when disabling L1. Fixes: 72ea91afbfb0 ("PCI/ASPM: Add sysfs attributes for controlling ASPM link states") Link: https://lore.kernel.org/r/6ba7dd79-9cfe-4ed0-a002-d99cb842f361@gmail.com Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28PCI: keystone: Don't discard .probe() callbackUwe Kleine-König1-2/+2
commit 7994db905c0fd692cf04c527585f08a91b560144 upstream. The __init annotation makes the ks_pcie_probe() function disappear after booting completes. However a device can also be bound later. In that case, we try to call ks_pcie_probe(), but the backing memory is likely already overwritten. The right thing to do is do always have the probe callback available. Note that the (wrong) __refdata annotation prevented this issue to be noticed by modpost. Fixes: 0c4ffcfe1fbc ("PCI: keystone: Add TI Keystone PCIe driver") Link: https://lore.kernel.org/r/20231001170254.2506508-5-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28PCI: keystone: Don't discard .remove() callbackUwe Kleine-König1-2/+2
commit 200bddbb3f5202bbce96444fdc416305de14f547 upstream. With CONFIG_PCIE_KEYSTONE=y and ks_pcie_remove() marked with __exit, the function is discarded from the driver. In this case a bound device can still get unbound, e.g via sysfs. Then no cleanup code is run resulting in resource leaks or worse. The right thing to do is do always have the remove callback available. Note that this driver cannot be compiled as a module, so ks_pcie_remove() was always discarded before this change and modpost couldn't warn about this issue. Furthermore the __ref annotation also prevents a warning. Fixes: 0c4ffcfe1fbc ("PCI: keystone: Add TI Keystone PCIe driver") Link: https://lore.kernel.org/r/20231001170254.2506508-4-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28PCI/sysfs: Protect driver's D3cold preference from user spaceLukas Wunner2-5/+2
commit 70b70a4307cccebe91388337b1c85735ce4de6ff upstream. struct pci_dev contains two flags which govern whether the device may suspend to D3cold: * no_d3cold provides an opt-out for drivers (e.g. if a device is known to not wake from D3cold) * d3cold_allowed provides an opt-out for user space (default is true, user space may set to false) Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"), the user space setting overwrites the driver setting. Essentially user space is trusted to know better than the driver whether D3cold is working. That feels unsafe and wrong. Assume that the change was introduced inadvertently and do not overwrite no_d3cold when d3cold_allowed is modified. Instead, consider d3cold_allowed in addition to no_d3cold when choosing a suspend state for the device. That way, user space may opt out of D3cold if the driver hasn't, but it may no longer force an opt in if the driver has opted out. Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Link: https://lore.kernel.org/r/b8a7f4af2b73f6b506ad8ddee59d747cbf834606.1695025365.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28PCI: Use FIELD_GET() in Sapphire RX 5600 XT Pulse quirkBjorn Helgaas1-4/+4
[ Upstream commit 04e82fa5951ca66495d7b05665eff673aa3852b4 ] Use FIELD_GET() to remove dependences on the field position, i.e., the shift value. No functional change intended. Separate because this isn't as trivial as the other FIELD_GET() changes. See 907830b0fc9e ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse") Link: https://lore.kernel.org/r/20231010204436.1000644-3-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28PCI: dwc: Add missing PCI_EXP_LNKCAP_MLW handlingYoshihiro Shimoda1-1/+8
[ Upstream commit 89db0793c9f2da265ecb6c1681f899d9af157f37 ] Update dw_pcie_link_set_max_link_width() to set PCI_EXP_LNKCAP_MLW. In accordance with the DW PCIe RC/EP HW manuals [1,2,3,...] aside with the PORT_LINK_CTRL_OFF.LINK_CAPABLE and GEN2_CTRL_OFF.NUM_OF_LANES[8:0] field there is another one which needs to be updated. It's LINK_CAPABILITIES_REG.PCIE_CAP_MAX_LINK_WIDTH. If it isn't done at the very least the maximum link-width capability CSR won't expose the actual maximum capability. [1] DesignWare Cores PCI Express Controller Databook - DWC PCIe Root Port, Version 4.60a, March 2015, p.1032 [2] DesignWare Cores PCI Express Controller Databook - DWC PCIe Root Port, Version 4.70a, March 2016, p.1065 [3] DesignWare Cores PCI Express Controller Databook - DWC PCIe Root Port, Version 4.90a, March 2016, p.1057 ... [X] DesignWare Cores PCI Express Controller Databook - DWC PCIe Endpoint, Version 5.40a, March 2019, p.1396 [X+1] DesignWare Cores PCI Express Controller Databook - DWC PCIe Root Port, Version 5.40a, March 2019, p.1266 Suggested-by: Serge Semin <fancer.lancer@gmail.com> Link: https://lore.kernel.org/linux-pci/20231018085631.1121289-4-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28PCI: dwc: Add dw_pcie_link_set_max_link_width()Yoshihiro Shimoda1-45/+41
[ Upstream commit a9a1bcba90254975d4adbcca53f720318cf81c0c ] This is a preparation before adding the Max-Link-width capability setup which would in its turn complete the max-link-width setup procedure defined by Synopsys in the HW-manual. Seeing there is a max-link-speed setup method defined in the DW PCIe core driver it would be good to have a similar function for the link width setup. That's why we need to define a dedicated function first from already implemented but incomplete link-width setting up code. Link: https://lore.kernel.org/linux-pci/20231018085631.1121289-3-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28PCI: Disable ATS for specific Intel IPU E2000 devicesBartosz Pawlowski1-0/+19
[ Upstream commit a18615b1cfc04f00548c60eb9a77e0ce56e848fd ] Due to a hardware issue in A and B steppings of Intel IPU E2000, it expects wrong endianness in ATS invalidation message body. This problem can lead to outdated translations being returned as valid and finally cause system instability. To prevent such issues, add quirk_intel_e2000_no_ats() to disable ATS for vulnerable IPU E2000 devices. Link: https://lore.kernel.org/r/20230908143606.685930-3-bartosz.pawlowski@intel.com Signed-off-by: Bartosz Pawlowski <bartosz.pawlowski@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28PCI: Extract ATS disabling to a helper functionBartosz Pawlowski1-7/+9
[ Upstream commit f18b1137d38c091cc8c16365219f0a1d4a30b3d1 ] Introduce quirk_no_ats() helper function to provide a standard way to disable ATS capability in PCI quirks. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230908143606.685930-2-bartosz.pawlowski@intel.com Signed-off-by: Bartosz Pawlowski <bartosz.pawlowski@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28PCI: Use FIELD_GET() to extract Link WidthIlpo Järvinen2-6/+4
[ Upstream commit d1f9b39da4a5347150246871325190018cda8cb3 ] Use FIELD_GET() to extract PCIe Negotiated and Maximum Link Width fields instead of custom masking and shifting. Link: https://lore.kernel.org/r/20230919125648.1920-7-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: drop duplicate include of <linux/bitfield.h>] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28PCI: Do error check on own line to split long "if" conditionsIlpo Järvinen3-9/+12
[ Upstream commit d15f18053e5cc5576af9e7eef0b2a91169b6326d ] Placing PCI error code check inside "if" condition usually results in need to split lines. Combined with additional conditions the "if" condition becomes messy. Convert to the usual error handling pattern with an additional variable to improve code readability. In addition, reverse the logic in pci_find_vsec_capability() to get rid of &&. No functional changes intended. Link: https://lore.kernel.org/r/20230911125354.25501-5-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: PCI_POSSIBLE_ERROR()] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28PCI: mvebu: Use FIELD_PREP() with Link WidthIlpo Järvinen1-1/+1
[ Upstream commit 408599ec561ad5862cda4f107626009f6fa97a74 ] mvebu_pcie_setup_hw() setups the Maximum Link Width field in the Link Capabilities registers using an open-coded variant of FIELD_PREP() with a literal in shift. Improve readability by using FIELD_PREP(PCI_EXP_LNKCAP_MLW, ...). Link: https://lore.kernel.org/r/20230919125648.1920-6-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28PCI: tegra194: Use FIELD_GET()/FIELD_PREP() with Link Width fieldsIlpo Järvinen1-5/+4
[ Upstream commit 759574abd78e3b47ec45bbd31a64e8832cf73f97 ] Use FIELD_GET() to extract PCIe Negotiated Link Width field instead of custom masking and shifting. Similarly, change custom code that misleadingly used PCI_EXP_LNKSTA_NLW_SHIFT to prepare value for PCI_EXP_LNKCAP write to use FIELD_PREP() with correct field define (PCI_EXP_LNKCAP_MLW). Link: https://lore.kernel.org/r/20230919125648.1920-5-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28ACPI: APEI: Fix AER info corruption when error status data has multiple sectionsShiju Jose1-0/+10
[ Upstream commit e2abc47a5a1a9f641e7cacdca643fdd40729bf6e ] ghes_handle_aer() passes AER data to the PCI core for logging and recovery by calling aer_recover_queue() with a pointer to struct aer_capability_regs. The problem was that aer_recover_queue() queues the pointer directly without copying the aer_capability_regs data. The pointer was to the ghes->estatus buffer, which could be reused before aer_recover_work_func() reads the data. To avoid this problem, allocate a new aer_capability_regs structure from the ghes_estatus_pool, copy the AER data from the ghes->estatus buffer into it, pass a pointer to the new struct to aer_recover_queue(), and free it after aer_recover_work_func() has processed it. Reported-by: Bjorn Helgaas <helgaas@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20Revert "PCI/ASPM: Disable only ASPM_STATE_L1 when driver, disables L1"Heiner Kallweit1-1/+2
commit 3cb4f534bac010258b2688395c2f13459a932be9 upstream. This reverts commit fb097dcd5a28c0a2325632405c76a66777a6bed9. After fb097dcd5a28 ("PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1"), disabling L1 via pci_disable_link_state(PCIE_LINK_STATE_L1), then enabling one substate, e.g., L1.1, via sysfs actually enables *all* the substates. For example, r8169 disables L1 because of hardware issues on a number of systems, which implicitly disables the L1.1 and L1.2 substates. On some systems, L1 and L1.1 work fine, but L1.2 causes missed rx packets. Enabling L1.1 via the sysfs "aspm_l1_1" attribute unexpectedly enables L1.2 as well as L1.1. After fb097dcd5a28, pci_disable_link_state(PCIE_LINK_STATE_L1) adds only ASPM_L1 (but not any of the L1.x substates) to the "aspm_disable" mask: --- Before fb097dcd5a28 +++ After fb097dcd5a28 # r8169 disables L1: pci_disable_link_state(PCIE_LINK_STATE_L1) - disable |= ASPM_L1 | ASPM_L1_1 | ASPM_L1_2 | ... # disable L1, L1.x + disable |= ASPM_L1 # disable L1 only # write "1" to sysfs "aspm_l1_1" attribute: l1_1_aspm aspm_attr_store_common(state = ASPM_L1_1) disable &= ~ASPM_L1_1 # enable L1.1 if (state & (ASPM_L1_1 | ...)) # if enabling any substate disable &= ~ASPM_L1 # enable L1 # final state: - disable = ASPM_L1_2 | ... # L1, L1.1 enabled; L1.2 disabled + disable = 0 # L1, L1.1, L1.2 all enabled Enabling an L1.x substate removes the substate and L1 from the "aspm_disable" mask. After fb097dcd5a28, the substates were not added to the mask when disabling L1, so enabling one substate implicitly enables all of them. Revert fb097dcd5a28 so enabling one substate doesn't enable the others. Link: https://lore.kernel.org/r/c75931ac-7208-4200-9ca1-821629cf5e28@gmail.com Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> [bhelgaas: work through example in commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-20PCI: endpoint: Fix double free in __pci_epc_create()Dan Carpenter1-1/+0
[ Upstream commit c9501d268944d6c0475ecb3e740a084a7da9cbfe ] The pci_epc_release() function frees "epc" so the kfree() on the next line is a double free. Drop the redundant free. Fixes: 7711cbb4862a ("PCI: endpoint: Fix WARN() when an endpoint driver is removed") Link: https://lore.kernel.org/r/2ce68694-87a7-4c06-b8a4-9870c891b580@moroto.mountain Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20PCI: vmd: Correct PCI Header Type Register's multi-function checkIlpo Järvinen1-2/+1
[ Upstream commit 5827e17d0555b566c32044b0632b46f9f95054fa ] vmd_domain_reset() attempts to find whether the device may contain multiple functions by checking 0x80 (Multi-Function Device), however, the hdr_type variable has already been masked with PCI_HEADER_TYPE_MASK so the check can never true. To fix the issue, don't mask the read with PCI_HEADER_TYPE_MASK. Fixes: 6aab5622296b ("PCI: vmd: Clean up domain before enumeration") Link: https://lore.kernel.org/r/20231003125300.5541-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Nirmal Patel <nirmal.patel@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-08PCI: Prevent xHCI driver from claiming AMD VanGogh USB3 DRD deviceVicki Pfau1-3/+5
commit 7e6f3b6d2c352b5fde37ce3fed83bdf6172eebd4 upstream. The AMD VanGogh SoC contains a DesignWare USB3 Dual-Role Device that can be operated as either a USB Host or a USB Device, similar to on the AMD Nolan platform. be6646bfbaec ("PCI: Prevent xHCI driver from claiming AMD Nolan USB3 DRD device") added a quirk to let the dwc3 driver claim the Nolan device since it provides more specific support. Extend that quirk to include the VanGogh SoC USB3 device. Link: https://lore.kernel.org/r/20230927202212.2388216-1-vi@endrift.com Signed-off-by: Vicki Pfau <vi@endrift.com> [bhelgaas: include be6646bfbaec reference, add stable tag] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v3.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-10PCI: qcom: Fix IPQ8074 enumerationSricharan Ramabadhran1-3/+1
commit 6a878a54d0053ef21f3b829dc267487c2302b012 upstream. PARF_SLV_ADDR_SPACE_SIZE_2_3_3 is used by qcom_pcie_post_init_2_3_3(). This PCIe slave address space size register offset is 0x358 but was incorrectly changed to 0x16c by 39171b33f652 ("PCI: qcom: Remove PCIE20_ prefix from register definitions"). This prevented access to slave address space registers like iATU, etc., so the IPQ8074 PCIe controller was not enumerated. Revert back to the correct 0x358 offset and remove the unused PARF_SLV_ADDR_SPACE_SIZE_2_3_3. Fixes: 39171b33f652 ("PCI: qcom: Remove PCIE20_ prefix from register definitions") Link: https://lore.kernel.org/r/20230919102948.1844909-1-quic_srichara@quicinc.com Tested-by: Robert Marko <robimarko@gmail.com> Signed-off-by: Sricharan Ramabadhran <quic_srichara@quicinc.com> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-10-10PCI/PM: Mark devices disconnected if upstream PCIe link is down on resumeMika Westerberg1-1/+13
commit c82458101d5490230d735caecce14c9c27b1010c upstream. Mark Blakeney reported that when suspending system with a Thunderbolt dock connected and then unplugging the dock before resume (which is pretty normal flow with laptops), resuming takes long time. What happens is that the PCIe link from the root port to the PCIe switch inside the Thunderbolt device does not train (as expected, the link is unplugged): pcieport 0000:00:07.2: restoring config space at offset 0x24 (was 0x3bf12001, writing 0x3bf12001) pcieport 0000:00:07.0: waiting 100 ms for downstream link pcieport 0000:01:00.0: not ready 1023ms after resume; giving up However, at this point we still try to resume the devices below that unplugged link: pcieport 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible ... pcieport 0000:01:00.0: restoring config space at offset 0x38 (was 0xffffffff, writing 0x0) ... pcieport 0000:02:02.0: waiting 100 ms for downstream link, after activation And this is the link from PCIe switch downstream port to the xHCI on the dock: xhci_hcd 0000:03:00.0: not ready 65535ms after resume; giving up xhci_hcd 0000:03:00.0: Unable to change power state from D3cold to D0, device inaccessible xhci_hcd 0000:03:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x1ff) This ends up slowing down the resume time considerably. For this reason mark these devices as disconnected if the link above them did not train properly. Fixes: e8b908146d44 ("PCI/PM: Increase wait time after resume") Link: https://lore.kernel.org/r/20230918053041.1018876-1-mika.westerberg@linux.intel.com Reported-by: Mark Blakeney <mark.blakeney@bullet-systems.net> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217915 Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Cc: stable@vger.kernel.org # v6.4+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-23PCI: fu740: Set the number of MSI vectorsYong-Xuan Wang1-0/+1
[ Upstream commit 551a60e1225e71fff8efd9390204c505b0870e0f ] The iMSI-RX module of the DW PCIe controller provides multiple sets of MSI_CTRL_INT_i_* registers, and each set is capable of handling 32 MSI interrupts. However, the fu740 PCIe controller driver only enabled one set of MSI_CTRL_INT_i_* registers, as the total number of supported interrupts was not specified. Set the supported number of MSI vectors to enable all the MSI_CTRL_INT_i_* registers on the fu740 PCIe core, allowing the system to fully utilize the available MSI interrupts. Link: https://lore.kernel.org/r/20230807055621.2431-1-yongxuan.wang@sifive.com Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-23PCI: vmd: Disable bridge window for domain resetNirmal Patel1-2/+17
[ Upstream commit f73eedc90bf73d48e8368e6b0b4ad76a7fffaef7 ] During domain reset process vmd_domain_reset() clears PCI configuration space of VMD root ports. But certain platform has observed following errors and failed to boot. ... DMAR: VT-d detected Invalidation Queue Error: Reason f DMAR: VT-d detected Invalidation Time-out Error: SID ffff DMAR: VT-d detected Invalidation Completion Error: SID ffff DMAR: QI HEAD: UNKNOWN qw0 = 0x0, qw1 = 0x0 DMAR: QI PRIOR: UNKNOWN qw0 = 0x0, qw1 = 0x0 DMAR: Invalidation Time-out Error (ITE) cleared The root cause is that memset_io() clears prefetchable memory base/limit registers and prefetchable base/limit 32 bits registers sequentially. This seems to be enabling prefetchable memory if the device disabled prefetchable memory originally. Here is an example (before memset_io()): PCI configuration space for 10000:00:00.0: 86 80 30 20 06 00 10 00 04 00 04 06 00 00 01 00 00 00 00 00 00 00 00 00 00 01 01 00 00 00 00 20 00 00 00 00 01 00 01 00 ff ff ff ff 75 05 00 00 ... So, prefetchable memory is ffffffff00000000-575000fffff, which is disabled. When memset_io() clears prefetchable base 32 bits register, the prefetchable memory becomes 0000000000000000-575000fffff, which is enabled and incorrect. Here is the quote from section 7.5.1.3.9 of PCI Express Base 6.0 spec: The Prefetchable Memory Limit register must be programmed to a smaller value than the Prefetchable Memory Base register if there is no prefetchable memory on the secondary side of the bridge. This is believed to be the reason for the failure and in addition the sequence of operation in vmd_domain_reset() is not following the PCIe specs. Disable the bridge window by executing a sequence of operations borrowed from pci_disable_bridge_window() and pci_setup_bridge_io(), that comply with the PCI specifications. Link: https://lore.kernel.org/r/20230810215029.1177379-1-nirmal.patel@linux.intel.com Signed-off-by: Nirmal Patel <nirmal.patel@linux.intel.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-23PCI: dwc: Provide deinit callback for i.MXMark Brown1-0/+1
[ Upstream commit fc8b24c28bec19fc0621d108b9ee81ddfdedb25a ] The i.MX integration for the DesignWare PCI controller has a _host_exit() operation which undoes everything that the _host_init() operation does but does not wire this up as the host_deinit callback for the core, or call it in any path other than suspend. This means that if we ever unwind the initial probe of the device, for example because it fails, the regulator core complains that the regulators for the device were left enabled: imx6q-pcie 33800000.pcie: iATU: unroll T, 4 ob, 4 ib, align 64K, limit 16G imx6q-pcie 33800000.pcie: Phy link never came up imx6q-pcie 33800000.pcie: Phy link never came up imx6q-pcie: probe of 33800000.pcie failed with error -110 ------------[ cut here ]------------ WARNING: CPU: 2 PID: 46 at drivers/regulator/core.c:2396 _regulator_put+0x110/0x128 Wire up the callback so that the core can clean up after itself. Link: https://lore.kernel.org/r/20230731-pci-imx-regulator-cleanup-v2-1-fc8fa5c9893d@kernel.org Tested-by: Fabio Estevam <festevam@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Richard Zhu <hongxing.zhu@nxp.com> Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13Revert "PCI: Mark NVIDIA T4 GPUs to avoid bus reset"Bjorn Helgaas1-1/+1
commit 5260bd6d36c83c5b269c33baaaf8c78e520908b0 upstream. This reverts commit d5af729dc2071273f14cbb94abbc60608142fd83. d5af729dc207 ("PCI: Mark NVIDIA T4 GPUs to avoid bus reset") avoided Secondary Bus Reset on the T4 because the reset seemed to not work when the T4 was directly attached to a Root Port. But NVIDIA thinks the issue is probably related to some issue with the Root Port, not with the T4. The T4 provides neither PM nor FLR reset, so masking bus reset compromises this device for assignment scenarios. Revert d5af729dc207 as requested by Wu Zongyong. This will leave SBR broken in the specific configuration Wu tested, as it was in v6.5, so Wu will debug that further. Link: https://lore.kernel.org/r/ZPqMCDWvITlOLHgJ@wuzongyong-alibaba Link: https://lore.kernel.org/r/20230908201104.GA305023@bhelgaas Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13PCI/PM: Only read PCI_PM_CTRL register when availableFeiyang Chen1-4/+9
commit 5694ba13b004eea683c6d4faeb6d6e7a9636bda0 upstream. For a device with no Power Management Capability, pci_power_up() previously returned 0 (success) if the platform was able to put the device in D0, which led to pci_set_full_power_state() trying to read PCI_PM_CTRL, even though it doesn't exist. Since dev->pm_cap == 0 in this case, pci_set_full_power_state() actually read the wrong register, interpreted it as PCI_PM_CTRL, and corrupted dev->current_state. This led to messages like this in some cases: pci 0000:01:00.0: Refused to change power state from D3hot to D0 To prevent this, make pci_power_up() always return a negative failure code if the device lacks a Power Management Capability, even if non-PCI platform power management has been able to put the device in D0. The failure will prevent pci_set_full_power_state() from trying to access PCI_PM_CTRL. Fixes: e200904b275c ("PCI/PM: Split pci_power_up()") Link: https://lore.kernel.org/r/20230824013738.1894965-1-chenfeiyang@loongson.cn Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: "Rafael J. Wysocki" <rafael@kernel.org> Cc: stable@vger.kernel.org # v5.19+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13PCI: hv: Fix a crash in hv_pci_restore_msi_msg() during hibernationDexuan Cui1-0/+3
commit 04bbe863241a9be7d57fb4cf217ee4a72f480e70 upstream. When a Linux VM with an assigned PCI device runs on Hyper-V, if the PCI device driver is not loaded yet (i.e. MSI-X/MSI is not enabled on the device yet), doing a VM hibernation triggers a panic in hv_pci_restore_msi_msg() -> msi_lock_descs(&pdev->dev), because pdev->dev.msi.data is still NULL. Avoid the panic by checking if MSI-X/MSI is enabled. Link: https://lore.kernel.org/r/20230816175939.21566-1-decui@microsoft.com Fixes: dc2b453290c4 ("PCI: hv: Rework MSI handling") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: sathyanarayanan.kuppuswamy@linux.intel.com Reviewed-by: Michael Kelley <mikelley@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13PCI: Free released resource after coalescingRoss Lagerwall1-0/+1
commit 8ec9c1d5d0a5a4744516adb483b97a238892f9d5 upstream. release_resource() doesn't actually free the resource or resource list entry so free the resource list entry to avoid a leak. Closes: https://lore.kernel.org/r/878r9sga1t.fsf@kernel.org/ Fixes: e54223275ba1 ("PCI: Release resource invalidated by coalescing") Link: https://lore.kernel.org/r/20230906110846.225369-1-ross.lagerwall@citrix.com Reported-by: Kalle Valo <kvalo@kernel.org> Tested-by: Kalle Valo <kvalo@kernel.org> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13PCI: rockchip: Use 64-bit mask on MSI 64-bit PCI addressRick Wertenbroek1-3/+3
commit cdb50033dd6dfcf02ae3d4ee56bc1a9555be6d36 upstream. A 32-bit mask was used on the 64-bit PCI address used for mapping MSIs. This would result in the upper 32 bits being unintentionally zeroed and MSIs getting mapped to incorrect PCI addresses if the address had any of the upper bits set. Replace 32-bit mask by appropriate 64-bit mask. [kwilczynski: use GENMASK_ULL() over GENMASK() for 32-bit compatibility] Fixes: dc73ed0f1b8b ("PCI: rockchip: Fix window mapping and address translation for endpoint") Closes: https://lore.kernel.org/linux-pci/8d19e5b7-8fa0-44a4-90e2-9bb06f5eb694@moroto.mountain Link: https://lore.kernel.org/linux-pci/20230703085845.2052008-1-rick.wertenbroek@gmail.com Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-13PCI: layerscape: Add workaround for lost link capabilities during resetXiaowei Bao1-0/+19
[ Upstream commit 17cf8661ee0f065c08152e611a568dd1fb0285f1 ] The endpoint controller loses the Maximum Link Width and Supported Link Speed value from the Link Capabilities Register - initially configured by the Reset Configuration Word (RCW) - during a link-down or hot reset event. Address this issue in the endpoint event handler. Link: https://lore.kernel.org/r/20230720135834.1977616-2-Frank.Li@nxp.com Fixes: a805770d8a22 ("PCI: layerscape: Add EP mode support") Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13PCI/ASPM: Use RMW accessors for changing LNKCTLIlpo Järvinen1-17/+13
[ Upstream commit e09060b3b6b4661278ff8e1b7b81a37d5ea86eae ] Don't assume that the device is fully under the control of ASPM and use RMW capability accessors which do proper locking to avoid losing concurrent updates to the register values. If configuration fails in pcie_aspm_configure_common_clock(), the function attempts to restore the old PCI_EXP_LNKCTL_CCC settings. Store only the old PCI_EXP_LNKCTL_CCC bit for the relevant devices rather than the content of the whole LNKCTL registers. It aligns better with how pcie_lnkctl_clear_and_set() expects its parameter and makes the code more obvious to understand. Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: 2a42d9dba784 ("PCIe: ASPM: Break out of endless loop waiting for PCI config bits to switch") Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Link: https://lore.kernel.org/r/20230717120503.15276-5-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13PCI: pciehp: Use RMW accessors for changing LNKCTLIlpo Järvinen1-9/+3
[ Upstream commit 5f75f96c61039151c193775d776fde42477eace1 ] As hotplug is not the only driver touching LNKCTL, use the RMW capability accessor which handles concurrent changes correctly. Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: 7f822999e12a ("PCI: pciehp: Add Disable/enable link functions") Link: https://lore.kernel.org/r/20230717120503.15276-4-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13PCI: Make link retraining use RMW accessors for changing LNKCTLIlpo Järvinen1-6/+2
[ Upstream commit fb0171a4c01b4825e36a5584eaa84291179c64ce ] Don't assume that the device is fully under the control of PCI core. Use RMW capability accessors in link retraining which do proper locking to avoid losing concurrent updates to the register values. Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: 4ec73791a64b ("PCI: Work around Pericom PCIe-to-PCI bridge Retrain Link erratum") Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Link: https://lore.kernel.org/r/20230717120503.15276-3-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13PCI: Add locking to RMW PCI Express Capability Register accessorsIlpo Järvinen2-3/+18
[ Upstream commit 5e70d0acf0825f439079736080350371f8d6699a ] Many places in the kernel write the Link Control and Root Control PCI Express Capability Registers without proper concurrency control and this could result in losing the changes one of the writers intended to make. Add pcie_cap_lock spinlock into the struct pci_dev and use it to protect bit changes made in the RMW capability accessors. Protect only a selected set of registers by differentiating the RMW accessor internally to locked/unlocked variants using a wrapper which has the same signature as pcie_capability_clear_and_set_word(). As the Capability Register (pos) given to the wrapper is always a constant, the compiler should be able to simplify all the dead-code away. So far only the Link Control Register (ASPM, hotplug, link retraining, various drivers) and the Root Control Register (AER & PME) seem to require RMW locking. Suggested-by: Lukas Wunner <lukas@wunner.de> Fixes: c7f486567c1d ("PCI PM: PCIe PME root port service driver") Fixes: f12eb72a268b ("PCI/ASPM: Use PCI Express Capability accessors") Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Fixes: affa48de8417 ("staging/rdma/hfi1: Add support for enabling/disabling PCIe ASPM") Fixes: 849a9366cba9 ("misc: rtsx: Add support new chip rts5228 mmc: rtsx: Add support MMC_CAP2_NO_MMC") Fixes: 3d1e7aa80d1c ("misc: rtsx: Use pcie_capability_clear_and_set_word() for PCI_EXP_LNKCTL") Fixes: c0e5f4e73a71 ("misc: rtsx: Add support for RTS5261") Fixes: 3df4fce739e2 ("misc: rtsx: separate aspm mode into MODE_REG and MODE_CFG") Fixes: 121e9c6b5c4c ("misc: rtsx: modify and fix init_hw function") Fixes: 19f3bd548f27 ("mfd: rtsx: Remove LCTLR defination") Fixes: 773ccdfd9cc6 ("mfd: rtsx: Read vendor setting from config space") Fixes: 8275b77a1513 ("mfd: rts5249: Add support for RTS5250S power saving") Fixes: 5da4e04ae480 ("misc: rtsx: Add support for RTS5260") Fixes: 0f49bfbd0f2e ("tg3: Use PCI Express Capability accessors") Fixes: 5e7dfd0fb94a ("tg3: Prevent corruption at 10 / 100Mbps w CLKREQ") Fixes: b726e493e8dc ("r8169: sync existing 8168 device hardware start sequences with vendor driver") Fixes: e6de30d63eb1 ("r8169: more 8168dp support.") Fixes: 8a06127602de ("Bluetooth: hci_bcm4377: Add new driver for BCM4377 PCIe boards") Fixes: 6f461f6c7c96 ("e1000e: enable/disable ASPM L0s and L1 and ERT according to hardware errata") Fixes: 1eae4eb2a1c7 ("e1000e: Disable L1 ASPM power savings for 82573 mobile variants") Fixes: 8060e169e02f ("ath9k: Enable extended synch for AR9485 to fix L0s recovery issue") Fixes: 69ce674bfa69 ("ath9k: do btcoex ASPM disabling at initialization time") Fixes: f37f05503575 ("mt76: mt76x2e: disable pcie_aspm by default") Link: https://lore.kernel.org/r/20230717120503.15276-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: "Rafael J. Wysocki" <rafael@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13PCI: Mark NVIDIA T4 GPUs to avoid bus resetWu Zongyong1-1/+1
[ Upstream commit d5af729dc2071273f14cbb94abbc60608142fd83 ] NVIDIA T4 GPUs do not work with SBR. This problem is found when the T4 card is direct attached to a Root Port only. Avoid bus reset by marking T4 GPUs PCI_DEV_FLAGS_NO_BUS_RESET. Fixes: 4c207e7121fa ("PCI: Mark some NVIDIA GPUs to avoid bus reset") Link: https://lore.kernel.org/r/2dcebea53a6eb9bd212ec6d8974af2e5e0333ef6.1681129861.git.wuzongyong@linux.alibaba.com Signed-off-by: Wu Zongyong <wuzongyong@linux.alibaba.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13PCI: microchip: Correct the DED and SEC interrupt bit offsetsDaire McNamara1-4/+4
[ Upstream commit 6d473a5a26136edf55c435a1c433e52910e03926 ] The SEC and DED interrupt bits are laid out the wrong way round so the SEC interrupt handler attempts to mask, unmask, and clear the DED interrupt and vice versa. Correct the bit offsets so that each interrupt handler operates properly. Link: https://lore.kernel.org/r/20230728131401.1615724-2-daire.mcnamara@microchip.com Fixes: 6f15a9c9f941 ("PCI: microchip: Add Microchip PolarFire PCIe controller driver") Signed-off-by: Daire McNamara <daire.mcnamara@microchip.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13PCI/DOE: Fix destroy_work_on_stack() raceIra Weiny1-1/+1
[ Upstream commit e3a3a097eaebaf234a482b4d2f9f18fe989208c1 ] The following debug object splat was observed in testing: ODEBUG: free active (active state 0) object: 0000000097d23782 object type: work_struct hint: doe_statemachine_work+0x0/0x510 WARNING: CPU: 1 PID: 71 at lib/debugobjects.c:514 debug_print_object+0x7d/0xb0 ... Workqueue: pci 0000:36:00.0 DOE [1 doe_statemachine_work RIP: 0010:debug_print_object+0x7d/0xb0 ... Call Trace: ? debug_print_object+0x7d/0xb0 ? __pfx_doe_statemachine_work+0x10/0x10 debug_object_free.part.0+0x11b/0x150 doe_statemachine_work+0x45e/0x510 process_one_work+0x1d4/0x3c0 This occurs because destroy_work_on_stack() was called after signaling the completion in the calling thread. This creates a race between destroy_work_on_stack() and the task->work struct going out of scope in pci_doe(). Signal the work complete after destroying the work struct. This is safe because signal_task_complete() is the final thing the work item does and the workqueue code is careful not to access the work struct after. Fixes: abf04be0e707 ("PCI/DOE: Fix memory leak with CONFIG_DEBUG_OBJECTS=y") Link: https://lore.kernel.org/r/20230726-doe-fix-v1-1-af07e614d4dd@intel.com Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13PCI: qcom-ep: Switch MHI bus master clock off during L1SSManivannan Sadhasivam1-1/+1
[ Upstream commit b9cbc06049cb6b7a322d708c2098195fb9fdcc4c ] Currently, as part of the qcom_pcie_perst_deassert() function, instead of writing the updated value to clear PARF_MSTR_AXI_CLK_EN, the variable "val" is re-read. This must be fixed to ensure that the master clock supplied to the MHI bus is correctly gated during L1.1/L1.2 to save power. Thus, replace the line that re-reads "val" with a line that writes the updated value to the register to clear PARF_MSTR_AXI_CLK_EN. [kwilczynski: commit log] Fixes: c457ac029e44 ("PCI: qcom-ep: Gate Master AXI clock to MHI bus during L1SS") Link: https://lore.kernel.org/linux-pci/20230627141036.11600-1-manivannan.sadhasivam@linaro.org Reported-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13PCI: apple: Initialize pcie->nvecs before useSven Peter1-1/+5
[ Upstream commit d8650c0c2aa2e413594e4cb0faafa9958c1d7782 ] The apple_pcie_setup_port() function computes ilog2(pcie->nvecs) to set up the number of MSIs available for each port. However, it's called before apple_msi_init(), which initializes pcie->nvecs. Luckily, pcie->nvecs is part of kzalloc()-ed structure and, as such, initialized as zero. ilog2(0) happens to be 0xffffffff which then simply configures more MSIs in hardware than we have. This doesn't break anything because we never hand out those vectors. Thus, swap the order of the two calls so that the correctly initialized value is then used. [kwilczynski: commit log] Link: https://lore.kernel.org/linux-pci/20230311133453.63246-1-sven@svenpeter.dev Fixes: 476c41ed4597 ("PCI: apple: Implement MSI support") Signed-off-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Eric Curtin <ecurtin@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-13Revert "PCI: tegra194: Enable support for 256 Byte payload"Vidya Sagar1-10/+0
commit ebfde1584d9f037b6309fc682c96e22dac7bcb7a upstream. After commit 4fb8e46c1bc4 ("PCI: tegra194: Enable support for 256 Byte payload"), we initialize MPS=256 for tegra194 Root Ports before enumerating the hierarchy. Consider an Endpoint that supports only MPS=128. In the default situation (CONFIG_PCIE_BUS_DEFAULT set and no "pci=pcie_bus_*" parameter), Linux tries to configure the MPS of every device to match the upstream bridge. If the Endpoint is directly below the Root Port, Linux can reduce the Root Port MPS to 128 to match the Endpoint. But if there's a switch in the middle, Linux doesn't reduce the Root Port MPS because other devices below the switch may already be configured with MPS larger than 128. This scenario results in uncorrectable Malformed TLP errors if the Root Port sends TLPs with payloads larger than 128 bytes. These errors can be avoided by using the "pci=pcie_bus_safe" parameter, but it doesn't seem to be a good idea to always have this parameter even for basic functionality to work. Revert commit 4fb8e46c1bc4 ("PCI: tegra194: Enable support for 256 Byte payload") so the Root Ports default to MPS=128, which all devices support. If peer-to-peer DMA is not required, one can use "pci=pcie_bus_perf" to get the benefit of larger MPS settings. [bhelgaas: commit log; kwilczynski: retain "u16 val_16" declaration at the top, add missing acked by tag] Fixes: 4fb8e46c1bc4 ("PCI: tegra194: Enable support for 256 Byte payload") Link: https://lore.kernel.org/linux-pci/20230619102604.3735001-1-vidyas@nvidia.com Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Acked-by: Jon Hunter <jonathanh@nvidia.com> Cc: stable@vger.kernel.org # v6.0-rc1+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-11Merge tag 'pci-v6.5-fixes-1' of ↵Linus Torvalds5-25/+18
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Add Manivannan Sadhasivam as DesignWare PCIe driver co-maintainer (Krzysztof Wilczyński) - Revert "PCI: dwc: Wait for link up only if link is started" to fix a regression on Qualcomm platforms that don't reach interconnect sync state if the slot is empty (Johan Hovold) - Revert "PCI: mvebu: Mark driver as BROKEN" so people can use pci-mvebu even though some others report problems (Bjorn Helgaas) - Avoid a NULL pointer dereference when using acpiphp for root bus hotplug to fix a regression added in v6.5-rc1 (Igor Mammedov) * tag 'pci-v6.5-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus Revert "PCI: mvebu: Mark driver as BROKEN" Revert "PCI: dwc: Wait for link up only if link is started" MAINTAINERS: Add Manivannan Sadhasivam as DesignWare PCIe driver maintainer
2023-08-09PCI: move OF status = "disabled" detection to dev->match_driverVladimir Oltean2-6/+3
The blamed commit has broken probing on arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi when &enetc_port0 (PCI function 0) has status = "disabled". Background: pci_scan_slot() has logic to say that if the function 0 of a device is absent, the entire device is absent and we can skip the other functions entirely. Traditionally, this has meant that pci_bus_read_dev_vendor_id() returns an error code for that function. However, since the blamed commit, there is an extra confounding condition: function 0 of the device exists and has a valid vendor id, but it is disabled in the device tree. In that case, pci_scan_slot() would incorrectly skip the entire device instead of just that function. In the case of NXP LS1028A, status = "disabled" does not mean that the PCI function's config space is not available for reading. It is, but the Ethernet port is just not functionally useful with a particular SerDes protocol configuration (0x9999) due to pinmuxing constraints of the Soc. So, pci_scan_slot() skips all other functions on the ENETC ECAM (enetc_port1, enetc_port2, enetc_mdio_pf3 etc) when just enetc_port0 had to not be probed. There is an additional regression introduced by the change, caused by its fundamental premise. The enetc driver needs to run code for all PCI functions, regardless of whether they're enabled or not in the device tree. That is no longer possible if the driver's probe function is no longer called. But Rob recommends that we move the of_device_is_available() detection to dev->match_driver, and this makes the PCI fixups still run on all functions, while just probing drivers for those functions that are enabled. So, a separate change in the enetc driver will have to move the workarounds to a PCI fixup. Fixes: 6fffbc7ae137 ("PCI: Honor firmware's device disabled status") Link: https://lore.kernel.org/netdev/CAL_JsqLsVYiPLx2kcHkDQ4t=hQVCR7NHziDwi9cCFUFhx48Qow@mail.gmail.com/ Suggested-by: Rob Herring <robh@kernel.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-08PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root busIgor Mammedov1-1/+7
40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary") changed acpiphp hotplug to use pci_assign_unassigned_bridge_resources() which depends on bridge being available, however enable_slot() can be called without bridge associated: 1. Legitimate case of hotplug on root bus (widely used in virt world) 2. A (misbehaving) firmware, that sends ACPI Bus Check notifications to non existing root ports (Dell Inspiron 7352/0W6WV0), which end up at enable_slot(..., bridge = 0) where bus has no bridge assigned to it. acpihp doesn't know that it's a bridge, and bus specific 'PCI subsystem' can't augment ACPI context with bridge information since the PCI device to get this data from is/was not available. Issue is easy to reproduce with QEMU's 'pc' machine, which supports PCI hotplug on hostbridge slots. To reproduce, boot kernel at commit 40613da52b13 in VM started with following CLI (assuming guest root fs is installed on sda1 partition): # qemu-system-x86_64 -M pc -m 1G -enable-kvm -cpu host \ -monitor stdio -serial file:serial.log \ -kernel arch/x86/boot/bzImage \ -append "root=/dev/sda1 console=ttyS0" \ guest_disk.img Once guest OS is fully booted at qemu prompt: (qemu) device_add e1000 (check serial.log) it will cause NULL pointer dereference at: void pci_assign_unassigned_bridge_resources(struct pci_dev *bridge) { struct pci_bus *parent = bridge->subordinate; BUG: kernel NULL pointer dereference, address: 0000000000000018 ? pci_assign_unassigned_bridge_resources+0x1f/0x260 enable_slot+0x21f/0x3e0 acpiphp_hotplug_notify+0x13d/0x260 acpi_device_hotplug+0xbc/0x540 acpi_hotplug_work_fn+0x15/0x20 process_one_work+0x1f7/0x370 worker_thread+0x45/0x3b0 The issue was discovered on Dell Inspiron 7352/0W6WV0 laptop with following sequence: 1. Suspend to RAM 2. Wake up with the same backtrace being observed: 3. 2nd suspend to RAM attempt makes laptop freeze Fix it by using __pci_bus_assign_resources() instead of pci_assign_unassigned_bridge_resources() as we used to do, but only in case when bus doesn't have a bridge associated (to cover for the case of ACPI event on hostbridge or non existing root port). That lets us keep hotplug on root bus working like it used to and at the same time keeps resource reassignment usable on root ports (and other 1st level bridges) that was fixed by 40613da52b13. Fixes: 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary") Link: https://lore.kernel.org/r/20230726123518.2361181-2-imammedo@redhat.com Reported-by: Woody Suwalski <terraluna977@gmail.com> Tested-by: Woody Suwalski <terraluna977@gmail.com> Tested-by: Michal Koutný <mkoutny@suse.com> Link: https://lore.kernel.org/r/11fc981c-af49-ce64-6b43-3e282728bd1a@gmail.com Signed-off-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Michael S. Tsirkin <mst@redhat.com>
2023-08-05Revert "PCI: mvebu: Mark driver as BROKEN"Bjorn Helgaas1-1/+0
b3574f579ece ("PCI: mvebu: Mark driver as BROKEN") made it impossible to enable the pci-mvebu driver. The driver does have known problems, but as Russell and Uwe reported, it does work in some configurations, so removing it broke some working setups. Revert b3574f579ece so pci-mvebu is available. Reported-by: Russell King (Oracle) <linux@armlinux.org.uk> Link: https://lore.kernel.org/r/ZMzicVQEyHyZzBOc@shell.armlinux.org.uk Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/20230804134622.pmbymxtzxj2yfhri@pengutronix.de Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-07-26Revert "PCI: dwc: Wait for link up only if link is started"Johan Hovold3-23/+11
This reverts commit da56a1bfbab55189595e588f1d984bdfb5cf5924. Bjorn Andersson, Fabio Estevam, Xiaolei Wang, and Jon Hunter reported that da56a1bfbab5 ("PCI: dwc: Wait for link up only if link is started") broke controller probing by returning an error in case the link does not come up during host initialisation, for example when the slot is empty. As explained in commit 886a9c134755 ("PCI: dwc: Move link handling into common code") and as indicated by the comment "Ignore errors, the link may come up later" in the code, waiting for link up and ignoring errors is the intended behaviour: Let's standardize this to succeed as there are usecases where devices (and the link) appear later even without hotplug. For example, a reconfigured FPGA device. Reverting the offending commit specifically fixes a regression on Qualcomm platforms like the Lenovo ThinkPad X13s which no longer reach the interconnect sync state if a slot does not have a device populated (e.g. an optional modem). Note that enabling asynchronous probing by default as was done for Qualcomm platforms by commit c0e1eb441b1d ("PCI: qcom: Enable async probe by default"), should take care of any related boot time concerns. Finally, note that the intel-gw driver is the only driver currently not providing a .start_link() callback and instead starts the link in its .host_init() callback, which may avoid an additional one-second timeout during probe by making the link-up wait conditional. If anyone cares, that can be done in a follow-up patch with a proper motivation. [bhelgaas: add Fabio Estevam, Xiaolei Wang, Jon Hunter reports] Fixes: da56a1bfbab5 ("PCI: dwc: Wait for link up only if link is started") Link: https://lore.kernel.org/r/20230706082610.26584-1-johan+linaro@kernel.org Reported-by: Bjorn Andersson <quic_bjorande@quicinc.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Reported-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20230704122635.1362156-1-festevam@gmail.com/ Reported-by: Xiaolei Wang <xiaolei.wang@windriver.com> Link: https://lore.kernel.org/r/20230705010624.3912934-1-xiaolei.wang@windriver.com/ Reported-by: Jon Hunter <jonathanh@nvidia.com> Link: https://lore.kernel.org/r/6ca287a1-6c7c-7b90-9022-9e73fb82b564@nvidia.com Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Sajid Dalvi <sdalvi@google.com> Cc: Ajay Agarwal <ajayagarwal@google.com>
2023-07-09Merge tag 'ntb-6.5' of https://github.com/jonmason/ntbLinus Torvalds1-0/+1
Pull NTB updates from Jon Mason: "Fixes for pci_clean_master, error handling in driver inits, and various other issues/bugs" * tag 'ntb-6.5' of https://github.com/jonmason/ntb: ntb: hw: amd: Fix debugfs_create_dir error checking ntb.rst: Fix copy and paste error ntb_netdev: Fix module_init problem ntb: intel: Remove redundant pci_clear_master ntb: epf: Remove redundant pci_clear_master ntb_hw_amd: Remove redundant pci_clear_master ntb: idt: drop redundant pci_enable_pcie_error_reporting() MAINTAINERS: git://github -> https://github.com for jonmason NTB: EPF: fix possible memory leak in pci_vntb_probe() NTB: ntb_tool: Add check for devm_kcalloc NTB: ntb_transport: fix possible memory leak while device_register() fails ntb: intel: Fix error handling in intel_ntb_pci_driver_init() NTB: amd: Fix error handling in amd_ntb_pci_driver_init() ntb: idt: Fix error handling in idt_pci_driver_init()
2023-07-08NTB: EPF: fix possible memory leak in pci_vntb_probe()ruanjinjie1-0/+1
As ntb_register_device() don't handle error of device_register(), if ntb_register_device() returns error in pci_vntb_probe(), name of kobject which is allocated in dev_set_name() called in device_add() is leaked. As comment of device_add() says, it should call put_device() to drop the reference count that was set in device_initialize() when it fails, so the name can be freed in kobject_cleanup(). Signed-off-by: ruanjinjie <ruanjinjie@huawei.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>