summaryrefslogtreecommitdiff
path: root/drivers/iommu
AgeCommit message (Collapse)AuthorFilesLines
2026-03-04iommu/vt-d: Flush dev-IOTLB only when PCIe device is accessible in scalable modeJinhui Guo1-1/+1
[ Upstream commit 10e60d87813989e20eac1f3eda30b3bae461e7f9 ] Commit 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected") relies on pci_dev_is_disconnected() to skip ATS invalidation for safely-removed devices, but it does not cover link-down caused by faults, which can still hard-lock the system. For example, if a VM fails to connect to the PCIe device, "virsh destroy" is executed to release resources and isolate the fault, but a hard-lockup occurs while releasing the group fd. Call Trace: qi_submit_sync qi_flush_dev_iotlb intel_pasid_tear_down_entry device_block_translation blocking_domain_attach_dev __iommu_attach_device __iommu_device_set_domain __iommu_group_set_domain_internal iommu_detach_group vfio_iommu_type1_detach_group vfio_group_detach_container vfio_group_fops_release __fput Although pci_device_is_present() is slower than pci_dev_is_disconnected(), it still takes only ~70 µs on a ConnectX-5 (8 GT/s, x2) and becomes even faster as PCIe speed and width increase. Besides, devtlb_invalidation_with_pasid() is called only in the paths below, which are far less frequent than memory map/unmap. 1. mm-struct release 2. {attach,release}_dev 3. set/remove PASID 4. dirty-tracking setup The gain in system stability far outweighs the negligible cost of using pci_device_is_present() instead of pci_dev_is_disconnected() to decide when to skip ATS invalidation, especially under GDR high-load conditions. Fixes: 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected") Cc: stable@vger.kernel.org Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com> Link: https://lore.kernel.org/r/20251211035946.2071-3-guojinhui.liam@bytedance.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiencyAlexander Grest1-10/+21
[ Upstream commit df180b1a4cc51011c5f8c52c7ec02ad2e42962de ] The SMMU CMDQ lock is highly contentious when there are multiple CPUs issuing commands and the queue is nearly full. The lock has the following states: - 0: Unlocked - >0: Shared lock held with count - INT_MIN+N: Exclusive lock held, where N is the # of shared waiters - INT_MIN: Exclusive lock held, no shared waiters When multiple CPUs are polling for space in the queue, they attempt to grab the exclusive lock to update the cons pointer from the hardware. If they fail to get the lock, they will spin until either the cons pointer is updated by another CPU. The current code allows the possibility of shared lock starvation if there is a constant stream of CPUs trying to grab the exclusive lock. This leads to severe latency issues and soft lockups. Consider the following scenario where CPU1's attempt to acquire the shared lock is starved by CPU2 and CPU0 contending for the exclusive lock. CPU0 (exclusive) | CPU1 (shared) | CPU2 (exclusive) | `cmdq->lock` -------------------------------------------------------------------------- trylock() //takes | | | 0 | shared_lock() | | INT_MIN | fetch_inc() | | INT_MIN | no return | | INT_MIN + 1 | spins // VAL >= 0 | | INT_MIN + 1 unlock() | spins... | | INT_MIN + 1 set_release(0) | spins... | | 0 see[NOTE] (done) | (sees 0) | trylock() // takes | 0 | *exits loop* | cmpxchg(0, INT_MIN) | 0 | | *cuts in* | INT_MIN | cmpxchg(0, 1) | | INT_MIN | fails // != 0 | | INT_MIN | spins // VAL >= 0 | | INT_MIN | *starved* | | INT_MIN [NOTE] The current code resets the exclusive lock to 0 regardless of the state of the lock. This causes two problems: 1. It opens the possibility of back-to-back exclusive locks and the downstream effect of starving shared lock. 2. The count of shared lock waiters are lost. To mitigate this, we release the exclusive lock by only clearing the sign bit while retaining the shared lock waiter count as a way to avoid starving the shared lock waiters. Also deleted cmpxchg loop while trying to acquire the shared lock as it is not needed. The waiters can see the positive lock count and proceed immediately after the exclusive lock is released. Exclusive lock is not starved in that submitters will try exclusive lock first when new spaces become available. Reviewed-by: Mostafa Saleh <smostafa@google.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Alexander Grest <Alexander.Grest@microsoft.com> Signed-off-by: Jacob Pan <jacob.pan@linux.microsoft.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-04iommu/vt-d: Flush cache for PASID table before using itDmytro Maluka1-3/+4
[ Upstream commit 22d169bdd2849fe6bd18c2643742e1c02be6451c ] When writing the address of a freshly allocated zero-initialized PASID table to a PASID directory entry, do that after the CPU cache flush for this PASID table, not before it, to avoid the time window when this PASID table may be already used by non-coherent IOMMU hardware while its contents in RAM is still some random old data, not zero-initialized. Fixes: 194b3348bdbb ("iommu/vt-d: Fix PASID directory pointer coherency") Signed-off-by: Dmytro Maluka <dmaluka@chromium.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20251221123508.37495-1-dmaluka@chromium.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-01-19Revert "iommu/amd: Skip enabling command/event buffers for kdump"Greg Kroah-Hartman1-19/+9
This reverts commit 30e91eeb0bc9b3daf402b26176d1d52c29ab53e4 which is commit 9be15fbfc6c5c89c22cf6e209f66ea43ee0e58bb upstream. This causes problems in older kernel trees as SNP host kdump is not supported in them, so drop it from the stable branches. Reported-by: Ashish Kalra <ashish.kalra@amd.com> Link: https://lore.kernel.org/r/dacdff7f-0606-4ed5-b056-2de564404d51@amd.com Cc: Vasant Hegde <vasant.hegde@amd.com> Cc: Sairaj Kodilkar <sarunkod@amd.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-19iommu/qcom: fix device leak on of_xlate()Johan Hovold1-6/+4
[ Upstream commit 6a3908ce56e6879920b44ef136252b2f0c954194 ] Make sure to drop the reference taken to the iommu platform device when looking up its driver data during of_xlate(). Note that commit e2eae09939a8 ("iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()") fixed the leak in a couple of error paths, but the reference is still leaking on success and late failures. Fixes: 0ae349a0f33f ("iommu/qcom: Add qcom_iommu") Cc: stable@vger.kernel.org # 4.14: e2eae09939a8 Cc: Rob Clark <robin.clark@oss.qualcomm.com> Cc: Yu Kuai <yukuai3@huawei.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> [ adapted validation logic from max_asid to num_ctxs ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-19iommu/sun50i: fix device leak on of_xlate()Johan Hovold1-0/+2
commit f916109bf53864605d10bf6f4215afa023a80406 upstream. Make sure to drop the reference taken to the iommu platform device when looking up its driver data during of_xlate(). Fixes: 4100b8c229b3 ("iommu: Add Allwinner H6 IOMMU driver") Cc: stable@vger.kernel.org # 5.8 Cc: Maxime Ripard <mripard@kernel.org> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-19iommu/omap: fix device leaks on probe_device()Johan Hovold2-3/+1
commit b5870691065e6bbe6ba0650c0412636c6a239c5a upstream. Make sure to drop the references taken to the iommu platform devices when looking up their driver data during probe_device(). Note that the arch data device pointer added by commit 604629bcb505 ("iommu/omap: add support for late attachment of iommu devices") has never been used. Remove it to underline that the references are not needed. Fixes: 9d5018deec86 ("iommu/omap: Add support to program multiple iommus") Fixes: 7d6827748d54 ("iommu/omap: Fix iommu archdata name for DT-based devices") Cc: stable@vger.kernel.org # 3.18 Cc: Suman Anna <s-anna@ti.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-19iommu/mediatek: fix device leak on of_xlate()Johan Hovold1-0/+2
commit b3f1ee18280363ef17f82b564fc379ceba9ec86f upstream. Make sure to drop the reference taken to the iommu platform device when looking up its driver data during of_xlate(). Fixes: 0df4fabe208d ("iommu/mediatek: Add mt8173 IOMMU driver") Cc: stable@vger.kernel.org # 4.6 Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-19iommu/mediatek-v1: fix device leak on probe_device()Johan Hovold1-0/+2
commit c77ad28bfee0df9cbc719eb5adc9864462cfb65b upstream. Make sure to drop the reference taken to the iommu platform device when looking up its driver data during probe_device(). Fixes: b17336c55d89 ("iommu/mediatek: add support for mtk iommu generation one HW") Cc: stable@vger.kernel.org # 4.8 Cc: Honghui Zhang <honghui.zhang@mediatek.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Yong Wu <yong.wu@mediatek.com> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-19iommu/ipmmu-vmsa: fix device leak on of_xlate()Johan Hovold1-0/+2
commit 80aa518452c4aceb9459f9a8e3184db657d1b441 upstream. Make sure to drop the reference taken to the iommu platform device when looking up its driver data during of_xlate(). Fixes: 7b2d59611fef ("iommu/ipmmu-vmsa: Replace local utlb code with fwspec ids") Cc: stable@vger.kernel.org # 4.14 Cc: Magnus Damm <damm+renesas@opensource.se> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-19iommu/exynos: fix device leak on of_xlate()Johan Hovold1-6/+3
commit 05913cc43cb122f9afecdbe775115c058b906e1b upstream. Make sure to drop the reference taken to the iommu platform device when looking up its driver data during of_xlate(). Note that commit 1a26044954a6 ("iommu/exynos: add missing put_device() call in exynos_iommu_of_xlate()") fixed the leak in a couple of error paths, but the reference is still leaking on success. Fixes: aa759fd376fb ("iommu/exynos: Add callback for initializing devices from device tree") Cc: stable@vger.kernel.org # 4.2: 1a26044954a6 Cc: Yu Kuai <yukuai3@huawei.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-01-19iommu/arm-smmu-qcom: Enable use of all SMR groups when running bare-metalStephan Gerhold1-10/+17
[ Upstream commit 5583a55e074b33ccd88ac0542fd7cd656a7e2c8c ] Some platforms (e.g. SC8280XP and X1E) support more than 128 stream matching groups. This is more than what is defined as maximum by the ARM SMMU architecture specification. Commit 122611347326 ("iommu/arm-smmu-qcom: Limit the SMR groups to 128") disabled use of the additional groups because they don't exhibit the same behavior as the architecture supported ones. It seems like this is just another quirk of the hypervisor: When running bare-metal without the hypervisor, the additional groups appear to behave just like all others. The boot firmware uses some of the additional groups, so ignoring them in this situation leads to stream match conflicts whenever we allocate a new SMR group for the same SID. The workaround exists primarily because the bypass quirk detection fails when using a S2CR register from the additional matching groups, so let's perform the test with the last reliable S2CR (127) and then limit the number of SMR groups only if we detect that we are running below the hypervisor (because of the bypass quirk). Fixes: 122611347326 ("iommu/arm-smmu-qcom: Limit the SMR groups to 128") Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-12-07iommu/amd: Skip enabling command/event buffers for kdumpAshish Kalra1-9/+19
[ Upstream commit 9be15fbfc6c5c89c22cf6e209f66ea43ee0e58bb ] After a panic if SNP is enabled in the previous kernel then the kdump kernel boots with IOMMU SNP enforcement still enabled. IOMMU command buffers and event buffer registers remain locked and exclusive to the previous kernel. Attempts to enable command and event buffers in the kdump kernel will fail, as hardware ignores writes to the locked MMIO registers as per AMD IOMMU spec Section 2.12.2.1. Skip enabling command buffers and event buffers for kdump boot as they are already enabled in the previous kernel. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Sairaj Kodilkar <sarunkod@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Link: https://lore.kernel.org/r/576445eb4f168b467b0fc789079b650ca7c5b037.1756157913.git.ashish.kalra@amd.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28iommu/amd: Avoid stack buffer overflow from kernel cmdlineKees Cook1-2/+2
[ Upstream commit 8503d0fcb1086a7cfe26df67ca4bd9bd9e99bdec ] While the kernel command line is considered trusted in most environments, avoid writing 1 byte past the end of "acpiid" if the "str" argument is maximum length. Reported-by: Simcha Kosman <simcha.kosman@cyberark.com> Closes: https://lore.kernel.org/all/AS8P193MB2271C4B24BCEDA31830F37AE84A52@AS8P193MB2271.EURP193.PROD.OUTLOOK.COM Fixes: b6b26d86c61c ("iommu/amd: Add a length limitation for the ivrs_acpihid command-line parameter") Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Ankit Soni <Ankit.Soni@amd.com> Link: https://lore.kernel.org/r/20250804154023.work.970-kees@kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-27iommu/amd: Ensure GA log notifier callbacks finish running before module unloadSean Christopherson1-0/+8
[ Upstream commit 94c721ea03c7078163f41dbaa101ac721ddac329 ] Synchronize RCU when unregistering KVM's GA log notifier to ensure all in-flight interrupt handlers complete before KVM-the module is unloaded. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20250315031048.2374109-1-seanjc@google.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-04iommu/vt-d: Apply quirk_iommu_igfx for 8086:0044 (QM57/QS57)Mingcong Bai1-1/+3
commit 2c8a7c66c90832432496616a9a3c07293f1364f3 upstream. On the Lenovo ThinkPad X201, when Intel VT-d is enabled in the BIOS, the kernel boots with errors related to DMAR, the graphical interface appeared quite choppy, and the system resets erratically within a minute after it booted: DMAR: DRHD: handling fault status reg 3 DMAR: [DMA Write NO_PASID] Request device [00:02.0] fault addr 0xb97ff000 [fault reason 0x05] PTE Write access is not set Upon comparing boot logs with VT-d on/off, I found that the Intel Calpella quirk (`quirk_calpella_no_shadow_gtt()') correctly applied the igfx IOMMU disable/quirk correctly: pci 0000:00:00.0: DMAR: BIOS has allocated no shadow GTT; disabling IOMMU for graphics Whereas with VT-d on, it went into the "else" branch, which then triggered the DMAR handling fault above: ... else if (!disable_igfx_iommu) { /* we have to ensure the gfx device is idle before we flush */ pci_info(dev, "Disabling batched IOTLB flush on Ironlake\n"); iommu_set_dma_strict(); } Now, this is not exactly scientific, but moving 0x0044 to quirk_iommu_igfx seems to have fixed the aforementioned issue. Running a few `git blame' runs on the function, I have found that the quirk was originally introduced as a fix specific to ThinkPad X201: commit 9eecabcb9a92 ("intel-iommu: Abort IOMMU setup for igfx if BIOS gave no shadow GTT space") Which was later revised twice to the "else" branch we saw above: - 2011: commit 6fbcfb3e467a ("intel-iommu: Workaround IOTLB hang on Ironlake GPU") - 2024: commit ba00196ca41c ("iommu/vt-d: Decouple igfx_off from graphic identity mapping") I'm uncertain whether further testings on this particular laptops were done in 2011 and (honestly I'm not sure) 2024, but I would be happy to do some distro-specific testing if that's what would be required to verify this patch. P.S., I also see IDs 0x0040, 0x0062, and 0x006a listed under the same `quirk_calpella_no_shadow_gtt()' quirk, but I'm not sure how similar these chipsets are (if they share the same issue with VT-d or even, indeed, if this issue is specific to a bug in the Lenovo BIOS). With regards to 0x0062, it seems to be a Centrino wireless card, but not a chipset? I have also listed a couple (distro and kernel) bug reports below as references (some of them are from 7-8 years ago!), as they seem to be similar issue found on different Westmere/Ironlake, Haswell, and Broadwell hardware setups. Cc: stable@vger.kernel.org Fixes: 6fbcfb3e467a ("intel-iommu: Workaround IOTLB hang on Ironlake GPU") Fixes: ba00196ca41c ("iommu/vt-d: Decouple igfx_off from graphic identity mapping") Link: https://groups.google.com/g/qubes-users/c/4NP4goUds2c?pli=1 Link: https://bugs.archlinux.org/task/65362 Link: https://bbs.archlinux.org/viewtopic.php?id=230323 Reported-by: Wenhao Sun <weiguangtwk@outlook.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=197029 Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Link: https://lore.kernel.org/r/20250415133330.12528-1-jeffbai@aosc.io Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-04iommu/amd: Fix potential buffer overflow in parse_ivrs_acpihidPavel Paklov1-0/+8
commit 8dee308e4c01dea48fc104d37f92d5b58c50b96c upstream. There is a string parsing logic error which can lead to an overflow of hid or uid buffers. Comparing ACPIID_LEN against a total string length doesn't take into account the lengths of individual hid and uid buffers so the check is insufficient in some cases. For example if the length of hid string is 4 and the length of the uid string is 260, the length of str will be equal to ACPIID_LEN + 1 but uid string will overflow uid buffer which size is 256. The same applies to the hid string with length 13 and uid string with length 250. Check the length of hid and uid strings separately to prevent buffer overflow. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: ca3bf5d47cec ("iommu/amd: Introduces ivrs_acpihid kernel parameter") Cc: stable@vger.kernel.org Signed-off-by: Pavel Paklov <Pavel.Paklov@cyberprotect.ru> Link: https://lore.kernel.org/r/20250325092259.392844-1-Pavel.Paklov@cyberprotect.ru Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-02iommu/amd: Return an error if vCPU affinity is set for non-vCPU IRTESean Christopherson1-1/+1
[ Upstream commit 07172206a26dcf3f0bf7c3ecaadd4242b008ea54 ] Return -EINVAL instead of success if amd_ir_set_vcpu_affinity() is invoked without use_vapic; lying to KVM about whether or not the IRTE was configured to post IRQs is all kinds of bad. Fixes: d98de49a53e4 ("iommu/amd: Enable vAPIC interrupt remapping mode by default") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-ID: <20250404193923.1413163-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14iommu/arm-smmu: Defer probe of clients after smmu device boundPratyush Brahma1-0/+11
commit 229e6ee43d2a160a1592b83aad620d6027084aad upstream. Null pointer dereference occurs due to a race between smmu driver probe and client driver probe, when of_dma_configure() for client is called after the iommu_device_register() for smmu driver probe has executed but before the driver_bound() for smmu driver has been called. Following is how the race occurs: T1:Smmu device probe T2: Client device probe really_probe() arm_smmu_device_probe() iommu_device_register() really_probe() platform_dma_configure() of_dma_configure() of_dma_configure_id() of_iommu_configure() iommu_probe_device() iommu_init_device() arm_smmu_probe_device() arm_smmu_get_by_fwnode() driver_find_device_by_fwnode() driver_find_device() next_device() klist_next() /* null ptr assigned to smmu */ /* null ptr dereference while smmu->streamid_mask */ driver_bound() klist_add_tail() When this null smmu pointer is dereferenced later in arm_smmu_probe_device, the device crashes. Fix this by deferring the probe of the client device until the smmu device has bound to the arm smmu driver. Fixes: 021bb8420d44 ("iommu/arm-smmu: Wire up generic configuration support") Cc: stable@vger.kernel.org Co-developed-by: Prakash Gupta <quic_guptap@quicinc.com> Signed-off-by: Prakash Gupta <quic_guptap@quicinc.com> Signed-off-by: Pratyush Brahma <quic_pbrahma@quicinc.com> Link: https://lore.kernel.org/r/20241004090428.2035-1-quic_pbrahma@quicinc.com [will: Add comment] Signed-off-by: Will Deacon <will@kernel.org> [rm: backport for context conflict prior to 6.8] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-10-17iommu/vt-d: Fix potential lockup if qi_submit_sync called with 0 countSanjay K Kumar1-5/+11
[ Upstream commit 3cf74230c139f208b7fb313ae0054386eee31a81 ] If qi_submit_sync() is invoked with 0 invalidation descriptors (for instance, for DMA draining purposes), we can run into a bug where a submitting thread fails to detect the completion of invalidation_wait. Subsequently, this led to a soft lockup. Currently, there is no impact by this bug on the existing users because no callers are submitting invalidations with 0 descriptors. This fix will enable future users (such as DMA drain) calling qi_submit_sync() with 0 count. Suppose thread T1 invokes qi_submit_sync() with non-zero descriptors, while concurrently, thread T2 calls qi_submit_sync() with zero descriptors. Both threads then enter a while loop, waiting for their respective descriptors to complete. T1 detects its completion (i.e., T1's invalidation_wait status changes to QI_DONE by HW) and proceeds to call reclaim_free_desc() to reclaim all descriptors, potentially including adjacent ones of other threads that are also marked as QI_DONE. During this time, while T2 is waiting to acquire the qi->q_lock, the IOMMU hardware may complete the invalidation for T2, setting its status to QI_DONE. However, if T1's execution of reclaim_free_desc() frees T2's invalidation_wait descriptor and changes its status to QI_FREE, T2 will not observe the QI_DONE status for its invalidation_wait and will indefinitely remain stuck. This soft lockup does not occur when only non-zero descriptors are submitted.In such cases, invalidation descriptors are interspersed among wait descriptors with the status QI_IN_USE, acting as barriers. These barriers prevent the reclaim code from mistakenly freeing descriptors belonging to other submitters. Considered the following example timeline: T1 T2 ======================================== ID1 WD1 while(WD1!=QI_DONE) unlock lock WD1=QI_DONE* WD2 while(WD2!=QI_DONE) unlock lock WD1==QI_DONE? ID1=QI_DONE WD2=DONE* reclaim() ID1=FREE WD1=FREE WD2=FREE unlock soft lockup! T2 never sees QI_DONE in WD2 Where: ID = invalidation descriptor WD = wait descriptor * Written by hardware The root of the problem is that the descriptor status QI_DONE flag is used for two conflicting purposes: 1. signal a descriptor is ready for reclaim (to be freed) 2. signal by the hardware that a wait descriptor is complete The solution (in this patch) is state separation by using QI_FREE flag for #1. Once a thread's invalidation descriptors are complete, their status would be set to QI_FREE. The reclaim_free_desc() function would then only free descriptors marked as QI_FREE instead of those marked as QI_DONE. This change ensures that T2 (from the previous example) will correctly observe the completion of its invalidation_wait (marked as QI_DONE). Signed-off-by: Sanjay K Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240728210059.1964602-1-jacob.jun.pan@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17iommu/vt-d: Always reserve a domain ID for identity setupLu Baolu1-3/+3
[ Upstream commit 2c13012e09190174614fd6901857a1b8c199e17d ] We will use a global static identity domain. Reserve a static domain ID for it. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/20240809055431.36513-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-10-17iommu/arm-smmu-qcom: hide last LPASS SMMU context bank from linuxMarc Gonzalez1-0/+7
[ Upstream commit 3a8990b8a778219327c5f8ecf10b5d81377b925a ] On qcom msm8998, writing to the last context bank of lpass_q6_smmu (base address 0x05100000) produces a system freeze & reboot. The hardware/hypervisor reports 13 context banks for the LPASS SMMU on msm8998, but only the first 12 are accessible... Override the number of context banks [ 2.546101] arm-smmu 5100000.iommu: probing hardware configuration... [ 2.552439] arm-smmu 5100000.iommu: SMMUv2 with: [ 2.558945] arm-smmu 5100000.iommu: stage 1 translation [ 2.563627] arm-smmu 5100000.iommu: address translation ops [ 2.568923] arm-smmu 5100000.iommu: non-coherent table walk [ 2.574566] arm-smmu 5100000.iommu: (IDR0.CTTW overridden by FW configuration) [ 2.580220] arm-smmu 5100000.iommu: stream matching with 12 register groups [ 2.587263] arm-smmu 5100000.iommu: 13 context banks (0 stage-2 only) [ 2.614447] arm-smmu 5100000.iommu: Supported page sizes: 0x63315000 [ 2.621358] arm-smmu 5100000.iommu: Stage-1: 36-bit VA -> 36-bit IPA [ 2.627772] arm-smmu 5100000.iommu: preserved 0 boot mappings Specifically, the crashes occur here: qsmmu->bypass_cbndx = smmu->num_context_banks - 1; arm_smmu_cb_write(smmu, qsmmu->bypass_cbndx, ARM_SMMU_CB_SCTLR, 0); and here: arm_smmu_write_context_bank(smmu, i); arm_smmu_cb_write(smmu, i, ARM_SMMU_CB_FSR, ARM_SMMU_CB_FSR_FAULT); It is likely that FW reserves the last context bank for its own use, thus a simple work-around is: DON'T USE IT in Linux. If we decrease the number of context banks, last one will be "hidden". Signed-off-by: Marc Gonzalez <mgonzalez@freebox.fr> Reviewed-by: Caleb Connolly <caleb.connolly@linaro.org> Reviewed-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20240820-smmu-v3-1-2f71483b00ec@freebox.fr Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12iommu/vt-d: Handle volatile descriptor status readJacob Pan1-1/+1
[ Upstream commit b5e86a95541cea737394a1da967df4cd4d8f7182 ] Queued invalidation wait descriptor status is volatile in that IOMMU hardware writes the data upon completion. Use READ_ONCE() to prevent compiler optimizations which ensures memory reads every time. As a side effect, READ_ONCE() also enforces strict types and may add an extra instruction. But it should not have negative performance impact since we use cpu_relax anyway and the extra time(by adding an instruction) may allow IOMMU HW request cacheline ownership easier. e.g. gcc 12.3 BEFORE: 81 38 ad de 00 00 cmpl $0x2,(%rax) AFTER (with READ_ONCE()) 772f: 8b 00 mov (%rax),%eax 7731: 3d ad de 00 00 cmp $0x2,%eax //status data is 32 bit Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20240607173817.3914600-1-jacob.jun.pan@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20240702130839.108139-2-baolu.lu@linux.intel.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12iommu: sun50i: clear bypass registerJernej Skrabec1-0/+1
[ Upstream commit 927c70c93d929f4c2dcaf72f51b31bb7d118a51a ] The Allwinner H6 IOMMU has a bypass register, which allows to circumvent the page tables for each possible master. The reset value for this register is 0, which disables the bypass. The Allwinner H616 IOMMU resets this register to 0x7f, which activates the bypass for all masters, which is not what we want. Always clear this register to 0, to enforce the usage of page tables, and make this driver compatible with the H616 in this respect. Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Chen-Yu Tsai <wens@csie.org> Link: https://lore.kernel.org/r/20240616224056.29159-2-andre.przywara@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05iommu/arm-smmu-v3: Free MSIs in case of ENOMEMAleksandr Aprelkov1-1/+1
[ Upstream commit 80fea979dd9d48d67c5b48d2f690c5da3e543ebd ] If devm_add_action() returns -ENOMEM, then MSIs are allocated but not not freed on teardown. Use devm_add_action_or_reset() instead to keep the static analyser happy. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Aleksandr Aprelkov <aaprelkov@usergate.com> Link: https://lore.kernel.org/r/20240403053759.643164-1-aaprelkov@usergate.com [will: Tweak commit message, remove warning message] Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05iommu/amd: Fix sysfs leak in iommu initKun(llfl)1-0/+9
[ Upstream commit a295ec52c8624883885396fde7b4df1a179627c3 ] During the iommu initialization, iommu_init_pci() adds sysfs nodes. However, these nodes aren't remove in free_iommu_resources() subsequently. Fixes: 39ab9555c241 ("iommu: Add sysfs bindings for struct iommu_device") Signed-off-by: Kun(llfl) <llfl@linux.alibaba.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/c8e0d11c6ab1ee48299c288009cf9c5dae07b42d.1715215003.git.llfl@linux.alibaba.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05iommu/amd: Introduce pci segment structureVasant Hegde2-2/+68
[ Upstream commit 404ec4e4c169fb64da6b2a38b471c13ac0897c76 ] Newer AMD systems can support multiple PCI segments, where each segment contains one or more IOMMU instances. However, an IOMMU instance can only support a single PCI segment. Current code assumes that system contains only one pci segment (segment 0) and creates global data structures such as device table, rlookup table, etc. Introducing per PCI segment data structure, which contains segment specific data structures. This will eventually replace the global data structures. Also update `amd_iommu->pci_seg` variable to point to PCI segment structure instead of PCI segment ID. Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20220706113825.25582-3-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Stable-dep-of: a295ec52c862 ("iommu/amd: Fix sysfs leak in iommu init") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02iommu/vt-d: Allocate local memory for page request queueJacob Pan1-1/+1
[ Upstream commit a34f3e20ddff02c4f12df2c0635367394e64c63d ] The page request queue is per IOMMU, its allocation should be made NUMA-aware for performance reasons. Fixes: a222a7f0bb6c ("iommu/vt-d: Implement page request handling") Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240403214007.985600-1-jacob.jun.pan@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-27iommu/vt-d: Don't issue ATS Invalidation request when device is disconnectedEthan Zhao1-0/+3
[ Upstream commit 4fc82cd907ac075648789cc3a00877778aa1838b ] For those endpoint devices connect to system via hotplug capable ports, users could request a hot reset to the device by flapping device's link through setting the slot's link control register, as pciehp_ist() DLLSC interrupt sequence response, pciehp will unload the device driver and then power it off. thus cause an IOMMU device-TLB invalidation (Intel VT-d spec, or ATS Invalidation in PCIe spec r6.1) request for non-existence target device to be sent and deadly loop to retry that request after ITE fault triggered in interrupt context. That would cause following continuous hard lockup warning and system hang [ 4211.433662] pcieport 0000:17:01.0: pciehp: Slot(108): Link Down [ 4211.433664] pcieport 0000:17:01.0: pciehp: Slot(108): Card not present [ 4223.822591] NMI watchdog: Watchdog detected hard LOCKUP on cpu 144 [ 4223.822622] CPU: 144 PID: 1422 Comm: irq/57-pciehp Kdump: loaded Tainted: G S OE kernel version xxxx [ 4223.822623] Hardware name: vendorname xxxx 666-106, BIOS 01.01.02.03.01 05/15/2023 [ 4223.822623] RIP: 0010:qi_submit_sync+0x2c0/0x490 [ 4223.822624] Code: 48 be 00 00 00 00 00 08 00 00 49 85 74 24 20 0f 95 c1 48 8b 57 10 83 c1 04 83 3c 1a 03 0f 84 a2 01 00 00 49 8b 04 24 8b 70 34 <40> f6 c6 1 0 74 17 49 8b 04 24 8b 80 80 00 00 00 89 c2 d3 fa 41 39 [ 4223.822624] RSP: 0018:ffffc4f074f0bbb8 EFLAGS: 00000093 [ 4223.822625] RAX: ffffc4f040059000 RBX: 0000000000000014 RCX: 0000000000000005 [ 4223.822625] RDX: ffff9f3841315800 RSI: 0000000000000000 RDI: ffff9f38401a8340 [ 4223.822625] RBP: ffff9f38401a8340 R08: ffffc4f074f0bc00 R09: 0000000000000000 [ 4223.822626] R10: 0000000000000010 R11: 0000000000000018 R12: ffff9f384005e200 [ 4223.822626] R13: 0000000000000004 R14: 0000000000000046 R15: 0000000000000004 [ 4223.822626] FS: 0000000000000000(0000) GS:ffffa237ae400000(0000) knlGS:0000000000000000 [ 4223.822627] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4223.822627] CR2: 00007ffe86515d80 CR3: 000002fd3000a001 CR4: 0000000000770ee0 [ 4223.822627] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4223.822628] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 4223.822628] PKRU: 55555554 [ 4223.822628] Call Trace: [ 4223.822628] qi_flush_dev_iotlb+0xb1/0xd0 [ 4223.822628] __dmar_remove_one_dev_info+0x224/0x250 [ 4223.822629] dmar_remove_one_dev_info+0x3e/0x50 [ 4223.822629] intel_iommu_release_device+0x1f/0x30 [ 4223.822629] iommu_release_device+0x33/0x60 [ 4223.822629] iommu_bus_notifier+0x7f/0x90 [ 4223.822630] blocking_notifier_call_chain+0x60/0x90 [ 4223.822630] device_del+0x2e5/0x420 [ 4223.822630] pci_remove_bus_device+0x70/0x110 [ 4223.822630] pciehp_unconfigure_device+0x7c/0x130 [ 4223.822631] pciehp_disable_slot+0x6b/0x100 [ 4223.822631] pciehp_handle_presence_or_link_change+0xd8/0x320 [ 4223.822631] pciehp_ist+0x176/0x180 [ 4223.822631] ? irq_finalize_oneshot.part.50+0x110/0x110 [ 4223.822632] irq_thread_fn+0x19/0x50 [ 4223.822632] irq_thread+0x104/0x190 [ 4223.822632] ? irq_forced_thread_fn+0x90/0x90 [ 4223.822632] ? irq_thread_check_affinity+0xe0/0xe0 [ 4223.822633] kthread+0x114/0x130 [ 4223.822633] ? __kthread_cancel_work+0x40/0x40 [ 4223.822633] ret_from_fork+0x1f/0x30 [ 4223.822633] Kernel panic - not syncing: Hard LOCKUP [ 4223.822634] CPU: 144 PID: 1422 Comm: irq/57-pciehp Kdump: loaded Tainted: G S OE kernel version xxxx [ 4223.822634] Hardware name: vendorname xxxx 666-106, BIOS 01.01.02.03.01 05/15/2023 [ 4223.822634] Call Trace: [ 4223.822634] <NMI> [ 4223.822635] dump_stack+0x6d/0x88 [ 4223.822635] panic+0x101/0x2d0 [ 4223.822635] ? ret_from_fork+0x11/0x30 [ 4223.822635] nmi_panic.cold.14+0xc/0xc [ 4223.822636] watchdog_overflow_callback.cold.8+0x6d/0x81 [ 4223.822636] __perf_event_overflow+0x4f/0xf0 [ 4223.822636] handle_pmi_common+0x1ef/0x290 [ 4223.822636] ? __set_pte_vaddr+0x28/0x40 [ 4223.822637] ? flush_tlb_one_kernel+0xa/0x20 [ 4223.822637] ? __native_set_fixmap+0x24/0x30 [ 4223.822637] ? ghes_copy_tofrom_phys+0x70/0x100 [ 4223.822637] ? __ghes_peek_estatus.isra.16+0x49/0xa0 [ 4223.822637] intel_pmu_handle_irq+0xba/0x2b0 [ 4223.822638] perf_event_nmi_handler+0x24/0x40 [ 4223.822638] nmi_handle+0x4d/0xf0 [ 4223.822638] default_do_nmi+0x49/0x100 [ 4223.822638] exc_nmi+0x134/0x180 [ 4223.822639] end_repeat_nmi+0x16/0x67 [ 4223.822639] RIP: 0010:qi_submit_sync+0x2c0/0x490 [ 4223.822639] Code: 48 be 00 00 00 00 00 08 00 00 49 85 74 24 20 0f 95 c1 48 8b 57 10 83 c1 04 83 3c 1a 03 0f 84 a2 01 00 00 49 8b 04 24 8b 70 34 <40> f6 c6 10 74 17 49 8b 04 24 8b 80 80 00 00 00 89 c2 d3 fa 41 39 [ 4223.822640] RSP: 0018:ffffc4f074f0bbb8 EFLAGS: 00000093 [ 4223.822640] RAX: ffffc4f040059000 RBX: 0000000000000014 RCX: 0000000000000005 [ 4223.822640] RDX: ffff9f3841315800 RSI: 0000000000000000 RDI: ffff9f38401a8340 [ 4223.822641] RBP: ffff9f38401a8340 R08: ffffc4f074f0bc00 R09: 0000000000000000 [ 4223.822641] R10: 0000000000000010 R11: 0000000000000018 R12: ffff9f384005e200 [ 4223.822641] R13: 0000000000000004 R14: 0000000000000046 R15: 0000000000000004 [ 4223.822641] ? qi_submit_sync+0x2c0/0x490 [ 4223.822642] ? qi_submit_sync+0x2c0/0x490 [ 4223.822642] </NMI> [ 4223.822642] qi_flush_dev_iotlb+0xb1/0xd0 [ 4223.822642] __dmar_remove_one_dev_info+0x224/0x250 [ 4223.822643] dmar_remove_one_dev_info+0x3e/0x50 [ 4223.822643] intel_iommu_release_device+0x1f/0x30 [ 4223.822643] iommu_release_device+0x33/0x60 [ 4223.822643] iommu_bus_notifier+0x7f/0x90 [ 4223.822644] blocking_notifier_call_chain+0x60/0x90 [ 4223.822644] device_del+0x2e5/0x420 [ 4223.822644] pci_remove_bus_device+0x70/0x110 [ 4223.822644] pciehp_unconfigure_device+0x7c/0x130 [ 4223.822644] pciehp_disable_slot+0x6b/0x100 [ 4223.822645] pciehp_handle_presence_or_link_change+0xd8/0x320 [ 4223.822645] pciehp_ist+0x176/0x180 [ 4223.822645] ? irq_finalize_oneshot.part.50+0x110/0x110 [ 4223.822645] irq_thread_fn+0x19/0x50 [ 4223.822646] irq_thread+0x104/0x190 [ 4223.822646] ? irq_forced_thread_fn+0x90/0x90 [ 4223.822646] ? irq_thread_check_affinity+0xe0/0xe0 [ 4223.822646] kthread+0x114/0x130 [ 4223.822647] ? __kthread_cancel_work+0x40/0x40 [ 4223.822647] ret_from_fork+0x1f/0x30 [ 4223.822647] Kernel Offset: 0x6400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Such issue could be triggered by all kinds of regular surprise removal hotplug operation. like: 1. pull EP(endpoint device) out directly. 2. turn off EP's power. 3. bring the link down. etc. this patch aims to work for regular safe removal and surprise removal unplug. these hot unplug handling process could be optimized for fix the ATS Invalidation hang issue by calling pci_dev_is_disconnected() in function devtlb_invalidation_with_pasid() to check target device state to avoid sending meaningless ATS Invalidation request to iommu when device is gone. (see IMPLEMENTATION NOTE in PCIe spec r6.1 section 10.3.1) For safe removal, device wouldn't be removed until the whole software handling process is done, it wouldn't trigger the hard lock up issue caused by too long ATS Invalidation timeout wait. In safe removal path, device state isn't set to pci_channel_io_perm_failure in pciehp_unconfigure_device() by checking 'presence' parameter, calling pci_dev_is_disconnected() in devtlb_invalidation_with_pasid() will return false there, wouldn't break the function. For surprise removal, device state is set to pci_channel_io_perm_failure in pciehp_unconfigure_device(), means device is already gone (disconnected) call pci_dev_is_disconnected() in devtlb_invalidation_with_pasid() will return true to break the function not to send ATS Invalidation request to the disconnected device blindly, thus avoid to trigger further ITE fault, and ITE fault will block all invalidation request to be handled. furthermore retry the timeout request could trigger hard lockup. safe removal (present) & surprise removal (not present) pciehp_ist() pciehp_handle_presence_or_link_change() pciehp_disable_slot() remove_board() pciehp_unconfigure_device(presence) { if (!presence) pci_walk_bus(parent, pci_dev_set_disconnected, NULL); } this patch works for regular safe removal and surprise removal of ATS capable endpoint on PCIe switch downstream ports. Fixes: 6f7db75e1c46 ("iommu/vt-d: Add second level page table interface") Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Tested-by: Haorong Ye <yehaorong@bytedance.com> Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com> Link: https://lore.kernel.org/r/20240301080727.3529832-3-haifeng.zhao@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-27iommu/amd: Mark interrupt as managedMario Limonciello1-0/+3
[ Upstream commit 0feda94c868d396fac3b3cb14089d2d989a07c72 ] On many systems that have an AMD IOMMU the following sequence of warnings is observed during bootup. ``` pci 0000:00:00.2 can't derive routing for PCI INT A pci 0000:00:00.2: PCI INT A: not connected ``` This series of events happens because of the IOMMU initialization sequence order and the lack of _PRT entries for the IOMMU. During initialization the IOMMU driver first enables the PCI device using pci_enable_device(). This will call acpi_pci_irq_enable() which will check if the interrupt is declared in a PCI routing table (_PRT) entry. According to the PCI spec [1] these routing entries are only required under PCI root bridges: The _PRT object is required under all PCI root bridges The IOMMU is directly connected to the root complex, so there is no parent bridge to look for a _PRT entry. The first warning is emitted since no entry could be found in the hierarchy. The second warning is then emitted because the interrupt hasn't yet been configured to any value. The pin was configured in pci_read_irq() but the byte in PCI_INTERRUPT_LINE return 0xff which means "Unknown". After that sequence of events pci_enable_msi() is called and this will allocate an interrupt. That is both of these warnings are totally harmless because the IOMMU uses MSI for interrupts. To avoid even trying to probe for a _PRT entry mark the IOMMU as IRQ managed. This avoids both warnings. Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html?highlight=_prt#prt-pci-routing-table [1] Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Fixes: cffe0a2b5a34 ("x86, irq: Keep balance of IOAPIC pin reference count") Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20240122233400.1802-1-mario.limonciello@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-26iommu/arm-smmu-qcom: Add missing GMU entry to match tableRob Clark1-0/+1
commit afc95681c3068956fed1241a1ff1612c066c75ac upstream. In some cases the firmware expects cbndx 1 to be assigned to the GMU, so we also want the default domain for the GMU to be an identy domain. This way it does not get a context bank assigned. Without this, both of_dma_configure() and drm/msm's iommu_domain_attach() will trigger allocating and configuring a context bank. So GMU ends up attached to both cbndx 1 and later cbndx 2. This arrangement seemingly confounds and surprises the firmware if the GPU later triggers a translation fault, resulting (on sc8280xp / lenovo x13s, at least) in the SMMU getting wedged and the GPU stuck without memory access. Cc: stable@vger.kernel.org Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20231210180655.75542-1-robdclark@gmail.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-08iommu/vt-d: Add MTL to quirk list to skip TE disablingAbdul Halim, Mohd Syazwan1-1/+1
commit 85b80fdffa867d75dfb9084a839e7949e29064e8 upstream. The VT-d spec requires (10.4.4 Global Command Register, TE field) that: Hardware implementations supporting DMA draining must drain any in-flight DMA read/write requests queued within the Root-Complex before switching address translation on or off and reflecting the status of the command through the TES field in the Global Status register. Unfortunately, some integrated graphic devices fail to do so after some kind of power state transition. As the result, the system might stuck in iommu_disable_translation(), waiting for the completion of TE transition. Add MTL to the quirk list for those devices and skips TE disabling if the qurik hits. Fixes: b1012ca8dc4f ("iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu") Cc: stable@vger.kernel.org Signed-off-by: Abdul Halim, Mohd Syazwan <mohd.syazwan.abdul.halim@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20231116022324.30120-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19iommu/vt-d: Fix to flush cache of PASID directory tableYanfei Xu1-1/+1
[ Upstream commit 8a3b8e63f8371c1247b7aa24ff9c5312f1a6948b ] Even the PCI devices don't support pasid capability, PASID table is mandatory for a PCI device in scalable mode. However flushing cache of pasid directory table for these devices are not taken after pasid table is allocated as the "size" of table is zero. Fix it by calculating the size by page order. Found this when reading the code, no real problem encountered for now. Fixes: 194b3348bdbb ("iommu/vt-d: Fix PASID directory pointer coherency") Suggested-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Yanfei Xu <yanfei.xu@intel.com> Link: https://lore.kernel.org/r/20230616081045.721873-1-yanfei.xu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-19iommu/qcom: Disable and reset context bank before programmingAngeloGioacchino Del Regno1-0/+7
[ Upstream commit 9f3fef23d9b5a858a6e6d5f478bb1b6b76265e76 ] Writing the new TTBRs, TCRs and MAIRs on a previously enabled context bank may trigger a context fault, resulting in firmware driven AP resets: change the domain initialization programming sequence to disable the context bank(s) and to also clear the related fault address (CB_FAR) and fault status (CB_FSR) registers before writing new values to TTBR0/1, TCR/TCR2, MAIR0/1. Fixes: 0ae349a0f33f ("iommu/qcom: Add qcom_iommu") Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Link: https://lore.kernel.org/r/20230622092742.74819-4-angelogioacchino.delregno@collabora.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09iommu/amd: Don't block updates to GATag if guest mode is onJoao Martins1-2/+1
[ Upstream commit ed8a2f4ddef2eaaf864ab1efbbca9788187036ab ] On KVM GSI routing table updates, specially those where they have vIOMMUs with interrupt remapping enabled (to boot >255vcpus setups without relying on KVM_FEATURE_MSI_EXT_DEST_ID), a VMM may update the backing VF MSIs with a new VCPU affinity. On AMD with AVIC enabled, the new vcpu affinity info is updated via: avic_pi_update_irte() irq_set_vcpu_affinity() amd_ir_set_vcpu_affinity() amd_iommu_{de}activate_guest_mode() Where the IRTE[GATag] is updated with the new vcpu affinity. The GATag contains VM ID and VCPU ID, and is used by IOMMU hardware to signal KVM (via GALog) when interrupt cannot be delivered due to vCPU is in blocking state. The issue is that amd_iommu_activate_guest_mode() will essentially only change IRTE fields on transitions from non-guest-mode to guest-mode and otherwise returns *with no changes to IRTE* on already configured guest-mode interrupts. To the guest this means that the VF interrupts remain affined to the first vCPU they were first configured, and guest will be unable to issue VF interrupts and receive messages like this from spurious interrupts (e.g. from waking the wrong vCPU in GALog): [ 167.759472] __common_interrupt: 3.34 No irq handler for vector [ 230.680927] mlx5_core 0000:00:02.0: mlx5_cmd_eq_recover:247:(pid 3122): Recovered 1 EQEs on cmd_eq [ 230.681799] mlx5_core 0000:00:02.0: wait_func_handle_exec_timeout:1113:(pid 3122): cmd[0]: CREATE_CQ(0x400) recovered after timeout [ 230.683266] __common_interrupt: 3.34 No irq handler for vector Given the fact that amd_ir_set_vcpu_affinity() uses amd_iommu_activate_guest_mode() underneath it essentially means that VCPU affinity changes of IRTEs are nops. Fix it by dropping the check for guest-mode at amd_iommu_activate_guest_mode(). Same thing is applicable to amd_iommu_deactivate_guest_mode() although, even if the IRTE doesn't change underlying DestID on the host, the VFIO IRQ handler will still be able to poke at the right guest-vCPU. Fixes: b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code") Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20230419201154.83880-2-joao.m.martins@oracle.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09iommu/rockchip: Fix unwind goto issueChao Wang1-6/+8
[ Upstream commit ec014683c564fb74fc68e8f5e84691d3b3839d24 ] Smatch complains that drivers/iommu/rockchip-iommu.c:1306 rk_iommu_probe() warn: missing unwind goto? The rk_iommu_probe function, after obtaining the irq value through platform_get_irq, directly returns an error if the returned value is negative, without releasing any resources. Fix this by adding a new error handling label "err_pm_disable" and use a goto statement to redirect to the error handling process. In order to preserve the original semantics, set err to the value of irq. Fixes: 1aa55ca9b14a ("iommu/rockchip: Move irq request past pm_runtime_enable") Signed-off-by: Chao Wang <D202280639@hust.edu.cn> Reviewed-by: Dongliang Mu <dzm91@hust.edu.cn> Reviewed-by: Heiko Stuebner <heiko@sntech.de> Link: https://lore.kernel.org/r/20230417030421.2777-1-D202280639@hust.edu.cn Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-30iommu/arm-smmu-v3: Acknowledge pri/event queue overflow if anyTomas Krcka1-5/+14
[ Upstream commit 67ea0b7ce41844eae7c10bb04dfe66a23318c224 ] When an overflow occurs in the PRI queue, the SMMU toggles the overflow flag in the PROD register. To exit the overflow condition, the PRI thread is supposed to acknowledge it by toggling this flag in the CONS register. Unacknowledged overflow causes the queue to stop adding anything new. Currently, the priq thread always writes the CONS register back to the SMMU after clearing the queue. The writeback is not necessary if the OVFLG in the PROD register has not been changed, no overflow has occured. This commit checks the difference of the overflow flag between CONS and PROD register. If it's different, toggles the OVACKFLG flag in the CONS register and write it to the SMMU. The situation is similar for the event queue. The acknowledge register is also toggled after clearing the event queue but never propagated to the hardware. This would only be done the next time when executing evtq thread. Unacknowledged event queue overflow doesn't affect the event queue, because the SMMU still adds elements to that queue when the overflow condition is active. But it feel nicer to keep SMMU in sync when possible, so use the same way here as well. Signed-off-by: Tomas Krcka <krckatom@amazon.de> Link: https://lore.kernel.org/r/20230329123420.34641-1-tomas.krcka@gmail.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-30iommu/arm-smmu-qcom: Limit the SMR groups to 128Manivannan Sadhasivam1-1/+15
[ Upstream commit 12261134732689b7e30c59db9978f81230965181 ] Some platforms support more than 128 stream matching groups than what is defined by the ARM SMMU architecture specification. But due to some unknown reasons, those additional groups don't exhibit the same behavior as the architecture supported ones. For instance, the additional groups will not detect the quirky behavior of some firmware versions intercepting writes to S2CR register, thus skipping the quirk implemented in the driver and causing boot crash. So let's limit the groups to 128 for now until the issue with those groups are fixed and issue a notice to users in that case. Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20230327080029.11584-1-manivannan.sadhasivam@linaro.org [will: Reworded the comment slightly] Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17iommu/amd: Fix "Guest Virtual APIC Table Root Pointer" configuration in IRTEKishon Vijay Abraham I1-2/+2
commit ccc62b827775915a9b82db42a29813d04f92df7a upstream. commit b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code") while refactoring guest virtual APIC activation/de-activation code, stored information for activate/de-activate in "struct amd_ir_data". It used 32-bit integer data type for storing the "Guest Virtual APIC Table Root Pointer" (ga_root_ptr), though the "ga_root_ptr" is actually a 40-bit field in IRTE (Interrupt Remapping Table Entry). This causes interrupts from PCIe devices to not reach the guest in the case of PCIe passthrough with SME (Secure Memory Encryption) enabled as _SME_ bit in the "ga_root_ptr" is lost before writing it to the IRTE. Fix it by using 64-bit data type for storing the "ga_root_ptr". While at that also change the data type of "ga_tag" to u32 in order to match the IOMMU spec. Fixes: b9c6ff94e43a ("iommu/amd: Re-factor guest virtual APIC (de-)activation code") Cc: stable@vger.kernel.org # v5.4+ Reported-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Link: https://lore.kernel.org/r/20230405130317.9351-1-kvijayab@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17iommu/amd: Add a length limitation for the ivrs_acpihid command-line parameterGavrilov Ilia1-1/+15
[ Upstream commit b6b26d86c61c441144c72f842f7469bb686e1211 ] The 'acpiid' buffer in the parse_ivrs_acpihid function may overflow, because the string specifier in the format string sscanf() has no width limitation. Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org) with SVACE. Fixes: ca3bf5d47cec ("iommu/amd: Introduces ivrs_acpihid kernel parameter") Cc: stable@vger.kernel.org Signed-off-by: Ilia.Gavrilov <Ilia.Gavrilov@infotecs.ru> Reviewed-by: Kim Phillips <kim.phillips@amd.com> Link: https://lore.kernel.org/r/20230202082719.1513849-1-Ilia.Gavrilov@infotecs.ru Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17iommu/vt-d: Fix PASID directory pointer coherencyJacob Pan1-0/+7
[ Upstream commit 194b3348bdbb7db65375c72f3f774aee4cc6614e ] On platforms that do not support IOMMU Extended capability bit 0 Page-walk Coherency, CPU caches are not snooped when IOMMU is accessing any translation structures. IOMMU access goes only directly to memory. Intel IOMMU code was missing a flush for the PASID table directory that resulted in the unrecoverable fault as shown below. This patch adds clflush calls whenever allocating and updating a PASID table directory to ensure cache coherency. On the reverse direction, there's no need to clflush the PASID directory pointer when we deactivate a context entry in that IOMMU hardware will not see the old PASID directory pointer after we clear the context entry. PASID directory entries are also never freed once allocated. DMAR: DRHD: handling fault status reg 3 DMAR: [DMA Read NO_PASID] Request device [00:0d.2] fault addr 0x1026a4000 [fault reason 0x51] SM: Present bit in Directory Entry is clear DMAR: Dump dmar1 table entries for IOVA 0x1026a4000 DMAR: scalable mode root entry: hi 0x0000000102448001, low 0x0000000101b3e001 DMAR: context entry: hi 0x0000000000000000, low 0x0000000101b4d401 DMAR: pasid dir entry: 0x0000000101b4e001 DMAR: pasid table entry[0]: 0x0000000000000109 DMAR: pasid table entry[1]: 0x0000000000000001 DMAR: pasid table entry[2]: 0x0000000000000000 DMAR: pasid table entry[3]: 0x0000000000000000 DMAR: pasid table entry[4]: 0x0000000000000000 DMAR: pasid table entry[5]: 0x0000000000000000 DMAR: pasid table entry[6]: 0x0000000000000000 DMAR: pasid table entry[7]: 0x0000000000000000 DMAR: PTE not present at level 4 Cc: <stable@vger.kernel.org> Fixes: 0bbeb01a4faf ("iommu/vt-d: Manage scalalble mode PASID tables") Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reported-by: Sukumar Ghorai <sukumar.ghorai@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Link: https://lore.kernel.org/r/20230209212843.1788125-1-jacob.jun.pan@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17iommu/vt-d: Fix lockdep splat in intel_pasid_get_entry()Lu Baolu1-8/+13
[ Upstream commit 803766cbf85fb8edbf896729bbefc2d38dcf1e0a ] The pasid_lock is used to synchronize different threads from modifying a same pasid directory entry at the same time. It causes below lockdep splat. [ 83.296538] ======================================================== [ 83.296538] WARNING: possible irq lock inversion dependency detected [ 83.296539] 5.12.0-rc3+ #25 Tainted: G W [ 83.296539] -------------------------------------------------------- [ 83.296540] bash/780 just changed the state of lock: [ 83.296540] ffffffff82b29c98 (device_domain_lock){..-.}-{2:2}, at: iommu_flush_dev_iotlb.part.0+0x32/0x110 [ 83.296547] but this lock took another, SOFTIRQ-unsafe lock in the past: [ 83.296547] (pasid_lock){+.+.}-{2:2} [ 83.296548] and interrupts could create inverse lock ordering between them. [ 83.296549] other info that might help us debug this: [ 83.296549] Chain exists of: device_domain_lock --> &iommu->lock --> pasid_lock [ 83.296551] Possible interrupt unsafe locking scenario: [ 83.296551] CPU0 CPU1 [ 83.296552] ---- ---- [ 83.296552] lock(pasid_lock); [ 83.296553] local_irq_disable(); [ 83.296553] lock(device_domain_lock); [ 83.296554] lock(&iommu->lock); [ 83.296554] <Interrupt> [ 83.296554] lock(device_domain_lock); [ 83.296555] *** DEADLOCK *** Fix it by replacing the pasid_lock with an atomic exchange operation. Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210320020916.640115-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Stable-dep-of: 194b3348bdbb ("iommu/vt-d: Fix PASID directory pointer coherency") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe()Christophe JAILLET1-1/+3
[ Upstream commit 142e821f68cf5da79ce722cb9c1323afae30e185 ] A clk, prepared and enabled in mtk_iommu_v1_hw_init(), is not released in the error handling path of mtk_iommu_v1_probe(). Add the corresponding clk_disable_unprepare(), as already done in the remove function. Fixes: b17336c55d89 ("iommu/mediatek: add support for mtk iommu generation one HW") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Yong Wu <yong.wu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://lore.kernel.org/r/593e7b7d97c6e064b29716b091a9d4fd122241fb.1671473163.git.christophe.jaillet@wanadoo.fr Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18iommu/mediatek-v1: Add error handle for mtk_iommu_probeYong Wu1-4/+18
[ Upstream commit ac304c070c54413efabf29f9e73c54576d329774 ] In the original code, we lack the error handle. This patch adds them. Signed-off-by: Yong Wu <yong.wu@mediatek.com> Link: https://lore.kernel.org/r/20210412064843.11614-2-yong.wu@mediatek.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Stable-dep-of: 142e821f68cf ("iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18iommu/amd: Fix ill-formed ivrs_ioapic, ivrs_hpet and ivrs_acpihid optionsKim Phillips1-25/+54
[ Upstream commit 1198d2316dc4265a97d0e8445a22c7a6d17580a4 ] Currently, these options cause the following libkmod error: libkmod: ERROR ../libkmod/libkmod-config.c:489 kcmdline_parse_result: \ Ignoring bad option on kernel command line while parsing module \ name: 'ivrs_xxxx[XX:XX' Fix by introducing a new parameter format for these options and throw a warning for the deprecated format. Users are still allowed to omit the PCI Segment if zero. Adding a Link: to the reason why we're modding the syntax parsing in the driver and not in libkmod. Fixes: ca3bf5d47cec ("iommu/amd: Introduces ivrs_acpihid kernel parameter") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-modules/20200310082308.14318-2-lucas.demarchi@intel.com/ Reported-by: Kim Phillips <kim.phillips@amd.com> Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Kim Phillips <kim.phillips@amd.com> Link: https://lore.kernel.org/r/20220919155638.391481-2-kim.phillips@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18iommu/amd: Add PCI segment support for ivrs_[ioapic/hpet/acpihid] commandsSuravee Suthikulpanit1-17/+27
[ Upstream commit bbe3a106580c21bc883fb0c9fa3da01534392fe8 ] By default, PCI segment is zero and can be omitted. To support system with non-zero PCI segment ID, modify the parsing functions to allow PCI segment ID. Co-developed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Link: https://lore.kernel.org/r/20220706113825.25582-33-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Stable-dep-of: 1198d2316dc4 ("iommu/amd: Fix ill-formed ivrs_ioapic, ivrs_hpet and ivrs_acpihid options") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-14iommu/amd: Fix ivrs_acpihid cmdline parsing codeKim Phillips1-0/+7
commit 5f18e9f8868c6d4eae71678e7ebd4977b7d8c8cf upstream. The second (UID) strcmp in acpi_dev_hid_uid_match considers "0" and "00" different, which can prevent device registration. Have the AMD IOMMU driver's ivrs_acpihid parsing code remove any leading zeroes to make the UID strcmp succeed. Now users can safely specify "AMDxxxxx:00" or "AMDxxxxx:0" and expect the same behaviour. Fixes: ca3bf5d47cec ("iommu/amd: Introduces ivrs_acpihid kernel parameter") Signed-off-by: Kim Phillips <kim.phillips@amd.com> Cc: stable@vger.kernel.org Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20220919155638.391481-1-kim.phillips@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-14iommu/sun50i: Remove IOMMU_DOMAIN_IDENTITYJason Gunthorpe1-1/+0
[ Upstream commit ef5bb8e7a7127218f826b9ccdf7508e7a339f4c2 ] This driver treats IOMMU_DOMAIN_IDENTITY the same as UNMANAGED, which cannot possibly be correct. UNMANAGED domains are required to start out blocking all DMAs. This seems to be what this driver does as it allocates a first level 'dt' for the IO page table that is 0 filled. Thus UNMANAGED looks like a working IO page table, and so IDENTITY must be a mistake. Remove it. Fixes: 4100b8c229b3 ("iommu: Add Allwinner H6 IOMMU driver") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/0-v1-97f0adf27b5e+1f0-s50_identity_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-14iommu/fsl_pamu: Fix resource leak in fsl_pamu_probe()Yuan Can1-1/+1
[ Upstream commit 73f5fc5f884ad0c5f7d57f66303af64f9f002526 ] The fsl_pamu_probe() returns directly when create_csd() failed, leaving irq and memories unreleased. Fix by jumping to error if create_csd() returns error. Fixes: 695093e38c3e ("iommu/fsl: Freescale PAMU driver and iommu implementation.") Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20221121082022.19091-1-yuancan@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-14iommu/amd: Fix pci device refcount leak in ppr_notifier()Yang Yingliang1-0/+1
[ Upstream commit 6cf0981c2233f97d56938d9d61845383d6eb227c ] As comment of pci_get_domain_bus_and_slot() says, it returns a pci device with refcount increment, when finish using it, the caller must decrement the reference count by calling pci_dev_put(). So call it before returning from ppr_notifier() to avoid refcount leak. Fixes: daae2d25a477 ("iommu/amd: Don't copy GCR3 table root pointer") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Link: https://lore.kernel.org/r/20221118093604.216371-1-yangyingliang@huawei.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>