kernel/linux.git/drivers/iommu/arm, branch v6.1.174

iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency

2026-03-04T12:20:26+00:00

[ Upstream commit df180b1a4cc51011c5f8c52c7ec02ad2e42962de ] The SMMU CMDQ lock is highly contentious when there are multiple CPUs issuing commands and the queue is nearly full. The lock has the following states: - 0: Unlocked - >0: Shared lock held with count - INT_MIN+N: Exclusive lock held, where N is the # of shared waiters - INT_MIN: Exclusive lock held, no shared waiters When multiple CPUs are polling for space in the queue, they attempt to grab the exclusive lock to update the cons pointer from the hardware. If they fail to get the lock, they will spin until either the cons pointer is updated by another CPU. The current code allows the possibility of shared lock starvation if there is a constant stream of CPUs trying to grab the exclusive lock. This leads to severe latency issues and soft lockups. Consider the following scenario where CPU1's attempt to acquire the shared lock is starved by CPU2 and CPU0 contending for the exclusive lock. CPU0 (exclusive) | CPU1 (shared) | CPU2 (exclusive) | `cmdq->lock` -------------------------------------------------------------------------- trylock() //takes | | | 0 | shared_lock() | | INT_MIN | fetch_inc() | | INT_MIN | no return | | INT_MIN + 1 | spins // VAL >= 0 | | INT_MIN + 1 unlock() | spins... | | INT_MIN + 1 set_release(0) | spins... | | 0 see[NOTE] (done) | (sees 0) | trylock() // takes | 0 | *exits loop* | cmpxchg(0, INT_MIN) | 0 | | *cuts in* | INT_MIN | cmpxchg(0, 1) | | INT_MIN | fails // != 0 | | INT_MIN | spins // VAL >= 0 | | INT_MIN | *starved* | | INT_MIN [NOTE] The current code resets the exclusive lock to 0 regardless of the state of the lock. This causes two problems: 1. It opens the possibility of back-to-back exclusive locks and the downstream effect of starving shared lock. 2. The count of shared lock waiters are lost. To mitigate this, we release the exclusive lock by only clearing the sign bit while retaining the shared lock waiter count as a way to avoid starving the shared lock waiters. Also deleted cmpxchg loop while trying to acquire the shared lock as it is not needed. The waiters can see the positive lock count and proceed immediately after the exclusive lock is released. Exclusive lock is not starved in that submitters will try exclusive lock first when new spaces become available. Reviewed-by: Mostafa Saleh Reviewed-by: Nicolin Chen Signed-off-by: Alexander Grest Signed-off-by: Jacob Pan Signed-off-by: Will Deacon Signed-off-by: Sasha Levin

iommu/qcom: fix device leak on of_xlate()

2026-01-11T14:19:24+00:00

[ Upstream commit 6a3908ce56e6879920b44ef136252b2f0c954194 ] Make sure to drop the reference taken to the iommu platform device when looking up its driver data during of_xlate(). Note that commit e2eae09939a8 ("iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()") fixed the leak in a couple of error paths, but the reference is still leaking on success and late failures. Fixes: 0ae349a0f33f ("iommu/qcom: Add qcom_iommu") Cc: stable@vger.kernel.org # 4.14: e2eae09939a8 Cc: Rob Clark Cc: Yu Kuai Acked-by: Robin Murphy Signed-off-by: Johan Hovold Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman

iommu/qcom: Index contexts by asid number to allow asid 0

2026-01-11T14:19:24+00:00

[ Upstream commit ec5601661bfcdc206e6ceba1b97837e763dab1ba ] This driver was indexing the contexts by asid-1, which is probably done under the assumption that the first ASID is always 1. Unfortunately this is not always true: at least for MSM8956 and MSM8976's GPU IOMMU, the gpu_user context's ASID number is zero. To allow using a zero asid number, index the contexts by `asid` instead of by `asid - 1`. While at it, also enhance human readability by renaming the `num_ctxs` member of struct qcom_iommu_dev to `max_asid`. Signed-off-by: AngeloGioacchino Del Regno Reviewed-by: Konrad Dybcio Link: https://lore.kernel.org/r/20230622092742.74819-5-angelogioacchino.delregno@collabora.com Signed-off-by: Will Deacon Stable-dep-of: 6a3908ce56e6 ("iommu/qcom: fix device leak on of_xlate()") Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman

iommu/qcom: Use the asid read from device-tree if specified

2026-01-11T14:19:24+00:00

[ Upstream commit fcf226f1f7083cba76af47bf8dd764b68b149cd2 ] As specified in this driver, the context banks are 0x1000 apart but on some SoCs the context number does not necessarily match this logic, hence we end up using the wrong ASID: keeping in mind that this IOMMU implementation relies heavily on SCM (TZ) calls, it is mandatory that we communicate the right context number. Since this is all about how context banks are mapped in firmware, which may be board dependent (as a different firmware version may eventually change the expected context bank numbers), introduce a new property "qcom,ctx-asid": when found, the ASID will be forced as read from the devicetree. When "qcom,ctx-asid" is not found, this driver retains the previous behavior as to avoid breaking older devicetrees or systems that do not require forcing ASID numbers. Signed-off-by: Marijn Suijten [Marijn: Rebased over next-20221111] Signed-off-by: AngeloGioacchino Del Regno Reviewed-by: Konrad Dybcio Link: https://lore.kernel.org/r/20230622092742.74819-3-angelogioacchino.delregno@collabora.com Signed-off-by: Will Deacon Stable-dep-of: 6a3908ce56e6 ("iommu/qcom: fix device leak on of_xlate()") Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman

iommu/arm-smmu: Convert to platform remove callback returning void

2026-01-11T14:19:24+00:00

[ Upstream commit 62565a77c2323d32f2be737455729ac7d3efe6ad ] The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is (mostly) ignored and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new() which already returns void. Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König Link: https://lore.kernel.org/r/20230321084125.337021-5-u.kleine-koenig@pengutronix.de Signed-off-by: Joerg Roedel Stable-dep-of: 6a3908ce56e6 ("iommu/qcom: fix device leak on of_xlate()") Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman

iommu/arm-smmu: Drop if with an always false condition

2026-01-11T14:19:24+00:00

[ Upstream commit a2972cb89935160bfe515b15d28a77694723ac06 ] The remove and shutdown callback are only called after probe completed successfully. In this case platform_set_drvdata() was called with a non-NULL argument and so smmu is never NULL. Other functions in this driver also don't check for smmu being non-NULL before using it. Also note that returning an error code from a remove callback doesn't result in the device staying bound. It's still removed and devm allocated resources are freed (among others *smmu and the register mapping). So after an early exit to iommu device stayed around and using it probably oopses. Signed-off-by: Uwe Kleine-König Reviewed-by: Robin Murphy Link: https://lore.kernel.org/r/20230321084125.337021-2-u.kleine-koenig@pengutronix.de Signed-off-by: Joerg Roedel Stable-dep-of: 6a3908ce56e6 ("iommu/qcom: fix device leak on of_xlate()") Signed-off-by: Sasha Levin Signed-off-by: Greg Kroah-Hartman

iommu/arm-smmu-qcom: Enable use of all SMR groups when running bare-metal

2026-01-11T14:18:30+00:00

[ Upstream commit 5583a55e074b33ccd88ac0542fd7cd656a7e2c8c ] Some platforms (e.g. SC8280XP and X1E) support more than 128 stream matching groups. This is more than what is defined as maximum by the ARM SMMU architecture specification. Commit 122611347326 ("iommu/arm-smmu-qcom: Limit the SMR groups to 128") disabled use of the additional groups because they don't exhibit the same behavior as the architecture supported ones. It seems like this is just another quirk of the hypervisor: When running bare-metal without the hypervisor, the additional groups appear to behave just like all others. The boot firmware uses some of the additional groups, so ignoring them in this situation leads to stream match conflicts whenever we allocate a new SMR group for the same SID. The workaround exists primarily because the bypass quirk detection fails when using a S2CR register from the additional matching groups, so let's perform the test with the last reliable S2CR (127) and then limit the number of SMR groups only if we detect that we are running below the hypervisor (because of the bypass quirk). Fixes: 122611347326 ("iommu/arm-smmu-qcom: Limit the SMR groups to 128") Signed-off-by: Stephan Gerhold Signed-off-by: Will Deacon Signed-off-by: Sasha Levin

iommu/arm-smmu-v3: Fix iommu_device_probe bug due to duplicated stream ids

2025-05-09T07:41:45+00:00

[ Upstream commit b00d24997a11c10d3e420614f0873b83ce358a34 ] ASPEED VGA card has two built-in devices: 0008:06:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 06) 0008:07:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 52) Its toplogy looks like this: +-[0008:00]---00.0-[01-09]--+-00.0-[02-09]--+-00.0-[03]----00.0 Sandisk Corp Device 5017 | +-01.0-[04]-- | +-02.0-[05]----00.0 NVIDIA Corporation Device | +-03.0-[06-07]----00.0-[07]----00.0 ASPEED Technology, Inc. ASPEED Graphics Family | +-04.0-[08]----00.0 Renesas Technology Corp. uPD720201 USB 3.0 Host Controller | \-05.0-[09]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller \-00.1 PMC-Sierra Inc. Device 4028 The IORT logic populaties two identical IDs into the fwspec->ids array via DMA aliasing in iort_pci_iommu_init() called by pci_for_each_dma_alias(). Though the SMMU driver had been able to handle this situation since commit 563b5cbe334e ("iommu/arm-smmu-v3: Cope with duplicated Stream IDs"), that got broken by the later commit cdf315f907d4 ("iommu/arm-smmu-v3: Maintain a SID->device structure"), which ended up with allocating separate streams with the same stuffing. On a kernel prior to v6.15-rc1, there has been an overlooked warning: pci 0008:07:00.0: vgaarb: setting as boot VGA device pci 0008:07:00.0: vgaarb: bridge control possible pci 0008:07:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none pcieport 0008:06:00.0: Adding to iommu group 14 ast 0008:07:00.0: stream 67328 already in tree <===== WARNING ast 0008:07:00.0: enabling device (0002 -> 0003) ast 0008:07:00.0: Using default configuration ast 0008:07:00.0: AST 2600 detected ast 0008:07:00.0: [drm] Using analog VGA ast 0008:07:00.0: [drm] dram MCLK=396 Mhz type=1 bus_width=16 [drm] Initialized ast 0.1.0 for 0008:07:00.0 on minor 0 ast 0008:07:00.0: [drm] fb0: astdrmfb frame buffer device With v6.15-rc, since the commit bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path"), the error returned with the warning is moved to the SMMU device probe flow: arm_smmu_probe_device+0x15c/0x4c0 __iommu_probe_device+0x150/0x4f8 probe_iommu_group+0x44/0x80 bus_for_each_dev+0x7c/0x100 bus_iommu_probe+0x48/0x1a8 iommu_device_register+0xb8/0x178 arm_smmu_device_probe+0x1350/0x1db0 which then fails the entire SMMU driver probe: pci 0008:06:00.0: Adding to iommu group 21 pci 0008:07:00.0: stream 67328 already in tree arm-smmu-v3 arm-smmu-v3.9.auto: Failed to register iommu arm-smmu-v3 arm-smmu-v3.9.auto: probe with driver arm-smmu-v3 failed with error -22 Since SMMU driver had been already expecting a potential duplicated Stream ID in arm_smmu_install_ste_for_dev(), change the arm_smmu_insert_master() routine to ignore a duplicated ID from the fwspec->sids array as well. Note: this has been failing the iommu_device_probe() since 2021, although a recent iommu commit in v6.15-rc1 that moves iommu_device_probe() started to fail the SMMU driver probe. Since nobody has cared about DMA Alias support, leave that as it was but fix the fundamental iommu_device_probe() breakage. Fixes: cdf315f907d4 ("iommu/arm-smmu-v3: Maintain a SID->device structure") Cc: stable@vger.kernel.org Suggested-by: Jason Gunthorpe Reviewed-by: Jason Gunthorpe Signed-off-by: Nicolin Chen Link: https://lore.kernel.org/r/20250415185620.504299-1-nicolinc@nvidia.com Signed-off-by: Will Deacon Signed-off-by: Sasha Levin

iommu/arm-smmu-v3: Use the new rb tree helpers

2025-05-09T07:41:45+00:00

[ Upstream commit a2bb820e862d61f9ca1499e500915f9f505a2655 ] Since v5.12 the rbtree has gained some simplifying helpers aimed at making rb tree users write less convoluted boiler plate code. Instead the caller provides a single comparison function and the helpers generate the prior open-coded stuff. Update smmu->streams to use rb_find_add() and rb_find(). Tested-by: Nicolin Chen Reviewed-by: Mostafa Saleh Signed-off-by: Jason Gunthorpe Link: https://lore.kernel.org/r/1-v3-9fef8cdc2ff6+150d1-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon Stable-dep-of: b00d24997a11 ("iommu/arm-smmu-v3: Fix iommu_device_probe bug due to duplicated stream ids") Signed-off-by: Sasha Levin

iommu/arm-smmu-v3: Clean up more on probe failure

2025-02-21T12:49:33+00:00

[ Upstream commit fcbd621567420b3a2f21f49bbc056de8b273c625 ] kmemleak noticed that the iopf queue allocated deep down within arm_smmu_init_structures() can be leaked by a subsequent error return from arm_smmu_device_probe(). Furthermore, after arm_smmu_device_reset() we will also leave the SMMU enabled with an empty Stream Table, silently blocking all DMA. This proves rather annoying for debugging said probe failure, so let's handle it a bit better by putting the SMMU back into (more or less) the same state as if it hadn't probed at all. Signed-off-by: Robin Murphy Link: https://lore.kernel.org/r/5137901958471cf67f2fad5c2229f8a8f1ae901a.1733406914.git.robin.murphy@arm.com Signed-off-by: Will Deacon Signed-off-by: Sasha Levin