<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/iommu, branch v6.18.21</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.18.21'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-25T10:10:45+00:00</updated>
<entry>
<title>iommu/sva: Fix crash in iommu_sva_unbind_device()</title>
<updated>2026-03-25T10:10:45+00:00</updated>
<author>
<name>Lizhi Hou</name>
<email>lizhi.hou@amd.com</email>
</author>
<published>2026-03-05T06:18:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=58abeb7b9562f25bdfa2f5ae5ce803eb02e74433'/>
<id>urn:sha1:58abeb7b9562f25bdfa2f5ae5ce803eb02e74433</id>
<content type='text'>
[ Upstream commit 06e14c36e20b48171df13d51b89fe67c594ed07a ]

domain-&gt;mm-&gt;iommu_mm can be freed by iommu_domain_free():
  iommu_domain_free()
    mmdrop()
      __mmdrop()
        mm_pasid_drop()
After iommu_domain_free() returns, accessing domain-&gt;mm-&gt;iommu_mm may
dereference a freed mm structure, leading to a crash.

Fix this by moving the code that accesses domain-&gt;mm-&gt;iommu_mm to before
the call to iommu_domain_free().

Fixes: e37d5a2d60a3 ("iommu/sva: invalidate stale IOTLB entries for kernel address space")
Signed-off-by: Lizhi Hou &lt;lizhi.hou@amd.com&gt;
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Reviewed-by: Yi Liu &lt;yi.l.liu@intel.com&gt;
Reviewed-by: Vasant Hegde &lt;vasant.hegde@amd.com&gt;
Reviewed-by: Lu Baolu &lt;baolu.lu@linux.intel.com&gt;
Signed-off-by: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iommu/vt-d: Only handle IOPF for SVA when PRI is supported</title>
<updated>2026-03-25T10:10:34+00:00</updated>
<author>
<name>Lu Baolu</name>
<email>baolu.lu@linux.intel.com</email>
</author>
<published>2026-03-16T07:16:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ee312bb1052e45cbccaf6abac1012db9ca43150a'/>
<id>urn:sha1:ee312bb1052e45cbccaf6abac1012db9ca43150a</id>
<content type='text'>
commit 39c20c4e83b9f78988541d829aa34668904e54a0 upstream.

In intel_svm_set_dev_pasid(), the driver unconditionally manages the IOPF
handling during a domain transition. However, commit a86fb7717320
("iommu/vt-d: Allow SVA with device-specific IOPF") introduced support for
SVA on devices that handle page faults internally without utilizing the
PCI PRI. On such devices, the IOMMU-side IOPF infrastructure is not
required. Calling iopf_for_domain_replace() on these devices is incorrect
and can lead to unexpected failures during PASID attachment or unwinding.

Add a check for info-&gt;pri_supported to ensure that the IOPF queue logic
is only invoked for devices that actually rely on the IOMMU's PRI-based
fault handling.

Fixes: 17fce9d2336d ("iommu/vt-d: Put iopf enablement in domain attach path")
Cc: stable@vger.kernel.org
Suggested-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Signed-off-by: Lu Baolu &lt;baolu.lu@linux.intel.com&gt;
Link: https://lore.kernel.org/r/20260310075520.295104-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>iommu/vt-d: Fix intel iommu iotlb sync hardlockup and retry</title>
<updated>2026-03-25T10:10:34+00:00</updated>
<author>
<name>Guanghui Feng</name>
<email>guanghuifeng@linux.alibaba.com</email>
</author>
<published>2026-03-16T07:16:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=300e7cfdc92bbfb6fec00efe322555763ba88600'/>
<id>urn:sha1:300e7cfdc92bbfb6fec00efe322555763ba88600</id>
<content type='text'>
commit fe89277c9ceb0d6af0aa665bcf24a41d8b1b79cd upstream.

During the qi_check_fault process after an IOMMU ITE event, requests at
odd-numbered positions in the queue are set to QI_ABORT, only satisfying
single-request submissions. However, qi_submit_sync now supports multiple
simultaneous submissions, and can't guarantee that the wait_desc will be
at an odd-numbered position. Therefore, if an item times out, IOMMU can't
re-initiate the request, resulting in an infinite polling wait.

This modifies the process by setting the status of all requests already
fetched by IOMMU and recorded as QI_IN_USE status (including wait_desc
requests) to QI_ABORT, thus enabling multiple requests to be resubmitted.

Fixes: 8a1d82462540 ("iommu/vt-d: Multiple descriptors per qi_submit_sync()")
Cc: stable@vger.kernel.org
Signed-off-by: Guanghui Feng &lt;guanghuifeng@linux.alibaba.com&gt;
Tested-by: Shuai Xue &lt;xueshuai@linux.alibaba.com&gt;
Reviewed-by: Shuai Xue &lt;xueshuai@linux.alibaba.com&gt;
Reviewed-by: Samiullah Khawaja &lt;skhawaja@google.com&gt;
Link: https://lore.kernel.org/r/20260306101516.3885775-1-guanghuifeng@linux.alibaba.com
Signed-off-by: Lu Baolu &lt;baolu.lu@linux.intel.com&gt;
Fixes: 8a1d82462540 ("iommu/vt-d: Multiple descriptors per  qi_submit_sync()")
Signed-off-by: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>iommu/vt-d: Skip dev-iotlb flush for inaccessible PCIe device without scalable mode</title>
<updated>2026-03-12T11:09:29+00:00</updated>
<author>
<name>Jinhui Guo</name>
<email>guojinhui.liam@bytedance.com</email>
</author>
<published>2026-01-22T01:48:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e70d5feb10c5ba2bbf7ca400b8f39a2f82d653e8'/>
<id>urn:sha1:e70d5feb10c5ba2bbf7ca400b8f39a2f82d653e8</id>
<content type='text'>
[ Upstream commit 42662d19839f34735b718129ea200e3734b07e50 ]

PCIe endpoints with ATS enabled and passed through to userspace
(e.g., QEMU, DPDK) can hard-lock the host when their link drops,
either by surprise removal or by a link fault.

Commit 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation
request when device is disconnected") adds pci_dev_is_disconnected()
to devtlb_invalidation_with_pasid() so ATS invalidation is skipped
only when the device is being safely removed, but it applies only
when Intel IOMMU scalable mode is enabled.

With scalable mode disabled or unsupported, a system hard-lock
occurs when a PCIe endpoint's link drops because the Intel IOMMU
waits indefinitely for an ATS invalidation that cannot complete.

Call Trace:
 qi_submit_sync
 qi_flush_dev_iotlb
 __context_flush_dev_iotlb.part.0
 domain_context_clear_one_cb
 pci_for_each_dma_alias
 device_block_translation
 blocking_domain_attach_dev
 iommu_deinit_device
 __iommu_group_remove_device
 iommu_release_device
 iommu_bus_notifier
 blocking_notifier_call_chain
 bus_notify
 device_del
 pci_remove_bus_device
 pci_stop_and_remove_bus_device
 pciehp_unconfigure_device
 pciehp_disable_slot
 pciehp_handle_presence_or_link_change
 pciehp_ist

Commit 81e921fd3216 ("iommu/vt-d: Fix NULL domain on device release")
adds intel_pasid_teardown_sm_context() to intel_iommu_release_device(),
which calls qi_flush_dev_iotlb() and can also hard-lock the system
when a PCIe endpoint's link drops.

Call Trace:
 qi_submit_sync
 qi_flush_dev_iotlb
 __context_flush_dev_iotlb.part.0
 intel_context_flush_no_pasid
 device_pasid_table_teardown
 pci_pasid_table_teardown
 pci_for_each_dma_alias
 intel_pasid_teardown_sm_context
 intel_iommu_release_device
 iommu_deinit_device
 __iommu_group_remove_device
 iommu_release_device
 iommu_bus_notifier
 blocking_notifier_call_chain
 bus_notify
 device_del
 pci_remove_bus_device
 pci_stop_and_remove_bus_device
 pciehp_unconfigure_device
 pciehp_disable_slot
 pciehp_handle_presence_or_link_change
 pciehp_ist

Sometimes the endpoint loses connection without a link-down event
(e.g., due to a link fault); killing the process (virsh destroy)
then hard-locks the host.

Call Trace:
 qi_submit_sync
 qi_flush_dev_iotlb
 __context_flush_dev_iotlb.part.0
 domain_context_clear_one_cb
 pci_for_each_dma_alias
 device_block_translation
 blocking_domain_attach_dev
 __iommu_attach_device
 __iommu_device_set_domain
 __iommu_group_set_domain_internal
 iommu_detach_group
 vfio_iommu_type1_detach_group
 vfio_group_detach_container
 vfio_group_fops_release
 __fput

pci_dev_is_disconnected() only covers safe-removal paths;
pci_device_is_present() tests accessibility by reading
vendor/device IDs and internally calls pci_dev_is_disconnected().
On a ConnectX-5 (8 GT/s, x2) this costs ~70 µs.

Since __context_flush_dev_iotlb() is only called on
{attach,release}_dev paths (not hot), add pci_device_is_present()
there to skip inaccessible devices and avoid the hard-lock.

Fixes: 37764b952e1b ("iommu/vt-d: Global devTLB flush when present context entry changed")
Fixes: 81e921fd3216 ("iommu/vt-d: Fix NULL domain on device release")
Cc: stable@vger.kernel.org
Signed-off-by: Jinhui Guo &lt;guojinhui.liam@bytedance.com&gt;
Link: https://lore.kernel.org/r/20251211035946.2071-2-guojinhui.liam@bytedance.com
Signed-off-by: Lu Baolu &lt;baolu.lu@linux.intel.com&gt;
Signed-off-by: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iommu/arm-smmu-v3: Do not set disable_ats unless vSTE is Translate</title>
<updated>2026-03-04T12:21:14+00:00</updated>
<author>
<name>Nicolin Chen</name>
<email>nicolinc@nvidia.com</email>
</author>
<published>2026-01-15T01:12:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a50006e7ba0e31b49a00a3d4fef89161298f866e'/>
<id>urn:sha1:a50006e7ba0e31b49a00a3d4fef89161298f866e</id>
<content type='text'>
[ Upstream commit a45dd34663025c75652b27e384e91c9c05ba1d80 ]

A vSTE may have three configuration types: Abort, Bypass, and Translate.

An Abort vSTE wouldn't enable ATS, but the other two might.

It makes sense for a Transalte vSTE to rely on the guest vSTE.EATS field.

For a Bypass vSTE, it would end up with an S2-only physical STE, similar
to an attachment to a regular S2 domain. However, the nested case always
disables ATS following the Bypass vSTE, while the regular S2 case always
enables ATS so long as arm_smmu_ats_supported(master) == true.

Note that ATS is needed for certain VM centric workloads and historically
non-vSMMU cases have relied on this automatic enablement. So, having the
nested case behave differently causes problems.

To fix that, add a condition to disable_ats, so that it might enable ATS
for a Bypass vSTE, aligning with the regular S2 case.

Fixes: f27298a82ba0 ("iommu/arm-smmu-v3: Allow ATS for IOMMU_DOMAIN_NESTED")
Cc: stable@vger.kernel.org
Suggested-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Nicolin Chen &lt;nicolinc@nvidia.com&gt;
Reviewed-by: Pranjal Shrivastava &lt;praan@google.com&gt;
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iommu/arm-smmu-v3: Mark EATS_TRANS safe when computing the update sequence</title>
<updated>2026-03-04T12:21:14+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2026-01-15T18:23:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=779e0c250a6b8a0a2c1e2f39bedfb159100d113d'/>
<id>urn:sha1:779e0c250a6b8a0a2c1e2f39bedfb159100d113d</id>
<content type='text'>
[ Upstream commit 7cad800485956a263318930613f8f4a084af8c70 ]

If VM wants to toggle EATS_TRANS off at the same time as changing the CFG,
hypervisor will see EATS change to 0 and insert a V=0 breaking update into
the STE even though the VM did not ask for that.

In bare metal, EATS_TRANS is ignored by CFG=ABORT/BYPASS, which is why this
does not cause a problem until we have the nested case where CFG is always
a variation of S2 trans that does use EATS_TRANS.

Relax the rules for EATS_TRANS sequencing, we don't need it to be exact as
the enclosing code will always disable ATS at the PCI device when changing
EATS_TRANS. This ensures there are no ATS transactions that can race with
an EATS_TRANS change so we don't need to carefully sequence these bits.

Fixes: 1e8be08d1c91 ("iommu/arm-smmu-v3: Support IOMMU_DOMAIN_NESTED")
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Reviewed-by: Shuai Xue &lt;xueshuai@linux.alibaba.com&gt;
Signed-off-by: Nicolin Chen &lt;nicolinc@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iommu/arm-smmu-v3: Mark STE MEV safe when computing the update sequence</title>
<updated>2026-03-04T12:21:14+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2026-01-15T18:23:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3f73325f4548beee0a1f1d378ffe74ca1748570d'/>
<id>urn:sha1:3f73325f4548beee0a1f1d378ffe74ca1748570d</id>
<content type='text'>
[ Upstream commit f3c1d372dbb8e5a86923f20db66deabef42bfc9d ]

Nested CD tables set the MEV bit to try to reduce multi-fault spamming on
the hypervisor. Since MEV is in STE word 1 this causes a breaking update
sequence that is not required and impacts real workloads.

For the purposes of STE updates the value of MEV doesn't matter, if it is
set/cleared early or late it just results in a change to the fault reports
that must be supported by the kernel anyhow. The spec says:

 Note: Software must expect, and be able to deal with, coalesced fault
 records even when MEV == 0.

So mark STE MEV safe when computing the update sequence, to avoid creating
a breaking update.

Fixes: da0c56520e88 ("iommu/arm-smmu-v3: Set MEV bit in nested STE for DoS mitigations")
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Reviewed-by: Shuai Xue &lt;xueshuai@linux.alibaba.com&gt;
Reviewed-by: Mostafa Saleh &lt;smostafa@google.com&gt;
Reviewed-by: Pranjal Shrivastava &lt;praan@google.com&gt;
Signed-off-by: Nicolin Chen &lt;nicolinc@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iommu/arm-smmu-v3: Add update_safe bits to fix STE update sequence</title>
<updated>2026-03-04T12:21:14+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2026-01-15T18:23:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=efd59553e477fe139f9981ab36a899869dae9294'/>
<id>urn:sha1:efd59553e477fe139f9981ab36a899869dae9294</id>
<content type='text'>
[ Upstream commit 2781f2a930abb5d27f80b8afbabfa19684833b65 ]

C_BAD_STE was observed when updating nested STE from an S1-bypass mode to
an S1DSS-bypass mode. As both modes enabled S2, the used bit is slightly
different than the normal S1-bypass and S1DSS-bypass modes. As a result,
fields like MEV and EATS in S2's used list marked the word1 as a critical
word that requested a STE.V=0. This breaks a hitless update.

However, both MEV and EATS aren't critical in terms of STE update. One
controls the merge of the events and the other controls the ATS that is
managed by the driver at the same time via pci_enable_ats().

Add an arm_smmu_get_ste_update_safe() to allow STE update algorithm to
relax those fields, avoiding the STE update breakages.

After this change, entry_set has no caller checking its return value, so
change it to void.

Note that this change is required by both MEV and EATS fields, which were
introduced in different kernel versions. So add get_update_safe() first.
MEV and EATS will be added to arm_smmu_get_ste_update_safe() separately.

Fixes: 1e8be08d1c91 ("iommu/arm-smmu-v3: Support IOMMU_DOMAIN_NESTED")
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Reviewed-by: Shuai Xue &lt;xueshuai@linux.alibaba.com&gt;
Reviewed-by: Mostafa Saleh &lt;smostafa@google.com&gt;
Reviewed-by: Pranjal Shrivastava &lt;praan@google.com&gt;
Signed-off-by: Nicolin Chen &lt;nicolinc@nvidia.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iommu/vt-d: Flush piotlb for SVM and Nested domain</title>
<updated>2026-03-04T12:21:12+00:00</updated>
<author>
<name>Yi Liu</name>
<email>yi.l.liu@intel.com</email>
</author>
<published>2026-01-22T01:48:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8722a8fb7ed9242029531b880851f43163482fcc'/>
<id>urn:sha1:8722a8fb7ed9242029531b880851f43163482fcc</id>
<content type='text'>
[ Upstream commit 04b1b069f151e793767755f58b51670bff00cbc1 ]

Besides the paging domains that use FS, SVM and Nested domains need to
use piotlb invalidation descriptor as well.

Fixes: b33125296b50 ("iommu/vt-d: Create unique domain ops for each stage")
Cc: stable@vger.kernel.org
Signed-off-by: Yi Liu &lt;yi.l.liu@intel.com&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Link: https://lore.kernel.org/r/20251223065824.6164-1-yi.l.liu@intel.com
Signed-off-by: Lu Baolu &lt;baolu.lu@linux.intel.com&gt;
Signed-off-by: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iommu/vt-d: Flush dev-IOTLB only when PCIe device is accessible in scalable mode</title>
<updated>2026-03-04T12:21:12+00:00</updated>
<author>
<name>Jinhui Guo</name>
<email>guojinhui.liam@bytedance.com</email>
</author>
<published>2026-01-22T01:48:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9deaacc8dcaddb6ddc5b52e1e63b457450ec0f94'/>
<id>urn:sha1:9deaacc8dcaddb6ddc5b52e1e63b457450ec0f94</id>
<content type='text'>
[ Upstream commit 10e60d87813989e20eac1f3eda30b3bae461e7f9 ]

Commit 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation
request when device is disconnected") relies on
pci_dev_is_disconnected() to skip ATS invalidation for
safely-removed devices, but it does not cover link-down caused
by faults, which can still hard-lock the system.

For example, if a VM fails to connect to the PCIe device,
"virsh destroy" is executed to release resources and isolate
the fault, but a hard-lockup occurs while releasing the group fd.

Call Trace:
 qi_submit_sync
 qi_flush_dev_iotlb
 intel_pasid_tear_down_entry
 device_block_translation
 blocking_domain_attach_dev
 __iommu_attach_device
 __iommu_device_set_domain
 __iommu_group_set_domain_internal
 iommu_detach_group
 vfio_iommu_type1_detach_group
 vfio_group_detach_container
 vfio_group_fops_release
 __fput

Although pci_device_is_present() is slower than
pci_dev_is_disconnected(), it still takes only ~70 µs on a
ConnectX-5 (8 GT/s, x2) and becomes even faster as PCIe speed
and width increase.

Besides, devtlb_invalidation_with_pasid() is called only in the
paths below, which are far less frequent than memory map/unmap.

1. mm-struct release
2. {attach,release}_dev
3. set/remove PASID
4. dirty-tracking setup

The gain in system stability far outweighs the negligible cost
of using pci_device_is_present() instead of pci_dev_is_disconnected()
to decide when to skip ATS invalidation, especially under GDR
high-load conditions.

Fixes: 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected")
Cc: stable@vger.kernel.org
Signed-off-by: Jinhui Guo &lt;guojinhui.liam@bytedance.com&gt;
Link: https://lore.kernel.org/r/20251211035946.2071-3-guojinhui.liam@bytedance.com
Signed-off-by: Lu Baolu &lt;baolu.lu@linux.intel.com&gt;
Signed-off-by: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
