<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/iommu, branch v6.6.142</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.142</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.6.142'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-05-23T11:03:34+00:00</updated>
<entry>
<title>iommu/vt-d: Disable DMAR for Intel Q35 IGFX</title>
<updated>2026-05-23T11:03:34+00:00</updated>
<author>
<name>Naval Alcalá</name>
<email>ari@naval.cat</email>
</author>
<published>2026-05-09T02:43:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2776f9016f1b8b9cc4849fd7b2c65335d8d85544'/>
<id>urn:sha1:2776f9016f1b8b9cc4849fd7b2c65335d8d85544</id>
<content type='text'>
commit 2cda2e10dc8343ae01eae9e999a876b7e7d37861 upstream.

Intel Q35 integrated graphics (8086:29b2) exhibits broken DMAR
behaviour similar to other G4x/GM45 devices for which DMAR is
already disabled via quirks.

When DMAR is enabled, the system may hard lock up during boot or
early device initialization, requiring a reset.

Add the missing PCI ID to the existing quirk list to disable
DMAR for this device.

Fixes: 1f76249cc3be ("iommu/vt-d: Declare Broadwell igfx dmar support snafu")
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=201185
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=216064
Signed-off-by: Naval Alcalá &lt;ari@naval.cat&gt;
Link: https://lore.kernel.org/r/20260410161622.13549-1-ari@naval.cat
Signed-off-by: Lu Baolu &lt;baolu.lu@linux.intel.com&gt;
Signed-off-by: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>iommufd: vfio compatibility extension check for noiommu mode</title>
<updated>2026-05-23T11:03:16+00:00</updated>
<author>
<name>Jacob Pan</name>
<email>jacob.pan@linux.microsoft.com</email>
</author>
<published>2026-02-13T18:36:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fe1d1423c524bd206152c6057bb8cfa92857f8e0'/>
<id>urn:sha1:fe1d1423c524bd206152c6057bb8cfa92857f8e0</id>
<content type='text'>
[ Upstream commit 7147ec874ea08c322d779d8eba28946e294ed1f3 ]

VFIO_CHECK_EXTENSION should return false for TYPE1_IOMMU variants when
in NO-IOMMU mode and IOMMUFD compat container is set. This change makes
the behavior match VFIO_CONTAINER in noiommu mode. It also prevents
userspace from incorrectly attempting to use TYPE1 IOMMU operations
in a no-iommu context.

Fixes: d624d6652a65 ("iommufd: vfio container FD ioctl compatibility")
Link: https://patch.msgid.link/r/20260213183636.3340-1-jacob.pan@linux.microsoft.com
Signed-off-by: Jacob Pan &lt;jacob.pan@linux.microsoft.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iommu/amd: serialize sequence allocation under concurrent TLB invalidations</title>
<updated>2026-05-17T15:13:35+00:00</updated>
<author>
<name>Ankit Soni</name>
<email>Ankit.Soni@amd.com</email>
</author>
<published>2026-01-22T15:30:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=d51bf43193b1e95dc4e34e540dc76e19def2ae5a'/>
<id>urn:sha1:d51bf43193b1e95dc4e34e540dc76e19def2ae5a</id>
<content type='text'>
commit 9e249c48412828e807afddc21527eb734dc9bd3d upstream.

With concurrent TLB invalidations, completion wait randomly gets timed out
because cmd_sem_val was incremented outside the IOMMU spinlock, allowing
CMD_COMPL_WAIT commands to be queued out of sequence and breaking the
ordering assumption in wait_on_sem().
Move the cmd_sem_val increment under iommu-&gt;lock so completion sequence
allocation is serialized with command queuing.
And remove the unnecessary return.

Fixes: d2a0cac10597 ("iommu/amd: move wait_on_sem() out of spinlock")

Tested-by: Srikanth Aithal &lt;sraithal@amd.com&gt;
Reported-by: Srikanth Aithal &lt;sraithal@amd.com&gt;
Signed-off-by: Ankit Soni &lt;Ankit.Soni@amd.com&gt;
Reviewed-by: Vasant Hegde &lt;vasant.hegde@amd.com&gt;
Signed-off-by: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
[Salvatore Bonaccorso: Backport to v6.12.y where f32fe7cb0198
("iommu/amd: Add support to remap/unmap IOMMU buffers for kdump") is not
present]
Signed-off-by: Salvatore Bonaccorso &lt;carnil@debian.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iommu/amd: Use atomic64_inc_return() in iommu.c</title>
<updated>2026-05-17T15:13:35+00:00</updated>
<author>
<name>Uros Bizjak</name>
<email>ubizjak@gmail.com</email>
</author>
<published>2024-10-07T08:43:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c28c87d9a389c2561a719de3b2946346d723b0b4'/>
<id>urn:sha1:c28c87d9a389c2561a719de3b2946346d723b0b4</id>
<content type='text'>
commit 5ce73c524f5fb5abd7b1bfed0115474b4fb437b4 upstream.

Use atomic64_inc_return(&amp;ref) instead of atomic64_add_return(1, &amp;ref)
to use optimized implementation and ease register pressure around
the primitive for targets that implement optimized variant.

Signed-off-by: Uros Bizjak &lt;ubizjak@gmail.com&gt;
Cc: Joerg Roedel &lt;joro@8bytes.org&gt;
Cc: Suravee Suthikulpanit &lt;suravee.suthikulpanit@amd.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Robin Murphy &lt;robin.murphy@arm.com&gt;
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Link: https://lore.kernel.org/r/20241007084356.47799-1-ubizjak@gmail.com
Signed-off-by: Joerg Roedel &lt;jroedel@suse.de&gt;
Signed-off-by: Salvatore Bonaccorso &lt;carnil@debian.org&gt;
Stable-dep-of: 9e249c48412828e807afddc21527eb734dc9bd3d ("iommu/amd: serialize sequence allocation under concurrent TLB invalidations")
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iommufd: Fix a race with concurrent allocation and unmap</title>
<updated>2026-05-17T15:13:34+00:00</updated>
<author>
<name>Sina Hassani</name>
<email>sina@openai.com</email>
</author>
<published>2026-04-10T18:32:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cf3eb7c8e7057eb32c411a3bd11970f004f590d0'/>
<id>urn:sha1:cf3eb7c8e7057eb32c411a3bd11970f004f590d0</id>
<content type='text'>
commit 8602018b1f17fbdaa5e5d79f4c8603ad20640c12 upstream.

iopt_unmap_iova_range() releases the lock on iova_rwsem inside the loop
body when getting to the more expensive unmap operations. This is fine on
its own, except the loop condition is based on the first area that matches
the unmap address range. If a concurrent call to map picks an area that
was unmapped in previous iterations, the loop mistakenly tries to unmap
it.

This is reproducible by having one userspace thread map buffers and pass
them to another thread that unmaps them. The problem manifests as EBUSY
errors with single page mappings.

Fix this by advancing the start pointer after unmapping an area. This
ensures each iteration only examines the IOVA range that remains mapped,
which is guaranteed not to have overlaps.

Cc: stable@vger.kernel.org
Fixes: 51fe6141f0f6 ("iommufd: Data structure to provide IOVA to PFN mapping")
Link: https://patch.msgid.link/r/CAAJpGJSR4r_ds1JOjmkqHtsBPyxu8GntoeW08Sk5RNQPmgi+tg@mail.gmail.com
Signed-off-by: Sina Hassani &lt;sina@openai.com&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>iommu/vt-d: Fix intel iommu iotlb sync hardlockup and retry</title>
<updated>2026-03-25T10:06:04+00:00</updated>
<author>
<name>Guanghui Feng</name>
<email>guanghuifeng@linux.alibaba.com</email>
</author>
<published>2026-03-16T07:16:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=85654456e3945ef1f6c58fb532bcf87893d6be7b'/>
<id>urn:sha1:85654456e3945ef1f6c58fb532bcf87893d6be7b</id>
<content type='text'>
commit fe89277c9ceb0d6af0aa665bcf24a41d8b1b79cd upstream.

During the qi_check_fault process after an IOMMU ITE event, requests at
odd-numbered positions in the queue are set to QI_ABORT, only satisfying
single-request submissions. However, qi_submit_sync now supports multiple
simultaneous submissions, and can't guarantee that the wait_desc will be
at an odd-numbered position. Therefore, if an item times out, IOMMU can't
re-initiate the request, resulting in an infinite polling wait.

This modifies the process by setting the status of all requests already
fetched by IOMMU and recorded as QI_IN_USE status (including wait_desc
requests) to QI_ABORT, thus enabling multiple requests to be resubmitted.

Fixes: 8a1d82462540 ("iommu/vt-d: Multiple descriptors per qi_submit_sync()")
Cc: stable@vger.kernel.org
Signed-off-by: Guanghui Feng &lt;guanghuifeng@linux.alibaba.com&gt;
Tested-by: Shuai Xue &lt;xueshuai@linux.alibaba.com&gt;
Reviewed-by: Shuai Xue &lt;xueshuai@linux.alibaba.com&gt;
Reviewed-by: Samiullah Khawaja &lt;skhawaja@google.com&gt;
Link: https://lore.kernel.org/r/20260306101516.3885775-1-guanghuifeng@linux.alibaba.com
Signed-off-by: Lu Baolu &lt;baolu.lu@linux.intel.com&gt;
Fixes: 8a1d82462540 ("iommu/vt-d: Multiple descriptors per  qi_submit_sync()")
Signed-off-by: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>iommu/vt-d: Flush dev-IOTLB only when PCIe device is accessible in scalable mode</title>
<updated>2026-03-04T12:21:11+00:00</updated>
<author>
<name>Jinhui Guo</name>
<email>guojinhui.liam@bytedance.com</email>
</author>
<published>2026-01-22T01:48:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=01aed2f1d7cb8fdf4c60c5bb4727608cb82b401d'/>
<id>urn:sha1:01aed2f1d7cb8fdf4c60c5bb4727608cb82b401d</id>
<content type='text'>
[ Upstream commit 10e60d87813989e20eac1f3eda30b3bae461e7f9 ]

Commit 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation
request when device is disconnected") relies on
pci_dev_is_disconnected() to skip ATS invalidation for
safely-removed devices, but it does not cover link-down caused
by faults, which can still hard-lock the system.

For example, if a VM fails to connect to the PCIe device,
"virsh destroy" is executed to release resources and isolate
the fault, but a hard-lockup occurs while releasing the group fd.

Call Trace:
 qi_submit_sync
 qi_flush_dev_iotlb
 intel_pasid_tear_down_entry
 device_block_translation
 blocking_domain_attach_dev
 __iommu_attach_device
 __iommu_device_set_domain
 __iommu_group_set_domain_internal
 iommu_detach_group
 vfio_iommu_type1_detach_group
 vfio_group_detach_container
 vfio_group_fops_release
 __fput

Although pci_device_is_present() is slower than
pci_dev_is_disconnected(), it still takes only ~70 µs on a
ConnectX-5 (8 GT/s, x2) and becomes even faster as PCIe speed
and width increase.

Besides, devtlb_invalidation_with_pasid() is called only in the
paths below, which are far less frequent than memory map/unmap.

1. mm-struct release
2. {attach,release}_dev
3. set/remove PASID
4. dirty-tracking setup

The gain in system stability far outweighs the negligible cost
of using pci_device_is_present() instead of pci_dev_is_disconnected()
to decide when to skip ATS invalidation, especially under GDR
high-load conditions.

Fixes: 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected")
Cc: stable@vger.kernel.org
Signed-off-by: Jinhui Guo &lt;guojinhui.liam@bytedance.com&gt;
Link: https://lore.kernel.org/r/20251211035946.2071-3-guojinhui.liam@bytedance.com
Signed-off-by: Lu Baolu &lt;baolu.lu@linux.intel.com&gt;
Signed-off-by: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iommu/amd: move wait_on_sem() out of spinlock</title>
<updated>2026-03-04T12:20:45+00:00</updated>
<author>
<name>Ankit Soni</name>
<email>Ankit.Soni@amd.com</email>
</author>
<published>2025-12-01T14:39:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f2f65b28d802a667119147444ec2ae33eebf9a58'/>
<id>urn:sha1:f2f65b28d802a667119147444ec2ae33eebf9a58</id>
<content type='text'>
[ Upstream commit d2a0cac10597068567d336e85fa3cbdbe8ca62bf ]

With iommu.strict=1, the existing completion wait path can cause soft
lockups under stressed environment, as wait_on_sem() busy-waits under the
spinlock with interrupts disabled.

Move the completion wait in iommu_completion_wait() out of the spinlock.
wait_on_sem() only polls the hardware-updated cmd_sem and does not require
iommu-&gt;lock, so holding the lock during the busy wait unnecessarily
increases contention and extends the time with interrupts disabled.

Signed-off-by: Ankit Soni &lt;Ankit.Soni@amd.com&gt;
Reviewed-by: Vasant Hegde &lt;vasant.hegde@amd.com&gt;
Signed-off-by: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iommu/arm-smmu-v3: Improve CMDQ lock fairness and efficiency</title>
<updated>2026-03-04T12:20:44+00:00</updated>
<author>
<name>Alexander Grest</name>
<email>Alexander.Grest@microsoft.com</email>
</author>
<published>2025-12-08T21:28:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9ff4843e6ea3a33ad3b5f26115f817d29fcb4565'/>
<id>urn:sha1:9ff4843e6ea3a33ad3b5f26115f817d29fcb4565</id>
<content type='text'>
[ Upstream commit df180b1a4cc51011c5f8c52c7ec02ad2e42962de ]

The SMMU CMDQ lock is highly contentious when there are multiple CPUs
issuing commands and the queue is nearly full.

The lock has the following states:
 - 0:		Unlocked
 - &gt;0:		Shared lock held with count
 - INT_MIN+N:	Exclusive lock held, where N is the # of shared waiters
 - INT_MIN:	Exclusive lock held, no shared waiters

When multiple CPUs are polling for space in the queue, they attempt to
grab the exclusive lock to update the cons pointer from the hardware. If
they fail to get the lock, they will spin until either the cons pointer
is updated by another CPU.

The current code allows the possibility of shared lock starvation
if there is a constant stream of CPUs trying to grab the exclusive lock.
This leads to severe latency issues and soft lockups.

Consider the following scenario where CPU1's attempt to acquire the
shared lock is starved by CPU2 and CPU0 contending for the exclusive
lock.

CPU0 (exclusive)  | CPU1 (shared)     | CPU2 (exclusive)    | `cmdq-&gt;lock`
--------------------------------------------------------------------------
trylock() //takes |                   |                     | 0
                  | shared_lock()     |                     | INT_MIN
                  | fetch_inc()       |                     | INT_MIN
                  | no return         |                     | INT_MIN + 1
                  | spins // VAL &gt;= 0 |                     | INT_MIN + 1
unlock()          | spins...          |                     | INT_MIN + 1
set_release(0)    | spins...          |                     | 0 see[NOTE]
(done)            | (sees 0)          | trylock() // takes  | 0
                  | *exits loop*      | cmpxchg(0, INT_MIN) | 0
                  |                   | *cuts in*           | INT_MIN
                  | cmpxchg(0, 1)     |                     | INT_MIN
                  | fails // != 0     |                     | INT_MIN
                  | spins // VAL &gt;= 0 |                     | INT_MIN
                  | *starved*         |                     | INT_MIN

[NOTE] The current code resets the exclusive lock to 0 regardless of the
state of the lock. This causes two problems:
1. It opens the possibility of back-to-back exclusive locks and the
   downstream effect of starving shared lock.
2. The count of shared lock waiters are lost.

To mitigate this, we release the exclusive lock by only clearing the sign
bit while retaining the shared lock waiter count as a way to avoid
starving the shared lock waiters.

Also deleted cmpxchg loop while trying to acquire the shared lock as it
is not needed. The waiters can see the positive lock count and proceed
immediately after the exclusive lock is released.

Exclusive lock is not starved in that submitters will try exclusive lock
first when new spaces become available.

Reviewed-by: Mostafa Saleh &lt;smostafa@google.com&gt;
Reviewed-by: Nicolin Chen &lt;nicolinc@nvidia.com&gt;
Signed-off-by: Alexander Grest &lt;Alexander.Grest@microsoft.com&gt;
Signed-off-by: Jacob Pan &lt;jacob.pan@linux.microsoft.com&gt;
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>iommu/vt-d: Flush cache for PASID table before using it</title>
<updated>2026-03-04T12:19:46+00:00</updated>
<author>
<name>Dmytro Maluka</name>
<email>dmaluka@chromium.org</email>
</author>
<published>2026-01-22T01:48:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c93f23375d8c410954b0df825e814b632fd62b9d'/>
<id>urn:sha1:c93f23375d8c410954b0df825e814b632fd62b9d</id>
<content type='text'>
[ Upstream commit 22d169bdd2849fe6bd18c2643742e1c02be6451c ]

When writing the address of a freshly allocated zero-initialized PASID
table to a PASID directory entry, do that after the CPU cache flush for
this PASID table, not before it, to avoid the time window when this
PASID table may be already used by non-coherent IOMMU hardware while
its contents in RAM is still some random old data, not zero-initialized.

Fixes: 194b3348bdbb ("iommu/vt-d: Fix PASID directory pointer coherency")
Signed-off-by: Dmytro Maluka &lt;dmaluka@chromium.org&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Link: https://lore.kernel.org/r/20251221123508.37495-1-dmaluka@chromium.org
Signed-off-by: Lu Baolu &lt;baolu.lu@linux.intel.com&gt;
Signed-off-by: Joerg Roedel &lt;joerg.roedel@amd.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
