kernel/linux.git/drivers/iommu/amd, branch v6.18.21

iommu/amd: serialize sequence allocation under concurrent TLB invalidations

2026-03-04T12:20:43+00:00

[ Upstream commit 9e249c48412828e807afddc21527eb734dc9bd3d ] With concurrent TLB invalidations, completion wait randomly gets timed out because cmd_sem_val was incremented outside the IOMMU spinlock, allowing CMD_COMPL_WAIT commands to be queued out of sequence and breaking the ordering assumption in wait_on_sem(). Move the cmd_sem_val increment under iommu->lock so completion sequence allocation is serialized with command queuing. And remove the unnecessary return. Fixes: d2a0cac10597 ("iommu/amd: move wait_on_sem() out of spinlock") Tested-by: Srikanth Aithal Reported-by: Srikanth Aithal Signed-off-by: Ankit Soni Reviewed-by: Vasant Hegde Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin

iommu/amd: move wait_on_sem() out of spinlock

2026-03-04T12:20:13+00:00

[ Upstream commit d2a0cac10597068567d336e85fa3cbdbe8ca62bf ] With iommu.strict=1, the existing completion wait path can cause soft lockups under stressed environment, as wait_on_sem() busy-waits under the spinlock with interrupts disabled. Move the completion wait in iommu_completion_wait() out of the spinlock. wait_on_sem() only polls the hardware-updated cmd_sem and does not require iommu->lock, so holding the lock during the busy wait unnecessarily increases contention and extends the time with interrupts disabled. Signed-off-by: Ankit Soni Reviewed-by: Vasant Hegde Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin

iommu/amd: Use core's primary handler and set IRQF_ONESHOT

2026-02-26T22:59:05+00:00

[ Upstream commit 5bfcdccb4d18d3909b7f87942be67fd6bdc00c1d ] request_threaded_irq() is invoked with a primary and a secondary handler and no flags are passed. The primary handler is the same as irq_default_primary_handler() so there is no need to have an identical copy. The lack of the IRQF_ONESHOT can be dangerous because the interrupt source is not masked while the threaded handler is active. This means, especially on LEVEL typed interrupt lines, the interrupt can fire again before the threaded handler had a chance to run. Use the default primary interrupt handler by specifying NULL and set IRQF_ONESHOT so the interrupt source is masked until the secondary handler is done. Fixes: 72fe00f01f9a3 ("x86/amd-iommu: Use threaded interupt handler") Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner Link: https://patch.msgid.link/20260128095540.863589-4-bigeasy@linutronix.de Signed-off-by: Sasha Levin

iommu/amd: Fix error path in amd_iommu_probe_device()

2026-01-30T09:32:17+00:00

[ Upstream commit 3222b6de5145272c43a90cb8667377d676635ea0 ] Currently, the error path of amd_iommu_probe_device() unconditionally references dev_data, which may not be initialized if an early failure occurs (like iommu_init_device() fails). Move the out_err label to ensure the function exits immediately on failure without accessing potentially uninitialized dev_data. Fixes: 19e5cc156cb ("iommu/amd: Enable support for up to 2K interrupts per function") Cc: Rakuram Eswaran Cc: Jörg Rödel Reported-by: kernel test robot Reported-by: Dan Carpenter Closes: https://lore.kernel.org/r/202512191724.meqJENXe-lkp@intel.com/ Signed-off-by: Vasant Hegde Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin

iommu/amd: Propagate the error code returned by __modify_irte_ga() in modify_irte_ga()

2026-01-08T09:16:55+00:00

commit 2381a1b40be4b286062fb3cf67dd7f005692aa2a upstream. The return type of __modify_irte_ga() is int, but modify_irte_ga() treats it as a bool. Casting the int to bool discards the error code. To fix the issue, change the type of ret to int in modify_irte_ga(). Fixes: 57cdb720eaa5 ("iommu/amd: Do not flush IRTE when only updating isRun and destination fields") Cc: stable@vger.kernel.org Signed-off-by: Jinhui Guo Reviewed-by: Vasant Hegde Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman

iommu/amd: Fix pci_segment memleak in alloc_pci_segment()

2026-01-08T09:16:55+00:00

commit 75ba146c2674ba49ed8a222c67f9abfb4a4f2a4f upstream. Fix a memory leak of struct amd_iommu_pci_segment in alloc_pci_segment() when system memory (or contiguous memory) is insufficient. Fixes: 04230c119930 ("iommu/amd: Introduce per PCI segment device table") Fixes: eda797a27795 ("iommu/amd: Introduce per PCI segment rlookup table") Fixes: 99fc4ac3d297 ("iommu/amd: Introduce per PCI segment alias_table") Cc: stable@vger.kernel.org Signed-off-by: Jinhui Guo Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman

amd/iommu: Preserve domain ids inside the kdump kernel

2026-01-02T11:56:54+00:00

[ Upstream commit c2e8dc1222c2136e714d5d972dce7e64924e4ed8 ] Currently AMD IOMMU driver does not reserve domain ids programmed in the DTE while reusing the device table inside kdump kernel. This can cause reallocation of these domain ids for newer domains that are created by the kdump kernel, which can lead to potential IO_PAGE_FAULTs Hence reserve these ids inside pdom_ids. Fixes: 38e5f33ee359 ("iommu/amd: Reuse device table for kdump") Signed-off-by: Sairaj Kodilkar Reported-by: Jason Gunthorpe Reviewed-by: Vasant Hegde Reviewed-by: Jason Gunthorpe Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin

iommu/amd: Fix potential out-of-bounds read in iommu_mmio_show

2025-12-18T13:02:49+00:00

[ Upstream commit a0c7005333f9a968abb058b1d77bbcd7fb7fd1e7 ] In iommu_mmio_write(), it validates the user-provided offset with the check: `iommu->dbg_mmio_offset > iommu->mmio_phys_end - 4`. This assumes a 4-byte access. However, the corresponding show handler, iommu_mmio_show(), uses readq() to perform an 8-byte (64-bit) read. If a user provides an offset equal to `mmio_phys_end - 4`, the check passes, and will lead to a 4-byte out-of-bounds read. Fix this by adjusting the boundary check to use sizeof(u64), which corresponds to the size of the readq() operation. Fixes: 7a4ee419e8c1 ("iommu/amd: Add debugfs support to dump IOMMU MMIO registers") Signed-off-by: Songtang Liu Reviewed-by: Dheeraj Kumar Srivastava Tested-by: Dheeraj Kumar Srivastava Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin

Merge branches 'apple/dart', 'ti/omap', 'riscv', 'intel/vt-d' and 'amd/amd-vi' into next

2025-09-26T08:03:33+00:00

iommu/amd/pgtbl: Fix possible race while increase page table level

2025-09-19T07:39:40+00:00

The AMD IOMMU host page table implementation supports dynamic page table levels (up to 6 levels), starting with a 3-level configuration that expands based on IOVA address. The kernel maintains a root pointer and current page table level to enable proper page table walks in alloc_pte()/fetch_pte() operations. The IOMMU IOVA allocator initially starts with 32-bit address and onces its exhuasted it switches to 64-bit address (max address is determined based on IOMMU and device DMA capability). To support larger IOVA, AMD IOMMU driver increases page table level. But in unmap path (iommu_v1_unmap_pages()), fetch_pte() reads pgtable->[root/mode] without lock. So its possible that in exteme corner case, when increase_address_space() is updating pgtable->[root/mode], fetch_pte() reads wrong page table level (pgtable->mode). It does compare the value with level encoded in page table and returns NULL. This will result is iommu_unmap ops to fail and upper layer may retry/log WARN_ON. CPU 0 CPU 1 ------ ------ map pages unmap pages alloc_pte() -> increase_address_space() iommu_v1_unmap_pages() -> fetch_pte() pgtable->root = pte (new root value) READ pgtable->[mode/root] Reads new root, old mode Updates mode (pgtable->mode += 1) Since Page table level updates are infrequent and already synchronized with a spinlock, implement seqcount to enable lock-free read operations on the read path. Fixes: 754265bcab7 ("iommu/amd: Fix race in increase_address_space()") Reported-by: Alejandro Jimenez Cc: stable@vger.kernel.org Cc: Joao Martins Cc: Suravee Suthikulpanit Signed-off-by: Vasant Hegde Signed-off-by: Joerg Roedel