kernel/linux.git/drivers/iommu/amd, branch v6.18.35

iommu/amd: Remove latent out-of-bounds access in IOMMU debugfs

2026-06-01T15:50:37+00:00

[ Upstream commit 8dfd3d8d74435344ee8dc9237596959c8b2a6cbe ] In iommu_mmio_write() and iommu_capability_write(), the variables dbg_mmio_offset and dbg_cap_offset are declared as int. However, they are populated using kstrtou32_from_user(). If a user provides a sufficiently large value, it can become a negative integer. Prior to this patch, the AMD IOMMU debugfs implementation was already protected by different mechanisms. 1. #define OFS_IN_SZ 8 ensures the user string <= 8 bytes, so e.g. 0xffffffff isn't a valid input. if (cnt > OFS_IN_SZ) return -EINVAL; 2. Implicit type promotion in iommu_mmio_write(), dbg_mmio_offset is int and iommu->mmio_phys_end is u64 if (dbg_mmio_offset > iommu->mmio_phys_end - sizeof(u64)) return -EINVAL; 3. The show handlers would currently catch the negative number and refuse to perform the read. Replace kstrtou32_from_user() with kstrtos32_from_user() to parse the input, and check for negative values to explicitly prevent out-of-bounds memory accesses directly in iommu_mmio_write() and iommu_capability_write(). Signed-off-by: Eder Zulian Fixes: 7a4ee419e8c1 ("iommu/amd: Add debugfs support to dump IOMMU MMIO registers") Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin

iommu/amd: Fix illegal cap/mmio access in IOMMU debugfs

2026-06-01T15:50:36+00:00

[ Upstream commit 0e59645683b7b6fa20eceb21a6f420e4f7412943 ] In the current AMD IOMMU debugfs, when multiple processes simultaneously access the IOMMU mmio/cap registers using the IOMMU debugfs, illegal access issues can occur in the following execution flow: 1. CPU1: Sets a valid access address using iommu_mmio/capability_write, and verifies the access address's validity in iommu_mmio/capability_show 2. CPU2: Sets an invalid address using iommu_mmio/capability_write 3. CPU1: accesses the IOMMU mmio/cap registers based on the invalid address, resulting in an illegal access. This patch modifies the execution process to first verify the address's validity and then access it based on the same address, ensuring correctness and robustness. Signed-off-by: Guanghui Feng Signed-off-by: Joerg Roedel Stable-dep-of: 8dfd3d8d7443 ("iommu/amd: Remove latent out-of-bounds access in IOMMU debugfs") Signed-off-by: Sasha Levin

iommu/amd: Bounds-check devid in __rlookup_amd_iommu()

2026-05-23T11:07:18+00:00

commit 07d0f496fe7ec5abe3bee7e38be709521567bb33 upstream. iommu_device_register() walks every device on the PCI bus via bus_for_each_dev() and calls amd_iommu_probe_device() for each. The inlined check_device() path computes the device's sbdf, calls rlookup_amd_iommu() to find the owning IOMMU, and only afterwards verifies devid <= pci_seg->last_bdf. __rlookup_amd_iommu() indexes rlookup_table[devid] with no bounds check of its own, so for a PCI device whose BDF is not described by the IVRS, the lookup reads past the end of the allocation before the caller's bounds check can run. This was harmless before commit e874c666b15b ("iommu/amd: Change rlookup, irq_lookup, and alias to use kvalloc()"): the table was a zeroed page-order allocation, so the over-read returned NULL and the caller's NULL check skipped the device. After that commit the table is a tight kvcalloc() and the over-read returns adjacent slab contents, which check_device() then dereferences as a struct amd_iommu *, causing a boot-time GPF. Seen on Google Compute Engine ct6e VMs, where the virtualized IVRS describes only the four TPU endpoints 00:04.0-07.0; the gVNIC at 00:08.0 (devid 0x40) indexes 56 bytes past the 456-byte allocation, into the adjacent kmalloc-512 slab object: pci 0000:00:04.0: Adding to iommu group 0 pci 0000:00:05.0: Adding to iommu group 1 pci 0000:00:06.0: Adding to iommu group 2 pci 0000:00:07.0: Adding to iommu group 3 Oops: general protection fault, probably for non-canonical address 0x3a64695f78746382: 0000 [#1] SMP NOPTI CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.18.22 #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/06/2025 RIP: 0010:amd_iommu_probe_device+0x54/0x3a0 Call Trace: __iommu_probe_device+0x107/0x520 probe_iommu_group+0x29/0x50 bus_for_each_dev+0x7e/0xe0 iommu_device_register+0xc9/0x240 iommu_go_to_state+0x9c0/0x1c60 amd_iommu_init+0x14/0x40 pci_iommu_init+0x16/0x60 do_one_initcall+0x47/0x2f0 Guard the array access in __rlookup_amd_iommu(). With the fix applied on 6.18.22, the gVNIC at 00:08.0 is skipped cleanly and the VM boots. Fixes: e874c666b15b ("iommu/amd: Change rlookup, irq_lookup, and alias to use kvalloc()") Cc: stable@vger.kernel.org Reported-by: Ziyuan Chen Tested-by: Ziyuan Chen Reviewed-by: Josef Bacik Assisted-by: Claude:unspecified Signed-off-by: Jose Fernandez (Anthropic) Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman

iommu/amd: Fix clone_alias() to use the original device's devid

2026-05-23T11:06:40+00:00

[ Upstream commit faad224fe0f0857a04ff2eb3c90f0de57f47d0f3 ] Currently clone_alias() assumes first argument (pdev) is always the original device pointer. This function is called by pci_for_each_dma_alias() which based on topology decides to send original or alias device details in first argument. This meant that the source devid used to look up and copy the DTE may be incorrect, leading to wrong or stale DTE entries being propagated to alias device. Fix this by passing the original pdev as the opaque data argument to both the direct clone_alias() call and pci_for_each_dma_alias(). Inside clone_alias(), retrieve the original device from data and compute devid from it. Fixes: 3332364e4ebc ("iommu/amd: Support multiple PCI DMA aliases in device table") Signed-off-by: Vasant Hegde Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin

iommu/amd: serialize sequence allocation under concurrent TLB invalidations

2026-03-04T12:20:43+00:00

[ Upstream commit 9e249c48412828e807afddc21527eb734dc9bd3d ] With concurrent TLB invalidations, completion wait randomly gets timed out because cmd_sem_val was incremented outside the IOMMU spinlock, allowing CMD_COMPL_WAIT commands to be queued out of sequence and breaking the ordering assumption in wait_on_sem(). Move the cmd_sem_val increment under iommu->lock so completion sequence allocation is serialized with command queuing. And remove the unnecessary return. Fixes: d2a0cac10597 ("iommu/amd: move wait_on_sem() out of spinlock") Tested-by: Srikanth Aithal Reported-by: Srikanth Aithal Signed-off-by: Ankit Soni Reviewed-by: Vasant Hegde Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin

iommu/amd: move wait_on_sem() out of spinlock

2026-03-04T12:20:13+00:00

[ Upstream commit d2a0cac10597068567d336e85fa3cbdbe8ca62bf ] With iommu.strict=1, the existing completion wait path can cause soft lockups under stressed environment, as wait_on_sem() busy-waits under the spinlock with interrupts disabled. Move the completion wait in iommu_completion_wait() out of the spinlock. wait_on_sem() only polls the hardware-updated cmd_sem and does not require iommu->lock, so holding the lock during the busy wait unnecessarily increases contention and extends the time with interrupts disabled. Signed-off-by: Ankit Soni Reviewed-by: Vasant Hegde Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin

iommu/amd: Use core's primary handler and set IRQF_ONESHOT

2026-02-26T22:59:05+00:00

[ Upstream commit 5bfcdccb4d18d3909b7f87942be67fd6bdc00c1d ] request_threaded_irq() is invoked with a primary and a secondary handler and no flags are passed. The primary handler is the same as irq_default_primary_handler() so there is no need to have an identical copy. The lack of the IRQF_ONESHOT can be dangerous because the interrupt source is not masked while the threaded handler is active. This means, especially on LEVEL typed interrupt lines, the interrupt can fire again before the threaded handler had a chance to run. Use the default primary interrupt handler by specifying NULL and set IRQF_ONESHOT so the interrupt source is masked until the secondary handler is done. Fixes: 72fe00f01f9a3 ("x86/amd-iommu: Use threaded interupt handler") Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Thomas Gleixner Link: https://patch.msgid.link/20260128095540.863589-4-bigeasy@linutronix.de Signed-off-by: Sasha Levin

iommu/amd: Fix error path in amd_iommu_probe_device()

2026-01-30T09:32:17+00:00

[ Upstream commit 3222b6de5145272c43a90cb8667377d676635ea0 ] Currently, the error path of amd_iommu_probe_device() unconditionally references dev_data, which may not be initialized if an early failure occurs (like iommu_init_device() fails). Move the out_err label to ensure the function exits immediately on failure without accessing potentially uninitialized dev_data. Fixes: 19e5cc156cb ("iommu/amd: Enable support for up to 2K interrupts per function") Cc: Rakuram Eswaran Cc: Jörg Rödel Reported-by: kernel test robot Reported-by: Dan Carpenter Closes: https://lore.kernel.org/r/202512191724.meqJENXe-lkp@intel.com/ Signed-off-by: Vasant Hegde Signed-off-by: Joerg Roedel Signed-off-by: Sasha Levin

iommu/amd: Propagate the error code returned by __modify_irte_ga() in modify_irte_ga()

2026-01-08T09:16:55+00:00

commit 2381a1b40be4b286062fb3cf67dd7f005692aa2a upstream. The return type of __modify_irte_ga() is int, but modify_irte_ga() treats it as a bool. Casting the int to bool discards the error code. To fix the issue, change the type of ret to int in modify_irte_ga(). Fixes: 57cdb720eaa5 ("iommu/amd: Do not flush IRTE when only updating isRun and destination fields") Cc: stable@vger.kernel.org Signed-off-by: Jinhui Guo Reviewed-by: Vasant Hegde Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman

iommu/amd: Fix pci_segment memleak in alloc_pci_segment()

2026-01-08T09:16:55+00:00

commit 75ba146c2674ba49ed8a222c67f9abfb4a4f2a4f upstream. Fix a memory leak of struct amd_iommu_pci_segment in alloc_pci_segment() when system memory (or contiguous memory) is insufficient. Fixes: 04230c119930 ("iommu/amd: Introduce per PCI segment device table") Fixes: eda797a27795 ("iommu/amd: Introduce per PCI segment rlookup table") Fixes: 99fc4ac3d297 ("iommu/amd: Introduce per PCI segment alias_table") Cc: stable@vger.kernel.org Signed-off-by: Jinhui Guo Signed-off-by: Joerg Roedel Signed-off-by: Greg Kroah-Hartman