<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/vfio_pci_core.h, branch v6.19.11</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.11'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-12-23T21:07:08+00:00</updated>
<entry>
<title>vfio/pci: Disable qword access to the PCI ROM bar</title>
<updated>2025-12-23T21:07:08+00:00</updated>
<author>
<name>Kevin Tian</name>
<email>kevin.tian@intel.com</email>
</author>
<published>2025-12-18T08:16:49+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=dc85a46928c41423ad89869baf05a589e2975575'/>
<id>urn:sha1:dc85a46928c41423ad89869baf05a589e2975575</id>
<content type='text'>
Commit 2b938e3db335 ("vfio/pci: Enable iowrite64 and ioread64 for vfio
pci") enables qword access to the PCI bar resources. However certain
devices (e.g. Intel X710) are observed with problem upon qword accesses
to the rom bar, e.g. triggering PCI aer errors.

This is triggered by Qemu which caches the rom content by simply does a
pread() of the remaining size until it gets the full contents. The other
bars would only perform operations at the same access width as their
guest drivers.

Instead of trying to identify all broken devices, universally disable
qword access to the rom bar i.e. going back to the old way which worked
reliably for years.

Reported-by: Farrah Chen &lt;farrah.chen@intel.com&gt;
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220740
Fixes: 2b938e3db335 ("vfio/pci: Enable iowrite64 and ioread64 for vfio pci")
Cc: stable@vger.kernel.org
Signed-off-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Tested-by: Farrah Chen &lt;farrah.chen@intel.com&gt;
Link: https://lore.kernel.org/r/20251218081650.555015-2-kevin.tian@intel.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd</title>
<updated>2025-12-05T02:50:11+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-12-05T02:50:11+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=056daec2925dc200b22c30419bc7b9e01f7843c4'/>
<id>urn:sha1:056daec2925dc200b22c30419bc7b9e01f7843c4</id>
<content type='text'>
Pull iommufd updates from Jason Gunthorpe:
 "This is a pretty consequential cycle for iommufd, though this pull is
  not too big. It is based on a shared branch with VFIO that introduces
  VFIO_DEVICE_FEATURE_DMA_BUF a DMABUF exporter for VFIO device's MMIO
  PCI BARs. This was a large multiple series journey over the last year
  and a half.

  Based on that work IOMMUFD gains support for VFIO DMABUF's in its
  existing IOMMU_IOAS_MAP_FILE, which closes the last major gap to
  support PCI peer to peer transfers within VMs.

  In Joerg's iommu tree we have the "generic page table" work which aims
  to consolidate all the duplicated page table code in every iommu
  driver into a single algorithm. This will be used by iommufd to
  implement unique page table operations to start adding new features
  and improve performance.

  In here:

   - Expand IOMMU_IOAS_MAP_FILE to accept a DMABUF exported from VFIO.
     This is the first step to broader DMABUF support in iommufd, right
     now it only works with VFIO. This closes the last functional gap
     with classic VFIO type 1 to safely support PCI peer to peer DMA by
     mapping the VFIO device's MMIO into the IOMMU.

   - Relax SMMUv3 restrictions on nesting domains to better support
     qemu's sequence to have an identity mapping before the vSID is
     established"

* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd:
  iommu/arm-smmu-v3-iommufd: Allow attaching nested domain for GBPA cases
  iommufd/selftest: Add some tests for the dmabuf flow
  iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE
  iommufd: Have iopt_map_file_pages convert the fd to a file
  iommufd: Have pfn_reader process DMABUF iopt_pages
  iommufd: Allow MMIO pages in a batch
  iommufd: Allow a DMABUF to be revoked
  iommufd: Do not map/unmap revoked DMABUFs
  iommufd: Add DMABUF to iopt_pages
  vfio/pci: Add vfio_pci_dma_buf_iommufd_map()
</content>
</entry>
<entry>
<title>vfio/nvgrace-gpu: wait for the GPU mem to be ready</title>
<updated>2025-11-28T17:09:26+00:00</updated>
<author>
<name>Ankit Agrawal</name>
<email>ankita@nvidia.com</email>
</author>
<published>2025-11-27T17:06:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a23b10608d420346e5af7eda6c46726a61572469'/>
<id>urn:sha1:a23b10608d420346e5af7eda6c46726a61572469</id>
<content type='text'>
Speculative prefetches from CPU to GPU memory until the GPU is
ready after reset can cause harmless corrected RAS events to
be logged on Grace systems. It is thus preferred that the
mapping not be re-established until the GPU is ready post reset.

The GPU readiness can be checked through BAR0 registers similar
to the checking at the time of device probe.

It can take several seconds for the GPU to be ready. So it is
desirable that the time overlaps as much of the VM startup as
possible to reduce impact on the VM bootup time. The GPU
readiness state is thus checked on the first fault/huge_fault
request or read/write access which amortizes the GPU readiness
time.

The first fault and read/write checks the GPU state when the
reset_done flag - which denotes whether the GPU has just been
reset. The memory_lock is taken across map/access to avoid
races with GPU reset.

Also check if the memory is enabled, before waiting for GPU
to be ready. Otherwise the readiness check would block for 30s.

Lastly added PM handling wrapping on read/write access.

Cc: Shameer Kolothum &lt;skolothumtho@nvidia.com&gt;
Cc: Alex Williamson &lt;alex@shazbot.org&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Vikram Sethi &lt;vsethi@nvidia.com&gt;
Reviewed-by: Shameer Kolothum &lt;skolothumtho@nvidia.com&gt;
Suggested-by: Alex Williamson &lt;alex@shazbot.org&gt;
Signed-off-by: Ankit Agrawal &lt;ankita@nvidia.com&gt;
Link: https://lore.kernel.org/r/20251127170632.3477-7-ankita@nvidia.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
</entry>
<entry>
<title>vfio: refactor vfio_pci_mmap_huge_fault function</title>
<updated>2025-11-28T17:07:25+00:00</updated>
<author>
<name>Ankit Agrawal</name>
<email>ankita@nvidia.com</email>
</author>
<published>2025-11-27T17:06:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9b92bc7554b543dc00a0a0b62904a9ef2ad5c4b0'/>
<id>urn:sha1:9b92bc7554b543dc00a0a0b62904a9ef2ad5c4b0</id>
<content type='text'>
Refactor vfio_pci_mmap_huge_fault to take out the implementation
to map the VMA to the PTE/PMD/PUD as a separate function.

Export the new function to be used by nvgrace-gpu module.

Move the alignment check code to verify that pfn and VMA VA is
aligned to the page order to the header file and make it inline.

No functional change is intended.

Cc: Shameer Kolothum &lt;skolothumtho@nvidia.com&gt;
Cc: Alex Williamson &lt;alex@shazbot.org&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Reviewed-by: Shameer Kolothum &lt;skolothumtho@nvidia.com&gt;
Signed-off-by: Ankit Agrawal &lt;ankita@nvidia.com&gt;
Link: https://lore.kernel.org/r/20251127170632.3477-2-ankita@nvidia.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
</entry>
<entry>
<title>vfio/pci: Use RCU for error/request triggers to avoid circular locking</title>
<updated>2025-11-28T17:04:27+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex.williamson@nvidia.com</email>
</author>
<published>2025-11-24T22:36:22+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=98693e0897f754e3f51ce6626ed5f785f625ba2b'/>
<id>urn:sha1:98693e0897f754e3f51ce6626ed5f785f625ba2b</id>
<content type='text'>
Thanks to a device generating an ACS violation during bus reset,
lockdep reported the following circular locking issue:

CPU0: SET_IRQS (MSI/X): holds igate, acquires memory_lock
CPU1: HOT_RESET: holds memory_lock, acquires pci_bus_sem
CPU2: AER: holds pci_bus_sem, acquires igate

This results in a potential 3-way deadlock.

Remove the pci_bus_sem-&gt;igate leg of the triangle by using RCU
to peek at the eventfd rather than locking it with igate.

Fixes: 3be3a074cf5b ("vfio-pci: Don't use device_lock around AER interrupt setup")
Signed-off-by: Alex Williamson &lt;alex.williamson@nvidia.com&gt;
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Link: https://lore.kernel.org/r/20251124223623.2770706-1-alex@shazbot.org
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
</entry>
<entry>
<title>vfio/pci: Add vfio_pci_dma_buf_iommufd_map()</title>
<updated>2025-11-25T15:30:15+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2025-11-21T15:50:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=96ce2aeb15bd8672ab47abe547e2a1f8ba3886ff'/>
<id>urn:sha1:96ce2aeb15bd8672ab47abe547e2a1f8ba3886ff</id>
<content type='text'>
This function is used to establish the "private interconnect" between the
VFIO DMABUF exporter and the iommufd DMABUF importer. This is intended to
be a temporary API until the core DMABUF interface is improved to natively
support a private interconnect and revocable negotiation.

This function should only be called by iommufd when trying to map a
DMABUF. For now iommufd will only support VFIO DMABUFs.

The following improvements are needed in the DMABUF API to generically
support more exporters with iommufd/kvm type importers that cannot use the
DMA API:

 1) Revoke semantics. VFIO needs to be able to prevent access to the MMIO
    during FLR, and so it will use dma_buf_move_notify() to prevent
    access. iommmufd does not support fault handling so it cannot
    implement the full move_notify. Instead if revoke is negotiated the
    exporter promises not to use move_notify() unless the importer can
    experiance failures. iommufd will unmap the dmabuf from the iommu page
    tables while it is revoked.

 2) Private interconnect negotiation. iommufd will only be able to map
    a "private interconnect" that provides a phys_addr_t and a
    struct p2pdma_provider * to describe the memory. It cannot use a DMA
    mapped scatterlist since it is directly calling iommu_map().

 3) NULL device during dma_buf_dynamic_attach(). Since iommufd doesn't use
    the DMA API it doesn't have a DMAable struct device to pass here.

Link: https://patch.msgid.link/r/1-v2-b2c110338e3f+5c2-iommufd_dmabuf_jgg@nvidia.com
Reviewed-by: Nicolin Chen &lt;nicolinc@nvidia.com&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Tested-by: Nicolin Chen &lt;nicolinc@nvidia.com&gt;
Tested-by: Shuai Xue &lt;xueshuai@linux.alibaba.com&gt;
Acked-by: Alex Williamson &lt;alex@shazbot.org&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'vfio-v6.19-dma-buf-v9+' into v6.19/vfio/next</title>
<updated>2025-11-21T04:20:00+00:00</updated>
<author>
<name>Alex Williamson</name>
<email>alex@shazbot.org</email>
</author>
<published>2025-11-21T04:20:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fa804aa4ac1b091ef2ec2981f08a1c28aaeba8e7'/>
<id>urn:sha1:fa804aa4ac1b091ef2ec2981f08a1c28aaeba8e7</id>
<content type='text'>
[v9] vfio/pci: Allow MMIO regions to be exported through dma-buf

https://lore.kernel.org/all/20251120-dmabuf-vfio-v9-0-d7f71607f371@nvidia.com

Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
</entry>
<entry>
<title>vfio/pci: Add dma-buf export support for MMIO regions</title>
<updated>2025-11-21T04:12:19+00:00</updated>
<author>
<name>Leon Romanovsky</name>
<email>leonro@nvidia.com</email>
</author>
<published>2025-11-20T09:28:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5d74781ebc86c5fa9e9d6934024c505412de9b52'/>
<id>urn:sha1:5d74781ebc86c5fa9e9d6934024c505412de9b52</id>
<content type='text'>
Add support for exporting PCI device MMIO regions through dma-buf,
enabling safe sharing of non-struct page memory with controlled
lifetime management. This allows RDMA and other subsystems to import
dma-buf FDs and build them into memory regions for PCI P2P operations.

The implementation provides a revocable attachment mechanism using
dma-buf move operations. MMIO regions are normally pinned as BARs
don't change physical addresses, but access is revoked when the VFIO
device is closed or a PCI reset is issued. This ensures kernel
self-defense against potentially hostile userspace.

Currently VFIO can take MMIO regions from the device's BAR and map
them into a PFNMAP VMA with special PTEs. This mapping type ensures
the memory cannot be used with things like pin_user_pages(), hmm, and
so on. In practice only the user process CPU and KVM can safely make
use of these VMA. When VFIO shuts down these VMAs are cleaned by
unmap_mapping_range() to prevent any UAF of the MMIO beyond driver
unbind.

However, VFIO type 1 has an insecure behavior where it uses
follow_pfnmap_*() to fish a MMIO PFN out of a VMA and program it back
into the IOMMU. This has a long history of enabling P2P DMA inside
VMs, but has serious lifetime problems by allowing a UAF of the MMIO
after the VFIO driver has been unbound.

Introduce DMABUF as a new safe way to export a FD based handle for the
MMIO regions. This can be consumed by existing DMABUF importers like
RDMA or DRM without opening an UAF. A following series will add an
importer to iommufd to obsolete the type 1 code and allow safe
UAF-free MMIO P2P in VM cases.

DMABUF has a built in synchronous invalidation mechanism called
move_notify. VFIO keeps track of all drivers importing its MMIO and
can invoke a synchronous invalidation callback to tell the importing
drivers to DMA unmap and forget about the MMIO pfns. This process is
being called revoke. This synchronous invalidation fully prevents any
lifecycle problems. VFIO will do this before unbinding its driver
ensuring there is no UAF of the MMIO beyond the driver lifecycle.

Further, VFIO has additional behavior to block access to the MMIO
during things like Function Level Reset. This is because some poor
platforms may experience a MCE type crash when touching MMIO of a PCI
device that is undergoing a reset. Today this is done by using
unmap_mapping_range() on the VMAs. Extend that into the DMABUF world
and temporarily revoke the MMIO from the DMABUF importers during FLR
as well. This will more robustly prevent an errant P2P from possibly
upsetting the platform.

A DMABUF FD is a preferred handle for MMIO compared to using something
like a pgmap because:
 - VFIO is supported, including its P2P feature, on archs that don't
   support pgmap
 - PCI devices have all sorts of BAR sizes, including ones smaller
   than a section so a pgmap cannot always be created
 - It is undesirable to waste a lot of memory for struct pages,
   especially for a case like a GPU with ~100GB of BAR size
 - We want a synchronous revoke semantic to support FLR with light
   hardware requirements

Use the P2P subsystem to help generate the DMA mapping. This is a
significant upgrade over the abuse of dma_map_resource() that has
historically been used by DMABUF exporters. Experience with an OOT
version of this patch shows that real systems do need this. This
approach deals with all the P2P scenarios:
 - Non-zero PCI bus_offset
 - ACS flags routing traffic to the IOMMU
 - ACS flags that bypass the IOMMU - though vfio noiommu is required
   to hit this.

There will be further work to formalize the revoke semantic in
DMABUF. For now this acts like a move_notify dynamic exporter where
importer fault handling will get a failure when they attempt to map.
This means that only fully restartable fault capable importers can
import the VFIO DMABUFs. A future revoke semantic should open this up
to more HW as the HW only needs to invalidate, not handle restartable
faults.

Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Signed-off-by: Vivek Kasireddy &lt;vivek.kasireddy@intel.com&gt;
Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Acked-by: Ankit Agrawal &lt;ankita@nvidia.com&gt;
Link: https://lore.kernel.org/r/20251120-dmabuf-vfio-v9-10-d7f71607f371@nvidia.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
</entry>
<entry>
<title>vfio/pci: Convert all PCI drivers to get_region_info_caps</title>
<updated>2025-11-12T22:06:40+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2025-11-07T17:41:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=1b0ecb5baf4af3baa8627144bbcf9848806aa5f1'/>
<id>urn:sha1:1b0ecb5baf4af3baa8627144bbcf9848806aa5f1</id>
<content type='text'>
Since the core function signature changes it has to flow up to all
drivers.

Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Pranjal Shrivastava &lt;praan@google.com&gt;
Reviewed-by: Brett Creeley &lt;brett.creeley@amd.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Link: https://lore.kernel.org/r/19-v2-2a9e24d62f1b+e10a-vfio_get_region_info_op_jgg@nvidia.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
</entry>
<entry>
<title>vfio: Provide a get_region_info op</title>
<updated>2025-11-12T21:58:23+00:00</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@nvidia.com</email>
</author>
<published>2025-11-07T17:41:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=113557b0406818a8a5df3479b0a89125d2b2a04c'/>
<id>urn:sha1:113557b0406818a8a5df3479b0a89125d2b2a04c</id>
<content type='text'>
Instead of hooking the general ioctl op, have the core code directly
decode VFIO_DEVICE_GET_REGION_INFO and call an op just for it.

This is intended to allow mechanical changes to the drivers to pull their
VFIO_DEVICE_GET_REGION_INFO int oa function. Later patches will improve
the function signature to consolidate more code.

Reviewed-by: Kevin Tian &lt;kevin.tian@intel.com&gt;
Reviewed-by: Pranjal Shrivastava &lt;praan@google.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Link: https://lore.kernel.org/r/1-v2-2a9e24d62f1b+e10a-vfio_get_region_info_op_jgg@nvidia.com
Signed-off-by: Alex Williamson &lt;alex@shazbot.org&gt;
</content>
</entry>
</feed>
