summaryrefslogtreecommitdiff
path: root/Documentation/vfio.txt
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2013-07-04 21:29:23 +0400
committerLinus Torvalds <torvalds@linux-foundation.org>2013-07-04 21:29:23 +0400
commit65b97fb7303050fc826e518cf67fc283da23314f (patch)
tree595e7f04d65d95a39d65bd2dcf2385b3b6ea7969 /Documentation/vfio.txt
parentddcf6600b133697adbafd96e080818bdc0dfd028 (diff)
parent1d8b368ab4aacfc3f864655baad4d31a3028ec1a (diff)
downloadlinux-65b97fb7303050fc826e518cf67fc283da23314f.tar.xz
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc updates from Ben Herrenschmidt: "This is the powerpc changes for the 3.11 merge window. In addition to the usual bug fixes and small updates, the main highlights are: - Support for transparent huge pages by Aneesh Kumar for 64-bit server processors. This allows the use of 16M pages as transparent huge pages on kernels compiled with a 64K base page size. - Base VFIO support for KVM on power by Alexey Kardashevskiy - Wiring up of our nvram to the pstore infrastructure, including putting compressed oopses in there by Aruna Balakrishnaiah - Move, rework and improve our "EEH" (basically PCI error handling and recovery) infrastructure. It is no longer specific to pseries but is now usable by the new "powernv" platform as well (no hypervisor) by Gavin Shan. - I fixed some bugs in our math-emu instruction decoding and made it usable to emulate some optional FP instructions on processors with hard FP that lack them (such as fsqrt on Freescale embedded processors). - Support for Power8 "Event Based Branch" facility by Michael Ellerman. This facility allows what is basically "userspace interrupts" for performance monitor events. - A bunch of Transactional Memory vs. Signals bug fixes and HW breakpoint/watchpoint fixes by Michael Neuling. And more ... I appologize in advance if I've failed to highlight something that somebody deemed worth it." * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits) pstore: Add hsize argument in write_buf call of pstore_ftrace_call powerpc/fsl: add MPIC timer wakeup support powerpc/mpic: create mpic subsystem object powerpc/mpic: add global timer support powerpc/mpic: add irq_set_wake support powerpc/85xx: enable coreint for all the 64bit boards powerpc/8xx: Erroneous double irq_eoi() on CPM IRQ in MPC8xx powerpc/fsl: Enable CONFIG_E1000E in mpc85xx_smp_defconfig powerpc/mpic: Add get_version API both for internal and external use powerpc: Handle both new style and old style reserve maps powerpc/hw_brk: Fix off by one error when validating DAWR region end powerpc/pseries: Support compression of oops text via pstore powerpc/pseries: Re-organise the oops compression code pstore: Pass header size in the pstore write callback powerpc/powernv: Fix iommu initialization again powerpc/pseries: Inform the hypervisor we are using EBB regs powerpc/perf: Add power8 EBB support powerpc/perf: Core EBB support for 64-bit book3s powerpc/perf: Drop MMCRA from thread_struct powerpc/perf: Don't enable if we have zero events ...
Diffstat (limited to 'Documentation/vfio.txt')
-rw-r--r--Documentation/vfio.txt63
1 files changed, 63 insertions, 0 deletions
diff --git a/Documentation/vfio.txt b/Documentation/vfio.txt
index 8eda3635a17d..c55533c0adb3 100644
--- a/Documentation/vfio.txt
+++ b/Documentation/vfio.txt
@@ -283,6 +283,69 @@ a direct pass through for VFIO_DEVICE_* ioctls. The read/write/mmap
interfaces implement the device region access defined by the device's
own VFIO_DEVICE_GET_REGION_INFO ioctl.
+
+PPC64 sPAPR implementation note
+-------------------------------------------------------------------------------
+
+This implementation has some specifics:
+
+1) Only one IOMMU group per container is supported as an IOMMU group
+represents the minimal entity which isolation can be guaranteed for and
+groups are allocated statically, one per a Partitionable Endpoint (PE)
+(PE is often a PCI domain but not always).
+
+2) The hardware supports so called DMA windows - the PCI address range
+within which DMA transfer is allowed, any attempt to access address space
+out of the window leads to the whole PE isolation.
+
+3) PPC64 guests are paravirtualized but not fully emulated. There is an API
+to map/unmap pages for DMA, and it normally maps 1..32 pages per call and
+currently there is no way to reduce the number of calls. In order to make things
+faster, the map/unmap handling has been implemented in real mode which provides
+an excellent performance which has limitations such as inability to do
+locked pages accounting in real time.
+
+So 3 additional ioctls have been added:
+
+ VFIO_IOMMU_SPAPR_TCE_GET_INFO - returns the size and the start
+ of the DMA window on the PCI bus.
+
+ VFIO_IOMMU_ENABLE - enables the container. The locked pages accounting
+ is done at this point. This lets user first to know what
+ the DMA window is and adjust rlimit before doing any real job.
+
+ VFIO_IOMMU_DISABLE - disables the container.
+
+
+The code flow from the example above should be slightly changed:
+
+ .....
+ /* Add the group to the container */
+ ioctl(group, VFIO_GROUP_SET_CONTAINER, &container);
+
+ /* Enable the IOMMU model we want */
+ ioctl(container, VFIO_SET_IOMMU, VFIO_SPAPR_TCE_IOMMU)
+
+ /* Get addition sPAPR IOMMU info */
+ vfio_iommu_spapr_tce_info spapr_iommu_info;
+ ioctl(container, VFIO_IOMMU_SPAPR_TCE_GET_INFO, &spapr_iommu_info);
+
+ if (ioctl(container, VFIO_IOMMU_ENABLE))
+ /* Cannot enable container, may be low rlimit */
+
+ /* Allocate some space and setup a DMA mapping */
+ dma_map.vaddr = mmap(0, 1024 * 1024, PROT_READ | PROT_WRITE,
+ MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
+
+ dma_map.size = 1024 * 1024;
+ dma_map.iova = 0; /* 1MB starting at 0x0 from device view */
+ dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE;
+
+ /* Check here is .iova/.size are within DMA window from spapr_iommu_info */
+
+ ioctl(container, VFIO_IOMMU_MAP_DMA, &dma_map);
+ .....
+
-------------------------------------------------------------------------------
[1] VFIO was originally an acronym for "Virtual Function I/O" in its