diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2013-07-04 21:29:23 +0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2013-07-04 21:29:23 +0400 |
commit | 65b97fb7303050fc826e518cf67fc283da23314f (patch) | |
tree | 595e7f04d65d95a39d65bd2dcf2385b3b6ea7969 /Documentation | |
parent | ddcf6600b133697adbafd96e080818bdc0dfd028 (diff) | |
parent | 1d8b368ab4aacfc3f864655baad4d31a3028ec1a (diff) | |
download | linux-65b97fb7303050fc826e518cf67fc283da23314f.tar.xz |
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc updates from Ben Herrenschmidt:
"This is the powerpc changes for the 3.11 merge window. In addition to
the usual bug fixes and small updates, the main highlights are:
- Support for transparent huge pages by Aneesh Kumar for 64-bit
server processors. This allows the use of 16M pages as transparent
huge pages on kernels compiled with a 64K base page size.
- Base VFIO support for KVM on power by Alexey Kardashevskiy
- Wiring up of our nvram to the pstore infrastructure, including
putting compressed oopses in there by Aruna Balakrishnaiah
- Move, rework and improve our "EEH" (basically PCI error handling
and recovery) infrastructure. It is no longer specific to pseries
but is now usable by the new "powernv" platform as well (no
hypervisor) by Gavin Shan.
- I fixed some bugs in our math-emu instruction decoding and made it
usable to emulate some optional FP instructions on processors with
hard FP that lack them (such as fsqrt on Freescale embedded
processors).
- Support for Power8 "Event Based Branch" facility by Michael
Ellerman. This facility allows what is basically "userspace
interrupts" for performance monitor events.
- A bunch of Transactional Memory vs. Signals bug fixes and HW
breakpoint/watchpoint fixes by Michael Neuling.
And more ... I appologize in advance if I've failed to highlight
something that somebody deemed worth it."
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits)
pstore: Add hsize argument in write_buf call of pstore_ftrace_call
powerpc/fsl: add MPIC timer wakeup support
powerpc/mpic: create mpic subsystem object
powerpc/mpic: add global timer support
powerpc/mpic: add irq_set_wake support
powerpc/85xx: enable coreint for all the 64bit boards
powerpc/8xx: Erroneous double irq_eoi() on CPM IRQ in MPC8xx
powerpc/fsl: Enable CONFIG_E1000E in mpc85xx_smp_defconfig
powerpc/mpic: Add get_version API both for internal and external use
powerpc: Handle both new style and old style reserve maps
powerpc/hw_brk: Fix off by one error when validating DAWR region end
powerpc/pseries: Support compression of oops text via pstore
powerpc/pseries: Re-organise the oops compression code
pstore: Pass header size in the pstore write callback
powerpc/powernv: Fix iommu initialization again
powerpc/pseries: Inform the hypervisor we are using EBB regs
powerpc/perf: Add power8 EBB support
powerpc/perf: Core EBB support for 64-bit book3s
powerpc/perf: Drop MMCRA from thread_struct
powerpc/perf: Don't enable if we have zero events
...
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/devicetree/bindings/powerpc/fsl/interlaken-lac.txt | 309 | ||||
-rw-r--r-- | Documentation/powerpc/00-INDEX | 2 | ||||
-rw-r--r-- | Documentation/powerpc/pmu-ebb.txt | 137 | ||||
-rw-r--r-- | Documentation/vfio.txt | 63 |
4 files changed, 511 insertions, 0 deletions
diff --git a/Documentation/devicetree/bindings/powerpc/fsl/interlaken-lac.txt b/Documentation/devicetree/bindings/powerpc/fsl/interlaken-lac.txt new file mode 100644 index 000000000000..641bc13983e1 --- /dev/null +++ b/Documentation/devicetree/bindings/powerpc/fsl/interlaken-lac.txt @@ -0,0 +1,309 @@ +=============================================================================== +Freescale Interlaken Look-Aside Controller Device Bindings +Copyright 2012 Freescale Semiconductor Inc. + +CONTENTS + - Interlaken Look-Aside Controller (LAC) Node + - Example LAC Node + - Interlaken Look-Aside Controller (LAC) Software Portal Node + - Interlaken Look-Aside Controller (LAC) Software Portal Child Nodes + - Example LAC SWP Node with Child Nodes + +============================================================================== +Interlaken Look-Aside Controller (LAC) Node + +DESCRIPTION + +The Interlaken is a narrow, high speed channelized chip-to-chip interface. To +facilitate interoperability between a data path device and a look-aside +co-processor, the Interlaken Look-Aside protocol is defined for short +transaction-related transfers. Although based on the Interlaken protocol, +Interlaken Look-Aside is not directly compatible with Interlaken and can be +considered a different operation mode. + +The Interlaken LA controller connects internal platform to Interlaken serial +interface. It accepts LA command through software portals, which are system +memory mapped 4KB spaces. The LA commands are then translated into the +Interlaken control words and data words, which are sent on TX side to TCAM +through SerDes lanes. + +There are two 4KiB spaces defined within the LAC global register memory map. +There is a full register set at 0x0000-0x0FFF (also known as the "hypervisor" +version), and a subset at 0x1000-0x1FFF. The former is a superset of the +latter, and includes certain registers that should not be accessible to +partitioned software. Separate nodes are used for each region, with a phandle +linking the hypervisor node to the normal operating node. + +PROPERTIES + + - compatible + Usage: required + Value type: <string> + Definition: Must include "fsl,interlaken-lac". This represents only + those LAC CCSR registers not protected in partitioned + software. The version of the device is determined by the LAC + IP Block Revision Register (IPBRR0) at offset 0x0BF8. + + Table of correspondences between IPBRR0 values and example + chips: + Value Device + ----------- ------- + 0x02000100 T4240 + + The Hypervisor node has a different compatible. It must include + "fsl,interlaken-lac-hv". This node represents the protected + LAC register space and is required except inside a partition + where access to the hypervisor node is to be denied. + + - fsl,non-hv-node + Usage: required in "fsl,interlaken-lac-hv" + Value type: <phandle> + Definition: Points to the non-protected LAC CCSR mapped register space + node. + + - reg + Usage: required + Value type: <prop-encoded-array> + Definition: A standard property. The first resource represents the + Interlaken LAC configuration registers. + + - interrupts: + Usage: required in non-hv node only + Value type: <prop-encoded-array> + Definition: Interrupt mapping for Interlaken LAC error IRQ. + +EXAMPLE + lac: lac@229000 { + compatible = "fsl,interlaken-lac" + reg = <0x229000 0x1000>; + interrupts = <16 2 1 18>; + }; + + lac-hv@228000 { + compatible = "fsl,interlaken-lac-hv" + reg = <0x228000 0x1000>; + fsl,non-hv-node = <&lac>; + }; + +=============================================================================== +Interlaken Look-Aside Controller (LAC) Software Portal Container Node + +DESCRIPTION +The Interlaken Look-Aside Controller (LAC) utilizes Software Portals to accept +Interlaken Look-Aside (ILA) commands. The Interlaken LAC software portal +memory map occupies 128KB of memory space. The software portal memory space is +intended to be cache-enabled. WIMG for each software space is required to be +0010 if stashing is enabled; otherwise, WIMG can be 0000 or 0010. + +PROPERTIES + + - #address-cells + Usage: required + Value type: <u32> + Definition: A standard property. Must have a value of 1. + + - #size-cells + Usage: required + Value type: <u32> + Definition: A standard property. Must have a value of 1. + + - compatible + Usage: required + Value type: <string> + Definition: Must include "fsl,interlaken-lac-portals" + + - ranges + Usage: required + Value type: <prop-encoded-array> + Definition: A standard property. Specifies the address and length + of the LAC portal memory space. + +=============================================================================== +Interlaken Look-Aside Controller (LAC) Software Portals Child Nodes + +DESCRIPTION +There are up to 24 available software portals with each software portal +requiring 4KB of consecutive memory within the software portal memory mapped +space. + +PROPERTIES + + - compatible + Usage: required + Value type: <string> + Definition: Must include "fsl,interlaken-lac-portal-vX.Y" where X is + the Major version (IP_MJ) found in the LAC IP Block Revision + Register (IPBRR0), at offset 0x0BF8, and Y is the Minor version + (IP_MN). + + Table of correspondences between version values and example chips: + Value Device + ------ ------- + 1.0 T4240 + + - reg + Usage: required + Value type: <prop-encoded-array> + Definition: A standard property. The first resource represents the + Interlaken LAC software portal registers. + + - fsl,liodn + Value type: <u32> + Definition: The logical I/O device number (LIODN) for this device. The + LIODN is a number expressed by this device and used to perform + look-ups in the IOMMU (PAMU) address table when performing + DMAs. This property is automatically added by u-boot. + +=============================================================================== +EXAMPLE + +lac-portals { + #address-cells = <0x1>; + #size-cells = <0x1>; + compatible = "fsl,interlaken-lac-portals"; + ranges = <0x0 0xf 0xf4400000 0x20000>; + + lportal0: lac-portal@0 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x204>; + reg = <0x0 0x1000>; + }; + + lportal1: lac-portal@1000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x205>; + reg = <0x1000 0x1000>; + }; + + lportal2: lac-portal@2000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x206>; + reg = <0x2000 0x1000>; + }; + + lportal3: lac-portal@3000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x207>; + reg = <0x3000 0x1000>; + }; + + lportal4: lac-portal@4000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x208>; + reg = <0x4000 0x1000>; + }; + + lportal5: lac-portal@5000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x209>; + reg = <0x5000 0x1000>; + }; + + lportal6: lac-portal@6000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x20A>; + reg = <0x6000 0x1000>; + }; + + lportal7: lac-portal@7000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x20B>; + reg = <0x7000 0x1000>; + }; + + lportal8: lac-portal@8000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x20C>; + reg = <0x8000 0x1000>; + }; + + lportal9: lac-portal@9000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x20D>; + reg = <0x9000 0x1000>; + }; + + lportal10: lac-portal@A000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x20E>; + reg = <0xA000 0x1000>; + }; + + lportal11: lac-portal@B000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x20F>; + reg = <0xB000 0x1000>; + }; + + lportal12: lac-portal@C000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x210>; + reg = <0xC000 0x1000>; + }; + + lportal13: lac-portal@D000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x211>; + reg = <0xD000 0x1000>; + }; + + lportal14: lac-portal@E000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x212>; + reg = <0xE000 0x1000>; + }; + + lportal15: lac-portal@F000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x213>; + reg = <0xF000 0x1000>; + }; + + lportal16: lac-portal@10000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x214>; + reg = <0x10000 0x1000>; + }; + + lportal17: lac-portal@11000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x215>; + reg = <0x11000 0x1000>; + }; + + lportal8: lac-portal@1200 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x216>; + reg = <0x12000 0x1000>; + }; + + lportal19: lac-portal@13000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x217>; + reg = <0x13000 0x1000>; + }; + + lportal20: lac-portal@14000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x218>; + reg = <0x14000 0x1000>; + }; + + lportal21: lac-portal@15000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x219>; + reg = <0x15000 0x1000>; + }; + + lportal22: lac-portal@16000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x21A>; + reg = <0x16000 0x1000>; + }; + + lportal23: lac-portal@17000 { + compatible = "fsl,interlaken-lac-portal-v1.0"; + fsl,liodn = <0x21B>; + reg = <0x17000 0x1000>; + }; +}; diff --git a/Documentation/powerpc/00-INDEX b/Documentation/powerpc/00-INDEX index dd9e92802ec0..05026ce1875e 100644 --- a/Documentation/powerpc/00-INDEX +++ b/Documentation/powerpc/00-INDEX @@ -14,6 +14,8 @@ hvcs.txt - IBM "Hypervisor Virtual Console Server" Installation Guide mpc52xx.txt - Linux 2.6.x on MPC52xx family +pmu-ebb.txt + - Description of the API for using the PMU with Event Based Branches. qe_firmware.txt - describes the layout of firmware binaries for the Freescale QUICC Engine and the code that parses and uploads the microcode therein. diff --git a/Documentation/powerpc/pmu-ebb.txt b/Documentation/powerpc/pmu-ebb.txt new file mode 100644 index 000000000000..73cd163dbfb8 --- /dev/null +++ b/Documentation/powerpc/pmu-ebb.txt @@ -0,0 +1,137 @@ +PMU Event Based Branches +======================== + +Event Based Branches (EBBs) are a feature which allows the hardware to +branch directly to a specified user space address when certain events occur. + +The full specification is available in Power ISA v2.07: + + https://www.power.org/documentation/power-isa-version-2-07/ + +One type of event for which EBBs can be configured is PMU exceptions. This +document describes the API for configuring the Power PMU to generate EBBs, +using the Linux perf_events API. + + +Terminology +----------- + +Throughout this document we will refer to an "EBB event" or "EBB events". This +just refers to a struct perf_event which has set the "EBB" flag in its +attr.config. All events which can be configured on the hardware PMU are +possible "EBB events". + + +Background +---------- + +When a PMU EBB occurs it is delivered to the currently running process. As such +EBBs can only sensibly be used by programs for self-monitoring. + +It is a feature of the perf_events API that events can be created on other +processes, subject to standard permission checks. This is also true of EBB +events, however unless the target process enables EBBs (via mtspr(BESCR)) no +EBBs will ever be delivered. + +This makes it possible for a process to enable EBBs for itself, but not +actually configure any events. At a later time another process can come along +and attach an EBB event to the process, which will then cause EBBs to be +delivered to the first process. It's not clear if this is actually useful. + + +When the PMU is configured for EBBs, all PMU interrupts are delivered to the +user process. This means once an EBB event is scheduled on the PMU, no non-EBB +events can be configured. This means that EBB events can not be run +concurrently with regular 'perf' commands, or any other perf events. + +It is however safe to run 'perf' commands on a process which is using EBBs. The +kernel will in general schedule the EBB event, and perf will be notified that +its events could not run. + +The exclusion between EBB events and regular events is implemented using the +existing "pinned" and "exclusive" attributes of perf_events. This means EBB +events will be given priority over other events, unless they are also pinned. +If an EBB event and a regular event are both pinned, then whichever is enabled +first will be scheduled and the other will be put in error state. See the +section below titled "Enabling an EBB event" for more information. + + +Creating an EBB event +--------------------- + +To request that an event is counted using EBB, the event code should have bit +63 set. + +EBB events must be created with a particular, and restrictive, set of +attributes - this is so that they interoperate correctly with the rest of the +perf_events subsystem. + +An EBB event must be created with the "pinned" and "exclusive" attributes set. +Note that if you are creating a group of EBB events, only the leader can have +these attributes set. + +An EBB event must NOT set any of the "inherit", "sample_period", "freq" or +"enable_on_exec" attributes. + +An EBB event must be attached to a task. This is specified to perf_event_open() +by passing a pid value, typically 0 indicating the current task. + +All events in a group must agree on whether they want EBB. That is all events +must request EBB, or none may request EBB. + +EBB events must specify the PMC they are to be counted on. This ensures +userspace is able to reliably determine which PMC the event is scheduled on. + + +Enabling an EBB event +--------------------- + +Once an EBB event has been successfully opened, it must be enabled with the +perf_events API. This can be achieved either via the ioctl() interface, or the +prctl() interface. + +However, due to the design of the perf_events API, enabling an event does not +guarantee that it has been scheduled on the PMU. To ensure that the EBB event +has been scheduled on the PMU, you must perform a read() on the event. If the +read() returns EOF, then the event has not been scheduled and EBBs are not +enabled. + +This behaviour occurs because the EBB event is pinned and exclusive. When the +EBB event is enabled it will force all other non-pinned events off the PMU. In +this case the enable will be successful. However if there is already an event +pinned on the PMU then the enable will not be successful. + + +Reading an EBB event +-------------------- + +It is possible to read() from an EBB event. However the results are +meaningless. Because interrupts are being delivered to the user process the +kernel is not able to count the event, and so will return a junk value. + + +Closing an EBB event +-------------------- + +When an EBB event is finished with, you can close it using close() as for any +regular event. If this is the last EBB event the PMU will be deconfigured and +no further PMU EBBs will be delivered. + + +EBB Handler +----------- + +The EBB handler is just regular userspace code, however it must be written in +the style of an interrupt handler. When the handler is entered all registers +are live (possibly) and so must be saved somehow before the handler can invoke +other code. + +It's up to the program how to handle this. For C programs a relatively simple +option is to create an interrupt frame on the stack and save registers there. + +Fork +---- + +EBB events are not inherited across fork. If the child process wishes to use +EBBs it should open a new event for itself. Similarly the EBB state in +BESCR/EBBHR/EBBRR is cleared across fork(). diff --git a/Documentation/vfio.txt b/Documentation/vfio.txt index 8eda3635a17d..c55533c0adb3 100644 --- a/Documentation/vfio.txt +++ b/Documentation/vfio.txt @@ -283,6 +283,69 @@ a direct pass through for VFIO_DEVICE_* ioctls. The read/write/mmap interfaces implement the device region access defined by the device's own VFIO_DEVICE_GET_REGION_INFO ioctl. + +PPC64 sPAPR implementation note +------------------------------------------------------------------------------- + +This implementation has some specifics: + +1) Only one IOMMU group per container is supported as an IOMMU group +represents the minimal entity which isolation can be guaranteed for and +groups are allocated statically, one per a Partitionable Endpoint (PE) +(PE is often a PCI domain but not always). + +2) The hardware supports so called DMA windows - the PCI address range +within which DMA transfer is allowed, any attempt to access address space +out of the window leads to the whole PE isolation. + +3) PPC64 guests are paravirtualized but not fully emulated. There is an API +to map/unmap pages for DMA, and it normally maps 1..32 pages per call and +currently there is no way to reduce the number of calls. In order to make things +faster, the map/unmap handling has been implemented in real mode which provides +an excellent performance which has limitations such as inability to do +locked pages accounting in real time. + +So 3 additional ioctls have been added: + + VFIO_IOMMU_SPAPR_TCE_GET_INFO - returns the size and the start + of the DMA window on the PCI bus. + + VFIO_IOMMU_ENABLE - enables the container. The locked pages accounting + is done at this point. This lets user first to know what + the DMA window is and adjust rlimit before doing any real job. + + VFIO_IOMMU_DISABLE - disables the container. + + +The code flow from the example above should be slightly changed: + + ..... + /* Add the group to the container */ + ioctl(group, VFIO_GROUP_SET_CONTAINER, &container); + + /* Enable the IOMMU model we want */ + ioctl(container, VFIO_SET_IOMMU, VFIO_SPAPR_TCE_IOMMU) + + /* Get addition sPAPR IOMMU info */ + vfio_iommu_spapr_tce_info spapr_iommu_info; + ioctl(container, VFIO_IOMMU_SPAPR_TCE_GET_INFO, &spapr_iommu_info); + + if (ioctl(container, VFIO_IOMMU_ENABLE)) + /* Cannot enable container, may be low rlimit */ + + /* Allocate some space and setup a DMA mapping */ + dma_map.vaddr = mmap(0, 1024 * 1024, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, 0, 0); + + dma_map.size = 1024 * 1024; + dma_map.iova = 0; /* 1MB starting at 0x0 from device view */ + dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE; + + /* Check here is .iova/.size are within DMA window from spapr_iommu_info */ + + ioctl(container, VFIO_IOMMU_MAP_DMA, &dma_map); + ..... + ------------------------------------------------------------------------------- [1] VFIO was originally an acronym for "Virtual Function I/O" in its |