diff options
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/ABI/testing/sysfs-bus-pci | 24 | ||||
-rw-r--r-- | Documentation/PCI/endpoint/pci-test-howto.txt | 19 | ||||
-rw-r--r-- | Documentation/PCI/pci-error-recovery.txt | 35 | ||||
-rw-r--r-- | Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt | 1 | ||||
-rw-r--r-- | Documentation/devicetree/bindings/pci/pci-keystone.txt | 3 | ||||
-rw-r--r-- | Documentation/devicetree/bindings/pci/pci-rcar-gen2.txt | 1 | ||||
-rw-r--r-- | Documentation/devicetree/bindings/pci/rcar-pci.txt | 2 | ||||
-rw-r--r-- | Documentation/devicetree/bindings/pci/ti-pci.txt | 5 | ||||
-rw-r--r-- | Documentation/driver-api/index.rst | 2 | ||||
-rw-r--r-- | Documentation/driver-api/pci/index.rst | 22 | ||||
-rw-r--r-- | Documentation/driver-api/pci/p2pdma.rst | 145 | ||||
-rw-r--r-- | Documentation/driver-api/pci/pci.rst (renamed from Documentation/driver-api/pci.rst) | 0 | ||||
-rw-r--r-- | Documentation/switchtec.txt | 30 |
13 files changed, 245 insertions, 44 deletions
diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci index 44d4b2be92fd..8bfee557e50e 100644 --- a/Documentation/ABI/testing/sysfs-bus-pci +++ b/Documentation/ABI/testing/sysfs-bus-pci @@ -323,3 +323,27 @@ Description: This is similar to /sys/bus/pci/drivers_autoprobe, but affects only the VFs associated with a specific PF. + +What: /sys/bus/pci/devices/.../p2pmem/size +Date: November 2017 +Contact: Logan Gunthorpe <logang@deltatee.com> +Description: + If the device has any Peer-to-Peer memory registered, this + file contains the total amount of memory that the device + provides (in decimal). + +What: /sys/bus/pci/devices/.../p2pmem/available +Date: November 2017 +Contact: Logan Gunthorpe <logang@deltatee.com> +Description: + If the device has any Peer-to-Peer memory registered, this + file contains the amount of memory that has not been + allocated (in decimal). + +What: /sys/bus/pci/devices/.../p2pmem/published +Date: November 2017 +Contact: Logan Gunthorpe <logang@deltatee.com> +Description: + If the device has any Peer-to-Peer memory registered, this + file contains a '1' if the memory has been published for + use outside the driver that owns the device. diff --git a/Documentation/PCI/endpoint/pci-test-howto.txt b/Documentation/PCI/endpoint/pci-test-howto.txt index e40cf0fb58d7..040479f437a5 100644 --- a/Documentation/PCI/endpoint/pci-test-howto.txt +++ b/Documentation/PCI/endpoint/pci-test-howto.txt @@ -99,17 +99,20 @@ Note that the devices listed here correspond to the value populated in 1.4 above 2.2 Using Endpoint Test function Device pcitest.sh added in tools/pci/ can be used to run all the default PCI endpoint -tests. Before pcitest.sh can be used pcitest.c should be compiled using the -following commands. +tests. To compile this tool the following commands should be used: - cd <kernel-dir> - make headers_install ARCH=arm - arm-linux-gnueabihf-gcc -Iusr/include tools/pci/pcitest.c -o pcitest - cp pcitest <rootfs>/usr/sbin/ - cp tools/pci/pcitest.sh <rootfs> + # cd <kernel-dir> + # make -C tools/pci + +or if you desire to compile and install in your system: + + # cd <kernel-dir> + # make -C tools/pci install + +The tool and script will be located in <rootfs>/usr/bin/ 2.2.1 pcitest.sh Output - # ./pcitest.sh + # pcitest.sh BAR tests BAR0: OKAY diff --git a/Documentation/PCI/pci-error-recovery.txt b/Documentation/PCI/pci-error-recovery.txt index 688b69121e82..0b6bb3ef449e 100644 --- a/Documentation/PCI/pci-error-recovery.txt +++ b/Documentation/PCI/pci-error-recovery.txt @@ -110,7 +110,7 @@ The actual steps taken by a platform to recover from a PCI error event will be platform-dependent, but will follow the general sequence described below. -STEP 0: Error Event: ERR_NONFATAL +STEP 0: Error Event ------------------- A PCI bus error is detected by the PCI hardware. On powerpc, the slot is isolated, in that all I/O is blocked: all reads return 0xffffffff, @@ -228,7 +228,13 @@ proceeds to either STEP3 (Link Reset) or to STEP 5 (Resume Operations). If any driver returned PCI_ERS_RESULT_NEED_RESET, then the platform proceeds to STEP 4 (Slot Reset) -STEP 3: Slot Reset +STEP 3: Link Reset +------------------ +The platform resets the link. This is a PCI-Express specific step +and is done whenever a fatal error has been detected that can be +"solved" by resetting the link. + +STEP 4: Slot Reset ------------------ In response to a return value of PCI_ERS_RESULT_NEED_RESET, the @@ -314,7 +320,7 @@ Failure). >>> However, it probably should. -STEP 4: Resume Operations +STEP 5: Resume Operations ------------------------- The platform will call the resume() callback on all affected device drivers if all drivers on the segment have returned @@ -326,7 +332,7 @@ a result code. At this point, if a new error happens, the platform will restart a new error recovery sequence. -STEP 5: Permanent Failure +STEP 6: Permanent Failure ------------------------- A "permanent failure" has occurred, and the platform cannot recover the device. The platform will call error_detected() with a @@ -349,27 +355,6 @@ errors. See the discussion in powerpc/eeh-pci-error-recovery.txt for additional detail on real-life experience of the causes of software errors. -STEP 0: Error Event: ERR_FATAL -------------------- -PCI bus error is detected by the PCI hardware. On powerpc, the slot is -isolated, in that all I/O is blocked: all reads return 0xffffffff, all -writes are ignored. - -STEP 1: Remove devices --------------------- -Platform removes the devices depending on the error agent, it could be -this port for all subordinates or upstream component (likely downstream -port) - -STEP 2: Reset link --------------------- -The platform resets the link. This is a PCI-Express specific step and is -done whenever a fatal error has been detected that can be "solved" by -resetting the link. - -STEP 3: Re-enumerate the devices --------------------- -Initiates the re-enumeration. Conclusion; General Remarks --------------------------- diff --git a/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt b/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt index cb33421184a0..f37494d5a7be 100644 --- a/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt +++ b/Documentation/devicetree/bindings/pci/fsl,imx6q-pcie.txt @@ -50,6 +50,7 @@ Additional required properties for imx7d-pcie: - reset-names: Must contain the following entires: - "pciephy" - "apps" + - "turnoff" Example: diff --git a/Documentation/devicetree/bindings/pci/pci-keystone.txt b/Documentation/devicetree/bindings/pci/pci-keystone.txt index 4dd17de549a7..2030ee0dc4f9 100644 --- a/Documentation/devicetree/bindings/pci/pci-keystone.txt +++ b/Documentation/devicetree/bindings/pci/pci-keystone.txt @@ -19,6 +19,9 @@ pcie_msi_intc : Interrupt controller device node for MSI IRQ chip interrupt-cells: should be set to 1 interrupts: GIC interrupt lines connected to PCI MSI interrupt lines +ti,syscon-pcie-id : phandle to the device control module required to set device + id and vendor id. + Example: pcie_msi_intc: msi-interrupt-controller { interrupt-controller; diff --git a/Documentation/devicetree/bindings/pci/pci-rcar-gen2.txt b/Documentation/devicetree/bindings/pci/pci-rcar-gen2.txt index 9fe7e12a7bf3..b94078f58d8e 100644 --- a/Documentation/devicetree/bindings/pci/pci-rcar-gen2.txt +++ b/Documentation/devicetree/bindings/pci/pci-rcar-gen2.txt @@ -7,6 +7,7 @@ OHCI and EHCI controllers. Required properties: - compatible: "renesas,pci-r8a7743" for the R8A7743 SoC; + "renesas,pci-r8a7744" for the R8A7744 SoC; "renesas,pci-r8a7745" for the R8A7745 SoC; "renesas,pci-r8a7790" for the R8A7790 SoC; "renesas,pci-r8a7791" for the R8A7791 SoC; diff --git a/Documentation/devicetree/bindings/pci/rcar-pci.txt b/Documentation/devicetree/bindings/pci/rcar-pci.txt index a5f7fc62d10e..976ef7bfff93 100644 --- a/Documentation/devicetree/bindings/pci/rcar-pci.txt +++ b/Documentation/devicetree/bindings/pci/rcar-pci.txt @@ -2,6 +2,7 @@ Required properties: compatible: "renesas,pcie-r8a7743" for the R8A7743 SoC; + "renesas,pcie-r8a7744" for the R8A7744 SoC; "renesas,pcie-r8a7779" for the R8A7779 SoC; "renesas,pcie-r8a7790" for the R8A7790 SoC; "renesas,pcie-r8a7791" for the R8A7791 SoC; @@ -9,6 +10,7 @@ compatible: "renesas,pcie-r8a7743" for the R8A7743 SoC; "renesas,pcie-r8a7795" for the R8A7795 SoC; "renesas,pcie-r8a7796" for the R8A7796 SoC; "renesas,pcie-r8a77980" for the R8A77980 SoC; + "renesas,pcie-r8a77990" for the R8A77990 SoC; "renesas,pcie-rcar-gen2" for a generic R-Car Gen2 or RZ/G1 compatible device. "renesas,pcie-rcar-gen3" for a generic R-Car Gen3 compatible device. diff --git a/Documentation/devicetree/bindings/pci/ti-pci.txt b/Documentation/devicetree/bindings/pci/ti-pci.txt index 7f7af3044016..452fe48c4fdd 100644 --- a/Documentation/devicetree/bindings/pci/ti-pci.txt +++ b/Documentation/devicetree/bindings/pci/ti-pci.txt @@ -26,6 +26,11 @@ HOST MODE ranges, interrupt-map-mask, interrupt-map : as specified in ../designware-pcie.txt + - ti,syscon-unaligned-access: phandle to the syscon DT node. The 1st argument + should contain the register offset within syscon + and the 2nd argument should contain the bit field + for setting the bit to enable unaligned + access. DEVICE MODE =========== diff --git a/Documentation/driver-api/index.rst b/Documentation/driver-api/index.rst index 1f0cdba2b2bf..909f991b4c0d 100644 --- a/Documentation/driver-api/index.rst +++ b/Documentation/driver-api/index.rst @@ -30,7 +30,7 @@ available subsections can be seen below. input usb/index firewire - pci + pci/index spi i2c hsi diff --git a/Documentation/driver-api/pci/index.rst b/Documentation/driver-api/pci/index.rst new file mode 100644 index 000000000000..c6cf1fef61ce --- /dev/null +++ b/Documentation/driver-api/pci/index.rst @@ -0,0 +1,22 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================================ +The Linux PCI driver implementer's API guide +============================================ + +.. class:: toc-title + + Table of contents + +.. toctree:: + :maxdepth: 2 + + pci + p2pdma + +.. only:: subproject and html + + Indices + ======= + + * :ref:`genindex` diff --git a/Documentation/driver-api/pci/p2pdma.rst b/Documentation/driver-api/pci/p2pdma.rst new file mode 100644 index 000000000000..4c577fa7bef9 --- /dev/null +++ b/Documentation/driver-api/pci/p2pdma.rst @@ -0,0 +1,145 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================ +PCI Peer-to-Peer DMA Support +============================ + +The PCI bus has pretty decent support for performing DMA transfers +between two devices on the bus. This type of transaction is henceforth +called Peer-to-Peer (or P2P). However, there are a number of issues that +make P2P transactions tricky to do in a perfectly safe way. + +One of the biggest issues is that PCI doesn't require forwarding +transactions between hierarchy domains, and in PCIe, each Root Port +defines a separate hierarchy domain. To make things worse, there is no +simple way to determine if a given Root Complex supports this or not. +(See PCIe r4.0, sec 1.3.1). Therefore, as of this writing, the kernel +only supports doing P2P when the endpoints involved are all behind the +same PCI bridge, as such devices are all in the same PCI hierarchy +domain, and the spec guarantees that all transactions within the +hierarchy will be routable, but it does not require routing +between hierarchies. + +The second issue is that to make use of existing interfaces in Linux, +memory that is used for P2P transactions needs to be backed by struct +pages. However, PCI BARs are not typically cache coherent so there are +a few corner case gotchas with these pages so developers need to +be careful about what they do with them. + + +Driver Writer's Guide +===================== + +In a given P2P implementation there may be three or more different +types of kernel drivers in play: + +* Provider - A driver which provides or publishes P2P resources like + memory or doorbell registers to other drivers. +* Client - A driver which makes use of a resource by setting up a + DMA transaction to or from it. +* Orchestrator - A driver which orchestrates the flow of data between + clients and providers. + +In many cases there could be overlap between these three types (i.e., +it may be typical for a driver to be both a provider and a client). + +For example, in the NVMe Target Copy Offload implementation: + +* The NVMe PCI driver is both a client, provider and orchestrator + in that it exposes any CMB (Controller Memory Buffer) as a P2P memory + resource (provider), it accepts P2P memory pages as buffers in requests + to be used directly (client) and it can also make use of the CMB as + submission queue entries (orchastrator). +* The RDMA driver is a client in this arrangement so that an RNIC + can DMA directly to the memory exposed by the NVMe device. +* The NVMe Target driver (nvmet) can orchestrate the data from the RNIC + to the P2P memory (CMB) and then to the NVMe device (and vice versa). + +This is currently the only arrangement supported by the kernel but +one could imagine slight tweaks to this that would allow for the same +functionality. For example, if a specific RNIC added a BAR with some +memory behind it, its driver could add support as a P2P provider and +then the NVMe Target could use the RNIC's memory instead of the CMB +in cases where the NVMe cards in use do not have CMB support. + + +Provider Drivers +---------------- + +A provider simply needs to register a BAR (or a portion of a BAR) +as a P2P DMA resource using :c:func:`pci_p2pdma_add_resource()`. +This will register struct pages for all the specified memory. + +After that it may optionally publish all of its resources as +P2P memory using :c:func:`pci_p2pmem_publish()`. This will allow +any orchestrator drivers to find and use the memory. When marked in +this way, the resource must be regular memory with no side effects. + +For the time being this is fairly rudimentary in that all resources +are typically going to be P2P memory. Future work will likely expand +this to include other types of resources like doorbells. + + +Client Drivers +-------------- + +A client driver typically only has to conditionally change its DMA map +routine to use the mapping function :c:func:`pci_p2pdma_map_sg()` instead +of the usual :c:func:`dma_map_sg()` function. Memory mapped in this +way does not need to be unmapped. + +The client may also, optionally, make use of +:c:func:`is_pci_p2pdma_page()` to determine when to use the P2P mapping +functions and when to use the regular mapping functions. In some +situations, it may be more appropriate to use a flag to indicate a +given request is P2P memory and map appropriately. It is important to +ensure that struct pages that back P2P memory stay out of code that +does not have support for them as other code may treat the pages as +regular memory which may not be appropriate. + + +Orchestrator Drivers +-------------------- + +The first task an orchestrator driver must do is compile a list of +all client devices that will be involved in a given transaction. For +example, the NVMe Target driver creates a list including the namespace +block device and the RNIC in use. If the orchestrator has access to +a specific P2P provider to use it may check compatibility using +:c:func:`pci_p2pdma_distance()` otherwise it may find a memory provider +that's compatible with all clients using :c:func:`pci_p2pmem_find()`. +If more than one provider is supported, the one nearest to all the clients will +be chosen first. If more than one provider is an equal distance away, the +one returned will be chosen at random (it is not an arbitrary but +truely random). This function returns the PCI device to use for the provider +with a reference taken and therefore when it's no longer needed it should be +returned with pci_dev_put(). + +Once a provider is selected, the orchestrator can then use +:c:func:`pci_alloc_p2pmem()` and :c:func:`pci_free_p2pmem()` to +allocate P2P memory from the provider. :c:func:`pci_p2pmem_alloc_sgl()` +and :c:func:`pci_p2pmem_free_sgl()` are convenience functions for +allocating scatter-gather lists with P2P memory. + +Struct Page Caveats +------------------- + +Driver writers should be very careful about not passing these special +struct pages to code that isn't prepared for it. At this time, the kernel +interfaces do not have any checks for ensuring this. This obviously +precludes passing these pages to userspace. + +P2P memory is also technically IO memory but should never have any side +effects behind it. Thus, the order of loads and stores should not be important +and ioreadX(), iowriteX() and friends should not be necessary. +However, as the memory is not cache coherent, if access ever needs to +be protected by a spinlock then :c:func:`mmiowb()` must be used before +unlocking the lock. (See ACQUIRES VS I/O ACCESSES in +Documentation/memory-barriers.txt) + + +P2P DMA Support Library +======================= + +.. kernel-doc:: drivers/pci/p2pdma.c + :export: diff --git a/Documentation/driver-api/pci.rst b/Documentation/driver-api/pci/pci.rst index ca85e5e78b2c..ca85e5e78b2c 100644 --- a/Documentation/driver-api/pci.rst +++ b/Documentation/driver-api/pci/pci.rst diff --git a/Documentation/switchtec.txt b/Documentation/switchtec.txt index f788264921ff..30d6a64e53f7 100644 --- a/Documentation/switchtec.txt +++ b/Documentation/switchtec.txt @@ -23,7 +23,7 @@ The primary means of communicating with the Switchtec management firmware is through the Memory-mapped Remote Procedure Call (MRPC) interface. Commands are submitted to the interface with a 4-byte command identifier and up to 1KB of command specific data. The firmware will -respond with a 4 bytes return code and up to 1KB of command specific +respond with a 4-byte return code and up to 1KB of command-specific data. The interface only processes a single command at a time. @@ -36,8 +36,8 @@ device: /dev/switchtec#, one for each management endpoint in the system. The char device has the following semantics: * A write must consist of at least 4 bytes and no more than 1028 bytes. - The first four bytes will be interpreted as the command to run and - the remainder will be used as the input data. A write will send the + The first 4 bytes will be interpreted as the Command ID and the + remainder will be used as the input data. A write will send the command to the firmware to begin processing. * Each write must be followed by exactly one read. Any double write will @@ -45,9 +45,9 @@ The char device has the following semantics: produce an error. * A read will block until the firmware completes the command and return - the four bytes of status plus up to 1024 bytes of output data. (The - length will be specified by the size parameter of the read call -- - reading less than 4 bytes will produce an error. + the 4-byte Command Return Value plus up to 1024 bytes of output + data. (The length will be specified by the size parameter of the read + call -- reading less than 4 bytes will produce an error.) * The poll call will also be supported for userspace applications that need to do other things while waiting for the command to complete. @@ -83,10 +83,20 @@ The following IOCTLs are also supported by the device: Non-Transparent Bridge (NTB) Driver =================================== -An NTB driver is provided for the switchtec hardware in switchtec_ntb. -Currently, it only supports switches configured with exactly 2 -partitions. It also requires the following configuration settings: +An NTB hardware driver is provided for the Switchtec hardware in +ntb_hw_switchtec. Currently, it only supports switches configured with +exactly 2 NT partitions and zero or more non-NT partitions. It also requires +the following configuration settings: -* Both partitions must be able to access each other's GAS spaces. +* Both NT partitions must be able to access each other's GAS spaces. Thus, the bits in the GAS Access Vector under Management Settings must be set to support this. +* Kernel configuration MUST include support for NTB (CONFIG_NTB needs + to be set) + +NT EP BAR 2 will be dynamically configured as a Direct Window, and +the configuration file does not need to configure it explicitly. + +Please refer to Documentation/ntb.txt in Linux source tree for an overall +understanding of the Linux NTB stack. ntb_hw_switchtec works as an NTB +Hardware Driver in this stack. |