diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-12 22:21:29 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-12 22:21:29 +0300 |
commit | 9d33edb20f7e6943250d6bb96ceaf2368f674d51 (patch) | |
tree | 4f31a59b262b5cca1905b3f74a43cfab09bed416 /Documentation | |
parent | f10bc40168032962ebee26894bdbdc972cde35bf (diff) | |
parent | 6132a490f9c81d621fdb4e8c12f617dc062130a2 (diff) | |
download | linux-9d33edb20f7e6943250d6bb96ceaf2368f674d51.tar.xz |
Merge tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
"Updates for the interrupt core and driver subsystem:
The bulk is the rework of the MSI subsystem to support per device MSI
interrupt domains. This solves conceptual problems of the current
PCI/MSI design which are in the way of providing support for
PCI/MSI[-X] and the upcoming PCI/IMS mechanism on the same device.
IMS (Interrupt Message Store] is a new specification which allows
device manufactures to provide implementation defined storage for MSI
messages (as opposed to PCI/MSI and PCI/MSI-X that has a specified
message store which is uniform accross all devices). The PCI/MSI[-X]
uniformity allowed us to get away with "global" PCI/MSI domains.
IMS not only allows to overcome the size limitations of the MSI-X
table, but also gives the device manufacturer the freedom to store the
message in arbitrary places, even in host memory which is shared with
the device.
There have been several attempts to glue this into the current MSI
code, but after lengthy discussions it turned out that there is a
fundamental design problem in the current PCI/MSI-X implementation.
This needs some historical background.
When PCI/MSI[-X] support was added around 2003, interrupt management
was completely different from what we have today in the actively
developed architectures. Interrupt management was completely
architecture specific and while there were attempts to create common
infrastructure the commonalities were rudimentary and just providing
shared data structures and interfaces so that drivers could be written
in an architecture agnostic way.
The initial PCI/MSI[-X] support obviously plugged into this model
which resulted in some basic shared infrastructure in the PCI core
code for setting up MSI descriptors, which are a pure software
construct for holding data relevant for a particular MSI interrupt,
but the actual association to Linux interrupts was completely
architecture specific. This model is still supported today to keep
museum architectures and notorious stragglers alive.
In 2013 Intel tried to add support for hot-pluggable IO/APICs to the
kernel, which was creating yet another architecture specific mechanism
and resulted in an unholy mess on top of the existing horrors of x86
interrupt handling. The x86 interrupt management code was already an
incomprehensible maze of indirections between the CPU vector
management, interrupt remapping and the actual IO/APIC and PCI/MSI[-X]
implementation.
At roughly the same time ARM struggled with the ever growing SoC
specific extensions which were glued on top of the architected GIC
interrupt controller.
This resulted in a fundamental redesign of interrupt management and
provided the today prevailing concept of hierarchical interrupt
domains. This allowed to disentangle the interactions between x86
vector domain and interrupt remapping and also allowed ARM to handle
the zoo of SoC specific interrupt components in a sane way.
The concept of hierarchical interrupt domains aims to encapsulate the
functionality of particular IP blocks which are involved in interrupt
delivery so that they become extensible and pluggable. The X86
encapsulation looks like this:
|--- device 1
[Vector]---[Remapping]---[PCI/MSI]--|...
|--- device N
where the remapping domain is an optional component and in case that
it is not available the PCI/MSI[-X] domains have the vector domain as
their parent. This reduced the required interaction between the
domains pretty much to the initialization phase where it is obviously
required to establish the proper parent relation ship in the
components of the hierarchy.
While in most cases the model is strictly representing the chain of IP
blocks and abstracting them so they can be plugged together to form a
hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the
hardware it's clear that the actual PCI/MSI[-X] interrupt controller
is not a global entity, but strict a per PCI device entity.
Here we took a short cut on the hierarchical model and went for the
easy solution of providing "global" PCI/MSI domains which was possible
because the PCI/MSI[-X] handling is uniform across the devices. This
also allowed to keep the existing PCI/MSI[-X] infrastructure mostly
unchanged which in turn made it simple to keep the existing
architecture specific management alive.
A similar problem was created in the ARM world with support for IP
block specific message storage. Instead of going all the way to stack
a IP block specific domain on top of the generic MSI domain this ended
in a construct which provides a "global" platform MSI domain which
allows overriding the irq_write_msi_msg() callback per allocation.
In course of the lengthy discussions we identified other abuse of the
MSI infrastructure in wireless drivers, NTB etc. where support for
implementation specific message storage was just mindlessly glued into
the existing infrastructure. Some of this just works by chance on
particular platforms but will fail in hard to diagnose ways when the
driver is used on platforms where the underlying MSI interrupt
management code does not expect the creative abuse.
Another shortcoming of today's PCI/MSI-X support is the inability to
allocate or free individual vectors after the initial enablement of
MSI-X. This results in an works by chance implementation of VFIO (PCI
pass-through) where interrupts on the host side are not set up upfront
to avoid resource exhaustion. They are expanded at run-time when the
guest actually tries to use them. The way how this is implemented is
that the host disables MSI-X and then re-enables it with a larger
number of vectors again. That works by chance because most device
drivers set up all interrupts before the device actually will utilize
them. But that's not universally true because some drivers allocate a
large enough number of vectors but do not utilize them until it's
actually required, e.g. for acceleration support. But at that point
other interrupts of the device might be in active use and the MSI-X
disable/enable dance can just result in losing interrupts and
therefore hard to diagnose subtle problems.
Last but not least the "global" PCI/MSI-X domain approach prevents to
utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact
that IMS is not longer providing a uniform storage and configuration
model.
The solution to this is to implement the missing step and switch from
global PCI/MSI domains to per device PCI/MSI domains. The resulting
hierarchy then looks like this:
|--- [PCI/MSI] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
which in turn allows to provide support for multiple domains per
device:
|--- [PCI/MSI] device 1
|--- [PCI/IMS] device 1
[Vector]---[Remapping]---|...
|--- [PCI/MSI] device N
|--- [PCI/IMS] device N
This work converts the MSI and PCI/MSI core and the x86 interrupt
domains to the new model, provides new interfaces for post-enable
allocation/free of MSI-X interrupts and the base framework for
PCI/IMS. PCI/IMS has been verified with the work in progress IDXD
driver.
There is work in progress to convert ARM over which will replace the
platform MSI train-wreck. The cleanup of VFIO, NTB and other creative
"solutions" are in the works as well.
Drivers:
- Updates for the LoongArch interrupt chip drivers
- Support for MTK CIRQv2
- The usual small fixes and updates all over the place"
* tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (134 commits)
irqchip/ti-sci-inta: Fix kernel doc
irqchip/gic-v2m: Mark a few functions __init
irqchip/gic-v2m: Include arm-gic-common.h
irqchip/irq-mvebu-icu: Fix works by chance pointer assignment
iommu/amd: Enable PCI/IMS
iommu/vt-d: Enable PCI/IMS
x86/apic/msi: Enable PCI/IMS
PCI/MSI: Provide pci_ims_alloc/free_irq()
PCI/MSI: Provide IMS (Interrupt Message Store) support
genirq/msi: Provide constants for PCI/IMS support
x86/apic/msi: Enable MSI_FLAG_PCI_MSIX_ALLOC_DYN
PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X
PCI/MSI: Provide prepare_desc() MSI domain op
PCI/MSI: Split MSI-X descriptor setup
genirq/msi: Provide MSI_FLAG_MSIX_ALLOC_DYN
genirq/msi: Provide msi_domain_alloc_irq_at()
genirq/msi: Provide msi_domain_ops:: Prepare_desc()
genirq/msi: Provide msi_desc:: Msi_data
genirq/msi: Provide struct msi_map
x86/apic/msi: Remove arch_create_remap_msi_irq_domain()
...
Diffstat (limited to 'Documentation')
4 files changed, 112 insertions, 33 deletions
diff --git a/Documentation/PCI/msi-howto.rst b/Documentation/PCI/msi-howto.rst index aa2046af69f7..8ae461e97c54 100644 --- a/Documentation/PCI/msi-howto.rst +++ b/Documentation/PCI/msi-howto.rst @@ -285,3 +285,13 @@ to bridges between the PCI root and the device, MSIs are disabled. It is also worth checking the device driver to see whether it supports MSIs. For example, it may contain calls to pci_alloc_irq_vectors() with the PCI_IRQ_MSI or PCI_IRQ_MSIX flags. + + +List of device drivers MSI(-X) APIs +=================================== + +The PCI/MSI subystem has a dedicated C file for its exported device driver +APIs — `drivers/pci/msi/api.c`. The following functions are exported: + +.. kernel-doc:: drivers/pci/msi/api.c + :export: diff --git a/Documentation/devicetree/bindings/interrupt-controller/loongarch,cpu-interrupt-controller.yaml b/Documentation/devicetree/bindings/interrupt-controller/loongarch,cpu-interrupt-controller.yaml new file mode 100644 index 000000000000..2a1cf885c99d --- /dev/null +++ b/Documentation/devicetree/bindings/interrupt-controller/loongarch,cpu-interrupt-controller.yaml @@ -0,0 +1,34 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/interrupt-controller/loongarch,cpu-interrupt-controller.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: LoongArch CPU Interrupt Controller + +maintainers: + - Liu Peibao <liupeibao@loongson.cn> + +properties: + compatible: + const: loongarch,cpu-interrupt-controller + + '#interrupt-cells': + const: 1 + + interrupt-controller: true + +additionalProperties: false + +required: + - compatible + - '#interrupt-cells' + - interrupt-controller + +examples: + - | + interrupt-controller { + compatible = "loongarch,cpu-interrupt-controller"; + #interrupt-cells = <1>; + interrupt-controller; + }; diff --git a/Documentation/devicetree/bindings/interrupt-controller/mediatek,cirq.txt b/Documentation/devicetree/bindings/interrupt-controller/mediatek,cirq.txt deleted file mode 100644 index 5865f4f2c69d..000000000000 --- a/Documentation/devicetree/bindings/interrupt-controller/mediatek,cirq.txt +++ /dev/null @@ -1,33 +0,0 @@ -* Mediatek 27xx cirq - -In Mediatek SOCs, the CIRQ is a low power interrupt controller designed to -work outside MCUSYS which comprises with Cortex-Ax cores,CCI and GIC. -The external interrupts (outside MCUSYS) will feed through CIRQ and connect -to GIC in MCUSYS. When CIRQ is enabled, it will record the edge-sensitive -interrupts and generate a pulse signal to parent interrupt controller when -flush command is executed. With CIRQ, MCUSYS can be completely turned off -to improve the system power consumption without losing interrupts. - -Required properties: -- compatible: should be one of - - "mediatek,mt2701-cirq" for mt2701 CIRQ - - "mediatek,mt8135-cirq" for mt8135 CIRQ - - "mediatek,mt8173-cirq" for mt8173 CIRQ - and "mediatek,cirq" as a fallback. -- interrupt-controller : Identifies the node as an interrupt controller. -- #interrupt-cells : Use the same format as specified by GIC in arm,gic.txt. -- reg: Physical base address of the cirq registers and length of memory - mapped region. -- mediatek,ext-irq-range: Identifies external irq number range in different - SOCs. - -Example: - cirq: interrupt-controller@10204000 { - compatible = "mediatek,mt2701-cirq", - "mediatek,mtk-cirq"; - interrupt-controller; - #interrupt-cells = <3>; - interrupt-parent = <&sysirq>; - reg = <0 0x10204000 0 0x400>; - mediatek,ext-irq-start = <32 200>; - }; diff --git a/Documentation/devicetree/bindings/interrupt-controller/mediatek,mtk-cirq.yaml b/Documentation/devicetree/bindings/interrupt-controller/mediatek,mtk-cirq.yaml new file mode 100644 index 000000000000..fdcb4d8db818 --- /dev/null +++ b/Documentation/devicetree/bindings/interrupt-controller/mediatek,mtk-cirq.yaml @@ -0,0 +1,68 @@ +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/interrupt-controller/mediatek,mtk-cirq.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: MediaTek System Interrupt Controller + +maintainers: + - Youlin Pei <youlin.pei@mediatek.com> + +description: + In MediaTek SoCs, the CIRQ is a low power interrupt controller designed to + work outside of MCUSYS which comprises with Cortex-Ax cores, CCI and GIC. + The external interrupts (outside MCUSYS) will feed through CIRQ and connect + to GIC in MCUSYS. When CIRQ is enabled, it will record the edge-sensitive + interrupts and generate a pulse signal to parent interrupt controller when + flush command is executed. With CIRQ, MCUSYS can be completely turned off + to improve the system power consumption without losing interrupts. + + +properties: + compatible: + items: + - enum: + - mediatek,mt2701-cirq + - mediatek,mt8135-cirq + - mediatek,mt8173-cirq + - mediatek,mt8192-cirq + - const: mediatek,mtk-cirq + + reg: + maxItems: 1 + + '#interrupt-cells': + const: 3 + + interrupt-controller: true + + mediatek,ext-irq-range: + $ref: /schemas/types.yaml#/definitions/uint32-array + items: + - description: First CIRQ interrupt + - description: Last CIRQ interrupt + description: + Identifies the range of external interrupts in different SoCs + +required: + - compatible + - reg + - '#interrupt-cells' + - interrupt-controller + - mediatek,ext-irq-range + +additionalProperties: false + +examples: + - | + #include <dt-bindings/interrupt-controller/irq.h> + + cirq: interrupt-controller@10204000 { + compatible = "mediatek,mt2701-cirq", "mediatek,mtk-cirq"; + reg = <0x10204000 0x400>; + #interrupt-cells = <3>; + interrupt-controller; + interrupt-parent = <&sysirq>; + mediatek,ext-irq-range = <32 200>; + }; |