Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
"This time with bigger changes than usual:
- A new IOMMU driver for the ARM SMMUv3.
This IOMMU is pretty different from SMMUv1 and v2 in that it is
configured through in-memory structures and not through the MMIO
register region. The ARM SMMUv3 also supports IO demand paging for
PCI devices with PRI/PASID capabilities, but this is not
implemented in the driver yet.
- Lots of cleanups and device-tree support for the Exynos IOMMU
driver. This is part of the effort to bring Exynos DRM support
upstream.
- Introduction of default domains into the IOMMU core code.
The rationale behind this is to move functionalily out of the IOMMU
drivers to common code to get to a unified behavior between
different drivers. The patches here introduce a default domain for
iommu-groups (isolation groups).
A device will now always be attached to a domain, either the
default domain or another domain handled by the device driver. The
IOMMU drivers have to be modified to make use of that feature. So
long the AMD IOMMU driver is converted, with others to follow.
- Patches for the Intel VT-d drvier to fix DMAR faults that happen
when a kdump kernel boots.
When the kdump kernel boots it re-initializes the IOMMU hardware,
which destroys all mappings from the crashed kernel. As this
happens before the endpoint devices are re-initialized, any
in-flight DMA causes a DMAR fault. These faults cause PCI master
aborts, which some devices can't handle properly and go into an
undefined state, so that the device driver in the kdump kernel
fails to initialize them and the dump fails.
This is now fixed by copying over the mapping structures (only
context tables and interrupt remapping tables) from the old kernel
and keep the old mappings in place until the device driver of the
new kernel takes over. This emulates the the behavior without an
IOMMU to the best degree possible.
- A couple of other small fixes and cleanups"
* tag 'iommu-updates-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (69 commits)
iommu/amd: Handle large pages correctly in free_pagetable
iommu/vt-d: Don't disable IR when it was previously enabled
iommu/vt-d: Make sure copied over IR entries are not reused
iommu/vt-d: Copy IR table from old kernel when in kdump mode
iommu/vt-d: Set IRTA in intel_setup_irq_remapping
iommu/vt-d: Disable IRQ remapping in intel_prepare_irq_remapping
iommu/vt-d: Move QI initializationt to intel_setup_irq_remapping
iommu/vt-d: Move EIM detection to intel_prepare_irq_remapping
iommu/vt-d: Enable Translation only if it was previously disabled
iommu/vt-d: Don't disable translation prior to OS handover
iommu/vt-d: Don't copy translation tables if RTT bit needs to be changed
iommu/vt-d: Don't do early domain assignment if kdump kernel
iommu/vt-d: Allocate si_domain in init_dmars()
iommu/vt-d: Mark copied context entries
iommu/vt-d: Do not re-use domain-ids from the old kernel
iommu/vt-d: Copy translation tables from old kernel
iommu/vt-d: Detect pre enabled translation
iommu/vt-d: Make root entry visible for hardware right after allocation
iommu/vt-d: Init QI before root entry is allocated
iommu/vt-d: Cleanup log messages
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Ingo Molnar:
"There were so many changes in the x86/asm, x86/apic and x86/mm topics
in this cycle that the topical separation of -tip broke down somewhat -
so the result is a more traditional architecture pull request,
collected into the 'x86/core' topic.
The topics were still maintained separately as far as possible, so
bisectability and conceptual separation should still be pretty good -
but there were a handful of merge points to avoid excessive
dependencies (and conflicts) that would have been poorly tested in the
end.
The next cycle will hopefully be much more quiet (or at least will
have fewer dependencies).
The main changes in this cycle were:
* x86/apic changes, with related IRQ core changes: (Jiang Liu, Thomas
Gleixner)
- This is the second and most intrusive part of changes to the x86
interrupt handling - full conversion to hierarchical interrupt
domains:
[IOAPIC domain] -----
|
[MSI domain] --------[Remapping domain] ----- [ Vector domain ]
| (optional) |
[HPET MSI domain] ----- |
|
[DMAR domain] -----------------------------
|
[Legacy domain] -----------------------------
This now reflects the actual hardware and allowed us to distangle
the domain specific code from the underlying parent domain, which
can be optional in the case of interrupt remapping. It's a clear
separation of functionality and removes quite some duct tape
constructs which plugged the remap code between ioapic/msi/hpet
and the vector management.
- Intel IOMMU IRQ remapping enhancements, to allow direct interrupt
injection into guests (Feng Wu)
* x86/asm changes:
- Tons of cleanups and small speedups, micro-optimizations. This
is in preparation to move a good chunk of the low level entry
code from assembly to C code (Denys Vlasenko, Andy Lutomirski,
Brian Gerst)
- Moved all system entry related code to a new home under
arch/x86/entry/ (Ingo Molnar)
- Removal of the fragile and ugly CFI dwarf debuginfo annotations.
Conversion to C will reintroduce many of them - but meanwhile
they are only getting in the way, and the upstream kernel does
not rely on them (Ingo Molnar)
- NOP handling refinements. (Borislav Petkov)
* x86/mm changes:
- Big PAT and MTRR rework: making the code more robust and
preparing to phase out exposing direct MTRR interfaces to drivers -
in favor of using PAT driven interfaces (Toshi Kani, Luis R
Rodriguez, Borislav Petkov)
- New ioremap_wt()/set_memory_wt() interfaces to support
Write-Through cached memory mappings. This is especially
important for good performance on NVDIMM hardware (Toshi Kani)
* x86/ras changes:
- Add support for deferred errors on AMD (Aravind Gopalakrishnan)
This is an important RAS feature which adds hardware support for
poisoned data. That means roughly that the hardware marks data
which it has detected as corrupted but wasn't able to correct, as
poisoned data and raises an APIC interrupt to signal that in the
form of a deferred error. It is the OS's responsibility then to
take proper recovery action and thus prolonge system lifetime as
far as possible.
- Add support for Intel "Local MCE"s: upcoming CPUs will support
CPU-local MCE interrupts, as opposed to the traditional system-
wide broadcasted MCE interrupts (Ashok Raj)
- Misc cleanups (Borislav Petkov)
* x86/platform changes:
- Intel Atom SoC updates
... and lots of other cleanups, fixlets and other changes - see the
shortlog and the Git log for details"
* 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (222 commits)
x86/hpet: Use proper hpet device number for MSI allocation
x86/hpet: Check for irq==0 when allocating hpet MSI interrupts
x86/mm/pat, drivers/infiniband/ipath: Use arch_phys_wc_add() and require PAT disabled
x86/mm/pat, drivers/media/ivtv: Use arch_phys_wc_add() and require PAT disabled
x86/platform/intel/baytrail: Add comments about why we disabled HPET on Baytrail
genirq: Prevent crash in irq_move_irq()
genirq: Enhance irq_data_to_desc() to support hierarchy irqdomain
iommu, x86: Properly handle posted interrupts for IOMMU hotplug
iommu, x86: Provide irq_remapping_cap() interface
iommu, x86: Setup Posted-Interrupts capability for Intel iommu
iommu, x86: Add cap_pi_support() to detect VT-d PI capability
iommu, x86: Avoid migrating VT-d posted interrupts
iommu, x86: Save the mode (posted or remapped) of an IRTE
iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip
iommu: dmar: Provide helper to copy shared irte fields
iommu: dmar: Extend struct irte for VT-d Posted-Interrupts
iommu: Add new member capability to struct irq_remap_ops
x86/asm/entry/64: Disentangle error_entry/exit gsbase/ebx/usermode code
x86/asm/entry/32: Shorten __audit_syscall_entry() args preparation
x86/asm/entry/32: Explain reloading of registers after __audit_syscall_entry()
...
|
|
When we are booting into a kdump kernel and find IR enabled,
copy over the contents of the previous IR table so that
spurious interrupts will not be target aborted.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Add code to detect whether translation is already enabled in
the IOMMU. Save this state in a flags field added to
struct intel_iommu.
Tested-by: ZhenHua Li <zhen-hual@hp.com>
Tested-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Add helper function to detect VT-d Posted-Interrupts capability.
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
Acked-by: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/1433827237-3382-8-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The existing hardware implementations with PASID support advertised in
bit 28? Forget them. They do not exist. Bit 28 means nothing. When we
have something that works, it'll use bit 40. Do not attempt to infer
anything meaningful from bit 28.
This will be reflected in an updated VT-d spec in the extremely near
future.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Conflicts:
arch/x86/kernel/apic/io_apic.c
arch/x86/kernel/apic/vector.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Enhance Intel interrupt remapping driver to support hierarchical
irqdomains. Implement intel_ir_chip to support stacked irq_chip.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: David Cohen <david.a.cohen@linux.intel.com>
Cc: Sander Eikelenboom <linux@eikelenboom.it>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: iommu@lists.linux-foundation.org
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Link: http://lkml.kernel.org/r/1428905519-23704-11-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
As far back as I can see (which right now is a draft of the v1.2 spec
dating from September 2008), bits 24-31 of the Extended Capability Register
have already been reserved. I have no idea why anyone ever thought there
would be multiple sets of IOTLB registers, but we've never supported them
and all we do is make sure we map enough MMIO space for them.
Kill it dead. Those bits do actually have a different meaning now.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Register our DRHD IOMMUs, cross link devices, and provide a base set
of attributes for the IOMMU. Note that IRQ remapping support parses
the DMAR table very early in boot, well before the iommu_class can
reasonably be setup, so our registration is split between
intel_iommu_init(), which occurs later, and alloc_iommu(), which
typically occurs much earlier, but may happen at any time later
with IOMMU hot-add support.
On a typical desktop system, this provides the following (pruned):
$ find /sys | grep dmar
/sys/devices/virtual/iommu/dmar0
/sys/devices/virtual/iommu/dmar0/devices
/sys/devices/virtual/iommu/dmar0/devices/0000:00:02.0
/sys/devices/virtual/iommu/dmar0/intel-iommu
/sys/devices/virtual/iommu/dmar0/intel-iommu/cap
/sys/devices/virtual/iommu/dmar0/intel-iommu/ecap
/sys/devices/virtual/iommu/dmar0/intel-iommu/address
/sys/devices/virtual/iommu/dmar0/intel-iommu/version
/sys/devices/virtual/iommu/dmar1
/sys/devices/virtual/iommu/dmar1/devices
/sys/devices/virtual/iommu/dmar1/devices/0000:00:00.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:01.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:16.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:1a.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:1b.0
/sys/devices/virtual/iommu/dmar1/devices/0000:00:1c.0
...
/sys/devices/virtual/iommu/dmar1/intel-iommu
/sys/devices/virtual/iommu/dmar1/intel-iommu/cap
/sys/devices/virtual/iommu/dmar1/intel-iommu/ecap
/sys/devices/virtual/iommu/dmar1/intel-iommu/address
/sys/devices/virtual/iommu/dmar1/intel-iommu/version
/sys/class/iommu/dmar0
/sys/class/iommu/dmar1
(devices also link back to the dmar units)
This makes address, version, capabilities, and extended capabilities
available, just like printed on boot. I've tried not to duplicate
data that can be found in the DMAR table, with the exception of the
address, which provides an easy way to associate the sysfs device with
a DRHD entry in the DMAR. It's tempting to add scopes and RMRR data
here, but the full DMAR table is already exposed under /sys/firmware/
and therefore already provides a way for userspace to learn such
details.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Data structure drhd->iommu is shared between DMA remapping driver and
interrupt remapping driver, so DMA remapping driver shouldn't release
drhd->iommu when it failed to initialize IOMMU devices. Otherwise it
may cause invalid memory access to the interrupt remapping driver.
Sample stack dump:
[ 13.315090] BUG: unable to handle kernel paging request at ffffc9000605a088
[ 13.323221] IP: [<ffffffff81461bac>] qi_submit_sync+0x15c/0x400
[ 13.330107] PGD 82f81e067 PUD c2f81e067 PMD 82e846067 PTE 0
[ 13.336818] Oops: 0002 [#1] SMP
[ 13.340757] Modules linked in:
[ 13.344422] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 3.13.0-rc1-gerry+ #7
[ 13.352474] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.091020121352 09/10/2012
[ 13.365659] Workqueue: events work_for_cpu_fn
[ 13.370774] task: ffff88042ddf00d0 ti: ffff88042ddee000 task.ti: ffff88042dde e000
[ 13.379389] RIP: 0010:[<ffffffff81461bac>] [<ffffffff81461bac>] qi_submit_sy nc+0x15c/0x400
[ 13.389055] RSP: 0000:ffff88042ddef940 EFLAGS: 00010002
[ 13.395151] RAX: 00000000000005e0 RBX: 0000000000000082 RCX: 0000000200000025
[ 13.403308] RDX: ffffc9000605a000 RSI: 0000000000000010 RDI: ffff88042ddb8610
[ 13.411446] RBP: ffff88042ddef9a0 R08: 00000000000005d0 R09: 0000000000000001
[ 13.419599] R10: 0000000000000000 R11: 000000000000005d R12: 000000000000005c
[ 13.427742] R13: ffff88102d84d300 R14: 0000000000000174 R15: ffff88042ddb4800
[ 13.435877] FS: 0000000000000000(0000) GS:ffff88043de00000(0000) knlGS:00000 00000000000
[ 13.445168] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 13.451749] CR2: ffffc9000605a088 CR3: 0000000001a0b000 CR4: 00000000000407f0
[ 13.459895] Stack:
[ 13.462297] ffff88042ddb85d0 000000000000005d ffff88042ddef9b0 0000000000000 5d0
[ 13.471147] 00000000000005c0 ffff88042ddb8000 000000000000005c 0000000000000 015
[ 13.480001] ffff88042ddb4800 0000000000000282 ffff88042ddefa40 ffff88042ddef ac0
[ 13.488855] Call Trace:
[ 13.491771] [<ffffffff8146848d>] modify_irte+0x9d/0xd0
[ 13.497778] [<ffffffff8146886d>] intel_setup_ioapic_entry+0x10d/0x290
[ 13.505250] [<ffffffff810a92a6>] ? trace_hardirqs_on_caller+0x16/0x1e0
[ 13.512824] [<ffffffff810346b0>] ? default_init_apic_ldr+0x60/0x60
[ 13.519998] [<ffffffff81468be0>] setup_ioapic_remapped_entry+0x20/0x30
[ 13.527566] [<ffffffff8103683a>] io_apic_setup_irq_pin+0x12a/0x2c0
[ 13.534742] [<ffffffff8136673b>] ? acpi_pci_irq_find_prt_entry+0x2b9/0x2d8
[ 13.544102] [<ffffffff81037fd5>] io_apic_setup_irq_pin_once+0x85/0xa0
[ 13.551568] [<ffffffff8103816f>] ? mp_find_ioapic_pin+0x8f/0xf0
[ 13.558434] [<ffffffff81038044>] io_apic_set_pci_routing+0x34/0x70
[ 13.565621] [<ffffffff8102f4cf>] mp_register_gsi+0xaf/0x1c0
[ 13.572111] [<ffffffff8102f5ee>] acpi_register_gsi_ioapic+0xe/0x10
[ 13.579286] [<ffffffff8102f33f>] acpi_register_gsi+0xf/0x20
[ 13.585779] [<ffffffff81366b86>] acpi_pci_irq_enable+0x171/0x1e3
[ 13.592764] [<ffffffff8146d771>] pcibios_enable_device+0x31/0x40
[ 13.599744] [<ffffffff81320e9b>] do_pci_enable_device+0x3b/0x60
[ 13.606633] [<ffffffff81322248>] pci_enable_device_flags+0xc8/0x120
[ 13.613887] [<ffffffff813222f3>] pci_enable_device+0x13/0x20
[ 13.620484] [<ffffffff8132fa7e>] pcie_port_device_register+0x1e/0x510
[ 13.627947] [<ffffffff810a92a6>] ? trace_hardirqs_on_caller+0x16/0x1e0
[ 13.635510] [<ffffffff810a947d>] ? trace_hardirqs_on+0xd/0x10
[ 13.642189] [<ffffffff813302b8>] pcie_portdrv_probe+0x58/0xc0
[ 13.648877] [<ffffffff81323ba5>] local_pci_probe+0x45/0xa0
[ 13.655266] [<ffffffff8106bc44>] work_for_cpu_fn+0x14/0x20
[ 13.661656] [<ffffffff8106fa79>] process_one_work+0x369/0x710
[ 13.668334] [<ffffffff8106fa02>] ? process_one_work+0x2f2/0x710
[ 13.675215] [<ffffffff81071d56>] ? worker_thread+0x46/0x690
[ 13.681714] [<ffffffff81072194>] worker_thread+0x484/0x690
[ 13.688109] [<ffffffff81071d10>] ? cancel_delayed_work_sync+0x20/0x20
[ 13.695576] [<ffffffff81079c60>] kthread+0xf0/0x110
[ 13.701300] [<ffffffff8108e7bf>] ? local_clock+0x3f/0x50
[ 13.707492] [<ffffffff81079b70>] ? kthread_create_on_node+0x250/0x250
[ 13.714959] [<ffffffff81574d2c>] ret_from_fork+0x7c/0xb0
[ 13.721152] [<ffffffff81079b70>] ? kthread_create_on_node+0x250/0x250
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
Functions alloc_iommu() and parse_ioapics_under_ir()
are only used internally, so mark them as static.
[Joerg: Made detect_intel_iommu() non-static again for IA64]
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
Currently Intel interrupt remapping drivers uses the "present" flag bit
in remapping entry to track whether an entry is allocated or not.
It works as follow:
1) allocate a remapping entry and set its "present" flag bit to 1
2) compose other fields for the entry
3) update the remapping entry with the composed value
The remapping hardware may access the entry between step 1 and step 3,
which then observers an entry with the "present" flag set but random
values in all other fields.
This patch introduces a dedicated bitmap to track remapping entry
allocation status instead of sharing the "present" flag with hardware,
thus eliminate the race window. It also simplifies the implementation.
Tested-and-reviewed-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
According to Intel Vt-D specs, the offset of Invalidation complete
status register should be 0x9C, not 0x98.
See Intel's VT-d spec, Revision 1.3, Chapter 10.4, Page 98;
Signed-off-by: Li, Zhen-Hua <zhen-hual@hp.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
Intel-iommu initialization doesn't currently reserve the memory
used for the IOMMU registers. This can allow the pci resource
allocator to assign a device BAR to the same address as the
IOMMU registers. This can cause some not so nice side affects
when the driver ioremap's that region.
Introduced two helper functions to map & unmap the IOMMU
registers as well as simplify the init and exit paths.
Signed-off-by: Donald Dutile <ddutile@redhat.com>
Acked-by: Chris Wright <chrisw@redhat.com>
Cc: iommu@lists.linux-foundation.org
Cc: suresh.b.siddha@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/1338845342-12464-3-git-send-email-ddutile@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
rtmutex: Add missing rcu_read_unlock() in debug_rt_mutex_print_deadlock()
lockdep: Comment all warnings
lib: atomic64: Change the type of local lock to raw_spinlock_t
locking, lib/atomic64: Annotate atomic64_lock::lock as raw
locking, x86, iommu: Annotate qi->q_lock as raw
locking, x86, iommu: Annotate irq_2_ir_lock as raw
locking, x86, iommu: Annotate iommu->register_lock as raw
locking, dma, ipu: Annotate bank_lock as raw
locking, ARM: Annotate low level hw locks as raw
locking, drivers/dca: Annotate dca_lock as raw
locking, powerpc: Annotate uic->lock as raw
locking, x86: mce: Annotate cmci_discover_lock as raw
locking, ACPI: Annotate c3_lock as raw
locking, oprofile: Annotate oprofilefs lock as raw
locking, video: Annotate vga console lock as raw
locking, latencytop: Annotate latency_lock as raw
locking, timer_stats: Annotate table_lock as raw
locking, rwsem: Annotate inner lock as raw
locking, semaphores: Annotate inner lock as raw
locking, sched: Annotate thread_group_cputimer as raw
...
Fix up conflicts in kernel/posix-cpu-timers.c manually: making
cputimer->cputime a raw lock conflicted with the ABBA fix in commit
bcd5cff7216f ("cputimer: Cure lock inversion").
|
|
Change the CONFIG_DMAR to CONFIG_INTEL_IOMMU to be consistent
with the other IOMMU options.
Rename the CONFIG_INTR_REMAP to CONFIG_IRQ_REMAP to match the
irq subsystem name.
And define the CONFIG_DMAR_TABLE for the common ACPI DMAR
routines shared by both CONFIG_INTEL_IOMMU and CONFIG_IRQ_REMAP.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: youquan.song@intel.com
Cc: joerg.roedel@amd.com
Cc: tony.luck@intel.com
Cc: dwmw2@infradead.org
Link: http://lkml.kernel.org/r/20110824001456.558630224@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
The qi->q_lock lock can be taken in atomic context and therefore
cannot be preempted on -rt - annotate it.
In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
The iommu->register_lock can be taken in atomic context and therefore
must not be preempted on -rt - annotate it.
In mainline this change documents the low level nature of
the lock - otherwise there's no functional difference. Lockdep
and Sparse checking will work as usual.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Add support for parsing Remapping Hardware Static Affinity (RHSA) structure.
This enables identifying the association between remapping hardware units and
the corresponding proximity domain. This enables to allocate transalation
structures closer to the remapping hardware unit.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
BIOS clear DMAR table INTR_REMAP flag to disable interrupt remapping. Current
kernel only check interrupt remapping(IR) flag in DRHD's extended capability
register to decide interrupt remapping support or not. But IR flag will not
change when BIOS disable/enable interrupt remapping.
When user disable interrupt remapping in BIOS or BIOS often defaultly disable
interrupt remapping feature when BIOS is not mature.Though BIOS disable
interrupt remapping but intr_remapping_supported function will always report
to OS support interrupt remapping if VT-d2 chipset populated. On this
cases, kernel will continue enable interrupt remapping and result kernel panic.
This bug exist on almost all platforms with interrupt remapping support.
This patch add DMAR table INTR_REMAP flag check before enable interrupt
remapping.
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Enable the device IOTLB (i.e. ATS) for both the bare metal and KVM
environments.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Support device IOTLB invalidation to flush the translation cached
in the Endpoint.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Parse the Root Port ATS Capability Reporting Structure in the DMA
Remapping Reporting Structure ACPI table.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
As we just did for context cache flushing, clean up the logic around
whether we need to flush the iotlb or just the write-buffer, depending
on caching mode.
Fix the same bug in qi_flush_iotlb() that qi_flush_context() had -- it
isn't supposed to be returning an error; it's supposed to be returning a
flag which triggers a write-buffer flush.
Remove some superfluous conditional write-buffer flushes which could
never have happened because they weren't for non-present-to-present
mapping changes anyway.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
It really doesn't make a lot of sense to have some of the logic to
handle caching vs. non-caching mode duplicated in qi_flush_context() and
__iommu_flush_context(), while the return value indicates whether the
caller should take other action which depends on the same thing.
Especially since qi_flush_context() thought it was returning something
entirely different anyway.
This patch makes qi_flush_context() and __iommu_flush_context() both
return void, removes the 'non_present_entry_flush' argument and makes
the only call site which _set_ that argument to 1 do the right thing.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
The patch adds kernel parameter intel_iommu=pt to set up pass through
mode in context mapping entry. This disables DMAR in linux kernel; but
KVM still runs on VT-d and interrupt remapping still works.
In this mode, kernel uses swiotlb for DMA API functions but other VT-d
functionalities are enabled for KVM. KVM always uses multi level
translation page table in VT-d. By default, pass though mode is disabled
in kernel.
This is useful when people don't want to enable VT-d DMAR in kernel but
still want to use KVM and interrupt remapping for reasons like DMAR
performance concern or debug purpose.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Weidong Han <weidong@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
When extended interrupt mode (x2apic mode) is not supported in a
system, it must set compatibility format interrupt to bypass
interrupt remapping, otherwise compatibility format interrupts
will be blocked.
This will be used when interrupt remapping is enabled while x2apic
is not supported.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
This patch implements the suspend and resume feature for Intel IOMMU
DMAR. It hooks to kernel suspend and resume interface. When suspend happens, it
saves necessary hardware registers. When resume happens, it restores the
registers and restarts IOMMU by enabling translation, setting up root entry, and
re-enabling queued invalidation.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
* git://git.infradead.org/iommu-2.6:
intel-iommu: Fix address wrap on 32-bit kernel.
intel-iommu: Enable DMAR on 32-bit kernel.
intel-iommu: fix PCI device detach from virtual machine
intel-iommu: VT-d page table to support snooping control bit
iommu: Add domain_has_cap iommu_ops
intel-iommu: Snooping control support
Fixed trivial conflicts in arch/x86/Kconfig and drivers/pci/intel-iommu.c
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
dma-debug: make memory range checks more consistent
dma-debug: warn of unmapping an invalid dma address
dma-debug: fix dma_debug_add_bus() definition for !CONFIG_DMA_API_DEBUG
dma-debug/x86: register pci bus for dma-debug leak detection
dma-debug: add a check dma memory leaks
dma-debug: add checks for kernel text and rodata
dma-debug: print stacktrace of mapping path on unmap error
dma-debug: Documentation update
dma-debug: x86 architecture bindings
dma-debug: add function to dump dma mappings
dma-debug: add checks for sync_single_sg_*
dma-debug: add checks for sync_single_range_*
dma-debug: add checks for sync_single_*
dma-debug: add checking for [alloc|free]_coherent
dma-debug: add add checking for map/unmap_sg
dma-debug: add checking for map/unmap_page/single
dma-debug: add core checking functions
dma-debug: add debugfs interface
dma-debug: add kernel command line parameters
dma-debug: add initialization code
...
Fix trivial conflicts due to whitespace changes in arch/x86/kernel/pci-nommu.c
|
|
Snooping control enabled IOMMU to guarantee DMA cache coherency and thus reduce
software effort (VMM) in maintaining effective memory type.
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Impact: new interfaces (not yet used)
Routines for disabling queued invalidation and interrupt remapping.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
|
Impact: interface augmentation (not yet used)
Enable fault handling flow for intr-remapping aswell. Fault handling
code now shared by both dma-remapping and intr-remapping.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
|
|
|
|
When hardware detects any error with a descriptor from the invalidation
queue, it stops fetching new descriptors from the queue until software
clears the Invalidation Queue Error bit in the Fault Status register.
Following fix handles the IQE so the kernel won't be trapped in an
infinite loop.
Signed-off-by: Yu Zhao <yu.zhao@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
The dma ops unification enables X86 and IA64 to share intel_dma_ops so
we can make dma mapping functions static. This also remove unused
intel_map_single().
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
This converts X86 and IA64 to use include/linux/dma-mapping.h.
It's a bit large but pretty boring. The major change for X86 is
converting 'int dir' to 'enum dma_data_direction dir' in DMA mapping
operations. The major changes for IA64 is using map_page and
unmap_page instead of map_single and unmap_single.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
These APIs are used by KVM to use VT-d
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
"SAGAW" capability may be different across iommus. Use a default agaw, but if default agaw is not supported in some iommus, choose a less supported agaw.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|
|
The seg, saved_msg and sysdev fields appear to be unused since
before the code was first merged.
linux/msi.h is not needed in linux/intel-iommu.h anymore since
there is no longer a reference to struct msi_msg. The MSI code
in drivers/pci/intel-iommu.c still has linux/msi.h included
via linux/dmar.h.
linux/sysdev.h isn't needed because there is no reference to
struct sys_device.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
The current Intel IOMMU code assumes that both host page size and Intel
IOMMU page size are 4KiB. The first patch supports variable page size.
This provides support for IA64 which has multiple page sizes.
This patch also adds some other code hooks for IA64 platform including
DMAR_OPERATION_TIMEOUT definition.
[dwmw2: some cleanup]
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
If queued invalidation interface is available and enabled, queued invalidation
interface will be used instead of the register based interface.
According to Vt-d2 specification, when queued invalidation is enabled,
invalidation command submit works only through invalidation queue and not
through the command registers interface.
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|