Age | Commit message (Collapse) | Author | Files | Lines |
|
Move all files related to the Intel IOMMU driver into its own
subdirectory.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200609130303.26974-3-joro@8bytes.org
|
|
'hyper-v', 'core' and 'x86/amd' into next
|
|
By removing the real DMA indirection in find_domain(), we can allow
sub-devices of a real DMA device to have their own valid
device_domain_info. The dmar lookup and context entry removal paths have
been fixed to account for sub-devices.
Fixes: 2b0140c69637 ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200527165617.297470-4-jonathan.derrick@intel.com
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207575
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Sub-devices of a real DMA device might exist on a separate segment than
the real DMA device and its IOMMU. These devices should still have a
valid device_domain_info, but the current dma alias model won't
allocate info for the subdevice.
This patch adds a segment member to struct device_domain_info and uses
the sub-device's BDF so that these sub-devices won't alias to other
devices.
Fixes: 2b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Cc: stable@vger.kernel.org # v5.6+
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200527165617.297470-3-jonathan.derrick@intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Domain context mapping can encounter issues with sub-devices of a real
DMA device. A sub-device cannot have a valid context entry due to it
potentially aliasing another device's 16-bit ID. It's expected that
sub-devices of the real DMA device uses the real DMA device's requester
when context mapping.
This is an issue when a sub-device is removed where the context entry is
cleared for all aliases. Other sub-devices are still valid, resulting in
those sub-devices being stranded without valid context entries.
The correct approach is to use the real DMA device when programming the
context entries. The insertion path is correct because device_to_iommu()
will return the bus and devfn of the real DMA device. The removal path
needs to only operate on the real DMA device, otherwise the entire
context entry would be cleared for all sub-devices of the real DMA
device.
This patch also adds a helper to determine if a struct device is a
sub-device of a real DMA device.
Fixes: 2b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Cc: stable@vger.kernel.org # v5.6+
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200527165617.297470-2-jonathan.derrick@intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The pci_ats_supported() helper checks if a device supports ATS and is
allowed to use it. By checking the ATS capability it also integrates the
pci_ats_disabled() check from pci_ats_init(). Simplify the vt-d checks.
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200520152201.3309416-5-jean-philippe@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The commit 6ee1b77ba3ac ("iommu/vt-d: Add svm/sva invalidate function")
introduced a GCC warning,
drivers/iommu/intel-iommu.c:5330:1: warning: 'static' is not at beginning of
declaration [-Wold-style-declaration]
const static int
^~~~~
Fixes: 6ee1b77ba3ac0 ("iommu/vt-d: Add svm/sva invalidate function")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200521215030.16938-1-cai@lca.pw
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
There's no need for the non-dma_ops path to keep track of IOVAs. The
whole point of the non-dma_ops path is that it allows the IOVAs to be
handled separately. The IOVA handling code removed in this patch is
pointless.
Signed-off-by: Tom Murphy <murphyt7@tcd.ie>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-19-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
When a PASID is used for SVA by the device, it's possible that the PASID
entry is cleared before the device flushes all ongoing DMA requests. The
IOMMU should tolerate and ignore the non-recoverable faults caused by the
untranslated requests from this device.
For example, when an exception happens, the process terminates before the
device driver stops DMA and call IOMMU driver to unbind PASID. The flow
of process exist is as follows:
do_exit() {
exit_mm() {
mm_put();
exit_mmap() {
intel_invalidate_range() //mmu notifier
tlb_finish_mmu()
mmu_notifier_release(mm) {
intel_iommu_release() {
[2] intel_iommu_teardown_pasid();
intel_iommu_flush_tlbs();
}
}
unmap_vmas();
free_pgtables();
};
}
exit_files(tsk) {
close_files() {
dsa_close();
[1] dsa_stop_dma();
intel_svm_unbind_pasid();
}
}
}
Care must be taken on VT-d to avoid unrecoverable faults between the time
window of [1] and [2]. [Process exist flow was contributed by Jacob Pan.]
Intel VT-d provides such function through the FPD bit of the PASID entry.
This sets FPD bit when PASID entry is changing from present to nonpresent
in the mm notifier and will clear it when the pasid is unbound.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-15-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
This patch is an initial step to replace Intel SVM code with the
following IOMMU SVA ops:
intel_svm_bind_mm() => iommu_sva_bind_device()
intel_svm_unbind_mm() => iommu_sva_unbind_device()
intel_svm_is_pasid_valid() => iommu_sva_get_pasid()
The features below will continue to work but are not included in this patch
in that they are handled mostly within the IOMMU subsystem.
- IO page fault
- mmu notifier
Consolidation of the above will come after merging generic IOMMU sva
code[1]. There should not be any changes needed for SVA users such as
accelerator device drivers during this time.
[1] http://jpbrucker.net/sva/
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-12-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Query Shared Virtual Address/Memory capability is a generic feature.
SVA feature check is the required first step before calling
iommu_sva_bind_device().
VT-d checks SVA feature enabling at per IOMMU level during this step,
SVA bind device will check and enable PCI ATS, PRS, and PASID capabilities
at device level.
This patch reports Intel SVM as SVA feature such that generic code
(e.g. Uacce [1]) can use it.
[1] https://lkml.org/lkml/2020/1/15/604
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-11-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Add a get_domain_info() helper to retrieve the valid per-device
iommu private data.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-10-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
When VT-d driver runs in the guest, PASID allocation must be
performed via virtual command interface. This patch registers a
custom IOASID allocator which takes precedence over the default
XArray based allocator. The resulting IOASID allocation will always
come from the host. This ensures that PASID namespace is system-
wide.
Virtual command registers are used in the guest only, to prevent
vmexit cost, we cache the capability and store it during initialization.
Signed-off-by: Liu, Yi L <yi.l.liu@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200516062101.29541-9-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
When Shared Virtual Address (SVA) is enabled for a guest OS via
vIOMMU, we need to provide invalidation support at IOMMU API and driver
level. This patch adds Intel VT-d specific function to implement
iommu passdown invalidate API for shared virtual address.
The use case is for supporting caching structure invalidation
of assigned SVM capable devices. Emulated IOMMU exposes queue
invalidation capability and passes down all descriptors from the guest
to the physical IOMMU.
The assumption is that guest to host device ID mapping should be
resolved prior to calling IOMMU driver. Based on the device handle,
host IOMMU driver can replace certain fields before submit to the
invalidation queue.
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200516062101.29541-7-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
When supporting guest SVA with emulated IOMMU, the guest PASID
table is shadowed in VMM. Updates to guest vIOMMU PASID table
will result in PASID cache flush which will be passed down to
the host as bind guest PASID calls.
For the SL page tables, it will be harvested from device's
default domain (request w/o PASID), or aux domain in case of
mediated device.
.-------------. .---------------------------.
| vIOMMU | | Guest process CR3, FL only|
| | '---------------------------'
.----------------/
| PASID Entry |--- PASID cache flush -
'-------------' |
| | V
| | CR3 in GPA
'-------------'
Guest
------| Shadow |--------------------------|--------
v v v
Host
.-------------. .----------------------.
| pIOMMU | | Bind FL for GVA-GPA |
| | '----------------------'
.----------------/ |
| PASID Entry | V (Nested xlate)
'----------------\.------------------------------.
| | |SL for GPA-HPA, default domain|
| | '------------------------------'
'-------------'
Where:
- FL = First level/stage one page tables
- SL = Second level/stage two page tables
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-5-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Nested translation mode is supported in VT-d 3.0 Spec.CH 3.8.
With PASID granular translation type set to 0x11b, translation
result from the first level(FL) also subject to a second level(SL)
page table translation. This mode is used for SVA virtualization,
where FL performs guest virtual to guest physical translation and
SL performs guest physical to host physical translation.
This patch adds a helper function for setting up nested translation
where second level comes from a domain and first level comes from
a guest PGD.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Liu Yi L <yi.l.liu@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20200516062101.29541-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Move domain helper to header to be used by SVA code.
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20200516062101.29541-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Unify format of the printed messages, i.e. replace printk(LEVEL ... )
with pr_level(...).
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200507161804.13275-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Current Intel IOMMU driver sets the system level dma_ops. This causes
each dma API to go through the IOMMU driver even the devices are using
identity mapped domains. This sets per-device dma_ops only if a device
is using a DMA domain. Otherwise, use the default system level dma_ops
for direct dma.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Daniel Drake <drake@endlessm.com>
Reviewed-by: Jon Derrick <jonathan.derrick@intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20200506015947.28662-4-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Before commit fa954e6831789 ("iommu/vt-d: Delegate the dma domain
to upper layer"), Intel IOMMU started off with all devices in the
identity domain, and took them out later if it found they couldn't
access all of memory. This required devices behind a PCI bridge to
use a DMA domain at the beginning because all PCI devices behind
the bridge use the same source-id in their transactions and the
domain couldn't be changed at run-time.
Intel IOMMU driver is now aligned with the default domain framework,
there's no need to keep this requirement anymore.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Daniel Drake <drake@endlessm.com>
Reviewed-by: Jon Derrick <jonathan.derrick@intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20200506015947.28662-3-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Currently, if a 32bit device initially uses an identity domain, Intel
IOMMU driver will convert it forcibly to a DMA one if its address
capability is not enough for the whole system memory. The motivation was
to overcome the overhead caused by possible bounced buffer.
Unfortunately, this improvement has led to many problems. For example,
some 32bit devices are required to use an identity domain, forcing them
to use DMA domain will cause the device not to work anymore. On the
other hand, the VMD sub-devices share a domain but each sub-device might
have different address capability. Forcing a VMD sub-device to use DMA
domain blindly will impact the operation of other sub-devices without
any notification. Further more, PCI aliased devices (PCI bridge and all
devices beneath it, VMD devices and various devices quirked with
pci_add_dma_alias()) must use the same domain. Forcing one device to
switch to DMA domain during runtime will cause in-fligh DMAs for other
devices to abort or target to other memory which might cause undefind
system behavior.
With the last private domain usage in iommu_need_mapping() removed, all
private domain helpers are also cleaned in this patch. Otherwise, the
compiler will complain that some functions are defined but not used.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Daniel Drake <drake@endlessm.com>
Reviewed-by: Jon Derrick <jonathan.derrick@intel.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Cc: Daniel Drake <drake@endlessm.com>
Cc: Derrick Jonathan <jonathan.derrick@intel.com>
Cc: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20200506015947.28662-2-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Linux 5.7-rc4
|
|
Convert the Intel IOMMU driver to use the probe_device() and
release_device() call-backs of iommu_ops, so that the iommu core code
does the group and sysfs setup.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200429133712.31431-17-joro@8bytes.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The Intel VT-d driver already has a matching function to determine the
default domain type for a device. Wire it up in intel_iommu_ops.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200429133712.31431-5-joro@8bytes.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The CONFIG_ prefix should be added in the code.
Fixes: 046182525db61 ("iommu/vt-d: Add Kconfig option to enable/disable scalable mode")
Reported-and-tested-by: Kumar, Sanjay K <sanjay.k.kumar@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Link: https://lore.kernel.org/r/20200501072427.14265-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
dmar_find_atsr() calls list_for_each_entry_rcu() outside of an RCU read
side critical section but with dmar_global_lock held. Silence this
false positive.
drivers/iommu/intel-iommu.c:4504 RCU-list traversed in non-reader section!!
1 lock held by swapper/0/1:
#0: ffffffff9755bee8 (dmar_global_lock){+.+.}, at: intel_iommu_init+0x1a6/0xe19
Call Trace:
dump_stack+0xa4/0xfe
lockdep_rcu_suspicious+0xeb/0xf5
dmar_find_atsr+0x1ab/0x1c0
dmar_parse_one_atsr+0x64/0x220
dmar_walk_remapping_entries+0x130/0x380
dmar_table_init+0x166/0x243
intel_iommu_init+0x1ab/0xe19
pci_iommu_init+0x1a/0x44
do_one_initcall+0xae/0x4d0
kernel_init_freeable+0x412/0x4c5
kernel_init+0x19/0x193
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Currently, the intel iommu debugfs directory(/sys/kernel/debug/iommu/intel)
gets populated only when DMA remapping is enabled (dmar_disabled = 0)
irrespective of whether interrupt remapping is enabled or not.
Instead, populate the intel iommu debugfs directory if any IOMMUs are
detected.
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: ee2636b8670b1 ("iommu/vt-d: Enable base Intel IOMMU debugfs support")
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
add_taint
Quoting from the comment describing the WARN functions in
include/asm-generic/bug.h:
* WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
* significant kernel issues that need prompt attention if they should ever
* appear at runtime.
*
* Do not use these macros when checking for invalid external inputs
The (buggy) firmware tables which the dmar code was calling WARN_TAINT
for really are invalid external inputs. They are not under the kernel's
control and the issues in them cannot be fixed by a kernel update.
So logging a backtrace, which invites bug reports to be filed about this,
is not helpful.
Fixes: 556ab45f9a77 ("ioat2: catch and recover from broken vtd configurations v6")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200309182510.373875-1-hdegoede@redhat.com
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=701847
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Quoting from the comment describing the WARN functions in
include/asm-generic/bug.h:
* WARN(), WARN_ON(), WARN_ON_ONCE, and so on can be used to report
* significant kernel issues that need prompt attention if they should ever
* appear at runtime.
*
* Do not use these macros when checking for invalid external inputs
The (buggy) firmware tables which the dmar code was calling WARN_TAINT
for really are invalid external inputs. They are not under the kernel's
control and the issues in them cannot be fixed by a kernel update.
So logging a backtrace, which invites bug reports to be filed about this,
is not helpful.
Some distros, e.g. Fedora, have tools watching for the kernel backtraces
logged by the WARN macros and offer the user an option to file a bug for
this when these are encountered. The WARN_TAINT in dmar_parse_one_rmrr
+ another iommu WARN_TAINT, addressed in another patch, have lead to over
a 100 bugs being filed this way.
This commit replaces the WARN_TAINT("...") call, with a
pr_warn(FW_BUG "...") + add_taint(TAINT_FIRMWARE_WORKAROUND, ...) call
avoiding the backtrace and thus also avoiding bug-reports being filed
about this against the kernel.
Fixes: f5a68bb0752e ("iommu/vt-d: Mark firmware tainted if RMRR fails sanity check")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: stable@vger.kernel.org
Cc: Barret Rhoden <brho@google.com>
Link: https://lore.kernel.org/r/20200309140138.3753-3-hdegoede@redhat.com
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1808874
|
|
There are several places traverse RCU-list without holding any lock in
intel_iommu_init(). Fix them by acquiring dmar_global_lock.
WARNING: suspicious RCU usage
-----------------------------
drivers/iommu/intel-iommu.c:5216 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
no locks held by swapper/0/1.
Call Trace:
dump_stack+0xa0/0xea
lockdep_rcu_suspicious+0x102/0x10b
intel_iommu_init+0x947/0xb13
pci_iommu_init+0x26/0x62
do_one_initcall+0xfe/0x500
kernel_init_freeable+0x45a/0x4f8
kernel_init+0x11/0x139
ret_from_fork+0x3a/0x50
DMAR: Intel(R) Virtualization Technology for Directed I/O
Fixes: d8190dc63886 ("iommu/vt-d: Enable DMA remapping after rmrr mapped")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
intel_iommu_iova_to_phys() has a bug when it translates an IOVA for a huge
page onto its corresponding physical address. This commit fixes the bug by
accomodating the level of page entry for the IOVA and adds IOVA's lower
address to the physical address.
Cc: <stable@vger.kernel.org>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: Yonghyun Hwang <yonghyun@google.com>
Fixes: 3871794642579 ("VT-d: Changes to support KVM")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The function only has one call-site and there it is never called with
dummy or deferred devices. Simplify the check in the function to
account for that.
Fixes: 1ee0186b9a12 ("iommu/vt-d: Refactor find_domain() helper")
Cc: stable@vger.kernel.org # v5.5
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The function is now only a wrapper around find_domain(). Remove the
function and call find_domain() directly at the call-sites.
Fixes: 1ee0186b9a12 ("iommu/vt-d: Refactor find_domain() helper")
Cc: stable@vger.kernel.org # v5.5
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The attachment of deferred devices needs to happen before the check
whether the device is identity mapped or not. Otherwise the check will
return wrong results, cause warnings boot failures in kdump kernels, like
WARNING: CPU: 0 PID: 318 at ../drivers/iommu/intel-iommu.c:592 domain_get_iommu+0x61/0x70
[...]
Call Trace:
__intel_map_single+0x55/0x190
intel_alloc_coherent+0xac/0x110
dmam_alloc_attrs+0x50/0xa0
ahci_port_start+0xfb/0x1f0 [libahci]
ata_host_start.part.39+0x104/0x1e0 [libata]
With the earlier check the kdump boot succeeds and a crashdump is written.
Fixes: 1ee0186b9a12 ("iommu/vt-d: Refactor find_domain() helper")
Cc: stable@vger.kernel.org # v5.5
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Move the code that does the deferred device attachment into a separate
helper function.
Fixes: 1ee0186b9a12 ("iommu/vt-d: Refactor find_domain() helper")
Cc: stable@vger.kernel.org # v5.5
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Implement a helper function to check whether a device's attach process
is deferred.
Fixes: 1ee0186b9a12 ("iommu/vt-d: Refactor find_domain() helper")
Cc: stable@vger.kernel.org # v5.5
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- Allow compiling the ARM-SMMU drivers as modules.
- Fixes and cleanups for the ARM-SMMU drivers and io-pgtable code
collected by Will Deacon. The merge-commit (6855d1ba7537) has all the
details.
- Cleanup of the iommu_put_resv_regions() call-backs in various
drivers.
- AMD IOMMU driver cleanups.
- Update for the x2APIC support in the AMD IOMMU driver.
- Preparation patches for Intel VT-d nested mode support.
- RMRR and identity domain handling fixes for the Intel VT-d driver.
- More small fixes and cleanups.
* tag 'iommu-updates-v5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (87 commits)
iommu/amd: Remove the unnecessary assignment
iommu/vt-d: Remove unnecessary WARN_ON_ONCE()
iommu/vt-d: Unnecessary to handle default identity domain
iommu/vt-d: Allow devices with RMRRs to use identity domain
iommu/vt-d: Add RMRR base and end addresses sanity check
iommu/vt-d: Mark firmware tainted if RMRR fails sanity check
iommu/amd: Remove unused struct member
iommu/amd: Replace two consecutive readl calls with one readq
iommu/vt-d: Don't reject Host Bridge due to scope mismatch
PCI/ATS: Add PASID stubs
iommu/arm-smmu-v3: Return -EBUSY when trying to re-add a device
iommu/arm-smmu-v3: Improve add_device() error handling
iommu/arm-smmu-v3: Use WRITE_ONCE() when changing validity of an STE
iommu/arm-smmu-v3: Add second level of context descriptor table
iommu/arm-smmu-v3: Prepare for handling arm_smmu_write_ctx_desc() failure
iommu/arm-smmu-v3: Propagate ssid_bits
iommu/arm-smmu-v3: Add support for Substream IDs
iommu/arm-smmu-v3: Add context descriptor tables allocators
iommu/arm-smmu-v3: Prepare arm_smmu_s1_cfg for SSID support
ACPI/IORT: Parse SSID property of named component node
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
"Resource management:
- Improve resource assignment for hot-added nested bridges, e.g.,
Thunderbolt (Nicholas Johnson)
Power management:
- Optionally print config space of devices before suspend (Chen Yu)
- Increase D3 delay for AMD Ryzen5/7 XHCI controllers (Daniel Drake)
Virtualization:
- Generalize DMA alias quirks (James Sewart)
- Add DMA alias quirk for PLX PEX NTB (James Sewart)
- Fix IOV memory leak (Navid Emamdoost)
AER:
- Log which device prevents error recovery (Yicong Yang)
Peer-to-peer DMA:
- Whitelist Intel SkyLake-E (Armen Baloyan)
Broadcom iProc host bridge driver:
- Apply PAXC quirk whether driver is built-in or module (Wei Liu)
Broadcom STB host bridge driver:
- Add Broadcom STB PCIe host controller driver (Jim Quinlan)
Intel Gateway SoC host bridge driver:
- Add driver for Intel Gateway SoC (Dilip Kota)
Intel VMD host bridge driver:
- Add support for DMA aliases on other buses (Jon Derrick)
- Remove dma_map_ops overrides (Jon Derrick)
- Remove now-unused X86_DEV_DMA_OPS (Christoph Hellwig)
NVIDIA Tegra host bridge driver:
- Fix Tegra30 afi_pex2_ctrl register offset (Marcel Ziswiler)
Panasonic UniPhier host bridge driver:
- Remove module code since driver can't be built as a module
(Masahiro Yamada)
Qualcomm host bridge driver:
- Add support for SDM845 PCIe controller (Bjorn Andersson)
TI Keystone host bridge driver:
- Fix "num-viewport" DT property error handling (Kishon Vijay Abraham I)
- Fix link training retries initiation (Yurii Monakov)
- Fix outbound region mapping (Yurii Monakov)
Misc:
- Add Switchtec Gen4 support (Kelvin Cao)
- Add Switchtec Intercomm Notify and Upstream Error Containment
support (Logan Gunthorpe)
- Use dma_set_mask_and_coherent() since Switchtec supports 64-bit
addressing (Wesley Sheng)"
* tag 'pci-v5.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (60 commits)
PCI: Allow adjust_bridge_window() to shrink resource if necessary
PCI: Set resource size directly in adjust_bridge_window()
PCI: Rename extend_bridge_window() to adjust_bridge_window()
PCI: Rename extend_bridge_window() parameter
PCI: Consider alignment of hot-added bridges when assigning resources
PCI: Remove local variable usage in pci_bus_distribute_available_resources()
PCI: Pass size + alignment to pci_bus_distribute_available_resources()
PCI: Rename variables
PCI: vmd: Add two VMD Device IDs
PCI: Remove unnecessary braces
PCI: brcmstb: Add MSI support
PCI: brcmstb: Add Broadcom STB PCIe host controller driver
x86/PCI: Remove X86_DEV_DMA_OPS
PCI: vmd: Remove dma_map_ops overrides
iommu/vt-d: Remove VMD child device sanity check
iommu/vt-d: Use pci_real_dma_dev() for mapping
PCI: Introduce pci_real_dma_dev()
x86/PCI: Expose VMD's pci_dev in struct pci_sysdata
x86/PCI: Add to_pci_sysdata() helper
PCI/AER: Initialize aer_fifo
...
|
|
Remove the sanity check required for VMD child devices. The new
pci_real_dma_dev() DMA alias mechanism places them in the same IOMMU group
as the VMD endpoint. Assignment of the group would require assigning the
VMD endpoint, where unbinding the VMD endpoint removes the child device
domain from the hierarchy.
Link: https://lore.kernel.org/r/1579613871-301529-6-git-send-email-jonathan.derrick@intel.com
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
|
|
The PCI device may have a DMA requester on another bus, such as VMD
subdevices needing to use the VMD endpoint. This case requires the real
DMA device for the IOMMU mapping, so use pci_real_dma_dev() to find that
device.
Link: https://lore.kernel.org/r/1579613871-301529-5-git-send-email-jonathan.derrick@intel.com
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
|
|
into next
|
|
The iommu default domain framework has been designed to take
care of setting identity default domain type. It's unnecessary
to handle this again in the VT-d driver. Hence, remove it.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Since commit ea2447f700cab ("intel-iommu: Prevent devices with
RMRRs from being placed into SI Domain"), the Intel IOMMU driver
doesn't allow any devices with RMRR locked to use the identity
domain. This was added to to fix the issue where the RMRR info
for devices being placed in and out of the identity domain gets
lost. This identity maps all RMRRs when setting up the identity
domain, so that devices with RMRRs could also use it.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
The VT-d spec specifies requirements for the RMRR entries base and
end (called 'Limit' in the docs) addresses.
This commit will cause the DMAR processing to mark the firmware as
tainted if any RMRR entries that do not meet these requirements.
Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
RMRR entries describe memory regions that are DMA targets for devices
outside the kernel's control.
RMRR entries that fail the sanity check are pointing to regions of
memory that the firmware did not tell the kernel are reserved or
otherwise should not be used.
Instead of aborting DMAR processing, this commit marks the firmware
as tainted. These RMRRs will still be identity mapped, otherwise,
some devices, e.x. graphic devices, will not work during boot.
Signed-off-by: Barret Rhoden <brho@google.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Fixes: f036c7fa0ab60 ("iommu/vt-d: Check VT-d RMRR region in BIOS is reported as reserved")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
It is possible for archdata.iommu to be set to
DEFER_DEVICE_DOMAIN_INFO or DUMMY_DEVICE_DOMAIN_INFO so check for
those values before calling __dmar_remove_one_dev_info. Without a
check it can result in a null pointer dereference. This has been seen
while booting a kdump kernel on an HP dl380 gen9.
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: stable@vger.kernel.org # 5.3+
Cc: linux-kernel@vger.kernel.org
Fixes: ae23bfb68f28 ("iommu/vt-d: Detach domain before using a private one")
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
If the device fails to be added to the group, make sure to unlink the
reference before returning.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Fixes: 39ab9555c2411 ("iommu: Add sysfs bindings for struct iommu_device")
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Starting with commit fa212a97f3a3 ("iommu/vt-d: Probe DMA-capable ACPI
name space devices"), we now probe DMA-capable ACPI name
space devices. On Dell XPS 13 9343, which has an Intel LPSS platform
device INTL9C60 enumerated via ACPI, this change leads to the following
warning:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 1 at pci_device_group+0x11a/0x130
CPU: 1 PID: 1 Comm: swapper/0 Tainted: G T 5.5.0-rc3+ #22
Hardware name: Dell Inc. XPS 13 9343/0310JH, BIOS A20 06/06/2019
RIP: 0010:pci_device_group+0x11a/0x130
Code: f0 ff ff 48 85 c0 49 89 c4 75 c4 48 8d 74 24 10 48 89 ef e8 48 ef ff ff 48 85 c0 49 89 c4 75 af e8 db f7 ff ff 49 89 c4 eb a5 <0f> 0b 49 c7 c4 ea ff ff ff eb 9a e8 96 1e c7 ff 66 0f 1f 44 00 00
RSP: 0000:ffffc0d6c0043cb0 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffffa3d1d43dd810 RCX: 0000000000000000
RDX: ffffa3d1d4fecf80 RSI: ffffa3d12943dcc0 RDI: ffffa3d1d43dd810
RBP: ffffa3d1d43dd810 R08: 0000000000000000 R09: ffffa3d1d4c04a80
R10: ffffa3d1d4c00880 R11: ffffa3d1d44ba000 R12: 0000000000000000
R13: ffffa3d1d4383b80 R14: ffffa3d1d4c090d0 R15: ffffa3d1d4324530
FS: 0000000000000000(0000) GS:ffffa3d1d6700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000000460a001 CR4: 00000000003606e0
Call Trace:
? iommu_group_get_for_dev+0x81/0x1f0
? intel_iommu_add_device+0x61/0x170
? iommu_probe_device+0x43/0xd0
? intel_iommu_init+0x1fa2/0x2235
? pci_iommu_init+0x52/0xe7
? e820__memblock_setup+0x15c/0x15c
? do_one_initcall+0xcc/0x27e
? kernel_init_freeable+0x169/0x259
? rest_init+0x95/0x95
? kernel_init+0x5/0xeb
? ret_from_fork+0x35/0x40
---[ end trace 28473e7abc25b92c ]---
DMAR: ACPI name space devices didn't probe correctly
The bug results from the fact that while we now enumerate ACPI devices,
we aren't able to handle any non-PCI device when generating the device
group. Fix the issue by implementing an Intel-specific callback that
returns `pci_device_group` only if the device is a PCI device.
Otherwise, it will return a generic device group.
Fixes: fa212a97f3a3 ("iommu/vt-d: Probe DMA-capable ACPI name space devices")
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Cc: stable@vger.kernel.org # v5.3+
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Export page table internals of the domain attached to each device.
Example of such dump on a Skylake machine:
$ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
[ ... ]
Device 0000:00:14.0 with pasid 0 @0x15f3d9000
IOVA_PFN PML5E PML4E
0x000000008ced0 | 0x0000000000000000 0x000000015f3da003
0x000000008ced1 | 0x0000000000000000 0x000000015f3da003
0x000000008ced2 | 0x0000000000000000 0x000000015f3da003
0x000000008ced3 | 0x0000000000000000 0x000000015f3da003
0x000000008ced4 | 0x0000000000000000 0x000000015f3da003
0x000000008ced5 | 0x0000000000000000 0x000000015f3da003
0x000000008ced6 | 0x0000000000000000 0x000000015f3da003
0x000000008ced7 | 0x0000000000000000 0x000000015f3da003
0x000000008ced8 | 0x0000000000000000 0x000000015f3da003
0x000000008ced9 | 0x0000000000000000 0x000000015f3da003
PDPE PDE PTE
0x000000015f3db003 0x000000015f3dc003 0x000000008ced0003
0x000000015f3db003 0x000000015f3dc003 0x000000008ced1003
0x000000015f3db003 0x000000015f3dc003 0x000000008ced2003
0x000000015f3db003 0x000000015f3dc003 0x000000008ced3003
0x000000015f3db003 0x000000015f3dc003 0x000000008ced4003
0x000000015f3db003 0x000000015f3dc003 0x000000008ced5003
0x000000015f3db003 0x000000015f3dc003 0x000000008ced6003
0x000000015f3db003 0x000000015f3dc003 0x000000008ced7003
0x000000015f3db003 0x000000015f3dc003 0x000000008ced8003
0x000000015f3db003 0x000000015f3dc003 0x000000008ced9003
[ ... ]
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
After we make all map/unmap paths support first level page table.
Let's turn it on if hardware supports scalable mode.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|