Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit 5016bdb796b3726eec043ca0ce3be981f712c756 ]
Normally, calling alloc_iova() using an iova_domain with insufficient
pfns remaining between start_pfn and dma_limit will fail and return a
NULL pointer. Unexpectedly, if such a "full" iova_domain contains an
iova with pfn_lo == 0, the alloc_iova() call will instead succeed and
return an iova containing invalid pfns.
This is caused by an underflow bug in __alloc_and_insert_iova_range()
that occurs after walking the "full" iova tree when the search ends
at the iova with pfn_lo == 0 and limit_pfn is then adjusted to be just
below that (-1). This (now huge) limit_pfn gives the impression that a
vast amount of space is available between it and start_pfn and thus
a new iova is allocated with the invalid pfn_hi value, 0xFFF.... .
To rememdy this, a check is introduced to ensure that adjustments to
limit_pfn will not underflow.
This issue has been observed in the wild, and is easily reproduced with
the following sample code.
struct iova_domain *iovad = kzalloc(sizeof(*iovad), GFP_KERNEL);
struct iova *rsvd_iova, *good_iova, *bad_iova;
unsigned long limit_pfn = 3;
unsigned long start_pfn = 1;
unsigned long va_size = 2;
init_iova_domain(iovad, SZ_4K, start_pfn, limit_pfn);
rsvd_iova = reserve_iova(iovad, 0, 0);
good_iova = alloc_iova(iovad, va_size, limit_pfn, true);
bad_iova = alloc_iova(iovad, va_size, limit_pfn, true);
Prior to the patch, this yielded:
*rsvd_iova == {0, 0} /* Expected */
*good_iova == {2, 3} /* Expected */
*bad_iova == {-2, -1} /* Oh no... */
After the patch, bad_iova is NULL as expected since inadequate
space remains between limit_pfn and start_pfn after allocating
good_iova.
Signed-off-by: Nate Watterson <nwatters@codeaurora.org>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Between acquiring the this_cpu_ptr() and using it, ideally we don't want
to be preempted and work on another CPU's private data. this_cpu_ptr()
checks whether or not preemption is disable, and get_cpu_ptr() provides
a convenient wrapper for operating on the cpu ptr inside a preemption
disabled critical section (which currently is provided by the
spinlock).
[ 167.997877] BUG: using smp_processor_id() in preemptible [00000000] code: usb-storage/216
[ 167.997940] caller is debug_smp_processor_id+0x17/0x20
[ 167.997945] CPU: 7 PID: 216 Comm: usb-storage Tainted: G U 4.7.0-rc1-gfxbench-RO_Patchwork_1057+ #1
[ 167.997948] Hardware name: Hewlett-Packard HP Pro 3500 Series/2ABF, BIOS 8.11 10/24/2012
[ 167.997951] 0000000000000000 ffff880118b7f9c8 ffffffff8140dca5 0000000000000007
[ 167.997958] ffffffff81a3a7e9 ffff880118b7f9f8 ffffffff8142a927 0000000000000000
[ 167.997965] ffff8800d499ed58 0000000000000001 00000000000fffff ffff880118b7fa08
[ 167.997971] Call Trace:
[ 167.997977] [<ffffffff8140dca5>] dump_stack+0x67/0x92
[ 167.997981] [<ffffffff8142a927>] check_preemption_disabled+0xd7/0xe0
[ 167.997985] [<ffffffff8142a947>] debug_smp_processor_id+0x17/0x20
[ 167.997990] [<ffffffff81507e17>] alloc_iova_fast+0xb7/0x210
[ 167.997994] [<ffffffff8150c55f>] intel_alloc_iova+0x7f/0xd0
[ 167.997998] [<ffffffff8151021d>] intel_map_sg+0xbd/0x240
[ 167.998002] [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[ 167.998009] [<ffffffff81596059>] usb_hcd_map_urb_for_dma+0x4b9/0x5a0
[ 167.998013] [<ffffffff81596d19>] usb_hcd_submit_urb+0xe9/0xaa0
[ 167.998017] [<ffffffff810cff2f>] ? mark_held_locks+0x6f/0xa0
[ 167.998022] [<ffffffff810d525c>] ? __raw_spin_lock_init+0x1c/0x50
[ 167.998025] [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[ 167.998028] [<ffffffff815988f3>] usb_submit_urb+0x3f3/0x5a0
[ 167.998032] [<ffffffff810d0082>] ? trace_hardirqs_on_caller+0x122/0x1b0
[ 167.998035] [<ffffffff81599ae7>] usb_sg_wait+0x67/0x150
[ 167.998039] [<ffffffff815dc202>] usb_stor_bulk_transfer_sglist.part.3+0x82/0xd0
[ 167.998042] [<ffffffff815dc29c>] usb_stor_bulk_srb+0x4c/0x60
[ 167.998045] [<ffffffff815dc42e>] usb_stor_Bulk_transport+0x17e/0x420
[ 167.998049] [<ffffffff815dcf32>] usb_stor_invoke_transport+0x242/0x540
[ 167.998052] [<ffffffff810e5efd>] ? debug_lockdep_rcu_enabled+0x1d/0x20
[ 167.998058] [<ffffffff815dba19>] usb_stor_transparent_scsi_command+0x9/0x10
[ 167.998061] [<ffffffff815de518>] usb_stor_control_thread+0x158/0x260
[ 167.998064] [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20
[ 167.998067] [<ffffffff815de3c0>] ? fill_inquiry_response+0x20/0x20
[ 167.998071] [<ffffffff8109ddfa>] kthread+0xea/0x100
[ 167.998078] [<ffffffff817ac6af>] ret_from_fork+0x1f/0x40
[ 167.998081] [<ffffffff8109dd10>] ? kthread_create_on_node+0x1f0/0x1f0
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96293
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org
Cc: linux-kernel@vger.kernel.org
Fixes: 9257b4a206fc ('iommu/iova: introduce per-cpu caching to iova allocation')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
IOVA allocation has two problems that impede high-throughput I/O.
First, it can do a linear search over the allocated IOVA ranges.
Second, the rbtree spinlock that serializes IOVA allocations becomes
contended.
Address these problems by creating an API for caching allocated IOVA
ranges, so that the IOVA allocator isn't accessed frequently. This
patch adds a per-CPU cache, from which CPUs can alloc/free IOVAs
without taking the rbtree spinlock. The per-CPU caches are backed by
a global cache, to avoid invoking the (linear-time) IOVA allocator
without needing to make the per-CPU cache size excessive. This design
is based on magazines, as described in "Magazines and Vmem: Extending
the Slab Allocator to Many CPUs and Arbitrary Resources" (currently
available at https://www.usenix.org/legacy/event/usenix01/bonwick.html)
Adding caching on top of the existing rbtree allocator maintains the
property that IOVAs are densely packed in the IO virtual address space,
which is important for keeping IOMMU page table usage low.
To keep the cache size reasonable, we bound the IOVA space a CPU can
cache by 32 MiB (we cache a bounded number of IOVA ranges, and only
ranges of size <= 128 KiB). The shared global cache is bounded at
4 MiB of IOVA space.
Signed-off-by: Omer Peleg <omer@cs.technion.ac.il>
[mad@cs.technion.ac.il: rebased, cleaned up and reworded the commit message]
Signed-off-by: Adam Morrison <mad@cs.technion.ac.il>
Reviewed-by: Shaohua Li <shli@fb.com>
Reviewed-by: Ben Serebrin <serebrin@google.com>
[dwmw2: split out VT-d part into a separate patch]
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
The iova library has use outside the intel-iommu driver, thus make it a
module.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Use EXPORT_SYMBOL_GPL() to export the iova library symbols. The symbols
include:
init_iova_domain();
iova_cache_get();
iova_cache_put();
iova_cache_init();
alloc_iova();
find_iova();
__free_iova();
free_iova();
put_iova_domain();
reserve_iova();
copy_reserved_iova();
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
This is necessary to separate intel-iommu from the iova library.
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Currently, allocating a size-aligned IOVA region quietly adjusts the
actual allocation size in the process, returning a rounded-up
power-of-two-sized allocation. This results in mismatched behaviour in
the IOMMU driver if the original size was not a power of two, where the
original size is mapped, but the rounded-up IOVA size is unmapped.
Whilst some IOMMUs will happily unmap already-unmapped pages, others
consider this an error, so fix it by computing the necessary alignment
padding without altering the actual allocation size. Also clean up by
making pad_size unsigned, since its callers always pass unsigned values
and negative padding makes little sense here anyway.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
|
|
Fixed checkpatch warnings for missing blank line after
declaration of struct.
Signed-off-by: Robert Callicotte <rcallicotte@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Systems may contain heterogeneous IOMMUs supporting differing minimum
page sizes, which may also not be common with the CPU page size.
Thus it is practical to have an explicit notion of IOVA granularity
to simplify handling of mapping and allocation constraints.
As an initial step, move the IOVA page granularity from an implicit
compile-time constant to a per-domain property so we can make use
of it in IOVA domain context at runtime. To keep the abstraction tidy,
extend the little API of inline iova_* helpers to parallel some of the
equivalent PAGE_* macros.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
To share the IOVA allocator with other architectures, it needs to
accommodate more general aperture restrictions; move the lower limit
from a compile-time constant to a runtime domain property to allow
IOVA domains with different requirements to co-exist.
Also reword the slightly unclear description of alloc_iova since we're
touching it anyway.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
In order to share the IOVA allocator with other architectures, break
the unnecssary dependency on the Intel IOMMU driver and move the
remaining IOVA internals to iova.c
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
If static identity domain is created, IOMMU driver needs to update
si_domain page table when memory hotplug event happens. Otherwise
PCI device DMA operations can't access the hot-added memory regions.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
|
|
Correct spelling typo in debug messages and comments
in drivers/iommu.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
This should ease finding similarities with different platforms,
with the intention of solving problems once in a generic framework
which everyone can use.
Note: to move intel-iommu.c, the declaration of pci_find_upstream_pcie_bridge()
has to move from drivers/pci/pci.h to include/linux/pci.h. This is handled
in this patch, too.
As suggested, also drop DMAR's EXPERIMENTAL tag while we're at it.
Compile-tested on x86_64.
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
|