Age | Commit message (Collapse) | Author | Files | Lines |
|
Using struct drm_device.pdev is deprecated. Convert i915 to struct
drm_device.dev. No functional changes.
v6:
* also remove assignment in selftests/ in a later patch (Chris)
v5:
* remove assignment in later patch (Chris)
v3:
* rebased
v2:
* move gt/ and gvt/ changes into separate patches
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210128133127.2311-2-tzimmermann@suse.de
|
|
If any of the perf tests run into 0 time, not only are we liable to
divide by zero, but the result would be highly questionable.
Nevertheless, let's not have a div-by-zero error.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-2-chris@chris-wilson.co.uk
|
|
In the near future, upstream will introduce a SZ_8G macro that is
slightly different to our own. Employ a temporary ifndef to avoid
compilation failure until we have backmerged.
References: 8b0fac44bd1f ("sizes.h: add SZ_8G/SZ_16G/SZ_32G macros")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210104171511.32684-1-chris@chris-wilson.co.uk
|
|
Rather than having special case code for opportunistically calling
process_csb() and performing a direct submit while holding the engine
spinlock for submitting the request, simply call the tasklet directly.
This allows us to retain the direct submission path, including the CS
draining to allow fast/immediate submissions, without requiring any
duplicated code paths, and most importantly greatly simplifying the
control flow by removing reentrancy. This will enable us to close a few
races in the virtual engines in the next few patches.
The trickiest part here is to ensure that paired operations (such as
schedule_in/schedule_out) remain under consistent locking domains,
e.g. when pulled outside of the engine->active.lock
v2: Use bh kicking, see commit 3c53776e29f8 ("Mark HI and TASKLET
softirq synchronous").
v3: Update engine-reset to be tasklet aware
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-1-chris@chris-wilson.co.uk
|
|
Pull the GT clock information [used to derive CS timestamps and PM
interval] under the GT so that is it local to the users. In doing so, we
consolidate the two references for the same information, of which the
runtime-info took note of a potential clock source override and scaling
factors.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201223122359.22562-2-chris@chris-wilson.co.uk
|
|
We just need the context image from the logical state to force eviction
of many contexts, so simplify by avoiding the GEM context container.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201223154509.14155-1-chris@chris-wilson.co.uk
|
|
Reduce the pollution of intel_engine.h by moving gen8_emit_pipe_control
and friends to gen8_engine_cs.h
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201216135452.6063-1-chris@chris-wilson.co.uk
|
|
The free_list and worker was introduced in commit 5f09a9c8ab6b ("drm/i915:
Allow contexts to be unreferenced locklessly"), but subsequently made
redundant by the removal of the last sleeping lock in commit 2935ed5339c4
("drm/i915: Remove logical HW ID"). As we can now free the GEM context
immediately from any context, remove the deferral of the free_list
v2: Lift removing the context from the global list into close().
Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201215152138.8158-1-chris@chris-wilson.co.uk
|
|
When we fail to find a single block large enough to require splitting,
report the largest block we did find.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201207130346.11849-1-chris@chris-wilson.co.uk
|
|
There is a copy and paste bug in this code. It's supposed to check
"obj2" instead of checking "obj" a second time.
Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/8ilneOcJAjwqU4t@mwand
|
|
Mixing I915_ALLOC_CONTIGUOUS and I915_ALLOC_MAX_SEGMENT_SIZE fared
badly. The two directives conflict, with the contiguous request setting
the min_order to the full size of the object, and the max-segment-size
setting the max_order to the limit of the DMA mapper. This results in a
situation where max_order < min_order, causing our sanity checks to
fail.
Instead of limiting the buddy block size, in the previous patch we split
the oversized buddy into multiple scatterlist elements.
Fixes: d2cf0125d4a1 ("drm/i915/lmem: Limit block size to 4G")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201202173444.14903-2-chris@chris-wilson.co.uk
|
|
Adhere to the i915_sg_max_segment() limit on the lengths of individual
scatterlist elements, and in doing so split up very large chunks of lmem
into manageable pieces for the dma-mapping backend.
Reported-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201202173444.14903-1-chris@chris-wilson.co.uk
|
|
Block sizes are only limited by the largest power-of-two that will fit
in the region size, but to construct an object we also require feeding
it into an sg list, where the upper limit of the sg entry is at most
UINT_MAX. Therefore to prevent issues with allocating blocks that are
too large, add the flag I915_ALLOC_MAX_SEGMENT_SIZE which should limit
block sizes to the i915_sg_segment_size().
v2: (matt)
- query the max segment.
- prefer flag to limit block size to 4G, since it's best not to assume
the user will feed the blocks into an sg list.
- simple selftest so we don't have to guess.
Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201130134721.54457-1-matthew.auld@intel.com
|
|
If intel context create failed, the perf_request_latency() will return 0
rather than error, because we doesn't initialize the return value.
Fixes: 25c26f18ea79 ("drm/i915/selftests: Measure dispatch latency")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116143540.3648870-1-zhangxiaoxu5@huawei.com
|
|
If intel context create failed, the perf_series_engines() will return 0
rather than error, because we doesn't initialize the return value.
Fixes: cbfd3a0c5a55 ("drm/i915/selftests: Add request throughput measurement to perf")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201116144112.3673011-1-zhangxiaoxu5@huawei.com
|
|
Only the following drivers aren't converted:
- amdgpu, because of the driver_feature mangling due to virt support.
Subsequent patch will address this.
- nouveau, because DRIVER_ATOMIC uapi is still not the default on the
platforms where it's supported (i.e. again driver_feature mangling)
- vc4, again because of driver_feature mangling
- qxl, because the ioctl table is somewhere else and moving that is
maybe a bit too much, hence the num_ioctls assignment prevents a
const driver structure.
- arcpgu, because that is stuck behind a pending tiny-fication series
from me.
- legacy drivers, because legacy requires non-const drm_driver.
Note that for armada I also went ahead and made the ioctl array const.
Only cc'ing the driver people who've not been converted (everyone else
is way too much).
v2: Fix one misplaced const static, should be static const (0day)
v3:
- Improve commit message (Sam)
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: kernel test robot <lkp@intel.com>
Acked-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Eric Anholt <eric@anholt.net>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: nouveau@lists.freedesktop.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201104100425.1922351-5-daniel.vetter@ffwll.ch
|
|
Daniel needs -rc2 in drm-misc-next to merge some patches
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
|
|
We are incorrectly limiting the max allocation size as per the mm
max_order, which is effectively the largest power-of-two that we can fit
in the region size. However, it's normal to setup the region or
allocator with a non-power-of-two size(for example 3G), which we should
already handle correctly, except it seems for the early too-big-check.
v2: make sure we also exercise the I915_BO_ALLOC_CONTIGUOUS path, which
is quite different, since for that we are actually limited by the
largest power-of-two that we can fit within the region size. (Chris)
Fixes: b908be543e44 ("drm/i915: support creating LMEM objects")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: CQ Tang <cq.tang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021103606.241395-1-matthew.auld@intel.com
(cherry picked from commit 83ebef47f8ebe320d5c5673db82f9903a4f40a69)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
GEM object functions deprecate several similar callback interfaces in
struct drm_driver. This patch replaces the per-driver callbacks with
per-instance callbacks in i915.
v2:
* move object-function instance to i915_gem_object.c (Jani)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200923102159.24084-7-tzimmermann@suse.de
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.10:
UAPI Changes:
Cross-subsystem Changes:
- virtio: Merged a PR for patches that will affect drm/virtio
Core Changes:
- dev: More devm_drm convertions and removal of drm_dev_init
- atomic: Split out drm_atomic_helper_calc_timestamping_constants of
drm_atomic_helper_update_legacy_modeset_state
- ttm: More rework
Driver Changes:
- i915: selftests improvements
- panfrost: support for Amlogic SoC
- vc4: one fix
- tree-wide: conversions to devm_drm_dev_alloc,
- ast: simplifications of the atomic modesetting code
- panfrost: multiple fixes
- vc4: multiple fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20200921152956.2gxnsdgxmwhvjyut@gilmour.lan
|
|
To avoid having to create all the device and driver scaffolding we
just manually create and destroy a devres_group.
v2: Rebased
v3: use devres_open/release_group so we can use devm without real
hacks in the driver core or having to create an entire fake bus for
testing drivers. Might want to extract this into helpers eventually,
maybe as a mock_drm_dev_alloc or test_drm_dev_alloc.
v4:
- Fix IS_ERR handling (Matt)
- Delete surplus put_device() in mock_device_release (intel-gfx-ci)
v5:
- do not switch to device_add - it breaks runtime pm in the tests and
with the devres_group_add/release no longer needed for automatic
cleanup (CI). Update commit message to match.
- print correct error in pr_err (Matt)
v6: Remove now unused err variable (CI).
v7: More warning fixes ...
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (v3)
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com> (v4)
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200919134032.2488403-1-daniel.vetter@ffwll.ch
|
|
Just some prep work before we rework the lifetime handling, which
requires replacing all the drm_dev_put in selftests by something else.
v2: Don't go with a static inline, upsets the header tests and
separation.
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200918132505.2316382-2-daniel.vetter@ffwll.ch
|
|
Since we store a pointer to the fake iommu device that is allocated on
the stack, as soon as we leave the function it goes out of scope and any
future dereference is undefined behaviour. Just in case we may need to
look at the fake iommu device after initialiation, move the allocation
from the stack into the data.
Fixes: 01b9d4e21148 ("iommu/vt-d: Use dev_iommu_priv_get/set()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200916105022.28316-2-chris@chris-wilson.co.uk
|
|
Make sure vma_lock is not used as inner lock when kernel context is used,
and add ww handling where appropriate.
Ensure that execbuf selftests keep passing by using ww handling.
Changes since v2:
- Fix i915_gem_context finally.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-22-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
i915_gem_ww_ctx is used to lock all gem bo's for pinning and memory
eviction. We don't use it yet, but lets start adding the definition
first.
To use it, we have to pass a non-NULL ww to gem_object_lock, and don't
unlock directly. It is done in i915_gem_ww_ctx_fini.
Changes since v1:
- Change ww_ctx and obj order in locking functions (Jonas Lahtinen)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-6-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
When igt_random_offset() is a given a range of [0, PAGE_SIZE], it is
allowed to return 0. However, attempting to use a size of 0 for the
igt_lmem_write_cpu() byte poking, leads to call igt_random_offset() with
a range of [offset, offset + 0] and ask it to find a length of 4 within
it. This triggers the bug on that the requested length should fit within
the range!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200806145728.16495-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
The GEM object is grossly overweight for the practicality of tracking
large numbers of individual pages, yet it is currently our only
abstraction for tracking DMA allocations. Since those allocations need
to be reserved upfront before an operation, and that we need to break
away from simple system memory, we need to ditch using plain struct page
wrappers.
In the process, we drop the WC mapping as we ended up clflushing
everything anyway due to various issues across a wider range of
platforms. Though in a future step, we need to drop the kmap_atomic
approach which suggests we need to pre-map all the pages and keep them
mapped.
v2: Verify our large scratch page is suitably DMA aligned; and manually
clear the scratch since we are allocating plain struct pages full of
prior content.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-2-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
We need to make the DMA allocations used for page directories to be
performed up front so that we can include those allocations in our
memory reservation pass. The downside is that we have to assume the
worst case, even before we know the final layout, and always allocate
enough page directories for this object, even when there will be overlap.
This unfortunately can be quite expensive, especially as we have to
clear/reset the page directories and DMA pages, but it should only be
required during early phases of a workload when new objects are being
discovered, or after memory/eviction pressure when we need to rebind.
Once we reach steady state, the objects should not be moved and we no
longer need to preallocating the pages tables.
It should be noted that the lifetime for the page directories DMA is
more or less decoupled from individual fences as they will be shared
across objects across timelines.
v2: Only allocate enough PD space for the PTE we may use, we do not need
to allocate PD that will be left as scratch.
v3: Store the shift unto the first PD level to encapsulate the different
PTE counts for gen6/gen8.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200729164219.5737-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:
- Introduce a mechanism to extend execbuf2 (Lionel)
- Add syncobj timeline support (Lionel)
Driver Changes:
- Limit stolen mem usage on the compressed frame buffer (Ville)
- Some clean-up around display's cdclk (Ville)
- Some DDI changes for better DP link training according
to spec (Imre)
- Provide the perf pmu.module (Chris)
- Remove dobious Valleyview PCI IDs (Alexei)
- Add new display power saving feature for gen12+ called
HOBL (Jose)
- Move SKL's clock gating w/a to skl_init_clock_gating() (Ville)
- Rocket Lake display additions (Matt)
- Selftest: temporarily downgrade on severity of frequency
scaling tests (Chris)
- Introduce a new display workaround for fixing FLR related
issues on new PCH. (Jose)
- Temporarily disable FBC on TGL. It was the culprit of random
underruns. (Uma).
- Copy default modparams to mock i915_device (Chris)
- Add compiler paranoia for checking HWSP values (Chris)
- Remove useless gen check before calling intel_rps_boost (Chris)
- Fix a null pointer dereference (Chris)
- Add a couple of missing i915_active_fini() (Chris)
- Update TGL display power's bw_buddy table according to
update spec (Matt)
- Fix couple wrong return values (Tianjia)
- Selftest: Avoid passing random 0 into ilog2 (George)
- Many Tiger Lake display fixes and improvements for Type-C and
DP compliance (Imre, Jose)
- Start the addition of PSR2 selective fetch (Jose)
- Update a few DMC and HuC firmware versions (Jose)
- Add gen11+ w/a to fix underuns (Matt)
- Fix cmd parser desc matching with mask (Mika)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200826232733.GA129053@intel.com
|
|
igt_mm_config() calls ilog2() on the (pseudo)random 21-bit number
s>>12. Once in 2 million seeds, this is zero and ilog2 summons
the nasal demons.
There was an attempt to handle this case with a max(), but that's
too late; ms could already be something bizarre.
Given that the low 12 bits of s and ms are always zero, it's a lot
simpler just to divide them by 4096, then everything fits into 32
bits, and we can easily generate a random number 1 <= s <= 0x1fffff.
Fixes: 14d1b9a6247c ("drm/i915: buddy allocator")
Signed-off-by: George Spelvin <lkml@sdf.org>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325192429.GA8865@SDF.ORG
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 21118e8e56479ef33460fbd63a5ad0535843b666)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Since we use the module parameters stored inside the drm_i915_device
itself, we need to ensure the mock i915_device also sets up the right
defaults.
Fixes: 8a25c4be583d ("drm/i915/params: switch to device specific parameters")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728150600.4509-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 98ef067453709444a264939940f7b3a5dfdfa09e)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
igt_mm_config() calls ilog2() on the (pseudo)random 21-bit number
s>>12. Once in 2 million seeds, this is zero and ilog2 summons
the nasal demons.
There was an attempt to handle this case with a max(), but that's
too late; ms could already be something bizarre.
Given that the low 12 bits of s and ms are always zero, it's a lot
simpler just to divide them by 4096, then everything fits into 32
bits, and we can easily generate a random number 1 <= s <= 0x1fffff.
Fixes: 14d1b9a6247c ("drm/i915: buddy allocator")
Signed-off-by: George Spelvin <lkml@sdf.org>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325192429.GA8865@SDF.ORG
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
In function i915_active_acquire_preallocate_barrier(), not all
paths have the return value set correctly, and in case of memory
allocation failure, a negative error code should be returned.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200802115655.25568-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Since we use the module parameters stored inside the drm_i915_device
itself, we need to ensure the mock i915_device also sets up the right
defaults.
Fixes: 8a25c4be583d ("drm/i915/params: switch to device specific parameters")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200728150600.4509-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
- Remove of the dev->archdata.iommu (or similar) pointers from most
architectures. Only Sparc is left, but this is private to Sparc as
their drivers don't use the IOMMU-API.
- ARM-SMMU updates from Will Deacon:
- Support for SMMU-500 implementation in Marvell Armada-AP806 SoC
- Support for SMMU-500 implementation in NVIDIA Tegra194 SoC
- DT compatible string updates
- Remove unused IOMMU_SYS_CACHE_ONLY flag
- Move ARM-SMMU drivers into their own subdirectory
- Intel VT-d updates from Lu Baolu:
- Misc tweaks and fixes for vSVA
- Report/response page request events
- Cleanups
- Move the Kconfig and Makefile bits for the AMD and Intel drivers into
their respective subdirectory.
- MT6779 IOMMU Support
- Support for new chipsets in the Renesas IOMMU driver
- Other misc cleanups and fixes (e.g. to improve compile test coverage)
* tag 'iommu-updates-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (77 commits)
iommu/amd: Move Kconfig and Makefile bits down into amd directory
iommu/vt-d: Move Kconfig and Makefile bits down into intel directory
iommu/arm-smmu: Move Arm SMMU drivers into their own subdirectory
iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu
iommu: Add gfp parameter to io_pgtable_ops->map()
iommu: Mark __iommu_map_sg() as static
iommu/vt-d: Rename intel-pasid.h to pasid.h
iommu/vt-d: Add page response ops support
iommu/vt-d: Report page request faults for guest SVA
iommu/vt-d: Add a helper to get svm and sdev for pasid
iommu/vt-d: Refactor device_to_iommu() helper
iommu/vt-d: Disable multiple GPASID-dev bind
iommu/vt-d: Warn on out-of-range invalidation address
iommu/vt-d: Fix devTLB flush for vSVA
iommu/vt-d: Handle non-page aligned address
iommu/vt-d: Fix PASID devTLB invalidation
iommu/vt-d: Remove global page support in devTLB flush
iommu/vt-d: Enforce PASID devTLB field mask
iommu: Make some functions static
iommu/amd: Remove double zero check
...
|
|
The error code needs to be set on this path. It currently returns
success.
Fixes: ed2690a9ca89 ("drm/i915/selftest: Check that GPR are restored across noa_wait")
Reported-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200714143652.GA337376@mwanda
|
|
There is an error condition where err is not being set and an uninitialized
garbage value in err is being returned. Fix this by assigning err to an
appropriate error return value before taking the error exit path.
Addresses-Coverity: ("Uninitialized scalar value")
Fixes: ed2690a9ca89 ("drm/i915/selftest: Check that GPR are restored across noa_wait")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200713142551.423649-1-colin.king@canonical.com
|
|
Perf implements a GPU delay (noa_wait) by looping until the CS timestamp
has passed a certain point. This use MI_MATH and the general purpose
registers of the user's context, and since it is clobbering the user
state it must carefully save and restore the user's data around the
noa_wait. We can verify this by loading some values in the GPR that we
know will be clobbered by the noa_wait, and then inspecting the GPR after
the noa_wait completes and confirming that they have been restored.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200709224504.11345-2-chris@chris-wilson.co.uk
|
|
Since the engines belong to the GT, move the runtime-updated list of
available engines to the intel_gt struct. The original mask has been
renamed to indicate it contains the maximum engine list that can be
found on a matching device.
In preparation for other info being moved to the gt in follow up patches
(sseu), introduce an intel_gt_info structure to group all gt-related
runtime info.
v2: s/max_engine_mask/platform_engine_mask (tvrtko), fix selftest
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-5-daniele.ceraolospurio@intel.com
|
|
Reuse the ppgtt_bind_vma() for aliasing_ppgtt_bind_vma() so we can
reduce some code near-duplication. The catch is that we need to then
pass along the i915_address_space and not rely on vma->vm, as they
differ with the aliasing-ppgtt.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200703102519.26539-1-chris@chris-wilson.co.uk
|
|
Remove the use of dev->archdata.iommu and use the private per-device
pointer provided by IOMMU core code instead.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200625130836.1916-3-joro@8bytes.org
|
|
Catch up with upstream, in particular to get c1e8d7c6a7a6 ("mmap locking
API: convert mmap_sem comments").
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Return the monotonic timestamp (ktime_get()) at the time of sampling the
busy-time. This is used in preference to taking ktime_get() separately
before or after the read seqlock as there can be some large variance in
reported timestamps. For selftests trying to ascertain that we are
reporting accurate to within a few microseconds, even a small delay
leads to the test failing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200617130916.15261-2-chris@chris-wilson.co.uk
|
|
A couple of very simple tests to ensure that the basic properties of
per-engine busyness accounting [0% and 100% busy] are faithful.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200617130916.15261-1-chris@chris-wilson.co.uk
|
|
In commit 5ba32c7be81e ("drm/i915/execlists: Always force a context
reload when rewinding RING_TAIL"), we placed the check for rewinding a
context on actually submitting the next request in that context. This
was so that we only had to check once, and could do so with precision
avoiding as many forced restores as possible. For example, to ensure
that we can resubmit the same request a couple of times, we include a
small wa_tail such that on the next submission, the ring->tail will
appear to move forwards when resubmitting the same request. This is very
common as it will happen for every lite-restore to fill the second port
after a context switch.
However, intel_ring_direction() is limited in precision to movements of
upto half the ring size. The consequence being that if we tried to
unwind many requests, we could exceed half the ring and flip the sense
of the direction, so missing a force restore. As no request can be
greater than half the ring (i.e. 2048 bytes in the smallest case), we
can check for rollback incrementally. As we check against the tail that
would be submitted, we do not lose any sensitivity and allow lite
restores for the simple case. We still need to double check upon
submitting the context, to allow for multiple preemptions and
resubmissions.
Fixes: 5ba32c7be81e ("drm/i915/execlists: Always force a context reload when rewinding RING_TAIL")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609151723.12971-1-chris@chris-wilson.co.uk
(cherry picked from commit e36ba817fa966f81fb1c8d16f3721b5a644b2fa9)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Reduce the smoke depth by trimming the number of contexts, repetitions
and wait times. This is in preparation for a less greedy scheduler that
tries to be fair across contexts, resulting in a great many more context
switches. A thousand context switches may be 50-100ms, causing us to
timeout as the HW is not fast enough to complete the deep smoketests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200607222108.14401-5-chris@chris-wilson.co.uk
|
|
In commit 5ba32c7be81e ("drm/i915/execlists: Always force a context
reload when rewinding RING_TAIL"), we placed the check for rewinding a
context on actually submitting the next request in that context. This
was so that we only had to check once, and could do so with precision
avoiding as many forced restores as possible. For example, to ensure
that we can resubmit the same request a couple of times, we include a
small wa_tail such that on the next submission, the ring->tail will
appear to move forwards when resubmitting the same request. This is very
common as it will happen for every lite-restore to fill the second port
after a context switch.
However, intel_ring_direction() is limited in precision to movements of
upto half the ring size. The consequence being that if we tried to
unwind many requests, we could exceed half the ring and flip the sense
of the direction, so missing a force restore. As no request can be
greater than half the ring (i.e. 2048 bytes in the smallest case), we
can check for rollback incrementally. As we check against the tail that
would be submitted, we do not lose any sensitivity and allow lite
restores for the simple case. We still need to double check upon
submitting the context, to allow for multiple preemptions and
resubmissions.
Fixes: 5ba32c7be81e ("drm/i915/execlists: Always force a context reload when rewinding RING_TAIL")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609151723.12971-1-chris@chris-wilson.co.uk
|
|
This has no code changes, but the typo is clearly getting copy/pasted,
so better to avoid this now and fix the typo. IS_ENABLED() takes full
names, and must have the "CONFIG_" prefix.
Reported-by: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/lkml/b08611018fdb6d88757c6008a5c02fa0e07b32fb.camel@perches.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/202006050718.9D4FCFC2E@keescook
|
|
We infrequently use the direct i915 backpointer from the i915_request,
so do we really need to waste the space in the struct for it? 8 bytes
from the most frequently allocated struct vs an 3 bytes and pointer
chasing in using rq->engine->i915?
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200602220953.21178-1-chris@chris-wilson.co.uk
|
|
Name the object classes and their offspring for easier lockdep
debugging.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200529183204.16850-2-chris@chris-wilson.co.uk
|