summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gt
AgeCommit message (Collapse)AuthorFilesLines
2021-07-22drm/i915: Ditch i915 globals shrink infrastructureDaniel Vetter2-10/+0
This essentially reverts commit 84a1074920523430f9dc30ff907f4801b4820072 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Jan 24 11:36:08 2018 +0000 drm/i915: Shrink the GEM kmem_caches upon idling mm/vmscan.c:do_shrink_slab() is a thing, if there's an issue with it then we need to fix that there, not hand-roll our own slab shrinking code in i915. Also when this was added there was only one other caller of kmem_cache_shrink (added 2005 to the acpi code). Now there's a 2nd one outside of i915 code in a kunit test, which seems legit since that wants to very carefully control what's in the kmem_cache. This out of a total of over 500 calls to kmem_cache_create. This alone should have been warning sign enough that we're doing something silly. Noticed while reviewing a patch set from Jason to fix up some issues in our i915_init() and i915_exit() module load/cleanup code. Now that i915_globals.c isn't any different than normal init/exit functions, we should convert them over to one unified table and remove i915_globals.[hc] entirely. v2: Improve commit message (Jason) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: David Airlie <airlied@linux.ie> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210721183229.4136488-1-daniel.vetter@ffwll.ch
2021-07-21drm/i915: Make GT workaround upper bounds exclusiveMatt Roper3-16/+16
Workarounds are documented in the bspec with an exclusive upper bound (i.e., a "fixed" stepping that no longer needs the workaround). This makes our driver's use of an inclusive upper bound for stepping ranges confusing; the differing notation between code and bspec makes it very easy for mistakes to creep in. Let's switch the upper bound of our IS_{GT,DISP}_STEP macros over to use an exclusive upper bound like the bspec does. This also has the benefit of helping make sure workarounds are properly handled for new minor steppings that show up (e.g., an A1 between the A0 and B0 we already knew about) --- if the new intermediate stepping pulls in hardware fixes early, there will be an update to the workaround definition which lets us know we need to change our code. If the new stepping does not pull a hardware fix earlier, then the new stepping will already be captured properly by the "[begin, fix)" range in the code. We'll probably need to be extra vigilant in code review of new workarounds for the near future to make sure developers notice the new semantics of workaround bounds. But we just migrated a bunch of our platforms from the IS_REVID bounds over to IS_{GT,DISP}_STEP, so people are already adjusting to the new macros and now is a good time to make this change too. [mattrope: Split out GT changes to apply through gt-next tree] Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210717051426.4120328-8-matthew.d.roper@intel.com
2021-07-21drm/i915: Program DFR enable/disable as a GT workaroundMatt Roper1-0/+9
DFR programming (which we enable as an optimization on gen11, but must ensure is disabled on gen12) should be handled as a GT workaround rather than clock gating initialization. This will ensure that the programming of these registers is verified with our typical workaround checks. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210717051426.4120328-4-matthew.d.roper@intel.com
2021-07-21drm/i915/icl: Drop a couple unnecessary workaroundsMatt Roper1-13/+1
While doing a quick sanity check of the ICL workarounds in the driver I noticed a few things that should be updated: * There's no mention in the bspec that WaPipelineFlushCoherentLines is needed on gen11 (both the current WA database and the old, deprecated page 20196 were checked); it appears this might have just been copied from the gen9 list? Even if this were needed, it doesn't seem like this was the correct implementation anyway since the gen9 workaround is supposed to be implemented in the indirect context bb (as we do in gen8_emit_flush_coherentl3_wa() on gen8/gen9). * WaForwardProgressSoftReset does not appear in the current workaround database. The old deprecated workaround list has a note indicating the workaround was dropped in 2017, so we should be safe to drop it from the code too. While we're at it, add the formal workaround ID number to WaDisableBankHangMode (our hardware team made a transition from text-based workaround names to ID numbers partway through the development of ICL, which is why some workarounds only have names, some only have numbers, and some have both). Bspec: 33450 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210717051426.4120328-3-matthew.d.roper@intel.com
2021-07-21drm/i915: Fix application of WaInPlaceDecompressionHangMatt Roper1-18/+2
On SKL we've been applying this workaround on H0+ steppings, which is actually backwards; H0 is supposed to be the first stepping where the workaround is no longer needed. Flip the bounds so that the workaround applies to all steppings _before_ H0. On BXT we've been applying this workaround to all steppings, but the bspec tells us it's only needed until C0. Pre-C0 GT steppings only appeared in pre-production hardware, which we no longer support in the driver, so we can drop the workaround completely for this platform. On ICL we've been applying this workaround to all steppings, but there doesn't seem to be any indication that this workaround was ever needed for this platform (even now-deprecated page 20196 of the bspec doesn't mention it). We can go ahead and drop it. I also don't see any mention of this workaround being needed for KBL, although this may be an oversight since the workaround is needed for all steppings of CFL. I'll leave the workaround in place for KBL to be safe. Bspec: 14091, 33450 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210717051426.4120328-2-matthew.d.roper@intel.com
2021-07-16drm/i915: Remove allow_alloc from i915_gem_object_get_sg*Jason Ekstrand1-1/+1
This reverts the rest of 0edbb9ba1bfe ("drm/i915: Move cmd parser pinning to execbuffer"). Now that the only user of i915_gem_object_get_sg without allow_alloc has been removed, we can drop the parameter. This portion of the revert was broken into its own patch to aid review. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-4-jason@jlekstrand.net
2021-07-15Merge branch 'topic/revid_steppings' into drm-intel-gt-nextMatt Roper2-102/+8
The switch from old old IS_FOO_REVID() macros to the new table-based IS_FOO_{GT,DISP}_STEP() macros is needed on both drm-intel-next (for display-based DMC matching) and drm-intel-gt-next (for workaround guards). To avoid conflicts, we'll apply the patches to a topic branch and merge it to both intel branches to ensure the transition to the new macros is clean. Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2021-07-15drm/i915/icl: Drop workarounds that only apply to pre-production steppingsMatt Roper1-39/+0
We're past the point at which we usually drop workarounds that were never needed on production hardware. The driver will already print an error and apply taint if loaded on pre-production hardware. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210713193635.3390052-13-matthew.d.roper@intel.com
2021-07-15drm/i915/cnl: Drop all workaroundsMatt Roper1-57/+0
All of the Cannon Lake hardware that came out had graphics fused off, and our userspace drivers have already dropped their support for the platform; CNL-specific code in i915 that isn't inherited by subsequent platforms is effectively dead code. Let's remove all of the CNL-specific workarounds as a quick and easy first step. References: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6899 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210713193635.3390052-12-matthew.d.roper@intel.com
2021-07-15drm/i915/dg1: Use revid->stepping tablesMatt Roper2-6/+7
Switch DG1 to use a revid->stepping table as we're trying to do on all platforms going forward. This removes the last use of IS_REVID() and REVID_FOREVER, so remove those now-unused macros as well to prevent their accidental use on future platforms. v2: - Use COMMON_STEP() macro in table. (Anusha) Bspec: 44463 Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210713193635.3390052-11-matthew.d.roper@intel.com
2021-07-15drm/i915/jsl_ehl: Use revid->stepping tablesMatt Roper1-1/+1
Switch JSL/EHL to use a revid->stepping table as we're trying to do on all platforms going forward. v2: - Use COMMON_STEP(). (Anusha) Bspec: 29153 Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210713193635.3390052-9-matthew.d.roper@intel.com
2021-07-15drm/i915/icl: Use revid->stepping tablesMatt Roper1-6/+6
Switch ICL to use a revid->stepping table as we're trying to do on all platforms going forward. While we're at it, let's include some additional steppings that have popped up, even if we don't yet have any workarounds tied to those steppings (we probably need to audit our workaround list soon to see if any of the bounds have moved or if new workarounds have appeared). Note that the current bspec table is missing information about how to map PCI revision ID to GT/display steppings; it only provides an SoC stepping. The mapping to GT/display steppings (which aren't always the same as the SoC stepping) used to be in the bspec, but was apparently dropped during an update in Nov 2019; I've made my changes here based on an older bspec snapshot that still had the necessary information. We've requested that the missing information be restored. I'm only including the production revids in the table here since we're past the point at which we usually stop trying to support pre-production hardware. An appropriate check is added to intel_detect_preproduction_hw() to print an error and taint the kernel just in case someone still tries to load the driver on old pre-production hardware. v2: - Drop pre-production steppings and add error/taint at startup when loading on pre-production hardware. Bspec: 21141 # pre-Nov 2019 snapshot Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210713193635.3390052-8-matthew.d.roper@intel.com
2021-07-15drm/i915/skl: Use revid->stepping tablesMatt Roper1-1/+1
Switch SKL to use a revid->stepping table as we're trying to do on all platforms going forward. Also drop the preproduction revisions and add the newer steppings we hadn't already handled. Note that SKL has a case where a newer revision ID corresponds to an older GT/disp stepping (0x9 -> STEP_J0, 0xA -> STEP_I1). Also, the lack of a revision ID 0x8 in the table is intentional and not an oversight. We'll re-write the KBL-specific comment to make it clear that these kind of quirks are expected. v2: - Since GT and display steppings are always identical on SKL use a macro to set both values at once in a more readable manner. (Anusha) - Drop preproduction steppings. Bspec: 13626 Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210713193635.3390052-4-matthew.d.roper@intel.com
2021-07-14drm/i915/gtt: drop the page table optimisationMatthew Auld1-4/+1
We skip filling out the pt with scratch entries if the va range covers the entire pt, since we later have to fill it with the PTEs for the object pages anyway. However this might leave open a small window where the PTEs don't point to anything valid for the HW to consume. When for example using 2M GTT pages this fill_px() showed up as being quite significant in perf measurements, and ends up being completely wasted since we ignore the pt and just use the pde directly. Anyway, currently we have our PTE construction split between alloc and insert, which is probably slightly iffy nowadays, since the alloc doesn't actually allocate anything anymore, instead it just sets up the page directories and points the PTEs at the scratch page. Later when we do the insert step we re-program the PTEs again. Better might be to squash the alloc and insert into a single step, then bringing back this optimisation(along with some others) should be possible. Fixes: 14826673247e ("drm/i915: Only initialize partially filled pagetables") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: <stable@vger.kernel.org> # v4.15+ Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210713130431.2392740-1-matthew.auld@intel.com (cherry picked from commit 8f88ca76b3942d82e2c1cea8735ec368d89ecc15) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-07-14drm/i915/gtt: drop the page table optimisationMatthew Auld1-4/+1
We skip filling out the pt with scratch entries if the va range covers the entire pt, since we later have to fill it with the PTEs for the object pages anyway. However this might leave open a small window where the PTEs don't point to anything valid for the HW to consume. When for example using 2M GTT pages this fill_px() showed up as being quite significant in perf measurements, and ends up being completely wasted since we ignore the pt and just use the pde directly. Anyway, currently we have our PTE construction split between alloc and insert, which is probably slightly iffy nowadays, since the alloc doesn't actually allocate anything anymore, instead it just sets up the page directories and points the PTEs at the scratch page. Later when we do the insert step we re-program the PTEs again. Better might be to squash the alloc and insert into a single step, then bringing back this optimisation(along with some others) should be possible. Fixes: 14826673247e ("drm/i915: Only initialize partially filled pagetables") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: <stable@vger.kernel.org> # v4.15+ Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210713130431.2392740-1-matthew.auld@intel.com
2021-07-13drm/i915/guc: Module load failure test for CT buffer creationJohn Harrison1-0/+8
Add several module failure load inject points in the CT buffer creation code path. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-8-matthew.brost@intel.com
2021-07-13drm/i915/guc: Optimize CTB writes and readsMatthew Brost2-32/+67
CTB writes are now in the path of command submission and should be optimized for performance. Rather than reading CTB descriptor values (e.g. head, tail) which could result in accesses across the PCIe bus, store shadow local copies and only read/write the descriptor values when absolutely necessary. Also store the current space in the each channel locally. v2: (Michal) - Add additional sanity checks for head / tail pointers - Use GUC_CTB_HDR_LEN rather than magic 1 v3: (Michal / John H) - Drop redundant check of head value v4: (John H) - Drop redundant checks of tail / head values v5: (Michal) - Address more nits v6: (Michal) - Add GEM_BUG_ON sanity check on ctb->space Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-7-matthew.brost@intel.com
2021-07-13drm/i915/guc: Add stall timer to non blocking CTB send functionMatthew Brost2-7/+59
Implement a stall timer which fails H2G CTBs once a period of time with no forward progress is reached to prevent deadlock. v2: (Michal) - Improve error message in ct_deadlock() - Set broken when ct_deadlock() returns true - Return -EPIPE on ct_deadlock() v3: (Michal) - Add ms to stall timer comment (Matthew) - Move broken check to intel_guc_ct_send() Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-6-matthew.brost@intel.com
2021-07-13drm/i915/guc: Add non blocking CTB send functionMatthew Brost4-15/+91
Add non blocking CTB send function, intel_guc_send_nb. GuC submission will send CTBs in the critical path and does not need to wait for these CTBs to complete before moving on, hence the need for this new function. The non-blocking CTB now must have a flow control mechanism to ensure the buffer isn't overrun. A lazy spin wait is used as we believe the flow control condition should be rare with a properly sized buffer. The function, intel_guc_send_nb, is exported in this patch but unused. Several patches later in the series make use of this function. v2: (Michal) - Use define for H2G room calculations - Move INTEL_GUC_SEND_NB define (Daniel Vetter) - Use msleep_interruptible rather than cond_resched v3: (Michal) - Move includes to following patch - s/INTEL_GUC_SEND_NB/INTEL_GUC_CT_SEND_NB/g v4: (John H) - Update comment, add type local variable Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-5-matthew.brost@intel.com
2021-07-13drm/i915/guc: Increase size of CTB buffersMatthew Brost1-3/+8
With the introduction of non-blocking CTBs more than one CTB can be in flight at a time. Increasing the size of the CTBs should reduce how often software hits the case where no space is available in the CTB buffer. Cc: John Harrison <john.c.harrison@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-4-matthew.brost@intel.com
2021-07-13drm/i915/guc: Improve error message for unsolicited CT responseMatthew Brost1-3/+7
Improve the error message when a unsolicited CT response is received by printing fence that couldn't be found, the last fence, and all requests with a response outstanding. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-3-matthew.brost@intel.com
2021-07-13drm/i915/guc: Relax CTB response timeoutMatthew Brost1-3/+7
In upcoming patch we will allow more CTB requests to be sent in parallel to the GuC for processing, so we shouldn't assume any more that GuC will always reply without 10ms. Use bigger value hardcoded value of 1s instead. v2: Add CONFIG_DRM_I915_GUC_CTB_TIMEOUT config option v3: (Daniel Vetter) - Use hardcoded value of 1s rather than config option v4: (Michal) - Use defines for timeout values Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210708162055.129996-2-matthew.brost@intel.com
2021-07-13drm/i915/gt: Fix -EDEADLK handling regressionVille Syrjälä1-1/+1
The conversion to ww mutexes failed to address the fence code which already returns -EDEADLK when we run out of fences. Ww mutexes on the other hand treat -EDEADLK as an internal errno value indicating a need to restart the operation due to a deadlock. So now when the fence code returns -EDEADLK the higher level code erroneously restarts everything instead of returning the error to userspace as is expected. To remedy this let's switch the fence code to use a different errno value for this. -ENOBUFS seems like a semi-reasonable unique choice. Apart from igt the only user of this I could find is sna, and even there all we do is dump the current fence registers from debugfs into the X server log. So no user visible functionality is affected. If we really cared about preserving this we could of course convert back to -EDEADLK higher up, but doesn't seem like that's worth the hassle here. Not quite sure which commit specifically broke this, but I'll just attribute it to the general gem ww mutex work. Cc: stable@vger.kernel.org Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Thomas Hellström <thomas.hellstrom@intel.com> Testcase: igt/gem_pread/exhaustion Testcase: igt/gem_pwrite/basic-exhaustion Testcase: igt/gem_fenced_exec_thrash/too-many-fences Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630164413.25481-1-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (cherry picked from commit 78d2ad7eb4e1f0e9cd5d79788446b6092c21d3e0) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-07-13drm/i915/adl_s: Extend Wa_1406941453José Roberto de Souza1-2/+3
BSpec: 54370 Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210713003854.143197-4-jose.souza@intel.com
2021-07-13drm/i915: Implement Wa_1508744258José Roberto de Souza1-0/+7
Same bit was required for Wa_14012131227 in DG1 now it is also required as Wa_1508744258 to TGL, RKL, DG1, ADL-S and ADL-P. Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210713003854.143197-3-jose.souza@intel.com
2021-07-13drm/i915: Settle on "adl-x" in WA commentsJosé Roberto de Souza1-1/+1
Most of the places are using this format so lets consolidate it. v2: - split patch in two: display and non-display because of conflicts between drm-intel-gt-next x drm-intel-next Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210713003854.143197-2-jose.souza@intel.com
2021-07-11drm/i915/gt: Fix -EDEADLK handling regressionVille Syrjälä1-1/+1
The conversion to ww mutexes failed to address the fence code which already returns -EDEADLK when we run out of fences. Ww mutexes on the other hand treat -EDEADLK as an internal errno value indicating a need to restart the operation due to a deadlock. So now when the fence code returns -EDEADLK the higher level code erroneously restarts everything instead of returning the error to userspace as is expected. To remedy this let's switch the fence code to use a different errno value for this. -ENOBUFS seems like a semi-reasonable unique choice. Apart from igt the only user of this I could find is sna, and even there all we do is dump the current fence registers from debugfs into the X server log. So no user visible functionality is affected. If we really cared about preserving this we could of course convert back to -EDEADLK higher up, but doesn't seem like that's worth the hassle here. Not quite sure which commit specifically broke this, but I'll just attribute it to the general gem ww mutex work. Cc: stable@vger.kernel.org Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Thomas Hellström <thomas.hellstrom@intel.com> Testcase: igt/gem_pread/exhaustion Testcase: igt/gem_pwrite/basic-exhaustion Testcase: igt/gem_fenced_exec_thrash/too-many-fences Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630164413.25481-1-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-07-08drm/i915/selftests: Take a VM in kernel_context()Jason Ekstrand2-11/+11
This better models where we want to go with contexts in general where things like the VM and engine set are create parameters instead of being set after the fact. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-28-jason@jlekstrand.net
2021-07-08drm/i915/gt: Drop i915_address_space::file (v2)Jason Ekstrand1-11/+0
There's a big comment saying how useful it is but no one is using this for anything anymore. It was added in 2bfa996e031b ("drm/i915: Store owning file on the i915_address_space") and used for debugfs at the time as well as telling the difference between the global GTT and a PPGTT. In f6e8aa387171 ("drm/i915: Report the number of closed vma held by each context in debugfs") we removed one use of it by switching to a context walk and comparing with the VM in the context. Finally, VM stats for debugfs were entirely nuked in db80a1294c23 ("drm/i915/gem: Remove per-client stats from debugfs/i915_gem_objects") v2 (Daniel Vetter): - Delete a struct drm_i915_file_private pre-declaration - Add a comment to the commit message about history Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-24-jason@jlekstrand.net
2021-07-08drm/i915/gem: Remove engine auto-magic with FENCE_SUBMIT (v2)Jason Ekstrand2-24/+0
Even though FENCE_SUBMIT is only documented to wait until the request in the in-fence starts instead of waiting until it completes, it has a bit more magic than that. If FENCE_SUBMIT is used to submit something to a balanced engine, we would wait to assign engines until the primary request was ready to start and then attempt to assign it to a different engine than the primary. There is an IGT test (the bonded-slice subtest of gem_exec_balancer) which exercises this by submitting a primary batch to a specific VCS and then using FENCE_SUBMIT to submit a secondary which can run on any VCS and have i915 figure out which VCS to run it on such that they can run in parallel. However, this functionality has never been used in the real world. The media driver (the only user of FENCE_SUBMIT) always picks exactly two physical engines to bond and never asks us to pick which to use. v2 (Daniel Vetter): - Mention the exact IGT test this breaks Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-11-jason@jlekstrand.net
2021-07-08drm/i915/gem: Disallow bonding of virtual engines (v3)Jason Ekstrand3-301/+2
This adds a bunch of complexity which the media driver has never actually used. The media driver does technically bond a balanced engine to another engine but the balanced engine only has one engine in the sibling set. This doesn't actually result in a virtual engine. This functionality was originally added to handle cases where we may have more than two video engines and media might want to load-balance their bonded submits by, for instance, submitting to a balanced vcs0-1 as the primary and then vcs2-3 as the secondary. However, no such hardware has shipped thus far and, if we ever want to enable such use-cases in the future, we'll use the up-and-coming parallel submit API which targets GuC submission. This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op. We leave the validation code in place in case we ever decide we want to do something interesting with the bonding information. v2 (Jason Ekstrand): - Don't delete quite as much code. v3 (Tvrtko Ursulin): - Add some history to the commit message Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-10-jason@jlekstrand.net
2021-07-08drm/i915: Drop the CONTEXT_CLONE API (v2)Jason Ekstrand2-31/+0
This API allows one context to grab bits out of another context upon creation. It can be used as a short-cut for setparam(getparam()) for things like I915_CONTEXT_PARAM_VM. However, it's never been used by any real userspace. It's used by a few IGT tests and that's it. Since it doesn't add any real value (most of the stuff you can CLONE you can copy in other ways), drop it. There is one thing that this API allows you to clone which you cannot clone via getparam/setparam: timelines. However, timelines are an implementation detail of i915 and not really something that needs to be exposed to userspace. Also, sharing timelines between contexts isn't obviously useful and supporting it has the potential to complicate i915 internally. It also doesn't add any functionality that the client can't get in other ways. If a client really wants a shared timeline, they can use a syncobj and set it as an in and out fence on every submit. v2 (Jason Ekstrand): - More detailed commit message Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-7-jason@jlekstrand.net
2021-07-08drm/i915/gem: Set the watchdog timeout directly in intel_context_set_gem (v2)Jason Ekstrand1-2/+1
Instead of handling it like a context param, unconditionally set it when intel_contexts are created. For years we've had the idea of a watchdog uAPI floating about. The aim was for media, so that they could set very tight deadlines for their transcodes jobs, so that if you have a corrupt bitstream (especially for decoding) you don't hang your desktop too hard. But it's been stuck in limbo since forever, and this simplifies things a bit in preparation for the proto-context work. If we decide to actually make said uAPI a reality, we can do it through the proto- context easily enough. This does mean that we move from reading the request_timeout_ms param once per engine when engines are created instead of once at context creation. If someone changes request_timeout_ms between creating a context and setting engines, it will mean that they get the new timeout. If someone races setting request_timeout_ms and context creation, they can theoretically end up with different timeouts. However, since both of these are fairly harmless and require changing kernel params, we don't care. v2 (Tvrtko Ursulin): - Add a comment about races with request_timeout_ms Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-5-jason@jlekstrand.net
2021-07-08drm/i915: Stop storing the ring size in the ring pointer (v3)Jason Ekstrand9-12/+11
Previously, we were storing the ring size in the ring pointer before it was actually allocated. We would then guard setting the ring size on checking for CONTEXT_ALLOC_BIT. This is error-prone at best and really only saves us a few bytes on something that already burns at least 4K. Instead, this patch adds a new ring_size field and makes everything use that. v2 (Daniel Vetter): - Replace 512 * SZ_4K with SZ_2M v2 (Jason Ekstrand): - Rebase on top of page migration code Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-3-jason@jlekstrand.net
2021-07-08drm/i915: Drop I915_CONTEXT_PARAM_RINGSIZEJason Ekstrand2-66/+0
This reverts commit 88be76cdafc7 ("drm/i915: Allow userspace to specify ringsize on construction"). This API was originally added for OpenCL but the compute-runtime PR has sat open for a year without action so we can still pull it out if we want. I argue we should drop it for three reasons: 1. If the compute-runtime PR has sat open for a year, this clearly isn't that important. 2. It's a very leaky API. Ring size is an implementation detail of the current execlist scheduler and really only makes sense there. It can't apply to the older ring-buffer scheduler on pre-execlist hardware because that's shared across all contexts and it won't apply to the GuC scheduler that's in the pipeline. 3. Having userspace set a ring size in bytes is a bad solution to the problem of having too small a ring. There is no way that userspace has the information to know how to properly set the ring size so it's just going to detect the feature and always set it to the maximum of 512K. This is what the compute-runtime PR does. The scheduler in i915, on the other hand, does have the information to make an informed choice. It could detect if the ring size is a problem and grow it itself. Or, if that's too hard, we could just increase the default size from 16K to 32K or even 64K instead of relying on userspace to do it. Let's drop this API for now and, if someone decides they really care about solving this problem, they can do it properly. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-2-jason@jlekstrand.net
2021-07-08drm/i915/adlp: Add ADL-P GuC/HuC firmware filesJohn Harrison1-0/+1
Add ADL-P to the list of supported GuC and HuC firmware versions. For HuC, it reuses the existing TGL firmware file. For GuC, there is a dedicated firmware release. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210626004522.1699509-3-John.C.Harrison@Intel.com
2021-07-08drm/i915/huc: Update TGL and friends to HuC 7.9.3John Harrison1-3/+3
A new HuC is available for TGL and compatible platforms, so switch to using it. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210626004522.1699509-2-John.C.Harrison@Intel.com
2021-07-08drm/i915/gt: finish INTEL_GEN and friends conversionLucas De Marchi1-10/+10
Commit c816723b6b8a ("drm/i915/gt: replace IS_GEN and friends with GRAPHICS_VER") converted INTEL_GEN and friends to the new version check macros. Meanwhile, some changes sneaked in to use INTEL_GEN. Remove the last users so we can remove the macros. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210707181325.2130821-2-lucas.demarchi@intel.com
2021-07-06drm/i915: Use the correct IRQ during resumeThomas Zimmermann2-3/+6
The code in xcs_resume() probably didn't work as intended. It uses struct drm_device.irq, which is allocated to 0, but never initialized by i915 to the device's interrupt number. Change all calls to synchronize_hardirq() to intel_synchronize_irq(), which uses the correct interrupt. _hardirq() functions are not needed in this context. v5: * go back to _hardirq() after PCI probe reported wrong context; add rsp comment v4: * switch everything to intel_synchronize_irq() (Daniel) v3: * also use intel_synchronize_hardirq() at another callsite v2: * wrap irq code in intel_synchronize_hardirq() (Ville) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 536f77b1caa0 ("drm/i915/gt: Call stop_ring() from ring resume, again") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210701173618.10718-2-tzimmermann@suse.de (cherry picked from commit 27e4b467d94e216b365da388358c9407af818662) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2021-07-02drm/i915: Use the correct IRQ during resumeThomas Zimmermann2-3/+6
The code in xcs_resume() probably didn't work as intended. It uses struct drm_device.irq, which is allocated to 0, but never initialized by i915 to the device's interrupt number. Change all calls to synchronize_hardirq() to intel_synchronize_irq(), which uses the correct interrupt. _hardirq() functions are not needed in this context. v5: * go back to _hardirq() after PCI probe reported wrong context; add rsp comment v4: * switch everything to intel_synchronize_irq() (Daniel) v3: * also use intel_synchronize_hardirq() at another callsite v2: * wrap irq code in intel_synchronize_hardirq() (Ville) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 536f77b1caa0 ("drm/i915/gt: Call stop_ring() from ring resume, again") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210701173618.10718-2-tzimmermann@suse.de
2021-06-30drm/i915/gtt: ignore min_page_size for paging structuresMatthew Auld1-1/+13
The min_page_size is only needed for pages inserted into the GTT, and for our paging structures we only need at most 4K bytes, so simply ignore the min_page_size restrictions here, otherwise we might see some severe overallocation on some devices. v2(Thomas): add some commentary Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210625103824.558481-2-matthew.auld@intel.com
2021-06-25drm/i915/selftest: Extend ctx_timestamp ICL workaround to GEN11Tejas Upadhyay1-2/+2
EHL and JSL are also observing requirement for 80ns interval for CTX_TIMESTAMP thus extending it to GEN11. Changes since V1: - IS_GEN replaced by GRAPHICS_VER - Tvrtko Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210624112250.895410-1-tejaskumarx.surendrakumar.upadhyay@intel.com
2021-06-19drm/i915/guc: Update firmware to v62.0.0Michal Wajdeczko11-421/+527
Most of the changes to the 62.0.0 firmware revolved around CTB communication channel. Conform to the new (stable) CTB protocol. v2: (Michal) Add values back to kernel DOC for actions (Docs) Add 'CT buffer' back in to fix warning Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> [mattrope: Tweaked kerneldoc while pushing as suggested by Daniele/Michal] Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210616001302.84233-3-matthew.brost@intel.com
2021-06-19drm/i915/guc: Introduce unified HXG messagesMichal Wajdeczko1-0/+213
New GuC firmware will unify format of MMIO and CTB H2G messages. Introduce their definitions now to allow gradual transition of our code to match new changes. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210616001302.84233-2-matthew.brost@intel.com
2021-06-19drm/i915: Move submission tasklet to i915_sched_engineMatthew Brost10-90/+75
The submission tasklet operates on i915_sched_engine, thus it is the correct place for it. v3: (Jason Ekstrand) Change sched_engine->engine to a void* private data pointer Add kernel doc v4: (Daniele) Update private_data comment Set queue_priority_hint in kick_execlists v5: (CI) Rebase and fix build error Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-9-matthew.brost@intel.com
2021-06-19drm/i915: Update i915_scheduler to operate on i915_sched_engineMatthew Brost2-5/+8
Rather passing around an intel_engine_cs in the scheduling code, pass around a i915_sched_engine. v3: (Jason Ekstrand) Add READ_ONCE around rq->engine in lock_sched_engine Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-8-matthew.brost@intel.com
2021-06-19drm/i915: Add kick_backend function to i915_sched_engineMatthew Brost1-0/+52
Not all back-ends require a kick after a scheduling update, so make the kick a call-back function that the back-end can opt-in to. Also move the current kick function from the scheduler to the execlists file as it is specific to that back-end. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-7-matthew.brost@intel.com
2021-06-19drm/i915: Move engine->schedule to i915_sched_engineMatthew Brost8-27/+16
The schedule function should be in the schedule object. v3: (Jason Ekstrand) Add kernel doc Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-6-matthew.brost@intel.com
2021-06-19drm/i915: Move active tracking to i915_sched_engineMatthew Brost8-110/+82
Move active request tracking and its lock to i915_sched_engine. This lock is also the submission lock so having it in the i915_sched_engine is the correct place. v3: (Jason Ekstrand) Add kernel doc v6: Rebase Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.comk> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-5-matthew.brost@intel.com
2021-06-19drm/i915: Reset sched_engine.no_priolist immediately after dequeueMatthew Brost3-2/+3
Rather than touching schedule state in the generic PM code, reset the priolist allocation when empty in the submission code. Add a wrapper function to do this and update the backends to call it in the correct place. v3: (Jason Ekstrand) Update patch commit message with a better description Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-4-matthew.brost@intel.com