summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-08-04drm/i915/guc/slpc: Add methods to set min/max frequencyVinay Belgaumkar2-0/+91
Add param set h2g helpers to set the min and max frequencies for use by SLPC. v2: Address review comments (Michal W) v3: Check for positive error code (Michal W) v4: Print generic error in set_param (Michal W) Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Sundaresan Sujaritha <sujaritha.sundaresan@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210730202119.23810-8-vinay.belgaumkar@intel.com
2021-08-04drm/i915/guc/slpc: Remove BUG_ON in guc_submission_disableVinay Belgaumkar1-4/+0
The assumption when it was added was that GT would not be holding any gt_pm references. However, uc_init is called from gt_init_hw, which holds a forcewake ref. If SLPC enable fails, we will still be holding this ref, which will result in the BUG_ON. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210730202119.23810-7-vinay.belgaumkar@intel.com
2021-08-04drm/i915/guc/slpc: Enable SLPC and add related H2G eventsVinay Belgaumkar3-0/+215
Add methods for interacting with GuC for enabling SLPC. Enable SLPC after GuC submission has been established. GuC load will fail if SLPC cannot be successfully initialized. Add various helper methods to set/unset the parameters for SLPC. They can be set using H2G calls or directly setting bits in the shared data structure. v2: Address several review comments, add new helpers for decoding the SLPC min/max frequencies. Use masks instead of hardcoded constants. (Michal W) v3: Split global_state_to_string function, and check for positive non-zero return value from intel_guc_send() (Michal W) v4: Optimize the stringify function and other comments (Michal W) v5: Enable slpc as well before declaring GuC submission status (Michal W) v6: Checkpatch() Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Sundaresan Sujaritha <sujaritha.sundaresan@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210730202119.23810-6-vinay.belgaumkar@intel.com
2021-08-04drm/i915/guc/slpc: Allocate, initialize and release SLPCVinay Belgaumkar3-1/+43
Allocate data structures for SLPC and functions for initializing on host side. v2: Address review comments (Michal W) v3: Remove unnecessary header includes (Michal W) v4: Rebase v5: Move allocation of shared data into slpc_init() (Michal W) Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Sundaresan Sujaritha <sujaritha.sundaresan@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210730202119.23810-5-vinay.belgaumkar@intel.com
2021-08-04drm/i915/guc/slpc: Adding SLPC communication interfacesVinay Belgaumkar4-1/+245
Add constants and params that are needed to configure SLPC. v2: Add a new abi header for SLPC. Replace bitfields with genmasks. Address other comments from Michal W. v3: Add slpc H2G format in abi, other review commments (Michal W) v4: Update status bits according to latest spec v5: checkpatch() Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Sundaresan Sujaritha <sujaritha.sundaresan@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210730202119.23810-4-vinay.belgaumkar@intel.com
2021-08-04drm/i915/guc/slpc: Gate Host RPS when SLPC is enabledVinay Belgaumkar2-1/+24
Also ensure uc_init is called before we initialize RPS so that we can check for SLPC support. We do not need to enable up/down interrupts when SLPC is enabled. However, we still need the ARAT interrupt, which will be enabled separately later. v2: Explicitly return from intel_rps_enable with slpc check (Matthew B) Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Sundaresan Sujaritha <sujaritha.sundaresan@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210730202119.23810-3-vinay.belgaumkar@intel.com
2021-08-04drm/i915/guc/slpc: Initial definitions for SLPCVinay Belgaumkar8-2/+105
Add macros to check for SLPC support. This feature is currently supported for Gen12+ and enabled whenever GuC submission is enabled/selected. Include templates for SLPC init/fini and enable. v2: Move SLPC helper functions to intel_guc_slpc.c/.h. Define basic template for SLPC structure in intel_guc_slpc_types.h. Fix copyright (Michal W) v3: Review comments (Michal W) v4: Include supported/selected inside slpc struct (Michal W) Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Sundaresan Sujaritha <sujaritha.sundaresan@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210730202119.23810-2-vinay.belgaumkar@intel.com
2021-07-31drm/i915/xehp: Fix missing sentinel on mcr_ranges_xehpLucas De Marchi1-0/+1
There's a missing sentinel since we are not using ARRAY_SIZE(), but rather checking that the .start is 0 to stop the iteration in mcr_range(). BUG: KASAN: global-out-of-bounds in mcr_range.isra.0+0x69/0xa0 [i915] Read of size 4 at addr ffffffffa0889928 by task modprobe/3881 Fixes: d8905ba705ab ("drm/i915/xehp: Define multicast register ranges") Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210730191115.2514581-1-lucas.demarchi@intel.com
2021-07-30drm/i915/selftests: prefer the create_user helperMatthew Auld1-30/+4
No need to hand roll the set_placements stuff, now that we have a helper for this. v2: add back the -ENODEV checking since it's possible for stolen to be probed, and yet still be non-functional Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Link: https://patchwork.freedesktop.org/patch/msgid/20210729094731.1953091-1-matthew.auld@intel.com
2021-07-29drm/i915/gt: remove GRAPHICS_VER == 10Lucas De Marchi9-65/+22
Replace all remaining handling of GRAPHICS_VER {==,>=} 10 with {==,>=} 11. With the removal of CNL, there is no platform with graphics version equals 10. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210728220326.1578242-5-lucas.demarchi@intel.com
2021-07-29drm/i915/gt: rename CNL references in intel_engine.hLucas De Marchi2-3/+3
With the removal of CNL, let's consider ICL as the first platform using that index. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210728220326.1578242-4-lucas.demarchi@intel.com
2021-07-29drm/i915/gt: remove explicit CNL handling from intel_sseu.cLucas De Marchi2-80/+1
CNL is the only platform with GRAPHICS_VER == 10. With its removal we don't need to handle that version anymore. Also we can now reduce the max number of slices: the call to intel_sseu_set_info() with the highest number of slices comes from SKL and BDW with 3 slices. Recent platforms actually increase the number of subslices so the number of slices remain 1. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210728220326.1578242-3-lucas.demarchi@intel.com
2021-07-29drm/i915/gt: remove explicit CNL handling from intel_mocs.cLucas De Marchi1-1/+1
Only one reference to CNL that is not needed, but code is the same for GEN9_BC, so leave the code around and just remove the special case for CNL. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210728220326.1578242-2-lucas.demarchi@intel.com
2021-07-28drm/i915: Extract i915_module.cDaniel Vetter4-114/+125
The module init code is somewhat misplaced in i915_pci.c, since it needs to pull in init/exit functions from every part of the driver and pollutes the include list a lot. Extract an i915_module.c file which pulls all the bits together, and allows us to massively trim the include list of i915_pci.c. The downside is that have to drop the error path check Jason added to catch when we set up the pci driver too early. I think that risk is acceptable for this pretty nice include. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-11-daniel.vetter@ffwll.ch
2021-07-28drm/i915: Remove i915_globalsDaniel Vetter5-82/+0
No longer used. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-10-daniel.vetter@ffwll.ch
2021-07-28drm/i915: move vma slab to direct module init/exitDaniel Vetter5-22/+14
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_vmas to just a slab_vmas. We have to keep i915_drv.h include in i915_globals otherwise there's nothing anymore that pulls in GEM_BUG_ON. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-9-daniel.vetter@ffwll.ch
2021-07-28drm/i915: move scheduler slabs to direct module init/exitDaniel Vetter5-28/+20
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_dependencies|priorities to just a slab_dependencies|priorities. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-8-daniel.vetter@ffwll.ch
2021-07-28drm/i915: move request slabs to direct module init/exitDaniel Vetter4-30/+24
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_requests|execute_cbs to just a slab_requests|execute_cbs. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-7-daniel.vetter@ffwll.ch
2021-07-28drm/i915: move gem_objects slab to direct module init/exitDaniel Vetter5-20/+12
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_objects to just a slab_objects. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-6-daniel.vetter@ffwll.ch
2021-07-28drm/i915: move gem_context slab to direct module init/exitDaniel Vetter5-20/+13
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_luts to just a slab_luts. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-5-daniel.vetter@ffwll.ch
2021-07-28drm/i915: move intel_context slab to direct module init/exitDaniel Vetter5-20/+13
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_ce to just a slab_ce. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-4-daniel.vetter@ffwll.ch
2021-07-28drm/i915: move i915_buddy slab to direct module init/exitDaniel Vetter4-20/+12
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_blocks to just a slab_blocks. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-3-daniel.vetter@ffwll.ch
2021-07-28drm/i915: move i915_active slab to direct module init/exitDaniel Vetter5-23/+16
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_cache to just a slab_cache. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-2-daniel.vetter@ffwll.ch
2021-07-28drm/i915: Check for nomodeset in i915_init() firstDaniel Vetter1-1/+1
When modesetting (aka the full pci driver, which has nothing to do with disable_display option, which just gives you the full pci driver without the display driver) is disabled, we load nothing and do nothing. So move that check first, for a bit of orderliness. With Jason's module init/exit table this now becomes trivial. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-1-daniel.vetter@ffwll.ch
2021-07-28drm/i915/xehpsdv: Correct parameters for IS_XEHPSDV_GT_STEP()Matt Roper1-2/+2
During a rebase the parameters were partially renamed, but not completely. Since the subsequent patches that start using this macro haven't landed on an upstream tree yet this didn't cause a build failure. Fixes: 086df54e20be ("drm/i915/xehpsdv: add initial XeHP SDV definitions") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Caz Yokoyama <caz.yokoyama@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210723174239.1551352-2-matthew.d.roper@intel.com
2021-07-28drm/i915/guc: Unblock GuC submission on Gen11+Daniele Ceraolo Spurio4-7/+18
Unblock GuC submission on Gen11+ platforms. v2: (Martin Peres / John H) - Delete debug message when GuC is disabled by default on certain platforms Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-34-matthew.brost@intel.com
2021-07-28drm/i915/guc: Implement GuC priority managementMatthew Brost10-5/+282
Implement a simple static mapping algorithm of the i915 priority levels (int, -1k to 1k exposed to user) to the 4 GuC levels. Mapping is as follows: i915 level < 0 -> GuC low level (3) i915 level == 0 -> GuC normal level (2) i915 level < INT_MAX -> GuC high level (1) i915 level == INT_MAX -> GuC highest level (0) We believe this mapping should cover the UMD use cases (3 distinct user levels + 1 kernel level). In addition to static mapping, a simple counter system is attached to each context tracking the number of requests inflight on the context at each level. This is needed as the GuC levels are per context while in the i915 levels are per request. v2: (Daniele) - Add BUILD_BUG_ON to enforce ordering of priority levels - Add missing lockdep to guc_prio_fini - Check for return before setting context registered flag - Map DISPLAY priority or higher to highest guc prio - Update comment for guc_prio Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-33-matthew.brost@intel.com
2021-07-28drm/i915/selftest: Bump selftest timeouts for hangcheckJohn Harrison3-3/+3
Some testing environments and some heavier tests are slower than previous limits allowed for. For example, it can take multiple seconds for the 'context has been reset' notification handler to reach the 'kill the requests' code in the 'active' version of the 'reset engines' test. During which time the selftest gets bored, gives up waiting and fails the test. There is also an async thread that the selftest uses to pump work through the hardware in parallel to the context that is marked for reset. That also could get bored waiting for completions and kill the test off. Lastly, the flush at the of various test sections can also see timeouts due to the large amount of work backed up. This is also true of the live_hwsp_read test. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-32-matthew.brost@intel.com
2021-07-28drm/i915/selftest: Fix hangcheck self test for GuC submissionJohn Harrison7-71/+236
When GuC submission is enabled, the GuC controls engine resets. Rather than explicitly triggering a reset, the driver must submit a hanging context to GuC and wait for the reset to occur. Conversely, one of the tests specifically sends hanging batches to the engines but wants them to sit around until a manual reset of the full GT (including GuC itself). That means disabling GuC based engine resets to prevent those from killing the hanging batch too soon. So, add support to the scheduling policy helper for disabling resets as well as making them quicker! In GuC submission mode, the 'is engine idle' test basically turns into 'is engine PM wakelock held'. Independently, there is a heartbeat disable helper function that the tests use. For unexplained reasons, this acquires the engine wakelock before disabling the heartbeat and only releases it when re-enabling the heartbeat. As one of the tests tries to do a wait for idle in the middle of a heartbeat disabled section, it is therefore guaranteed to always fail. Added a 'no_pm' variant of the heartbeat helper that allows the engine to be asleep while also having heartbeats disabled. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-31-matthew.brost@intel.com
2021-07-28drm/i915/selftest: Increase some timeouts in live_requestsMatthew Brost1-2/+2
Requests may take slightly longer with GuC submission, let's increase the timeouts in live_requests. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-30-matthew.brost@intel.com
2021-07-28drm/i915/selftest: Fix MOCS selftest for GuC submissionRahul Kumar Singh1-14/+35
When GuC submission is enabled, the GuC controls engine resets. Rather than explicitly triggering a reset, the driver must submit a hanging context to GuC and wait for the reset to occur. Signed-off-by: Rahul Kumar Singh <rahul.kumar.singh@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-29-matthew.brost@intel.com
2021-07-28drm/i915/selftest: Fix workarounds selftest for GuC submissionRahul Kumar Singh6-34/+203
When GuC submission is enabled, the GuC controls engine resets. Rather than explicitly triggering a reset, the driver must submit a hanging context to GuC and wait for the reset to occur. Signed-off-by: Rahul Kumar Singh <rahul.kumar.singh@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-28-matthew.brost@intel.com
2021-07-28drm/i915/selftest: Better error reporting from hangcheck selftestJohn Harrison1-18/+75
There are many ways in which the hangcheck selftest can fail. Very few of them actually printed an error message to say what happened. So, fill in the missing messages. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-27-matthew.brost@intel.com
2021-07-28drm/i915/guc: Support request cancellationMatthew Brost7-14/+251
This adds GuC backend support for i915_request_cancel(), which in turn makes CONFIG_DRM_I915_REQUEST_TIMEOUT work. This implementation makes use of fence while there are likely simplier options. A fence was chosen because of another feature coming soon which requires a user to block on a context until scheduling is disabled. In that case we return the fence to the user and the user can wait on that fence. v2: (Daniele) - A comment about locking the blocked incr / decr - A comments about the use of the fence - Update commit message explaining why fence - Delete redundant check blocked count in unblock function - Ring buffer implementation - Comment about blocked in submission path - Shorter rpm path v3: (Checkpatch) - Fix typos in commit message (Daniel) - Rework to simplier locking structure in guc_context_block / unblock Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-26-matthew.brost@intel.com
2021-07-28drm/i915/guc: Implement banned contexts for GuC submissionMatthew Brost8-37/+195
When using GuC submission, if a context gets banned disable scheduling and mark all inflight requests as complete. Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-25-matthew.brost@intel.com
2021-07-28drm/i915/guc: Add golden context to GuC ADSJohn Harrison7-30/+202
The media watchdog mechanism involves GuC doing a silent reset and continue of the hung context. This requires the i915 driver provide a golden context to GuC in the ADS. v2: (Matthew Brost): - Fix memory corruption in shmem_read (John H) - Use locals rather than defines for LR_* + SKIP_SIZE Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-24-matthew.brost@intel.com
2021-07-28drm/i915/guc: Include scheduling policies in the debugfs state dumpJohn Harrison3-0/+19
Added the scheduling policy parameters to the 'guc_info' debugfs state dump. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-23-matthew.brost@intel.com
2021-07-28drm/i915/guc: Connect reset modparam updates to GuC policy flagsJohn Harrison2-1/+33
Changing the reset module parameter has no effect on a running GuC. The corresponding entry in the ADS must be updated and then the GuC informed via a Host2GuC message. The new debugfs interface to module parameters allows this to happen. However, connecting the parameter data address back to anything useful is messy. One option would be to pass a new private data structure address through instead of just the parameter pointer. However, that means having a new (and different) data structure for each parameter and a new (and different) write function for each parameter. This method keeps everything generic by instead using a string lookup on the directory entry name. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-22-matthew.brost@intel.com
2021-07-28drm/i915/guc: Hook GuC scheduling policies upJohn Harrison3-5/+49
Use the official driver default scheduling policies for configuring the GuC scheduler rather than a bunch of hardcoded values. v2: (Matthew Brost) - Move I915_ENGINE_WANT_FORCED_PREEMPTION to later patch Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Jose Souza <jose.souza@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-21-matthew.brost@intel.com
2021-07-28drm/i915/guc: Fix for error capture after full GPU reset with GuCJohn Harrison9-47/+229
In the case of a full GPU reset (e.g. because GuC has died or because GuC's hang detection has been disabled), the driver can't rely on GuC reporting the guilty context. Instead, the driver needs to scan all active contexts and find one that is currently executing, as per the execlist mode behaviour. In GuC mode, this scan is different to execlist mode as the active request list is handled very differently. Similarly, the request state dump in debugfs needs to be handled differently when in GuC submission mode. Also refactured some of the request scanning code to avoid duplication across the multiple code paths that are now replicating it. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-20-matthew.brost@intel.com
2021-07-28drm/i915/guc: Capture error state on context resetMatthew Brost7-26/+91
We receive notification of an engine reset from GuC at its completion. Meaning GuC has potentially cleared any HW state we may have been interested in capturing. GuC resumes scheduling on the engine post-reset, as the resets are meant to be transparent, further muddling our error state. There is ongoing work to define an API for a GuC debug state dump. The suggestion for now is to manually disable FW initiated resets in cases where debug state is needed. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-19-matthew.brost@intel.com
2021-07-28drm/i915/guc: Enable GuC engine resetJohn Harrison1-2/+1
Clear the 'disable resets' flag to allow GuC to reset hung contexts (detected via pre-emption timeout). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-18-matthew.brost@intel.com
2021-07-28drm/i915/guc: Don't complain about reset racesJohn Harrison3-1/+10
It is impossible to seal all race conditions of resets occurring concurrent to other operations. At least, not without introducing excesive mutex locking. Instead, don't complain if it occurs. In particular, don't complain if trying to send a H2G during a reset. Whatever the H2G was about should get redone once the reset is over. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-17-matthew.brost@intel.com
2021-07-28drm/i915/guc: Provide mmio list to be saved/restored on engine resetJohn Harrison5-26/+222
The driver must provide GuC with a list of mmio registers that should be saved/restored during a GuC-based engine reset. Unfortunately, the list must be dynamically allocated as its size is variable. That means the driver must generate the list twice - once to work out the size and a second time to actually save it. v2: (Alan / CI) - GEN7_GT_MODE -> GEN6_GT_MODE to fix WA selftest failure Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-16-matthew.brost@intel.com
2021-07-28drm/i915/guc: Enable the timer expired interrupt for GuCMatthew Brost1-0/+4
The GuC can implement execution qunatums, detect hung contexts and other such things but it requires the timer expired interrupt to do so. Signed-off-by: Matthew Brost <matthew.brost@intel.com> CC: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-15-matthew.brost@intel.com
2021-07-28drm/i915/guc: Handle engine reset failure notificationMatthew Brost3-0/+48
GuC will notify the driver, via G2H, if it fails to reset an engine. We recover by resorting to a full GPU reset. v2: (John Harrison): - s/drm_dbg/drm_err Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-14-matthew.brost@intel.com
2021-07-28drm/i915/guc: Handle context reset notificationMatthew Brost4-0/+51
GuC will issue a reset on detecting an engine hang and will notify the driver via a G2H message. The driver will service the notification by resetting the guilty context to a simple state or banning it completely. v2: (John Harrison) - Move msg[0] lookup after length check v3: (John Harrison) - s/drm_dbg/drm_err Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-13-matthew.brost@intel.com
2021-07-28drm/i915/guc: Suspend/resume implementation for new interfaceMatthew Brost5-51/+54
The new GuC interface introduces an MMIO H2G command, INTEL_GUC_ACTION_RESET_CLIENT, which is used to implement suspend. This MMIO tears down any active contexts generating a context reset G2H CTB for each. Once that step completes the GuC tears down the CTB channels. It is safe to suspend once this MMIO H2G command completes and all G2H CTBs have been processed. In practice the i915 will likely never receive a G2H as suspend should only be called after the GPU is idle. Resume is implemented in the same manner as before - simply reload the GuC firmware and reinitialize everything (e.g. CTB channels, contexts, etc..). v2: (Michel / John H) - INTEL_GUC_ACTION_RESET_CLIENT 0x5B01 -> 0x5507 Cc: John Harrison <john.c.harrison@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-12-matthew.brost@intel.com
2021-07-28drm/i915/guc: Add disable interrupts to guc sanitizeMatthew Brost2-18/+19
Add disable GuC interrupts to intel_guc_sanitize(). Part of this requires moving the guc_*_interrupt wrapper function into header file intel_guc.h. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-11-matthew.brost@intel.com
2021-07-28drm/i915: Reset GPU immediately if submission is disabledMatthew Brost6-13/+79
If submission is disabled by the backend for any reason, reset the GPU immediately in the heartbeat code as the backend can't be reenabled until the GPU is reset. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-10-matthew.brost@intel.com