summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915
AgeCommit message (Collapse)AuthorFilesLines
2021-07-28drm/i915: move request slabs to direct module init/exitDaniel Vetter4-30/+24
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_requests|execute_cbs to just a slab_requests|execute_cbs. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-7-daniel.vetter@ffwll.ch
2021-07-28drm/i915: move gem_objects slab to direct module init/exitDaniel Vetter5-20/+12
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_objects to just a slab_objects. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-6-daniel.vetter@ffwll.ch
2021-07-28drm/i915: move gem_context slab to direct module init/exitDaniel Vetter5-20/+13
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_luts to just a slab_luts. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-5-daniel.vetter@ffwll.ch
2021-07-28drm/i915: move intel_context slab to direct module init/exitDaniel Vetter5-20/+13
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_ce to just a slab_ce. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-4-daniel.vetter@ffwll.ch
2021-07-28drm/i915: move i915_buddy slab to direct module init/exitDaniel Vetter4-20/+12
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_blocks to just a slab_blocks. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-3-daniel.vetter@ffwll.ch
2021-07-28drm/i915: move i915_active slab to direct module init/exitDaniel Vetter5-23/+16
With the global kmem_cache shrink infrastructure gone there's nothing special and we can convert them over. I'm doing this split up into each patch because there's quite a bit of noise with removing the static global.slab_cache to just a slab_cache. v2: Make slab static (Jason, 0day) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-2-daniel.vetter@ffwll.ch
2021-07-28drm/i915: Check for nomodeset in i915_init() firstDaniel Vetter1-1/+1
When modesetting (aka the full pci driver, which has nothing to do with disable_display option, which just gives you the full pci driver without the display driver) is disabled, we load nothing and do nothing. So move that check first, for a bit of orderliness. With Jason's module init/exit table this now becomes trivial. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727121037.2041102-1-daniel.vetter@ffwll.ch
2021-07-28drm/i915/xehpsdv: Correct parameters for IS_XEHPSDV_GT_STEP()Matt Roper1-2/+2
During a rebase the parameters were partially renamed, but not completely. Since the subsequent patches that start using this macro haven't landed on an upstream tree yet this didn't cause a build failure. Fixes: 086df54e20be ("drm/i915/xehpsdv: add initial XeHP SDV definitions") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Caz Yokoyama <caz.yokoyama@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210723174239.1551352-2-matthew.d.roper@intel.com
2021-07-28drm/i915/adlp: Add workaround to disable CMTG clock gatingImre Deak2-0/+21
The driver doesn't depend atm on the common mode timing generator functionality (it would be used for some power saving feature and panel timing synchronization), however DMC will corrupt the CMTG registers across DC5 entry/exit sequences unless the CMTG clock gating is disabled. This in turn can lead to at least the DPLL0/1 configuration getting stuck at their last state, which means we can't reprogram them to a new config. Add the corresponding Bspec workaround to prevent the above. v2: Fix checkpatch errors. (CI, Jose) Cc: Uma Shankar <uma.shankar@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727134400.101290-1-imre.deak@intel.com
2021-07-28drm/i915/adl_p: Allow underrun recovery when possibleMatt Roper2-17/+36
ADL_P requires that we disable underrun recovery when downscaling (or using the scaler for YUV420 pipe output), using DSC, or using PSR2. Otherwise we should be able to enable the underrun recovery. On DG2 we need to keep underrun recovery disabled at all times, but the chicken bit in PIPE_CHICKEN has an inverted meaning (it's an enable bit instead of disable). v2: - Reverse the condition (clear the disable bit when supported, set disable bit when not supported). Bspec: 50351 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727145056.2049720-1-matthew.d.roper@intel.com
2021-07-28drm/i915/guc: Unblock GuC submission on Gen11+Daniele Ceraolo Spurio4-7/+18
Unblock GuC submission on Gen11+ platforms. v2: (Martin Peres / John H) - Delete debug message when GuC is disabled by default on certain platforms Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-34-matthew.brost@intel.com
2021-07-28drm/i915/guc: Implement GuC priority managementMatthew Brost9-5/+273
Implement a simple static mapping algorithm of the i915 priority levels (int, -1k to 1k exposed to user) to the 4 GuC levels. Mapping is as follows: i915 level < 0 -> GuC low level (3) i915 level == 0 -> GuC normal level (2) i915 level < INT_MAX -> GuC high level (1) i915 level == INT_MAX -> GuC highest level (0) We believe this mapping should cover the UMD use cases (3 distinct user levels + 1 kernel level). In addition to static mapping, a simple counter system is attached to each context tracking the number of requests inflight on the context at each level. This is needed as the GuC levels are per context while in the i915 levels are per request. v2: (Daniele) - Add BUILD_BUG_ON to enforce ordering of priority levels - Add missing lockdep to guc_prio_fini - Check for return before setting context registered flag - Map DISPLAY priority or higher to highest guc prio - Update comment for guc_prio Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-33-matthew.brost@intel.com
2021-07-28drm/i915/selftest: Bump selftest timeouts for hangcheckJohn Harrison3-3/+3
Some testing environments and some heavier tests are slower than previous limits allowed for. For example, it can take multiple seconds for the 'context has been reset' notification handler to reach the 'kill the requests' code in the 'active' version of the 'reset engines' test. During which time the selftest gets bored, gives up waiting and fails the test. There is also an async thread that the selftest uses to pump work through the hardware in parallel to the context that is marked for reset. That also could get bored waiting for completions and kill the test off. Lastly, the flush at the of various test sections can also see timeouts due to the large amount of work backed up. This is also true of the live_hwsp_read test. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-32-matthew.brost@intel.com
2021-07-28drm/i915/selftest: Fix hangcheck self test for GuC submissionJohn Harrison7-71/+236
When GuC submission is enabled, the GuC controls engine resets. Rather than explicitly triggering a reset, the driver must submit a hanging context to GuC and wait for the reset to occur. Conversely, one of the tests specifically sends hanging batches to the engines but wants them to sit around until a manual reset of the full GT (including GuC itself). That means disabling GuC based engine resets to prevent those from killing the hanging batch too soon. So, add support to the scheduling policy helper for disabling resets as well as making them quicker! In GuC submission mode, the 'is engine idle' test basically turns into 'is engine PM wakelock held'. Independently, there is a heartbeat disable helper function that the tests use. For unexplained reasons, this acquires the engine wakelock before disabling the heartbeat and only releases it when re-enabling the heartbeat. As one of the tests tries to do a wait for idle in the middle of a heartbeat disabled section, it is therefore guaranteed to always fail. Added a 'no_pm' variant of the heartbeat helper that allows the engine to be asleep while also having heartbeats disabled. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-31-matthew.brost@intel.com
2021-07-28drm/i915/selftest: Increase some timeouts in live_requestsMatthew Brost1-2/+2
Requests may take slightly longer with GuC submission, let's increase the timeouts in live_requests. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-30-matthew.brost@intel.com
2021-07-28drm/i915/selftest: Fix MOCS selftest for GuC submissionRahul Kumar Singh1-14/+35
When GuC submission is enabled, the GuC controls engine resets. Rather than explicitly triggering a reset, the driver must submit a hanging context to GuC and wait for the reset to occur. Signed-off-by: Rahul Kumar Singh <rahul.kumar.singh@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-29-matthew.brost@intel.com
2021-07-28drm/i915/selftest: Fix workarounds selftest for GuC submissionRahul Kumar Singh6-34/+203
When GuC submission is enabled, the GuC controls engine resets. Rather than explicitly triggering a reset, the driver must submit a hanging context to GuC and wait for the reset to occur. Signed-off-by: Rahul Kumar Singh <rahul.kumar.singh@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-28-matthew.brost@intel.com
2021-07-28drm/i915/selftest: Better error reporting from hangcheck selftestJohn Harrison1-18/+75
There are many ways in which the hangcheck selftest can fail. Very few of them actually printed an error message to say what happened. So, fill in the missing messages. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-27-matthew.brost@intel.com
2021-07-28drm/i915/guc: Support request cancellationMatthew Brost7-14/+251
This adds GuC backend support for i915_request_cancel(), which in turn makes CONFIG_DRM_I915_REQUEST_TIMEOUT work. This implementation makes use of fence while there are likely simplier options. A fence was chosen because of another feature coming soon which requires a user to block on a context until scheduling is disabled. In that case we return the fence to the user and the user can wait on that fence. v2: (Daniele) - A comment about locking the blocked incr / decr - A comments about the use of the fence - Update commit message explaining why fence - Delete redundant check blocked count in unblock function - Ring buffer implementation - Comment about blocked in submission path - Shorter rpm path v3: (Checkpatch) - Fix typos in commit message (Daniel) - Rework to simplier locking structure in guc_context_block / unblock Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-26-matthew.brost@intel.com
2021-07-28drm/i915/guc: Implement banned contexts for GuC submissionMatthew Brost8-37/+195
When using GuC submission, if a context gets banned disable scheduling and mark all inflight requests as complete. Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-25-matthew.brost@intel.com
2021-07-28drm/i915/guc: Add golden context to GuC ADSJohn Harrison7-30/+202
The media watchdog mechanism involves GuC doing a silent reset and continue of the hung context. This requires the i915 driver provide a golden context to GuC in the ADS. v2: (Matthew Brost): - Fix memory corruption in shmem_read (John H) - Use locals rather than defines for LR_* + SKIP_SIZE Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-24-matthew.brost@intel.com
2021-07-28drm/i915/guc: Include scheduling policies in the debugfs state dumpJohn Harrison3-0/+19
Added the scheduling policy parameters to the 'guc_info' debugfs state dump. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-23-matthew.brost@intel.com
2021-07-28drm/i915/guc: Connect reset modparam updates to GuC policy flagsJohn Harrison2-1/+33
Changing the reset module parameter has no effect on a running GuC. The corresponding entry in the ADS must be updated and then the GuC informed via a Host2GuC message. The new debugfs interface to module parameters allows this to happen. However, connecting the parameter data address back to anything useful is messy. One option would be to pass a new private data structure address through instead of just the parameter pointer. However, that means having a new (and different) data structure for each parameter and a new (and different) write function for each parameter. This method keeps everything generic by instead using a string lookup on the directory entry name. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-22-matthew.brost@intel.com
2021-07-28drm/i915/guc: Hook GuC scheduling policies upJohn Harrison3-5/+49
Use the official driver default scheduling policies for configuring the GuC scheduler rather than a bunch of hardcoded values. v2: (Matthew Brost) - Move I915_ENGINE_WANT_FORCED_PREEMPTION to later patch Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Jose Souza <jose.souza@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-21-matthew.brost@intel.com
2021-07-28drm/i915/guc: Fix for error capture after full GPU reset with GuCJohn Harrison9-47/+229
In the case of a full GPU reset (e.g. because GuC has died or because GuC's hang detection has been disabled), the driver can't rely on GuC reporting the guilty context. Instead, the driver needs to scan all active contexts and find one that is currently executing, as per the execlist mode behaviour. In GuC mode, this scan is different to execlist mode as the active request list is handled very differently. Similarly, the request state dump in debugfs needs to be handled differently when in GuC submission mode. Also refactured some of the request scanning code to avoid duplication across the multiple code paths that are now replicating it. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-20-matthew.brost@intel.com
2021-07-28drm/i915/guc: Capture error state on context resetMatthew Brost7-26/+91
We receive notification of an engine reset from GuC at its completion. Meaning GuC has potentially cleared any HW state we may have been interested in capturing. GuC resumes scheduling on the engine post-reset, as the resets are meant to be transparent, further muddling our error state. There is ongoing work to define an API for a GuC debug state dump. The suggestion for now is to manually disable FW initiated resets in cases where debug state is needed. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-19-matthew.brost@intel.com
2021-07-28drm/i915/guc: Enable GuC engine resetJohn Harrison1-2/+1
Clear the 'disable resets' flag to allow GuC to reset hung contexts (detected via pre-emption timeout). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-18-matthew.brost@intel.com
2021-07-28drm/i915/guc: Don't complain about reset racesJohn Harrison3-1/+10
It is impossible to seal all race conditions of resets occurring concurrent to other operations. At least, not without introducing excesive mutex locking. Instead, don't complain if it occurs. In particular, don't complain if trying to send a H2G during a reset. Whatever the H2G was about should get redone once the reset is over. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-17-matthew.brost@intel.com
2021-07-28drm/i915/guc: Provide mmio list to be saved/restored on engine resetJohn Harrison5-26/+222
The driver must provide GuC with a list of mmio registers that should be saved/restored during a GuC-based engine reset. Unfortunately, the list must be dynamically allocated as its size is variable. That means the driver must generate the list twice - once to work out the size and a second time to actually save it. v2: (Alan / CI) - GEN7_GT_MODE -> GEN6_GT_MODE to fix WA selftest failure Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-16-matthew.brost@intel.com
2021-07-28drm/i915/guc: Enable the timer expired interrupt for GuCMatthew Brost1-0/+4
The GuC can implement execution qunatums, detect hung contexts and other such things but it requires the timer expired interrupt to do so. Signed-off-by: Matthew Brost <matthew.brost@intel.com> CC: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-15-matthew.brost@intel.com
2021-07-28drm/i915/guc: Handle engine reset failure notificationMatthew Brost3-0/+48
GuC will notify the driver, via G2H, if it fails to reset an engine. We recover by resorting to a full GPU reset. v2: (John Harrison): - s/drm_dbg/drm_err Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-14-matthew.brost@intel.com
2021-07-28drm/i915/guc: Handle context reset notificationMatthew Brost4-0/+51
GuC will issue a reset on detecting an engine hang and will notify the driver via a G2H message. The driver will service the notification by resetting the guilty context to a simple state or banning it completely. v2: (John Harrison) - Move msg[0] lookup after length check v3: (John Harrison) - s/drm_dbg/drm_err Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-13-matthew.brost@intel.com
2021-07-28drm/i915/guc: Suspend/resume implementation for new interfaceMatthew Brost5-51/+54
The new GuC interface introduces an MMIO H2G command, INTEL_GUC_ACTION_RESET_CLIENT, which is used to implement suspend. This MMIO tears down any active contexts generating a context reset G2H CTB for each. Once that step completes the GuC tears down the CTB channels. It is safe to suspend once this MMIO H2G command completes and all G2H CTBs have been processed. In practice the i915 will likely never receive a G2H as suspend should only be called after the GPU is idle. Resume is implemented in the same manner as before - simply reload the GuC firmware and reinitialize everything (e.g. CTB channels, contexts, etc..). v2: (Michel / John H) - INTEL_GUC_ACTION_RESET_CLIENT 0x5B01 -> 0x5507 Cc: John Harrison <john.c.harrison@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-12-matthew.brost@intel.com
2021-07-28drm/i915/guc: Add disable interrupts to guc sanitizeMatthew Brost2-18/+19
Add disable GuC interrupts to intel_guc_sanitize(). Part of this requires moving the guc_*_interrupt wrapper function into header file intel_guc.h. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-11-matthew.brost@intel.com
2021-07-28drm/i915: Reset GPU immediately if submission is disabledMatthew Brost6-13/+79
If submission is disabled by the backend for any reason, reset the GPU immediately in the heartbeat code as the backend can't be reenabled until the GPU is reset. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-10-matthew.brost@intel.com
2021-07-28drm/i915/guc: Reset implementation for new GuC interfaceMatthew Brost7-132/+516
Reset implementation for new GuC interface. This is the legacy reset implementation which is called when the i915 owns the engine hang check. Future patches will offload the engine hang check to GuC but we will continue to maintain this legacy path as a fallback and this code path is also required if the GuC dies. With the new GuC interface it is not possible to reset individual engines - it is only possible to reset the GPU entirely. This patch forces an entire chip reset if any engine hangs. v2: (Michal) - Check for -EPIPE rather than -EIO (CT deadlock/corrupt check) v3: (John H) - Split into a series of smaller patches v4: (John H) - Fix typo - Add braces around if statements in reset code v5: (Checkpatch) - Fix warnings Cc: John Harrison <john.c.harrison@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <john.c.harrison@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-9-matthew.brost@intel.com
2021-07-28drm/i915: Move active request tracking to a vfuncMatthew Brost9-37/+147
Move active request tracking to a backend vfunc rather than assuming all backends want to do this in the manner. In the of case execlists / ring submission the tracking is on the physical engine while with GuC submission it is on the context. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-8-matthew.brost@intel.com
2021-07-28drm/i915: Add i915_sched_engine destroy vfuncMatthew Brost3-4/+8
This is required to allow backend specific cleanup v2: (John H) - Rework commit message Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-7-matthew.brost@intel.com
2021-07-28drm/i915/guc: Direct all breadcrumbs for a class to single breadcrumbsMatthew Brost9-37/+133
With GuC virtual engines the physical engine which a request executes and completes on isn't known to the i915. Therefore we can't attach a request to a physical engines breadcrumbs. To work around this we create a single breadcrumbs per engine class when using GuC submission and direct all physical engine interrupts to this breadcrumbs. v2: (John H) - Rework header file structure so intel_engine_mask_t can be in intel_engine_types.h Signed-off-by: Matthew Brost <matthew.brost@intel.com> CC: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-6-matthew.brost@intel.com
2021-07-28drm/i915/guc: Disable bonding extension with GuC submissionMatthew Brost1-0/+5
Update the bonding extension to return -ENODEV when using GuC submission as this extension fundamentally will not work with the GuC submission interface. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-5-matthew.brost@intel.com
2021-07-28drm/i915: Hold reference to intel_context over life of i915_requestMatthew Brost1-32/+23
Hold a reference to the intel_context over life of an i915_request. Without this an i915_request can exist after the context has been destroyed (e.g. request retired, context closed, but user space holds a reference to the request from an out fence). In the case of GuC submission + virtual engine, the engine that the request references is also destroyed which can trigger bad pointer dref in fence ops (e.g. i915_fence_get_driver_name). We could likely change i915_fence_get_driver_name to avoid touching the engine but let's just be safe and hold the intel_context reference. v2: (John Harrison) - Update comment explaining how GuC mode and execlists mode deal with virtual engines differently Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-4-matthew.brost@intel.com
2021-07-28drm/i915/guc: Make hangcheck work with GuC virtual enginesJohn Harrison3-1/+17
The serial number tracking of engines happens at the backend of request submission and was expecting to only be given physical engines. However, in GuC submission mode, the decomposition of virtual to physical engines does not happen in i915. Instead, requests are submitted to their virtual engine mask all the way through to the hardware (i.e. to GuC). This would mean that the heart beat code thinks the physical engines are idle due to the serial number not incrementing. Which in turns means hangcheck does not work for GuC virtual engines. This patch updates the tracking to decompose virtual engines into their physical constituents and tracks the request against each. This is not entirely accurate as the GuC will only be issuing the request to one physical engine. However, it is the best that i915 can do given that it has no knowledge of the GuC's scheduling decisions. Downside of this is that all physical engines constituting a GuC virtual engine will be periodically unparked (even during just a single context executing) in order to be pinged with a heartbeat request. However the power and performance cost of this is not expected to be measurable (due low frequency of heartbeat pulses) and it is considered an easier option than trying to make changes to GuC firmware. v2: (Tvrtko) - Update commit message - Have default behavior if no vfunc present Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-3-matthew.brost@intel.com
2021-07-28drm/i915/guc: GuC virtual enginesMatthew Brost9-36/+311
Implement GuC virtual engines. Rather simple implementation, basically just allocate an engine, setup context enter / exit function to virtual engine specific functions, set all other variables / functions to guc versions, and set the engine mask to that of all the siblings. v2: Update to work with proto-ctx v3: (Daniele) - Drop include, add comment to intel_virtual_engine_has_heartbeat Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-2-matthew.brost@intel.com
2021-07-27drm/i915/display: Disable audio, DRRS and PSR before planesJosé Roberto de Souza4-13/+59
HDMI and DisplayPort sequences states that audio and PSR should be disabled before planes are disabled. Not following it did not caused any problems up to Alderlake-P but for this platform it causes underruns during the PSR2 disable sequence. Specification don't mention that DRRS should be disabled before planes but it looks safer to switch back to the default refresh rate before following with the rest of the pipe disable sequence. BSpec: 49191 BSpec: 49190 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210726181559.80855-1-jose.souza@intel.com
2021-07-27drm/i915: Implement PSF GV point supportStanislav Lisovskiy4-5/+122
PSF GV points are an additional factor that can limit the bandwidth available to display, separate from the traditional QGV points. Whereas traditional QGV points represent possible memory clock frequencies, PSF GV points reflect possible frequencies of the memory fabric. Switching between PSF GV points has the advantage of incurring almost no memory access block time and thus does not need to be accounted for in watermark calculations. This patch adds support for those on top of regular QGV points. Those are supposed to be used simultaneously, i.e we are always at some QGV and some PSF GV point, based on the current video mode requirements. Bspec: 64631, 53998 v2: Seems that initial assumption made during ml conversation was wrong, PCode rejects any masks containing points beyond the ones returned, so even though BSpec says we have around 8 points theoretically, we can mask/unmask only those which are returned, trying to manipulate those beyond causes a failure from PCode. So switched back to generating mask from 1 - num_qgv_points, where num_qgv_points is the actual amount of points, advertised by PCode. v3: - Extended restricted qgv point mask to 0xf, as we have now 3:2 bits for PSF GV points(Matt Roper) - Replaced val2 with NULL from PCode request, since its not being used(Matt Roper) - Replaced %d to 0x%x for better readability(thanks for spotting) Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-2-stanislav.lisovskiy@intel.com
2021-07-27drm/i915: Extend QGV point restrict mask to 0x3Stanislav Lisovskiy1-1/+1
According to BSpec there is now also a code 0x02, which corresponds to QGV point being rejected, this code so lets extend mask to check this. Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210531064845.4389-1-stanislav.lisovskiy@intel.com
2021-07-27drm/i915/ehl: unconditionally flush the pages on acquireMatthew Auld2-0/+24
EHL and JSL add the 'Bypass LLC' MOCS entry, which should make it possible for userspace to bypass the GTT caching bits set by the kernel, as per the given object cache_level. This is troublesome since the heavy flush we apply when first acquiring the pages is skipped if the kernel thinks the object is coherent with the GPU. As a result it might be possible to bypass the cache and read the contents of the page directly, which could be stale data. If it's just a case of userspace shooting themselves in the foot then so be it, but since i915 takes the stance of always zeroing memory before handing it to userspace, we need to prevent this. v2: this time actually set cache_dirty in put_pages() v3: move to get_pages() which looks simpler BSpec: 34007 References: 046091758b50 ("Revert "drm/i915/ehl: Update MOCS table for EHL"") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Cc: Francisco Jerez <francisco.jerez.plata@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210723105045.400841-2-matthew.auld@intel.com
2021-07-27drm/i915: document caching related bitsMatthew Auld2-13/+183
Try to document the object caching related bits, like cache_coherent and cache_dirty. v2(Ville): - As pointed out by Ville, fix the completely incorrect assumptions about the "partial" coherency on shared LLC platforms. v3(Daniel): - Fix nonsense about "dirtying" the cache with reads. v4(Daniel): - Various improvements, including adding some more details for WT. Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210723105045.400841-1-matthew.auld@intel.com
2021-07-27drm/i915/display/psr2: Fix cursor updates using legacy apisJosé Roberto de Souza1-2/+6
The fast path only updates cursor register what will not cause any updates in the screen when using PSR2 selective fetch. The only option that we have is to go through the slow patch that will do full atomic commit, that will trigger the PSR2 selective fetch compute and programing calls. Without this patch is possible to see a mouse movement lag in Gnome when PSR2 selective fetch is enabled. Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210717011227.204494-3-jose.souza@intel.com
2021-07-27drm/i915/display/psr2: Mark as updated all planes that intersect with pipe_clipJosé Roberto de Souza1-0/+1
Without this planes that were added by intel_psr2_sel_fetch_update() that intersect with pipe damaged area will not have skl_program_plane() and intel_psr2_program_plane_sel_fetch() called, causing panel to not be properly updated. Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210717011227.204494-2-jose.souza@intel.com