kernel/linux.git/drivers/gpu/drm/i915/intel_ringbuffer.h, branch v4.11.5

drm/i915/execlists: Wrap tail pointer after reset tweaking

2017-03-29T12:45:48+00:00

If the request->wa_tail is 0 (because it landed exactly on the end of the ringbuffer), when we reconstruct request->tail following a reset we fill in an illegal value (-8 or 0x001ffff8). As a result, RING_HEAD is never able to catch up with RING_TAIL and the GPU spins endlessly. If the ring contains a couple of breadcrumbs, even our hangcheck is unable to catch the busy-looping as the ACTHD and seqno continually advance. v2: Move the wrap into a common intel_ring_wrap(). Fixes: a3aabe86a340 ("drm/i915/execlists: Reinitialise context image after GPU hang") Signed-off-by: Chris Wilson Cc: Mika Kuoppala Cc: # v4.10+ Link: http://patchwork.freedesktop.org/patch/msgid/20170327130009.4678-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala (cherry picked from commit 450362d3fe866b14304f309b5fffba0c33fbfbc3) Signed-off-by: Jani Nikula Link: http://patchwork.freedesktop.org/patch/msgid/20170329121315.1290-1-chris@chris-wilson.co.uk

drm/i915: make context status notifier head be per engine

2017-03-21T14:51:47+00:00

GVTg has introduced the context status notifier to schedule the GVTg workload. At that time, the notifier is bound to GVTg context only, so GVTg is not aware of host workloads. Now we are going to improve GVTg's guest workload scheduler policy, and add Guc emulation support for new Gen graphics. Both these two features require acknowledgment for all contexts running on hardware. (But will not alter host workload.) So here try to make some change. The change is simple: 1. Move the context status notifier head from i915_gem_context to intel_engine_cs. Which means there is a notifier head per engine instead of per context. Execlist driver still call notifier for each context sched-in/out events of current engine. 2. At GVTg side, it binds a notifier_block for each physical engine at GVTg initialization period. Then GVTg can hear all context status events. In this patch, GVTg do nothing for host context event, but later will add a function there. But in any case, the notifier callback is a noop if this is no active vGPU. Since intel_gvt_init() is called at early initialization stage and require the status notifier head has been initiated, I initiate it in intel_engine_setup(). v2: remove a redundant newline. (chris) Fixes: 3c7ba6359d70 ("drm/i915: Introduce execlist context status change notification") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100232 Signed-off-by: Changbin Du Cc: Joonas Lahtinen Cc: Tvrtko Ursulin Cc: Zhi Wang Reviewed-by: Chris Wilson Link: http://patchwork.freedesktop.org/patch/msgid/20170313024711.28591-1-changbin.du@intel.com Acked-by: Zhenyu Wang Signed-off-by: Chris Wilson (cherry picked from commit 3fc03069bc6e6c316f19bb526e3c8ce784677477) Signed-off-by: Jani Nikula Link: http://patchwork.freedesktop.org/patch/msgid/20170321144720.17020-1-chris@chris-wilson.co.uk

drm/i915: request ring to be pinned above GUC_WOPCM_TOP

2016-12-24T10:06:59+00:00

GuC will validate the ring offset and fail if it is in the [0, GUC_WOPCM_TOP) range. The bias is conditionally applied only if GuC loading is enabled (we can't check for guc submission enabled as in other cases because HuC loading requires this fix). Note that the default context is processed before enable_guc_loading is sanitized, so we might still apply the bias to its ring even if it is not needed. v2: compute the value during ctx init and pass it to intel_ring_pin (Chris), updated commit message Signed-off-by: Daniele Ceraolo Spurio Cc: Chris Wilson Cc: Michal Wajdeczko Cc: Arkadiusz Hiler Cc: Anusha Srivatsa Cc: Michał Winiarski Link: http://patchwork.freedesktop.org/patch/msgid/1482537382-28584-1-git-send-email-daniele.ceraolospurio@intel.com Reviewed-by: Chris Wilson Signed-off-by: Chris Wilson

drm/i915: Swap if(enable_execlists) in i915_gem_request_alloc for a vfunc

2016-12-18T16:18:56+00:00

A fairly trivial move of a matching pair of routines (for preparing a request for construction) onto an engine vfunc. The ulterior motive is to be able to create a mock request implementation. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin Link: http://patchwork.freedesktop.org/patch/msgid/20161218153724.8439-7-chris@chris-wilson.co.uk

drm/i915: Unify active context tracking between legacy/execlists/guc

2016-12-18T16:18:50+00:00

The requests conversion introduced a nasty bug where we could generate a new request in the middle of constructing a request if we needed to idle the system in order to evict space for a context. The request to idle would be executed (and waited upon) before the current one, creating a minor havoc in the seqno accounting, as we will consider the current request to already be completed (prior to deferred seqno assignment) but ring->last_retired_head would have been updated and still could allow us to overwrite the current request before execution. We also employed two different mechanisms to track the active context until it was switched out. The legacy method allowed for waiting upon an active context (it could forcibly evict any vma, including context's), but the execlists method took a step backwards by pinning the vma for the entire active lifespan of the context (the only way to evict was to idle the entire GPU, not individual contexts). However, to circumvent the tricky issue of locking (i.e. we cannot take struct_mutex at the time of i915_gem_request_submit(), where we would want to move the previous context onto the active tracker and unpin it), we take the execlists approach and keep the contexts pinned until retirement. The benefit of the execlists approach, more important for execlists than legacy, was the reduction in work in pinning the context for each request - as the context was kept pinned until idle, it could short circuit the pinning for all active contexts. We introduce new engine vfuncs to pin and unpin the context respectively. The context is pinned at the start of the request, and only unpinned when the following request is retired (this ensures that the context is idle and coherent in main memory before we unpin it). We move the engine->last_context tracking into the retirement itself (rather than during request submission) in order to allow the submission to be reordered or unwound without undue difficultly. And finally an ulterior motive for unifying context handling was to prepare for mock requests. v2: Rename to last_retired_context, split out legacy_context tracking for MI_SET_CONTEXT. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin Link: http://patchwork.freedesktop.org/patch/msgid/20161218153724.8439-3-chris@chris-wilson.co.uk

drm/i915: Decouple hang detection from hangcheck period

2016-11-21T12:36:40+00:00

Hangcheck state accumulation has gained more steps along the years, like head movement and more recently the subunit inactivity check. As the subunit sampling is only done if the previous state check showed inactivity, we have added more stages (and time) to reach a hang verdict. Asymmetric engine states led to different actual weight of 'one hangcheck unit' and it was demonstrated in some hangs that due to difference in stages, simpler engines were accused falsely of a hang as their scoring was much more quicker to accumulate above the hang treshold. To completely decouple the hangcheck guilty score from the hangcheck period, convert hangcheck score to a rough period of inactivity measurement. As these are tracked as jiffies, they are meaningful also across reset boundaries. This makes finding a guilty engine more accurate across multi engine activity scenarios, especially across asymmetric engines. We lose the ability to detect cross batch malicious attempts to hinder the progress. Plan is to move this functionality to be part of context banning which is more natural fit, later in the series. v2: use time_before macros (Chris) reinstate the pardoning of moving engine after hc (Chris) v3: avoid global state for per engine stall detection (Chris) v4: take timeline last retirement into account (Chris) v5: do debug print on pardoning, split out retirement timestamp (Chris) Cc: Chris Wilson Reviewed-by: Chris Wilson Signed-off-by: Mika Kuoppala

drm/i915: Split up hangcheck phases

2016-11-21T12:36:40+00:00

In order to simplify hangcheck state keeping, split hangcheck per engine loop in three phases: state load, action, state save. Add few more hangcheck actions to separate between seqno, head and subunit movements. This helps to gather all the hangcheck actions under a single switch umbrella. Cc: Chris Wilson Signed-off-by: Mika Kuoppala Reviewed-by: Chris Wilson Signed-off-by: Mika Kuoppala

drm/i915/scheduler: Execute requests in order of priorities

2016-11-14T21:01:21+00:00

Track the priority of each request and use it to determine the order in which we submit requests to the hardware via execlists. The priority of the request is determined by the user (eventually via the context) but may be overridden at any time by the driver. When we set the priority of the request, we bump the priority of all of its dependencies to match - so that a high priority drawing operation is not stuck behind a background task. When the request is ready to execute (i.e. we have signaled the submit fence following completion of all its dependencies, including third party fences), we put the request into a priority sorted rbtree to be submitted to the hardware. If the request is higher priority than all pending requests, it will be submitted on the next context-switch interrupt as soon as the hardware has completed the current request. We do not currently preempt any current execution to immediately run a very high priority request, at least not yet. One more limitation, is that this is first implementation is for execlists only so currently limited to gen8/gen9. v2: Replace recursive priority inheritance bumping with an iterative depth-first search list. v3: list_next_entry() for walking lists v4: Explain how the dfs solves the recursion problem with PI. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-8-chris@chris-wilson.co.uk

drm/i915/scheduler: Signal the arrival of a new request

2016-11-14T21:00:26+00:00

The start of the scheduler, add a hook into request submission for the scheduler to see the arrival of new requests and prepare its runqueues. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-6-chris@chris-wilson.co.uk

drm/i915: Remove engine->execlist_lock

2016-11-14T21:00:25+00:00

The execlist_lock is now completely subsumed by the engine->timeline->lock, and so we can remove the redundant layer of locking. Signed-off-by: Chris Wilson Reviewed-by: Tvrtko Ursulin Link: http://patchwork.freedesktop.org/patch/msgid/20161114204105.29171-5-chris@chris-wilson.co.uk