summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gt/intel_rps.c
AgeCommit message (Collapse)AuthorFilesLines
2020-12-31drm/i915: Drop i915_request.lock requirement for intel_rps_boost()Chris Wilson1-12/+13
Since we use a flag within i915_request.flags to indicate when we have boosted the request (so that we only apply the boost) once, this can be used as the serialisation with i915_request_retire() to avoid having to explicitly take the i915_request.lock which is more heavily contended. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201231093149.19086-1-chris@chris-wilson.co.uk
2020-11-30drm/i915/gt: Limit frequency drop to RPe on parkingChris Wilson1-0/+4
We treat idling the GT (intel_rps_park) as a downclock event, and reduce the frequency we intend to restart the GT with. Since the two workloads are likely related (e.g. a compositor rendering every 16ms), we want to carry the frequency and load information from across the idling. However, we do also need to update the frequencies so that workloads that run for less than 1ms are autotuned by RPS (otherwise we leave compositors running at max clocks, draining excess power). Conversely, if we try to run too slowly, the next workload has to run longer. Since there is a hysteresis in the power graph, below a certain frequency running a short workload for longer consumes more energy than running it slightly higher for less time. The exact balance point is unknown beforehand, but measurements with 30fps media playback indicate that RPe is a better choice. Reported-by: Edward Baker <edward.baker@intel.com> Tested-by: Edward Baker <edward.baker@intel.com> Fixes: 043cd2d14ede ("drm/i915/gt: Leave rps->cur_freq on unpark") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Edward Baker <edward.baker@intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Lyude Paul <lyude@redhat.com> Cc: <stable@vger.kernel.org> # v5.8+ Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201124183521.28623-1-chris@chris-wilson.co.uk
2020-11-21drm/i915/gt: Plug IPS into intel_rps_setChris Wilson1-12/+22
The old IPS interface did not match the RPS interface that we tried to plug it into (bool vs int return). Once repaired, our minimal selftesting is finally happy! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201121190352.15996-1-chris@chris-wilson.co.uk
2020-11-13Merge tag 'drm-intel-gt-next-2020-11-12-1' of ↵Dave Airlie1-1/+1
git://anongit.freedesktop.org/drm/drm-intel into drm-next Cross-subsystem Changes: - DMA mapped scatterlist fixes in i915 to unblock merging of https://lkml.org/lkml/2020/9/27/70 (Tvrtko, Tom) Driver Changes: - Fix for user reported issue #2381 (Graphical output stops with "switching to inteldrmfb from simple"): Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init (Ville, Chris) - Fix for Tigerlake (and earlier) to avoid spurious empty CSB events leading to hang (Chris, Bruce) - Delay execlist processing for Tigerlake to avoid hang (Chris) - Fix for Tigerlake RCS engine health check through heartbeat (Chris) - Fix for Tigerlake reserved MOCS entries (Ayaz, Chris) - Fix Media power gate sequence on Tigerlake (Rodrigo) - Enable eLLC caching of display buffers for SKL+ (Ville) - Support parsing of oversize batches on Gen9 (Matt, Chris) - Exclude low pages (128KiB) of stolen from use to avoid thrashing during reset (Chris) - Flush engines before Tigerlake breadcrumbs (Chris) - Use the local HWSP offset during submission (Chris) - Flush coherency domains on first set-domain-ioctl (Chris, Zbigniew) - Use the active reference on the vma while capturing to avoid use-after-free (Chris) - Fix MOCS PTE setting for gen9+ (Ville) - Avoid NULL dereference on IPS driver callback while unbinding i915 (Chris) - Avoid NULL dereference from PT/PD stash allocation error (Matt) - Hold request reference for canceling an active context (Chris) - Avoid infinite loop on x86-32 when mapping a lot of objects (Chris) - Disallow WC mappings when processor doesn't support them (Chris) - Return correct error in i915_gem_object_copy_blt() error path (Dan) - Return correct error in intel_context_create_request() error path (Maarten) - Tune down GuC communication enabled/disabled messages to debug (Jani) - Fix rebased commit "Remove i915_request.lock requirement for execution callbacks" (Chris) - Cancel outstanding work after disabling heartbeats on an engine (Chris) - Signal cancelled requests (Chris) - Retire cancelled requests on unload (Chris) - Scrub HW state on driver remove (Chris) - Undo forced context restores after trivial preemptions (Chris) - Handle PCI unbind in PMU code (Tvrtko) - Fix CPU hotplug with multiple GPUs in PMU code (Trtkko) - Correctly set SFC capability for video engines (Venkata) - Update GuC code to use firmware v49.0.1 (John, Matthew B., Daniele, Oscar, Michel, Rodrigo, Michal) - Improve GuC warnings on loading failure (John) - Avoid ownership race in buffer pool by clearing age (Chris) - Use MMIO to read CSB in case of failure (Chris, Mika) - Show engine properties in engine state dump to indicate changes (Chris, Joonas) - Break up error capture compression loops with cond_resched() (Chris) - Reduce GPU error capture mutex hold time to avoid khungtaskd (Chris) - Serialise debugfs i915_gem_objects with ctx->mutex (Chris) - Always test execution status on closing the context and close if not persistent (Chris) - Avoid mixing integer types during batch copies (Chris, Jared) - Skip over MI_NOOP when parsing to avoid overhead (Chris) - Hold onto an explicit ref to i915_vma_work.pinned (Chris) - Perform all asynchronous waits prior to marking payload start (Chris) - Pull phys pread/pwrite implementations to the backend (Matt) - Improve record of hung engines in error state (Tvrtko) - Allow backends to override pread implementation (Matt) - Reinforce LRC poisoning checks to confirm context survives execution (Chris) - Fix memory region max size calculation (Matt) - Fix order when adding blocks to memory region (Matt) - Eliminate unused intel_virtual_engine_get_sibling func (Chris) - Cleanup kasan warning for on-stack (unsigned long) casting (Chris) - Onion unwind for scratch page allocation failure (Chris) - Poison stolen pages before use (Chris) - Selftest improvements (Chris) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201112163407.GA20320@jlahtine-mobl.ger.corp.intel.com
2020-10-21drm/i915: Clean up the irq enable/disable for ilk rpsVille Syrjälä1-5/+10
Let's unmask the PCU event irq _after_ we've set up the hardware and software to deal with the fallout. We can also drop the PCU event bit from DEIER except when we need it for rps. And on the disable side we replace the hand rolled (and unlocked) DEIER/IIR/IMR frobbing with ilk_disable_display_irq(). Ocd does require me to reorder it to be symmetric with the enable path however. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-5-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-10-21drm/i915: Fix potential overflows in ilk ips calculationsVille Syrjälä1-5/+5
A bunch of the ips calculations require 64bit math. In particular 'corr' and 'corr2' look like they can overflow on 32bit systems. Switch to explicit u64 for those. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-3-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-10-21drm/i915: Read actual GPU frequency from MEMSTAT_ILK on ILKVille Syrjälä1-6/+25
There is no GEN6_RPSTAT1 on ILK. Instead of reading that let's try to get the same information from MEMSTAT_ILK. At least it seems to track MEMSWCTL frequency request perfectly on my ILK. It needs the same invert trick as the request value. We don't want to put the invert thing into intel_gpu_freq() and intel_freq_opcode() because that would incorrectly invert the min/max/etc frequencies also. One day someone might want to reverse engineer the formula for converting these numbers to Hz, but for now we'll just report them raw. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-2-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
2020-09-15drm/i915/gt: Check for a registered driver with IPSChris Wilson1-1/+1
If the ips module calls into the driver during an unbind/bind cycle, we may see the driver while it has unregistered itself from ips and try and dereference a NULL ips_mchdev pointer. <1> [211.928844] BUG: kernel NULL pointer dereference, address: 0000000000000014 <1> [211.928861] #PF: supervisor read access in kernel mode <1> [211.928871] #PF: error_code(0x0000) - not-present page <6> [211.928881] PGD 0 P4D 0 <4> [211.928890] Oops: 0000 [#1] PREEMPT SMP PTI <4> [211.928900] CPU: 3 PID: 327 Comm: ips-monitor Not tainted 5.9.0-rc5-CI-CI_DRM_9008+ #1 <4> [211.928914] Hardware name: Hewlett-Packard HP EliteBook 8440p/172A, BIOS 68CCU Ver. F.24 09/13/2013 <4> [211.929056] RIP: 0010:mchdev_get+0x5a/0x180 [i915] <4> [211.929067] Code: c0 5a 74 0d 80 3d f1 53 29 00 00 0f 84 ab 00 00 00 48 8b 1d c8 a8 29 00 e8 d3 18 89 e1 85 c0 74 09 80 3d d1 53 29 00 00 74 65 <8b> 4b 14 48 8d 7b 14 85 c9 0f 84 09 01 00 00 8d 51 01 89 c8 f0 0f <4> [211.929095] RSP: 0018:ffffc900002efe60 EFLAGS: 00010202 <4> [211.929105] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffff8881297acf40 <4> [211.929118] RDX: 0000000000000000 RSI: ffffffff8264e2c0 RDI: ffff8881297ad820 <4> [211.929130] RBP: ffffc900002efe68 R08: ffff8881297ad820 R09: 00000000fffffffe <4> [211.929143] R10: ffff8881297acf40 R11: 00000000fff74c96 R12: ffff8881294dfa18 <4> [211.929155] R13: 0000000000000067 R14: ffff888126eff640 R15: ffff888126efe840 <4> [211.929168] FS: 0000000000000000(0000) GS:ffff888133d80000(0000) knlGS:0000000000000000 <4> [211.929182] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [211.929194] CR2: 0000000000000014 CR3: 0000000002610000 CR4: 00000000000006e0 <4> [211.929206] Call Trace: <4> [211.929294] i915_read_mch_val+0x15/0x380 [i915] <4> [211.929309] ? ips_monitor+0x3fb/0x630 [intel_ips] <4> [211.929321] ips_monitor+0x53c/0x630 [intel_ips] <4> [211.929334] ? ips_gpu_lower+0x30/0x30 [intel_ips] <4> [211.929348] kthread+0x14d/0x170 <4> [211.929358] ? kthread_park+0x80/0x80 <4> [211.929369] ret_from_fork+0x22/0x30 <4> [211.929382] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_generic ledtrig_audio i915 coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hwdep snd_hda_core e1000e snd_pcm mei_me mei intel_ips lpc_ich ptp prime_numbers pps_core <4> [211.929437] CR2: 0000000000000014 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200915105113.26564-1-chris@chris-wilson.co.uk
2020-09-07drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbsChris Wilson1-0/+1
On the virtual engines, we only use the intel_breadcrumbs for tracking signaling of stale breadcrumbs from the irq_workers. They do not have any associated interrupt handling, active requests are passed to a physical engine and associated breadcrumb interrupt handler. This causes issues for us as we need to ensure that we do not actually try and enable interrupts and the powermanagement required for them on the virtual engine, as they will never be disabled. Instead, let's specify the physical engine used for interrupt handler on a particular breadcrumb. v2: Drop b->irq_armed = true mocking for no interrupt HW Fixes: 4fe6abb8f513 ("drm/i915/gt: Ignore irq enabling on the virtual engines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-4-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2020-07-08drm/i915/sseu: Move sseu_info under gt_infoVenkata Sandeep Dhanalakota1-1/+2
SSEUs are a GT capability, so track them under gt_info. Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-8-daniele.ceraolospurio@intel.com
2020-06-19drm/i915/gt: Initialise rps timestampChris Wilson1-2/+2
Smatch warns that we may iterate over an empty array of gt->engines[]. One hopes that this is impossible, but nevertheless we can simply appease smatch by initialising the timestamp to zero before we starting probing the busy-time from the engines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200619151938.21740-1-chris@chris-wilson.co.uk
2020-06-18drm/i915/gt: Always report the sample time for busy-statsChris Wilson1-5/+4
Return the monotonic timestamp (ktime_get()) at the time of sampling the busy-time. This is used in preference to taking ktime_get() separately before or after the read seqlock as there can be some large variance in reported timestamps. For selftests trying to ascertain that we are reporting accurate to within a few microseconds, even a small delay leads to the test failing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200617130916.15261-2-chris@chris-wilson.co.uk
2020-05-03drm/i915/gt: Sanitize RPS interrupts upon resumeChris Wilson1-1/+4
Currently we clear and disable the RPS pm interrupts on module load, and presume that they remain disabled forevermore. However, the mask is cleared on suspend and so after resume they may start showing up again unexepectedly. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1811 Fixes: 8e99299a04bc ("drm/i915/gt: Track use of RPS interrupts in flags") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi@etezian.org> Reviewed-by: Andi Shyti <andi@etezian.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200502173512.32353-1-chris@chris-wilson.co.uk
2020-04-30drm/i915/gt: Restore aggressive post-boost downclockingChris Wilson1-16/+4
We reduced the clocks slowly after a boost event based on the observation that the smoothness of animations suffered. However, since reducing the evalution intervals, we should be able to respond to the rapidly fluctuating workload of a simple desktop animation and so restore the more aggressive downclocking. References: 2a8862d2f3da ("drm/i915: Reduce the RPS shock") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-6-chris@chris-wilson.co.uk
2020-04-30drm/i915/gt: Apply the aggressive downclocking to parkingChris Wilson1-4/+9
We treat parking as a manual RPS timeout event, and downclock the GPU for the next unpark and batch execution. However, having restored the aggressive downclocking and observed that we have very light workloads whose only interaction is through the manual parking events, carry over the aggressive downclocking to the fake RPS events. References: 21abf0bf168d ("drm/i915/gt: Treat idling as a RPS downclock event") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-5-chris@chris-wilson.co.uk
2020-04-30drm/i915/gt: Switch to manual evaluation of RPSChris Wilson1-16/+122
As with the realisation for soft-rc6, we respond to idling the engines within microseconds, far faster than the response times for HW RC6 and RPS. Furthermore, our fast parking upon idle, prevents HW RPS from running for many desktop workloads, as the RPS evaluation intervals are on the order of tens of milliseconds, but the typical workload is just a couple of milliseconds, but yet we still need to determine the best frequency for user latency versus power. Recognising that the HW evaluation intervals are a poor fit, and that they were deprecated [in bspec at least] from gen10, start to wean ourselves off them and replace the EI with a timer and our accurate busy-stats. The principle benefit of manually evaluating RPS intervals is that we can be more responsive for better performance and powersaving for both spiky workloads and steady-state. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1698 Fixes: 98479ada421a ("drm/i915/gt: Treat idling as a RPS downclock event") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-4-chris@chris-wilson.co.uk
2020-04-30drm/i915/gt: Track use of RPS interrupts in flagsChris Wilson1-5/+12
Use the new intel_rps.flags field to store whether or not interrupts are being used with RPS. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi@etezian.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-3-chris@chris-wilson.co.uk
2020-04-30drm/i915/gt: Move rps.enabled/active to flagsChris Wilson1-17/+26
Pull the boolean intel_rps.enabled and intel_rps.active into a single flags field, in preparation for more. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-2-chris@chris-wilson.co.uk
2020-04-24drm/i915/gt: Use the RPM config register to determine clk frequenciesChris Wilson1-14/+22
For many configuration details within RC6 and RPS we are programming intervals for the internal clocks. From gen11, these clocks are configuration via the RPM_CONFIG and so for convenience, we would like to convert to/from more natural units (ns). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-2-chris@chris-wilson.co.uk
2020-04-24drm/i915/gt: Trace RPS eventsChris Wilson1-4/+44
Add tracek to the RPS events (interrupts, worker, enabling, threshold selection, frequency setting), so that if we have to debug reticent HW we have some traces to start from. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200424162805.25920-1-chris@chris-wilson.co.uk
2020-04-24drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUTChris Wilson1-23/+18
The RPS DOWN_TIMEOUT interrupt is signaled after a period of rc6, and upon receipt of that interrupt we reprogram the GPU clocks down to the next idle notch [to help convserve power during rc6]. However, on execlists, we benefit from soft-rc6 immediately parking the GPU and setting idle frequencies upon idling [within a jiffie], and here the interrupt prevents us from restarting from our last frequency. In the process, we can simply opt for a static pm_events mask and rely on the enable/disable interrupts to flush the worker on parking. This will reduce the amount of oscillation observed during steady workloads with microsleeps, as each time the rc6 timeout occurs we immediately follow with a waitboost for a dropped frame. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200422001703.1697-1-chris@chris-wilson.co.uk
2020-04-16drm/i915/gt: Update PMINTRMSK holding fwChris Wilson1-3/+6
If we use a non-forcewaked write to PMINTRMSK, it does not take effect until much later, if at all, causing a loss of RPS interrupts and no GPU reclocking, leaving the GPU running at the wrong frequency for long periods of time. Reported-by: Francisco Jerez <currojerez@riseup.net> Suggested-by: Francisco Jerez <currojerez@riseup.net> Fixes: 35cc7f32c298 ("drm/i915/gt: Use non-forcewake writes for RPS") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Francisco Jerez <currojerez@riseup.net> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Cc: <stable@vger.kernel.org> # v5.6+ Link: https://patchwork.freedesktop.org/patch/msgid/20200415170318.16771-2-chris@chris-wilson.co.uk
2020-04-16drm/i915/selftests: Exercise basic RPS interrupt generationChris Wilson1-0/+4
Since we depend upon RPS generating interrupts after evaluation intervals to determine when to up/down clock the GPU, it is imperative that we successfully enable interrupt generation! Verify that we do see an interrupt if we keep the GPU busy for an entire EI. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200415170318.16771-1-chris@chris-wilson.co.uk
2020-03-23drm/i915/gt: Leave rps->cur_freq on unparkChris Wilson1-9/+8
Don't override our previous frequency we used after parking, and avoid continually spiking back to the efficient frequency for mostly idle workloads. Trust our ability to autotune across a workload switch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Lyude Paul <lyude@redhat.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200322163225.28791-2-chris@chris-wilson.co.uk
2020-03-23drm/i915/gt: Treat idling as a RPS downclock eventChris Wilson1-0/+13
If we park/unpark faster than we can respond to RPS events, we never will process a downclock event after expiring a waitboost, and thus we will forever restart the GPU at max clocks even if the workload switches and doesn't justify full power. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1500 Fixes: 3e7abf814193 ("drm/i915: Extract GT render power state management") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Lyude Paul <lyude@redhat.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200322163225.28791-1-chris@chris-wilson.co.uk Cc: <stable@vger.kernel.org> # v5.5+
2020-03-19drm/i915/rps: use struct drm_device based logging macros.Wambui Karuga1-39/+36
Replace the use of the printk based drm logging macros with the struct drm_device based logging macros in i915/gt/intel_rps.c. This also involves extracting the drm_i915_private device pointer from various intel types. This converts the instances of DRM_DEBUG_DRIVER to drm_dbg() while not converting DRM_DEBUG() instances due to the lack of an analogous drm_device based macro. References: https://lists.freedesktop.org/archives/dri-devel/2020-January/253381.html Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200314183344.17603-7-wambui.karugax@gmail.com
2020-03-11drm/i915/gt: Pull checking rps->pm_events under the irq_lockChris Wilson1-12/+17
Avoid angering kcsan by serialising the read of the pm_events with the write in rps_disable_interrupts. [ 6268.713419] BUG: KCSAN: data-race in intel_rps_park [i915] / rps_work [i915] [ 6268.713437] [ 6268.713449] write to 0xffff8881eda8efac of 4 bytes by task 1127 on cpu 3: [ 6268.713680] intel_rps_park+0x136/0x260 [i915] [ 6268.713905] __gt_park+0x61/0xa0 [i915] [ 6268.714128] ____intel_wakeref_put_last+0x42/0x90 [i915] [ 6268.714352] __intel_wakeref_put_work+0xd3/0xf0 [i915] [ 6268.714369] process_one_work+0x3b1/0x690 [ 6268.714384] worker_thread+0x80/0x670 [ 6268.714398] kthread+0x19a/0x1e0 [ 6268.714412] ret_from_fork+0x1f/0x30 [ 6268.714423] [ 6268.714435] read to 0xffff8881eda8efac of 4 bytes by task 950 on cpu 2: [ 6268.714664] rps_work+0xc2/0x680 [i915] [ 6268.714679] process_one_work+0x3b1/0x690 [ 6268.714693] worker_thread+0x80/0x670 [ 6268.714707] kthread+0x19a/0x1e0 [ 6268.714720] ret_from_fork+0x1f/0x30 v2: Mark all reads and writes of rpm->pm_events. The flow of enabling/disabling rps is stronly ordered, so the writes and interrupt generation are also strongly ordered -- just this may not be visible to the compiler, so provide annotations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200311092624.10012-1-chris@chris-wilson.co.uk
2020-03-11drm/i915: Mark up racy read of active rq->engineChris Wilson1-1/+1
As a virtual engine may change the rq->engine to point to the active request in flight, we need to warn the compiler that an active request's engine is volatile. [ 95.017686] write (marked) to 0xffff8881e8386b10 of 8 bytes by interrupt on cpu 2: [ 95.018123] execlists_dequeue+0x762/0x2150 [i915] [ 95.018539] __execlists_submission_tasklet+0x48/0x60 [i915] [ 95.018955] execlists_submission_tasklet+0xd3/0x170 [i915] [ 95.018986] tasklet_action_common.isra.0+0x42/0xa0 [ 95.019016] __do_softirq+0xd7/0x2cd [ 95.019043] irq_exit+0xbe/0xe0 [ 95.019068] irq_work_interrupt+0xf/0x20 [ 95.019491] i915_request_retire+0x2c5/0x670 [i915] [ 95.019937] retire_requests+0xa1/0xf0 [i915] [ 95.020348] engine_retire+0xa1/0xe0 [i915] [ 95.020376] process_one_work+0x3b1/0x690 [ 95.020403] worker_thread+0x80/0x670 [ 95.020429] kthread+0x19a/0x1e0 [ 95.020454] ret_from_fork+0x1f/0x30 [ 95.020476] [ 95.020498] read to 0xffff8881e8386b10 of 8 bytes by task 8909 on cpu 3: [ 95.020918] __i915_request_commit+0x177/0x220 [i915] [ 95.021329] i915_gem_do_execbuffer+0x38c4/0x4e50 [i915] [ 95.021750] i915_gem_execbuffer2_ioctl+0x2c3/0x580 [i915] [ 95.021784] drm_ioctl_kernel+0xe4/0x120 [ 95.021809] drm_ioctl+0x297/0x4c7 [ 95.021832] ksys_ioctl+0x89/0xb0 [ 95.021865] __x64_sys_ioctl+0x42/0x60 [ 95.021901] do_syscall_64+0x6e/0x2c0 [ 95.021927] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200310142403.5953-1-chris@chris-wilson.co.uk
2020-03-09drm/i915/gt: Mark up intel_rps.active for racy readsChris Wilson1-0/+1
We read the current state of intel_rps.active outside of the lock, so mark up the racy access. [ 525.037073] BUG: KCSAN: data-race in intel_rps_boost [i915] / intel_rps_park [i915] [ 525.037091] [ 525.037103] write to 0xffff8881f145efa1 of 1 bytes by task 192 on cpu 2: [ 525.037331] intel_rps_park+0x72/0x230 [i915] [ 525.037552] __gt_park+0x61/0xa0 [i915] [ 525.037771] ____intel_wakeref_put_last+0x42/0x90 [i915] [ 525.037991] __intel_wakeref_put_work+0xd3/0xf0 [i915] [ 525.038008] process_one_work+0x3b1/0x690 [ 525.038022] worker_thread+0x80/0x670 [ 525.038037] kthread+0x19a/0x1e0 [ 525.038051] ret_from_fork+0x1f/0x30 [ 525.038062] [ 525.038074] read to 0xffff8881f145efa1 of 1 bytes by task 733 on cpu 3: [ 525.038304] intel_rps_boost+0x67/0x1f0 [i915] [ 525.038535] i915_request_wait+0x562/0x5d0 [i915] [ 525.038764] i915_gem_object_wait_fence+0x81/0xa0 [i915] [ 525.038994] i915_gem_object_wait_reservation+0x489/0x520 [i915] [ 525.039224] i915_gem_wait_ioctl+0x167/0x2b0 [i915] [ 525.039241] drm_ioctl_kernel+0xe4/0x120 [ 525.039255] drm_ioctl+0x297/0x4c7 [ 525.039269] ksys_ioctl+0x89/0xb0 [ 525.039282] __x64_sys_ioctl+0x42/0x60 [ 525.039296] do_syscall_64+0x6e/0x2c0 [ 525.039311] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200309113623.24208-1-chris@chris-wilson.co.uk
2020-03-09drm/i915/gt: Mark up intel_rps.active for racy readsChris Wilson1-4/+7
We read the current state of intel_rps.active outside of the lock, so mark up the racy access. [ 525.037073] BUG: KCSAN: data-race in intel_rps_boost [i915] / intel_rps_park [i915] [ 525.037091] [ 525.037103] write to 0xffff8881f145efa1 of 1 bytes by task 192 on cpu 2: [ 525.037331] intel_rps_park+0x72/0x230 [i915] [ 525.037552] __gt_park+0x61/0xa0 [i915] [ 525.037771] ____intel_wakeref_put_last+0x42/0x90 [i915] [ 525.037991] __intel_wakeref_put_work+0xd3/0xf0 [i915] [ 525.038008] process_one_work+0x3b1/0x690 [ 525.038022] worker_thread+0x80/0x670 [ 525.038037] kthread+0x19a/0x1e0 [ 525.038051] ret_from_fork+0x1f/0x30 [ 525.038062] [ 525.038074] read to 0xffff8881f145efa1 of 1 bytes by task 733 on cpu 3: [ 525.038304] intel_rps_boost+0x67/0x1f0 [i915] [ 525.038535] i915_request_wait+0x562/0x5d0 [i915] [ 525.038764] i915_gem_object_wait_fence+0x81/0xa0 [i915] [ 525.038994] i915_gem_object_wait_reservation+0x489/0x520 [i915] [ 525.039224] i915_gem_wait_ioctl+0x167/0x2b0 [i915] [ 525.039241] drm_ioctl_kernel+0xe4/0x120 [ 525.039255] drm_ioctl+0x297/0x4c7 [ 525.039269] ksys_ioctl+0x89/0xb0 [ 525.039282] __x64_sys_ioctl+0x42/0x60 [ 525.039296] do_syscall_64+0x6e/0x2c0 [ 525.039311] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200309113623.24208-1-chris@chris-wilson.co.uk
2020-02-27drm/i915: significantly reduce the use of <drm/i915_drm.h>Jani Nikula1-0/+2
The #include has been splattered all over the place, but there are precious few places, all .c files, that actually need it. v2: remove leftover double newlines Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200225133131.3301-1-jani.nikula@intel.com
2020-01-22drm/i915/gt: Make WARN* drm specific where drm_priv ptr is availablePankaj Bharadiya1-9/+11
drm specific WARN* calls include device information in the backtrace, so we know what device the warnings originate from. Covert all the calls of WARN* with device specific drm_WARN* variants in functions where drm_i915_private struct pointer is readily available. The conversion was done automatically with below coccinelle semantic patch. checkpatch errors/warnings are fixed manually. @rule1@ identifier func, T; @@ func(...) { ... struct drm_i915_private *T = ...; <+... ( -WARN( +drm_WARN(&T->drm, ...) | -WARN_ON( +drm_WARN_ON(&T->drm, ...) | -WARN_ONCE( +drm_WARN_ONCE(&T->drm, ...) | -WARN_ON_ONCE( +drm_WARN_ON_ONCE(&T->drm, ...) ) ...+> } @rule2@ identifier func, T; @@ func(struct drm_i915_private *T,...) { <+... ( -WARN( +drm_WARN(&T->drm, ...) | -WARN_ON( +drm_WARN_ON(&T->drm, ...) | -WARN_ONCE( +drm_WARN_ONCE(&T->drm, ...) | -WARN_ON_ONCE( +drm_WARN_ON_ONCE(&T->drm, ...) ) ...+> } command: spatch --sp-file <script> --dir drivers/gpu/drm/i915/gt \ --linux-spacing --in-place Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200115034455.17658-7-pankaj.laxminarayan.bharadiya@intel.com
2020-01-06drm/i915: Merge i915_request.flags with i915_request.fence.flagsChris Wilson1-1/+1
As we already have a flags field buried within i915_request, reuse it! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200106114234.2529613-3-chris@chris-wilson.co.uk
2019-12-19drm/i915/gt: Suppress threshold updates on RPS parkingChris Wilson1-5/+6
When we park RPS, we set the GPU to run at minimum 'idle' frequency. However, as the GPU is idle, we also disable the worker and RPS interrupts - changing the RPS thresholds has no effect, it just incurs extra changes to restore them when we unpark. So on parking, leave the thresholds set to the current power level and so we expect them to be valid for our restart. References: https://gitlab.freedesktop.org/drm/intel/issues/848 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191218210545.3975426-2-chris@chris-wilson.co.uk
2019-12-19drm/i915/gt: Use non-forcewake writes for RPSChris Wilson1-34/+32
Use non-forcewaked writes to queue RPS register changes that will take effect when the write buffer is flushed, rather than wake the mmio device for immediate effect. This is so that we can avoid a slow forcewake dance upon unparking, and at our irregular updates. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191218210545.3975426-1-chris@chris-wilson.co.uk
2019-12-18drm/i915/gt: Remove direct invocation of breadcrumb signalingChris Wilson1-1/+1
Only signal the breadcrumbs from inside the irq_work, simplifying our interface and calling conventions. The micro-optimisation here is that by always using the irq_work interface, we know we are always inside an irq-off critical section for the breadcrumb signaling and can ellide save/restore of the irq flags. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191217095642.3124521-7-chris@chris-wilson.co.uk
2019-12-14drm/i915/rps: Add frequency translation helpersAndi Shyti1-3/+33
Add two helpers that for reading the actual GT's frequency. The two helpers are: - intel_rps_read_cagf: reads the frequency and returns it not normalized - intel_rps_read_actual_frequency: provides the frequency in Hz. Use the above helpers in sysfs and debugfs. Signed-off-by: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191213183736.31992-2-andi@etezian.org
2019-12-12drm/i915/gt: Mark up ips_mchdev pointer accessChris Wilson1-1/+1
drivers/gpu/drm/i915/gt/intel_rps.c:1726:24: error: incompatible types in comparison expression (different address spaces): drivers/gpu/drm/i915/gt/intel_rps.c:1726:24: struct drm_i915_private [noderef] <asn:4> * drivers/gpu/drm/i915/gt/intel_rps.c:1726:24: struct drm_i915_private * Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191212140459.1307617-7-chris@chris-wilson.co.uk
2019-12-11drm/i915/gt: Check we are the Ironlake IPS provider before deregisteringChris Wilson1-1/+3
Check that we own the global pointer before deregistering. Reported-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191210153620.3929372-1-chris@chris-wilson.co.uk
2019-11-15drm/i915/guc: Properly capture & release GuC interrupts on Gen11+Daniele Ceraolo Spurio1-1/+1
With the new interrupt re-partitioning in Gen11, GuC controls by itself the interrupts it receives, so steering bits and registers have been defeatured. Being this the case, when the GuC is in control of submissions we won't know what to do with the ctx switch interrupt in the driver, so disable it. v2 (Daniele): replace the gen9 paths instead of keeping gen9 and gen11 functions since we won't support guc submission on any pre-gen11 platform. Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191105225321.26642-1-daniele.ceraolospurio@intel.com
2019-10-30drm/i915/gt: Always track callers to intel_rps_mark_interactive()Chris Wilson1-6/+6
During startup, we may find ourselves in an interesting position where we haven't fully enabled RPS before the display starts trying to use it. This may lead to an imbalance in our "interactive" counter: <3>[ 4.813326] intel_rps_mark_interactive:652 GEM_BUG_ON(!rps->power.interactive) <4>[ 4.813396] ------------[ cut here ]------------ <2>[ 4.813398] kernel BUG at drivers/gpu/drm/i915/gt/intel_rps.c:652! <4>[ 4.813430] invalid opcode: 0000 [#1] PREEMPT SMP PTI <4>[ 4.813438] CPU: 1 PID: 18 Comm: kworker/1:0H Not tainted 5.4.0-rc5-CI-CI_DRM_7209+ #1 <4>[ 4.813447] Hardware name: /NUC7i5BNB, BIOS BNKBL357.86A.0054.2017.1025.1822 10/25/2017 <4>[ 4.813525] Workqueue: events_highpri intel_atomic_cleanup_work [i915] <4>[ 4.813589] RIP: 0010:intel_rps_mark_interactive+0xb3/0xc0 [i915] <4>[ 4.813597] Code: bc 3f de e0 48 8b 35 84 2e 24 00 49 c7 c0 f3 d4 4e a0 b9 8c 02 00 00 48 c7 c2 80 9c 48 a0 48 c7 c7 3e 73 34 a0 e8 8d 3b e5 e0 <0f> 0b 90 66 2e 0f 1f 84 00 00 00 00 00 80 bf c0 00 00 00 00 74 32 <4>[ 4.813616] RSP: 0018:ffffc900000efe00 EFLAGS: 00010286 <4>[ 4.813623] RAX: 000000000000000e RBX: ffff8882583cc7f0 RCX: 0000000000000000 <4>[ 4.813631] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff888275969c00 <4>[ 4.813639] RBP: 0000000000000000 R08: 0000000000000008 R09: ffff888275ace000 <4>[ 4.813646] R10: ffffc900000efe00 R11: ffff888275969c00 R12: ffff8882583cc8d8 <4>[ 4.813654] R13: ffff888276abce00 R14: 0000000000000000 R15: ffff88825e878860 <4>[ 4.813662] FS: 0000000000000000(0000) GS:ffff888276a80000(0000) knlGS:0000000000000000 <4>[ 4.813672] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 4.813678] CR2: 00007f051d5ca0a8 CR3: 0000000262f48001 CR4: 00000000003606e0 <4>[ 4.813686] Call Trace: <4>[ 4.813755] intel_cleanup_plane_fb+0x4e/0x60 [i915] <4>[ 4.813764] drm_atomic_helper_cleanup_planes+0x4d/0x70 <4>[ 4.813833] intel_atomic_cleanup_work+0x15/0x80 [i915] <4>[ 4.813842] process_one_work+0x26a/0x620 <4>[ 4.813850] worker_thread+0x37/0x380 <4>[ 4.813857] ? process_one_work+0x620/0x620 <4>[ 4.813864] kthread+0x119/0x130 <4>[ 4.813870] ? kthread_park+0x80/0x80 <4>[ 4.813878] ret_from_fork+0x3a/0x50 <4>[ 4.813887] Modules linked in: i915(+) mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul btusb btrtl btbcm btintel snd_hda_intel snd_intel_nhlt snd_hda_codec bluetooth snd_hwdep snd_hda_core ghash_clmulni_intel snd_pcm e1000e ecdh_generic ecc ptp pps_core mei_me mei prime_numbers <4>[ 4.813934] ---[ end trace c13289af88174ffc ]--- The solution employed is to not worry about RPS state and keep the tally of the interactive counter separate. When we do enable RPS, we will then take the display activity into account. Fixes: 3e7abf814193 ("drm/i915: Extract GT render power state management") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191030103827.2413-1-chris@chris-wilson.co.uk
2019-10-28drm/i915/gt: Tidy up rps irq handler to use intel_gtChris Wilson1-5/+3
Since the rps is tied to its intel_gt, use that backpointer to find the right engine rather than delving into i915. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191027175505.25470-1-chris@chris-wilson.co.uk
2019-10-27drm/i915/rps: Flip interpretation of ips fmin/fmax to max rpsChris Wilson1-3/+5
ips uses clock delays as opposed to rps frequency bins. To fit the delays into the same rps calculations, we need to invert the ips delays. Fixes: 3e7abf814193 ("drm/i915: Extract GT render power state management") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191026200917.1780-1-chris@chris-wilson.co.uk
2019-10-26drm/i915: Extract GT render power state managementAndi Shyti1-0/+1872
i915_irq.c is large. One reason for this is that has a large chunk of the GT render power management stashed away in it. Extract that logic out of i915_irq.c and intel_pm.c and put it under one roof. Based on a patch by Chris Wilson. Signed-off-by: Andi Shyti <andi.shyti@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191024211642.7688-1-chris@chris-wilson.co.uk