Age | Commit message (Collapse) | Author | Files | Lines |
|
The intent of the version check in the mmap ioctl was to maintain
support for existing platforms (i.e., ADL/RPL and earlier), but drop
support on all future igpu platforms. As we've seen on the dgpu side,
the hardware teams are using a more fine-grained numbering system for IP
version numbers these days, so it's possible the version number
associated with our next igpu could be some form of "12.xx" rather than
13 or higher. Comparing against the full ver.release number will ensure
the intent of the check is maintained no matter what numbering the
hardware teams settle on.
Fixes: d3f3baa3562a ("drm/i915: Reinstate the mmap ioctl for some platforms")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220407161839.1073443-1-matthew.d.roper@intel.com
|
|
Freq caps (i.e. RP0, RP1 and RPn frequencies) are read from HW. However the
formats (bit positions, widths, registers and units) of these vary for
different generations with even more variations arriving in the future. In
order not to have to do identical computation for these caps in multiple
places, here we centralize the computation of these caps. This makes the
code cleaner and also more extensible for the future.
v2: Clarify that caps are in "hw units" in comments (Lucas De Marchi)
v3: Minor checkpatch fix
v4: s/intel_rps_get_freq_caps/gen6_rps_get_freq_caps/ (Badal Nilawar)
v5: Changes comments to kernel doc (Anshuman Gupta)
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Badal Nilawar <badal.nilawar@intel.com>
Acked-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220406191848.20895-1-ashutosh.dixit@intel.com
|
|
Ensure we account for potential rounding up of lmem objects.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5485
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220406173023.1039107-1-matthew.auld@intel.com
|
|
The intel-gtt module is not used on other, non-x86 platforms, so we
will restrict it to x86 platforms only.
Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220330234809.1218210-3-casey.g.bowman@intel.com
|
|
Some functions defined in the intel-gtt module are used in several
areas, but is only supported on x86 platforms.
By separating these calls and their static underlying functions to
another area, we are able to compile out these functions for
non-x86 builds and provide stubs for the non-x86 implementations.
In addition to the problematic calls, we are moving the gmch-related
functions to the new area.
Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220330234809.1218210-2-casey.g.bowman@intel.com
|
|
Mixup in rebasing and patchwork re-runs made me push the wrong version of
the patch. Or I even forgot to send out the fixed version. Fix it up.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 49bd54b390c2 ("drm/i915: Track all user contexts per client")
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220405155345.3292769-1-tvrtko.ursulin@linux.intel.com
|
|
Similar to AMD commit
874442541133 ("drm/amdgpu: Add show_fdinfo() interface"), using the
infrastructure added in previous patches, we add basic client info
and GPU engine utilisation for i915.
Example of the output:
pos: 0
flags: 0100002
mnt_id: 21
drm-driver: i915
drm-pdev: 0000:00:02.0
drm-client-id: 7
drm-engine-render: 9288864723 ns
drm-engine-copy: 2035071108 ns
drm-engine-video: 0 ns
drm-engine-video-enhance: 0 ns
v2:
* Update for removal of name and pid.
v3:
* Use drm_driver.name.
v4:
* Added drm-engine-capacity- tag.
* Fix typo. (Umesh)
v5:
* Don't output engine data before Gen8.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David M Nieto <David.Nieto@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Chris Healy <cphealy@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-9-tvrtko.ursulin@linux.intel.com
|
|
This will be useful to have at hand in a following patch.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-8-tvrtko.ursulin@linux.intel.com
|
|
Proposal to standardise the fdinfo text format as optionally output by DRM
drivers.
Idea is that a simple but, well defined, spec will enable generic
userspace tools to be written while at the same time avoiding a more heavy
handed approach of adding a mid-layer to DRM.
i915 implements a subset of the spec, everything apart from the memory
stats currently, and a matching intel_gpu_top tool exists.
Open is to see if AMD can migrate to using the proposed GPU utilisation
key-value pairs, or if they are not workable to see whether to go
vendor specific, or if a standardised alternative can be found which is
workable for both drivers.
Same for the memory utilisation key-value pairs proposal.
v2:
* Update for removal of name and pid.
v3:
* 'Drm-driver' tag will be obtained from struct drm_driver.name. (Daniel)
v4:
* Added drm-engine-capacity- tag.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: David M Nieto <David.Nieto@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Daniel Stone <daniel@fooishbar.org>
Cc: Chris Healy <cphealy@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-7-tvrtko.ursulin@linux.intel.com
|
|
Track context active (on hardware) status together with the start
timestamp.
This will be used to provide better granularity of context
runtime reporting in conjunction with already tracked pphwsp accumulated
runtime.
The latter is only updated on context save so does not give us visibility
to any currently executing work.
As part of the patch the existing runtime tracking data is moved under the
new ce->stats member and updated under the seqlock. This provides the
ability to atomically read out accumulated plus active runtime.
v2:
* Rename and make __intel_context_get_active_time unlocked.
v3:
* Use GRAPHICS_VER.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> # v1
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-6-tvrtko.ursulin@linux.intel.com
|
|
We soon want to start answering questions like how much GPU time is the
context belonging to a client which exited still using.
To enable this we start tracking all context belonging to a client on a
separate list.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-5-tvrtko.ursulin@linux.intel.com
|
|
As contexts are abandoned we want to remember how much GPU time they used
(per class) so later we can used it for smarter purposes.
As GEM contexts are closed we want to have the DRM client remember how
much GPU time they used (per class) so later we can used it for smarter
purposes.
v2:
* Size past runtimes array by uabi class, not internal.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> # v1
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-4-tvrtko.ursulin@linux.intel.com
|
|
Make GEM contexts keep a reference to i915_drm_client for the whole of
of their lifetime which will come handy in following patches.
v2: Don't bother supporting selftests contexts from debugfs. (Chris)
v3 (Lucas): Finish constructing ctx before adding it to the list
v4 (Ram): Rebase.
v5: Trivial rebase for proto ctx changes.
v6: Rebase after clients no longer track name and pid.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v5
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> # v5
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-3-tvrtko.ursulin@linux.intel.com
|
|
Tracking DRM clients more explicitly will allow later patches to
accumulate past and current GPU usage in a centralised place and also
consolidate access to owning task pid/name.
Unique client id is also assigned for the purpose of distinguishing/
consolidating between multiple file descriptors owned by the same process.
v2:
Chris Wilson:
* Enclose new members into dedicated structs.
* Protect against failed sysfs registration.
v3:
* sysfs_attr_init.
v4:
* Fix for internal clients.
v5:
* Use cyclic ida for client id. (Chris)
* Do not leak pid reference. (Chris)
* Tidy code with some locals.
v6:
* Use xa_alloc_cyclic to simplify locking. (Chris)
* No need to unregister individial sysfs files. (Chris)
* Rebase on top of fpriv kref.
* Track client closed status and reflect in sysfs.
v7:
* Make drm_client more standalone concept.
v8:
* Simplify sysfs show. (Chris)
* Always track name and pid.
v9:
* Fix cyclic id assignment.
v10:
* No need for a mutex around xa_alloc_cyclic.
* Refactor sysfs into own function.
* Unregister sysfs before freeing pid and name.
* Move clients setup into own function.
v11:
* Call clients init directly from driver init. (Chris)
v12:
* Do not fail client add on id wrap. (Maciej)
v13 (Lucas): Rebase.
v14:
* Dropped sysfs bits.
v15:
* Dropped tracking of pid/ and name.
* Dropped RCU freeing of the client object.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v11
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> # v11
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-2-tvrtko.ursulin@linux.intel.com
|
|
New DG2 workaround added to specification.
BSpec: 54077
BSpec: 66622
BSpec: 54833
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220325142249.81443-1-jose.souza@intel.com
|
|
Move the sanity check that both src and dst are never both system
memory, which should never happen on discrete, and likely means we have
a bug. The only exception is on integrated where we trigger this path in
the selftests.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Nirmoy Das <nirmoy.das@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220324172143.377104-2-matthew.auld@intel.com
|
|
We only need this when allocating device local-memory, where this
influences the drm_buddy. Currently there is some funny behaviour where
an "in limbo" system memory object is lacking the relevant placement
flags etc. before we first allocate the ttm_tt, leading to ttm
performing a move when not needed, since the current placement is seen
as not compatible.
Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 2ed38cec5606 ("drm/i915: opportunistically apply ALLOC_CONTIGIOUS")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220324172143.377104-1-matthew.auld@intel.com
|
|
GPU hangs have been observed when multiple engines write to the
same aux_inv register at the same time. To avoid this each engine
should only invalidate its own auxiliary table. The function
gen12_emit_flush_xcs() currently invalidate the auxiliary table for
all engines because the rq->engine is not necessarily the engine
eventually carrying out the request, and potentially the engine
could even be a virtual one (with engine->instance being -1).
With the MMIO remap feature, we can actually set bit 17 of MI_LRI
instruction and let the hardware to figure out the local aux_inv
register at runtime to avoid invalidating auxiliary table for all
engines.
Bspec: 45728
v2: Invalidate AUX table for indirect context as well.
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Signed-off-by: Fei Yang <fei.yang@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220328171650.1900674-1-fei.yang@intel.com
|
|
lmem_size is used to limit the amount of lmem for testing purposes.
Default is to use hardware available lmem size.
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Nirmoy Das <nirmoy.das@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220324143123.348590-2-matthew.auld@intel.com
|
|
On error the "new" allocation is not freed, so add the required kfree.
Fixes: 247f8071d5893 ("drm/i915/guc: Pre-allocate output nodes for extraction")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220324000439.2370440-1-daniele.ceraolospurio@intel.com
|
|
UAPI with absolutely no documentation should not have been added -
clarify blob format and content will be described externally.
Fixes: 78e1fb3112c0 ("drm/i915/uapi: Add query for hwconfig blob")
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Co-developed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: John Harrison <john.c.harrison@intel.com>
Cc: Jon Ewins <jon.ewins@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20220325094916.2186367-1-tvrtko.ursulin@linux.intel.com
[tursulin: Fixed spelling s/meading/meaning/.]
|
|
Print the GuC captured error state register list (string names
and values) when gpu_coredump_state printout is invoked via
the i915 debugfs for flushing the gpu error-state that was
captured prior.
Since GuC could have reported multiple engine register dumps
in a single notification event, parse the captured data
(appearing as a stream of structures) to identify each dump as
a different 'engine-capture-group-output'.
Finally, for each 'engine-capture-group-output' that is found,
verify if the engine register dump corresponds to the
engine_coredump content that was previously populated by the
i915_gpu_coredump function. That function would have copied
the context's vma's including the bacth buffer during the
G2H-context-reset notification that occurred earlier. Perform
this verification check by comparing guc_id, lrca and engine-
instance obtained from the 'engine-capture-group-output' vs a
copy of that same info taken during i915_gpu_coredump. If
they match, then print those vma's as well (such as the batch
buffers).
NOTE: the output format was verified using the gem_exec_capture
IGT test.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-14-alan.previn.teres.alexis@intel.com
|
|
Add a flags parameter through all of the coredump creation
functions. Add a bitmask flag to indicate if the top
level gpu_coredump event is triggered in response to
a GuC context reset notification.
Using that flag, ensure all coredump functions that
read or print mmio-register values related to work submission
or command-streamer engines are skipped and replaced with
a calls guc-capture module equivalent functions to retrieve
or print the register dump.
While here, split out display related register reading
and printing into its own function that is called agnostic
to whether GuC had triggered the reset.
For now, introduce an empty printing function that can
filled in on a subsequent patch just to handle formatting.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-13-alan.previn.teres.alexis@intel.com
|
|
In the rare but possible scenario where we are in the midst of
multiple GuC error-capture (and engine reset) events and the
user also triggers a forced full GT reset or the internal watchdog
triggers the same, intel_guc_submission_reset_prepare's call
to flush_work(&guc->ct.requests.worker) can cause the G2H message
handler to trigger intel_guc_capture_store_snapshot upon
receiving new G2H error-capture notifications. This can happen
despite the prior call to disable_submission(guc);. However,
there's no race-free way for intel_guc_capture_store_snapshot to
know that we are in the midst of a reset. That said, we can never
dynamically allocate the output nodes in this handler. Thus, we
shall pre-allocate a fixed number of empty nodes up front (at the
time of ADS registration) that we can consume from or return to
an internal cached list of nodes.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-12-alan.previn.teres.alexis@intel.com
|
|
- Upon the G2H Notify-Err-Capture event, parse through the
GuC Log Buffer (error-capture-subregion) and generate one or
more capture-nodes. A single node represents a single "engine-
instance-capture-dump" and contains at least 3 register lists:
global, engine-class and engine-instance. An internal link
list is maintained to store one or more nodes.
- Because the link-list node generation happen before the call
to i915_gpu_codedump, duplicate global and engine-class register
lists for each engine-instance register dump if we find
dependent-engine resets in a engine-capture-group.
- When i915_gpu_coredump calls into capture_engine, (in a
subsequent patch) we detach the matching node (guc-id,
LRCA, etc) from the link list above and attach it to
i915_gpu_coredump's intel_engine_coredump structure when have
matching LRCA/guc-id/engine-instance.
Additional notes to be aware of:
- GuC generates the error capture dump into the GuC log buffer but
this buffer is one big log buffer with 3 independent subregions
within it. Each subregion is populated with different content
and used in different ways and timings but all regions operate
behave as independent ring buffers. Each guc-log subregion
(general-logs, crash-dump and error- capture) has it's own
guc_log_buffer_state that contain independent read and write
pointers.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-11-alan.previn.teres.alexis@intel.com
|
|
Add intel_guc_capture_output_min_size_est function to
provide a reasonable minimum size for error-capture
region before allocating the shared buffer.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-10-alan.previn.teres.alexis@intel.com
|
|
GuC log buffer regions for debug-log-events, crash-dumps and
error-state-capture are all part of a single bo allocation that
also includes the guc_log_buffer_state structures. Now that we
support it, increase the size allocation for error-capture.
Since the error-capture region is accessed at non-deterministic
times (as part of GuC triggered context reset) while debug-log-
events region is accessed as part of relay logging or during
debugfs triggered dumps, move the mapping and unmapping of the
shared buffer into intel_guc_log_create and intel_guc_log_destroy
so that it's always mapped throughout life of GuC operation.
Additionally, while here, update the guc log region layout
diagram to follow the order according to the enum definition
as per the GuC interface.
NOTE: A future effort to visit (part of baseline code) is that
buf_addr should be updated to be a io_sys_map and use the
io_sys_map wrapper functions to access the various GuC log
buffer regions.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-9-alan.previn.teres.alexis@intel.com
|
|
For the sake of better code readibility, change previous
relay logging function names with "capture_logs" to
"copy_debug_logs" to differentiate from error capture
functions that will use a different region of the same buffer.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-8-alan.previn.teres.alexis@intel.com
|
|
Add GuC's error capture output structures and definitions as how
they would appear in GuC log buffer's error capture subregion after
an error state capture G2H event notification.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-7-alan.previn.teres.alexis@intel.com
|
|
Abstract out a Gen9 register list as the default for all other
platforms we don't yet formally support GuC submission on.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-6-alan.previn.teres.alexis@intel.com
|
|
Add additional DG2 registers for GuC error state capture.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-5-alan.previn.teres.alexis@intel.com
|
|
Add the ability for runtime allocation and freeing of
steered register list extentions that depend on the
detected HW config fuses.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-4-alan.previn.teres.alexis@intel.com
|
|
Add device specific tables and register lists to cover different engines
class types for GuC error state capture for XE_LP products.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-3-alan.previn.teres.alexis@intel.com
|
|
Update GuC ADS size allocation to include space for
the lists of error state capture register descriptors.
Then, populate GuC ADS with the lists of registers we want
GuC to report back to host on engine reset events. This list
should include global, engine-class and engine-instance
registers for every engine-class type on the current hardware.
Ensure we allocate a persistent store for the register lists
that are populated into ADS so that we don't need to allocate
memory during GT resets when GuC is reloaded and ADS population
happens again.
NOTE: Start with a sample static table of register lists to
layout the framework before adding real registers in subsequent
patch. This static register tables are a different format from
the ADS populated list.
Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-2-alan.previn.teres.alexis@intel.com
|
|
Replace all occurrence of cache_clflush_range with drm_clflush_virt_range.
This will prevent compile errors on non-x86 platforms.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321223819.72833-6-michael.cheng@intel.com
|
|
Use drm_clflush_virt_range instead of clflushopt and remove the memory
barrier, since drm_clflush_virt_range takes care of that.
v2(Michael Cheng): Use sizeof(*addr) instead of sizeof(addr) to get the
actual size of the page. Thanks to Matt Roper for
pointing this out.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321223819.72833-5-michael.cheng@intel.com
|
|
Use drm_clflush_virt_range instead of directly invoking clflush. This
will prevent compiler errors when building for non-x86 architectures.
v2(Michael Cheng): Remove extra clflush
v3(Michael Cheng): Remove memory barrier since drm_clflush_virt_range
takes care of it.
v4(Michael Cheng): Get the size of value and not the size of the pointer
when passing in execlists->csb_write. Thanks to Matt
Roper for pointing this out.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321223819.72833-4-michael.cheng@intel.com
|
|
Drop invalidate_csb_entries and directly call drm_clflush_virt_range.
This allows for one less function call, and prevent complier errors when
building for non-x86 architectures.
v2(Michael Cheng): Drop invalidate_csb_entries function and directly
invoke drm_clflush_virt_range. Thanks to Tvrtko for the
sugguestion.
v3(Michael Cheng): Use correct parameters for drm_clflush_virt_range.
Thanks to Tvrtko for pointing this out.
v4(Michael Cheng): Simplify &execlists->csb_status[0] to
execlists->csb_status. Thanks to Matt Roper for the
suggestion.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321223819.72833-3-michael.cheng@intel.com
|
|
Re-work intel_write_status_page to use drm_clflush_virt_range. This
will prevent compiler errors when building for non-x86 architectures.
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321223819.72833-2-michael.cheng@intel.com
|
|
The initialization is there only to silence the compiler, but use the
correct initializer for i915_reg_t.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321135955.922791-1-jani.nikula@intel.com
|
|
Change functions that always return '0' to be void type.
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321122759.227091-1-andi.shyti@linux.intel.com
|
|
On platforms capable of allowing 8K (7680 x 4320) modes, pinning 2 or
more framebuffers/scanout buffers results in only one that is mappable/
fenceable. Therefore, pageflipping between these 2 FBs where only one
is mappable/fenceable creates latencies large enough to miss alternate
vblanks thereby producing less optimal framerate.
This mainly happens because when i915_gem_object_pin_to_display_plane()
is called to pin one of the FB objs, the associated vma is identified
as misplaced -- because there is no space for it in the aperture --
and therefore i915_vma_unbind() is called which unbinds and evicts it.
This misplaced vma gets subseqently pinned only when
i915_gem_object_ggtt_pin_ww() is called without PIN_MAPPABLE. This whole
thing results in a latency of ~10ms and happens every other repaint cycle.
Therefore, to fix this issue, we just ensure that the misplaced VMA
does not get evicted when we try to pin it with PIN_MAPPABLE -- by
returning early if the mappable/fenceable flag is not set.
Testcase:
Running Weston and weston-simple-egl on an Alderlake_S (ADLS) platform
with a 8K@60 mode results in only ~40 FPS (compared to ~59 FPS with
this patch). Since upstream Weston submits a frame ~7ms before the
next vblank, the latencies seen between atomic commit and flip event
are 7, 24 (7 + 16.66), 7, 24..... suggesting that it misses the
vblank every other frame.
Here is the ftrace snippet that shows the source of the ~10ms latency:
i915_gem_object_pin_to_display_plane() {
0.102 us | i915_gem_object_set_cache_level();
i915_gem_object_ggtt_pin_ww() {
0.390 us | i915_vma_instance();
0.178 us | i915_vma_misplaced();
i915_vma_unbind() {
__i915_active_wait() {
0.082 us | i915_active_acquire_if_busy();
0.475 us | }
intel_runtime_pm_get() {
0.087 us | intel_runtime_pm_acquire();
0.259 us | }
__i915_active_wait() {
0.085 us | i915_active_acquire_if_busy();
0.240 us | }
__i915_vma_evict() {
ggtt_unbind_vma() {
gen8_ggtt_clear_range() {
10507.255 us | }
10507.689 us | }
10508.516 us | }
v2:
- Expand the code comments to describe the ping-pong issue.
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321005431.1113890-1-vivek.kasireddy@intel.com
|
|
Throttling here refers to the GT frequency being clipped. Each of
the throttle reason attributes will have a 0 or 1 value depending
upon whether there is throttling and also the specific reason for
it.
The following is a brief description of the sysfs throttle
frequency attributes added:
- throttle_reason_status: when set indicates that there is GT
frequency clipping.
- throttle_reason_pl1: when set indicates that PBM PL1 (platform
or package PL1) has caused GT frequency clipping.
- throttle_reason_pl2: when set indicates that PBM PL2 or PL3
(platform or package PL2 or PL3) has caused GT frequency
clipping.
- throttle_reason_pl4: when set indicates that PL4 or IccMax has
caused GT frequency clipping.
- throttle_reason_thermal: when set indicates that Thermal event
has caused GT frequency clipping.
- throttle_reason_prochot: when set indicates that PROCHOT# has
caused GT frequency clipping.
- throttle_reason_ratl: when set indicates that Running Average
Thermal Limit has caused GT frequency clipping.
- throttle_reason_vr_thermalert: when set indicates that Hot VR
(any processor VR) has caused GT frequency clipping.
- throttle_reason_vr_tdc: when set indicates that VR TDC
(Thermal Design Current) has caused GT frequency clipping.
Signed-off-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Dale B Stimson <dale.b.stimson@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220318233938.149744-8-andi.shyti@linux.intel.com
|
|
Now tiles have their own sysfs interfaces under the gt/
directory. Because RPS is a property that can be configured on a
tile basis, then each tile should have its own interface
The new sysfs structure will have a similar layout for the 4 tile
case:
/sys/.../card0
├── gt
│ ├── gt0
│ │ ├── id
│ │ ├── rc6_enable
│ │ ├── rc6_residency_ms
│ │ ├── rps_act_freq_mhz
│ │ ├── rps_boost_freq_mhz
│ │ ├── rps_cur_freq_mhz
│ │ ├── rps_max_freq_mhz
│ │ ├── rps_min_freq_mhz
│ │ ├── rps_RP0_freq_mhz
│ │ ├── rps_RP1_freq_mhz
│ │ └── rps_RPn_freq_mhz
. .
. .
. .
│ └── gtN
│ ├── id
│ ├── rc6_enable
│ ├── rc6_residency_ms
│ ├── rps_act_freq_mhz
│ ├── rps_boost_freq_mhz
│ ├── rps_cur_freq_mhz
│ ├── rps_max_freq_mhz
│ ├── rps_min_freq_mhz
│ ├── rps_RP0_freq_mhz
│ ├── rps_RP1_freq_mhz
│ └── rps_RPn_freq_mhz
├── gt_act_freq_mhz -+
├── gt_boost_freq_mhz |
├── gt_cur_freq_mhz | Original interface
├── gt_max_freq_mhz +─-> kept as existing ABI;
├── gt_min_freq_mhz | it points to gt0/
├── gt_RP0_freq_mhz |
├── gt_RP1_freq_mhz |
└── gt_RPn_freq_mhz -+
The existing interfaces have been kept in their original location
to preserve the existing ABI. They act on all the GTs: when
writing they loop through all the GTs and write the information
on each interface. When reading they provide the average value
from all the GTs.
This patch is not really adding exposing new interfaces (new
ABI) other than adapting the existing one to more tiles. In any
case this new set of interfaces will be a basic tool for system
managers and administrators when using i915.
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220318233938.149744-7-andi.shyti@linux.intel.com
|
|
Now tiles have their own sysfs interfaces under the gt/
directory. Because RC6 is a property that can be configured on a
tile basis, then each tile should have its own interface
The new sysfs structure will have a similar layout for the 4 tile
case:
/sys/.../card0
├── gt
│ ├── gt0
│ │ ├── id
│ │ ├── rc6_enable
│ │ ├── rc6_residency_ms
. . .
. . .
. .
│ └── gtN
│ ├── id
│ ├── rc6_enable
│ ├── rc6_residency_ms
│ .
│ .
│
└── power/ -+
├── rc6_enable | Original interface
├── rc6_residency_ms +-> kept as existing ABI;
. | it multiplexes over
. | the GTs
-+
The existing interfaces have been kept in their original location
to preserve the existing ABI. They act on all the GTs: when
reading they provide the average value from all the GTs.
This patch is not really adding exposing new interfaces (new
ABI) other than adapting the existing one to more tiles. In any
case this new set of interfaces will be a basic tool for system
managers and administrators when using i915.
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220318233938.149744-6-andi.shyti@linux.intel.com
|
|
Now that we have tiles we want each of them to have its own
interface. A directory "gt/" is created under "cardN/" that will
contain as many diroctories as the tiles.
In the coming patches tile related interfaces will be added. For
now the sysfs gt structure simply has an id interface related
to the current tile count.
The directory structure will follow this scheme:
/sys/.../card0
└── gt
├── gt0
│ └── id
:
:
└─- gtN
└── id
This new set of interfaces will be a basic tool for system
managers and administrators when using i915.
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220318233938.149744-5-andi.shyti@linux.intel.com
|
|
On a multi-tile platform, each tile has its own registers + GGTT
space, and BAR 0 is extended to cover all of them.
Up to four GTs are supported in i915->gt[], with slot zero
shadowing the existing i915->gt0 to enable source compatibility
with legacy driver paths. A for_each_gt macro is added to iterate
over the GTs and will be used by upcoming patches that convert
various parts of the driver to be multi-gt aware.
Only the primary/root tile is initialized for now; the other
tiles will be detected and plugged in by future patches once the
necessary infrastructure is in place to handle them.
Signed-off-by: Abdiel Janulgue <abdiel.janulgue@gmail.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220318233938.149744-4-andi.shyti@linux.intel.com
|
|
The "gt_is_root(struct intel_gt *gt)" helper return true if the
gt is the root gt, which means that its id is 0. Return false
otherwise.
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220318233938.149744-3-andi.shyti@linux.intel.com
|
|
With the upcoming multitile support each tile will have its own
local memory. Mark the current LMEM with the suffix '0' to
emphasise that it belongs to the root tile.
Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220318233938.149744-2-andi.shyti@linux.intel.com
|
|
Add logical mapping for VDBOXs. This mapping is required for
split-frame workloads, which otherwise fail with
00000000-F8C53528: [GUC] 0441-INVALID_ENGINE_SUBMIT_MASK
... if the application is using the logical id to reorder the engines and
then using it for the batch buffer submission. It's not a big problem on
media version 11 and 12 as they have only 2 instances of VCS and the
logical to physical mapping is monotonically increasing - if the
application is not using the logical id.
Changing it for the previous platforms allows the media driver
implementation for the next ones (12.50 and above) to be the same,
checking the logical id. It should also not introduce any bug for the
old versions of userspace not checking the id.
The mapping added here is the complete map needed by XEHPSDV. Previous
platforms with only 2 instances will just use a partial map and should
still work.
v2: Remove static from map variable (José)
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
[ Extend the mapping to media versions 11 and 12 and give proper
justification in the commit message why ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220316234538.434357-2-lucas.demarchi@intel.com
|