kernel/linux.git/drivers/gpu/drm/xe/instructions, branch v6.12.80

drm/xe: Save CTX_TIMESTAMP mmio value instead of LRC value

2025-05-22T12:29:42+00:00

[ Upstream commit 66c8f7b435bddb7d8577ac8a57e175a6cb147227 ] For determining actual job execution time, save the current value of the CTX_TIMESTAMP register rather than the value saved in LRC since the current register value is the closest to the start time of the job. v2: Define MI_STORE_REGISTER_MEM to fix compile error v3: Place MI_STORE_REGISTER_MEM sorted by MI_INSTR (Lucas) Fixes: 65921374c48f ("drm/xe: Emit ctx timestamp copy in ring ops") Signed-off-by: Umesh Nerlige Ramappa Reviewed-by: Matthew Brost Reviewed-by: Lucas De Marchi Link: https://lore.kernel.org/r/20250509161159.2173069-6-umesh.nerlige.ramappa@intel.com (cherry picked from commit 38b14233e5deff51db8faec287b4acd227152246) Signed-off-by: Lucas De Marchi Signed-off-by: Sasha Levin

drm/xe/oa: Add OAR support

2024-06-18T19:40:38+00:00

Add OAR support to allow userspace to execute MI_REPORT_PERF_COUNT on render engines. Configuration batches are used to program the OAR unit, as well as modifying the render engine context image of a specified exec queue (to have correct register values when that context switches in). v2: Rename/refactor xe_oa_modify_self (Umesh) v3: Move IS_MI_LRI_CMD() into xe_oa.c (Michal) Acked-by: Rodrigo Vivi Reviewed-by: Umesh Nerlige Ramappa Signed-off-by: Ashutosh Dixit Link: https://patchwork.freedesktop.org/patch/msgid/20240618014609.3233427-11-ashutosh.dixit@intel.com

drm/xe: Add MI_COPY_MEM_MEM GPU instruction definitions

2024-06-13T02:10:19+00:00

MI_COPY_MEM_MEM GPU instructions are used to copy ctx timestamp from a LRC registers to another location at the beginning of every jobs execution. Add MI_COPY_MEM_MEM GPU instruction definitions. v2: - Include MI_COPY_MEM_MEM based on instruction order (Michal) - Fix tabs/spaces issue (Michal) - Use macro for DW definition (Michal) Signed-off-by: Matthew Brost Reviewed-by: Jonathan Cavitt Link: https://patchwork.freedesktop.org/patch/msgid/20240611144053.2805091-3-matthew.brost@intel.com

drm/xe: Move xe_gpu_commands.h file to instructions/

2024-05-09T19:17:57+00:00

All other files with commands definitions are in instructions/ folder. Move xe_gpu_commands.h also there. Signed-off-by: Michal Wajdeczko Reviewed-by: Himal Prasad Ghimiray Link: https://patchwork.freedesktop.org/patch/msgid/20240508174856.1908-1-michal.wajdeczko@intel.com

drm/xe: Add LRC parsing for more GPU instructions

2024-02-29T20:39:16+00:00

The LRCs on some of our newer platforms appear to contain a few GPU instructions that weren't handled in our LRC parser. Add the relevant instruction names and opcodes so that our debugfs LRC dumps will properly indicate what these are. Bspec: 55866, 64848, 46931 Signed-off-by: Matt Roper Reviewed-by: Ravi Kumar Vodapalli Link: https://patchwork.freedesktop.org/patch/msgid/20240222184009.6857-2-matthew.d.roper@intel.com

drm/xe: Add command MI_LOAD_REGISTER_MEM

2023-12-21T21:31:29+00:00

We will need this shortly during context state preparation. Cc: Matt Roper Reviewed-by: Matt Roper Link: https://lore.kernel.org/r/20231214185955.1791-2-michal.wajdeczko@intel.com Signed-off-by: Michal Wajdeczko

drm/xe/gsc: Query GSC compatibility version

2023-12-21T16:45:06+00:00

The version is obtained via a dedicated MKHI GSC HECI command. The compatibility version is what we want to match against for the GSC, so we need to call the FW version checker after obtaining the version. Since this is the first time we send a GSC HECI command via the GSCCS, this patch also introduces common infrastructure to send such commands to the GSC. Communication with the GSC FW is done via input/output buffers, whose addresses are provided via a GSCCS command. The buffers contain a generic header and a client-specific packet (e.g. PXP, HDCP); the clients don't care about the header format and/or the GSCCS command in the batch, they only care about their client-specific header. This patch therefore introduces helpers that allow the callers to automatically fill in the input header, submit the GSCCS job and decode the output header, to make it so that the caller only needs to worry about their client-specific input and output messages. v3: squash of 2 separate patches ahead of merge, so that the common functions and their first user are added at the same time Signed-off-by: Daniele Ceraolo Spurio Cc: Alan Previn Cc: Suraj Kandpal Cc: John Harrison Reviewed-by: John Harrison #v1 Reviewed-by: Suraj Kandpal Signed-off-by: Rodrigo Vivi

drm/xe/gsc: GSC FW load

2023-12-21T16:45:06+00:00

The GSC FW must be copied in a 4MB stolen memory allocation, whose GGTT address is then passed as a parameter to a dedicated load instruction submitted via the GSC engine. Since the GSC load is relatively slow (up to 250ms), we perform it asynchronously via a worker. This requires us to make sure that the worker has stopped before suspending/unloading. Note that we can't yet use xe_migrate_copy for the copy because it doesn't work with stolen memory right now, so we do a memcpy from the CPU side instead. v2: add comment about timeout value, fix GSC status checking before load (John) Bspec: 65306, 65346 Signed-off-by: Daniele Ceraolo Spurio Cc: Alan Previn Cc: John Harrison Reviewed-by: John Harrison Signed-off-by: Rodrigo Vivi

drm/xe/xe2: Update SVG state handling

2023-12-21T16:43:22+00:00

As with DG2/MTL, Xe2 also fails to emit instruction headers for SVG state instructions if no explicit state has been set. The SVG part of the LRC is nearly identical to DG2/MTL; the only change is that 3DSTATE_DRAWING_RECTANGLE has been replaced by 3DSTATE_DRAWING_RECTANGLE_FAST, so we can just re-use the same state table and handle that single instruction when we encounter it. Bspec: 65182 Reviewed-by: Balasubramani Vivekanandan Link: https://lore.kernel.org/r/20231025151732.3461842-8-matthew.d.roper@intel.com Signed-off-by: Matt Roper Signed-off-by: Rodrigo Vivi

drm/xe: Emit SVG state on RCS during driver load on DG2 and MTL

2023-12-21T16:43:22+00:00

When recording the default LRC, the expectation is that the hardware's original state settings (both register and instruction) will be written out to the LRC upon first context switch. For many 3DSTATE_* state instructions that don't truly have "default" values, this translates to a simple instruction header (opcodes + dword length) being written to the LRC, followed by an appropriate number of blank dwords as a place holder. When userspace creates a context (which starts as a copy of the default LRC), they'll generally emit real 3DSTATE_* as part of their initialization to select the settings they desire. If they don't emit one of the 3DSTATE instructions, then the zeroed dwords that remain in their LRC image generally translate to various state remaining disabled. This will either be what userspace wants or will lead to very reproducible and easily-debugged problems (rendering glitches, engine hangs). It turns out that a subset of the 3DSTATE instructions, specifically those belonging to the SVG (State Variable - Global) unit, are not only emitting 0's for the instruction's "body" dwords, but also for the instruction header dword if no specific state has been explicitly set before context switch. This means that when the hardware switches to a context that hasn't explicitly provided an appropriate state setting, the hardware will just see a sequence of NOOPs in the spot reserved for that 3DSTATE instruction while executing the LRC, and the actual hardware state setting will unintentionally inherit the configuration used by the previously running context. Now when userspace makes a mistake and forgets to emit an important state instruction they no longer get consistent, easily-reproducible corruption/hangs, but rather erratic behavior where the presence/absence of a problem depends on what other workloads are running on the system and what order the contexts are scheduled on the engine. A specific example of this that came up recently related to mesh shading The OpenGL driver was not specifically emitting a 3DSTATE_MESH_CONTROL to disable mesh shading at context init, so on context switch, mesh shading would either be on or off depending on what the previous context had been doing. Vulkan apps _were_ enabling mesh shading, so running a Vulkan app and then context switching to an OpenGL app resulted in mesh shading still unexpectedly being enabled during OpenGL operation, and since other Mesh-related state was not properly initialized for that context a GPU hang was seen. Due to the specific ordering requirements (Vulkan app runs first, followed by OpenGL app), it took additional debug effort to track down the cause of the problem. There are various workarounds related to this behavior, with current implementations handled in the userspace drivers. E.g., Wa_14019789679 and Wa_22018402687. However it's been suggested that the kernel driver can help simplify things here by emitting zeroed SVG state with proper instruction headers as part of our default context creation (i.e., at the same point we apply LRC workarounds). This will help ensure that any future cases where a userspace driver does not emit an important state setting will result in consistent behavior. Bspec: 46261 Reviewed-by: Balasubramani Vivekanandan Link: https://lore.kernel.org/r/20231025151732.3461842-7-matthew.d.roper@intel.com Signed-off-by: Matt Roper Signed-off-by: Rodrigo Vivi