summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/xe/xe_gt.c
AgeCommit message (Collapse)AuthorFilesLines
2024-04-18drm/xe/gt: Abort driver load for sysfs creation failureHimal Prasad Ghimiray1-6/+10
Instead of allowing the driver to load with incomplete sysfs entries in case of sysfs creation failure, we should terminate the driver loading. This change ensures that the status of all gt associated sysfs entries creation is relayed to xe_gt_init, leading to a driver load abort if any sysfs creation failures occur. -v2 use err_force_wake label instead of new. (Lucas) Avoid unnecessary warn/error messages. (Lucas) Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240412181211.1155732-6-himal.prasad.ghimiray@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-18drm/xe: Remove sysfs only once on action add failureHimal Prasad Ghimiray1-1/+3
The drmm_add_action_or_reset function automatically invokes the action (sysfs removal) in the event of a failure; therefore, there's no necessity to call it within the return check. Modify the return type of xe_gt_ccs_mode_sysfs_init to int, allowing the caller to pass errors up the call chain. Should sysfs creation or drmm_add_action_or_reset fail, error propagation will prompt a driver load abort. -v2 Edit commit message (Nikula/Lucas) use err_force_wake label instead of new. (Lucas) Avoid unnecessary warn/error messages. (Lucas) Fixes: f3bc5bb4d53d ("drm/xe: Allow userspace to configure CCS mode") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240412181211.1155732-3-himal.prasad.ghimiray@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-18drm/xe: Simplify function return using drmm_add_action_or_reset()Himal Prasad Ghimiray1-5/+1
Instead of assigning the value of drmm_add_action_or_reset() to err and returning err in case of failure and 0 in case of success, simply return the result of drmm_add_action_or_reset(). -v2: cleanup in xe_display too. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240412181211.1155732-2-himal.prasad.ghimiray@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-18drm/xe: Remove useless mem_access during probeRodrigo Vivi1-9/+0
xe_pm_init is the very last thing during the xe_pci_probe(), hence these protections are useless from the point of view of ensuring that the device is awake. Let's remove it so we continue towards the goal of killing xe_device_mem_access. v2: Adding more cases v3: Provide a separate fix for xe_tile_init_noalloc return (Matt) Adding a new case where display HDCP init calls which are also called at display probe time. Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240417203952.25503-5-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-16drm/xe/pf: Add SR-IOV PF specific early GT initializationMichal Wajdeczko1-0/+7
The PF driver must maintain additional GT level data per each VF. This additional per-VF data will be added in upcoming patches and will include: provisioning configuration (like GGTT space or LMEM allocation sizes or scheduling parameters), monitoring thresholds and counters, and more. As number of supported VFs varies across platforms use flexible array where first entry will contain metadata for the PF itself (if such configuration parameter is applicable for the PF) and all remaining entries will contain data for potential VFs. Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240415173937.1287-6-michal.wajdeczko@intel.com
2024-03-19drm/xe: Convert gt suspend/resume messages to debugRodrigo Vivi1-2/+4
Let's be quieter on production configuration and let's also print the entry point of the gt suspend when debug messages are enabled. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318180141.267458-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-02-29drm/xe/guc: Fix missing topology initZhanjun Dong1-2/+1
init_steering_dss need topology dss mask to be init ahead. Fixed by moving xe_gt_topology_init ahead of xe_gt_mcr_init Fixes: bf8ec3c3e82c ("drm/xe: Initialize GuC earlier during probe") Cc: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240227164922.281346-2-zhanjun.dong@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-02-26drm/xe: Convert gt_reset from mem_access to xe_pm_runtimeRodrigo Vivi1-3/+4
We need to ensure that device is in D0 on any kind of GT reset. We are likely already protected by outer bounds like exec, but if exec/sched ref gets dropped on a hang, we might transition to D3 before we are able to perform the gt_reset and recover. Suggested-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-13-rodrigo.vivi@intel.com
2024-02-26drm/xe: Remove mem_access from suspend and resume functionsRodrigo Vivi1-8/+0
At these points, we are sure that device is awake in D0. Likely in the middle of the transition, but awake. So, these extra protections are useless. Let's remove it and continue with the killing of xe_device_mem_access. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-12-rodrigo.vivi@intel.com
2024-02-20drm/xe: Initialize GuC earlier during probeMichał Winiarski1-18/+39
SR-IOV VF has limited access to MMIO registers. Fortunately, it is able to access a curated subset that is needed to initialize the driver by communicating with SR-IOV PF using GuC CT. Initialize GuC earlier in order to keep the unified probe ordering between VF and PF modes. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240219130530.1406044-4-michal.winiarski@intel.com
2024-02-20drm/xe/guc: Move GuC power control init to "post-hwconfig"Michał Winiarski1-3/+0
SLPC is not used at "hwconfig" stage. Move the initialization of data structures used for SLPC to a later point in probe. Also - move the xe_guc_pc_init_early to happen just prior to initial "hwconfig" load. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240219130530.1406044-3-michal.winiarski@intel.com
2024-02-20Merge drm/drm-next into drm-xe-nextLucas De Marchi1-1/+1
Bring changes from drm-misc-next that got merged in drm-next back to drm-xe so they can be used for additional features. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-02-09drm/xe: switch from drm_debug_printer() to device specific drm_dbg_printer()Jani Nikula1-1/+1
Prefer the device specific debug printer. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Luca Coelho <luciano.coelho@intel.com> Acked-by: Maxime Ripard <mripard@kernel.org> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/35929b030f7ba67cd32808d42e916aa9cfb5709d.1705410327.git.jani.nikula@intel.com
2024-02-03drm/xe: Map both mem.kernel_bb_pool and usm.bb_poolMatthew Brost1-1/+4
For integrated devices we need to map both mem.kernel_bb_pool and usm.bb_pool to be able to run batches from both pools. Fixes: a682b6a42d4d ("drm/xe: Support device page faults on integrated platforms") Tested-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Brian Welty <brian.welty@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240202033440.2351862-1-matthew.brost@intel.com
2024-01-26drm/xe: Move TLB invalidation reset before HW resetMatthew Brost1-2/+2
This is a software reset which can be done immediately after stopping the UC. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240122210156.1517444-3-matthew.brost@intel.com
2024-01-24drm/xe: Stash GMD_ID value in xe_gtMatt Roper1-0/+6
Although we've stored the major and minor versions for graphics/media in xe_device, it will be simpler to implement the uapi version query if we also stash the raw register value in the GT itself. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240123204454.246788-4-jose.souza@intel.com Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
2024-01-18drm/xe/gsc: Initialize GSC proxyDaniele Ceraolo Spurio1-0/+13
The GSC uC needs to communicate with the CSME to perform certain operations. Since the GSC can't perform this communication directly on platforms where it is integrated in GT, the graphics driver needs to transfer the messages from GSC to CSME and back. The proxy flow must be manually started after the GSC is loaded to signal to GSC that we're ready to handle its messages and allow it to query its init data from CSME. Note that the component must be removed before the pci_remove call completes, so we can't use a drmm helper for it and we need to instead perform the cleanup as part of the removal flow. v2: add function documentation, more targeted memory clear, clearer logs and variable names (Alan) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240117182621.2653049-2-daniele.ceraolospurio@intel.com
2024-01-11drm/xe: Finish refactoring of exec_queue_createBrian Welty1-2/+2
Setting of exec_queue user extensions is moved from the end of the ioctl function earlier, into __xe_exec_queue_alloc(). This fixes bug in that the USM attributes for access counters were being applied too late, and effectively were ignored. However, in order to apply user extensions this early, we can no longer call q->ops functions. Instead, make it more efficient. The user extension functions can simply update the q->sched_props values and they will be applied by the backend during q->ops->init(). v2: minor changes for readability (Matt) Signed-off-by: Brian Welty <brian.welty@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com>
2023-12-21drm/xe/uapi: Remove reset uevent for nowRodrigo Vivi1-18/+0
This kernel uevent is getting removed for now. It will come back later with a better future proof name. v2: Rebase (Francois Dugast) Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Cc: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com> Signed-off-by: Francois Dugast <francois.dugast@intel.com>
2023-12-21drm/xe/pmu: Remove PMU from Xe till uapi is finalizedAshutosh Dixit1-2/+0
PMU uapi is likely to change in the future. Till the uapi is finalized, remove PMU from Xe. PMU can be re-added after uapi is finalized. v2: Include xe_drm.h in xe/tests/xe_dma_buf.c (Francois) Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Acked-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Acked-by: Mateusz Naklicki <mateusz.naklicki@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Create a xe_gt_freq component for raw management and sysfsRodrigo Vivi1-0/+3
Goals of this new xe_gt_freq component: 1. Detach sysfs controls and raw freq management from GuC SLPC. 2. Create a directory that could later be aligned with devfreq. 3. Encapsulate all the freq control in a single directory. Although we only have one freq domain per GT, already start with a numbered freq0 directory so it could be expanded in the future if multiple domains or PLL are needed. Note: Although in the goal #1, the raw freq management control is mentioned, this patch only starts by the sysfs control. The RP freq configuration and init freq selection are still under the guc_pc, but should be moved to this component in a follow-up patch. v2: - Add /tile# to the doc and remove unnecessary kobject_put (Riana) - s/ssize_t/int on some ret variables (Vinay) Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Rename info.supports_* to info.has_*Lucas De Marchi1-1/+1
Rename supports_mmio_ext and supports_usm to use a has_ prefix so the flags are grouped together. This settles on just one variant for positive info matching ("has_") and one for negative ("skip_"). Also make sure the has_* flags are grouped together in xe_pci.c. Reviewed-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://lore.kernel.org/r/20231205145235.2114761-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/pf: Introduce Local Memory Translation TableMichal Wajdeczko1-0/+10
The Local Memory Translation Table (LMTT) provides additional abstraction for Virtual Functions (VF) accessing device VRAM. This code is based on prior work of Michal Winiarski. In this patch we focus only on LMTT initialization. Remaining LMTT functions will be used once we add a VF provisioning to the PF. Bspec: 44117, 52404, 59314 Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://lore.kernel.org/r/20231128151507.1015-4-michal.wajdeczko@intel.com Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Allow userspace to configure CCS modeNiranjana Vishwanathapura1-0/+3
Allow user to configure the CCS mode setting through a 'ccs_mode' sysfs interface. Also report the current CCS mode configuration and number of compute slices available through this interface. v2: Rebase, make it platform agnostic v3: Separte out num_cslices sysfs interface and make xe_gt_ccs_mode_sysfs_init() return void Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Enable Fixed CCS mode settingNiranjana Vishwanathapura1-0/+10
Disable dynamic HW load balancing of compute resource assignment to engines and instead enabled fixed mode of mapping compute resources to engines on all platforms with more than one compute engine. By default enable only one CCS engine with all compute slices assigned to it. This is the desired configuration for common workloads. PVC platform supports only the fixed CCS mode (workaround 16016805146). v2: Rebase, make it platform agnostic v3: Minor code refactoring Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/uc: Extract xe_uc_sanitize_resetMichał Winiarski1-0/+4
Earlier GuC load will require more fine-grained control over reset. Extract it outside of xe_uc_init_hw. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Move force_wake init to earlier point in probeMichał Winiarski1-2/+0
GuC will need to be loaded earlier during probe. And in order to load GuC, being able to take the forcewake is going to be needed. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Rename xe_gt_idle_sysfs to xe_gt_idleVinay Belgaumkar1-1/+1
Prep this file to contain C6 toggling as well instead of just sysfs related stuff. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/gsc: Implement WA 14015076503Daniele Ceraolo Spurio1-0/+5
When the GSC FW is loaded, we need to inform it when a GSCCS reset is coming and then wait 200ms for it to get ready to process the reset. v2: move WA code to GSC file, use variable in Makefile (John) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Remove GEN[0-9]*_ prefixesLucas De Marchi1-1/+1
After noticing in logs there were still mentions to GEN6 registers, it was clear commit d9b79ad275e7 ("drm/xe: Drop gen afixes from registers") didn't take care of all the afixes. Some were added later, but there are also constants and strings still using that. Continue the cleanup removing the remaining ones. To keep it consistent with code nearby, a few other changes are made: - Remove prefix in INTEL_LEGACY_64B_CONTEXT - Remove GEN8_CTX_L3LLC_COHERENT since it's unused - Rename GEN9_FREQ_SCALER to GT_FREQUENCY_SCALER v2: Use XELP_ as prefix for NUM_MOCS_ENTRIES and remove changes to MOCS_ENTRIES as this is now done as part of a previous commit (Matt Roper) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231117174049.527192-3-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/uapi: Add missing DRM_ prefix in uAPI constantsFrancois Dugast1-1/+1
Most constants defined in xe_drm.h use DRM_XE_ as prefix which is helpful to identify the name space. Make this systematic and add this prefix where it was missing. v2: - fix vertical alignment of define values - remove double DRM_ in some variables (José Roberto de Souza) v3: Rebase Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Raise GT frequency before GuC/HuC loadVinay Belgaumkar1-0/+4
Starting GT freq is usually RPn. Raising freq to RP0 will help speed up GuC load times. As an example, this data was collected on DG2- GuC Load time @RPn ~ 41 ms GuC Load time @RP0 ~ 11 ms v2: Raise GT freq before hwconfig init. This will speed up both HuC and GuC loads. Address review comments (Rodrigo). Also add a small usleep after requesting frequency which gives pcode some time to react. v3: Address checkpatch issue Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: implement driver initiated function-resetAndrzej Hajda1-0/+2
Driver initiated function-reset (FLR) is the highest level of reset that we can trigger from within the driver. In contrast to PCI FLR it doesn't require re-enumeration of PCI BAR. It can be useful in case GT fails to reset. It is also the only way to trigger GSC reset from the driver and can be used in future addition of GSC support. v2: - use regs from xe_regs.h - move the flag to xe.mmio - call flr only on root gt - use BIOS protection check - copy/paste comments from i915 v3: - flr code moved to xe_device.c v4: - needs_flr_on_fini moved to xe_device Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Emit SVG state on RCS during driver load on DG2 and MTLMatt Roper1-0/+1
When recording the default LRC, the expectation is that the hardware's original state settings (both register and instruction) will be written out to the LRC upon first context switch. For many 3DSTATE_* state instructions that don't truly have "default" values, this translates to a simple instruction header (opcodes + dword length) being written to the LRC, followed by an appropriate number of blank dwords as a place holder. When userspace creates a context (which starts as a copy of the default LRC), they'll generally emit real 3DSTATE_* as part of their initialization to select the settings they desire. If they don't emit one of the 3DSTATE instructions, then the zeroed dwords that remain in their LRC image generally translate to various state remaining disabled. This will either be what userspace wants or will lead to very reproducible and easily-debugged problems (rendering glitches, engine hangs). It turns out that a subset of the 3DSTATE instructions, specifically those belonging to the SVG (State Variable - Global) unit, are not only emitting 0's for the instruction's "body" dwords, but also for the instruction header dword if no specific state has been explicitly set before context switch. This means that when the hardware switches to a context that hasn't explicitly provided an appropriate state setting, the hardware will just see a sequence of NOOPs in the spot reserved for that 3DSTATE instruction while executing the LRC, and the actual hardware state setting will unintentionally inherit the configuration used by the previously running context. Now when userspace makes a mistake and forgets to emit an important state instruction they no longer get consistent, easily-reproducible corruption/hangs, but rather erratic behavior where the presence/absence of a problem depends on what other workloads are running on the system and what order the contexts are scheduled on the engine. A specific example of this that came up recently related to mesh shading The OpenGL driver was not specifically emitting a 3DSTATE_MESH_CONTROL to disable mesh shading at context init, so on context switch, mesh shading would either be on or off depending on what the previous context had been doing. Vulkan apps _were_ enabling mesh shading, so running a Vulkan app and then context switching to an OpenGL app resulted in mesh shading still unexpectedly being enabled during OpenGL operation, and since other Mesh-related state was not properly initialized for that context a GPU hang was seen. Due to the specific ordering requirements (Vulkan app runs first, followed by OpenGL app), it took additional debug effort to track down the cause of the problem. There are various workarounds related to this behavior, with current implementations handled in the userspace drivers. E.g., Wa_14019789679 and Wa_22018402687. However it's been suggested that the kernel driver can help simplify things here by emitting zeroed SVG state with proper instruction headers as part of our default context creation (i.e., at the same point we apply LRC workarounds). This will help ensure that any future cases where a userspace driver does not emit an important state setting will result in consistent behavior. Bspec: 46261 Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://lore.kernel.org/r/20231025151732.3461842-7-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Prepare to emit non-register state while recording default LRCMatt Roper1-1/+9
On some platforms we need to emit some non-register state while recording an engine class' default LRC. Add the infrastructure to support this; actual per-platform tables will be added in future patches. v2: - Checkpatch whitespace fix - Add extra assertion to ensure num_dw != 0. (Bala) Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://lore.kernel.org/r/20231025151732.3461842-6-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Extract MI_* instructions to their own headerMatt Roper1-0/+1
Extracting the common MI_* instructions that can be used with any engine to their own header will make it easier as we add additional engine instructions in upcoming patches. Also, since the majority of GPU instructions (both MI and non-MI) have a "length" field in bits 7:0 of the instruction header, a common define is added for that. Instruction-specific length fields are still defined for special case instructions that have larger/smaller length fields. v2: - Use "instr" instead of "inst" as the short form of "instruction" everywhere. (Lucas) - Include xe_reg_defs.h instead of the i915 compat header. (Lucas) Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231016163449.1300701-12-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Separate number of registers from MI_LRI opcodeMatt Roper1-1/+1
Keeping the number of registers to be loaded as a separate macro from the instruction opcode will simplify some upcoming LRC parsing code. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231016163449.1300701-10-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/gt: Dump PAT table when failing to initializeLucas De Marchi1-0/+12
When failing on early initialization, one cause may be that the PAT configuration is not correct. Dump it for ease of debugging. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20231006182325.3617685-4-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: do not register to PM if GuC is disabledOhad Sharabi1-4/+0
When working without GuC (i.e. working with execlists), the flow attempts to perform suspend operation which is failing due to a lack of support without GuC. If PM ops are not supported without GuC we may as well avoid PM registration rather than returning errors from various PM flows. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/uc: Rename guc_submission_enabled() to uc_enabled()Daniele Ceraolo Spurio1-2/+2
The guc_submission_enabled() function is being used as a boolean toggle for all firmwares and all related features, not just GuC submission. We could add additional flags/functions to distinguish and allow different use-cases (e.g. loading HuC but not using GuC submission), but given that not using GuC is a debug-only scenario having a global switch for all FWs is enough. However, we want to make it clear that this switch turns off everything, so rename it to uc_enabled(). v2: rebase on s/XE_WARN_ON/xe_assert Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe/pmu: Enable PMU interfaceAravind Iddamsetty1-0/+2
There are a set of engine group busyness counters provided by HW which are perfect fit to be exposed via PMU perf events. BSPEC: 46559, 46560, 46722, 46729, 52071, 71028 events can be listed using: perf list xe_0000_03_00.0/any-engine-group-busy-gt0/ [Kernel PMU event] xe_0000_03_00.0/copy-group-busy-gt0/ [Kernel PMU event] xe_0000_03_00.0/interrupts/ [Kernel PMU event] xe_0000_03_00.0/media-group-busy-gt0/ [Kernel PMU event] xe_0000_03_00.0/render-group-busy-gt0/ [Kernel PMU event] and can be read using: perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000 time counts unit events 1.001139062 0 ns xe_0000_8c_00.0/render-group-busy-gt0/ 2.003294678 0 ns xe_0000_8c_00.0/render-group-busy-gt0/ 3.005199582 0 ns xe_0000_8c_00.0/render-group-busy-gt0/ 4.007076497 0 ns xe_0000_8c_00.0/render-group-busy-gt0/ 5.008553068 0 ns xe_0000_8c_00.0/render-group-busy-gt0/ 6.010531563 43520 ns xe_0000_8c_00.0/render-group-busy-gt0/ 7.012468029 44800 ns xe_0000_8c_00.0/render-group-busy-gt0/ 8.013463515 0 ns xe_0000_8c_00.0/render-group-busy-gt0/ 9.015300183 0 ns xe_0000_8c_00.0/render-group-busy-gt0/ 10.017233010 0 ns xe_0000_8c_00.0/render-group-busy-gt0/ 10.971934120 0 ns xe_0000_8c_00.0/render-group-busy-gt0/ The pmu base implementation is taken from i915. v2: Store last known value when device is awake return that while the GT is suspended and then update the driver copy when read during awake. v3: 1. drop init_samples, as storing counters before going to suspend should be sufficient. 2. ported the "drm/i915/pmu: Make PMU sample array two-dimensional" and dropped helpers to store and read samples. 3. use xe_device_mem_access_get_if_ongoing to check if device is active before reading the OA registers. 4. dropped format attr as no longer needed 5. introduce xe_pmu_suspend to call engine_group_busyness_store 6. few other nits. v4: minor nits. v5: take forcewake when accessing the OAG registers v6: 1. drop engine_busyness_sample_type 2. update UAPI documentation v7: 1. update UAPI documentation 2. drop MEDIA_GT specific change for media busyness counter. Co-developed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Co-developed-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Use Xe assert macros instead of XE_WARN_ON macroFrancois Dugast1-0/+1
The XE_WARN_ON macro maps to WARN_ON which is not justified in many cases where only a simple debug check is needed. Replace the use of the XE_WARN_ON macro with the new xe_assert macros which relies on drm_*. This takes a struct drm_device argument, which is one of the main changes in this commit. The other main change is that the condition is reversed, as with XE_WARN_ON a message is displayed if the condition is true, whereas with xe_assert it is if the condition is false. v2: - Rebase - Keep WARN splats in xe_wopcm.c (Matt Roper) v3: - Rebase Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Fix LRC workaroundsLucas De Marchi1-7/+34
Fix 2 issues when writing LRC workarounds by copying the same handling done when processing other RTP entries: For masked registers, it was not correctly setting the upper 16bits. Differently than i915, the entry itself doesn't set the upper bits for masked registers: this is done when applying them. Testing on ADL-P: Before: [drm:xe_gt_record_default_lrcs [xe]] LRC WA rcs0 save-restore MMIOs [drm:xe_gt_record_default_lrcs [xe]] REG[0x2580] = 0x00000002 ... [drm:xe_gt_record_default_lrcs [xe]] REG[0x7018] = 0x00002000 [drm:xe_gt_record_default_lrcs [xe]] REG[0x7300] = 0x00000040 [drm:xe_gt_record_default_lrcs [xe]] REG[0x7304] = 0x00000200 After: [drm:xe_gt_record_default_lrcs [xe]] LRC WA rcs0 save-restore MMIOs [drm:xe_gt_record_default_lrcs [xe]] REG[0x2580] = 0x00060002 ... [drm:xe_gt_record_default_lrcs [xe]] REG[0x7018] = 0x20002000 [drm:xe_gt_record_default_lrcs [xe]] REG[0x7300] = 0x00400040 [drm:xe_gt_record_default_lrcs [xe]] REG[0x7304] = 0x02000200 All of these registers are masked registers, so writing to them without the relevant bits in the upper 16b doesn't have any effect. Also, this adds support to regular registers; previously it was assumed that LRC entries would only contain masked registers. However this is not true. 0x6604 is not a masked register, but used in workarounds for e.g. ADL-P. See commit 28cf243a341a ("drm/i915/gt: Fix context workarounds with non-masked regs"). In the same test with ADL-P as above: Before: [drm:xe_gt_record_default_lrcs [xe]] REG[0x6604] = 0xe0000000 After: [drm:xe_gt_record_default_lrcs [xe]] REG[0x6604] = 0xe0efef6f As can be seen, now it will read what was in the register rather than completely overwrite the other bits. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230906012053.1733755-5-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Add dbg messages for LRC WAsLucas De Marchi1-0/+4
Just like the GT and engine workarounds, add debug message with the final value being written to the register for easy debugging. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20230906012053.1733755-4-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: standardize vm-less kernel submissionsDaniele Ceraolo Spurio1-16/+7
The current only submission in the driver that doesn't use a vm is the WA setup. We still pass a vm structure (the migration one), but we don't actually use it at submission time and we instead have an hack to use GGTT for this particular engine. Instead of special-casing the WA engine, we can skip providing a VM and use that as selector for whether to use GGTT or PPGTT. As part of this change, we can drop the special engine flag for the WA engine and switch the WA submission to use the standard job functions instead of dedicated ones. v2: rebased on s/engine/exec_queue Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20230822173334.1664332-4-daniele.ceraolospurio@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Add sysfs entries for engines under its GTTejas Upadhyay1-0/+7
Add engines sysfs directory under its GT and create sub directory for all engine class (note its not per instance) present on GT. For example, DUT# cat /sys/class/drm/cardX/device/tileN/gtN/engines/ bcs/ ccs/ V9 : - Add missing drmm_add_action_or_reset V8 : - Rebase V7 : - Remove xe_gt.h from .h and include in .c - Matt V6 : - Add kernel doc and arrange file in make file by alphabet - Matt V5 : - replace xe_engine with xe_hw_engine - Matt V4 : - Rebase to resolve conflicts - CI V3 : - Move code in its own file - Rename API name V2 : - Correct class mask logic - Himal - Remove extra parenthesis Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Rename engine to exec_queueFrancois Dugast1-35/+35
Engine was inappropriately used to refer to execution queues and it also created some confusion with hardware engines. Where it applies the exec_queue variable name is changed to q and comments are also updated. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/162 Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Rename xe_engine.[ch] to xe_exec_queue.[ch]Francois Dugast1-1/+1
This is a preparation commit for a larger renaming of engine to exec queue. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Introduce fault injection for gt resetHimal Prasad Ghimiray1-1/+7
To trigger gt reset failure: echo 100 > /sys/kernel/debug/dri/<cardX>/fail_gt_reset/probability echo 2 > /sys/kernel/debug/dri/<cardX>/fail_gt_reset/times Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21drm/xe: Notify Userspace when gt reset failsHimal Prasad Ghimiray1-0/+19
Send uevent in case of gt reset failure. This intimation can be used by userspace monitoring tool to do the device level reset/reboot when GT reset fails. udevadm can be used to monitor the uevents. v2: - Support only gt failure notification (Rodrigo) v3 - Rectify the comments in header file. v4 - Use pci kobj instead of drm kobj for notification.(Rodrigo) - Cleanup (Badal) v5 - Add tile id and gt id as additional info provided by uevent. - Provide code documentation for the uevent. (Rodrigo) Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>