summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/xe
AgeCommit message (Collapse)AuthorFilesLines
2025-03-05drm/xe/vm: Fix a misplaced #endifThomas Hellström1-1/+1
Fix a (harmless) misplaced #endif leading to declarations appearing multiple times. Fixes: 0eb2a18a8fad ("drm/xe: Implement VM snapshot support for BO's and userptr") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-3-thomas.hellstrom@linux.intel.com
2025-03-05drm/xe/vm: Validate userptr during gpu vma prefetchingThomas Hellström1-1/+10
If a userptr vma subject to prefetching was already invalidated or invalidated during the prefetch operation, the operation would repeatedly return -EAGAIN which would typically cause an infinite loop. Validate the userptr to ensure this doesn't happen. v2: - Don't fallthrough from UNMAP to PREFETCH (Matthew Brost) Fixes: 5bd24e78829a ("drm/xe/vm: Subclass userptr vmas") Fixes: 617eebb9c480 ("drm/xe: Fix array of binds") Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.9+ Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228073058.59510-2-thomas.hellstrom@linux.intel.com
2025-03-05drm/xe/uapi: Use hint for guc to set GT frequencyTejas Upadhyay6-3/+38
Allow user to provide a low latency hint. When set, KMD sends a hint to GuC which results in special handling for that process. SLPC will ramp the GT frequency aggressively every time it switches to this process. We need to enable the use of SLPC Compute strategy during init, but it will apply only to processes that set this bit during process creation. Improvement with this approach as below: Before, :~$ NEOReadDebugKeys=1 EnableDirectSubmission=0 clpeak --kernel-latency Platform: Intel(R) OpenCL Graphics Device: Intel(R) Graphics [0xe20b] Driver version : 24.52.0 (Linux x64) Compute units : 160 Clock frequency : 2850 MHz Kernel launch latency : 283.16 us After, :~$ NEOReadDebugKeys=1 EnableDirectSubmission=0 clpeak --kernel-latency Platform: Intel(R) OpenCL Graphics Device: Intel(R) Graphics [0xe20b] Driver version : 24.52.0 (Linux x64) Compute units : 160 Clock frequency : 2850 MHz Kernel launch latency : 63.38 us Compute PR: https://github.com/intel/compute-runtime/pull/794 Mesa PR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/33214 IGT PR: https://patchwork.freedesktop.org/patch/639989/ V10(Lucas): - Remove doc from drm-uapi.rst v9(Vinay): - remove extra line, align commit message v8(Vinay): - Add separate example for using low latency hint v7(Jose): - Update UMD PR - applicable to all gpus V6: - init flags, remove redundant flags check (MAuld) V5: - Move uapi doc to documentation and GuC ABI specific change (Rodrigo) - Modify logic to restrict exec queue flags (MAuld) V4: - To make it clear, dont use exec queue word (Vinay) - Correct typo in description of flag (Jose/Vinay) - rename set_strategy api and replace ctx with exec queue(Vinay) - Start with 0th bit to indentify user flags (Jose) V3: - Conver user flag to kernel internal flag and use (Oak) - Support query config for use to check kernel support (Jose) - Dont need to take runtime pm (Vinay) V2: - DRM_XE_EXEC_QUEUE_LOW_LATENCY_HINT 1 planned for other hint(Szymon) - Add motivation to description (Lucas) Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228070224.739295-2-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
2025-03-04drm/xe: Fix GT "for each engine" workaroundsTvrtko Ursulin1-2/+2
Any rules using engine matching are currently broken due RTP processing happening too in early init, before the list of hardware engines has been initialised. Fix this by moving workaround processing to later in the driver probe sequence, to just before the processed list is used for the first time. Looking at the debugfs gt0/workarounds on ADL-P we notice 14011060649 should be present while we see, before: GT Workarounds 14011059788 14015795083 And with the patch: GT Workarounds 14011060649 14011059788 14015795083 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: stable@vger.kernel.org # v6.11+ Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250227101304.46660-2-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 25d434cef791e03cf40680f5441b576c639bfa84) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-04drm/xe/userptr: properly setup pfn_flags_maskMatthew Auld1-8/+10
Currently we just leave it uninitialised, which at first looks harmless, however we also don't zero out the pfn array, and with pfn_flags_mask the idea is to be able set individual flags for a given range of pfn or completely ignore them, outside of default_flags. So here we end up with pfn[i] & pfn_flags_mask, and if both are uninitialised we might get back an unexpected flags value, like asking for read only with default_flags, but getting back write on top, leading to potentially bogus behaviour. To fix this ensure we zero the pfn_flags_mask, such that hmm only considers the default_flags and not also the initial pfn[i] value. v2 (Thomas): - Prefer proper initializer. Fixes: 81e058a3e7fd ("drm/xe: Introduce helper to populate userptr") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226174748.294285-2-matthew.auld@intel.com (cherry picked from commit dd8c01e42f4c5c1eaf02f003d7d588ba6706aa71) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-04drm/xe: Remove double pageflipMaarten Lankhorst1-10/+0
This is already handled below in the code by fixup_initial_plane_config. Fixes: a8153627520a ("drm/i915: Try to relocate the BIOS fb to the start of ggtt") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241210083111.230484-3-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se> (cherry picked from commit 2218704997979fbf11765281ef752f07c5cf25bb) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-03-03drm/i915: split out i915_gtt_view_types.h from i915_vma_types.hJani Nikula2-74/+7
In the interest of limiting the display dependencies on i915 core headers, split out i915_gtt_view_types.h from i915_vma_types.h, and only include the new header from intel_display_types.h. Reuse the new header from xe compat code too, failing build if partial view is used in display code. Side note: Why would we ever have set enum i915_gtt_view_type values to size of each type?! What an insane hack. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/bb31885c32dbddad76d634c6fdb98a73b546b42e.1740412806.git.jani.nikula@intel.com
2025-03-03drm/i915: relocate intel_plane_ggtt_offset() to intel_atomic_plane.cJani Nikula1-0/+1
With the primary goal of removing #include "i915_vma.h" from intel_display_types.h, move intel_plane_ggtt_offset() to a proper function in intel_atomic_plane.c. This reveals tons of implicit dependencies all over the place that we pulled in via i915_vma.h. Fix the fallout. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/70ac6d19518f355abf37ac8c4b0f1d18878be28c.1740412806.git.jani.nikula@intel.com
2025-03-03drm/i915/pxp & drm/xe/pxp: Figure out pxp instance from the gem objectJani Nikula3-9/+9
It's undesirable to have to figure out the pxp pointer in display code. For one thing, its type is different for i915 and xe. Since we can figure the pxp pointer out in the pxp code from the gem object, offload it there. v2: Rebase Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250228114527.3091620-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-03-01drm/xe: Add performance tunings to debugfsTvrtko Ursulin5-0/+87
Add a list of active tunings to debugfs, analogous to the existing list of workarounds. Rationale being that it seems to make sense to either have both or none. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250227101304.46660-6-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-03-01drm/xe/xelp: L3 recommended hashing maskTvrtko Ursulin2-1/+8
According to the i915 codebase xe missed to set the recommended performance tuning for L3 hashing which is applicable to all legacy XeLP platforms. Lets add it. v2: * Rename prefixes to XELP_. * Tweak version end point. v3: * Add bspec tag. * Tweak version range. v4: * Move from LRC to engine tunings list. v5: * Drop L3 Cache Control comment. Bspec: 31870 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> References: c46c5fb725be ("drm/i915/gen12: Apply recommended L3 hashing mask") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250227101304.46660-5-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-03-01drm/xe/xelp: Add Wa_1604555607Tvrtko Ursulin1-0/+7
According to the i915 code base and as confirmed in the workaround database, apart from setting the GS timer, all XeLP platforms should also set the TDS timer. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> References: 2b5298b0aa09 ("drm/i915/gen12: Add recommended hardware tuning value") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250227101304.46660-4-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-03-01drm/xe/xelp: Move Wa_16011163337 from tunings to workaroundsTvrtko Ursulin2-8/+7
Workaround database specifies 16011163337 as a workaround so lets move it there. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250227101304.46660-3-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-03-01drm/xe: Fix GT "for each engine" workaroundsTvrtko Ursulin1-2/+2
Any rules using engine matching are currently broken due RTP processing happening too in early init, before the list of hardware engines has been initialised. Fix this by moving workaround processing to later in the driver probe sequence, to just before the processed list is used for the first time. Looking at the debugfs gt0/workarounds on ADL-P we notice 14011060649 should be present while we see, before: GT Workarounds 14011059788 14015795083 And with the patch: GT Workarounds 14011060649 14011059788 14015795083 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: stable@vger.kernel.org # v6.11+ Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250227101304.46660-2-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-28Merge drm/drm-next into drm-xe-nextLucas De Marchi13-32/+127
Sync to fix conlicts between drm-xe-next and drm-intel-next. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-28Merge drm/drm-next into drm-intel-nextJani Nikula124-747/+5121
Sync to fix conlicts between drm-xe-next and drm-intel-next. Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-02-28drm/xe/vf: Retry sending MMIO request to GUC on timeout errorSatyanarayana K V P1-1/+8
Add support to allow retrying the sending of MMIO requests from the VF to the GUC in the event of an error. During the suspend/resume process, VFs begin resuming only after the PF has resumed. Although the PF resumes, the GUC reset and provisioning occur later in a separate worker process. When there are a large number of VFs, some may attempt to resume before the PF has completed its provisioning. Therefore, if a MMIO request from a VF fails during this period, we will retry sending the request up to GUC_RESET_VF_STATE_RETRY_MAX times, which is set to a maximum of 10 attempts. Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Michał Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Piotr Piorkowski <piotr.piorkowski@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250224102807.11065-3-satyanarayana.k.v.p@intel.com
2025-02-28drm/xe/pf: Create a link between PF and VF devicesSatyanarayana K V P1-0/+51
When both PF and VF devices are enabled on the host, they resume simultaneously during system resume. However, the PF must finish provisioning the VF before any VFs can successfully resume. Establish a parent-child device link between the PF and VF devices to ensure the correct order of resumption. V4 -> V5: - Added missing break in the error condition. V3 -> V4: - Made xe_pci_pf_get_vf_dev() as a static function and updated input parameter types. - Updated xe_sriov_warn() to xe_sriov_abort() when VF device cannot be found. V2 -> V3: - Added function documentation for xe_pci_pf_get_vf_dev(). - Added assertion if not called from PF. V1 -> V2: - Added a helper function to get VF pci_dev. - Updated xe_sriov_notice() to xe_sriov_warn() if vf pci_dev is not found. Signed-off-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Cc: Michał Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Piotr Piorkowski <piotr.piorkowski@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250224102807.11065-2-satyanarayana.k.v.p@intel.com
2025-02-28drm/xe/xe3lpg: Add Wa_13012615864Tejas Upadhyay2-0/+6
Wa_13012615864 applies to xe3lpg Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250221112200.388612-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
2025-02-27drm/xe: xe_gen_wa_oob: replace program_invocation_short_nameDaniel Gomez1-3/+3
program_invocation_short_name may not be available in other systems. Instead, replace it with the argv[0] to pass the executable name. Fixes build error when program_invocation_short_name is not available: drivers/gpu/drm/xe/xe_gen_wa_oob.c:34:3: error: use of undeclared identifier 'program_invocation_short_name' 34 | program_invocation_short_name); | ^ 1 error generated. Suggested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Daniel Gomez <da.gomez@samsung.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250224-macos-build-support-xe-v3-1-d2c9ed3a27cc@samsung.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-27drm/i915/rps: convert intel_display_rps.[ch] to struct intel_displayJani Nikula1-1/+1
Going forward, struct intel_display is the main display device data pointer. Convert as much as possible of intel_display_rps.[ch] to struct intel_display. Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c81156007bffbf0a1b1e6831afaf8fb05db546bc.1740502116.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-02-27drm/i915/tdf: convert intel_tdf.[ch] to struct intel_displayJani Nikula1-2/+4
Going forward, struct intel_display is the main display device data pointer. Convert the intel_tdf.[ch] glue to struct intel_display. Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/26d976f23295713f9a7cda20e32b7ef5aad3dd9e.1740502116.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2025-02-27drm/xe/userptr: properly setup pfn_flags_maskMatthew Auld1-8/+10
Currently we just leave it uninitialised, which at first looks harmless, however we also don't zero out the pfn array, and with pfn_flags_mask the idea is to be able set individual flags for a given range of pfn or completely ignore them, outside of default_flags. So here we end up with pfn[i] & pfn_flags_mask, and if both are uninitialised we might get back an unexpected flags value, like asking for read only with default_flags, but getting back write on top, leading to potentially bogus behaviour. To fix this ensure we zero the pfn_flags_mask, such that hmm only considers the default_flags and not also the initial pfn[i] value. v2 (Thomas): - Prefer proper initializer. Fixes: 81e058a3e7fd ("drm/xe: Introduce helper to populate userptr") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226174748.294285-2-matthew.auld@intel.com
2025-02-27Merge tag 'drm-xe-next-2025-02-24' of ↵Dave Airlie118-660/+4994
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next UAPI Changes: - Add mmap support for PCI memory barrier (Tejas, Matthew Auld) - Enable integration with perf pmu, exposing event counters: for now, just GT C6 residency (Vinay, Lucas) - Add "survivability mode" to allow putting the driver in a state capable of firmware upgrade on critical failures (Riana, Rodrigo) - Add PXP HWDRM support and enable for compatible platforms: Meteor Lake and Lunar Lake (Daniele, John Harrison) - Expose package and vram temperature over hwmon subsystem (Raag, Badal, Rodrigo) Cross-subsystem Changes: - Backmege drm-next to synchronize with i915 display and other internal APIs Display Changes (including i915): - Device probe re-order to help with flicker-free boot (Maarten) - Align watermark, hpd and dsm with i915 (Rodrigo) - Better abstraction for d3cold (Rodrigo) Driver Changes: - Make sure changes to ccs_mode is with helper for gt sync reset (Maciej) - Drop mmio_ext abstraction since it didn't prove useful in its current form (Matt Roper) - Reject BO eviction if BO is bound to current VM (Oak, Thomas Hellström) - Add GuC Power Conservation debugfs (Rodrigo) - L3 cache topology updates for Xe3 (Francois, Matt Atwood) - Better logging about missing GuC logs (John Harrison) - Better logging for hwconfig-related data availability (John Harrison) - Tracepoint updates for xe_bo_create, xe_vm and xe_vma (Oak) - Add missing SPDX licenses (Francois) - Xe suballocator imporovements (Michal Wajdeczko) - Improve logging for native vs SR-IOV driver mode (Satyanarayana) - Make sure VF bootstrap is not attempted in execlist mode (Maarten) - Add GuC Buffer Cache abstraction for some CTB H2G actions and use during VF provisioning (Michal Wajdeczko) - Better synchronization in gtidle for new users (Vinay) - New workarounds for Panther Lake (Nirmoy, Vinay) - PCI ID updates for Panther Lake (Matt Atwood) - Enable SR-IOV for Panther Lake (Michal Wajdeczko) - Update MAINTAINERS to stop directing xe changes to drm-misc (Lucas) - New PCI IDs for Battle Mage (Shekhar) - Better pagefault logging (Francois) - SR-IOV fixes and refactors for past and new platforms (Michal Wajdeczko) - Platform descriptor refactors and updates (Sai Teja) - Add gt stats debugfs (Francois) - Add guc_log debugfs to dump to dmesg (Lucas) - Abstract per-platform LMTT availability (Piotr Piórkowski) - Refactor VRAM manager location (Piotr Piórkowski) - Add missing xe_pm_runtime_put when forcing wedged mode (Shuicheng) - Fix possible lockup when forcing wedged mode (Xin Wang) - Probe refactors to use cleanup actions with better error handling (Lucas) - XE_IOCTL_DBG clarification for userspace (Maarten) - Better xe_mmio initialization and abstraction (Ilia) - Drop unnecessary GT lookup (Matt Roper) - Skip client engine usage from fdinfo for VFs (Marcin Bernatowicz) - Allow to test xe_sync_entry_parse with error injection (Priyanka) - OA fix for polled read (Umesh) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/m3gbuh32wgiep43i4zxbyhxqbenvtgvtao5sczivlasj7tikwv@dmlba4bfg2ny
2025-02-27drm/xe: Eliminate usage of TIMESTAMP_OVERRIDEMatt Roper3-43/+23
Recent discussions with the hardware architects have revealed that the TIMESTAMP_OVERRIDE register is never expected to hold a valid/useful value on production hardware. That register would only get used by hardware workarounds (although there are none that use it today) or during early internal hardware testing. Due to lack of documentation it's not clear exactly what the driver should be doing if CTC_MODE[0] is set (or even whether that's a setting that would ever be encountered on real hardware), but it's definitely not what Xe and i915 have been doing. So drop the incorrect code trying to use TIMESTAMP_REGISTER. If the driver does encounter CTC_MODE[0] in the wild, we'll print a warning and just continue trying to use the crystal clock frequency since that's probably less incorrect than what we're doing today. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250225224908.1671554-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
2025-02-27drm/xe/pxp: Don't kill queues while holding PXP locksDaniele Ceraolo Spurio1-34/+44
xe_exec_queue_kill can sleep, so we can't call it from under a spinlock. We can instead move the queues to a separate list and then kill them all after we release the spinlock. Furthermore, xe_exec_queue_kill can take the VM lock so we can't call it while holding the PXP mutex because the mutex is taken under VM lock at queue creation time. Note that while it is safe to call the kill without holding the mutex, we must call it after the PXP state has been updated, otherwise an app might be able to create a queue between the invalidation and the state update, which would break the state machine. Since being in the list is used to track whether RPM cleanup is needed, we can no longer defer that to queue_destroy, so we perform it immediately instead. v2: also avoid calling kill() under pxp->mutex. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/intel-xe/34aaced9-4a9d-4e8c-900a-b8f73452e35c@stanley.mountain/ Fixes: f8caa80154c4 ("drm/xe/pxp: Add PXP queue tracking and session start") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250225235328.2895877-1-daniele.ceraolospurio@intel.com
2025-02-27Merge tag 'drm-intel-next-2025-02-24' of ↵Dave Airlie9-21/+95
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next drm/i915 feature pull for v6.15: Features and functionality: - Enable DP 128b/132b SST DSC (Jani, Imre) - Allow DSB to perform commits when VRR is enabled (Ville) - Compute HDMI PLLs for SNPS/C10 PHYs for rates not in fixed tables (Ankit) - Allow DSB usage when PSR is enabled on LNL+ (Jouni) - Enable Panel Replay mode change without full modeset (Jouni) - Enable async flips with compressed buffers on ICL+ (Ville) - Support luminance based brightness control via DPCD for eDP (Suraj) - Enable VRR enable/disable without full modeset (Mitul, Ankit) - Add debugfs facility for force testing HDCP 1.4 (Suraj) - Add scaler tracepoints, improve plane tracepoints (Ville) - Improve DMC wakelock debugging facilities (Gustavo) - Allow GuC SLPC default strategies on MTL+ for performance (Rodrigo) - Provide more information on display faults (Ville) Refactoring and cleanups: - Continue conversions to struct intel_display (Ville, Jani, Suraj, Imre) - Joiner and Y plane reorganization (Ville) - Move HDCP debugfs to intel_hdcp.c (Jani) - Clean up and unify LSPCON interfaces (Jani) - Move code out of intel_display.c to reduce its size (Ville) - Clean up and simplify DDI port enabling/disabling (Imre) - Make LPT LP a dedicated PCH type, refactor (Jani) - Simplify DSC range BPG offset calculation (Ankit) - Scaler cleanups (Ville) - Remove unused code from GVT (David Alan Gilbert) - Improve plane debugging (Ville) - DSB and VRR refactoring (Ville) Fixes: - Check if vblank is sufficient for DSC prefill and scaler (Mitul) - Fix Mesa clear color alignment regression (Ville) - Add missing TC DP PHY lane stagger delay (Imre) - Fix DSB + VRR usage for PTL+ (Ville) - Improve robustness of display VT-d workarounds (Ville) - Fix platforms for dbuf tracker state service programming (Ravi) - Fix DMC wakelock support conditions (Gustavo) - Amend DMC wakelock register ranges (Gustavo) - Disable the Common Primary Timing Generator (CMTG) (Gustavo) - Enable C20 PHY SSC (Suraj) - Add workaround for DKL PHY DP mode write (Nemesa) - Fix build warnings on clamp() usage (Guenter Roeck, Ankit) - Fix error handling while adding a connector (Imre) - Avoid full modeset at probe on vblank delay mismatches (Ville) - Fix encoder HDMI check for HDCP line rekeying (Suraj) - Fix HDCP repeater authentication during topology change (Suraj) - Handle display PHY power state reset for power savings (Mika) - Fix typos all over the place (Nitin) - Update HDMI TMDS C20 parameters for various platforms (Dnyaneshwar) - Guarantee a minimum hblank time for 128b/132b and 8b/10b MST (Arun, Imre) - Do not hardcode LSPCON settle timeout (Giedrius Statkevičius) Xe driver changes: - Re-use display vmas when possible (Maarten) - Remove double pageflip (Maarten) - Enable DP tunneling (Imre) - Separate i915 and xe tracepoints (Ville) DRM core changes: - Increase DPCD eDP display control CAP size to 5 bytes (Suraj) - Add DPCD eDP version 1.5 definition (Suraj) - Add timeout parameter to drm_lspcon_set_mode() (Giedrius Statkevičius) Merges: - Backmerge drm-next (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/87h64j7b7n.fsf@intel.com
2025-02-26drm/xe/eustall: Add workaround 22016596838 which applies to PVC.Harish Chegondi2-0/+11
Add PVC workaround 22016596838 that disables EU DOP gating during EU stall sampling. Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/062a12ed9e110fea420cd47cb70fb10136ee9132.1740533885.git.harish.chegondi@intel.com
2025-02-26drm/xe/uapi: Add a device query to get EU stall sampling informationHarish Chegondi3-7/+93
User space can get the EU stall data record size, EU stall capabilities, EU stall sampling rates, and per XeCore buffer size with query IOCTL DRM_IOCTL_XE_DEVICE_QUERY with .query set to DRM_XE_DEVICE_QUERY_EU_STALL. A struct drm_xe_query_eu_stall will be returned to the user space along with an array of supported sampling rates sorted in the fastest sampling rate first order. sampling_rates in struct drm_xe_query_eu_stall will point to the array of sampling rates. Any capabilities in EU stall sampling as of this patch are considered as base capabilities. New capability bits will be added for any new functionality added later. v12: Rename has_eu_stall_sampling_support() to xe_eu_stall_supported_on_platform() and move it to header file. v11: Check if EU stall sampling is supported on the platform. v10: Change comments and variable names as per feedback v9: Move reserved fields above num_sampling_rates in struct drm_xe_query_eu_stall. v7: Change sampling_rates from a pointer to flexible array. v6: Include EU stall sampling rates information and per XeCore buffer size in the query information. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/67ba42796a5a99d648239c315694cd222812a49b.1740533885.git.harish.chegondi@intel.com
2025-02-26drm/xe/eustall: Add EU stall sampling support for Xe2Harish Chegondi1-3/+32
Add EU stall sampling support for Xe2 architecture GPUs - LNL and BMG. EU stall data format for LNL and BMG is different from that of PVC. v10: Update comments as per review feedback v9: Use GRAPHICS_VER() check instead of platform v8: Renamed struct drm_xe_eu_stall_data_xe2 to struct xe_eu_stall_data_xe2 since it is a local structure. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/d85093e9ab1204d14d2cc783f304a4bc8688951c.1740533885.git.harish.chegondi@intel.com
2025-02-26drm/xe/eustall: Add support to handle dropped EU stall dataHarish Chegondi1-1/+45
If the user space doesn't read the EU stall data fast enough, it is possible that the EU stall data buffer can get filled, and if the hardware wants to write more data, it simply drops data due to unavailable buffer space. In that case, hardware sets a bit in a register. If the driver detects data drop, the driver read() returns -EIO error to let the user space know that HW has dropped data. The -EIO error is returned even if there is EU stall data in the buffer. A subsequent read by the user space returns the remaining EU stall data. v12: Move 'goto exit_drop;' to the next 'if (read_data_size == 0)' statement. v11: Clear drop bit even for empty data buffer as the data was read from the buffer in the previous read. v10: Reverted the changes back to v8: Clear the drop bits only after reading the data. v9: Move all data drop handling code to this patch Clear all drop data bits before returning -EIO. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/6fbfd7cfa42cb3ef5515b6412573d74c7cd3d27a.1740533885.git.harish.chegondi@intel.com
2025-02-26drm/xe/eustall: Add support to read() and poll() EU stall dataHarish Chegondi2-3/+294
Implement the EU stall sampling APIs to read() and poll() EU stall data. A work function periodically polls the EU stall data buffer write pointer registers to look for any new data and caches the write pointer. The read function compares the cached read and write pointers and copies any new data to the user space. v11: Used gt->eu_stall->stream_lock instead of stream->buf_lock. Removed read and write offsets from trace and added read size. Moved workqueue from struct xe_eu_stall_data_stream to struct xe_eu_stall_gt. v10: Used cancel_delayed_work_sync() instead of flush_delayed_work() Replaced per xecore lock with a lock for all the xecore buffers Code movement and optimizations as per review feedback v9: New patch split from the previous patch. Used *_delayed_work functions instead of hrtimer Addressed the review feedback in read and poll functions Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/369dee85a3b6bd2c08aeae89ca55e66a9a0242d2.1740533885.git.harish.chegondi@intel.com
2025-02-26drm/xe/eustall: Add support to init, enable and disable EU stall samplingHarish Chegondi5-7/+409
Implement EU stall sampling APIs introduced in the previous patch for Xe_HPC (PVC). Add register definitions and the code that accesses these registers to the APIs. Add initialization and clean up functions and their implementations, EU stall enable and disable functions. v11: Move stream->xecore_buf alloc to xe_eu_stall_data_buf_alloc(). Register xe_eu_stall_fini() with devm_add_action_or_reset() instead of calling it from xe_gt_fini(). Changed a couple of variables in struct xe_eu_stall_data_stream from unsigned int to int. v10: Fixed error rewinding code Moved code around as per review feedback v9: Moved structure definitions from xe_eu_stall.h to xe_eu_stall.c Moved read and poll implementations to the next patch Used xe_bo_create_pin_map_at_aligned instead of xe_bo_create_pin_map Changed lock names as per review feedback Moved drop data handling into a subsequent patch Moved code around as per review feedback v8: Updated copyright year in xe_eu_stall_regs.h to 2025. Renamed struct drm_xe_eu_stall_data_pvc to struct xe_eu_stall_data_pvc since it is a local structure. v6: Fix buffer wrap around over write bug (Matt Olson) Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/b6aeca593d521828a0b4fbf6cfd2844716c4fc66.1740533885.git.harish.chegondi@intel.com
2025-02-26drm/xe/uapi: Introduce API for EU stall samplingHarish Chegondi4-0/+247
A new hardware feature first introduced in PVC gives capability to periodically sample EU stall state and record counts for different stall reasons, on a per IP basis, aggregate across all EUs in a subslice and record the samples in a buffer in each subslice. Eventually, the aggregated data is written out to a buffer in the memory. This feature is also supported in XE2 and later architecture GPUs. Use an existing IOCTL - DRM_IOCTL_XE_OBSERVATION as the interface into the driver from the user space to do initial setup and obtain a file descriptor for the EU stall data stream. Input parameter to the IOCTL is a struct drm_xe_observation_param in which observation_type should be set to DRM_XE_OBSERVATION_TYPE_EU_STALL, observation_op should be DRM_XE_OBSERVATION_OP_STREAM_OPEN and param should point to a chain of drm_xe_ext_set_property structures in which each structure has a pair of property and value. The EU stall sampling input properties are defined in drm_xe_eu_stall_property_id enum. With the file descriptor obtained from DRM_IOCTL_XE_OBSERVATION, user space can enable and disable EU stall sampling with the IOCTLs: DRM_XE_OBSERVATION_IOCTL_ENABLE and DRM_XE_OBSERVATION_IOCTL_DISABLE. User space can also call poll() to check for availability of data in the buffer. The data can be read with read(). Finally, the file descriptor can be closed with close(). v11: Changed a couple of variables in struct eu_stall_open_properties from unsigned int to int. v10: Use extension number while parsing chain of extensions. Remove function description for static functions. Move code around as per review feedback. v9: Changed some u32 to unsigned int. Moved some code around as per review feedback from v8. v8: Used div_u64 instead of / to fix 32-bit build issue. Changed copyright year in xe_eu_stall.c/h to 2025. v7: Renamed input property DRM_XE_EU_STALL_PROP_EVENT_REPORT_COUNT to DRM_XE_EU_STALL_PROP_WAIT_NUM_REPORTS to be consistent with OA. Renamed the corresponding internal variables. Fixed some commit messages based on review feedback. v6: Change the input sampling rate to GPU cycles instead of GPU cycles multiplier. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/bb707a27975c33e4a912b9839b023acb7a1f9c90.1740533885.git.harish.chegondi@intel.com
2025-02-26drm/xe/topology: Add a function to find the index of the last enabled DSS in ↵Harish Chegondi1-0/+13
a mask Last enabled DSS in a DSS mask can help estimate the maximum DSSes enabled in the DSS mask, as the enabled DSSes can be discontiguous. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/79944bb27eb4f7ce5df01f964aebbf431b3a6c61.1740533885.git.harish.chegondi@intel.com
2025-02-26drm/xe: Fix uninitialized pointer defColin Ian King1-1/+1
In the case where a set of checks on xe->info.platform don't assign a value to pointer def the pointer remains uninitialized and hence can fail the following !def check. Fix this be ensuring pointer def is initialized to NULL. Fixes: 292b1a8a5054 ("drm/xe: Stop ignoring errors from xe_heci_gsc_init()") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226160524.566074-1-colin.i.king@gmail.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-26drm/xe/oa: Refactor WAs to use XE_WA() macroAradhya Bhatia2-21/+14
Refactor Wa_18013179988, Wa_14015568240, Wa_1508761755, and Wa_1509372804, to use the proper workaround-check implementation for out-of-band workarounds, XE_WA(), and drop the use of the platform based WA selection. Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250220094645.358647-3-aradhya.bhatia@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-26drm/xe: Add Wa_16021333562 and Wa_14016712196Aradhya Bhatia3-1/+9
Wa_16021333562 and Wa_14016712196 are permanent workarounds that apply to multiple platforms. Wa_16021333562 applies to platforms ranging from TGL (12.00) to Xe_LPM (13.00), while Wa_14016712196 from DG2 (12.55) to Xe_LPG (12.74). Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250220094645.358647-2-aradhya.bhatia@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-26drm/xe: cancel pending job timer before freeing schedulerTejas Upadhyay1-0/+2
The async call to __guc_exec_queue_fini_async frees the scheduler while a submission may time out and restart. To prevent this race condition, the pending job timer should be canceled before freeing the scheduler. V3(MattB): - Adjust position of cancel pending job - Remove gitlab issue# from commit message V2(MattB): - Cancel pending jobs before scheduler finish Fixes: a20c75dba192 ("drm/xe: Call __guc_exec_queue_fini_async direct for KERNEL exec_queues") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250225045754.600905-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> (cherry picked from commit 18fbd567e75f9b97b699b2ab4f1fa76b7cf268f6) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-26drm/xe/regs: remove a duplicate definition for RING_CTL_SIZE(size)Mingcong Bai1-1/+0
Commit b79e8fd954c4 ("drm/xe: Remove dependency on intel_engine_regs.h") introduced an internal set of engine registers, however, as part of this change, it has also introduced two duplicate `define' lines for `RING_CTL_SIZE(size)'. This commit was introduced to the tree in v6.8-rc1. While this is harmless as the definitions did not change, so no compiler warning was observed. Drop this line anyway for the sake of correctness. Cc: stable@vger.kernel.org # v6.8-rc1+ Fixes: b79e8fd954c4 ("drm/xe: Remove dependency on intel_engine_regs.h") Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250225073104.865230-1-jeffbai@aosc.io Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 6b68c4542ffecc36087a9e14db8fc990c88bb01b) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-26drm/xe/gt_pagefault: Change vma_pagefault unit to kilobyteFrancois Dugast3-3/+3
Increase the amount of bytes that can be counted before the counter overflows, while not losing information as the VMA is not expected to have sub-kilobyte size. Suggested-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250225195902.1247100-3-francois.dugast@intel.com Signed-off-by: Francois Dugast <francois.dugast@intel.com>
2025-02-26drm/xe/gt_stats: Use atomic64_t for countersFrancois Dugast2-4/+4
The stats counters are now used for things like counting the VMA bytes during page faults. During workload execution, the counter value can grow fast and easily reach the atomic int limit, in which case it overflows. To make this less likely to happen, push the limit by switching to 64b atomic to store the counter value. Overhead is very small as there are only 3 stat entries per GT as of now, and stats are only enabled with CONFIG_DEBUG_FS. Suggested-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250225195902.1247100-2-francois.dugast@intel.com Signed-off-by: Francois Dugast <francois.dugast@intel.com>
2025-02-26drm/xe: cancel pending job timer before freeing schedulerTejas Upadhyay1-0/+2
The async call to __guc_exec_queue_fini_async frees the scheduler while a submission may time out and restart. To prevent this race condition, the pending job timer should be canceled before freeing the scheduler. V3(MattB): - Adjust position of cancel pending job - Remove gitlab issue# from commit message V2(MattB): - Cancel pending jobs before scheduler finish Fixes: a20c75dba192 ("drm/xe: Call __guc_exec_queue_fini_async direct for KERNEL exec_queues") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250225045754.600905-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
2025-02-26drm/xe/regs: remove a duplicate definition for RING_CTL_SIZE(size)Mingcong Bai1-1/+0
Commit b79e8fd954c4 ("drm/xe: Remove dependency on intel_engine_regs.h") introduced an internal set of engine registers, however, as part of this change, it has also introduced two duplicate `define' lines for `RING_CTL_SIZE(size)'. This commit was introduced to the tree in v6.8-rc1. While this is harmless as the definitions did not change, so no compiler warning was observed. Drop this line anyway for the sake of correctness. Cc: stable@vger.kernel.org # v6.8-rc1+ Fixes: b79e8fd954c4 ("drm/xe: Remove dependency on intel_engine_regs.h") Signed-off-by: Mingcong Bai <jeffbai@aosc.io> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250225073104.865230-1-jeffbai@aosc.io Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-26drm/xe: Stop ignoring errors from xe_ttm_sys_mgr_init()Lucas De Marchi1-1/+4
xe_ttm_sys_mgr_init() already cleans up after itself, just return error if that failed. Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250222001051.3012936-12-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-26drm/xe: Rename update_device_info() after sriovLucas De Marchi1-2/+2
This is only changing info flags for SR-IOV reasons. Rename it accordingly, because there are several other places in probe where the flags are updated, which is not inside this function. Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250222001051.3012936-11-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-26drm/xe: Stop ignoring errors from xe_heci_gsc_init()Lucas De Marchi4-28/+21
Do not ignore errors from xe_heci_gsc_init(). For example, it shouldn't be fine to report successfully entering survivability mode when there's no communication with gsc working. The driver should also not be half-initialized in the normal case neither. Cc: Riana Tauro <riana.tauro@intel.com> Cc: Alexander Usyskin <alexander.usyskin@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250222001051.3012936-10-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-26drm/xe: Move survivability entirely to xe_pciLucas De Marchi5-55/+49
There's an odd split between xe_pci.c and xe_device.c wrt xe_survivability: it's initialized by xe_device, but then finalized by xe_pci. Move it entirely to the outer layer, xe_pci, so it controls the flow entirely. This also allows to stop ignoring some of the errors. E.g.: if there's an -ENOMEM, it shouldn't continue as if it survivability had been enabled. One change worth mentioning is that if "wait for lmem" fails, it will also check the pcode status to decide if it should enter or not in survivability mode, which it was not doing before. The bit from pcode for that decision should remain the same after lmem failed initialization, so it should be fine. Cc: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250222001051.3012936-9-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-26drm/xe/display: Drop xe_display_driver_remove()Lucas De Marchi3-17/+3
Handle it as part of xe_display_fini(). The error handling was already calling it if a step after xe_display_init() failed. Just re-use the same xe_display_fini() for driver remove. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250222001051.3012936-8-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-26drm/xe: Drop remove callback supportLucas De Marchi4-88/+1
Now that devres supports component driver cleanup during driver removal cleanup, the xe custom support for removal callbacks is not needed anymore. Drop it. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250222001051.3012936-7-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>