summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/xe
AgeCommit message (Collapse)AuthorFilesLines
2025-07-17drm/xe/migrate: fix copy direction in access_memoryMatthew Auld1-1/+1
After we do the modification on the host side, ensure we write the result back to VRAM and not the other way around, otherwise the modification will be lost if treated like a read. Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250710134128.800756-2-matthew.auld@intel.com (cherry picked from commit c12fe703cab93f9d8bfe0ff32b58e7b1fd52be1f) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-17drm/xe: Dont skip TLB invalidations on VFTejas Upadhyay1-12/+10
Skipping TLB invalidations on VF causing unrecoverable faults. Probable reason for skipping TLB invalidations on SRIOV could be lack of support for instruction MI_FLUSH_DW_STORE_INDEX. Add back TLB flush with some additional handling. Helps in resolving, [ 704.913454] xe 0000:00:02.1: [drm:pf_queue_work_func [xe]] ASID: 0 VFID: 0 PDATA: 0x0d92 Faulted Address: 0x0000000002fa0000 FaultType: 0 AccessType: 1 FaultLevel: 0 EngineClass: 3 bcs EngineInstance: 8 [ 704.913551] xe 0000:00:02.1: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22 V2: - Use Xmas tree (MichalW) Suggested-by: Matthew Brost <matthew.brost@intel.com> Fixes: 97515d0b3ed92 ("drm/xe/vf: Don't emit access to Global HWSP if VF") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250710045945.1023840-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> (cherry picked from commit b528e896fa570844d654b5a4617a97fa770a1030) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-16drm/i915: Pass along the format info from .fb_create() to ↵Ville Syrjälä2-2/+6
drm_helper_mode_fill_fb_struct() Plumb the format info from .fb_create() all the way to drm_helper_mode_fill_fb_struct() to avoid the redundant lookup. For the fbdev case a manual drm_get_format_info() lookup is needed. Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250701090722.13645-14-ville.syrjala@linux.intel.com
2025-07-15drm/xe: Use DRM_GPU_SCHED_STAT_NO_HANG to skip the resetMaíra Canal1-9/+3
Xe can skip the reset if TDR has fired before the free job worker and can also re-arm the timeout timer in some scenarios. Instead of manipulating scheduler's internals, inform the scheduler that the job did not actually timeout and no reset was performed through the new status code DRM_GPU_SCHED_STAT_NO_HANG. Note that, in the first case, there is no need to restart submission if it hasn't been stopped. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-7-5c5ba4f55039@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-07-15drm/sched: Rename DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESETMaíra Canal1-3/+3
Among the scheduler's statuses, the only one that indicates an error is DRM_GPU_SCHED_STAT_ENODEV. Any status other than DRM_GPU_SCHED_STAT_ENODEV signifies that the operation succeeded and the GPU is in a nominal state. However, to provide more information about the GPU's status, it is needed to convey more information than just "OK". Therefore, rename DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESET, which better communicates the meaning of this status. The status DRM_GPU_SCHED_STAT_RESET indicates that the GPU has hung, but it has been successfully reset and is now in a nominal state again. Reviewed-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-1-5c5ba4f55039@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-07-15drm/xe/pf: Invalidate LMTT after completing changesMichal Wajdeczko1-1/+11
Once we finish populating all leaf pages in the VF's LMTT we should make sure that hardware will not access any stale data. Explicitly force LMTT invalidation (as it was already planned in the past). Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250711193316.1920-7-michal.wajdeczko@intel.com
2025-07-15drm/xe/pf: Invalidate LMTT during LMEM unprovisioningMichal Wajdeczko5-0/+94
Invalidate LMTT immediately after removing VF's LMTT page tables and clearing root PTE in the LMTT PD to avoid any invalid access by the hardware (and VF) due to stale data. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250711193316.1920-6-michal.wajdeczko@intel.com
2025-07-15drm/xe/pf: Force GuC virtualization modeMichal Wajdeczko1-0/+26
By default the GuC starts in the 'native' mode and enables the VGT mode (aka 'virtualization' mode) only after it receives at least one set of VF configuration data. While this happens naturally while PF begins VFs provisioning, we might need this sooner as some actions, like TLB_INVALIDATION_ALL(0x7002), is supported by the GuC only in the VGT mode. And this becomes a real problem if we would want to use above action to invalidate the LMTT early during VFs auto-provisioning, before VFs are enabled, as such H2G would be rejected: [ ] xe 0000:4d:00.0: [drm] *ERROR* GT0: FAST_REQ H2G fence 0x804e failed! e=0x30, h=0 [ ] xe 0000:4d:00.0: [drm] *ERROR* GT0: Fence 0x804e was used by action 0x7002 sent at: h2g_write+0x33e/0x870 [xe] __guc_ct_send_locked+0x1e1/0x1110 [xe] guc_ct_send_locked+0x9f/0x740 [xe] xe_guc_ct_send_locked+0x19/0x60 [xe] send_tlb_invalidation+0xc2/0x470 [xe] xe_gt_tlb_invalidation_all_async+0x45/0xa0 [xe] xe_gt_tlb_invalidation_all+0x4b/0xa0 [xe] lmtt_invalidate_hw+0x64/0x1a0 [xe] xe_lmtt_invalidate_hw+0x5c/0x340 [xe] pf_update_vf_lmtt+0x398/0xae0 [xe] pf_provision_vf_lmem+0x350/0xa60 [xe] xe_gt_sriov_pf_config_bulk_set_lmem+0xe2/0x410 [xe] xe_gt_sriov_pf_config_set_fair_lmem+0x1c6/0x620 [xe] xe_gt_sriov_pf_config_set_fair+0xd5/0x3f0 [xe] xe_pci_sriov_configure+0x360/0x1200 [xe] sriov_numvfs_store+0xbc/0x1d0 dev_attr_store+0x17/0x40 sysfs_kf_write+0x4a/0x80 kernfs_fop_write_iter+0x166/0x220 vfs_write+0x2ba/0x580 ksys_write+0x77/0x100 __x64_sys_write+0x19/0x30 x64_sys_call+0x2bf/0x2660 do_syscall_64+0x93/0x7a0 entry_SYSCALL_64_after_hwframe+0x76/0x7e [ ] xe 0000:4d:00.0: [drm] *ERROR* GT0: CT dequeue failed: -71 [ ] xe 0000:4d:00.0: [drm] GT0: trying reset from receive_g2h [xe] This could be mitigated by pushing earlier a PF self-configuration with some hard-coded values that cover unlimited access to the GGTT, use of all GuC contexts and doorbells. This step is sufficient for the GuC to switch into the VGT mode. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250711193316.1920-5-michal.wajdeczko@intel.com
2025-07-15drm/xe/pf: Move GGTT config KLVs encoding to helperMichal Wajdeczko1-11/+20
In upcoming patch we will want to encode GGTT config KLVs based on raw numbers, without relying on the allocated GGTT node. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250711193316.1920-4-michal.wajdeczko@intel.com
2025-07-15drm/xe/pf: Resend PF provisioning after GT resetMichal Wajdeczko1-0/+27
If we reload the GuC due to suspend/resume or GT reset then we have to resend not only any VFs provisioning data, but also PF configuration, like scheduling parameters (EQ, PT), as otherwise GuC will continue to use default values. Fixes: 411220808cee ("drm/xe/pf: Restart VFs provisioning after GT reset") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250711193316.1920-3-michal.wajdeczko@intel.com
2025-07-15drm/xe/pf: Prepare to stop SR-IOV support prior GT resetMichal Wajdeczko3-0/+27
As part of the resume or GT reset, the PF driver schedules work which is then used to complete restarting of the SR-IOV support, including resending to the GuC configurations of provisioned VFs. However, in case of short delay between those two actions, which could be seen by triggering a GT reset on the suspened device: $ echo 1 > /sys/kernel/debug/dri/0000:00:02.0/gt0/force_reset this PF worker might be still busy, which lead to errors due to just stopped or disabled GuC CTB communication: [ ] xe 0000:00:02.0: [drm:xe_gt_resume [xe]] GT0: resumed [ ] xe 0000:00:02.0: [drm] GT0: trying reset from force_reset_show [xe] [ ] xe 0000:00:02.0: [drm] GT0: reset queued [ ] xe 0000:00:02.0: [drm] GT0: reset started [ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel stopped [ ] xe 0000:00:02.0: [drm:guc_ct_send_recv [xe]] GT0: H2G request 0x5503 canceled! [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 12 config KLVs (-ECANCELED) [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF1 configuration (-ECANCELED) [ ] xe 0000:00:02.0: [drm:guc_ct_change_state [xe]] GT0: GuC CT communication channel disabled [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 12 config KLVs (-ENODEV) [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push VF2 configuration (-ENODEV) [ ] xe 0000:00:02.0: [drm] GT0: PF: Failed to push 2 of 2 VFs configurations [ ] xe 0000:00:02.0: [drm:pf_worker_restart_func [xe]] GT0: PF: restart completed While this VFs reprovisioning will be successful during next spin of the worker, to avoid those errors, make sure to cancel restart worker if we are about to trigger next reset. Fixes: 411220808cee ("drm/xe/pf: Restart VFs provisioning after GT reset") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250711193316.1920-2-michal.wajdeczko@intel.com
2025-07-14drm/xe/lrc: Add table with LRC layoutLucas De Marchi1-0/+24
Add a table to document the LRC's BO layout to make it easier to visualize how each region stacks on top of each other. Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Link: https://lore.kernel.org/r/20250710-lrc-refactors-v2-4-a5e2ca03f6bd@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe: Waste fewer instructions in emit_wa_job()Tvrtko Ursulin3-41/+49
I was debugging some unrelated issue and noticed the current code was very verbose. We can improve it easily by using the more common batch buffer building pattern. Before: bb->cs[bb->len++] = MI_LOAD_REGISTER_REG | MI_LRR_DST_CS_MMIO; c4d: 41 8b 56 10 mov 0x10(%r14),%edx c51: 49 8b 4e 08 mov 0x8(%r14),%rcx c55: 8d 72 01 lea 0x1(%rdx),%esi c58: 41 89 76 10 mov %esi,0x10(%r14) c5c: c7 04 91 01 00 08 15 movl $0x15080001,(%rcx,%rdx,4) bb->cs[bb->len++] = entry->reg.addr; c63: 8b 08 mov (%rax),%ecx c65: 41 8b 56 10 mov 0x10(%r14),%edx c69: 49 8b 76 08 mov 0x8(%r14),%rsi c6d: 81 e1 ff ff 3f 00 and $0x3fffff,%ecx c73: 8d 7a 01 lea 0x1(%rdx),%edi c76: 41 89 7e 10 mov %edi,0x10(%r14) c7a: 89 0c 96 mov %ecx,(%rsi,%rdx,4) ..etc.. After: *cs++ = MI_LOAD_REGISTER_REG | MI_LRR_DST_CS_MMIO; c52: 41 c7 04 24 01 00 08 movl $0x15080001,(%r12) c59: 15 *cs++ = entry->reg.addr; c5a: 8b 10 mov (%rax),%edx ..etc.. Resulting in the following binary change: add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-348 (-348) Function old new delta xe_gt_record_default_lrcs.cold 304 296 -8 xe_gt_record_default_lrcs 2200 1860 -340 Total: Before=13554, After=13206, chg -2.57% Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250710-lrc-refactors-v2-7-a5e2ca03f6bd@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe/gt: Drop third submission for default contextLucas De Marchi1-8/+0
There's no need to submit the nop job again on the first queue. Any state needed is already saved when the first LRC is switched out. The comment is a little misleading regarding indirect W/A: first of all there's still no indirect W/A enabled and secondly, even after they are, there's no need to submit this job again for having their state propagated: the indirect W/A will actually run on every LRC switch. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250710-lrc-refactors-v2-6-a5e2ca03f6bd@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe/lrc: Remove leftover TODO/FIXMELucas De Marchi1-6/+0
There isn't anything to set for CTX_TIMESTAMP handling in the empty LRC: that is set on every LRC init since it should always start from 0 rather than the value saved in the image after first submission. The FIXME about perma-pinning also doesn't make much sense as we will always going to pin the lrc and the GGTT mapping has nothing to do with VM bind. Nuke these leftover comments. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250710-lrc-refactors-v2-5-a5e2ca03f6bd@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe/gt: Extract emit_job_sync()Lucas De Marchi1-32/+22
Both the nop and wa jobs are going through the same boiler plate calls to emit the job with a timeout and handling error for both bb and job. Extract emit_job_sync() so those functions create the bb, handling possible errors and delegate the part about really emitting the job and waiting for its completion. Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Link: https://lore.kernel.org/r/20250710-lrc-refactors-v2-3-a5e2ca03f6bd@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe: Count dwords before allocatingLucas De Marchi2-15/+25
The bb allocation in emit_wa_job() is wrong in 2 ways: first it's allocating enough space for the 3DSTATE or hardcoding 4k depending on the engine. In the first case it doesn't account for the WAs and in the former it may not be sufficient. Secondly it's using the size instead of number of dwords, causing the buffer to be 4x bigger than needed: xe_bb_new() receives number of dwords as parameter and its declaration was also not following its implementation. Lastly, reword the debug message since it's not only about the LRC WAs anymore as it also include the 3DSTATE for render. While it's unlikely this is causing any real issue, let's calculate the needed space and allocate just enough. Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Link: https://lore.kernel.org/r/20250710-lrc-refactors-v2-2-a5e2ca03f6bd@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe/lrc: Reduce scope of empty lrc dataLucas De Marchi1-11/+11
The only case in which new lrc data is created from scratch is when it's called prior to recording the default lrc. There's no need to check for NULL init_data since in that case the function already failed: just move the allocation where it's needed. Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250710-lrc-refactors-v2-1-a5e2ca03f6bd@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe/vf: Store negotiated VF/PF ABI version at device levelMichal Wajdeczko3-24/+30
There is no need to maintain PF ABI version on per-GT level. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250713103625.1964-8-michal.wajdeczko@intel.com
2025-07-14drm/xe/pf: Stop requiring VF/PF version negotiation on every GTMichal Wajdeczko12-403/+548
While some VF/PF relay actions must be handled on the GT level, like query for runtime registers, it was clarified by the arch team that initial version negotiation can be done by the VF just once, by using any available GuC/GT. Move handling of the VF/PF ABI version negotiation on the PF side from the GT level functions to the device level functions. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250713103625.1964-7-michal.wajdeczko@intel.com
2025-07-14drm/xe/pf: Expose basic info about VFs in debugfsMichal Wajdeczko3-0/+53
We already have function to print summary about VFs, but we missed to add debugfs attribute to make it visible. Do it now. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250713103625.1964-6-michal.wajdeczko@intel.com
2025-07-14drm/xe: Introduce xe_gt_is_main_type helperMichal Wajdeczko10-34/+39
Instead of checking for not being a media type GT provide a small helper to explicitly express our intentions. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250713103625.1964-5-michal.wajdeczko@intel.com
2025-07-14drm/xe: Introduce xe_tile_is_root helperMichal Wajdeczko3-2/+10
Instead of looking at the tile->id member provide a small helper to explicitly express our intentions. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250713103625.1964-4-michal.wajdeczko@intel.com
2025-07-14drm/xe: Move PF and VF device types to separate headersMichal Wajdeczko4-36/+58
We plan to add more PF and VF types and mixing them in a single file is not desired. Move them out to new dedicated files. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250713103625.1964-3-michal.wajdeczko@intel.com
2025-07-14drm/xe: Combine PF and VF device data into unionMichal Wajdeczko1-4/+6
There is no need to keep PF and VF data fields fully separate since we can be only in one mode at the time. Move them into a anonymous union to save few bytes. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250713103625.1964-2-michal.wajdeczko@intel.com
2025-07-14drm/xe: Update register definitions in LRC layout headerXin Wang2-4/+3
Update the register definitions in xe_lrc_layout.h to align with the official hardware specification (Bspec) terminology. Specifically: - rename PVC_CTX_ACC_CTR_THOLD to CTX_ACC_CTR_THOLD - rename PVC_CTX_ASID to CTX_ASID Signed-off-by: Xin Wang <x.wang@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711060924.7373-1-x.wang@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe: Add plumbing for indirect context workaroundsTvrtko Ursulin3-3/+89
Some upcoming workarounds need to be emitted from the indirect workaround context so lets add some plumbing where they will be able to easily slot in. No functional changes for now since everything is still deactivated. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Bspec: 45954 Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711160153.49833-7-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe: Allow specifying number of extra dwords at the end of wa bb emissionTvrtko Ursulin1-3/+5
Indirect context setup will need more than one. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711160153.49833-6-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe: Track number of written dwords from workaround batch buffer emissionTvrtko Ursulin1-1/+4
Indirect context setup will need to get to the number of written dwords. Lets add it as an output parameter so it can be accessed from the finish helper regardless of whether code is writing directly or via an shadow buffer. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711160153.49833-5-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe: Rename utilization workaround emission functionTvrtko Ursulin1-3/+5
Lucas suggested to consolidate to a slightly different naming scheme which will align with the upcoming additions better. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711160153.49833-4-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe: Pass wa bb setup arguments in a structTvrtko Ursulin1-40/+53
Group the function arguments in a struct for more readable code and easier extending. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711160153.49833-3-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe: Generalize wa bb emission codeTvrtko Ursulin1-22/+48
Generalize the wa bb emission by splitting it into three phases - setup, emit and finish, and extract setup and finish steps into helpers. This will enable using the same infrastructure for emitting the indirect context workarounds. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250711160153.49833-2-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe: Fix missing kernel-docLucas De Marchi1-0/+1
Fix warning: Warning: drivers/gpu/drm/xe/xe_device_types.h:658 struct member 'wa_active' not described in 'xe_device' Fixes: 661a6950e061 ("drm/xe: Add infrastructure for Device OOB workarounds") Cc: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Jonathan Cavitt <joanthan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250711214911.2009714-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-14drm/xe: Remove unused functionsDr. David Alan Gilbert6-51/+0
xe_bo_create_from_data() last use was removed in 2023 by commit 0e1a47fcabc8 ("drm/xe: Add a helper for DRM device-lifetime BO create") xe_rtp_match_first_gslice_fused_off() last use was removed in 2023 by commit 4e124151fcfc ("drm/xe/dg2: Drop pre-production workarounds") Remove them, and xe_dss_mask_empty whose last use was by xe_rtp_match_first_gslice_fused_off(). (Xe has a bunch ofother symbols that have been added but not used, given how new it is, I've left those, as opposed to these that had the code that used them removed). Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20250713152531.219326-1-linux@treblig.org Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-11drm/xe: Normalize default param valuesLucas De Marchi1-12/+23
Document xe module params with the default values following a similar strategy for all of them: 1) Define a DEFAULT_* macro with the default value. When the value can't be directly stringified, also define a *_STR variant 2) Use __stringify() or the _STR variant to make sure the default value shows up in the param description This allows us to show the correct default according to the configuration. max_vfs for example was wrongly documented for CONFIG_DRM_XE_DEBUG and svm_notifier_size didn't have its default documented. Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250626-guc-log-level-v3-1-c3ed8b452e91@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-11drm/xe/migrate: Fix alignment checkLucas De Marchi1-2/+2
The check would fail if the address is unaligned, but not when accounting the offset. Instead of `buf | offset` it should have been `buf + offset`. To make it more readable and also drop the uintptr_t, just use the IS_ALIGNED() macro. Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250710-migrate-aligned-v1-1-44003ef3c078@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-11drm/xe: Remove references to CONFIG_DRM_XE_DEVMEM_MIRRORMatthew Brost1-2/+2
The prefetch code was referencing CONFIG_DRM_XE_DEVMEM_MIRROR, which has been replaced by CONFIG_DRM_XE_PAGEMAP. As a result, prefetches were limited to SRAM. Update the code to use CONFIG_DRM_XE_PAGEMAP instead of the deprecated option. Fixes: f86ad0ed620c ("drm/gpusvm, drm/pagemap: Move migration functionality to drm_pagemap") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250710205413.1105595-1-matthew.brost@intel.com
2025-07-11drm/xe: Move page fault init after topology initMatthew Brost1-3/+3
We need the topology to determine GT page fault queue size, move page fault init after topology init. Cc: stable@vger.kernel.org Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250710191208.1040215-1-matthew.brost@intel.com
2025-07-11drm/xe/migrate: fix copy direction in access_memoryMatthew Auld1-1/+1
After we do the modification on the host side, ensure we write the result back to VRAM and not the other way around, otherwise the modification will be lost if treated like a read. Fixes: 270172f64b11 ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250710134128.800756-2-matthew.auld@intel.com
2025-07-11drm/xe: Dont skip TLB invalidations on VFTejas Upadhyay1-12/+10
Skipping TLB invalidations on VF causing unrecoverable faults. Probable reason for skipping TLB invalidations on SRIOV could be lack of support for instruction MI_FLUSH_DW_STORE_INDEX. Add back TLB flush with some additional handling. Helps in resolving, [ 704.913454] xe 0000:00:02.1: [drm:pf_queue_work_func [xe]] ASID: 0 VFID: 0 PDATA: 0x0d92 Faulted Address: 0x0000000002fa0000 FaultType: 0 AccessType: 1 FaultLevel: 0 EngineClass: 3 bcs EngineInstance: 8 [ 704.913551] xe 0000:00:02.1: [drm:pf_queue_work_func [xe]] Fault response: Unsuccessful -22 V2: - Use Xmas tree (MichalW) Suggested-by: Matthew Brost <matthew.brost@intel.com> Fixes: 97515d0b3ed92 ("drm/xe/vf: Don't emit access to Global HWSP if VF") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250710045945.1023840-1-tejas.upadhyay@intel.com Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
2025-07-11Merge tag 'drm-xe-next-2025-07-10' of ↵Simona Vetter84-755/+2065
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next UAPI Changes: - Documentation fixes (Shuicheng) Cross-subsystem Changes: - MTD intel-dg driver for dgfx non-volatile memory device (Sasha) - i2c: designware changes to allow i2c integration with BMG (Heikki) Core Changes: - Restructure migration in preparation for multi-device (Brost, Thomas) - Expose fan control and voltage regulator version on sysfs (Raag) Driver Changes: - Add WildCat Lake support (Roper) - Add aux bus child device driver for NVM on DGFX (Sasha) - Some refactor and fixes to allow cleaner BMG w/a (Lucas, Maarten, Auld) - BMG w/a (Vinay) - Improve handling of aborted probe (Michal) - Do not wedge device on killed exec queues (Brost) - Init changes for flicker-free boot (Maarten) - Fix out-of-bounds field write in MI_STORE_DATA_IMM (Jia) - Enable the GuC Dynamic Inhibit Context Switch optimization (Daniele) - Drop bo->size (Brost) - Builds and KConfig fixes (Harry, Maarten) - Consolidate LRC offset calculations (Tvrtko) - Fix potential leak in hw_engine_group (Michal) - Future-proof for multi-tile + multi-GT cases (Roper) - Validate gt in pmu event (Riana) - SRIOV PF: Clear all LMTT pages on alloc (Michal) - Allocate PF queue size on pow2 boundary (Brost) - SRIOV VF: Make multi-GT migration less error prone (Tomasz) - Revert indirect ring state patch to fix random LRC context switches failures (Brost) - Fix compressed VRAM handling (Auld) - Add one additional BMG PCI ID (Ravi) - Recommend GuC v70.46.2 for BMG, LNL, DG2 (Julia) - Add GuC and HuC to PTL (Daniele) - Drop PTL force_probe requirement (Atwood) - Fix error flow in display suspend (Shuicheng) - Disable GuC communication on hardware initialization error (Zhanjun) - Devcoredump fixes and clean up (Shuicheng) - SRIOV PF: Downgrade some info to debug (Michal) - Don't allocate temporary GuC policies object (Michal) - Support for I2C attached MCUs (Heikki, Raag, Riana) - Add GPU memory bo trace points (Juston) - SRIOV VF: Skip some W/a (Michal) - Correct comment of xe_pm_set_vram_threshold (Shuicheng) - Cancel ongoing H2G requests when stopping CT (Michal) Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/aHA7184UnWlONORU@intel.com
2025-07-11drm/xe/guc: Default log level to non-verboseLucas De Marchi1-1/+1
Currently xe sets the guc log level to a verbose level since it's useful to debug hangs and general development. However the verbose level may already be too much and affect performance. Michal Mrozek did some tests with the L0 compute stack for submission latency with ULLS disabled. Below are the normalized numbers with log level 3 (the current default) as baseline for each test: Test \ Log Level 3 0 1 2 ----------------------------------------------------------- ------ ------ ------ ------ BestWalkerNthCommandListSubmission(CmdListCount=2) 1.00 0.63 0.63 0.96 BestWalkerNthSubmission(KernelCount=2) 1.00 0.62 0.63 0.96 BestWalkerNthSubmissionImmediate(KernelCount=2) 1.00 0.58 0.58 0.85 BestWalkerSubmission 1.00 0.62 0.62 0.96 BestWalkerSubmissionImmediate 1.00 0.63 0.62 0.96 BestWalkerSubmissionImmediateMultiCmdlists(cmdlistCount=2) 1.00 0.58 0.58 0.86 BestWalkerSubmissionImmediateMultiCmdlists(cmdlistCount=4) 1.00 0.70 0.70 0.83 BestWalkerSubmissionImmediateMultiCmdlists(cmdlistCount=8) 1.00 0.53 0.52 0.78 Log level 2 is the first "verbose level" for GuC, where the biggest difference happens. Keep log level 3 for CONFIG_DRM_XE_DEBUG, but switch to 1, i.e. GUC_LOG_LEVEL_NON_VERBOSE, for "normal" builds. Cc: Michal Mrozek <michal.mrozek@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250613-guc-log-level-v2-1-cb84a63e49fe@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit a37128ba613ad6a5f81f382fa3cfe5c4a6527310) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-11drm/xe/bmg: Don't use WA 16023588340 and 22019338487 on VFMichal Wajdeczko1-2/+2
These workarounds are not applicable for use by the VFs. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Tested-by: Jakub Kolakowski <jakub1.kolakowski@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Signed-off-by: Jakub Kolakowski <jakub1.kolakowski@intel.com> Link: https://lore.kernel.org/r/20250710103040.375610-2-jakub1.kolakowski@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 1d2e2503e506ddc499cbb7afdc8b70bcf6fe241f) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-11drm/xe/guc: Recommend GuC v70.46.2 for BMG, LNL, DG2Julia Filipchuk1-3/+3
UAPI compatibility version 1.22.2 Resolves various bugs. Recommend newer version. Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250626182805.1701096-13-daniele.ceraolospurio@intel.com (cherry picked from commit 0b64addcae7f04745bc5f62d41e27268052f812e) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-11drm/xe/pm: Correct comment of xe_pm_set_vram_threshold()Shuicheng Lin1-3/+5
The parameter threshold is with size in MiB, not in bits. Correct it to avoid any confusion. v2: s/mb/MiB, s/vram/VRAM, fix return section. (Michal) Fixes: 30c399529f4c ("drm/xe: Document Xe PM component") Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://lore.kernel.org/r/20250708021450.3602087-2-shuicheng.lin@intel.com Reviewed-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 0efec0500117947f924e5ac83be40f96378af85a) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-11drm/xe: Release runtime pm for error path of xe_devcoredump_read()Shuicheng Lin1-10/+28
xe_pm_runtime_put() is missed to be called for the error path in xe_devcoredump_read(). Add function description comments for xe_devcoredump_read() to help understand it. v2: more detail function comments and refine goto logic (Matt) Fixes: c4a2e5f865b7 ("drm/xe: Add devcoredump chunking") Cc: stable@vger.kernel.org Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250707004911.3502904-6-shuicheng.lin@intel.com (cherry picked from commit 017ef1228d735965419ff118fe1b89089e772c42) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-11drm/xe/pm: Restore display pm if there is error after display suspendShuicheng Lin1-2/+1
xe_bo_evict_all() is called after xe_display_pm_suspend(). So if there is error with xe_bo_evict_all(), display pm should be restored. Fixes: 51462211f4a9 ("drm/xe/pxp: add PXP PM support") Fixes: cb8f81c17531 ("drm/xe/display: Make display suspend/resume work on discrete") Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://lore.kernel.org/r/20250708035424.3608190-2-shuicheng.lin@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 83dcee17855c4e5af037ae3262809036de127903) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-11drm/xe/sriov: Mark BMG as SR-IOV capableMichal Wajdeczko1-0/+1
Enable SR-IOV support for BMG platforms. Note that as other flags from the platform descriptor, it only means it may have that capability: it still depends on runtime checks for the proper support in HW and firmware. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Tested-by: Jakub Kolakowski <jakub1.kolakowski@intel.com> Signed-off-by: Jakub Kolakowski <jakub1.kolakowski@intel.com> Reviewed-by: Satyanarayana K V P <satyanarayana.k.v.p@intel.com> Link: https://lore.kernel.org/r/20250710103040.375610-3-jakub1.kolakowski@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-11drm/xe: extend Wa_15015404425 to apply to PTLMatt Atwood4-0/+13
Wa_15015404425 only needs to be applied on PTL platforms with an A step compute die. There is no way to map PCI revid to the compute die stepping. The easiest way to figure out compute die stepping our end is to map the media IP's stepping to the compute die. For PTL, compute die has an A stepping if and only if the media IP's stepping is also A-step (This relationship is determined on a per platform basis and just happens to be this way on PTL). In addition this workaround is a chicken-and-egg problem. Wa_15015404425 requires that all register reads be preceded by four dummy MMIO writes (including during early driver init and even pre-OS firmware). The driver needs to perform some MMIO reads during init which include the GMD_ID register that contains the Media IPs stepping. To handle this in the safest manner assume the workaround applies to all of PTL during driver probe and deactivate the workaround after. The overall solution becomes a set of two workarounds: * 15015404425 - a Device OOB workaround that's always active for PTL * 15015404425_disable - a GT OOB workaround that applies to PTL platfroms with a B0 or later stepping The first of these workarounds issues dummy MMIO writes we do when reading registers. The second guards logic that disables the first once we have the necessary information later in the probe process. v2: rename SoC to device, avoid null pointer dereference, update commit message. v3: rebase v5: move disable check into xe_device_probe to avoid linking in xe_wa into xe_pci, reword commit message v6: squash extension and b0 support into 1 patch Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://lore.kernel.org/r/20250709221605.172516-7-matthew.s.atwood@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-07-11drm/xe: Move Wa_15015404425 to use the new XE_DEVICE_WA macroMatt Atwood2-4/+5
Move Wa_15015404425 to use the new implemented OOB macro XE_DEVICE_WA() v2: rename from SoC to Device v5: move workaround call back into the flush call v6: remove redundant commenting Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://lore.kernel.org/r/20250709221605.172516-6-matthew.s.atwood@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>