summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-07-09drm/xe/gt: Remove double includeLucas De Marchi1-1/+0
The header generated/xe_wa_oob.h is included twice. Remove one. Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202407052122.AzuWSPuo-lkp@intel.com/ Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708173301.1543871-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-09drm/xe/xe2lpg: Extend workaround 14021402888Bommu Krishnaiah1-0/+4
workaround 14021402888 also applies to Xe2_LPG. Replicate the existing entry to one specific for Xe2_LPG. Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703090754.1323647-1-krishnaiah.bommu@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-09drm/xe: Drop trace_xe_hw_fence_freeMatthew Brost2-6/+0
fence->ctx may be stale memory when trace_xe_hw_fence_free is called resuling UAF bug when deriving the device name. This tracepoint is not all that useful, so just drop it. Fixes: 501c4255c409 ("drm/xe/trace: Print device_id in xe_trace events") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708211008.956384-1-matthew.brost@intel.com
2024-07-08drm/xe/xe2lpm: Extend Wa_16021639441Ngai-Mint Kwan1-0/+10
Wa_16021639441 applies to Xe2_LPM. Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240701184637.531794-1-ngai-mint.kwan@linux.intel.com
2024-07-06drm/xe: Use write-back caching mode for system memory on DGFXThomas Hellström3-21/+37
The caching mode for buffer objects with VRAM as a possible placement was forced to write-combined, regardless of placement. However, write-combined system memory is expensive to allocate and even though it is pooled, the pool is expensive to shrink, since it involves global CPU TLB flushes. Moreover write-combined system memory from TTM is only reliably available on x86 and DGFX doesn't have an x86 restriction. So regardless of the cpu caching mode selected for a bo, internally use write-back caching mode for system memory on DGFX. Coherency is maintained, but user-space clients may perceive a difference in cpu access speeds. v2: - Update RB- and Ack tags. - Rephrase wording in xe_drm.h (Matt Roper) v3: - Really rephrase wording. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode") Cc: Pallavi Mishra <pallavi.mishra@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: dri-devel@lists.freedesktop.org Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Effie Yu <effie.yu@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Jose Souza <jose.souza@intel.com> Cc: Michal Mrozek <michal.mrozek@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Acked-by: Matthew Auld <matthew.auld@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode") Acked-by: Michal Mrozek <michal.mrozek@intel.com> Acked-by: Effie Yu <effie.yu@intel.com> #On chat Link: https://patchwork.freedesktop.org/patch/msgid/20240705132828.27714-1-thomas.hellstrom@linux.intel.com
2024-07-05drm/i915: disable fbc due to Wa_16023588340Matthew Auld4-1/+33
On BMG-G21 we need to disable fbc due to complications around the WA. v2: - Try to handle with i915_drv.h and compat layer. (Rodrigo) v3: - For simplicity retreat back to the original design for now. - Drop the extra \ from the Makefile (Jani) Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Vinod Govindapillai <vinod.govindapillai@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: intel-gfx@lists.freedesktop.org Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703124338.208220-4-matthew.auld@intel.com
2024-07-05drm/xe/bmg: implement Wa_16023588340Matthew Auld9-1/+117
This involves enabling l2 caching of host side memory access to VRAM through the CPU BAR. The main fallout here is with display since VRAM writes from CPU can now be cached in GPU l2, and display is never coherent with caches, so needs various manual flushing. In the case of fbc we disable it due to complications in getting this to work correctly (in a later patch). Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Vinod Govindapillai <vinod.govindapillai@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703124338.208220-3-matthew.auld@intel.com
2024-07-04drm/xe: Use VF_CAP_REG for device wmbMichal Wajdeczko1-1/+10
To force a write barrier on the device memory, we write to the SOFTWARE_FLAGS_SPR33 register, but this particular register was selected because it was one of the writable and unused register. Since a write barrier should also work if we use the read-only register, switch to VF_CAP_REG register that is also marked as accessible for VFs. While at it, add simple kernel-doc for xe_device_wmb() function. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240702183704.1022-4-michal.wajdeczko@intel.com
2024-07-04drm/xe: Kill regs/xe_sriov_regs.hMichal Wajdeczko6-26/+15
There is no real benefit to maintain a separate file. The register definitions related to SR-IOV can be placed in existing headers. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240702183704.1022-3-michal.wajdeczko@intel.com
2024-07-04drm/xe: Fix register definition order in xe_regs.hMichal Wajdeczko1-3/+3
Swap XEHP_CLOCK_GATE_DIS(0x101014) with GU_DEBUG(x101018). Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240702183704.1022-2-michal.wajdeczko@intel.com
2024-07-04drm/xe: Add VM bind IOCTL error injectionMatthew Brost4-1/+61
Add VM bind IOCTL error injection which steals MSB of the bind flags field which if set injects errors at various points in the VM bind IOCTL. Intended to validate error paths. Enabled by CONFIG_DRM_XE_DEBUG. v4: - Change define layout (Jonathan) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-8-matthew.brost@intel.com
2024-07-04drm/xe: Update PT layer with better error handlingMatthew Brost1-65/+167
Update PT layer so if a memory allocation for a PTE fails the error can be propagated to the user without requiring the VM to be killed. v5: - change return value invalidation_fence_init to void (Matthew Auld) v7: - Invert i,j usage in two places (Matthew Auld) - s/0/NULL (Matthew Auld) - Don't ignore return value of xe_pt_new_shared (Matthew Auld) - Don't check for NULL in xe_pt_entry (Matthew Auld) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-7-matthew.brost@intel.com
2024-07-04drm/xe: Update VM trace eventsMatthew Brost2-7/+45
The trace events have changed moving to a single job per VM bind IOCTL, update the trace events align with old behavior as much as possible. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-6-matthew.brost@intel.com
2024-07-04drm/xe: Convert multiple bind ops into single jobMatthew Brost10-1025/+1063
This aligns with the uAPI of an array of binds or single bind that results in multiple GPUVA ops to be considered a single atomic operations. The design is roughly: - xe_vma_ops is a list of xe_vma_op (GPUVA op) - each xe_vma_op resolves to 0-3 PT ops - xe_vma_ops creates a single job - if at any point during binding a failure occurs, xe_vma_ops contains the information necessary unwind the PT and VMA (GPUVA) state v2: - add missing dma-resv slot reservation (CI, testing) v4: - Fix TLB invalidation (Paulo) - Add missing xe_sched_job_last_fence_add/test_dep check (Inspection) v5: - Invert i, j usage (Matthew Auld) - Add helper to test and add job dep (Matthew Auld) - Return on anything but -ETIME for cpu bind (Matthew Auld) - Return -ENOBUFS if suballoc of BB fails due to size (Matthew Auld) - s/do/Do (Matthew Auld) - Add missing comma (Matthew Auld) - Do not assign return value to xe_range_fence_insert (Matthew Auld) v6: - s/0x1ff/MAX_PTE_PER_SDI (Matthew Auld, CI) - Check to large of SA in Xe to avoid triggering WARN (Matthew Auld) - Fix checkpatch issues v7: - Rebase - Support more than 510 PTEs updates in a bind job (Paulo, mesa testing) v8: - Rebase Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-5-matthew.brost@intel.com
2024-07-04drm/xe: Add xe_exec_queue_last_fence_test_depMatthew Brost2-0/+25
Helpful to determine if a bind can immediately use CPU or needs to be deferred a drm scheduler job. v7: - Better wording in kernel doc (Matthew Auld) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-4-matthew.brost@intel.com
2024-07-04drm/xe: Add xe_vm_pgtable_update_op to xe_vma_opsMatthew Brost3-2/+84
Each xe_vma_op resolves to 0-3 pt_ops. Add storage for the pt_ops to xe_vma_ops which is dynamically allocated based the number and types of xe_vma_op in the xe_vma_ops list. Allocation only implemented in this patch. This will help with converting xe_vma_ops (multiple xe_vma_op) in a atomic update unit. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-3-matthew.brost@intel.com
2024-07-04drm/xe: s/xe_tile_migrate_engine/xe_tile_migrate_exec_queueMatthew Brost2-6/+5
Engine is old nomenclature, replace with exec queue. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-2-matthew.brost@intel.com
2024-07-04drm/xe/uapi: Rename xe perf layer as xe observation layerAshutosh Dixit11-187/+190
In Xe, the perf layer allows capture of HW counter streams. These HW counters are generally performance related but don't have to be necessarily so. Also, the name "perf" is a carryover from i915 and is not preferred. Here we propose the name "observation" for this common layer which allows capture of different types of these counter streams. v2: Rename observability layer to observation layer (Lucas/Rodrigo) v3: Rename sysctl file to "observation_paranoid" (Jose) Fixes: 52c2e956dceb ("drm/xe/perf/uapi: "Perf" layer to support multiple perf counter stream types") Fixes: fe8929bdf835 ("drm/xe/perf/uapi: Add perf_stream_paranoid sysctl") Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703164801.2561423-1-ashutosh.dixit@intel.com
2024-07-04drm/xe: Add timeout to preempt fencesMatthew Brost5-13/+59
To adhere to dma fencing rules that fences must signal within a reasonable amount of time, add a 5 second timeout to preempt fences. If this timeout occurs, kill the associated VM as this fatal to the VM. v2: - Add comment for smp_wmb (Checkpatch) - Fix kernel doc typo (Inspection) - Add comment for killed check (Niranjana) v3: - Drop smp_wmb (Matthew Auld) - Don't take vm->lock in preempt fence worker (Matthew Auld) - Drop RB given changes to patch v4: - Add WRITE/READ_ONCE (Niranjana) - Don't export xe_vm_kill (Niranjana) Cc: Matthew Auld <matthew.auld@intel.com> Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Tested-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626004137.4060806-1-matthew.brost@intel.com
2024-07-02drm/xe/guc: Demote GuC IDs usage message to debugMichal Wajdeczko1-2/+2
Printing message at INFO level about available GuC IDs is not that important, DEBUG level is enough. It will also match message about available doorbells: [ ] xe ... [drm:xe_guc_id_mgr_init [xe]] GT0: using 65535 GuC IDs [ ] xe ... [drm:xe_guc_db_mgr_init [xe]] GT0: using 256 doorbells While at it, use proper "GuC" name. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240701193030.978-1-michal.wajdeczko@intel.com
2024-07-02drm/xe/bmg: Apply Wa_22019338487Vinay Belgaumkar4-8/+18
Extend this WA to BMG GT as well. In this case media GT is not affected. The cap frequencies and max allowed ggtt writes are different as well. On BMG, we need to do a flush after 1100 GGTT writes, and we need to limit the GT frequency request to 2133 Mhz during driver load and leave it at that value after driver unloads. v3: Fix checkpatch issue Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240701231529.2582452-2-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-02drm/xe/guc: Prevent use of uninitialized mutexVinay Belgaumkar1-0/+4
When skip_guc_pc is set and/or this is for a VF. Fixes: 3b1592fb7835 ("drm/xe/lnl: Apply Wa_22019338487") Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240701231529.2582452-1-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-01drm/xe/oa: Destroy the stream_lock mutexAshutosh Dixit1-0/+2
The mutex allocated in xe_oa_stream_init() was never previously destroyed. Do so now. Fixes: e936f885f1e9 ("drm/xe/oa/uapi: Expose OA stream fd") Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240628052125.1847989-1-ashutosh.dixit@intel.com
2024-07-01drm/xe/rtp: Fix out-of-bounds array accessLucas De Marchi1-1/+1
Increment the counter before checking for number of rules, otherwise when there's no XE_RTP_MATCH_OR an out-of-bounds access is done, as reported by kasan: BUG: KASAN: global-out-of-bounds in rule_matches+0xb6d/0x11c0 [xe] Read of size 1 at addr ffffffffa0a50b70 by task systemd-udevd/243 Fixes: dc72c52a42e0 ("drm/xe/rtp: Allow to OR rules") Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240628161726.836734-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-01drm/xe/pf: Restart VFs provisioning after GT resetMichal Wajdeczko5-0/+52
Any prior configurations pushed to the GuC are lost when the GT is reset. Push again all non-empty VF configurations to the GuC as part of the GuC reset procedure. This will also help restore early manual provisioning, when the PF was in the meantime suspended and then resumed. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240701102738.934-3-michal.wajdeczko@intel.com
2024-07-01drm/xe/pf: Skip fair VFs provisioning if already provisionedMichal Wajdeczko3-0/+63
Our debugfs allows to view and change VFs' provisioning configs. If we attempt to experiment with VFs provisioning before enabling them, this early config will affect fair provisioning calculations, and will also be overwritten, which is undesirable behavior. To improve this, check if the VFs configs are empty (unprovisioned) before starting the fair provisioning procedure. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240701102738.934-2-michal.wajdeczko@intel.com
2024-07-01drm/xe/pf: Remove inlined #ifdef CONFIG_PCI_IOVMichal Wajdeczko2-4/+7
We can remove #ifdef CONFIG_PCI_IOV in .c files if we provide dummy replacement of the xe_pci_sriov_configure() function. Suggested-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627104305.1477-1-michal.wajdeczko@intel.com
2024-07-01drm/xe/guc: Configure TLB timeout based on CT buffer sizeNirmoy Das3-8/+41
GuC TLB invalidation depends on GuC to process the request from the CT queue and then the real time to invalidate TLB. Add a function to return overestimated possible time a TLB inval H2G might take which can be used as timeout value for TLB invalidation wait time. v4: Make sure CTB is in 4K blocks(Michal) and other doc fixes v3: Pass CT to xe_guc_ct_queue_proc_time_jiffies() (Michal) Add tlb_timeout_jiffies() that replaces TLB_TIMEOUT(Michal) v2: Address reviews from Michal. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1622 Cc: Matthew Brost <matthew.brost@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240628085845.2369-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-06-29drm/xe/mcr: Avoid clobbering DSS steeringMatt Roper1-3/+3
A couple copy/paste mistakes in the code that selects steering targets for OADDRM and INSTANCE0 unintentionally clobbered the steering target for DSS ranges in some cases. The OADDRM/INSTANCE0 values were also not assigned as intended, although that mistake wound up being harmless since the desired values for those specific ranges were '0' which the kzalloc of the GT structure should have already taken care of implicitly. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626210536.1620176-2-matthew.d.roper@intel.com
2024-06-29drm/xe/mocs: Clarify difference between hw and sw sizesMatt Roper2-31/+39
It's not very obvious what the difference is between the 'size' and 'n_entries' fields of the MOCS structure. Rename both fields slightly and add some comments explaining that one is the documentation-defined table size, while the other is the number of entries that can be programmed into the hardware (and the documented table size can potentially be smaller than the number of hardware entries). Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627203741.2042752-4-matthew.d.roper@intel.com
2024-06-29drm/xe/mocs: Update MOCS assertions and remove redundant checksMatt Roper1-12/+2
Rely more heavily on assertions to describe the MOCS programming invariants. CI checks these assertions and will ensure no violations sneak in due to programmer error, so we can remove some of the redundant WARN and silent return checks from non-debug builds. Also tweak/augment some of the existing assertions: there's no reason we'd ever want a platform not to have a MOCS 'ops' structure hooked up so ensure info->ops is non-NULL. Likewise, we should never have a case where the bspec-defined MOCS setting table is larger than the number of MOCS registers exposed by the hardware, so add an extra assert on those sizes as well. Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627203741.2042752-3-matthew.d.roper@intel.com
2024-06-28drm/xe: Get hwe domain specific FW to read RING_TIMESTAMPUmesh Nerlige Ramappa3-2/+11
Per client engine utilization uses RING_TIMESTAMP to return drm-total-cycles to the user. Current code uses XE_FW_GT to read this register on the first available engine in a GT. When testing on DG2, it is observed that this value is 0 when running test on some engines. To resolve that, get the hwe domain specific FW for reading the engine timestamp. v2: - update commit message - use domain specific FW (Matt) v3: - Drop check for hwe in the helper (Matt, Michal) v4: - checkpatch fixes v5: Rebase Fixes: 188ced1e0ff8 ("drm/xe/client: Print runtime to fdinfo") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240627235105.2631135-1-umesh.nerlige.ramappa@intel.com
2024-06-27drm/xe/client: Check return value of xe_force_wake_getNirmoy Das1-2/+6
xe_force_wake_get() can return error so check it's return value before reading gpu_timestamp value. v2: set HWE to NULL instead of setting timestamp to 0(Lucas) Add a warn on for xe_force_wake_put(Himal) Fixes: 188ced1e0ff8 ("drm/xe/client: Print runtime to fdinfo") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625094228.5327-1-nirmoy.das@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-06-27drm/xe/hwmon: Remove xe_hwmon_process_regKarthik Poosa1-49/+40
Remove xe_hwmon_process_reg as it is a umbrella function which can be avoided (Lucas). v2: Improve commit message. (Badal) v3: Add couple of comments. (Lucas) Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Suggested-by: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626170746.2926011-2-karthik.poosa@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-06-27drm/xe: fix error handling in xe_migrate_update_pgtablesMatthew Auld1-4/+4
Don't call drm_suballoc_free with sa_bo pointing to PTR_ERR. References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2120 Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240620102025.127699-2-matthew.auld@intel.com
2024-06-27drm/xe/oa/uapi: Allow preemption to be disabled on the stream exec queueAshutosh Dixit3-1/+78
Mesa VK_KHR_performance_query use case requires preemption and timeslicing to be disabled for the stream exec queue. Implement this functionality here. v2: Minor change to debug print to print both ret values (Umesh) Acked-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626181817.1516229-3-ashutosh.dixit@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-27drm/xe/oa: Allow stream enable/disable functions to return errorAshutosh Dixit1-16/+22
Stream enable/disable functions previously had void return because failure during function execution was not possible. This will change when we introduce functionality to disable preemption on the stream exec queue. Therefore, in preparation for this functionality, prepare this code to be able to handle error returns. Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626181817.1516229-2-ashutosh.dixit@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-27drm/xe/pf: Disable VFs on removeMichal Wajdeczko1-0/+5
We shouldn't leave VFs enabled when unloading the PF driver. Otherwise we will get a message like: [ ] xe 0000:4d:00.0: driver left SR-IOV enabled after remove Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240626111827.1389-2-michal.wajdeczko@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-27drm/xe/irq: remove xe_irq_shutdownIlia Levi3-14/+4
The cleanup is done by devres in irq_uninstall. Commit bbc9651fe9f4 ("drm/xe/irq: move irq_uninstall over to devm") resolved the ordering issue where irq_uninstall (registered with drmm) was called after pci_free_irq_vectors (registered with devm upon calling pci_alloc_irq_vectors). This happened because drmm action list is registered with devm very early in the init flow - before pci_alloc_irq_vectors. Now that irq_uninstall is registered with devm, it will be called before pci_free_irq_vectors and we can remove xe_irq_shutdown. Signed-off-by: Ilia Levi <illevi@habana.ai> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240606124705.822451-1-illevi@habana.ai Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-27drm/xe/pf: Trigger explicit FLR while disabling VFsMichal Wajdeczko3-0/+36
We attempt to unprovision all VFs GuC when disabling them, but GuC may reject such request if the target VF was previously active but VF driver didn't unload with explicit VF reset H2G action or the VMM has not started the VF FLR. To avoid mismatches between configs maintained the PF and GuC, trigger an explicit FLR sequences just before releasing resources. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625194546.1301-2-michal.wajdeczko@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-27drm/xe/guc: Print GuC error codes as hex valueMichal Wajdeczko1-1/+1
We maintain GuC error code values in hex format. Also print them in that format for easier matching. While at it, slightly reformat the log and add missing \n. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625141258.1257-4-michal.wajdeczko@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-27drm/xe/guc: Add more GuC error codes to ABIMichal Wajdeczko1-0/+31
There are many more error codes used that the GuC firmware can return in the RESPONSE_FAILURE message. Add to the ABI header those which are more likely to be seen by the PF or VF drivers. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625141258.1257-3-michal.wajdeczko@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-27drm/xe/guc: Demote the H2G retry log message to debugMichal Wajdeczko1-2/+2
The G2H RETRY message sent by the GuC does not necessary indicate any serious problem and can be a part of the normal communication flow. Switch the log level from warning to more appropriate debug. This will also let the CI ignore these logs which were seen in few SR-IOV scenarios. While at it, use hex to print the reason and add missing \n. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240625141258.1257-2-michal.wajdeczko@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-27drm/xe/vf: Skip attempt to start GuC PC if VFMichal Wajdeczko1-4/+13
We have already marked the GuC PC feature as not applicable for VF devices, but we missed the fact that there may be still some privileged activities performed by this component, who does much more than its name suggests. Explicitly skip xe_guc_pc_start() if running as a VF driver and use a GT oriented message to report any error. v2: also skip xe_guc_pc_stop (Vinay) Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240622094253.1081-1-michal.wajdeczko@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-27drm/xe/oa: Fix kernel doc in xe_drm.hAshutosh Dixit2-5/+3
Fix kernel doc in xe_drm.h. Also eliminate private/non-abi enum definitions. v2: Remove __DRM_XE_PERF_TYPE_MAX since it is unused (Michal) v3: Also remove DRM_XE_OA_PROPERTY_MAX since it can also be eliminated (Michal) Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240623203119.3840283-1-ashutosh.dixit@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-27drm/xe/huc: Use GT oriented error messages in xe_huc.cMichal Wajdeczko1-11/+11
If applicable, we prefer GT oriented dmesg messages. Update all HuC related messages and use more user friendly error codes. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240621172522.1037-1-michal.wajdeczko@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-27drm/xe/guc: Request max GT freq during resumeVinay Belgaumkar3-3/+17
We already request max freq in the load path, moving it to __xe_guc_upload will ensure this speeds up GuC load in the resume path as well. v2: Rename xe_guc_pc_init_early since we now call it per GuC load (Michal W) v3: Keep pc_init_early() and init RPx values there (Rodrigo) Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240620224928.3986377-3-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-27drm/xe/lnl: Apply Wa_22019338487Vinay Belgaumkar13-13/+159
This WA requires us to limit media GT frequency requests to a certain cap value during driver load. Freq limits are restored after load completes, so perf will not be affected during normal operations. During normal driver operation, this WA requires dummy writes to media offset 0x380D8C after every ~63 GGTT writes. This will ensure completion of the LMEM writes originating from Gunit. During driver unload(before FLR), the WA requires that we set requested frequency to the cap value again. v3: Do not use WA number in function name. Call WA wrapper from xe_device. Rename some variables, check for locks in the correct function (Rodrigo). Ensure reset path is also covered for this WA. v4: Fix BAT failure v5: Add a function pointer for ggtt_ops (Michal W) v6: Fix name collision and use static function (Rodrigo) Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240620224928.3986377-2-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-27Merge drm/drm-next into drm-xe-nextRodrigo Vivi616-18095/+27860
Need to sync some header include that propagated through drm-intel-next. v2: After some changes in drm/drm-next Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-24agp: add missing MODULE_DESCRIPTION() macrosJeff Johnson5-0/+5
make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/char/agp/amd64-agp.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/char/agp/intel-agp.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/char/agp/intel-gtt.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/char/agp/sis-agp.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/char/agp/via-agp.o Add the missing invocations of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240603-md-agp-v1-1-9a1582114ced@quicinc.com