summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/xe/xe_gt.c
AgeCommit message (Collapse)AuthorFilesLines
2025-06-19drm/xe/bmg: Update Wa_16023588340Vinay Belgaumkar1-1/+1
This allows for additional L2 caching modes. Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340") Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20250612-wa-14022085890-v4-2-94ba5dcc1e30@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 6ab42fa03d4c88a0ddf5e56e62794853b198e7bf) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-05-05drm/xe/gsc: do not flush the GSC worker from the reset pathDaniele Ceraolo Spurio1-1/+1
The workqueue used for the reset worker is marked as WQ_MEM_RECLAIM, while the GSC one isn't (and can't be as we need to do memory allocations in the gsc worker). Therefore, we can't flush the latter from the former. The reason why we had such a flush was to avoid interrupting either the GSC FW load or in progress GSC proxy operations. GSC proxy operations fall into 2 categories: 1) GSC proxy init: this only happens once immediately after GSC FW load and does not support being interrupted. The only way to recover from an interruption of the proxy init is to do an FLR and re-load the GSC. 2) GSC proxy request: this can happen in response to a request that the driver sends to the GSC. If this is interrupted, the GSC FW will timeout and the driver request will be failed, but overall the GSC will keep working fine. Flushing the work allowed us to avoid interruption in both cases (unless the hang came from the GSC engine itself, in which case we're toast anyway). However, a failure on a proxy request is tolerable if we're in a scenario where we're triggering a GT reset (i.e., something is already gone pretty wrong), so what we really need to avoid is interrupting the init flow, which we can do by polling on the register that reports when the proxy init is complete (as that ensure us that all the load and init operations have been completed). Note that during suspend we still want to do a flush of the worker to make sure it completes any operations involving the HW before the power is cut. v2: fix spelling in commit msg, rename waiter function (Julia) Fixes: dd0e89e5edc2 ("drm/xe/gsc: GSC FW load") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4830 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://lore.kernel.org/r/20250502155104.2201469-1-daniele.ceraolospurio@intel.com
2025-03-12drm/xe: Avoid reading RMW registers in emit_wa_jobMichal Wajdeczko1-21/+63
To allow VFs properly handle LRC WAs, we should postpone doing all RMW register operations and let them be run by the engine itself, since attempt to perform read registers from within the driver will fail on the VF. Use MI_MATH and ALU for that. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250303173522.1822-4-michal.wajdeczko@intel.com
2025-03-01drm/xe: Add performance tunings to debugfsTvrtko Ursulin1-0/+4
Add a list of active tunings to debugfs, analogous to the existing list of workarounds. Rationale being that it seems to make sense to either have both or none. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250227101304.46660-6-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-03-01drm/xe: Fix GT "for each engine" workaroundsTvrtko Ursulin1-2/+2
Any rules using engine matching are currently broken due RTP processing happening too in early init, before the list of hardware engines has been initialised. Fix this by moving workaround processing to later in the driver probe sequence, to just before the processed list is used for the first time. Looking at the debugfs gt0/workarounds on ADL-P we notice 14011060649 should be present while we see, before: GT Workarounds 14011059788 14015795083 And with the patch: GT Workarounds 14011060649 14011059788 14015795083 Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: stable@vger.kernel.org # v6.11+ Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250227101304.46660-2-tvrtko.ursulin@igalia.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-26drm/xe/eustall: Add support to init, enable and disable EU stall samplingHarish Chegondi1-0/+5
Implement EU stall sampling APIs introduced in the previous patch for Xe_HPC (PVC). Add register definitions and the code that accesses these registers to the APIs. Add initialization and clean up functions and their implementations, EU stall enable and disable functions. v11: Move stream->xecore_buf alloc to xe_eu_stall_data_buf_alloc(). Register xe_eu_stall_fini() with devm_add_action_or_reset() instead of calling it from xe_gt_fini(). Changed a couple of variables in struct xe_eu_stall_data_stream from unsigned int to int. v10: Fixed error rewinding code Moved code around as per review feedback v9: Moved structure definitions from xe_eu_stall.h to xe_eu_stall.c Moved read and poll implementations to the next patch Used xe_bo_create_pin_map_at_aligned instead of xe_bo_create_pin_map Changed lock names as per review feedback Moved drop data handling into a subsequent patch Moved code around as per review feedback v8: Updated copyright year in xe_eu_stall_regs.h to 2025. Renamed struct drm_xe_eu_stall_data_pvc to struct xe_eu_stall_data_pvc since it is a local structure. v6: Fix buffer wrap around over write bug (Matt Olson) Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Harish Chegondi <harish.chegondi@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/b6aeca593d521828a0b4fbf6cfd2844716c4fc66.1740533885.git.harish.chegondi@intel.com
2025-02-18drm/xe: Add xe_mmio_init() initialization functionIlia Levi1-4/+3
Add a convenience function for minimal initialization of struct xe_mmio. This function also validates that the entirety of the provided mmio region is usable with struct xe_reg. v2: Modify commit message, add kernel doc, refactor assert (Michal) v3: Fix off-by-one bug, add clarifying macro (Michal) v4: Derive bitfield width from size (Michal) Signed-off-by: Ilia Levi <ilia.levi@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250213093559.204652-1-ilia.levi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-14drm/xe: Cleanup extra calls to xe_hw_fence_irq_finish()Lucas De Marchi1-11/+4
Now that xe_gt_remove is handled entirely by xe_gt, it's clear there are some extra calls to xe_hw_fence_irq_finish() that aren't necessary. Neither all_fw_domain_init() or gt_fw_domain_init() need to do that since it's handled by the caller on any error. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250213192909.996148-8-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-14drm/xe: Cleanup unwind of gt initializationLucas De Marchi1-20/+15
The only thing in xe_gt_remove() that really needs to happen on the device remove callback is the xe_uc_remove(). That's because of the following call chain: xe_gt_remove() xe_uc_remove() xe_gsc_remove() xe_gsc_proxy_remove() Move xe_gsc_proxy_remove() to be handled as a xe_device_remove_action, so it's recorded when it should run during device removal. The rest can be handled normally by devm infra. Besides removing the deep call chain above, xe_device_probe() doesn't have to unwind the gt loop and it's also more in line with the xe_device_probe() style. Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250213192909.996148-7-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2025-02-04drm/xe/vf: Don't try to trigger a full GT reset if VFMichal Wajdeczko1-0/+4
VFs don't have access to the GDRST(0x941c) register that driver uses to reset a GT. Attempt to trigger a reset using debugfs: $ cat /sys/kernel/debug/dri/0000:00:02.1/gt0/force_reset or due to a hang condition detected by the driver leads to: [ ] xe 0000:00:02.1: [drm] GT0: trying reset from force_reset [xe] [ ] xe 0000:00:02.1: [drm] GT0: reset queued [ ] xe 0000:00:02.1: [drm] GT0: reset started [ ] ------------[ cut here ]------------ [ ] xe 0000:00:02.1: [drm] GT0: VF is trying to write 0x1 to an inaccessible register 0x941c+0x0 [ ] WARNING: CPU: 3 PID: 3069 at drivers/gpu/drm/xe/xe_gt_sriov_vf.c:996 xe_gt_sriov_vf_write32+0xc6/0x580 [xe] [ ] RIP: 0010:xe_gt_sriov_vf_write32+0xc6/0x580 [xe] [ ] Call Trace: [ ] <TASK> [ ] ? show_regs+0x6c/0x80 [ ] ? __warn+0x93/0x1c0 [ ] ? xe_gt_sriov_vf_write32+0xc6/0x580 [xe] [ ] ? report_bug+0x182/0x1b0 [ ] ? handle_bug+0x6e/0xb0 [ ] ? exc_invalid_op+0x18/0x80 [ ] ? asm_exc_invalid_op+0x1b/0x20 [ ] ? xe_gt_sriov_vf_write32+0xc6/0x580 [xe] [ ] ? xe_gt_sriov_vf_write32+0xc6/0x580 [xe] [ ] ? xe_gt_tlb_invalidation_reset+0xef/0x110 [xe] [ ] ? __mutex_unlock_slowpath+0x41/0x2e0 [ ] xe_mmio_write32+0x64/0x150 [xe] [ ] do_gt_reset+0x2f/0xa0 [xe] [ ] gt_reset_worker+0x14e/0x1e0 [xe] [ ] process_one_work+0x21c/0x740 [ ] worker_thread+0x1db/0x3c0 Fix that by sending H2G VF_RESET(0x5507) action instead. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4078 Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250131182502.852-1-michal.wajdeczko@intel.com
2025-01-21drm/xe/pf: Fix migration initializationMichal Wajdeczko1-1/+3
The migration support only needs to be initialized once, but it was incorrectly called from the xe_gt_sriov_pf_init_hw(), which is part of the reset flow and may be called multiple times. Fixes: d86e3737c7ab ("drm/xe/pf: Add functions to save and restore VF GuC state") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250120232443.544-1-michal.wajdeczko@intel.com
2025-01-19drm/xe: Always setup GT MMIO adjustment dataMichal Wajdeczko1-0/+3
While we believed that xe_gt_mmio_init() will be called just once per GT, this might not be a case due to some tweaks that need to performed by the VF driver during early probe. To avoid leaving any stale data in case of the re-run, reset the GT MMIO adjustment data for the non-media GT case. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241114175955.2299-2-michal.wajdeczko@intel.com
2025-01-03drm/xe: Fix tlb invalidation when wedgingLucas De Marchi1-4/+4
If GuC fails to load, the driver wedges, but in the process it tries to do stuff that may not be initialized yet. This moves the xe_gt_tlb_invalidation_init() to be done earlier: as its own doc says, it's a software-only initialization and should had been named with the _early() suffix. Move it to be called by xe_gt_init_early(), so the locks and seqno are initialized, avoiding a NULL ptr deref when wedging: xe 0000:03:00.0: [drm] *ERROR* GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01 xe 0000:03:00.0: [drm] *ERROR* GT0: firmware signature verification failed xe 0000:03:00.0: [drm] *ERROR* CRITICAL: Xe has declared device 0000:03:00.0 as wedged. ... BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 9 UID: 0 PID: 3908 Comm: modprobe Tainted: G U W 6.13.0-rc4-xe+ #3 Tainted: [U]=USER, [W]=WARN Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-S ADP-S DDR5 UDIMM CRB, BIOS ADLSFWI1.R00.3275.A00.2207010640 07/01/2022 RIP: 0010:xe_gt_tlb_invalidation_reset+0x75/0x110 [xe] This can be easily triggered by poking the GuC binary to force a signature failure. There will still be an extra message, xe 0000:03:00.0: [drm] *ERROR* GT0: GuC mmio request 0x4100: no reply 0x4100 but that's better than a NULL ptr deref. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3956 Fixes: 7dbe8af13c18 ("drm/xe: Wedge the entire device") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250103001111.331684-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-12-11drm/xe: Apply whitelist to engine save-restoreLucas De Marchi1-3/+1
Instead of handling the whitelist directly in the GuC ADS initialization, make it follow the same logic as other engine registers that are save-restored. Main benefit is that then the SW tracking then shows it in debugfs and there's no risk of an engine workaround to write to the same nopriv register that is being passed directly to GuC. This means that xe_reg_whitelist_process_engine() only has to process the RTP and convert them to entries for the hwe. With that all the registers should be covered by xe_reg_sr_apply_mmio() to write to the HW and there's no special handling in GuC ADS to also add these registers to the list of registers that is passed to GuC. Example for DG2: # cat /sys/kernel/debug/dri/0000\:03\:00.0/gt0/register-save-restore ... Engine rcs0 ... REG[0x24d0] clr=0xffffffff set=0x1000dafc masked=no mcr=no REG[0x24d4] clr=0xffffffff set=0x1000db01 masked=no mcr=no REG[0x24d8] clr=0xffffffff set=0x0000db1c masked=no mcr=no ... Whitelist rcs0 REG[0xdafc-0xdaff]: allow read access REG[0xdb00-0xdb1f]: allow read access REG[0xdb1c-0xdb1f]: allow rw access v2: - Use ~0u for clr bits so it's just a write (Matt Roper) - Simplify helpers now that unused slots are not written Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241209232739.147417-6-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-10-23drm/xe: Mark GT work queue with WQ_MEM_RECLAIMMatthew Brost1-1/+2
GT ordered work queue can be used to free memory via resets and fence signaling thus we should allow this work queue to run during reclaim. Mark with GT ordered work queue with WQ_MEM_RECLAIM appropriately. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241021175705.1584521-5-matthew.brost@intel.com
2024-10-17drm/xe/gt: Update handling of xe_force_wake_get returnHimal Prasad Ghimiray1-47/+58
xe_force_wake_get() now returns the reference count-incremented domain mask. If it fails for individual domains, the return value will always be 0. However, for XE_FORCEWAKE_ALL, it may return a non-zero value even in the event of failure. Use helper xe_force_wake_ref_has_domain to verify all domains are initialized or not. Update the return handling of xe_force_wake_get() to reflect this behavior, and ensure that the return value is passed as input to xe_force_wake_put(). v3 - return xe_wakeref_t instead of int in xe_force_wake_get() - xe_force_wake_put() error doesn't need to be checked. It internally WARNS on domain ack failure. v4 - Rebase fix v5 - return unsigned int for xe_force_wake_get() - remove redundant XE_WARN_ON() v6 - use helper for checking all initialized domains are awake or not. v7 - Fix commit message v9 - Remove redundant WARN_ON (Badal) Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-10-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-10-17drm/xe: Add caller info to xe_gt_reset_asyncNirmoy Das1-1/+1
Add caller info to the xe_gt_reset_async() to help debug issues. v2: s/%pS/%ps(Matt) Cc: Matthew Auld <matthew.auld@intel.com> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2874 Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016141717.881143-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-10-09drm/xe/bmg: improve cache flushing behaviourMatthew Auld1-1/+0
The BSpec says that EN_L3_RW_CCS_CACHE_FLUSH must be toggled on for manual global invalidation to take effect and actually flush device cache, however this also turns on flushing for things like pipecontrol, which occurs between submissions for compute/render. This sounds like massive overkill for our needs, where we already have the manual flushing on the display side with the global invalidation. Some observations on BMG: 1. Disabling l2 caching for host writes and stubbing out the driver global invalidation but keeping EN_L3_RW_CCS_CACHE_FLUSH enabled, has no impact on wb-transient-vs-display IGT, which makes sense since the pipecontrol is now flushing the device cache after the render copy. Without EN_L3_RW_CCS_CACHE_FLUSH the test then fails, which is also expected since device cache is now dirty and display engine can't see the writes. 2. Disabling EN_L3_RW_CCS_CACHE_FLUSH, but keeping the driver global invalidation also has no impact on wb-transient-vs-display. This suggests that the global invalidation still works as expected and is flushing the device cache without EN_L3_RW_CCS_CACHE_FLUSH turned on. With that drop EN_L3_RW_CCS_CACHE_FLUSH. This helps some workloads since we no longer flush the device cache between submissions as part of pipecontrol. Edit: We now also have clarification from HW side that BSpec was indeed wrong here. v2: - Rebase and update commit message. BSpec: 71718 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Vitasta Wattal <vitasta.wattal@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241007074541.33937-2-matthew.auld@intel.com
2024-10-04Merge drm/drm-next into drm-xe-nextThomas Hellström1-2/+0
Backmerging to resolve a conflict with core locally. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-10-03drm/xe: Restore GT freq on GSC load errorVinay Belgaumkar1-1/+3
As part of a Wa_22019338487, ensure that GT freq is restored even when GSC reload is not successful. Fixes: 3b1592fb7835 ("drm/xe/lnl: Apply Wa_22019338487") Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240925204918.1989574-1-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-09-19Merge tag 'drm-next-2024-09-19' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds1-6/+6
Pull drm updates from Dave Airlie: "This adds a couple of patches outside the drm core, all should be acked appropriately, the string and pstore ones are the main ones that come to mind. Otherwise it's the usual drivers, xe is getting enabled by default on some new hardware, we've changed the device number handling to allow more devices, and we added some optional rust code to create QR codes in the panic handler, an idea first suggested I think 10 years ago :-) string: - add mem_is_zero() core: - support more device numbers - use XArray for minor ids - add backlight constants - Split dma fence array creation into alloc and arm fbdev: - remove usage of old fbdev hooks kms: - Add might_fault() to drm_modeset_lock priming - Add dynamic per-crtc vblank configuration support dma-buf: - docs cleanup buddy: - Add start address support for trim function printk: - pass description to kmsg_dump scheduler: - Remove full_recover from drm_sched_start ttm: - Make LRU walk restartable after dropping locks - Allow direct reclaim to allocate local memory panic: - add display QR code (in rust) displayport: - mst: GUID improvements bridge: - Silence error message on -EPROBE_DEFER - analogix: Clean aup - bridge-connector: Fix double free - lt6505: Disable interrupt when powered off - tc358767: Make default DP port preemphasis configurable - lt9611uxc: require DRM_BRIDGE_ATTACH_NO_CONNECTOR - anx7625: simplify OF array handling - dw-hdmi: simplify clock handling - lontium-lt8912b: fix mode validation - nwl-dsi: fix mode vsync/hsync polarity xe: - Enable LunarLake and Battlemage support - Introducing Xe2 ccs modifiers for integrated and discrete graphics - rename xe perf to xe observation - use wb caching on DGFX for system memory - add fence timeouts - Lunar Lake graphics/media/display workarounds - Battlemage workarounds - Battlemage GSC support - GSC and HuC fw updates for LL/BM - use dma_fence_chain_free - refactor hw engine lookup and mmio access - enable priority mem read for Xe2 - Add first GuC BMG fw - fix dma-resv lock - Fix DGFX display suspend/resume - Use xe_managed for kernel BOs - Use reserved copy engine for user binds on faulting devices - Allow mixing dma-fence jobs and long-running faulting jobs - fix media TLB invalidation - fix rpm in TTM swapout path - track resources and VF state by PF i915: - Type-C programming fix for MTL+ - FBC cleanup - Calc vblank delay more accurately - On DP MST, Enable LT fallback for UHBR<->non-UHBR rates - Fix DP LTTPR detection - limit relocations to INT_MAX - fix long hangs in buddy allocator on DG2/A380 amdgpu: - Per-queue reset support - SDMA devcoredump support - DCN 4.0.1 updates - GFX12/VCN4/JPEG4 updates - Convert vbios embedded EDID to drm_edid - GFX9.3/9.4 devcoredump support - process isolation framework for GFX 9.4.3/4 - take IOMMU mappings into account for P2P DMA amdkfd: - CRIU fixes - HMM fix - Enable process isolation support for GFX 9.4.3/4 - Allow users to target recommended SDMA engines - KFD support for targetting queues on recommended SDMA engines radeon: - remove .load and drm_dev_alloc - Fix vbios embedded EDID size handling - Convert vbios embedded EDID to drm_edid - Use GEM references instead of TTM - r100 cp init cleanup - Fix potential overflows in evergreen CS offset tracking msm: - DPU: - implement DP/PHY mapping on SC8180X - Enable writeback on SM8150, SC8180X, SM6125, SM6350 - DP: - Enable widebus on all relevant chipsets - MSM8998 HDMI support - GPU: - A642L speedbin support - A615/A306/A621 support - A7xx devcoredump support ast: - astdp: Support AST2600 with VGA - Clean up HPD - Fix timeout loop for DP link training - reorganize output code by type (VGA, DP, etc) - convert to struct drm_edid - fix BMC handling for all outputs exynos: - drop stale MAINTAINERS pattern - constify struct loongson: - use GEM refcount over TTM mgag200: - Improve BMC handling - Support VBLANK intterupts - transparently support BMC outputs nouveau: - Refactor and clean up internals - Use GEM refcount over TTM's gm12u320: - convert to struct drm_edid gma500: - update i2c terms lcdif: - pixel clock fix host1x: - fix syncpoint IRQ during resume - use iommu_paging_domain_alloc() imx: - ipuv3: convert to struct drm_edid omapdrm: - improve error handling - use common helper for_each_endpoint_of_node() panel: - add support for BOE TV101WUM-LL2 plus DT bindings - novatek-nt35950: improve error handling - nv3051d: improve error handling - panel-edp: - add support for BOE NE140WUM-N6G - revert support for SDC ATNA45AF01 - visionox-vtdr6130: - improve error handling - use devm_regulator_bulk_get_const() - boe-th101mb31ig002: - Support for starry-er88577 MIPI-DSI panel plus DT - Fix porch parameter - edp: Support AOU B116XTN02.3, AUO B116XAN06.1, AOU B116XAT04.1, BOE NV140WUM-N41, BOE NV133WUM-N63, BOE NV116WHM-A4D, CMN N116BCA-EA2, CMN N116BCP-EA2, CSW MNB601LS1-4 - himax-hx8394: Support Microchip AC40T08A MIPI Display panel plus DT - ilitek-ili9806e: Support Densitron DMT028VGHMCMI-1D TFT plus DT - jd9365da: - Support Melfas lmfbx101117480 MIPI-DSI panel plus DT - Refactor for code sharing - panel-edp: fix name for HKC MB116AN01 - jd9365da: fix "exit sleep" commands - jdi-fhd-r63452: simplify error handling with DSI multi-style helpers - mantix-mlaf057we51: simplify error handling with DSI multi-style helpers - simple: - support Innolux G070ACE-LH3 plus DT bindings - support On Tat Industrial Company KD50G21-40NT-A1 plus DT bindings - st7701: - decouple DSI and DRM code - add SPI support - support Anbernic RG28XX plus DT bindings mediatek: - support alpha blending - remove cl in struct cmdq_pkt - ovl adaptor fix - add power domain binding for mediatek DPI controller renesas: - rz-du: add support for RZ/G2UL plus DT bindings rockchip: - Improve DP sink-capability reporting - dw_hdmi: Support 4k@60Hz - vop: - Support RGB display on Rockchip RK3066 - Support 4096px width sti: - convert to struct drm_edid stm: - Avoid UAF wih managed plane and CRTC helpers - Fix module owner - Fix error handling in probe - Depend on COMMON_CLK - ltdc: - Fix transparency after disabling plane - Remove unused interrupt tegra: - gr3d: improve PM domain handling - convert to struct drm_edid - Call drm_atomic_helper_shutdown() vc4: - fix PM during detect - replace DRM_ERROR() with drm_error() - v3d: simplify clock retrieval v3d: - Clean up perfmon virtio: - add DRM capset" * tag 'drm-next-2024-09-19' of https://gitlab.freedesktop.org/drm/kernel: (1326 commits) drm/xe: Fix missing conversion to xe_display_pm_runtime_resume drm/xe/xe2hpg: Add Wa_15016589081 drm/xe: Don't keep stale pointer to bo->ggtt_node drm/xe: fix missing 'xe_vm_put' drm/xe: fix build warning with CONFIG_PM=n drm/xe: Suppress missing outer rpm protection warning drm/xe: prevent potential UAF in pf_provision_vf_ggtt() drm/amd/display: Add all planes on CRTC to state for overlay cursor drm/i915/bios: fix printk format width drm/i915/display: Fix BMG CCS modifiers drm/amdgpu: get rid of bogus includes of fdtable.h drm/amdkfd: CRIU fixes drm/amdgpu: fix a race in kfd_mem_export_dmabuf() drm: new helper: drm_gem_prime_handle_to_dmabuf() drm/amdgpu/atomfirmware: Silence UBSAN warning drm/amdgpu: Fix kdoc entry in 'amdgpu_vm_cpu_prepare' drm/amd/amdgpu: apply command submission parser for JPEG v1 drm/amd/amdgpu: apply command submission parser for JPEG v2+ drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3 drm/amd/pm: update the features set on smu v14.0.2/3 ...
2024-09-18drm/xe: Defer gt->mmio initialization until after multi-tile setupMatt Roper1-0/+24
With the recent xe_mmio redesign, tiles and GTs each have their own MMIO accessor, with the GT inheriting some of the information (such as the iomap pointer) from their containing tile. Given that non-root tiles get initialized later than the root tile (and currently after the point at which GT MMIO is initialized for _all_ GTs), we wind up incorrectly inheriting uninitialized pointers for the initialization of GT MMIO for GTs that reside on non-root tiles. This causes a driver crash on multi-tile PVC platforms. With the general xe_mmio redesign, it's now only necessary to do the GT-level MMIO setup before the point we start reading/writing GT registers. Move initialization of gt->mmio out of xe_info_init (which runs before non-root tiles are initialized) and to the beginning of where we start actually accessing the GTs themselves. The high-level initialization flow now boils down to: - General device init, software-only setup - (no register access possible yet) - Root tile initialization - (access to device/tile0 registers possible via xe_root_tile_mmio()) - Initialization of non-root tiles - (access to any tile's registers possible via tile->mmio) - GT MMIO initialization, inheriting iomap from each GT's tile - (access to any GT's registers possible via gt->mmio) Fixes: fa599b8c95a7 ("drm/xe: Populate GT's mmio iomap from tile during init") Reported-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240917221615.875962-2-matthew.d.roper@intel.com
2024-09-12drm/xe/gt: Remove double includeLucas De Marchi1-1/+0
The header generated/xe_wa_oob.h is included twice. Remove one. Fixes: 27cb2b7fec2a ("drm/xe/bmg: implement Wa_16023588340") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202407052122.AzuWSPuo-lkp@intel.com/ Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708173301.1543871-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 3d122660dc70029d9cccb4e8670125f0affa959e) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-09-12drm/xe/gt: Convert register access to use xe_mmioMatt Roper1-5/+5
Stop using GT pointers for register access. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240910234719.3335472-81-matthew.d.roper@intel.com
2024-09-12Merge drm/drm-next into drm-xe-nextLucas De Marchi1-0/+2
Sync with drm-misc and drm-intel-next for common APIs and refactors. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-09-11drm/xe: Wire up device shutdown handlerMaarten Lankhorst1-0/+7
The system is turning off, and we should probably put the device in a safe power state. We don't need to evict VRAM or suspend running jobs to a safe state, as the device is rebooted anyway. This does not imply the system is necessarily reset, as we can kexec into a new kernel. Without shutting down, things like USB Type-C may mysteriously start failing. References: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/3500 Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> [mlankhorst: Add !xe_driver_flr_disabled assert] Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240905150052.174895-4-maarten.lankhorst@linux.intel.com
2024-09-10Merge tag 'drm-xe-next-2024-09-05' of ↵Dave Airlie1-5/+4
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Cross-subsystem Changes: - Split dma fence array creation into alloc and arm (Matthew Brost) Driver Changes: - Move kernel_lrc to execlist backend (Ilia) - Fix type width for pcode coommand (Karthik) - Make xe_drm.h include unambiguous (Jani) - Fixes and debug improvements for GSC load (Daniele) - Track resources and VF state by PF (Michal Wajdeczko) - Fix memory leak on error path (Nirmoy) - Cleanup header includes (Matt Roper) - Move pcode logic to tile scope (Matt Roper) - Move hwmon logic to device scope (Matt Roper) - Fix media TLB invalidation (Matthew Brost) - Threshold config fixes for PF (Michal Wajdeczko) - Remove extra "[drm]" from logs (Michal Wajdeczko) - Add missing runtime ref (Rodrigo Vivi) - Fix circular locking on runtime suspend (Rodrigo Vivi) - Fix rpm in TTM swapout path (Thomas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/eirx5vdvoflbbqlrzi5cip6bpu3zjojm2pxseufu3rlq4pp6xv@eytjvhizfyu6
2024-09-04drm/xe: Add missing runtime reference to wedged upon gt_resetRodrigo Vivi1-2/+3
Fixes this missed case: xe 0000:00:02.0: [drm] Missing outer runtime PM protection WARNING: CPU: 99 PID: 1455 at drivers/gpu/drm/xe/xe_pm.c:564 xe_pm_runtime_get_noresume+0x48/0x60 [xe] Call Trace: <TASK> ? show_regs+0x67/0x70 ? __warn+0x94/0x1b0 ? xe_pm_runtime_get_noresume+0x48/0x60 [xe] ? report_bug+0x1b7/0x1d0 ? handle_bug+0x46/0x80 ? exc_invalid_op+0x19/0x70 ? asm_exc_invalid_op+0x1b/0x20 ? xe_pm_runtime_get_noresume+0x48/0x60 [xe] xe_device_declare_wedged+0x91/0x280 [xe] gt_reset_worker+0xa2/0x250 [xe] v2: Also move get and get the right Fixes tag (Himal, Brost) Fixes: fb74b205cdd2 ("drm/xe: Introduce a simple wedged state") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240830183507.298351-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit bc947d9a8c3ebd207e52c0e35cfc88f3e1abe54f) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-09-03drm/xe: Add missing runtime reference to wedged upon gt_resetRodrigo Vivi1-2/+3
Fixes this missed case: xe 0000:00:02.0: [drm] Missing outer runtime PM protection WARNING: CPU: 99 PID: 1455 at drivers/gpu/drm/xe/xe_pm.c:564 xe_pm_runtime_get_noresume+0x48/0x60 [xe] Call Trace: <TASK> ? show_regs+0x67/0x70 ? __warn+0x94/0x1b0 ? xe_pm_runtime_get_noresume+0x48/0x60 [xe] ? report_bug+0x1b7/0x1d0 ? handle_bug+0x46/0x80 ? exc_invalid_op+0x19/0x70 ? asm_exc_invalid_op+0x1b/0x20 ? xe_pm_runtime_get_noresume+0x48/0x60 [xe] xe_device_declare_wedged+0x91/0x280 [xe] gt_reset_worker+0xa2/0x250 [xe] v2: Also move get and get the right Fixes tag (Himal, Brost) Fixes: fb74b205cdd2 ("drm/xe: Introduce a simple wedged state") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240830183507.298351-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-09-03drm/xe/pcode: Treat pcode as per-tile rather than per-GTMatt Roper1-2/+0
There's only one instance of the pcode per tile, and for GT-related accesses both the primary and media GT share the same register interface. Since Xe was using per-GT locking, the pcode mutex wasn't actually protecting everything that it should since concurrent accesses related to a tile's primary GT and media GT were possible. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240829220619.789159-5-matthew.d.roper@intel.com (cherry picked from commit 3034cc8107b8d0c7d1b56584394e215dab57f8a3) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-30drm/xe/pcode: Treat pcode as per-tile rather than per-GTMatt Roper1-2/+0
There's only one instance of the pcode per tile, and for GT-related accesses both the primary and media GT share the same register interface. Since Xe was using per-GT locking, the pcode mutex wasn't actually protecting everything that it should since concurrent accesses related to a tile's primary GT and media GT were possible. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240829220619.789159-5-matthew.d.roper@intel.com
2024-08-30Merge tag 'drm-xe-next-2024-08-28' of ↵Dave Airlie1-5/+5
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next UAPI Changes: - Fix OA format masks which were breaking build with gcc-5 Cross-subsystem Changes: Driver Changes: - Use dma_fence_chain_free in chain fence unused as a sync (Matthew Brost) - Refactor hw engine lookup and mmio access to be used in more places (Dominik, Matt Auld, Mika Kuoppala) - Enable priority mem read for Xe2 and later (Pallavi Mishra) - Fix PL1 disable flow in xe_hwmon_power_max_write (Karthik) - Fix refcount and speedup devcoredump (Matthew Brost) - Add performance tuning changes to Xe2 (Akshata, Shekhar) - Fix OA sysfs entry (Ashutosh) - Add first GuC firmware support for BMG (Julia) - Bump minimum GuC firmware for platforms under force_probe to match LNL and BMG (Julia) - Fix access check on user fence creation (Nirmoy) - Add/document workarounds for Xe2 (Julia, Daniele, John, Tejas) - Document workaround and use proper WA infra (Matt Roper) - Fix VF configuration on media GT (Michal Wajdeczko) - Fix VM dma-resv lock (Matthew Brost) - Allow suspend/resume exec queue backend op to be called multiple times (Matthew Brost) - Add GT stats to debugfs (Nirmoy) - Add hwconfig to debugfs (Matt Roper) - Compile out all debugfs code with ONFIG_DEUBG_FS=n (Lucas) - Remove dead kunit code (Jani Nikula) - Refactor drvdata storing to help display (Jani Nikula) - Cleanup unsused xe parameter in pte handling (Himal) - Rename s/enable_display/probe_display/ for clarity (Lucas) - Fix missing MCR annotation in couple of registers (Tejas) - Fix DGFX display suspend/resume (Maarten) - Prepare exec_queue_kill for PXP handling (Daniele) - Fix devm/drmm issues (Daniele, Matthew Brost) - Fix tile and ggtt fini sequences (Matthew Brost) - Fix crashes when probing without firmware in place (Daniele, Matthew Brost) - Use xe_managed for kernel BOs (Daniele, Matthew Brost) - Future-proof dss_per_group calculation by using hwconfig (Matt Roper) - Use reserved copy engine for user binds on faulting devices (Matthew Brost) - Allow mixing dma-fence jobs and long-running faulting jobs (Francois) - Cleanup redundant arg when creating use BO (Nirmoy) - Prevent UAF around preempt fence (Auld) - Fix display suspend/resume (Maarten) - Use vma_pages() helper (Thorsten) - Calculate pagefault queue size (Stuart, Matthew Auld) - Fix missing pagefault wq destroy (Stuart) - Fix lifetime handling of HW fence ctx (Matthew Brost) - Fix order destroy order for jobs (Matthew Brost) - Fix TLB invalidation for media GT (Matthew Brost) - Document GGTT (Rodrigo Vivi) - Refactor GGTT layering and fix runtime outer protection (Rodrigo Vivi) - Handle HPD polling on display pm runtime suspend/resume (Imre, Vinod) - Drop unrequired NULL checks (Apoorva, Himal) - Use separate rpm lockdep map for non-d3cold-capable devices (Thomas Hellström) - Support "nomodeset" kernel command-line option (Thomas Zimmermann) - Drop force_probe requirement for LNL and BMG (Lucas, Balasubramani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/wd42jsh4i3q5zlrmi2cljejohdsrqc6hvtxf76lbxsp3ibrgmz@y54fa7wwxgsd
2024-08-28drm/xe: replace #include <drm/xe_drm.h> with <uapi/drm/xe_drm.h>Jani Nikula1-1/+1
include/drm/xe_drm.h does not exist. Prefer the explicit uapi include. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240827091539.4136838-1-jani.nikula@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-27Merge v6.11-rc5 into drm-nextDaniel Vetter1-0/+2
amdgpu pr conconflicts due to patches cherry-picked to -fixes, I might as well catch up with a backmerge and handle them all. Plus both misc and intel maintainers asked for a backmerge anyway. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2024-08-19drm/xe/xe2: Make subsequent L2 flush sequentialTejas Upadhyay1-0/+1
Issuing the flush on top of an ongoing flush is not desirable. Lets use lock to make it sequential. Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240710052750.3031586-1-tejas.upadhyay@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit 71733b8d7f50b61403f940c6c9745fb3a9b98dcb) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-19drm/xe/bmg: implement Wa_16023588340Matthew Auld1-0/+54
This involves enabling l2 caching of host side memory access to VRAM through the CPU BAR. The main fallout here is with display since VRAM writes from CPU can now be cached in GPU l2, and display is never coherent with caches, so needs various manual flushing. In the case of fbc we disable it due to complications in getting this to work correctly (in a later patch). Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Vinod Govindapillai <vinod.govindapillai@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703124338.208220-3-matthew.auld@intel.com (cherry picked from commit 01570b446939c3538b1aa3d059837f49fa14a3ae) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-08-14drm/xe: Write all slices if its mcr registerTejas Upadhyay1-4/+4
Register GAMREQSTRM_CTRL should be considered mcr register which should write to all slices as per documentation. Bspec: 71185 Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340") Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240814095614.909774-3-tejas.upadhyay@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-08-14drm/xe: Move enable host l2 VRAM post MCR initTejas Upadhyay1-1/+1
xe_gt_enable_host_l2_vram() is reading the XE2_GAMREQSTRM_CTRL register that is currently missing the MCR annotation. However, just adding the annotation doesn't work as this function is called before MCR handling is initialized in xe_gt_mcr_init(). xe_gt_enable_host_l2_vram() is used to implement WA 16023588340 that needs to be done as early as possible during initialization in order to be effective since the MMIO writes impact it. In the failure scenario, driver would simply not be able to bind successfully. Moving xe_gt_enable_host_l2_vram() later, after MCR initialization is done, only incurs a few additional HW accesses, particularly when loading GuC for hwconfig. Binding/unbinding the driver 100 times in BMG still works so it should be ok to start handling the WA a little bit later. This is sufficient to allow adding the MCR annotation to XE2_GAMREQSTRM_CTRL. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240814095614.909774-2-tejas.upadhyay@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-18drm/xe: Wedge the entire deviceMatthew Brost1-0/+15
Wedge the entire device, not just GT which may have triggered the wedge. To implement this, cleanup the layering so xe_device_declare_wedged() calls into the lower layers (GT) to ensure entire device is wedged. While we are here, also signal any pending GT TLB invalidations upon wedging device. Lastly, short circuit reset wait if device is wedged. v2: - Short circuit reset wait if device is wedged (Local testing) Fixes: 8ed9aaae39f3 ("drm/xe: Force wedged state and block GT reset upon any GPU hang") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-1-matthew.brost@intel.com (cherry picked from commit 7dbe8af13c189f5937e87e9fb924d5bbc49e6f71) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-17drm/xe: Wedge the entire deviceMatthew Brost1-0/+15
Wedge the entire device, not just GT which may have triggered the wedge. To implement this, cleanup the layering so xe_device_declare_wedged() calls into the lower layers (GT) to ensure entire device is wedged. While we are here, also signal any pending GT TLB invalidations upon wedging device. Lastly, short circuit reset wait if device is wedged. v2: - Short circuit reset wait if device is wedged (Local testing) Fixes: 8ed9aaae39f3 ("drm/xe: Force wedged state and block GT reset upon any GPU hang") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-1-matthew.brost@intel.com
2024-07-11drm/xe/xe2: Make subsequent L2 flush sequentialTejas Upadhyay1-0/+1
Issuing the flush on top of an ongoing flush is not desirable. Lets use lock to make it sequential. Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240710052750.3031586-1-tejas.upadhyay@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-07-09drm/xe/gt: Remove double includeLucas De Marchi1-1/+0
The header generated/xe_wa_oob.h is included twice. Remove one. Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202407052122.AzuWSPuo-lkp@intel.com/ Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708173301.1543871-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-05drm/xe/bmg: implement Wa_16023588340Matthew Auld1-0/+54
This involves enabling l2 caching of host side memory access to VRAM through the CPU BAR. The main fallout here is with display since VRAM writes from CPU can now be cached in GPU l2, and display is never coherent with caches, so needs various manual flushing. In the case of fbc we disable it due to complications in getting this to work correctly (in a later patch). Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Vinod Govindapillai <vinod.govindapillai@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703124338.208220-3-matthew.auld@intel.com
2024-07-02drm/xe/bmg: Apply Wa_22019338487Vinay Belgaumkar1-2/+1
Extend this WA to BMG GT as well. In this case media GT is not affected. The cap frequencies and max allowed ggtt writes are different as well. On BMG, we need to do a flush after 1100 GGTT writes, and we need to limit the GT frequency request to 2133 Mhz during driver load and leave it at that value after driver unloads. v3: Fix checkpatch issue Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240701231529.2582452-2-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-01drm/xe/pf: Restart VFs provisioning after GT resetMichal Wajdeczko1-0/+3
Any prior configurations pushed to the GuC are lost when the GT is reset. Push again all non-empty VF configurations to the GuC as part of the GuC reset procedure. This will also help restore early manual provisioning, when the PF was in the meantime suspended and then resumed. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240701102738.934-3-michal.wajdeczko@intel.com
2024-06-27drm/xe/lnl: Apply Wa_22019338487Vinay Belgaumkar1-0/+24
This WA requires us to limit media GT frequency requests to a certain cap value during driver load. Freq limits are restored after load completes, so perf will not be affected during normal operations. During normal driver operation, this WA requires dummy writes to media offset 0x380D8C after every ~63 GGTT writes. This will ensure completion of the LMEM writes originating from Gunit. During driver unload(before FLR), the WA requires that we set requested frequency to the cap value again. v3: Do not use WA number in function name. Call WA wrapper from xe_device. Rename some variables, check for locks in the correct function (Rodrigo). Ensure reset path is also covered for this WA. v4: Fix BAT failure v5: Add a function pointer for ggtt_ops (Michal W) v6: Fix name collision and use static function (Rodrigo) Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240620224928.3986377-2-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-06-06drm/xe/vf: Custom GT restartMichal Wajdeczko1-0/+22
Only few steps from the GT restart phase are applicable for the VF drivers, as initialization of PAT, WOPCM, MOCS or CCS mode can be done only by the native or PF drivers. Use custom GT restart function if running in VF mode. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240604212231.1416-5-michal.wajdeczko@intel.com
2024-05-31drm/xe: Split MCR initializationMichal Wajdeczko1-2/+4
The initialization order of GT topology, MCR, PAT and GuC HWconfig as done today by native/PF driver, can't be followed as-is by the VF driver, since fuse registers used in GT topology discovery will be obtained by the VF driver from the GuC in HWconfig step. While native/PF drivers need to program the HW PAT table prior to loading the GuC, this requires only multicast writes support from the MCR code, which could be initialized separately from the full MCR support that requires the GT topology to setup steering data. Split MCR initialization into two steps to avoid introducing VF specific code paths. This also fixes duplicated spin_lock inits. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240530115814.1284-1-michal.wajdeczko@intel.com
2024-05-30Revert "drm/xe: make gt_remove use devm"Daniele Ceraolo Spurio1-7/+9
This reverts commit cd506a33b0d9759e0a58556799b1b38650fa3698. The gt_remove function was explicitly added as part of the remove flow instead of using drmm/devm automatic cleanup due to it being illegal to remove a component after the driver has been detached from the pci device; the GSC proxy component is removed as part of gt_remove, so we need to do it in the pci cleanup flow. The function already has a comment above it to explain this. Note that the change to use the devm also caused an invalid pointer deref in the gsc_proxy unbind function, but I didn't bother to debug which pointer was bad since we shouldn't be calling the unbind that late anyway and this revert fixes it. Both issue were not seen in CI because the GSC loading is temporarily disabled due to a critical bug, which means we're not binding the component. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240528182354.1200424-1-daniele.ceraolospurio@intel.com
2024-05-30drm/xe: Decouple xe_exec_queue and xe_lrcNiranjana Vishwanathapura1-2/+2
Decouple xe_lrc from xe_exec_queue and reference count xe_lrc. Removing hard coupling between xe_exec_queue and xe_lrc allows flexible design where the user interface xe_exec_queue can be destroyed independent of the hardware/firmware interface xe_lrc. v2: Fix lrc indexing in wq_item_append() Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240530032211.29299-1-niranjana.vishwanathapura@intel.com