summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-07-19drm/xe: Fix use after free when client stats are capturedUmesh Nerlige Ramappa3-9/+13
xe_file_close triggers an asynchronous queue cleanup and then frees up the xef object. Since queue cleanup flushes all pending jobs and the KMD stores client usage stats into the xef object after jobs are flushed, we see a use-after-free for the xef object. Resolve this by taking a reference to xef from xe_exec_queue. While at it, revert an earlier change that contained a partial work around for this issue. v2: - Take a ref to xef even for the VM bind queue (Matt) - Squash patches relevant to that fix and work around (Lucas) v3: Fix typo (Lucas) Fixes: ce62827bc294 ("drm/xe: Do not access xe file when updating exec queue run_ticks") Fixes: 6109f24f87d7 ("drm/xe: Add helper to accumulate exec queue runtime") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/issues/1908 Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-5-umesh.nerlige.ramappa@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-19drm/xe: Take a ref to xe file when user creates a VMUmesh Nerlige Ramappa1-1/+5
Take a reference to xef when user creates the VM and put the reference when user destroys the VM. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-4-umesh.nerlige.ramappa@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-19drm/xe: Add ref counting for xe_fileUmesh Nerlige Ramappa3-2/+37
Add ref counting for xe_file. v2: - Add kernel doc for exported functions (Matt) - Instead of xe_file_destroy, export the get/put helpers (Lucas) v3: Fixup the kernel-doc format and description (Matt, Lucas) Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-3-umesh.nerlige.ramappa@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-19drm/xe: Move part of xe_file cleanup to a helperUmesh Nerlige Ramappa1-11/+18
In order to make xe_file ref counted, move destruction of xe_file members to a helper. v2: Move xe_vm_close_and_put back into xe_file_close (Matt) Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240718210548.3580382-2-umesh.nerlige.ramappa@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-18drm/xe/uapi: Expose SIMD16 EU mask in topology queryLucas De Marchi4-7/+45
PVC, Xe2 and later platforms have 16-wide EUs. We were implicitly reporting for PVC the number of 16-wide EUs without giving userspace any hint that they were different than for other platforms. Xe2 and later also have 16-wide, but in those cases the reported number would correspond to the 8-wide count. To avoid confusion and make sure the right number is used by userspace depending on the platform, add a new item to the topology query and drop the one that is not available. The new mask reported for both PVC and Xe2 should now match the numbers reported via hwconfig. v2: Use a different topo item with EU type in its name to report the new mask instead of adding the type itself as the item (Matt Roper) Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Acked-by: Mateusz Jablonski <mateusz.jablonski@intel.com> Acked-by: Wenbin Lu <wenbin.lu@intel.com> Acked-by: Effie Yu <effie.yu@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240710220446.2169797-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-18drm/xe: Remove unused xe_sync_entry_waitMatthew Brost2-9/+0
xe_sync_entry_wait is no longer used, remove it. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240717140429.1396820-2-matthew.brost@intel.com
2024-07-18drm/xe: Validate user fence during creationMatthew Brost1-4/+8
Fail invalid addresses during user fence creation. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240717140429.1396820-1-matthew.brost@intel.com
2024-07-18drm/xe/pm: Add trace for pm functionsNirmoy Das2-0/+60
Add trace for xe pm function for better debuggability. v2: Fix indentation and add trace for xe_pm_runtime_get_ioctl Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240717125950.9952-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-07-18drm/xe/fbdev: Limit the usage of stolen for LNL+Uma Shankar3-1/+12
As per recommendation in the workarounds: WA_22019338487 There is an issue with accessing Stolen memory pages due a hardware limitation. Limit the usage of stolen memory for fbdev for LNL+. Don't use BIOS FB from stolen on LNL+ and assign the same from system memory. v2: Corrected the WA Number, limited WA to LNL and Adopted XE_WA framework as suggested by Lucas and Matt. v3: Introduced the waxxx_display to implement display side of WA changes on Lunarlake. Used xe_root_mmio_gt and avoid the for loop (Suggested by Lucas) v4: Fixed some nits (Luca) Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Uma Shankar <uma.shankar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240717082252.3875909-1-uma.shankar@intel.com
2024-07-18drm/xe/xe2: Do not run xe_bo_test for xe2+ dgfxAkshata Jahagirdar1-0/+6
In xe2+ dgfx, we don't need to handle the copying of ccs metadata during migration. This test validates the ccs data post clear and copy during evict/restore operation. Thus, we can skip this test on xe2+ dgfx. Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/57d9df82ad02e53c9b0d2a7d40bb27acce57b927.1721250309.git.akshata.jahagirdar@intel.com
2024-07-18drm/xe/migrate: Add kunit to test migration functionality for BMGAkshata Jahagirdar1-1/+119
This part of kunit verifies that - main data is decompressed and ccs data is clear post bo eviction. - main data is raw copied and ccs data is clear post bo restore. v2: Added missing bo_put()/bo_unlock() (Matt Auld) Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1d36d4377c566508e42b3fb80d3fe4a588fd00ca.1721250309.git.akshata.jahagirdar@intel.com
2024-07-18drm/xe/xe_migrate: Handle migration logic for xe2+ dgfxAkshata Jahagirdar1-8/+11
During eviction (vram->sysmem), we use compressed -> uncompressed mapping. During restore (sysmem->vram), we need to use mapping from uncompressed -> uncompressed. Handle logic for selecting the compressed identity map for eviction, and selecting uncompressed map for restore operations. v2: Move check of xe_migrate_ccs_emit() before calling xe_migrate_ccs_copy(). (Nirmoy) Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/79b3a016e686a662ae68c32b5fc7f0f2ac8043e9.1721250309.git.akshata.jahagirdar@intel.com
2024-07-18drm/xe/xe2: Introduce identity map for compressed pat for vramAkshata Jahagirdar2-24/+66
Xe2+ has unified compression (exactly one compression mode/format), where compression is now controlled via PAT at PTE level. This simplifies KMD operations, as it can now decompress freely without concern for the buffer's original compression format—unlike DG2, which had multiple compression formats and thus required copying the raw CCS state during VRAM eviction. In addition mixed VRAM and system memory buffers were not supported with compression enabled. On Xe2 dGPU compression is still only supported with VRAM, however we can now support compression with VRAM and system memory buffers, with GPU access being seamless underneath. So long as when doing VRAM -> system memory the KMD uses compressed -> uncompressed, to decompress it. This also allows CPU access to such buffers, assuming that userspace first decompress the corresponding pages being accessed. If the pages are already in system memory then KMD would have already decompressed them. When restoring such buffers with sysmem -> VRAM the KMD can't easily know which pages were originally compressed, so we always use uncompressed -> uncompressed here. With this it also means we can drop all the raw CCS handling on such platforms (including needing to allocate extra CCS storage). In order to support this we now need to have two different identity mappings for compressed and uncompressed VRAM. In this patch, we set up the additional identity map for the VRAM with compressed pat_index. We then select the appropriate mapping during migration/clear. During eviction (vram->sysmem), we use the mapping from compressed -> uncompressed. During restore (sysmem->vram), we need the mapping from uncompressed -> uncompressed. Therefore, we need to have two different mappings for compressed and uncompressed vram. We set up an additional identity map for the vram with compressed pat_index. We then select the appropriate mapping during migration/clear. v2: Formatting nits, Updated code to match recent changes in xe_migrate_prepare_vm(). (Matt) v3: Move identity map loop to a helper function. (Matt Brost) v4: Split helper function in different patch, and add asserts and nits. (Matt Brost) v5: Convert the 2 bool arguments of pte_update_size to flags argument (Matt Brost) v6: Formatting nits (Matt Brost) Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/b00db5c7267e54260cb6183ba24b15c1e6ae52a3.1721250309.git.akshata.jahagirdar@intel.com
2024-07-18drm/xe/migrate: Add helper function to program identity mapAkshata Jahagirdar1-40/+48
Add an helper function to program identity map. v2: Formatting nits Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/91dc05f05bd33076fb9a9f74f8495b48d2abff53.1721250309.git.akshata.jahagirdar@intel.com
2024-07-18drm/xe/migrate: Add kunit to test clear functionalityAkshata Jahagirdar1-0/+276
This test verifies if the main and ccs data are cleared during bo creation. The motivation to use Kunit instead of IGT is that, although we can verify whether the data is zero following bo creation, we cannot confirm whether the zero value after bo creation is the result of our clear function or simply because the initial data present was zero. v2: Updated the mutex_lock and unlock logic, Changed out_unlock to out_put. (Matt) v3: Added missing dma_fence_put(). (Nirmoy) v4: Rebase. v5: Add missing bo_put(), bo_unlock() calls. (Matt Auld) Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c07603439b88cfc99e78c0e2069327e65d5aa87d.1721250309.git.akshata.jahagirdar@intel.com
2024-07-18drm/xe/migrate: Handle clear ccs logic for xe2 dgfxAkshata Jahagirdar1-3/+8
For Xe2 dGPU, we clear the bo by modifying the VRAM using an uncompressed pat index which then indirectly updates the compression status as uncompressed i.e zeroed CCS. So xe_migrate_clear() should be updated for BMG to not emit CCS surf copy commands. v2: Moved xe_device_needs_ccs_emit() to xe_migrate.c and changed name to xe_migrate_needs_ccs_emit() since its very specific to migration.(Matt) Signed-off-by: Akshata Jahagirdar <akshata.jahagirdar@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/8dd869dd8dda5e17ace28c04f1a48675f5540874.1721250309.git.akshata.jahagirdar@intel.com
2024-07-17drm/xe: Don't suspend device upon wedgeMatthew Brost1-0/+14
When wedging a device we shouldn't be suspending device as state for debug will be lost. Also this appears to not work as the below stack trace pops upon trying to resume a wedged device: [ 304.245044] INFO: task cat:12115 blocked for more than 151 seconds. [ 304.251333] Tainted: G W 6.10.0-rc7-xe+ #3518 [ 304.257617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 304.265459] task:cat state:D stack:13384 pid:12115 tgid:12115 ppid:3986 flags:0x00000006 [ 304.265465] Call Trace: [ 304.265467] <TASK> [ 304.265469] __schedule+0x3c4/0xdf0 [ 304.265478] schedule+0x3c/0x140 [ 304.265481] rpm_resume+0x1cc/0x740 [ 304.265484] ? __pfx_autoremove_wake_function+0x10/0x10 [ 304.265489] __pm_runtime_resume+0x49/0x80 [ 304.265494] guc_info+0x6b/0xb0 [xe] [ 304.265538] ? __pfx___drm_printfn_seq_file+0x10/0x10 [ 304.265541] ? __pfx___drm_puts_seq_file+0x10/0x10 [ 304.265545] seq_read_iter+0x111/0x4c0 [ 304.265551] seq_read+0xfc/0x140 [ 304.265556] full_proxy_read+0x58/0x80 [ 304.265560] vfs_read+0xa7/0x360 [ 304.265563] ? find_held_lock+0x2b/0x80 [ 304.265568] ksys_read+0x64/0xe0 [ 304.265571] do_syscall_64+0x68/0x140 [ 304.265575] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 304.265578] RIP: 0033:0x7f4254d14992 [ 304.265580] RSP: 002b:00007ffc558666f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 304.265583] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f4254d14992 [ 304.265584] RDX: 0000000000020000 RSI: 00007f4254ebb000 RDI: 0000000000000003 [ 304.265586] RBP: 00007f4254ebb000 R08: 00007f4254eba010 R09: 00007f4254eba010 [ 304.265587] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000 [ 304.265588] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000 [ 304.265593] </TASK> [ 304.265594] Showing all locks held in the system: [ 304.265598] 1 lock held by khungtaskd/57: [ 304.265599] #0: ffffffff8273b860 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x36/0x1c0 [ 304.265607] 3 locks held by kworker/6:1/90: [ 304.265610] 1 lock held by in:imklog/547: [ 304.265611] #0: ffff88810498cd88 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x76/0xc0 [ 304.265620] 1 lock held by dmesg/1310: v2: Drop local 'err' variable (Jonathan) Fixes: 8ed9aaae39f3 ("drm/xe: Force wedged state and block GT reset upon any GPU hang") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-2-matthew.brost@intel.com
2024-07-17drm/xe: Wedge the entire deviceMatthew Brost9-13/+80
Wedge the entire device, not just GT which may have triggered the wedge. To implement this, cleanup the layering so xe_device_declare_wedged() calls into the lower layers (GT) to ensure entire device is wedged. While we are here, also signal any pending GT TLB invalidations upon wedging device. Lastly, short circuit reset wait if device is wedged. v2: - Short circuit reset wait if device is wedged (Local testing) Fixes: 8ed9aaae39f3 ("drm/xe: Force wedged state and block GT reset upon any GPU hang") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-1-matthew.brost@intel.com
2024-07-17drm/xe/gsc: add Battlemage supportAlexander Usyskin5-7/+43
Add heci_cscfi support bit for new CSC engine type. It has same mmio offsets as DG2 GSC but separate interrupt flow. Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708084906.2827024-1-alexander.usyskin@intel.com
2024-07-15drm/xe/vf: Track writes to inaccessible registers from VFMichal Wajdeczko3-1/+32
Only limited set of registers is accessible for the VF driver and the hardware will silently drop writes to inaccessible registers. To improve our VF driver lets intercept all such writes to warn about such unexpected writes on debug builds or optionally allow to provide some substitution (as a potential future extension). Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Cc: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240713142643.1242-2-michal.wajdeczko@intel.com
2024-07-13drm/xe/xe2: Add Wa_15015404425Tejas Upadhyay1-0/+23
Wa_15015404425 asks us to perform four "dummy" writes to a non-existent register offset before every real register read. Although the specific offset of the writes doesn't directly matter, the workaround suggests offset 0x130030 as a good target so that these writes will be easy to recognize and filter out in debugging traces. V5(MattR): - Avoid negating an equality comparison V4(MattR): - Use writel and remove xe_reg usage V3(MattR): - Define dummy reg local to function - Avoid tracing dummy writes - Update commit message V2: - Add WA to 8/16/32bit reads also - MattR - Corrected dummy reg address - MattR - Use for loop to avoid mental pause - JaniN Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240709155606.2998941-1-tejas.upadhyay@intel.com
2024-07-12drm/xe/pf: Limit fair VF LMEM provisioningMichal Wajdeczko1-0/+1
Due to the current design of the BO and VRAM manager, any object with XE_BO_FLAG_PINNED flag, which the PF driver uses during VF LMEM provisionining, is created with the TTM_PL_FLAG_CONTIGUOUS flag, which may cause VRAM fragmentation that prevents subsequent allocations of larger objects, like fair VF LMEM provisioning. To avoid such failures, round down fair VF LMEM provisioning size to next power of two size, to compensate what xe_ttm_vram_mgr is doing to achieve contiguous allocations. Fixes: ac6598aed1b3 ("drm/xe/pf: Add support to configure SR-IOV VFs") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711192320.1198-2-michal.wajdeczko@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-12drm/xe/exec: Fix minor bug related to xe_sync_entry_cleanupAshutosh Dixit1-7/+7
Increment num_syncs after xe_sync_entry_parse() is successful to ensure the xe_sync_entry_cleanup() logic under "err_syncs" label works correctly. v2: Use the same pattern as that in xe_vm.c (Matt Brost) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711211203.3728180-1-ashutosh.dixit@intel.com
2024-07-12drm/xe/kunit: Simplify xe_mocs live tests code layoutMichal Wajdeczko5-42/+18
The test case logic is implemented by the functions compiled as part of the core Xe driver module and then exported to build and register the test suite in the live test module. But we don't need to export individual test case functions, we may just export the entire test suite. And we don't need to register this test suite in a separate file, it can be done in the main file of the live test module. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-5-michal.wajdeczko@intel.com
2024-07-12drm/xe/kunit: Simplify xe_migrate live tests code layoutMichal Wajdeczko5-37/+15
The test case logic is implemented by the functions compiled as part of the core Xe driver module and then exported to build and register the test suite in the live test module. But we don't need to export individual test case functions, we may just export the entire test suite. And we don't need to register this test suite in a separate file, it can be done in the main file of the live test module. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-4-michal.wajdeczko@intel.com
2024-07-12drm/xe/kunit: Simplify xe_dma_buf live tests code layoutMichal Wajdeczko5-37/+15
The test case logic is implemented by the functions compiled as part of the core Xe driver module and then exported to build and register the test suite in the live test module. But we don't need to export individual test case functions, we may just export the entire test suite. And we don't need to register this test suite in a separate file, it can be done in the main file of the live test module. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-3-michal.wajdeczko@intel.com
2024-07-12drm/xe/kunit: Simplify xe_bo live tests code layoutMichal Wajdeczko5-41/+20
The test case logic is implemented by the functions compiled as part of the core Xe driver module and then exported to build and register the test suite in the live test module. But we don't need to export individual test case functions, we may just export the entire test suite. And we don't need to register this test suite in a separate file, it can be done in the main file of the live test module. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-2-michal.wajdeczko@intel.com
2024-07-12drm/xe/kunit: Drop XE_TEST_EXPORTMichal Wajdeczko1-2/+0
It's unused and can be replaced with VISIBLE_IF_KUNIT if needed. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240705191057.1110-3-michal.wajdeczko@intel.com
2024-07-12drm/xe/kunit: Kill xe_cur_kunit()Michal Wajdeczko6-16/+14
We shouldn't use custom helper if there is a official one. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240705191057.1110-2-michal.wajdeczko@intel.com
2024-07-11drm/xe: Add process name and PID to job timedout messageJosé Roberto de Souza1-2/+15
This will be very helpful for Mesa CI, where it uses PID to match the exacly test that cause timedout/GPU hang and mark that test as failing. Also printing the process name as it might be relavant for human readers. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240710213149.57662-1-jose.souza@intel.com
2024-07-11drm/xe/xe2: Make subsequent L2 flush sequentialTejas Upadhyay3-0/+9
Issuing the flush on top of an ongoing flush is not desirable. Lets use lock to make it sequential. Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240710052750.3031586-1-tejas.upadhyay@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-07-10drm/xe/display/xe_hdcp_gsc: Free arbiter on driver removalNirmoy Das1-4/+8
Free arbiter allocated in intel_hdcp_gsc_init(). Fixes: 152f2df954d8 ("drm/xe/hdcp: Enable HDCP for XE") Cc: Suraj Kandpal <suraj.kandpal@intel.com> Cc: Arun R Murthy <arun.r.murthy@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708125918.23573-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-07-10drm/xe: Generate oob before compiling anythingLucas De Marchi1-21/+4
Instead of keep adding more dependencies as WAs are needed in different places of the driver, just add a rule with all the objects so the code generation happens before anything else. While at it, group lines related to wa_oob in the Makefile. v2: Prefix $(obj) when declaring dependency Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708213041.1734028-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-09drm/xe/gt: Remove double includeLucas De Marchi1-1/+0
The header generated/xe_wa_oob.h is included twice. Remove one. Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202407052122.AzuWSPuo-lkp@intel.com/ Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708173301.1543871-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-09drm/xe/xe2lpg: Extend workaround 14021402888Bommu Krishnaiah1-0/+4
workaround 14021402888 also applies to Xe2_LPG. Replicate the existing entry to one specific for Xe2_LPG. Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703090754.1323647-1-krishnaiah.bommu@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-09drm/xe: Drop trace_xe_hw_fence_freeMatthew Brost2-6/+0
fence->ctx may be stale memory when trace_xe_hw_fence_free is called resuling UAF bug when deriving the device name. This tracepoint is not all that useful, so just drop it. Fixes: 501c4255c409 ("drm/xe/trace: Print device_id in xe_trace events") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708211008.956384-1-matthew.brost@intel.com
2024-07-08drm/xe/xe2lpm: Extend Wa_16021639441Ngai-Mint Kwan1-0/+10
Wa_16021639441 applies to Xe2_LPM. Signed-off-by: Ngai-Mint Kwan <ngai-mint.kwan@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240701184637.531794-1-ngai-mint.kwan@linux.intel.com
2024-07-06drm/xe: Use write-back caching mode for system memory on DGFXThomas Hellström3-21/+37
The caching mode for buffer objects with VRAM as a possible placement was forced to write-combined, regardless of placement. However, write-combined system memory is expensive to allocate and even though it is pooled, the pool is expensive to shrink, since it involves global CPU TLB flushes. Moreover write-combined system memory from TTM is only reliably available on x86 and DGFX doesn't have an x86 restriction. So regardless of the cpu caching mode selected for a bo, internally use write-back caching mode for system memory on DGFX. Coherency is maintained, but user-space clients may perceive a difference in cpu access speeds. v2: - Update RB- and Ack tags. - Rephrase wording in xe_drm.h (Matt Roper) v3: - Really rephrase wording. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode") Cc: Pallavi Mishra <pallavi.mishra@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: dri-devel@lists.freedesktop.org Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Effie Yu <effie.yu@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Jose Souza <jose.souza@intel.com> Cc: Michal Mrozek <michal.mrozek@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Acked-by: Matthew Auld <matthew.auld@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode") Acked-by: Michal Mrozek <michal.mrozek@intel.com> Acked-by: Effie Yu <effie.yu@intel.com> #On chat Link: https://patchwork.freedesktop.org/patch/msgid/20240705132828.27714-1-thomas.hellstrom@linux.intel.com
2024-07-05drm/i915: disable fbc due to Wa_16023588340Matthew Auld4-1/+33
On BMG-G21 we need to disable fbc due to complications around the WA. v2: - Try to handle with i915_drv.h and compat layer. (Rodrigo) v3: - For simplicity retreat back to the original design for now. - Drop the extra \ from the Makefile (Jani) Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Vinod Govindapillai <vinod.govindapillai@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: intel-gfx@lists.freedesktop.org Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703124338.208220-4-matthew.auld@intel.com
2024-07-05drm/xe/bmg: implement Wa_16023588340Matthew Auld9-1/+117
This involves enabling l2 caching of host side memory access to VRAM through the CPU BAR. The main fallout here is with display since VRAM writes from CPU can now be cached in GPU l2, and display is never coherent with caches, so needs various manual flushing. In the case of fbc we disable it due to complications in getting this to work correctly (in a later patch). Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Vinod Govindapillai <vinod.govindapillai@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703124338.208220-3-matthew.auld@intel.com
2024-07-04drm/xe: Use VF_CAP_REG for device wmbMichal Wajdeczko1-1/+10
To force a write barrier on the device memory, we write to the SOFTWARE_FLAGS_SPR33 register, but this particular register was selected because it was one of the writable and unused register. Since a write barrier should also work if we use the read-only register, switch to VF_CAP_REG register that is also marked as accessible for VFs. While at it, add simple kernel-doc for xe_device_wmb() function. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240702183704.1022-4-michal.wajdeczko@intel.com
2024-07-04drm/xe: Kill regs/xe_sriov_regs.hMichal Wajdeczko6-26/+15
There is no real benefit to maintain a separate file. The register definitions related to SR-IOV can be placed in existing headers. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240702183704.1022-3-michal.wajdeczko@intel.com
2024-07-04drm/xe: Fix register definition order in xe_regs.hMichal Wajdeczko1-3/+3
Swap XEHP_CLOCK_GATE_DIS(0x101014) with GU_DEBUG(x101018). Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240702183704.1022-2-michal.wajdeczko@intel.com
2024-07-04drm/xe: Add VM bind IOCTL error injectionMatthew Brost4-1/+61
Add VM bind IOCTL error injection which steals MSB of the bind flags field which if set injects errors at various points in the VM bind IOCTL. Intended to validate error paths. Enabled by CONFIG_DRM_XE_DEBUG. v4: - Change define layout (Jonathan) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-8-matthew.brost@intel.com
2024-07-04drm/xe: Update PT layer with better error handlingMatthew Brost1-65/+167
Update PT layer so if a memory allocation for a PTE fails the error can be propagated to the user without requiring the VM to be killed. v5: - change return value invalidation_fence_init to void (Matthew Auld) v7: - Invert i,j usage in two places (Matthew Auld) - s/0/NULL (Matthew Auld) - Don't ignore return value of xe_pt_new_shared (Matthew Auld) - Don't check for NULL in xe_pt_entry (Matthew Auld) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-7-matthew.brost@intel.com
2024-07-04drm/xe: Update VM trace eventsMatthew Brost2-7/+45
The trace events have changed moving to a single job per VM bind IOCTL, update the trace events align with old behavior as much as possible. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-6-matthew.brost@intel.com
2024-07-04drm/xe: Convert multiple bind ops into single jobMatthew Brost10-1025/+1063
This aligns with the uAPI of an array of binds or single bind that results in multiple GPUVA ops to be considered a single atomic operations. The design is roughly: - xe_vma_ops is a list of xe_vma_op (GPUVA op) - each xe_vma_op resolves to 0-3 PT ops - xe_vma_ops creates a single job - if at any point during binding a failure occurs, xe_vma_ops contains the information necessary unwind the PT and VMA (GPUVA) state v2: - add missing dma-resv slot reservation (CI, testing) v4: - Fix TLB invalidation (Paulo) - Add missing xe_sched_job_last_fence_add/test_dep check (Inspection) v5: - Invert i, j usage (Matthew Auld) - Add helper to test and add job dep (Matthew Auld) - Return on anything but -ETIME for cpu bind (Matthew Auld) - Return -ENOBUFS if suballoc of BB fails due to size (Matthew Auld) - s/do/Do (Matthew Auld) - Add missing comma (Matthew Auld) - Do not assign return value to xe_range_fence_insert (Matthew Auld) v6: - s/0x1ff/MAX_PTE_PER_SDI (Matthew Auld, CI) - Check to large of SA in Xe to avoid triggering WARN (Matthew Auld) - Fix checkpatch issues v7: - Rebase - Support more than 510 PTEs updates in a bind job (Paulo, mesa testing) v8: - Rebase Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-5-matthew.brost@intel.com
2024-07-04drm/xe: Add xe_exec_queue_last_fence_test_depMatthew Brost2-0/+25
Helpful to determine if a bind can immediately use CPU or needs to be deferred a drm scheduler job. v7: - Better wording in kernel doc (Matthew Auld) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-4-matthew.brost@intel.com
2024-07-04drm/xe: Add xe_vm_pgtable_update_op to xe_vma_opsMatthew Brost3-2/+84
Each xe_vma_op resolves to 0-3 pt_ops. Add storage for the pt_ops to xe_vma_ops which is dynamically allocated based the number and types of xe_vma_op in the xe_vma_ops list. Allocation only implemented in this patch. This will help with converting xe_vma_ops (multiple xe_vma_op) in a atomic update unit. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-3-matthew.brost@intel.com
2024-07-04drm/xe: s/xe_tile_migrate_engine/xe_tile_migrate_exec_queueMatthew Brost2-6/+5
Engine is old nomenclature, replace with exec queue. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240704041652.272920-2-matthew.brost@intel.com