kernel/linux.git/drivers/gpu/drm/xe/xe_pt.c, branch linux-7.0.y

drm/xe: always keep track of remap prev/next

2026-03-25T12:05:33+00:00

During 3D workload, user is reporting hitting: [ 413.361679] WARNING: drivers/gpu/drm/xe/xe_vm.c:1217 at vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe], CPU#7: vkd3d_queue/9925 [ 413.361944] CPU: 7 UID: 1000 PID: 9925 Comm: vkd3d_queue Kdump: loaded Not tainted 7.0.0-070000rc3-generic #202603090038 PREEMPT(lazy) [ 413.361949] RIP: 0010:vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe] [ 413.362074] RSP: 0018:ffffd4c25c3df930 EFLAGS: 00010282 [ 413.362077] RAX: 0000000000000000 RBX: ffff8f3ee817ed10 RCX: 0000000000000000 [ 413.362078] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 413.362079] RBP: ffffd4c25c3df980 R08: 0000000000000000 R09: 0000000000000000 [ 413.362081] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f41fbf99380 [ 413.362082] R13: ffff8f3ee817e968 R14: 00000000ffffffef R15: ffff8f43d00bd380 [ 413.362083] FS: 00000001040ff6c0(0000) GS:ffff8f4696d89000(0000) knlGS:00000000330b0000 [ 413.362085] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [ 413.362086] CR2: 00007ddfc4747000 CR3: 00000002e6262005 CR4: 0000000000f72ef0 [ 413.362088] PKRU: 55555554 [ 413.362089] Call Trace: [ 413.362092] [ 413.362096] xe_vm_bind_ioctl+0xa9a/0xc60 [xe] Which seems to hint that the vma we are re-inserting for the ops unwind is either invalid or overlapping with something already inserted in the vm. It shouldn't be invalid since this is a re-insertion, so must have worked before. Leaving the likely culprit as something already placed where we want to insert the vma. Following from that, for the case where we do something like a rebind in the middle of a vma, and one or both mapped ends are already compatible, we skip doing the rebind of those vma and set next/prev to NULL. As well as then adjust the original unmap va range, to avoid unmapping the ends. However, if we trigger the unwind path, we end up with three va, with the two ends never being removed and the original va range in the middle still being the shrunken size. If this occurs, one failure mode is when another unwind op needs to interact with that range, which can happen with a vector of binds. For example, if we need to re-insert something in place of the original va. In this case the va is still the shrunken version, so when removing it and then doing a re-insert it can overlap with the ends, which were never removed, triggering a warning like above, plus leaving the vm in a bad state. With that, we need two things here: 1) Stop nuking the prev/next tracking for the skip cases. Instead relying on checking for skip prev/next, where needed. That way on the unwind path, we now correctly remove both ends. 2) Undo the unmap va shrinkage, on the unwind path. With the two ends now removed the unmap va should expand back to the original size again, before re-insertion. v2: - Update the explanation in the commit message, based on an actual IGT of triggering this issue, rather than conjecture. - Also undo the unmap shrinkage, for the skip case. With the two ends now removed, the original unmap va range should expand back to the original range. v3: - Track the old start/range separately. vma_size/start() uses the va info directly. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7602 Fixes: 8f33b4f054fc ("drm/xe: Avoid doing rebinds") Signed-off-by: Matthew Auld Cc: Matthew Brost Cc: # v6.8+ Reviewed-by: Matthew Brost Link: https://patch.msgid.link/20260318100208.78097-2-matthew.auld@intel.com (cherry picked from commit aec6969f75afbf4e01fd5fb5850ed3e9c27043ac) Signed-off-by: Rodrigo Vivi

drm/xe: Skip over non leaf pte for PRL generation

2026-03-19T13:22:53+00:00

The check using xe_child->base.children was insufficient in determining if a pte was a leaf node. So explicitly skip over every non-leaf pt and conditionally abort if there is a scenario where a non-leaf pt is interleaved between leaf pt, which results in the page walker skipping over some leaf pt. Note that the behavior being targeted for abort is PD[0] = 2M PTE PD[1] = PT -> 512 4K PTEs PD[2] = 2M PTE results in abort, page walker won't descend PD[1]. With new abort, ensuring valid PRL before handling a second abort. v2: - Revert to previous assert. - Revised non-leaf handling for interleaf child pt and leaf pte. - Update comments to specifications. (Stuart) - Remove unnecessary XE_PTE_PS64. (Matthew B) v3: - Modify secondary abort to only check non-leaf PTEs. (Matthew B) Fixes: b912138df299 ("drm/xe: Create page reclaim list on unbind") Signed-off-by: Brian Nguyen Reviewed-by: Matthew Brost Cc: Stuart Summers Link: https://patch.msgid.link/20260305171546.67691-6-brian3.nguyen@intel.com Signed-off-by: Matt Roper (cherry picked from commit 1d123587525db86cc8f0d2beb35d9e33ca3ade83) Signed-off-by: Thomas Hellström

Convert more 'alloc_obj' cases to default GFP_KERNEL arguments

2026-02-22T04:03:00+00:00

This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-22T01:09:51+00:00

This was done entirely with mindless brute force, using git grep -l '\

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

2026-02-21T09:02:28+00:00

This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook

drm/xe: Add page reclamation related stats

2026-01-08T22:33:34+00:00

Add page reclaim list (PRL) related stats to GT stats to assist in debugging and tuning of page reclaim related actions. Include counters of page sizes added to PRL and if PRL action is issued. v2: - Add PRL_ABORTED_COUNT stats and corresponding changes. (Matthew B) Signed-off-by: Brian Nguyen Cc: Matthew Brost Reviewed-by: Matthew Brost Signed-off-by: Matthew Brost Link: https://patch.msgid.link/20260107010447.4125005-10-brian3.nguyen@intel.com

drm/xe: Fix page reclaim entry handling for large pages

2026-01-08T22:33:32+00:00

For 64KB pages, XE_PTE_PS64 is defined for all consecutive 4KB pages and are all considered leaf nodes, so existing check was falsely adding multiple 64KB pages to PRL. For larger entries such as 2MB PDE, the check for pte->base.children is insufficient since this array is always defined for page directory, level 1 and above, so perform a check on the entry itself pointing to the correct page. For unmaps, if the range is properly covered by the page full directory, page walker may finish without walking to the leaf nodes. For example, a 1G range can be fully covered by 512 2MB pages if alignment allows. In this case, the page walker will walk until it reaches this corresponding directory which can correlate to the 1GB range. Page walker will simply complete its walk and the individual 2MB PDE leaves won't get accessed. In this case, PRL invalidation is also required, so add a check to see if pt entry cover the entire range since the walker will complete the walk. There are possible race conditions that will cause driver to read a pte that hasn't been written to yet. The 2 scenarios are: - Another issued TLB invalidation such as from userptr or MMU notifier. - Dependencies on original bind that has yet to be executed with an unbind on that job. The expectation is these race conditions are likely rare cases so simply perform a fallback to full PPC flush invalidation instead. v2: - Reword commit and updated zero-pte handling. (Matthew B) v3: - Rework if statement for abort case with additional comments. (Matthew B) Fixes: b912138df299 ("drm/xe: Create page reclaim list on unbind") Signed-off-by: Brian Nguyen Cc: Matthew Brost Reviewed-by: Matthew Brost Signed-off-by: Matthew Brost Link: https://patch.msgid.link/20260107010447.4125005-9-brian3.nguyen@intel.com

drm/xe: Add explicit abort page reclaim list

2026-01-08T22:33:31+00:00

PRLs could be invalidated to indicate its getting dropped from current scope but are still valid. So standardize calls and add abort to clearly define when an invalidation is a real abort and PRL should fallback. v3: - Update abort function to macro. (Matthew B) Signed-off-by: Brian Nguyen Reviewed-by: Matthew Brost Signed-off-by: Matthew Brost Link: https://patch.msgid.link/20260107010447.4125005-8-brian3.nguyen@intel.com

drm/xe: Optimize flushing of L2$ by skipping unnecessary page reclaim

2025-12-13T00:59:10+00:00

There are additional hardware managed L2$ flushing such as the transient display. In those scenarios, page reclamation is unnecessary resulting in redundant cacheline flushes, so skip over those corresponding ranges. v2: - Elaborated on reasoning for page reclamation skip based on Tejas's discussion. (Matthew A, Tejas) v3: - Removed MEDIA_IS_ON due to racy condition resulting in removal of relevant registers and values. (Matthew A) - Moved l3 policy access to xe_pat. (Matthew A) v4: - Updated comments based on previous change. (Tejas) - Move back PAT index macros to xe_pat.c. Signed-off-by: Brian Nguyen Reviewed-by: Tejas Upadhyay Cc: Matthew Auld Signed-off-by: Matthew Brost Link: https://patch.msgid.link/20251212213225.3564537-21-brian3.nguyen@intel.com

drm/xe: Prep page reclaim in tlb inval job

2025-12-13T00:59:09+00:00

Use page reclaim list as indicator if page reclaim action is desired and pass it to tlb inval fence to handle. Job will need to maintain its own embedded copy to ensure lifetime of PRL exist until job has run. v2: - Use xe variant of WARN_ON (Michal) v3: - Add comments for PRL tile handling and flush behavior with media. (Matthew Brost) Signed-off-by: Brian Nguyen Reviewed-by: Matthew Brost Cc: Michal Wajdeczko Signed-off-by: Matthew Brost Link: https://patch.msgid.link/20251212213225.3564537-19-brian3.nguyen@intel.com