kernel/linux.git/include/uapi/drm/xe_drm.h, branch v6.12.80

drm/xe: Limit num_syncs to prevent oversized allocations

2026-01-08T09:14:05+00:00

[ Upstream commit 8e461304009135270e9ccf2d7e2dfe29daec9b60 ] The exec and vm_bind ioctl allow userspace to specify an arbitrary num_syncs value. Without bounds checking, a very large num_syncs can force an excessively large allocation, leading to kernel warnings from the page allocator as below. Introduce DRM_XE_MAX_SYNCS (set to 1024) and reject any request exceeding this limit. " ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1217 at mm/page_alloc.c:5124 __alloc_frozen_pages_noprof+0x2f8/0x2180 mm/page_alloc.c:5124 ... Call Trace: alloc_pages_mpol+0xe4/0x330 mm/mempolicy.c:2416 ___kmalloc_large_node+0xd8/0x110 mm/slub.c:4317 __kmalloc_large_node_noprof+0x18/0xe0 mm/slub.c:4348 __do_kmalloc_node mm/slub.c:4364 [inline] __kmalloc_noprof+0x3d4/0x4b0 mm/slub.c:4388 kmalloc_noprof include/linux/slab.h:909 [inline] kmalloc_array_noprof include/linux/slab.h:948 [inline] xe_exec_ioctl+0xa47/0x1e70 drivers/gpu/drm/xe/xe_exec.c:158 drm_ioctl_kernel+0x1f1/0x3e0 drivers/gpu/drm/drm_ioctl.c:797 drm_ioctl+0x5e7/0xc50 drivers/gpu/drm/drm_ioctl.c:894 xe_drm_ioctl+0x10b/0x170 drivers/gpu/drm/xe/xe_device.c:224 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:598 [inline] __se_sys_ioctl fs/ioctl.c:584 [inline] __x64_sys_ioctl+0x18b/0x210 fs/ioctl.c:584 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xbb/0x380 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... " v2: Add "Reported-by" and Cc stable kernels. v3: Change XE_MAX_SYNCS from 64 to 1024. (Matt & Ashutosh) v4: s/XE_MAX_SYNCS/DRM_XE_MAX_SYNCS/ (Matt) v5: Do the check at the top of the exec func. (Matt) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Reported-by: Koen Koning Reported-by: Peter Senna Tschudin Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6450 Cc: # v6.12+ Cc: Matthew Brost Cc: Michal Mrozek Cc: Carl Zhang Cc: José Roberto de Souza Cc: Lionel Landwerlin Cc: Ivan Briano Cc: Thomas Hellström Cc: Ashutosh Dixit Signed-off-by: Shuicheng Lin Reviewed-by: Matthew Brost Signed-off-by: Matthew Brost Link: https://patch.msgid.link/20251205234715.2476561-5-shuicheng.lin@intel.com (cherry picked from commit b07bac9bd708ec468cd1b8a5fe70ae2ac9b0a11c) Signed-off-by: Thomas Hellström Stable-dep-of: f8dd66bfb4e1 ("drm/xe/oa: Limit num_syncs to prevent oversized allocations") Signed-off-by: Sasha Levin

drm/xe/oa/uapi: Define and parse OA sync properties

2025-02-27T12:30:13+00:00

[ Upstream commit c8507a25cebd179db935dd266a33c51bef1b1e80 ] Now that we have laid the groundwork, introduce OA sync properties in the uapi and parse the input xe_sync array as is done elsewhere in the driver. Also add DRM_XE_OA_CAPS_SYNCS bit in OA capabilities for userspace. v2: Fix and document DRM_XE_SYNC_TYPE_USER_FENCE for OA (Matt B) Add DRM_XE_OA_CAPS_SYNCS bit to OA capabilities (Jose) Acked-by: José Roberto de Souza Reviewed-by: Jonathan Cavitt Signed-off-by: Ashutosh Dixit Link: https://patchwork.freedesktop.org/patch/msgid/20241022200352.1192560-3-ashutosh.dixit@intel.com Stable-dep-of: f0ed39830e60 ("xe/oa: Fix query mode of operation for OAR/OAC") Signed-off-by: Sasha Levin

drm/xe/ptl: L3bank mask is not available on the media GT

2024-12-14T19:03:44+00:00

[ Upstream commit 9ab440a9d0426cf7842240891cc457155db1a97e ] On PTL platforms with media version 30.00, the fuse registers for reporting L3 bank availability to the GT just read out as ~0 and do not provide proper values. Xe does not use the L3 bank mask for anything internally; it only passes the mask through to userspace via the GT topology query. Since we don't have any way to get the real L3 bank mask, we don't want to pass garbage to userspace. Passing a zeroed mask or a copy of the primary GT's L3 bank mask would also be inaccurate and likely to cause confusion for userspace. The best approach is to simply not include L3 in the list of masks returned by the topology query in cases where we aren't able to provide a meaningful value. This won't change the behavior for any existing platforms (where we can always obtain L3 masks successfully for all GTs), it will only prevent us from mis-reporting bad information on upcoming platform(s). There's a good chance this will become a formal workaround in the future, but for now we don't have a lineage number so "no_media_l3" is used in place of a lineage as the OOB workaround descriptor. v2: - Re-calculate query size to properly match data returned. (Gustavo) - Update kerneldoc to clarify that the L3bank mask may not be included in the query results if the hardware doesn't make it available. (Gustavo) Cc: Matt Atwood Cc: Gustavo Sousa Signed-off-by: Shekhar Chauhan Co-developed-by: Matt Roper Signed-off-by: Matt Roper Reviewed-by: Jonathan Cavitt Reviewed-by: Gustavo Sousa Acked-by: Francois Dugast Link: https://patchwork.freedesktop.org/patch/msgid/20241007154143.2021124-2-matthew.d.roper@intel.com Signed-off-by: Sasha Levin

drm/xe/oa/uapi: Make bit masks unsigned

2024-07-30T20:45:38+00:00

When building with gcc-5: In function ‘decode_oa_format.isra.26’, inlined from ‘xe_oa_set_prop_oa_format’ at drivers/gpu/drm/xe/xe_oa.c:1664:6: ././include/linux/compiler_types.h:510:38: error: call to ‘__compiletime_assert_1336’ declared with attribute error: FIELD_GET: mask is not constant [...] ./include/linux/bitfield.h:155:3: note: in expansion of macro ‘__BF_FIELD_CHECK’ __BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: "); \ ^ drivers/gpu/drm/xe/xe_oa.c:1573:18: note: in expansion of macro ‘FIELD_GET’ u32 bc_report = FIELD_GET(DRM_XE_OA_FORMAT_MASK_BC_REPORT, fmt); ^ Fixes: b6fd51c62119 ("drm/xe/oa/uapi: Define and parse OA stream properties") Signed-off-by: Geert Uytterhoeven Reviewed-by: Lucas De Marchi Link: https://patchwork.freedesktop.org/patch/msgid/20240729092634.2227611-1-geert+renesas@glider.be Signed-off-by: Lucas De Marchi

drm/xe/uapi: Expose SIMD16 EU mask in topology query

2024-07-18T20:20:30+00:00

PVC, Xe2 and later platforms have 16-wide EUs. We were implicitly reporting for PVC the number of 16-wide EUs without giving userspace any hint that they were different than for other platforms. Xe2 and later also have 16-wide, but in those cases the reported number would correspond to the 8-wide count. To avoid confusion and make sure the right number is used by userspace depending on the platform, add a new item to the topology query and drop the one that is not available. The new mask reported for both PVC and Xe2 should now match the numbers reported via hwconfig. v2: Use a different topo item with EU type in its name to report the new mask instead of adding the type itself as the item (Matt Roper) Reviewed-by: Matt Roper Acked-by: José Roberto de Souza Acked-by: Mateusz Jablonski Acked-by: Wenbin Lu Acked-by: Effie Yu Acked-by: Rodrigo Vivi Link: https://patchwork.freedesktop.org/patch/msgid/20240710220446.2169797-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi

drm/xe: Use write-back caching mode for system memory on DGFX

2024-07-06T09:05:46+00:00

The caching mode for buffer objects with VRAM as a possible placement was forced to write-combined, regardless of placement. However, write-combined system memory is expensive to allocate and even though it is pooled, the pool is expensive to shrink, since it involves global CPU TLB flushes. Moreover write-combined system memory from TTM is only reliably available on x86 and DGFX doesn't have an x86 restriction. So regardless of the cpu caching mode selected for a bo, internally use write-back caching mode for system memory on DGFX. Coherency is maintained, but user-space clients may perceive a difference in cpu access speeds. v2: - Update RB- and Ack tags. - Rephrase wording in xe_drm.h (Matt Roper) v3: - Really rephrase wording. Signed-off-by: Thomas Hellström Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode") Cc: Pallavi Mishra Cc: Matthew Auld Cc: dri-devel@lists.freedesktop.org Cc: Joonas Lahtinen Cc: Effie Yu Cc: Matthew Brost Cc: Maarten Lankhorst Cc: Jose Souza Cc: Michal Mrozek Cc: # v6.8+ Acked-by: Matthew Auld Acked-by: José Roberto de Souza Reviewed-by: Rodrigo Vivi Fixes: 622f709ca629 ("drm/xe/uapi: Add support for CPU caching mode") Acked-by: Michal Mrozek Acked-by: Effie Yu #On chat Link: https://patchwork.freedesktop.org/patch/msgid/20240705132828.27714-1-thomas.hellstrom@linux.intel.com

drm/xe/uapi: Rename xe perf layer as xe observation layer

2024-07-03T23:46:02+00:00

In Xe, the perf layer allows capture of HW counter streams. These HW counters are generally performance related but don't have to be necessarily so. Also, the name "perf" is a carryover from i915 and is not preferred. Here we propose the name "observation" for this common layer which allows capture of different types of these counter streams. v2: Rename observability layer to observation layer (Lucas/Rodrigo) v3: Rename sysctl file to "observation_paranoid" (Jose) Fixes: 52c2e956dceb ("drm/xe/perf/uapi: "Perf" layer to support multiple perf counter stream types") Fixes: fe8929bdf835 ("drm/xe/perf/uapi: Add perf_stream_paranoid sysctl") Acked-by: Lucas De Marchi Acked-by: Rodrigo Vivi Signed-off-by: Ashutosh Dixit Reviewed-by: Umesh Nerlige Ramappa Acked-by: José Roberto de Souza Link: https://patchwork.freedesktop.org/patch/msgid/20240703164801.2561423-1-ashutosh.dixit@intel.com

drm/xe/oa/uapi: Allow preemption to be disabled on the stream exec queue

2024-06-26T22:25:46+00:00

Mesa VK_KHR_performance_query use case requires preemption and timeslicing to be disabled for the stream exec queue. Implement this functionality here. v2: Minor change to debug print to print both ret values (Umesh) Acked-by: José Roberto de Souza Reviewed-by: Umesh Nerlige Ramappa Signed-off-by: Ashutosh Dixit Link: https://patchwork.freedesktop.org/patch/msgid/20240626181817.1516229-3-ashutosh.dixit@intel.com Signed-off-by: Rodrigo Vivi

drm/xe/oa: Fix kernel doc in xe_drm.h

2024-06-26T22:24:38+00:00

Fix kernel doc in xe_drm.h. Also eliminate private/non-abi enum definitions. v2: Remove __DRM_XE_PERF_TYPE_MAX since it is unused (Michal) v3: Also remove DRM_XE_OA_PROPERTY_MAX since it can also be eliminated (Michal) Suggested-by: Michal Wajdeczko Signed-off-by: Ashutosh Dixit Reviewed-by: Michal Wajdeczko Link: https://patchwork.freedesktop.org/patch/msgid/20240623203119.3840283-1-ashutosh.dixit@intel.com Signed-off-by: Rodrigo Vivi

drm/xe/oa/uapi: Query OA unit properties

2024-06-18T19:40:40+00:00

Implement query for properties of OA units present on a device. v2: Clean up reserved/pad fields (Umesh) Follow the same scheme as other query structs v3: Skip reporting reserved engines attached to OA units v4: Expose oa_buf_size via DRM_XE_PERF_IOCTL_INFO (Umesh) v5: Don't expose capabilities as OR of properties (Umesh) v6: Add extensions to query output structs: drm_xe_oa_unit, drm_xe_query_oa_units and drm_xe_oa_stream_info v7: Change oa_units[] array to __u64 type Acked-by: Rodrigo Vivi Reviewed-by: Umesh Nerlige Ramappa Signed-off-by: Ashutosh Dixit Link: https://patchwork.freedesktop.org/patch/msgid/20240618014609.3233427-13-ashutosh.dixit@intel.com