kernel/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h, branch v6.19.12

drm/amdgpu: update the functions to use amdgpu version of hmm

2025-10-13T18:14:36+00:00

At times we need a bo reference for hmm and for that add a new struct amdgpu_hmm_range which will hold an optional bo member and hmm_range. Use amdgpu_hmm_range instead of hmm_range and let the bo as an optional argument for the caller if they want to the bo reference to be taken or they want to handle that explicitly. Signed-off-by: Sunil Khatri Reviewed-by: Christian König Signed-off-by: Alex Deucher

drm/amdkfd: add proper handling for S0ix

2025-09-18T13:43:02+00:00

When in S0i3, the GFX state is retained, so all we need to do is stop the runlist so GFX can enter gfxoff. Reviewed-by: Mario Limonciello (AMD) Tested-by: David Perry Signed-off-by: Alex Deucher

drm/amdgpu/amdkfd: Avoid a couple hundred -Wflex-array-member-not-at-end warnings

2025-09-02T21:36:20+00:00

-Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Move the conflicting declarations to the end of the corresponding structures. Notice that `struct dev_pagemap` is a flexible structure, this is a structure that contains a flexible-array member. struct dev_pagemap always has room for at least one range. amdgpu only uses a single range. Therefore no change are needed to the allocation of struct amdgpu_device. Fix 283 of the following type of warnings: 283 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h:111:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Signed-off-by: Gustavo A. R. Silva Signed-off-by: Felix Kuehling Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher

drm/amdkfd: Move the process suspend and resume out of full access

2025-06-18T16:19:19+00:00

For the suspend and resume process, exclusive access is not required. Therefore, it can be moved out of the full access section to reduce the duration of exclusive access. v3: Move suspend processes before hardware fini. Remove twice call for bare metal. v4: Refine code Signed-off-by: Emily Deng Acked-by: Lijo Lazar Signed-off-by: Alex Deucher

drm/amdkfd: allow compute partition mode switch with cgroup exclusions

2025-06-18T16:19:17+00:00

The KFD currently bars a compute partition mode switch while a KFD process exists. Since cgroup excluded devices remain excluded for the lifetime of a KFD process and user space is able to mode switch single devices, allow users to mode switch a device with any running process that has been cgroup excluded from this device. Signed-off-by: Jonathan Kim Reviewed-by: Harish Kasiviswanathan Signed-off-by: Alex Deucher

drm/amdgpu: simplify xgmi peer info calls

2025-02-25T16:45:12+00:00

Deprecate KFD XGMI peer info calls in favour of calling directly from simplified XGMI peer info functions. Signed-off-by: Jonathan Kim Reviewed-by: Lijo Lazar Signed-off-by: Alex Deucher

drm/amdgpu: remove all KFD fences from the BO on release

2025-02-21T15:41:49+00:00

Remove all KFD BOs from the private dma_resv object. This prevents the KFD from being evict unecessarily when an exported BO is released. Signed-off-by: Christian König Signed-off-by: James Zhu Reviewed-by: Felix Kuehling Reviewed-and-tested-by: James Zhu Signed-off-by: Alex Deucher

drm/amdkfd: Fix pasid value leak

2025-02-13T02:05:50+00:00

Curret kfd does not allocate pasid values, instead uses pasid value for each vm from graphic driver. So should not prevent graphic driver from releasing pasid values since the values are allocated by graphic driver, not kfd driver anymore. This patch does not stop graphic driver release pasid values. Fixes: 8544374c0f82 ("drm/amdkfd: Have kfd driver use same PASID values from graphic driver") Signed-off-by: Xiaogang Chen Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher

drm/amdkfd: Have kfd driver use same PASID values from graphic driver

2025-02-13T02:02:55+00:00

Current kfd driver has its own PASID value for a kfd process and uses it to locate vm at interrupt handler or mapping between kfd process and vm. That design is not working when a physical gpu device has multiple spatial partitions, ex: adev in CPX mode. This patch has kfd driver use same pasid values that graphic driver generated which is per vm per pasid. These pasid values are passed to fw/hardware. We do not need change interrupt handler though more pasid values are used. Also, pasid values at log are replaced by user process pid; pasid values are not exposed to user. Users see their process pids that have meaning in user space. Signed-off-by: Xiaogang Chen Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher

drm/amdgpu: Optimize gfx v9 GPU page fault handling

2024-12-18T17:39:07+00:00

After GPU page fault, there are lots of page fault interrupts generated at short period even with CAM filter enabled because the fault address is different. Each page fault copy to KFD ih fifo to send event to user space by KFD interrupt worker, this could cause KFD ih fifo overflow while other processes generate events at same time. KFD process is aborted after GPU page fault, we only need one GPU page fault interrupt sent to KFD ih fifo to send memory exception event to user space. Incease KFD ih fifo size to 2 times of IH primary ring size, to handle the burst events case. This patch handle the gfx v9 path, cover retry on/off and CAM filter on/off cases. Signed-off-by: Philip Yang Reviewed-by: Felix Kuehling Signed-off-by: Alex Deucher