diff options
| author | Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> | 2026-02-09 20:54:45 +0300 |
|---|---|---|
| committer | Alex Deucher <alexander.deucher@amd.com> | 2026-02-12 23:22:03 +0300 |
| commit | b18fc0ab837381c1a6ef28386602cd888f2d9edf (patch) | |
| tree | 7456a86ec6a71685324b08ca3b970eb164d55faf | |
| parent | 9aca641143cec1b25e76f54fc49187b17393c12f (diff) | |
| download | linux-b18fc0ab837381c1a6ef28386602cd888f2d9edf.tar.xz | |
drm/amdgpu: fix sync handling in amdgpu_dma_buf_move_notify
Invalidating a dmabuf will impact other users of the shared BO.
In the scenario where process A moves the BO, it needs to inform
process B about the move and process B will need to update its
page table.
The commit fixes a synchronisation bug caused by the use of the
ticket: it made amdgpu_vm_handle_moved behave as if updating
the page table immediately was correct but in this case it's not.
An example is the following scenario, with 2 GPUs and glxgears
running on GPU0 and Xorg running on GPU1, on a system where P2P
PCI isn't supported:
glxgears:
export linear buffer from GPU0 and import using GPU1
submit frame rendering to GPU0
submit tiled->linear blit
Xorg:
copy of linear buffer
The sequence of jobs would be:
drm_sched_job_run # GPU0, frame rendering
drm_sched_job_queue # GPU0, blit
drm_sched_job_done # GPU0, frame rendering
drm_sched_job_run # GPU0, blit
move linear buffer for GPU1 access #
amdgpu_dma_buf_move_notify -> update pt # GPU0
It this point the blit job on GPU0 is still running and would
likely produce a page fault.
Cc: stable@vger.kernel.org
Fixes: a448cb003edc ("drm/amdgpu: implement amdgpu_gem_prime_move_notify v2")
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
| -rw-r--r-- | drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 9 |
1 files changed, 8 insertions, 1 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index b9c38a4fe546..656c267dbe58 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -514,8 +514,15 @@ amdgpu_dma_buf_move_notify(struct dma_buf_attachment *attach) r = dma_resv_reserve_fences(resv, 2); if (!r) r = amdgpu_vm_clear_freed(adev, vm, NULL); + + /* Don't pass 'ticket' to amdgpu_vm_handle_moved: we want the clear=true + * path to be used otherwise we might update the PT of another process + * while it's using the BO. + * With clear=true, amdgpu_vm_bo_update will sync to command submission + * from the same VM. + */ if (!r) - r = amdgpu_vm_handle_moved(adev, vm, ticket); + r = amdgpu_vm_handle_moved(adev, vm, NULL); if (r && r != -EBUSY) DRM_ERROR("Failed to invalidate VM page tables (%d))\n", |
