kernel/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c, branch v6.1.168

drm/amdgpu: Update BO eviction priorities

2024-06-12T09:02:59+00:00

[ Upstream commit b0b13d532105e0e682d95214933bb8483a063184 ] Make SVM BOs more likely to get evicted than other BOs. These BOs opportunistically use available VRAM, but can fall back relatively seamlessly to system memory. It also avoids SVM migrations evicting other, more important BOs as they will evict other SVM allocations first. Signed-off-by: Felix Kuehling Acked-by: Mukul Joshi Tested-by: Mukul Joshi Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin

drm/amdgpu: once more fix the call oder in amdgpu_ttm_move() v2

2024-05-17T09:56:16+00:00

commit d3a9331a6591e9df64791e076f6591f440af51c3 upstream. This reverts drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap. The basic problem here is that after the move the old location is simply not available any more. Some fixes were suggested, but essentially we should call the move notification before actually moving things because only this way we have the correct order for DMA-buf and VM move notifications as well. Also rework the statistic handling so that we don't update the eviction counter before the move. v2: add missing NULL check Signed-off-by: Christian König Fixes: 94aeb4117343 ("drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3171 Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman

drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap

2024-02-05T20:12:56+00:00

[ Upstream commit 94aeb4117343d072e3a35b9595bcbfc0058ee724 ] Issue: during evict or validate happened on amdgpu_bo, the 'from' and 'to' is always same in ftrace event of amdgpu_bo_move where calling the 'trace_amdgpu_bo_move', the comment says move_notify is called before move happens, but actually it is called after move happens, here the new_mem is same as bo->resource Fix: move trace_amdgpu_bo_move from move_notify to amdgpu_bo_move Signed-off-by: Wang, Beyond Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin

drm/amdgpu: Remove unnecessary domain argument

2023-08-11T10:08:27+00:00

commit 3273f11675ef11959d25a56df3279f712bcd41b7 upstream Remove the "domain" argument to amdgpu_bo_create_kernel_at() since this function takes an "offset" argument which is the offset off of VRAM, and as such allocation always takes place in VRAM. Thus, the "domain" argument is unnecessary. Cc: Alex Deucher Cc: Christian König Cc: AMD Graphics Signed-off-by: Luben Tuikov Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Mario Limonciello Signed-off-by: Greg Kroah-Hartman

drm/amdgpu: fix Null pointer dereference error in amdgpu_device_recover_vram

2023-06-14T09:15:22+00:00

[ Upstream commit 2a1eb1a343208ce7d6839b73d62aece343e693ff ] Use the function of amdgpu_bo_vm_destroy to handle the resource release of shadow bo. During the amdgpu_mes_self_test, shadow bo released, but vmbo->shadow_list was not, which caused a null pointer reference error in amdgpu_device_recover_vram when GPU reset. Fixes: 6c032c37ac3e ("drm/amdgpu: Fix vram recover doesn't work after whole GPU reset (v2)") Signed-off-by: xinhui pan Signed-off-by: Horatio Zhang Acked-by: Feifei Xu Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin

drm/amdgpu: Fix call trace warning and hang when removing amdgpu device

2023-03-30T10:49:19+00:00

[ Upstream commit 93bb18d2a873d2fa9625c8ea927723660a868b95 ] On GPUs with RAS enabled, below call trace and hang are observed when shutting down device. v2: use DRM device unplugged flag instead of shutdown flag as the check to prevent memory wipe in shutdown stage. [ +0.000000] RIP: 0010:amdgpu_vram_mgr_fini+0x18d/0x1c0 [amdgpu] [ +0.000001] PKRU: 55555554 [ +0.000001] Call Trace: [ +0.000001] [ +0.000002] amdgpu_ttm_fini+0x140/0x1c0 [amdgpu] [ +0.000183] amdgpu_bo_fini+0x27/0xa0 [amdgpu] [ +0.000184] gmc_v11_0_sw_fini+0x2b/0x40 [amdgpu] [ +0.000163] amdgpu_device_fini_sw+0xb6/0x510 [amdgpu] [ +0.000152] amdgpu_driver_release_kms+0x16/0x30 [amdgpu] [ +0.000090] drm_dev_release+0x28/0x50 [drm] [ +0.000016] devm_drm_dev_init_release+0x38/0x60 [drm] [ +0.000011] devm_action_release+0x15/0x20 [ +0.000003] release_nodes+0x40/0xc0 [ +0.000001] devres_release_all+0x9e/0xe0 [ +0.000001] device_unbind_cleanup+0x12/0x80 [ +0.000003] device_release_driver_internal+0xff/0x160 [ +0.000001] driver_detach+0x4a/0x90 [ +0.000001] bus_remove_driver+0x6c/0xf0 [ +0.000001] driver_unregister+0x31/0x50 [ +0.000001] pci_unregister_driver+0x40/0x90 [ +0.000003] amdgpu_exit+0x15/0x120 [amdgpu] Signed-off-by: lyndonli Reviewed-by: Guchun Chen Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin

drm/amdgpu: Fix potential NULL dereference

2023-01-18T10:58:27+00:00

[ Upstream commit 0be7ed8e7eb15282b5d0f6fdfea884db594ea9bf ] Fix potential NULL dereference, in the case when "man", the resource manager might be NULL, when/if we print debug information. Cc: Alex Deucher Cc: Christian König Cc: AMD Graphics Cc: Dan Carpenter Cc: kernel test robot Fixes: 7554886daa31ea ("drm/amdgpu: Fix size validation for non-exclusive domains (v4)") Signed-off-by: Luben Tuikov Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin

drm/amdgpu: Fix size validation for non-exclusive domains (v4)

2023-01-12T11:02:37+00:00

[ Upstream commit 7554886daa31eacc8e7fac9e15bbce67d10b8f1f ] Fix amdgpu_bo_validate_size() to check whether the TTM domain manager for the requested memory exists, else we get a kernel oops when dereferencing "man". v2: Make the patch standalone, i.e. not dependent on local patches. v3: Preserve old behaviour and just check that the manager pointer is not NULL. v4: Complain if GTT domain requested and it is uninitialized--most likely a bug. Cc: Alex Deucher Cc: Christian König Cc: AMD Graphics Signed-off-by: Luben Tuikov Reviewed-by: Christian König Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin

drm/amdgpu: make display pinning more flexible (v2)

2023-01-07T10:12:03+00:00

commit 81d0bcf9900932633d270d5bc4a54ff599c6ebdb upstream. Only apply the static threshold for Stoney and Carrizo. This hardware has certain requirements that don't allow mixing of GTT and VRAM. Newer asics do not have these requirements so we should be able to be more flexible with where buffers end up. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2270 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2291 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2255 Acked-by: Luben Tuikov Reviewed-by: Christian König Signed-off-by: Alex Deucher Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman

drm/amdgpu: Set vmbo destroy after pt bo is created

2022-10-06T16:08:09+00:00

Under VRAM usage pression, map to GPU may fail to create pt bo and vmbo->shadow_list is not initialized, then ttm_bo_release calling amdgpu_bo_vm_destroy to access vmbo->shadow_list generates below dmesg and NULL pointer access backtrace: Set vmbo destroy callback to amdgpu_bo_vm_destroy only after creating pt bo successfully, otherwise use default callback amdgpu_bo_destroy. amdgpu: amdgpu_vm_bo_update failed amdgpu: update_gpuvm_pte() failed amdgpu: Failed to map bo to gpuvm amdgpu 0000:43:00.0: amdgpu: Failed to map peer:0000:43:00.0 mem_domain:2 BUG: kernel NULL pointer dereference, address: RIP: 0010:amdgpu_bo_vm_destroy+0x4d/0x80 [amdgpu] Call Trace: ttm_bo_release+0x207/0x320 [amdttm] amdttm_bo_init_reserved+0x1d6/0x210 [amdttm] amdgpu_bo_create+0x1ba/0x520 [amdgpu] amdgpu_bo_create_vm+0x3a/0x80 [amdgpu] amdgpu_vm_pt_create+0xde/0x270 [amdgpu] amdgpu_vm_ptes_update+0x63b/0x710 [amdgpu] amdgpu_vm_update_range+0x2e7/0x6e0 [amdgpu] amdgpu_vm_bo_update+0x2bd/0x600 [amdgpu] update_gpuvm_pte+0x160/0x420 [amdgpu] amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x313/0x1130 [amdgpu] kfd_ioctl_map_memory_to_gpu+0x115/0x390 [amdgpu] kfd_ioctl+0x24a/0x5b0 [amdgpu] Signed-off-by: Philip Yang Reviewed-by: Christian König Signed-off-by: Alex Deucher