summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/ttm
AgeCommit message (Collapse)AuthorFilesLines
2026-01-19drm/ttm: Avoid NULL pointer deref for evicted BOsSimon Richter1-0/+6
commit 491adc6a0f9903c32b05f284df1148de39e8e644 upstream. It is possible for a BO to exist that is not currently associated with a resource, e.g. because it has been evicted. When devcoredump tries to read the contents of all BOs for dumping, we need to expect this as well -- in this case, ENODATA is recorded instead of the buffer contents. Fixes: 7d08df5d0bd3 ("drm/ttm: Add ttm_bo_access") Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2") Cc: stable <stable@kernel.org> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6271 Signed-off-by: Simon Richter <Simon.Richter@hogyros.de> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20251013161241.709916-1-Simon.Richter@hogyros.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-08-28drm/ttm: Respect the shrinker core free targetTvrtko Ursulin1-3/+5
[ Upstream commit eac21f8ebeb4f84d703cf41dc3f81d16fa9dc00a ] Currently the TTM shrinker aborts shrinking as soon as it frees pages from any of the page order pools and by doing so it can fail to respect the freeing target which was configured by the shrinker core. We use the wording "can fail" because the number of freed pages will depend on the presence of pages in the pools and the order of the pools on the LRU list. For example if there are no free pages in the high order pools the shrinker core may require multiple passes over the TTM shrinker before it will free the default target of 128 pages (assuming there are free pages in the low order pools). This inefficiency can be compounded by the pool LRU where multiple further calls into the TTM shrinker are required to end up looking at the pool with pages. Improve this by never freeing less than the shrinker core has requested. At the same time we start reporting the number of scanned pages (freed in this case), which prevents the core shrinker from giving up on the TTM shrinker too soon and moving on. v2: * Simplify loop logic. (Christian) * Improve commit message. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Cc: Christian König <christian.koenig@amd.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://lore.kernel.org/r/20250603112750.34997-2-tvrtko.ursulin@igalia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-28drm/ttm: Should to return the evict errorEmily Deng1-0/+3
[ Upstream commit 4e16a9a00239db5d819197b9a00f70665951bf50 ] For the evict fail case, the evict error should be returned. v2: Consider ENOENT case. v3: Abort directly when the eviction failed for some reason (except for -ENOENT) and not wait for the move to finish Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250603091154.3472646-1-Emily.Deng@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-27drm/ttm: add ttm_resource_fini v2Christian König3-0/+26
[ Upstream commit de3688e469b08be958914674e8b01cb0cea42388 ] Make sure we call the common cleanup function in all implementations of the resource manager. v2: fix missing case in i915, rudimentary kerneldoc, should be filled in more when we add more functionality Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220124122514.1832-2-christian.koenig@amd.com Stable-dep-of: 89709105a609 ("drm/vmwgfx: fix a memleak in vmw_gmrid_man_get_node") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-01drm/ttm: Fix an invalid freeing on already freed page in error pathThomas Hellström1-1/+1
commit 40510a941d27d405a82dc3320823d875f94625df upstream. If caching mode change fails due to, for example, OOM we free the allocated pages in a two-step process. First the pages for which the caching change has already succeeded. Secondly the pages for which a caching change did not succeed. However the second step was incorrectly freeing the pages already freed in the first step. Fix. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: 379989e7cbdc ("drm/ttm/pool: Fix ttm_pool_alloc error path") Cc: Christian König <christian.koenig@amd.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.4+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240221073324.3303-1-thomas.hellstrom@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-08drm/ttm: Reorder sys manager cleanup stepKarolina Stolarek1-4/+4
[ Upstream commit 3b401e30c249849d803de6c332dad2a595a58658 ] With the current cleanup flow, we could trigger a NULL pointer dereference if there is a delayed destruction of a BO with a system resource that gets executed on drain_workqueue() call, as we attempt to free a resource using an already released resource manager. Remove the device from the device list and drain its workqueue before releasing the system domain manager in ttm_device_fini(). Signed-off-by: Karolina Stolarek <karolina.stolarek@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231016121525.2237838-1-karolina.stolarek@intel.com Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-11drm/ttm: check null pointer before accessing when swappingGuchun Chen1-1/+2
commit 2dedcf414bb01b8d966eb445db1d181d92304fb2 upstream. Add a check to avoid null pointer dereference as below: [ 90.002283] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 90.002292] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 90.002346] ? exc_general_protection+0x159/0x240 [ 90.002352] ? asm_exc_general_protection+0x26/0x30 [ 90.002357] ? ttm_bo_evict_swapout_allowable+0x322/0x5e0 [ttm] [ 90.002365] ? ttm_bo_evict_swapout_allowable+0x42e/0x5e0 [ttm] [ 90.002373] ttm_bo_swapout+0x134/0x7f0 [ttm] [ 90.002383] ? __pfx_ttm_bo_swapout+0x10/0x10 [ttm] [ 90.002391] ? lock_acquire+0x44d/0x4f0 [ 90.002398] ? ttm_device_swapout+0xa5/0x260 [ttm] [ 90.002412] ? lock_acquired+0x355/0xa00 [ 90.002416] ? do_raw_spin_trylock+0xb6/0x190 [ 90.002421] ? __pfx_lock_acquired+0x10/0x10 [ 90.002426] ? ttm_global_swapout+0x25/0x210 [ttm] [ 90.002442] ttm_device_swapout+0x198/0x260 [ttm] [ 90.002456] ? __pfx_ttm_device_swapout+0x10/0x10 [ttm] [ 90.002472] ttm_global_swapout+0x75/0x210 [ttm] [ 90.002486] ttm_tt_populate+0x187/0x3f0 [ttm] [ 90.002501] ttm_bo_handle_move_mem+0x437/0x590 [ttm] [ 90.002517] ttm_bo_validate+0x275/0x430 [ttm] [ 90.002530] ? __pfx_ttm_bo_validate+0x10/0x10 [ttm] [ 90.002544] ? kasan_save_stack+0x33/0x60 [ 90.002550] ? kasan_set_track+0x25/0x30 [ 90.002554] ? __kasan_kmalloc+0x8f/0xa0 [ 90.002558] ? amdgpu_gtt_mgr_new+0x81/0x420 [amdgpu] [ 90.003023] ? ttm_resource_alloc+0xf6/0x220 [ttm] [ 90.003038] amdgpu_bo_pin_restricted+0x2dd/0x8b0 [amdgpu] [ 90.003210] ? __x64_sys_ioctl+0x131/0x1a0 [ 90.003210] ? do_syscall_64+0x60/0x90 Fixes: a2848d08742c ("drm/ttm: never consider pinned BOs for eviction&swap") Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20230724024229.1118444-1-guchun.chen@amd.com Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03drm/ttm: never consider pinned BOs for eviction&swapChristian König1-0/+6
[ Upstream commit a2848d08742c8e8494675892c02c0d22acbe3cf8 ] There is a small window where we have already incremented the pin count but not yet moved the bo from the lru to the pinned list. Signed-off-by: Christian König <christian.koenig@amd.com> Reported-by: Pelloux-Prayer, Pierre-Eric <Pierre-eric.Pelloux-prayer@amd.com> Tested-by: Pelloux-Prayer, Pierre-Eric <Pierre-eric.Pelloux-prayer@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20230707120826.3701-1-christian.koenig@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-03drm/ttm: Don't leak a resource on eviction errorThomas Hellström1-11/+11
[ Upstream commit e8188c461ee015ba0b9ab2fc82dbd5ebca5a5532 ] On eviction errors other than -EMULTIHOP we were leaking a resource. Fix. v2: - Avoid yet another goto (Andi Shyti) Fixes: 403797925768 ("drm/ttm: Fix multihop assert on eviction.") Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.15+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> #v1 Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230626091450.14757-4-thomas.hellstrom@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-03drm/ttm: Don't print error message if eviction was interruptedThomas Hellström1-1/+2
[ Upstream commit 8ab3b0663e279ab550bc2c0b5d602960e8b94e02 ] Avoid printing an error message if eviction was interrupted by, for example, the user pressing CTRL-C. That may happen if eviction is waiting for something, like for example a free batch-buffer. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230307144621.10748-6-thomas.hellstrom@linux.intel.com Stable-dep-of: e8188c461ee0 ("drm/ttm: Don't leak a resource on eviction error") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-07-23drm/ttm: Don't leak a resource on swapout move errorThomas Hellström1-0/+1
commit a590f03d8de7c4cb7ce4916dc7f2fd10711faabe upstream. If moving the bo to system for swapout failed, we were leaking a resource. Fix. Fixes: bfa3357ef9ab ("drm/ttm: allocate resource object instead of embedding it v2") Cc: Christian König <christian.koenig@amd.com> Cc: "Christian König" <ckoenig.leichtzumerken@gmail.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.14+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230626091450.14757-5-thomas.hellstrom@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-11drm/ttm/pool: Fix ttm_pool_alloc error pathThomas Hellström1-30/+51
[ Upstream commit 379989e7cbdc7aa7496a00ee286ec146c7599cf0 ] When hitting an error, the error path forgot to unmap dma mappings and could call set_pages_wb() on already uncached pages. Fix this by introducing a common ttm_pool_free_range() function that does the right thing. v2: - Simplify that common function (Christian König) v3: - Rename that common function to ttm_pool_free_range() (Christian König) Fixes: d099fc8f540a ("drm/ttm: new TT backend allocation pool v3") Cc: Christian König <christian.koenig@amd.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: dri-devel@lists.freedesktop.org Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230404200650.11043-2-thomas.hellstrom@linux.intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11drm/ttm: optimize pool allocations a bit v2Christian König1-24/+58
[ Upstream commit 735c466465eba51deaee3012d8403c10fc7c8c03 ] If we got a page pool use it as much as possible. If we can't get more pages from the pool allocate as much as possible. Only if that still doesn't work reduce the order and try again. v2: minor cleanups Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221107195808.1873-1-christian.koenig@amd.com Stable-dep-of: 379989e7cbdc ("drm/ttm/pool: Fix ttm_pool_alloc error path") Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25drm/ttm: Fix dummy res NULL ptr deref bugArunpravin Paneer Selvam1-1/+1
commit cf4b7387c0a842d64bdd7c353e6d3298174a7740 upstream. Check the bo->resource value before accessing the resource mem_type. v2: Fix commit description unwrapped warning <log snip> [ 40.191227][ T184] general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] SMP KASAN PTI [ 40.192995][ T184] KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] [ 40.194411][ T184] CPU: 1 PID: 184 Comm: systemd-udevd Not tainted 5.19.0-rc4-00721-gb297c22b7070 #1 [ 40.196063][ T184] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014 [ 40.199605][ T184] RIP: 0010:ttm_bo_validate+0x1b3/0x240 [ttm] [ 40.200754][ T184] Code: e8 72 c5 ff ff 83 f8 b8 74 d4 85 c0 75 54 49 8b 9e 58 01 00 00 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 10 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 04 3c 03 7e 44 8b 53 10 31 c0 85 d2 0f 85 58 [ 40.203685][ T184] RSP: 0018:ffffc900006df0c8 EFLAGS: 00010202 [ 40.204630][ T184] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1102f4bb71b [ 40.205864][ T184] RDX: 0000000000000002 RSI: ffffc900006df208 RDI: 0000000000000010 [ 40.207102][ T184] RBP: 1ffff920000dbe1a R08: ffffc900006df208 R09: 0000000000000000 [ 40.208394][ T184] R10: ffff88817a5f0000 R11: 0000000000000001 R12: ffffc900006df110 [ 40.209692][ T184] R13: ffffc900006df0f0 R14: ffff88817a5db800 R15: ffffc900006df208 [ 40.210862][ T184] FS: 00007f6b1d16e8c0(0000) GS:ffff88839d700000(0000) knlGS:0000000000000000 [ 40.212250][ T184] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 40.213275][ T184] CR2: 000055a1001d4ff0 CR3: 00000001700f4000 CR4: 00000000000006e0 [ 40.214469][ T184] Call Trace: [ 40.214974][ T184] <TASK> [ 40.215438][ T184] ? ttm_bo_bounce_temp_buffer+0x140/0x140 [ttm] [ 40.216572][ T184] ? mutex_spin_on_owner+0x240/0x240 [ 40.217456][ T184] ? drm_vma_offset_add+0xaa/0x100 [drm] [ 40.218457][ T184] ttm_bo_init_reserved+0x3d6/0x540 [ttm] [ 40.219410][ T184] ? shmem_get_inode+0x744/0x980 [ 40.220231][ T184] ttm_bo_init_validate+0xb1/0x200 [ttm] [ 40.221172][ T184] ? bo_driver_evict_flags+0x340/0x340 [drm_vram_helper] [ 40.222530][ T184] ? ttm_bo_init_reserved+0x540/0x540 [ttm] [ 40.223643][ T184] ? __do_sys_finit_module+0x11a/0x1c0 [ 40.224654][ T184] ? __shmem_file_setup+0x102/0x280 [ 40.234764][ T184] drm_gem_vram_create+0x305/0x480 [drm_vram_helper] [ 40.235766][ T184] ? bo_driver_evict_flags+0x340/0x340 [drm_vram_helper] [ 40.236846][ T184] ? __kasan_slab_free+0x108/0x180 [ 40.237650][ T184] drm_gem_vram_fill_create_dumb+0x134/0x340 [drm_vram_helper] [ 40.238864][ T184] ? local_pci_probe+0xdf/0x180 [ 40.239674][ T184] ? drmm_vram_helper_init+0x400/0x400 [drm_vram_helper] [ 40.240826][ T184] drm_client_framebuffer_create+0x19c/0x400 [drm] [ 40.241955][ T184] ? drm_client_buffer_delete+0x200/0x200 [drm] [ 40.243001][ T184] ? drm_client_pick_crtcs+0x554/0xb80 [drm] [ 40.244030][ T184] drm_fb_helper_generic_probe+0x23f/0x940 [drm_kms_helper] [ 40.245226][ T184] ? __cond_resched+0x1c/0xc0 [ 40.245987][ T184] ? drm_fb_helper_memory_range_to_clip+0x180/0x180 [drm_kms_helper] [ 40.247316][ T184] ? mutex_unlock+0x80/0x100 [ 40.248005][ T184] ? __mutex_unlock_slowpath+0x2c0/0x2c0 [ 40.249083][ T184] drm_fb_helper_single_fb_probe+0x907/0xf00 [drm_kms_helper] [ 40.250314][ T184] ? drm_fb_helper_check_var+0x1180/0x1180 [drm_kms_helper] [ 40.251540][ T184] ? __cond_resched+0x1c/0xc0 [ 40.252321][ T184] ? mutex_lock+0x9f/0x100 [ 40.253062][ T184] __drm_fb_helper_initial_config_and_unlock+0xb9/0x2c0 [drm_kms_helper] [ 40.254394][ T184] drm_fbdev_client_hotplug+0x56f/0x840 [drm_kms_helper] [ 40.255477][ T184] drm_fbdev_generic_setup+0x165/0x3c0 [drm_kms_helper] [ 40.256607][ T184] bochs_pci_probe+0x6b7/0x900 [bochs] [ 40.257515][ T184] ? _raw_spin_lock_irqsave+0x87/0x100 [ 40.258312][ T184] ? bochs_hw_init+0x480/0x480 [bochs] [ 40.259244][ T184] ? bochs_hw_init+0x480/0x480 [bochs] [ 40.260186][ T184] local_pci_probe+0xdf/0x180 [ 40.260928][ T184] pci_call_probe+0x15f/0x500 [ 40.265798][ T184] ? _raw_spin_lock+0x81/0x100 [ 40.266508][ T184] ? pci_pm_suspend_noirq+0x980/0x980 [ 40.267322][ T184] ? pci_assign_irq+0x81/0x280 [ 40.268096][ T184] ? pci_match_device+0x351/0x6c0 [ 40.268883][ T184] ? kernfs_put+0x18/0x40 [ 40.269611][ T184] pci_device_probe+0xee/0x240 [ 40.270352][ T184] really_probe+0x435/0xa80 [ 40.271021][ T184] __driver_probe_device+0x2ab/0x480 [ 40.271828][ T184] driver_probe_device+0x49/0x140 [ 40.272627][ T184] __driver_attach+0x1bd/0x4c0 [ 40.273372][ T184] ? __device_attach_driver+0x240/0x240 [ 40.274273][ T184] bus_for_each_dev+0x11e/0x1c0 [ 40.275080][ T184] ? subsys_dev_iter_exit+0x40/0x40 [ 40.275951][ T184] ? klist_add_tail+0x132/0x280 [ 40.276767][ T184] bus_add_driver+0x39b/0x580 [ 40.277574][ T184] driver_register+0x20f/0x3c0 [ 40.278281][ T184] ? 0xffffffffc04a2000 [ 40.278894][ T184] do_one_initcall+0x8a/0x300 [ 40.279642][ T184] ? trace_event_raw_event_initcall_level+0x1c0/0x1c0 [ 40.280707][ T184] ? kasan_unpoison+0x23/0x80 [ 40.281479][ T184] ? kasan_unpoison+0x23/0x80 [ 40.282197][ T184] do_init_module+0x190/0x640 [ 40.282926][ T184] load_module+0x221b/0x2780 [ 40.283611][ T184] ? layout_and_allocate+0x5c0/0x5c0 [ 40.284401][ T184] ? kernel_read_file+0x286/0x6c0 [ 40.285216][ T184] ? __x64_sys_fspick+0x2c0/0x2c0 [ 40.286043][ T184] ? mmap_region+0x4e7/0x1300 [ 40.286832][ T184] ? __do_sys_finit_module+0x11a/0x1c0 [ 40.287743][ T184] __do_sys_finit_module+0x11a/0x1c0 [ 40.288636][ T184] ? __ia32_sys_init_module+0xc0/0xc0 [ 40.289557][ T184] ? __seccomp_filter+0x15e/0xc80 [ 40.290341][ T184] ? vm_mmap_pgoff+0x185/0x240 [ 40.291060][ T184] do_syscall_64+0x3b/0xc0 [ 40.291763][ T184] entry_SYSCALL_64_after_hwframe+0x46/0xb0 [ 40.292678][ T184] RIP: 0033:0x7f6b1d6279b9 [ 40.293438][ T184] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d a7 54 0c 00 f7 d8 64 89 01 48 [ 40.296302][ T184] RSP: 002b:00007ffe7f51b798 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 40.297633][ T184] RAX: ffffffffffffffda RBX: 00005642dcca2880 RCX: 00007f6b1d6279b9 [ 40.298890][ T184] RDX: 0000000000000000 RSI: 00007f6b1d7b2e2d RDI: 0000000000000016 [ 40.300199][ T184] RBP: 0000000000020000 R08: 0000000000000000 R09: 00005642dccd5530 [ 40.301547][ T184] R10: 0000000000000016 R11: 0000000000000246 R12: 00007f6b1d7b2e2d [ 40.302698][ T184] R13: 0000000000000000 R14: 00005642dcca4230 R15: 00005642dcca2880 Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reported-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220726162205.2778-1-Arunpravin.PaneerSelvam@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/20220809095623.3569-1-Arunpravin.PaneerSelvam@amd.com Signed-off-by: Christian König <christian.koenig@amd.com> CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-27drm/ttm: Put BO in its memory manager's lru listxinhui pan1-0/+2
commit 781050b0a3164934857c300bb0bc291e38c26b6f upstream. After we move BO to a new memory region, we should put it to the new memory manager's lru list regardless we unlock the resv or not. Cc: stable@vger.kernel.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211110043149.57554-1-xinhui.pan@amd.com Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18drm/ttm: remove ttm_bo_vm_insert_huge()Jason Gunthorpe1-92/+2
[ Upstream commit 0d979509539ed1df883a30d442177ca7be609565 ] The huge page functionality in TTM does not work safely because PUD and PMD entries do not have a special bit. get_user_pages_fast() considers any page that passed pmd_huge() as usable: if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) || pmd_devmap(pmd))) { And vmf_insert_pfn_pmd_prot() unconditionally sets entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE. As such gup_huge_pmd() will try to deref a struct page: head = try_grab_compound_head(pmd_page(orig), refs, flags); and thus crash. Thomas further notices that the drivers are not expecting the struct page to be used by anything - in particular the refcount incr above will cause them to malfunction. Thus everything about this is not able to fully work correctly considering GUP_fast. Delete it entirely. It can return someday along with a proper PMD/PUD_SPECIAL bit in the page table itself to gate GUP_fast. Fixes: 314b6580adc5 ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Thomas Hellström <thomas.helllstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> [danvet: Update subject per Thomas' &Christian's review] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/0-v2-a44694790652+4ac-ttm_pmd_jgg@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/ttm: stop calling tt_swapin in vm_accessMatthew Auld1-5/+0
[ Upstream commit f5d28856b89baab4232a9f841e565763fcebcdf9 ] In commit: commit 09ac4fcb3f255e9225967c75f5893325c116cdbe Author: Felix Kuehling <Felix.Kuehling@amd.com> Date: Thu Jul 13 17:01:16 2017 -0400 drm/ttm: Implement vm_operations_struct.access v2 we added the vm_access hook, where we also directly call tt_swapin for some reason. If something is swapped-out then the ttm_tt must also be unpopulated, and since access_kmap should also call tt_populate, if needed, then swapping-in will already be handled there. If anything, calling tt_swapin directly here would likely always fail since the tt->pages won't yet be populated, or worse since the tt->pages array is never actually cleared in unpopulate this might lead to a nasty uaf. Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210927114114.152310-1-matthew.auld@intel.com Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-21drm/ttm: fix memleak in ttm_transfered_destroyChristian König1-0/+1
We need to cleanup the fences for ghost objects as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reported-by: Erhard F. <erhard_f@mailbox.org> Tested-by: Erhard F. <erhard_f@mailbox.org> Reviewed-by: Huang Rui <ray.huang@amd.com> Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214029 Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214447 CC: <stable@vger.kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20211020173211.2247-1-christian.koenig@amd.com
2021-09-14drm/ttm: fix type mismatch error on sparc64Huang Rui1-1/+2
On sparc64, __fls() returns an "int", but the drm TTM code expected it to be "unsigned long" as on x86. As a result, on sparc (and arc, and m68k) you get build errors because 'min()' checks that the types match. As suggested by Linus, it can use min_t instead of min to force the type to be "unsigned int". Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexdeucher@gmail.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-10Merge tag 'drm-next-2021-09-10' of git://anongit.freedesktop.org/drm/drmLinus Torvalds3-8/+6
Pull drm fixes from Dave Airlie: "Just an initial bunch of fixes for the merge window, amdgpu is most of them with a few ttm fixes and an fbdev avoid multiply overflow fix. core: - Make some dma-buf config options depend on DMA_SHARED_BUFFER - Handle multiplication overflow of fbdev xres/yres in the core ttm: - Fix ttm_bo_move_memcpy() when ttm_resource is subclassed - Fix ttm deadlock if target BO isn't idle - ttm build fix - ttm docs fix dma-buf: - config option fixes fbdev: - limit resolutions to avoid int overflow i915: - stddef change. amdgpu: - Misc cleanups, typo fixes - EEPROM fix - Add some new PCI IDs - Scatter/Gather display support for Yellow Carp - PCIe DPM fix for RKL platforms - RAS fix amdkfd: - SVM fix vc4: - static function fix mgag200: - fix uninit var panfrost: - lock_region fixes" * tag 'drm-next-2021-09-10' of git://anongit.freedesktop.org/drm/drm: (36 commits) drm/ttm: Fix a deadlock if the target BO is not idle during swap fbmem: don't allow too huge resolutions dma-buf: DMABUF_SYSFS_STATS should depend on DMA_SHARED_BUFFER dma-buf: DMABUF_DEBUG should depend on DMA_SHARED_BUFFER drm/i915: use linux/stddef.h due to "isystem: trim/fixup stdarg.h and other headers" dma-buf: DMABUF_MOVE_NOTIFY should depend on DMA_SHARED_BUFFER drm/amdkfd: drop process ref count when xnack disable drm/amdgpu: enable more pm sysfs under SRIOV 1-VF mode drm/amdgpu: fix fdinfo race with process exit drm/amdgpu: Fix a deadlock if previous GEM object allocation fails drm/amdgpu: stop scheduler when calling hw_fini (v2) drm/amdgpu: Clear RAS interrupt status on aldebaran drm/amd/display: Initialize lt_settings on instantiation drm/amd/display: cleanup idents after a revert drm/amd/display: Fix memory leak reported by coverity drm/ttm: Fix ttm_bo_move_memcpy() for subclassed struct ttm_resource drm/amdgpu/swsmu: fix spelling mistake "minimun" -> "minimum" drm/amdgpu: Disable PCIE_DPM on Intel RKL Platform drm/amdgpu: show both cmd id and name when psp cmd failed drm/amd/display: setup system context for APUs ...
2021-09-10drm/ttm: Fix a deadlock if the target BO is not idle during swapxinhui pan1-3/+3
The ret value might be -EBUSY, caller will think lru lock is still locked but actually NOT. So return -ENOSPC instead. Otherwise we hit list corruption. ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0, caller(ttm_tt_populate -> ttm_global_swapout ->ttm_device_swapout) will be stuck as we actually did not free any BO memory. This usually happens when the fence is not signaled for a long time. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Fixes: ebd59851c796 ("drm/ttm: move swapout logic around v3") Link: https://patchwork.freedesktop.org/patch/msgid/20210907040832.1107747-1-xinhui.pan@amd.com Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-09-01Merge tag 'drm-next-2021-08-31-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds1-29/+37
Pull drm updates from Dave Airlie: "Highlights: - i915 has seen a lot of refactoring and uAPI cleanups due to a change in the upstream direction going forward This has all been audited with known userspace, but there may be some pitfalls that were missed. - i915 now uses common TTM to enable discrete memory on DG1/2 GPUs - i915 enables Jasper and Elkhart Lake by default and has preliminary XeHP/DG2 support - amdgpu adds support for Cyan Skillfish - lots of implicit fencing rules documented and fixed up in drivers - msm now uses the core scheduler - the irq midlayer has been removed for non-legacy drivers - the sysfb code now works on more than x86. Otherwise the usual smattering of stuff everywhere, panels, bridges, refactorings. Detailed summary: core: - extract i915 eDP backlight into core - DP aux bus support - drm_device.irq_enabled removed - port drivers to native irq interfaces - export gem shadow plane handling for vgem - print proper driver name in framebuffer registration - driver fixes for implicit fencing rules - ARM fixed rate compression modifier added - updated fb damage handling - rmfb ioctl logging/docs - drop drm_gem_object_put_locked - define DRM_FORMAT_MAX_PLANES - add gem fb vmap/vunmap helpers - add lockdep_assert(once) helpers - mark drm irq midlayer as legacy - use offset adjusted bo mapping conversion vgaarb: - cleanups fbdev: - extend efifb handling to all arches - div by 0 fixes for multiple drivers udmabuf: - add hugepage mapping support dma-buf: - non-dynamic exporter fixups - document implicit fencing rules amdgpu: - Initial Cyan Skillfish support - switch virtual DCE over to vkms based atomic - VCN/JPEG power down fixes - NAVI PCIE link handling fixes - AMD HDMI freesync fixes - Yellow Carp + Beige Goby fixes - Clockgating/S0ix/SMU/EEPROM fixes - embed hw fence in job - rework dma-resv handling - ensure eviction to system ram amdkfd: - uapi: SVM address range query added - sysfs leak fix - GPUVM TLB optimizations - vmfault/migration counters i915: - Enable JSL and EHL by default - preliminary XeHP/DG2 support - remove all CNL support (never shipped) - move to TTM for discrete memory support - allow mixed object mmap handling - GEM uAPI spring cleaning - add I915_MMAP_OBJECT_FIXED - reinstate ADL-P mmap ioctls - drop a bunch of unused by userspace features - disable and remove GPU relocations - revert some i915 misfeatures - major refactoring of GuC for Gen11+ - execbuffer object locking separate step - reject caching/set-domain on discrete - Enable pipe DMC loading on XE-LPD and ADL-P - add PSF GV point support - Refactor and fix DDI buffer translations - Clean up FBC CFB allocation code - Finish INTEL_GEN() and friends macro conversions nouveau: - add eDP backlight support - implicit fence fix msm: - a680/7c3 support - drm/scheduler conversion panfrost: - rework GPU reset virtio: - fix fencing for planes ast: - add detect support bochs: - move to tiny GPU driver vc4: - use hotplug irqs - HDMI codec support vmwgfx: - use internal vmware device headers ingenic: - demidlayering irq rcar-du: - shutdown fixes - convert to bridge connector helpers zynqmp-dsub: - misc fixes mgag200: - convert PLL handling to atomic mediatek: - MT8133 AAL support - gem mmap object support - MT8167 support etnaviv: - NXP Layerscape LS1028A SoC support - GEM mmap cleanups tegra: - new user API exynos: - missing unlock fix - build warning fix - use refcount_t" * tag 'drm-next-2021-08-31-1' of git://anongit.freedesktop.org/drm/drm: (1318 commits) drm/amd/display: Move AllowDRAMSelfRefreshOrDRAMClockChangeInVblank to bounding box drm/amd/display: Remove duplicate dml init drm/amd/display: Update bounding box states (v2) drm/amd/display: Update number of DCN3 clock states drm/amdgpu: disable GFX CGCG in aldebaran drm/amdgpu: Clear RAS interrupt status on aldebaran drm/amdgpu: Add support for RAS XGMI err query drm/amdkfd: Account for SH/SE count when setting up cu masks. drm/amdgpu: rename amdgpu_bo_get_preferred_pin_domain drm/amdgpu: drop redundant cancel_delayed_work_sync call drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspend drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend drm/amdkfd: map SVM range with correct access permission drm/amdkfd: check access permisson to restore retry fault drm/amdgpu: Update RAS XGMI Error Query drm/amdgpu: Add driver infrastructure for MCA RAS drm/amd/display: Add Logging for HDMI color depth information drm/amd/amdgpu: consolidate PSP TA init shared buf functions drm/amd/amdgpu: add name field back to ras_common_if drm/amdgpu: Fix build with missing pm_suspend_target_state module export ...
2021-08-31drm/ttm: Fix ttm_bo_move_memcpy() for subclassed struct ttm_resourceThomas Hellström1-4/+3
The code was making a copy of a struct ttm_resource. However, recently the struct ttm_resources were allowed to be subclassed and also were allowed to be malloced, hence the driver could end up assuming the copy we handed it was subclassed and worse, the original could have been freed at this point. Fix this by using the original struct ttm_resource before it is potentially freed in ttm_bo_move_sync_cleanup() v2: Base on drm-misc-next-fixes rather than drm-tip. Reported-by: Ben Skeggs <skeggsb@gmail.com> Reported-by: Dave Airlie <airlied@gmail.com> Cc: Christian König <christian.koenig@amd.com> Cc: <stable@vger.kernel.org> Fixes: 3bf3710e3718 ("drm/ttm: Add a generic TTM memcpy move for page-based iomem") Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210831071536.80636-1-thomas.hellstrom@linux.intel.com
2021-08-16drm/ttm: Include pagemap.h from ttm_tt.hJason Ekstrand1-1/+0
It's needed for pgprot_t which is used in the header. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Cc: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210812203443.1725307-2-jason@jlekstrand.net Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2021-08-16drm: ttm: Don't bail from ttm_global_init if debugfs_create_dir failsDan Moulding1-2/+0
In 69de4421bb4c ("drm/ttm: Initialize debugfs from ttm_global_init()"), ttm_global_init was changed so that if creation of the debugfs global root directory fails, ttm_global_init will bail out early and return an error, leading to initialization failure of DRM drivers. However, not every system will be using debugfs. On such a system, debugfs directory creation can be expected to fail, but DRM drivers must still be usable. This changes it so that if creation of TTM's debugfs root directory fails, then no biggie: keep calm and carry on. Fixes: 69de4421bb4c ("drm/ttm: Initialize debugfs from ttm_global_init()") Signed-off-by: Dan Moulding <dmoulding@me.com> Tested-by: Huacai Chen <chenhuacai@loongson.cn> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210810195906.22220-2-dmoulding@me.com Signed-off-by: Christian König <christian.koenig@amd.com>
2021-07-26Backmerge tag 'v5.14-rc3' into drm-nextDave Airlie3-0/+8
Linux 5.14-rc3 Daniel said we should pull the nouveau fix from fixes in here, probably a good plan. Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-07-22drm/ttm: Initialize debugfs from ttm_global_init()Jason Ekstrand2-16/+12
We create a bunch of debugfs entries as a side-effect of ttm_global_init() and then never clean them up. This isn't usually a problem because we free the whole debugfs directory on module unload. However, if the global reference count ever goes to zero and then ttm_global_init() is called again, we'll re-create those debugfs entries and debugfs will complain in dmesg that we're creating entries that already exist. This patch fixes this problem by changing the lifetime of the whole TTM debugfs directory to match that of the TTM global state. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210721152358.2893314-6-jason@jlekstrand.net
2021-07-21drm/ttm: add missing NULL checksPavel Skripkin2-0/+6
My local syzbot instance hit GPF in ttm_bo_release(). Unfortunately, syzbot didn't produce a reproducer for this, but I found out possible scenario: drm_gem_vram_create() <-- drm_gem_vram_object kzalloced (bo embedded in this object) ttm_bo_init() ttm_bo_init_reserved() ttm_resource_alloc() man->func->alloc() <-- allocation failure ttm_bo_put() ttm_bo_release() ttm_mem_io_free() <-- bo->resource == NULL passed as second argument *GPF* Added NULL check inside ttm_mem_io_free() to prevent reported GPF and make this function NULL save in future. Same problem was in ttm_bo_move_to_lru_tail() as Christian reported. ttm_bo_move_to_lru_tail() is called in ttm_bo_release() and mem pointer can be NULL as well as in ttm_mem_io_free(). Fail log: KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027] ... RIP: 0010:ttm_mem_io_free+0x28/0x170 drivers/gpu/drm/ttm/ttm_bo_util.c:66 .. Call Trace: ttm_bo_release+0xd94/0x10a0 drivers/gpu/drm/ttm/ttm_bo.c:422 kref_put include/linux/kref.h:65 [inline] ttm_bo_put drivers/gpu/drm/ttm/ttm_bo.c:470 [inline] ttm_bo_init_reserved+0x7cb/0x960 drivers/gpu/drm/ttm/ttm_bo.c:1050 ttm_bo_init+0x105/0x270 drivers/gpu/drm/ttm/ttm_bo.c:1074 drm_gem_vram_create+0x332/0x4c0 drivers/gpu/drm/drm_gem_vram_helper.c:228 Fixes: d3116756a710 ("drm/ttm: rename bo->mem and make it a pointer") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210708112518.17271-1-paskripkin@gmail.com
2021-07-21drm/ttm: Force re-init if ttm_global_init() failsJason Ekstrand1-0/+2
If we have a failure, decrement the reference count so that the next call to ttm_global_init() will actually do something instead of assume everything is all set up. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Fixes: 62b53b37e4b1 ("drm/ttm: use a static ttm_bo_global instance") Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210720181357.2760720-5-jason@jlekstrand.net Signed-off-by: Christian König <christian.koenig@amd.com>
2021-07-21Merge tag 'drm-misc-next-2021-07-16' of ↵Dave Airlie1-29/+37
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.15: UAPI Changes: Cross-subsystem Changes: - udmabuf: Add support for mapping hugepages - Add dma-buf stats to sysfs. - Assorted fixes to fbdev/omap2. - dma-buf: Document DMA_BUF_IOCTL_SYNC - Improve dma-buf non-dynamic exporter expectations better. - Add module parameters for dma-buf size and list limit. - Add HDMI codec support to vc4, to replace vc4's own codec. - Document dma-buf implicit fencing rules. - dma_resv_test_signaled test_all handling. Core Changes: - Extract i915's eDP backlight code into DRM helpers. - Assorted docbook updates. - Rework drm_dp_aux documentation. - Add support for the DP aux bus. - Shrink dma-fence-chain slightly. - Add alloc/free helpers for dma-fence-chain. - Assorted fixes to TTM., drm/of, bridge - drm_gem_plane_helper_prepare/cleanup_fb is now the default for gem drivers. - Small fix for scheduler completion. - Remove use of drm_device.irq_enabled. - Print the driver name to dmesg when registering framebuffer. - Export drm/gem's shadow plane handling, and use it in vkms. - Assorted small fixes. Driver Changes: - Add eDP backlight to nouveau. - Assorted fixes and cleanups to nouveau, panfrost, vmwgfx, anx7625, amdgpu, gma500, radeon, mgag200, vgem, vc4, vkms, omapdrm. - Add support for Samsung DB7430, Samsung ATNA33XC20, EDT ETMV570G2DHU, EDT ETM0350G0DH6, Innolux EJ030NA panels. - Fix some simple pannels missing bus_format and connector types. - Add mks-guest-stats instrumentation support to vmwgfx. - Merge i915-ttm topic branch. - Make s6e63m0 panel use Mipi-DBI helpers. - Add detect() supoprt for AST. - Use interrupts for hotplug on vc4. - vmwgfx is now moved to drm-misc-next, as sroland is no longer a maintainer for now. - vmwgfx now uses copies of vmware's internal device headers. - Slowly convert ti-sn65dsi83 over to atomic. - Rework amdgpu dma-resv handling. - Fix virtio fencing for planes. - Ensure amdgpu can always evict to SYSTEM. - Many drivers fixed for implicit fencing rules. - Set default prepare/cleanup fb for tiny, vram and simple helpers too. - Rework panfrost gpu reset and related serialization. - Update VKMS todo list. - Make bochs a tiny gpu driver, and use vram helper. - Use linux irq interfaces instead of drm_irq in some drivers. - Add support for Raspberry Pi Pico to GUD. Signed-off-by: Dave Airlie <airlied@redhat.com> # gpg: Signature made Fri 16 Jul 2021 21:06:04 AEST # gpg: using RSA key B97BD6A80CAC4981091AE547FE558C72A67013C3 # gpg: Good signature from "Maarten Lankhorst <maarten.lankhorst@linux.intel.com>" [expired] # gpg: aka "Maarten Lankhorst <maarten@debian.org>" [expired] # gpg: aka "Maarten Lankhorst <maarten.lankhorst@canonical.com>" [expired] # gpg: Note: This key has expired! # Primary key fingerprint: B97B D6A8 0CAC 4981 091A E547 FE55 8C72 A670 13C3 From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/444811c3-cbec-e9d5-9a6b-9632eda7962a@linux.intel.com
2021-07-14drm/ttm: add a check against null pointer dereferenceZheyu Ma1-0/+3
When calling ttm_range_man_fini(), 'man' may be uninitialized, which may cause a null pointer dereference bug. Fix this by checking if it is a null pointer. This log reveals it: [ 7.902580 ] BUG: kernel NULL pointer dereference, address: 0000000000000058 [ 7.905721 ] RIP: 0010:ttm_range_man_fini+0x40/0x160 [ 7.911826 ] Call Trace: [ 7.911826 ] radeon_ttm_fini+0x167/0x210 [ 7.911826 ] radeon_bo_fini+0x15/0x40 [ 7.913767 ] rs400_fini+0x55/0x80 [ 7.914358 ] radeon_device_fini+0x3c/0x140 [ 7.914358 ] radeon_driver_unload_kms+0x5c/0xe0 [ 7.914358 ] radeon_driver_load_kms+0x13a/0x200 [ 7.914358 ] ? radeon_driver_unload_kms+0xe0/0xe0 [ 7.914358 ] drm_dev_register+0x1db/0x290 [ 7.914358 ] radeon_pci_probe+0x16a/0x230 [ 7.914358 ] local_pci_probe+0x4a/0xb0 Signed-off-by: Zheyu Ma <zheyuma97@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/1626274459-8148-1-git-send-email-zheyuma97@gmail.com Signed-off-by: Christian König <christian.koenig@amd.com>
2021-06-23drm/ttm: add TTM_PL_FLAG_TEMPORARY flag v3Lang Yu1-0/+3
Sometimes drivers need to use bounce buffers to evict BOs. While those reside in some domain they are not necessarily suitable for CS. Add a flag so that drivers can note that a bounce buffers needs to be reallocated during validation. v2: add detailed comments v3 (chk): merge commits and rework commit message Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-1-andrey.grodzovsky@amd.com
2021-06-23drm/ttm: Fix multihop assert on eviction.Andrey Grodzovsky1-29/+34
Problem: Under memory pressure when GTT domain is almost full multihop assert will come up when trying to evict LRU BO from VRAM to SYSTEM. Fix: Don't assert on multihop error in evict code but rather do a retry as we do in ttm_bo_move_buffer Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-6-andrey.grodzovsky@amd.com
2021-06-23Backmerge tag 'v5.13-rc7' into drm-nextDave Airlie2-8/+5
Backmerge Linux 5.13-rc7 to make some pulls from later bases apply, and to bake in the conflicts so far.
2021-06-08drm/ttm: fix deref of bo->ttm without holding the lock v2Christian König2-8/+5
We need to grab the resv lock first before doing that check. v2 (chk): simplify the change for -fixes Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210528130041.1683-1-christian.koenig@amd.com
2021-06-08drm/ttm: nuke VM_MIXEDMAP on BO mappings v3Christian König1-30/+12
We discussed if that is really the right approach for quite a while now, but digging deeper into a bug report on arm turned out that this is actually horrible broken right now. The reason for this is that vmf_insert_mixed_prot() always tries to grab a reference to the underlaying page on architectures without ARCH_HAS_PTE_SPECIAL and as far as I can see also enabled GUP. So nuke using VM_MIXEDMAP here and use VM_PFNMAP instead. Also make sure to reject mappings without VM_SHARED. v2: reject COW mappings, merge function with only caller v3: adjust comment as suggested by Thomas Signed-off-by: Christian König <christian.koenig@amd.com> Bugs: https://gitlab.freedesktop.org/drm/amd/-/issues/1606#note_936174 Link: https://patchwork.freedesktop.org/patch/msgid/20210607135830.8574-1-christian.koenig@amd.com Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2021-06-08drm/ttm: fix pipelined gutting v2Christian König1-8/+20
We need to make sure to allocate the sys_mem resource before the point of no return. v2: add missing return value checking, also handle idle case Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210608081931.11339-1-christian.koenig@amd.com
2021-06-07drm/ttm: fix warning after moving resource to ghost objChristian König1-0/+1
After we moved the resource to the ghost the bo->resource pointer needs to be reset since the owner of the resource is now the ghost. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210607175737.1405-1-christian.koenig@amd.com
2021-06-07drm/ttm: fix access to uninitialized variable.Christian König1-1/+1
The resource is not allocated yet, so no chance that this will work. Use the placement instead. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210607171152.15914-1-christian.koenig@amd.com
2021-06-07drm/ttm, drm/amdgpu: Allow the driver some control over swappingThomas Hellström1-16/+30
We are calling the eviction_valuable driver callback at eviction time to determine whether we actually can evict a buffer object. The upcoming i915 TTM backend needs the same functionality for swapout, and that might actually be beneficial to other drivers as well. Add an eviction_valuable call also in the swapout path. Try to keep the current behaviour for all drivers by returning true if the buffer object is already in the TTM_PL_SYSTEM placement. We change behaviour for the case where a buffer object is in a TT backed placement when swapped out, in which case the drivers normal eviction_valuable path is run. Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20210602083818.241793-8-thomas.hellstrom@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20210602083818.241793-8-thomas.hellstrom@linux.intel.com
2021-06-07drm/ttm: Document and optimize ttm_bo_pipeline_gutting()Thomas Hellström2-15/+59
If the bo is idle when calling ttm_bo_pipeline_gutting(), we unnecessarily create a ghost object and push it out to delayed destroy. Fix this by adding a path for idle, and document the function. Also avoid having the bo end up in a bad state vulnerable to user-space triggered kernel BUGs if the call to ttm_tt_create() fails. Finally reuse ttm_bo_pipeline_gutting() in ttm_bo_evict(). Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20210602083818.241793-7-thomas.hellstrom@linux.intel.com
2021-06-07drm/ttm: Use drm_memcpy_from_wc for TTM bo movesThomas Hellström1-16/+3
Use fast wc memcpy for reading out of wc memory for TTM bo moves. Cc: Dave Airlie <airlied@gmail.com> Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> #v4 Link: https://lore.kernel.org/r/20210602083818.241793-6-thomas.hellstrom@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20210602083818.241793-6-thomas.hellstrom@linux.intel.com
2021-06-07drm/ttm: Add a generic TTM memcpy move for page-based iomemThomas Hellström4-181/+371
The internal ttm_bo_util memcpy uses ioremap functionality, and while it probably might be possible to use it for copying in- and out of sglist represented io memory, using io_mem_reserve() / io_mem_free() callbacks, that would cause problems with fault(). Instead, implement a method mapping page-by-page using kmap_local() semantics. As an additional benefit we then avoid the occasional global TLB flushes of ioremap() and consuming ioremap space, elimination of a critical point of failure and with a slight change of semantics we could also push the memcpy out async for testing and async driver development purposes. A special linear iomem iterator is introduced internally to mimic the old ioremap behaviour for code-paths that can't immediately be ported over. This adds to the code size and should be considered a temporary solution. Looking at the code we have a lot of checks for iomap tagged pointers. Ideally we should extend the core memremap functions to also accept uncached memory and kmap_local functionality. Then we could strip a lot of code. Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20210602083818.241793-4-thomas.hellstrom@linux.intel.com
2021-06-07drm/ttm: fix missing res assignment in ttm_range_man_allocChristian König1-4/+6
That somehow got missing. Signed-off-by: Christian König <christian.koenig@amd.com> Reported-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: cb1c81467af3 ("drm/ttm: flip the switch for driver allocated resources v2") Link: https://patchwork.freedesktop.org/patch/msgid/20210607104040.22017-1-christian.koenig@amd.com
2021-06-06dma-buf: drop the _rcu postfix on function names v3Christian König1-9/+9
The functions can be called both in _rcu context as well as while holding the lock. v2: add some kerneldoc as suggested by Daniel v3: fix indentation Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-7-christian.koenig@amd.com
2021-06-06dma-buf: rename and cleanup dma_resv_get_list v2Christian König1-1/+1
When the comment needs to state explicitly that this is doesn't get a reference to the object then the function is named rather badly. Rename the function and use it in even more places. v2: use dma_resv_shared_list as new name Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-5-christian.koenig@amd.com
2021-06-06dma-buf: rename and cleanup dma_resv_get_excl v3Christian König1-1/+1
When the comment needs to state explicitly that this doesn't get a reference to the object then the function is named rather badly. Rename the function and use rcu_dereference_check(), this way it can be used from both rcu as well as lock protected critical sections. v2: improve kerneldoc as suggested by Daniel v3: use dma_resv_excl_fence as function name Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-4-christian.koenig@amd.com
2021-06-04drm/ttm: flip the switch for driver allocated resources v2Christian König3-38/+15
Instead of both driver and TTM allocating memory finalize embedding the ttm_resource object as base into the driver backends. v2: fix typo in vmwgfx grid mgr and double init in amdgpu_vram_mgr.c Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210602100914.46246-10-christian.koenig@amd.com
2021-06-04drm/ttm: flip over the sys manager to self allocated nodesChristian König1-0/+7
Make sure to allocate a resource object here. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210602100914.46246-3-christian.koenig@amd.com
2021-06-04drm/ttm: flip over the range manager to self allocated nodesChristian König2-24/+58
Start with the range manager to make the resource object the base class for the allocated nodes. While at it cleanup a lot of the code around that. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210602100914.46246-2-christian.koenig@amd.com