summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/ttm/ttm_bo.c
AgeCommit message (Collapse)AuthorFilesLines
2021-09-10drm/ttm: Fix a deadlock if the target BO is not idle during swapxinhui pan1-3/+3
The ret value might be -EBUSY, caller will think lru lock is still locked but actually NOT. So return -ENOSPC instead. Otherwise we hit list corruption. ttm_bo_cleanup_refs might fail too if BO is not idle. If we return 0, caller(ttm_tt_populate -> ttm_global_swapout ->ttm_device_swapout) will be stuck as we actually did not free any BO memory. This usually happens when the fence is not signaled for a long time. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Fixes: ebd59851c796 ("drm/ttm: move swapout logic around v3") Link: https://patchwork.freedesktop.org/patch/msgid/20210907040832.1107747-1-xinhui.pan@amd.com Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-07-26Backmerge tag 'v5.14-rc3' into drm-nextDave Airlie1-0/+3
Linux 5.14-rc3 Daniel said we should pull the nouveau fix from fixes in here, probably a good plan. Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-07-21drm/ttm: add missing NULL checksPavel Skripkin1-0/+3
My local syzbot instance hit GPF in ttm_bo_release(). Unfortunately, syzbot didn't produce a reproducer for this, but I found out possible scenario: drm_gem_vram_create() <-- drm_gem_vram_object kzalloced (bo embedded in this object) ttm_bo_init() ttm_bo_init_reserved() ttm_resource_alloc() man->func->alloc() <-- allocation failure ttm_bo_put() ttm_bo_release() ttm_mem_io_free() <-- bo->resource == NULL passed as second argument *GPF* Added NULL check inside ttm_mem_io_free() to prevent reported GPF and make this function NULL save in future. Same problem was in ttm_bo_move_to_lru_tail() as Christian reported. ttm_bo_move_to_lru_tail() is called in ttm_bo_release() and mem pointer can be NULL as well as in ttm_mem_io_free(). Fail log: KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027] ... RIP: 0010:ttm_mem_io_free+0x28/0x170 drivers/gpu/drm/ttm/ttm_bo_util.c:66 .. Call Trace: ttm_bo_release+0xd94/0x10a0 drivers/gpu/drm/ttm/ttm_bo.c:422 kref_put include/linux/kref.h:65 [inline] ttm_bo_put drivers/gpu/drm/ttm/ttm_bo.c:470 [inline] ttm_bo_init_reserved+0x7cb/0x960 drivers/gpu/drm/ttm/ttm_bo.c:1050 ttm_bo_init+0x105/0x270 drivers/gpu/drm/ttm/ttm_bo.c:1074 drm_gem_vram_create+0x332/0x4c0 drivers/gpu/drm/drm_gem_vram_helper.c:228 Fixes: d3116756a710 ("drm/ttm: rename bo->mem and make it a pointer") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210708112518.17271-1-paskripkin@gmail.com
2021-07-21Merge tag 'drm-misc-next-2021-07-16' of ↵Dave Airlie1-29/+37
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.15: UAPI Changes: Cross-subsystem Changes: - udmabuf: Add support for mapping hugepages - Add dma-buf stats to sysfs. - Assorted fixes to fbdev/omap2. - dma-buf: Document DMA_BUF_IOCTL_SYNC - Improve dma-buf non-dynamic exporter expectations better. - Add module parameters for dma-buf size and list limit. - Add HDMI codec support to vc4, to replace vc4's own codec. - Document dma-buf implicit fencing rules. - dma_resv_test_signaled test_all handling. Core Changes: - Extract i915's eDP backlight code into DRM helpers. - Assorted docbook updates. - Rework drm_dp_aux documentation. - Add support for the DP aux bus. - Shrink dma-fence-chain slightly. - Add alloc/free helpers for dma-fence-chain. - Assorted fixes to TTM., drm/of, bridge - drm_gem_plane_helper_prepare/cleanup_fb is now the default for gem drivers. - Small fix for scheduler completion. - Remove use of drm_device.irq_enabled. - Print the driver name to dmesg when registering framebuffer. - Export drm/gem's shadow plane handling, and use it in vkms. - Assorted small fixes. Driver Changes: - Add eDP backlight to nouveau. - Assorted fixes and cleanups to nouveau, panfrost, vmwgfx, anx7625, amdgpu, gma500, radeon, mgag200, vgem, vc4, vkms, omapdrm. - Add support for Samsung DB7430, Samsung ATNA33XC20, EDT ETMV570G2DHU, EDT ETM0350G0DH6, Innolux EJ030NA panels. - Fix some simple pannels missing bus_format and connector types. - Add mks-guest-stats instrumentation support to vmwgfx. - Merge i915-ttm topic branch. - Make s6e63m0 panel use Mipi-DBI helpers. - Add detect() supoprt for AST. - Use interrupts for hotplug on vc4. - vmwgfx is now moved to drm-misc-next, as sroland is no longer a maintainer for now. - vmwgfx now uses copies of vmware's internal device headers. - Slowly convert ti-sn65dsi83 over to atomic. - Rework amdgpu dma-resv handling. - Fix virtio fencing for planes. - Ensure amdgpu can always evict to SYSTEM. - Many drivers fixed for implicit fencing rules. - Set default prepare/cleanup fb for tiny, vram and simple helpers too. - Rework panfrost gpu reset and related serialization. - Update VKMS todo list. - Make bochs a tiny gpu driver, and use vram helper. - Use linux irq interfaces instead of drm_irq in some drivers. - Add support for Raspberry Pi Pico to GUD. Signed-off-by: Dave Airlie <airlied@redhat.com> # gpg: Signature made Fri 16 Jul 2021 21:06:04 AEST # gpg: using RSA key B97BD6A80CAC4981091AE547FE558C72A67013C3 # gpg: Good signature from "Maarten Lankhorst <maarten.lankhorst@linux.intel.com>" [expired] # gpg: aka "Maarten Lankhorst <maarten@debian.org>" [expired] # gpg: aka "Maarten Lankhorst <maarten.lankhorst@canonical.com>" [expired] # gpg: Note: This key has expired! # Primary key fingerprint: B97B D6A8 0CAC 4981 091A E547 FE55 8C72 A670 13C3 From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/444811c3-cbec-e9d5-9a6b-9632eda7962a@linux.intel.com
2021-06-23drm/ttm: add TTM_PL_FLAG_TEMPORARY flag v3Lang Yu1-0/+3
Sometimes drivers need to use bounce buffers to evict BOs. While those reside in some domain they are not necessarily suitable for CS. Add a flag so that drivers can note that a bounce buffers needs to be reallocated during validation. v2: add detailed comments v3 (chk): merge commits and rework commit message Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-1-andrey.grodzovsky@amd.com
2021-06-23drm/ttm: Fix multihop assert on eviction.Andrey Grodzovsky1-29/+34
Problem: Under memory pressure when GTT domain is almost full multihop assert will come up when trying to evict LRU BO from VRAM to SYSTEM. Fix: Don't assert on multihop error in evict code but rather do a retry as we do in ttm_bo_move_buffer Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-6-andrey.grodzovsky@amd.com
2021-06-23Backmerge tag 'v5.13-rc7' into drm-nextDave Airlie1-1/+4
Backmerge Linux 5.13-rc7 to make some pulls from later bases apply, and to bake in the conflicts so far.
2021-06-08drm/ttm: fix deref of bo->ttm without holding the lock v2Christian König1-1/+4
We need to grab the resv lock first before doing that check. v2 (chk): simplify the change for -fixes Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210528130041.1683-1-christian.koenig@amd.com
2021-06-07drm/ttm: fix access to uninitialized variable.Christian König1-1/+1
The resource is not allocated yet, so no chance that this will work. Use the placement instead. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210607171152.15914-1-christian.koenig@amd.com
2021-06-07drm/ttm, drm/amdgpu: Allow the driver some control over swappingThomas Hellström1-16/+30
We are calling the eviction_valuable driver callback at eviction time to determine whether we actually can evict a buffer object. The upcoming i915 TTM backend needs the same functionality for swapout, and that might actually be beneficial to other drivers as well. Add an eviction_valuable call also in the swapout path. Try to keep the current behaviour for all drivers by returning true if the buffer object is already in the TTM_PL_SYSTEM placement. We change behaviour for the case where a buffer object is in a TT backed placement when swapped out, in which case the drivers normal eviction_valuable path is run. Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20210602083818.241793-8-thomas.hellstrom@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20210602083818.241793-8-thomas.hellstrom@linux.intel.com
2021-06-07drm/ttm: Document and optimize ttm_bo_pipeline_gutting()Thomas Hellström1-10/+10
If the bo is idle when calling ttm_bo_pipeline_gutting(), we unnecessarily create a ghost object and push it out to delayed destroy. Fix this by adding a path for idle, and document the function. Also avoid having the bo end up in a bad state vulnerable to user-space triggered kernel BUGs if the call to ttm_tt_create() fails. Finally reuse ttm_bo_pipeline_gutting() in ttm_bo_evict(). Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20210602083818.241793-7-thomas.hellstrom@linux.intel.com
2021-06-06dma-buf: drop the _rcu postfix on function names v3Christian König1-9/+9
The functions can be called both in _rcu context as well as while holding the lock. v2: add some kerneldoc as suggested by Daniel v3: fix indentation Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-7-christian.koenig@amd.com
2021-06-06dma-buf: rename and cleanup dma_resv_get_list v2Christian König1-1/+1
When the comment needs to state explicitly that this is doesn't get a reference to the object then the function is named rather badly. Rename the function and use it in even more places. v2: use dma_resv_shared_list as new name Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-5-christian.koenig@amd.com
2021-06-06dma-buf: rename and cleanup dma_resv_get_excl v3Christian König1-1/+1
When the comment needs to state explicitly that this doesn't get a reference to the object then the function is named rather badly. Rename the function and use rcu_dereference_check(), this way it can be used from both rcu as well as lock protected critical sections. v2: improve kerneldoc as suggested by Daniel v3: use dma_resv_excl_fence as function name Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-4-christian.koenig@amd.com
2021-06-04drm/ttm: allocate resource object instead of embedding it v2Christian König1-55/+28
To improve the handling we want the establish the resource object as base class for the backend allocations. v2: add missing error handling Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210602100914.46246-1-christian.koenig@amd.com
2021-06-02drm/ttm: rename bo->mem and make it a pointerChristian König1-17/+20
When we want to decouble resource management from buffer management we need to be able to handle resources separately. Add a resource pointer and rename bo->mem so that all code needs to change to access the pointer instead. No functional change. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210430092508.60710-4-christian.koenig@amd.com
2021-05-20drm/ttm: Explain why ttm_bo_add_move_fence uses a shared slotDaniel Vetter1-1/+3
Motivated because I got confused and Christian confirmed why this works. I think this is non-obvious enough that it merits a slightly longer comment. Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Cc: Christian König <ckoenig.leichtzumerken@gmail.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210519082409.672016-1-daniel.vetter@ffwll.ch
2021-05-03drm/ttm: properly allocate sys resource during swapoutChristian König1-5/+7
Drop the special handling here. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210430092508.60710-3-christian.koenig@amd.com
2021-05-03drm/ttm: always initialize the full ttm_resource v2Christian König1-22/+4
Init all fields in ttm_resource_alloc() when we create a new resource. v2: use place->mem_type instead of res->mem_type Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210430092508.60710-2-christian.koenig@amd.com
2021-04-23drm/ttm: move the page_alignment into the BO v2Christian König1-2/+1
The alignment is a constant property and shouldn't change. v2: move documentation as well as suggested by Matthew. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210413135248.1266-4-christian.koenig@amd.com
2021-04-23drm/ttm: remove special handling for non GEM driversChristian König1-11/+0
vmwgfx is the only driver actually using this. Move the handling into the driver instead. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Zack Rusin <zackr@vmware.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210419092853.1605-1-christian.koenig@amd.com
2021-04-22drm/ttm/ttm_bo: Fix incorrectly documented function 'ttm_bo_cleanup_refs'Lee Jones1-1/+1
Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/ttm/ttm_bo.c:293: warning: expecting prototype for function ttm_bo_cleanup_refs(). Prototype was for ttm_bo_cleanup_refs() instead Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20210416143725.2769053-24-lee.jones@linaro.org Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com>
2021-04-19drm/ttm: warn stricter about freeing pinned BOsChristian König1-1/+3
So far we only warned when the BOs where pinned and not idle. Also warn if we see a pinned BO in general. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210415084730.2057-3-christian.koenig@amd.com
2021-04-08drm/ttm: Ignore signaled move fencesFelix Kuehling1-1/+2
Move fences that have already signaled should not prevent memory allocations with no_wait_gpu. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210227034524.21763-1-Felix.Kuehling@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>
2021-03-26drm/ttm: fix invalid NULL derefChristian König1-3/+3
The BO might be NULL in this function, use the bdev directly. Signed-off-by: Christian König <christian.koenig@amd.com> Reported-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Colin Ian King <colin.king@canonical.com> Fixes: a1f091f8ef2b ("drm/ttm: switch to per device LRU lock") Link: https://patchwork.freedesktop.org/patch/msgid/20210325152740.82633-1-christian.koenig@amd.com
2021-03-24drm/ttm: switch to per device LRU lockChristian König1-26/+23
Instead of having a global lock for potentially less contention. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/424010/
2021-03-24drm/ttm: remove swap LRU v3Christian König1-29/+0
Instead evict round robin from each devices SYSTEM and TT domain. v2: reorder num_pages access reported by Dan's script v3: fix rebase fallout, num_pages should be 32bit Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/424009/
2021-03-24drm/ttm: move swapout logic around v3Christian König1-41/+16
Move the iteration of the global lru into the new function ttm_global_swapout() and use that instead in drivers. v2: consistently return int v3: fix build fail Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/424008/
2021-03-16Merge tag 'drm-misc-next-2021-03-03' of ↵Dave Airlie1-280/+55
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.13: UAPI Changes: Cross-subsystem Changes: Core Changes: - %p4cc printk format modifier - atomic: introduce drm_crtc_commit_wait, rework atomic plane state helpers to take the drm_commit_state structure - dma-buf: heaps rework to return a struct dma_buf - simple-kms: Add plate state helpers - ttm: debugfs support, removal of sysfs Driver Changes: - Convert drivers to shadow plane helpers - arc: Move to drm/tiny - ast: cursor plane reworks - gma500: Remove TTM and medfield support - mxsfb: imx8mm support - panfrost: MMU IRQ handling rework - qxl: rework to better handle resources deallocation, locking - sun4i: Add alpha properties for UI and VI layers - vc4: RPi4 CEC support - vmwgfx: doc cleanup Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20210303100600.dgnkadonzuvfnu22@gilmour
2021-03-11drm/ttm: soften TTM warningsChristian König1-2/+6
QXL indeed unrefs pinned BOs and the warnings are spamming peoples log files. Make sure we warn only once until the QXL driver is fixed. Signed-off-by: Christian König <christian.koenig@amd.com> References: https://lore.kernel.org/lkml/YD+eYcMMcdlXB8PY@alley/ Link: https://patchwork.freedesktop.org/patch/422834/ Reviewed-by: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-02-25drm/ttm: Do not add non-system domain BO into swap listxinhui pan1-0/+2
BO would be added into swap list if it is validated into system domain. If BO is validated again into non-system domain, say, VRAM domain. It actually should not be in the swap list. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210224032808.150465-1-xinhui.pan@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>
2021-02-25drm/ttm: Fix a memory leakxinhui pan1-3/+6
Free the memory on failure. Also no need to re-alloc memory on retry. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210219042547.44855-1-xinhui.pan@amd.com Reviewed-by: Christian König <christian.koenig@amd.com> CC: stable@vger.kernel.org # 5.11 Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2021-02-09drm/ttm: move memory accounting into vmwgfx v4Christian König1-32/+1
This is just another feature which is only used by VMWGFX, so move it into the driver instead. I've tried to add the accounting sysfs file to the kobject of the drm minor, but I'm not 100% sure if this works as expected. v2: fix typo in KFD and avoid 64bit divide v3: fix init order in VMWGFX v4: use pdev sysfs reference instead of drm Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Zack Rusin <zackr@vmware.com> (v3) Tested-by: Nirmoy Das <nirmoy.das@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210208133226.36955-2-christian.koenig@amd.com
2021-02-09drm/ttm: rework ttm_tt page limit v4Christian König1-2/+2
TTM implements a rather extensive accounting of allocated memory. There are two reasons for this: 1. It tries to block userspace allocating a huge number of very small BOs without accounting for the kmalloced memory. 2. Make sure we don't over allocate and run into an OOM situation during swapout while trying to handle the memory shortage. This is only partially a good idea. First of all it is perfectly valid for an application to use all of system memory, limiting it to 50% is not really acceptable. What we need to take care of is that the application is held accountable for the memory it allocated. This is what control mechanisms like memcg and the normal Linux page accounting already do. Making sure that we don't run into an OOM situation while trying to cope with a memory shortage is still a good idea, but this is also not very well implemented since it means another opportunity of recursion from the driver back into TTM. So start to rework all of this by implementing a shrinker callback which allows for TT object to be swapped out if necessary. v2: Switch from limit to shrinker callback. v3: fix gfp mask handling, use atomic for swapable_pages, add debugfs v4: drop the extra gfp_mask checks Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210208133226.36955-1-christian.koenig@amd.com
2021-01-21drm/ttm: device naming cleanupChristian König1-205/+51
Rename ttm_bo_device to ttm_device. Rename ttm_bo_driver to ttm_device_funcs. Rename ttm_bo_global to ttm_global. Move global and device related functions to ttm_device.[ch]. No functional change. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/415222/
2021-01-20drm/ttm: add debugfs directory v2Christian König1-45/+3
As far as I can tell the buffer_count was never used by an userspace application. The number of BOs in the system is far better suited in debugfs than sysfs and we now should be able to add other information here as well. v2: add that additionally to sysfs Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/414954/
2021-01-06drm/ttm: Remove pinned bos from LRU in ttm_bo_move_to_lru_tail() v2Lyude Paul1-1/+3
Recently a regression was introduced which caused TTM's buffer eviction to attempt to evict already-pinned BOs, causing issues with buffer eviction under memory pressure along with suspend/resume: nouveau 0000:1f:00.0: DRM: evicting buffers... nouveau 0000:1f:00.0: DRM: Moving pinned object 00000000c428c3ff! nouveau 0000:1f:00.0: fifo: fault 00 [READ] at 0000000000200000 engine 04 [BAR1] client 07 [HUB/HOST_CPU] reason 02 [PTE] on channel -1 [00ffeaa000 unknown] nouveau 0000:1f:00.0: fifo: DROPPED_MMU_FAULT 00001000 nouveau 0000:1f:00.0: fifo: fault 01 [WRITE] at 0000000000020000 engine 0c [HOST6] client 07 [HUB/HOST_CPU] reason 02 [PTE] on channel 1 [00ffb28000 DRM] nouveau 0000:1f:00.0: fifo: channel 1: killed nouveau 0000:1f:00.0: fifo: runlist 0: scheduled for recovery [TTM] Buffer eviction failed nouveau 0000:1f:00.0: DRM: waiting for kernel channels to go idle... nouveau 0000:1f:00.0: DRM: failed to idle channel 1 [DRM] nouveau 0000:1f:00.0: DRM: resuming display... After some bisection and investigation, it appears this resulted from the recent changes to ttm_bo_move_to_lru_tail(). Previously when a buffer was pinned, the buffer would be removed from the LRU once ttm_bo_unreserve to maintain the LRU list when pinning or unpinning BOs. However, since: commit 3d1a88e1051f ("drm/ttm: cleanup LRU handling further") We've been exiting from ttm_bo_move_to_lru_tail() at the very beginning of the function if the bo we're looking at is pinned, resulting in the pinned BO never getting removed from the lru and as a result - causing issues when it eventually becomes time for eviction. So, let's fix this by calling ttm_bo_del_from_lru() from ttm_bo_move_to_lru_tail() in the event that we're dealing with a pinned buffer. v2 (chk): reduce to only the fixing one liner since we always want to call the callback whenever we would move on the LRU. Fixes: 3d1a88e1051f ("drm/ttm: cleanup LRU handling further") Cc: Dave Airlie <airlied@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210105114505.38210-1-christian.koenig@amd.com
2020-12-15drm/ttm: cleanup LRU handling furtherChristian König1-35/+25
We only completely delete the BO from the LRU on destruction. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Link: https://patchwork.freedesktop.org/patch/404618/
2020-12-15drm/ttm: use pin_count more extensivelyChristian König1-2/+1
Check the pin_count instead of the lru list is empty here. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Link: https://patchwork.freedesktop.org/patch/404617/
2020-12-14drm/ttm: cleanup BO size handling v3Christian König1-24/+11
Based on an idea from Dave, but cleaned up a bit. We had multiple fields for essentially the same thing. Now bo->base.size is the original size of the BO in arbitrary units, usually bytes. bo->mem.num_pages is the size in number of pages in the resource domain of bo->mem.mem_type. v2: use the GEM object size instead of the BO size v3: fix printks in some places Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> (v1) Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/406831/
2020-12-01drm/ttm/drivers: remove unecessary ttm_module.h include v2Christian König1-1/+2
ttm_module.h deals with internals of TTM and should never be include outside of it. v2: also move the file around Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/404885/
2020-11-27drm/ttm: Warn on pinning without holding a referenceDaniel Vetter1-1/+1
Not technically a problem for ttm, but very likely a driver bug and pretty big time confusing for reviewing code. So warn about it, both at cleanup time (so we catch these for sure) and at pin/unpin time (so we know who's the culprit). Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201028113120.3641237-1-daniel.vetter@ffwll.ch
2020-11-17drm/ttm/ttm_bo: Fix one function header - demote lots of kernel-doc abusesLee Jones1-11/+12
Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/ttm/ttm_bo.c:51: warning: Function parameter or member 'ttm_global_mutex' not described in 'DEFINE_MUTEX' drivers/gpu/drm/ttm/ttm_bo.c:286: warning: Function parameter or member 'bo' not described in 'ttm_bo_cleanup_memtype_use' drivers/gpu/drm/ttm/ttm_bo.c:359: warning: Function parameter or member 'bo' not described in 'ttm_bo_cleanup_refs' drivers/gpu/drm/ttm/ttm_bo.c:359: warning: Function parameter or member 'interruptible' not described in 'ttm_bo_cleanup_refs' drivers/gpu/drm/ttm/ttm_bo.c:359: warning: Function parameter or member 'no_wait_gpu' not described in 'ttm_bo_cleanup_refs' drivers/gpu/drm/ttm/ttm_bo.c:359: warning: Function parameter or member 'unlock_resv' not described in 'ttm_bo_cleanup_refs' drivers/gpu/drm/ttm/ttm_bo.c:424: warning: Function parameter or member 'bdev' not described in 'ttm_bo_delayed_delete' drivers/gpu/drm/ttm/ttm_bo.c:424: warning: Function parameter or member 'remove_all' not described in 'ttm_bo_delayed_delete' drivers/gpu/drm/ttm/ttm_bo.c:635: warning: Function parameter or member 'bo' not described in 'ttm_bo_evict_swapout_allowable' drivers/gpu/drm/ttm/ttm_bo.c:635: warning: Function parameter or member 'ctx' not described in 'ttm_bo_evict_swapout_allowable' drivers/gpu/drm/ttm/ttm_bo.c:635: warning: Function parameter or member 'locked' not described in 'ttm_bo_evict_swapout_allowable' drivers/gpu/drm/ttm/ttm_bo.c:635: warning: Function parameter or member 'busy' not described in 'ttm_bo_evict_swapout_allowable' drivers/gpu/drm/ttm/ttm_bo.c:769: warning: Function parameter or member 'bo' not described in 'ttm_bo_add_move_fence' drivers/gpu/drm/ttm/ttm_bo.c:769: warning: Function parameter or member 'man' not described in 'ttm_bo_add_move_fence' drivers/gpu/drm/ttm/ttm_bo.c:769: warning: Function parameter or member 'mem' not described in 'ttm_bo_add_move_fence' drivers/gpu/drm/ttm/ttm_bo.c:769: warning: Function parameter or member 'no_wait_gpu' not described in 'ttm_bo_add_move_fence' drivers/gpu/drm/ttm/ttm_bo.c:806: warning: Function parameter or member 'bo' not described in 'ttm_bo_mem_force_space' drivers/gpu/drm/ttm/ttm_bo.c:806: warning: Function parameter or member 'place' not described in 'ttm_bo_mem_force_space' drivers/gpu/drm/ttm/ttm_bo.c:806: warning: Function parameter or member 'mem' not described in 'ttm_bo_mem_force_space' drivers/gpu/drm/ttm/ttm_bo.c:806: warning: Function parameter or member 'ctx' not described in 'ttm_bo_mem_force_space' drivers/gpu/drm/ttm/ttm_bo.c:872: warning: Function parameter or member 'bo' not described in 'ttm_bo_mem_space' drivers/gpu/drm/ttm/ttm_bo.c:872: warning: Function parameter or member 'placement' not described in 'ttm_bo_mem_space' drivers/gpu/drm/ttm/ttm_bo.c:872: warning: Function parameter or member 'mem' not described in 'ttm_bo_mem_space' drivers/gpu/drm/ttm/ttm_bo.c:872: warning: Function parameter or member 'ctx' not described in 'ttm_bo_mem_space' drivers/gpu/drm/ttm/ttm_bo.c:1387: warning: Function parameter or member 'ctx' not described in 'ttm_bo_swapout' Reviewed-by: Christian König <christian.koenig@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Huang Rui <ray.huang@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20201116174112.1833368-32-lee.jones@linaro.org
2020-11-11drm/ttm: add multihop infrastrucutre (v3)Dave Airlie1-9/+65
Currently drivers get called to move a buffer, but if they have to move it temporarily through another space (SYSTEM->VRAM via TT) then they can end up with a lot of ttm->driver->ttm call stacks, if the temprorary space moves requires eviction. Instead of letting the driver do all the placement/space for the temporary, allow it to report back (-EMULTIHOP) and a placement (hop) to the move code, which will then do the temporary move, and the correct placement move afterwards. This removes a lot of code from drivers, at the expense of adding some midlayering. I've some further ideas on how to turn it inside out, but I think this is a good solution to the call stack problems. v2: separate out the driver patches, add WARN for getting MULTHOP in paths we shouldn't (Daniel) v3: use memset (Christian) Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: hristian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201109005432.861936-2-airlied@gmail.com
2020-11-04drm/ttm: replace context flags with bools v2Christian König1-1/+1
The ttm_operation_ctx structure has a mixture of flags and bools. Drop the flags and replace them with bools as well. v2: fix typos, improve comments Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/398686/
2020-11-02Merge drm/drm-next into drm-misc-nextMaxime Ripard1-1/+1
Daniel needs -rc2 in drm-misc-next to merge some patches Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2020-10-29drm/ttm: nuke old page allocatorChristian König1-1/+0
Not used any more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com> Tested-by: Huang Rui <ray.huang@amd.com> Link: https://patchwork.freedesktop.org/patch/397087/?series=83051&rev=1
2020-10-29drm/ttm: wire up the new pool as default one v2Christian König1-2/+6
Provide the necessary parameters by all drivers and use the new pool alloc when no driver specific function is provided. v2: fix the GEM VRAM helpers Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com> Tested-by: Huang Rui <ray.huang@amd.com> Link: https://patchwork.freedesktop.org/patch/397081/?series=83051&rev=1
2020-10-26drm/ttm: merge ttm_dma_tt back into ttm_ttChristian König1-1/+1
It makes no difference to kmalloc if the structure is 48 or 64 bytes in size. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/396950/
2020-10-22drm/ttm: replace last move_notify with delete_mem_notifyDave Airlie1-2/+2
The move notify callback is only used in one place, this should be removed in the future, but for now just rename it to the use case which is to notify the driver that the GPU memory is to be deleted. Drivers can be cleaned up after this separately. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201021044031.1752624-2-airlied@gmail.com