summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/v3d/v3d_sched.c
AgeCommit message (Collapse)AuthorFilesLines
2019-05-02drm/scheduler: rework job destructionChristian König1-1/+1
We now destroy finished jobs from the worker thread to make sure that we never destroy a job currently in timeout processing. By this we avoid holding lock around ring mirror list in drm_sched_stop which should solve a deadlock reported by a user. v2: Remove unused variable. v4: Move guilty job free into sched code. v5: Move sched->hw_rq_count to drm_sched_start to account for counter decrement in drm_sched_stop even when we don't call resubmit jobs if guily job did signal. v6: remove unused variable Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109692 Acked-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/1555599624-12285-3-git-send-email-andrey.grodzovsky@amd.com
2019-04-18drm/v3d: Add missing implicit synchronization.Eric Anholt1-36/+4
It is the expectation of existing userspace (X11 + Mesa, in particular) that jobs submitted to the kernel against a shared BO will get implicitly synchronized by their submission order. If we want to allow clever userspace to disable implicit synchronization, we should do that under its own submit flag (as amdgpu and lima do). Note that we currently only implicitly sync for the rendering pass, not binning -- if you texture-from-pixmap in the binning vertex shader (vertex coordinate generation), you'll miss out on synchronization. Fixes flickering when multiple clients are running in parallel, particularly GL apps and compositors. v2: Fix a missing refcount on the CSD done fence for L2 cleaning. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-6-eric@anholt.net Acked-by: Rob Clark <robdclark@gmail.com>
2019-04-18drm/v3d: Add support for compute shader dispatch.Eric Anholt1-7/+114
The compute shader dispatch interface is pretty simple -- just pass in the regs that userspace has passed us, with no CLs to run. However, with no CL to run it means that we need to do manual cache flushing of the L2 after the HW execution completes (for SSBO, atomic, and image_load_store writes that are the output of compute shaders). This doesn't yet expose the L2 cache's ability to have a region of the address space not write back to memory (which could be used for shared_var storage). So far, the Mesa side has been tested on V3D v4.2 simpenrose (passing the ES31 tests), and on the kernel side on 7278 (failing atomic compswap tests in a way that doesn't reproduce on simpenrose). v2: Fix excessive allocation for the clean_job (reported by Dan Carpenter). Keep refs on jobs until clean_job is finished, to avoid spurious MMU errors if the output BOs are freed by userspace before L2 cleaning is finished. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-4-eric@anholt.net Acked-by: Rob Clark <robdclark@gmail.com>
2019-04-18drm/v3d: Refactor job management.Eric Anholt1-108/+151
The CL submission had two jobs embedded in an exec struct. When I added TFU support, I had to replicate some of the exec stuff and some of the job stuff. As I went to add CSD, it became clear that actually what was in exec should just be in the two CL jobs, and it would let us share a lot more code between the 4 queues. v2: Fix missing error path in TFU ioctl's bo[] allocation. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190416225856.20264-3-eric@anholt.net Acked-by: Rob Clark <robdclark@gmail.com>
2019-04-01drm/v3d: Rename the fence signaled from IRQs to "irq_fence".Eric Anholt1-6/+6
We have another thing called the "done fence" that tracks when the scheduler considers the job done, and having the shared name was confusing. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20190313235211.28995-2-eric@anholt.net Reviewed-by: Dave Emett <david.emett@broadcom.com>
2019-03-14drm/v3d: Fix calling drm_sched_resubmit_jobs for same sched.Andrey Grodzovsky1-8/+5
Also stop calling drm_sched_increase_karma multiple times. v2: Fix whitespace in the code we're moving (by anholt) Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/1552409822-17230-1-git-send-email-andrey.grodzovsky@amd.com Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Eric Anholt <eric@anholt.net> Fixes: 222b5f044159 ("drm/sched: Refactor ring mirror list handling.")
2019-01-26drm/sched: Refactor ring mirror list handling.Andrey Grodzovsky1-5/+8
Decauple sched threads stop and start and ring mirror list handling from the policy of what to do about the guilty jobs. When stoppping the sched thread and detaching sched fences from non signaled HW fenes wait for all signaled HW fences to complete before rerunning the jobs. v2: Fix resubmission of guilty job into HW after refactoring. v4: Full restart for all the jobs, not only from guilty ring. Extract karma increase into standalone function. v5: Rework waiting for signaled jobs without relying on the job struct itself as those might already be freed for non 'guilty' job's schedulers. Expose karma increase to drivers. v6: Use list_for_each_entry_safe_continue and drm_sched_process_job in case fence already signaled. Call drm_sched_increase_karma only once for amdgpu and add documentation. v7: Wait only for the latest job's fence. Suggested-by: Christian Koenig <Christian.Koenig@amd.com> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-12-01drm/v3d: Add support for submitting jobs to the TFU.Eric Anholt1-19/+128
The TFU can copy from raster, UIF, and SAND input images to UIF output images, with optional mipmap generation. This will certainly be useful for media EGL image input, but is also useful immediately for mipmap generation without bogging the V3D core down. For now we only run the queue 1 job deep, and don't have any hang recovery (though I don't think we should need it, with TFU). Queuing multiple jobs in the HW will require synchronizing the YUV coefficient regs updates since they don't get FIFOed with the job. v2: Change the ioctl to IOW instead of IOWR, always set COEF0, explain why TFU is AUTH, clarify the syncing docs, drop the unused TFU interrupt regs (you're expected to use the hub's), don't take &bo->base for NULL bos. v3: Fix a little whitespace alignment (noticed by checkpatch), rebase on drm_sched_job_cleanup() changes. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Dave Emett <david.emett@broadcom.com> (v2) Link: https://patchwork.freedesktop.org/patch/264607/
2018-11-29Merge tag 'drm-misc-next-2018-11-28' of ↵Dave Airlie1-1/+1
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v4.21: Core Changes: - Merge drm_info.c into drm_debugfs.c - Complete the fake drm_crtc_commit's hw_done/flip_done sooner. - Remove deprecated drm_obj_ref/unref functions. All drivers use get/put now. - Decrease stack use of drm_gem_prime_mmap. - Improve documentation for dumb callbacks. Driver Changes: - Add edid support to virtio. - Wait on implicit fence in meson and sun4i. - Add support for BGRX8888 to sun4i. - Preparation patches for sun4i driver to start supporting linear and tiled YUV formats. - Add support for HDMI 1.4 4k modes to meson, and support for VIC alternate timings. - Drop custom dumb_map in vkms. - Small fixes and cleanups to v3d. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/151a3270-b1be-ed75-bd58-6b29d741f592@linux.intel.com
2018-11-27drm/v3d: Update a comment about what uses v3d_job_dependency().Eric Anholt1-1/+1
I merged bin and render's paths in a late refactoring. Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20181108161654.19888-3-eric@anholt.net Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com>
2018-11-05drm/scheduler: Add drm_sched_job_cleanupSharat Masetty1-0/+2
This patch adds a new API to clean up the scheduler job resources. This is primarliy needed in cases the job was created but was not queued to the scheduler queue. Additionally with this change, the layer which creates the scheduler job also gets to free up the job's resources and this entails moving the dma_fence_put(finished_fence) to the drivers ops free handler routines. Signed-off-by: Sharat Masetty <smasetty@codeaurora.org> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-11-05drm/sched: make sure timer is restartedChristian König1-3/+0
Make sure we always restart the timer after a timeout and remove the device specific workarounds. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-27drm/scheduler: remove timeout work_struct from drm_sched_job (v3)Nayan Deshmukh1-1/+1
having a delayed work item per job is redundant as we only need one per scheduler to track the time out the currently executing job. v2: the first element of the ring mirror list is the currently executing job so we don't need a additional variable for it v3: squash in fixes for v3d and etnaviv Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-05drm/v3d: Fix a grammar nit in the scheduler docs.Eric Anholt1-2/+2
Signed-off-by: Eric Anholt <eric@anholt.net> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180703170515.6298-4-eric@anholt.net Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Daniel Vetter <daniel@ffwll.ch>
2018-07-05drm/v3d: Delay the scheduler timeout if we're still making progress.Eric Anholt1-0/+18
GTF-GLES2.gtf.GL.acos.acos_float_vert_xvary submits jobs that take 4 seconds at maximum resolution, but we still want to reset quickly if a job is really hung. Sample the CL's current address and the return address (since we call into tile lists repeatedly) and if either has changed then assume we've made progress. Signed-off-by: Eric Anholt <eric@anholt.net> Cc: Lucas Stach <l.stach@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/20180703170515.6298-1-eric@anholt.net Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Daniel Vetter <daniel@ffwll.ch>
2018-05-21drm/v3d: Checking for NULL vs IS_ERR()Dan Carpenter1-2/+2
The v3d_fence_create() only returns error pointers on error. It never returns NULL. Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20180518081041.GC28335@mwanda
2018-05-04drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+Eric Anholt1-0/+228
This driver will be used to support Mesa on the Broadcom 7268 and 7278 platforms. V3D 3.3 introduces an MMU, which means we no longer need CMA or vc4's complicated CL/shader validation scheme. This massively changes the GEM behavior, so I've forked off to a new driver. v2: Mark SUBMIT_CL as needing DRM_AUTH. coccinelle fixes from kbuild test robot. Drop personal git link from MAINTAINERS. Don't double-map dma-buf imported BOs. Add kerneldoc about needing MMU eviction. Drop prime vmap/unmap stubs. Delay mmap offset setup to mmap time. Use drm_dev_init instead of _alloc. Use ktime_get() for wait_bo timeouts. Drop drm_can_sleep() usage, since we don't modeset. Switch page tables back to WC (debug change to coherent had slipped in). Switch drm_gem_object_unreference_unlocked() to drm_gem_object_put_unlocked(). Simplify overflow mem handling by not sharing overflow mem between jobs. v3: no changes v4: align submit_cl to 64 bits (review by airlied), check zero flags in other ioctls. Signed-off-by: Eric Anholt <eric@anholt.net> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v4) Acked-by: Dave Airlie <airlied@linux.ie> (v3, requested submit_cl change) Link: https://patchwork.freedesktop.org/patch/msgid/20180430181058.30181-3-eric@anholt.net