summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/msm/adreno
AgeCommit message (Collapse)AuthorFilesLines
2023-03-17drm/msm/adreno: fix runtime PM imbalance at unbindJohan Hovold1-1/+2
[ Upstream commit 6153c44392b04ff2da1e9aa82ba87da9ab9a0fc1 ] A recent commit moved enabling of runtime PM from adreno_gpu_init() to adreno_load_gpu() (called on first open()), which means that unbind() may now be called with runtime PM disabled in case the device was never opened in between. Make sure to only forcibly suspend and disable runtime PM at unbind() in case runtime PM has been enabled to prevent a disable count imbalance. This specifically avoids leaving runtime PM disabled when the device is later opened after a successful bind: msm_dpu ae01000.display-controller: [drm:adreno_load_gpu [msm]] *ERROR* Couldn't power up the GPU: -13 Fixes: 4b18299b3365 ("drm/msm/adreno: Defer enabling runpm until hw_init()") Reported-by: Bjorn Andersson <quic_bjorande@quicinc.com> Link: https://lore.kernel.org/lkml/20230203181245.3523937-1-quic_bjorande@quicinc.com Cc: stable@vger.kernel.org # 6.0 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Patchwork: https://patchwork.freedesktop.org/patch/523549/ Link: https://lore.kernel.org/r/20230221101430.14546-2-johan+linaro@kernel.org Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17adreno: Shutdown the GPU properlyJoel Fernandes (Google)1-2/+3
[ Upstream commit e752e5454e6417da3f40ec1306a041ea96c56423 ] During kexec on ARM device, we notice that device_shutdown() only calls pm_runtime_force_suspend() while shutting down the GPU. This means the GPU kthread is still running and further, there maybe active submits. This causes all kinds of issues during a kexec reboot: Warning from shutdown path: [ 292.509662] WARNING: CPU: 0 PID: 6304 at [...] adreno_runtime_suspend+0x3c/0x44 [ 292.509863] Hardware name: Google Lazor (rev3 - 8) with LTE (DT) [ 292.509872] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 292.509881] pc : adreno_runtime_suspend+0x3c/0x44 [ 292.509891] lr : pm_generic_runtime_suspend+0x30/0x44 [ 292.509905] sp : ffffffc014473bf0 [...] [ 292.510043] Call trace: [ 292.510051] adreno_runtime_suspend+0x3c/0x44 [ 292.510061] pm_generic_runtime_suspend+0x30/0x44 [ 292.510071] pm_runtime_force_suspend+0x54/0xc8 [ 292.510081] adreno_shutdown+0x1c/0x28 [ 292.510090] platform_shutdown+0x2c/0x38 [ 292.510104] device_shutdown+0x158/0x210 [ 292.510119] kernel_restart_prepare+0x40/0x4c And here from GPU kthread, an SError OOPs: [ 192.648789] el1h_64_error+0x7c/0x80 [ 192.648812] el1_interrupt+0x20/0x58 [ 192.648833] el1h_64_irq_handler+0x18/0x24 [ 192.648854] el1h_64_irq+0x7c/0x80 [ 192.648873] local_daif_inherit+0x10/0x18 [ 192.648900] el1h_64_sync_handler+0x48/0xb4 [ 192.648921] el1h_64_sync+0x7c/0x80 [ 192.648941] a6xx_gmu_set_oob+0xbc/0x1fc [ 192.648968] a6xx_hw_init+0x44/0xe38 [ 192.648991] msm_gpu_hw_init+0x48/0x80 [ 192.649013] msm_gpu_submit+0x5c/0x1a8 [ 192.649034] msm_job_run+0xb0/0x11c [ 192.649058] drm_sched_main+0x170/0x434 [ 192.649086] kthread+0x134/0x300 [ 192.649114] ret_from_fork+0x10/0x20 Fix by calling adreno_system_suspend() in the device_shutdown() path. [ Applied Rob Clark feedback on fixing adreno_unbind() similarly, also tested as above. ] Cc: Rob Clark <robdclark@chromium.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ricardo Ribalda <ribalda@chromium.org> Cc: Ross Zwisler <zwisler@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Reviewed-by: Ricardo Ribalda <ribalda@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/517633/ Link: https://lore.kernel.org/r/20230109222547.1368644-1-joel@joelfernandes.org Signed-off-by: Rob Clark <robdclark@chromium.org> Stable-dep-of: 6153c44392b0 ("drm/msm/adreno: fix runtime PM imbalance at unbind") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17drm/msm/a5xx: fix context faults during ring switchDmitry Baryshkov1-1/+1
[ Upstream commit 32e7083429d46f29080626fe387ff90c086b1fbe ] The rptr_addr is set in the preempt_init_ring(), which is called from a5xx_gpu_init(). It uses shadowptr() to set the address, however the shadow_iova is not yet initialized at that time. Move the rptr_addr setting to the a5xx_preempt_hw_init() which is called after setting the shadow_iova, getting the correct value for the address. Fixes: 8907afb476ac ("drm/msm: Allow a5xx to mark the RPTR shadow as privileged") Suggested-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/522640/ Link: https://lore.kernel.org/r/20230214020956.164473-5-dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17drm/msm/a5xx: fix the emptyness check in the preempt codeDmitry Baryshkov1-1/+1
[ Upstream commit b4fb748f0b734ce1d2e7834998cc599fcbd25d67 ] Quoting Yassine: ring->memptrs->rptr is never updated and stays 0, so the comparison always evaluates to false and get_next_ring always returns ring 0 thinking it isn't empty. Fix this by calling get_rptr() instead of reading rptr directly. Reported-by: Yassine Oudjana <y.oudjana@protonmail.com> Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/522642/ Link: https://lore.kernel.org/r/20230214020956.164473-4-dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17drm/msm/a5xx: fix highest bank bit for a530Dmitry Baryshkov1-1/+1
[ Upstream commit 141f66ebbfa17cc7e2075f06c50107da978c965b ] A530 has highest bank bit equal to 15 (like A540). Fix values written to REG_A5XX_RB_MODE_CNTL and REG_A5XX_TPL1_MODE_CNTL registers. Fixes: 1d832ab30ce6 ("drm/msm/a5xx: Add support for Adreno 508, 509, 512 GPUs") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/522639/ Link: https://lore.kernel.org/r/20230214020956.164473-3-dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17drm/msm/a5xx: fix setting of the CP_PREEMPT_ENABLE_LOCAL registerDmitry Baryshkov1-2/+2
[ Upstream commit a7a4c19c36de1e4b99b06e4060ccc8ab837725bc ] Rather than writing CP_PREEMPT_ENABLE_GLOBAL twice, follow the vendor kernel and set CP_PREEMPT_ENABLE_LOCAL register instead. a5xx_submit() will override it during submission, but let's get the sequence correct. Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/522638/ Link: https://lore.kernel.org/r/20230214020956.164473-2-dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10drm/msm/adreno: Fix null ptr access in adreno_gpu_cleanup()Akhil P Oommen1-2/+2
[ Upstream commit dbeedbcb268d055d8895aceca427f897e12c2b50 ] Fix the below kernel panic due to null pointer access: [ 18.504431] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000048 [ 18.513464] Mem abort info: [ 18.516346] ESR = 0x0000000096000005 [ 18.520204] EC = 0x25: DABT (current EL), IL = 32 bits [ 18.525706] SET = 0, FnV = 0 [ 18.528878] EA = 0, S1PTW = 0 [ 18.532117] FSC = 0x05: level 1 translation fault [ 18.537138] Data abort info: [ 18.540110] ISV = 0, ISS = 0x00000005 [ 18.544060] CM = 0, WnR = 0 [ 18.547109] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000112826000 [ 18.553738] [0000000000000048] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 [ 18.562690] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP **Snip** [ 18.696758] Call trace: [ 18.699278] adreno_gpu_cleanup+0x30/0x88 [ 18.703396] a6xx_destroy+0xc0/0x130 [ 18.707066] a6xx_gpu_init+0x308/0x424 [ 18.710921] adreno_bind+0x178/0x288 [ 18.714590] component_bind_all+0xe0/0x214 [ 18.718797] msm_drm_bind+0x1d4/0x614 [ 18.722566] try_to_bring_up_aggregate_device+0x16c/0x1b8 [ 18.728105] __component_add+0xa0/0x158 [ 18.732048] component_add+0x20/0x2c [ 18.735719] adreno_probe+0x40/0xc0 [ 18.739300] platform_probe+0xb4/0xd4 [ 18.743068] really_probe+0xfc/0x284 [ 18.746738] __driver_probe_device+0xc0/0xec [ 18.751129] driver_probe_device+0x48/0x110 [ 18.755421] __device_attach_driver+0xa8/0xd0 [ 18.759900] bus_for_each_drv+0x90/0xdc [ 18.763843] __device_attach+0xfc/0x174 [ 18.767786] device_initial_probe+0x20/0x2c [ 18.772090] bus_probe_device+0x40/0xa0 [ 18.776032] deferred_probe_work_func+0x94/0xd0 [ 18.780686] process_one_work+0x190/0x3d0 [ 18.784805] worker_thread+0x280/0x3d4 [ 18.788659] kthread+0x104/0x1c0 [ 18.791981] ret_from_fork+0x10/0x20 [ 18.795654] Code: f9400408 aa0003f3 aa1f03f4 91142015 (f9402516) [ 18.801913] ---[ end trace 0000000000000000 ]--- [ 18.809039] Kernel panic - not syncing: Oops: Fatal exception Fixes: 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}") Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/515605/ Link: https://lore.kernel.org/r/20221221203925.v2.1.Ib978de92c4bd000b515486aad72e96c2481f84d0@changeid Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01drm/msm/a6xx: Avoid gx gbit halt during rpm suspendAkhil P Oommen3-6/+17
[ Upstream commit f4a75b5933c998e60fd812a7680e0971eb1c7cee ] As per the downstream driver, gx gbif halt is required only during recovery sequence. So lets avoid it during regular rpm suspend. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/515279/ Link: https://lore.kernel.org/r/20221216223253.1.Ice9c47bfeb1fddb8dc377a3491a043a3ee7fca7d@changeid Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-01drm/msm/gpu: Fix potential double-freeRob Clark1-0/+4
[ Upstream commit a66f1efcf748febea7758c4c3c8b5bc5294949ef ] If userspace was calling the MSM_SET_PARAM ioctl on multiple threads to set the COMM or CMDLINE param, it could trigger a race causing the previous value to be kfree'd multiple times. Fix this by serializing on the gpu lock. Signed-off-by: Rob Clark <robdclark@chromium.org> Fixes: d4726d770068 ("drm/msm: Add a way to override processes comm/cmdline") Patchwork: https://patchwork.freedesktop.org/patch/517778/ Link: https://lore.kernel.org/r/20230110212903.1925878-1-robdclark@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18drm/msm/adreno: Make adreno quirks not overwrite each otherKonrad Dybcio1-6/+4
commit 13ef096e342b00e30b95a90c6c13eee1f0bec4c5 upstream. So far the adreno quirks have all been assigned with an OR operator, which is problematic, because they were assigned consecutive integer values, which makes checking them with an AND operator kind of no bueno.. Switch to using BIT(n) so that only the quirks that the programmer chose are taken into account when evaluating info->quirks & ADRENO_QUIRK_... Fixes: 370063ee427a ("drm/msm/adreno: Add A540 support") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/516456/ Link: https://lore.kernel.org/r/20230102100201.77286-1-konrad.dybcio@linaro.org Signed-off-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-12-31drm/msm/a6xx: Fix speed-bin detection vs probe-deferRob Clark1-7/+5
[ Upstream commit f6d1918794ef92b4e26b80c3d40365347b76b1fd ] If we get an error (other than -ENOENT) we need to propagate that up the stack. Otherwise if the nvmem driver hasn't probed yet, we'll end up end up claiming that we support all the OPPs which is not likely to be true (and on some generations impossible to be true, ie. if there are conflicting OPPs). v2: Update commit msg, gc unused label, etc v3: Add previously missing \n's Fixes: fe7952c629da ("drm/msm: Add speed-bin support to a618 gpu") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/511690/ Link: https://lore.kernel.org/r/20221115154637.1613968-1-robdclark@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-14drm/msm/a6xx: Remove state objects from list before freeingRob Clark1-1/+3
Technically it worked as it was before, only because it was using the _safe version of the iterator. But it is sloppy practice to leave dangling pointers. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/507017/ Link: https://lore.kernel.org/r/20221013225520.371226-4-robdclark@gmail.com
2022-10-14drm/msm/a6xx: Skip snapshotting unused GMU buffersRob Clark1-0/+3
Some buffers are unused on certain sub-generations of a6xx. So just skip them. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/507013/ Link: https://lore.kernel.org/r/20221013225520.371226-3-robdclark@gmail.com
2022-10-14drm/msm/a6xx: Fix kvzalloc vs state_kcalloc usageRob Clark2-2/+16
adreno_show_object() is a trap! It will re-allocate the pointer it is passed on first call, when the data is ascii85 encoded, using kvmalloc/ kvfree(). Which means the data *passed* to it must be kvmalloc'd, ie. we cannot use the state_kcalloc() helper. This partially reverts commit ec8f1813bf8d ("drm/msm/a6xx: Replace kcalloc() with kvzalloc()"), but adds the missing kvfree() to fix the memory leak that was present previously. And adds a warning comment. Fixes: ec8f1813bf8d ("drm/msm/a6xx: Replace kcalloc() with kvzalloc()") Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/20 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/507014/ Link: https://lore.kernel.org/r/20221013225520.371226-2-robdclark@gmail.com
2022-09-30drm/msm/gpu: Fix crash during system suspend after unbindAkhil P Oommen1-1/+9
In adreno_unbind, we should clean up gpu device's drvdata to avoid accessing a stale pointer during system suspend. Also, check for NULL ptr in both system suspend/resume callbacks. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/505075/ Link: https://lore.kernel.org/r/20220928124830.2.I5ee0ac073ccdeb81961e5ec0cce5f741a7207a71@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-09-30drm/msm/a6xx: Replace kcalloc() with kvzalloc()Akhil P Oommen1-9/+3
In order to reduce chance of allocation failure while capturing a6xx gpu state, use kvzalloc() instead of kcalloc() in state_kcalloc(). Indirectly, this patch helps to fix leaking memory allocated for gmu_debug object. Fixes: b859f9b009b (drm/msm/gpu: Snapshot GMU debug buffer) Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/505074/ Link: https://lore.kernel.org/r/20220928124830.1.I8ea24a8d586b4978823b848adde000f92f74d5c2@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-09-01drm/msm/a6xx: Handle GMU prepare-slumber hfi failureAkhil P Oommen1-1/+5
When prepare-slumber hfi fails, we should follow a6xx_gmu_force_off() sequence. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/498401/ Link: https://lore.kernel.org/r/20220819015030.v5.7.I54815c7c36b80d4725cd054e536365250454452f@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-28drm/msm/a6xx: Improve gpu recovery sequenceAkhil P Oommen3-30/+58
We can do a few more things to improve our chance at a successful gpu recovery, especially during a hangcheck timeout: 1. Halt CP and GMU core 2. Do RBBM GBIF HALT sequence 3. Do a soft reset of GPU core Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/498400/ Link: https://lore.kernel.org/r/20220819015030.v5.6.Idf2ba51078e87ae7ceb75cc77a5bd4ff2bd31eab@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-28drm/msm/a6xx: Ensure CX collapse during gpu recoveryAkhil P Oommen1-0/+4
Because there could be transient votes from other drivers/tz/hyp which may keep the cx gdsc enabled, we should poll until cx gdsc collapses. We can use the reset framework to poll for cx gdsc collapse from gpucc clk driver. This feature requires support from the platform's gpucc driver. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Patchwork: https://patchwork.freedesktop.org/patch/498397/ Link: https://lore.kernel.org/r/20220819015030.v5.5.I176567525af2b9439a7e485d0ca130528666a55c@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-28drm/msm: Fix cx collapse issue during recoveryAkhil P Oommen1-3/+29
There are some hardware logic under CX domain. For a successful recovery, we should ensure cx headswitch collapses to ensure all the stale states are cleard out. This is especially true to for a6xx family where we can GMU co-processor. Currently, cx doesn't collapse due to a devlink between gpu and its smmu. So the *struct gpu device* needs to be runtime suspended to ensure that the iommu driver removes its vote on cx gdsc. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/498398/ Link: https://lore.kernel.org/r/20220819015030.v5.4.I4ac27a0b34ea796ce0f938bb509e257516bc6f57@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-28drm/msm: De-open-code some CP_EVENT_WRITERob Clark3-3/+3
Replace some open coding to improve readability. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/499272/ Link: https://lore.kernel.org/r/20220821155441.1092134-1-robdclark@gmail.com
2022-07-07drm/msm/adreno: Defer enabling runpm until hw_init()Rob Clark2-1/+6
To avoid preventing the display from coming up before the rootfs is mounted, without resorting to packing fw in the initrd, the GPU has this limbo state where the device is probed, but we aren't ready to start sending commands to it. This is particularly problematic for a6xx, since the GMU (which requires fw to be loaded) is the one that is controlling the power/clk/icc votes. So defer enabling runpm until we are ready to call gpu->hw_init(), as that is a point where we know we have all the needed fw and are ready to start sending commands to the coproc's. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/489337/ Link: https://lore.kernel.org/r/20220613182036.2567963-1-robdclark@gmail.com
2022-07-07drm/msm/adreno: Do not propagate void return valuesGeert Uytterhoeven3-4/+4
With sparse ("make C=2"), lots of error: return expression in void function messages are seen. Fix this by removing the return statements to propagate void return values. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Patchwork: https://patchwork.freedesktop.org/patch/492529/ Link: https://lore.kernel.org/r/0083bc7e23753c19902580b902582ae499b44dbf.1657113388.git.geert@linux-m68k.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-07drm/msm/gpu: Add GEM debug label to devcoreRob Clark1-0/+1
When trying to understand an iova fault devcore, once you figure out which buffer we accessed beyond the end of, it is useful to see the buffer's debug label. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/491910/ Link: https://lore.kernel.org/r/20220629211919.563585-3-robdclark@gmail.com
2022-07-06drm/msm: Fix %d vs %uRob Clark1-5/+5
In debugging fence rollover, I noticed that GPU state capture and devcore dumps were showing me negative fence numbers. Let's fix that and some related signed vs unsigned confusion. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/489621/ Link: https://lore.kernel.org/r/20220615163532.3013035-1-robdclark@gmail.com
2022-07-06drm/msm/adreno: Allow larger address space sizeRob Clark4-1/+24
The restriction to 4G was strictly to work around 64b math bug in some versions of SQE firmware. This appears to be fixed in a650+ SQE fw, so allow a larger address space size on these devices. Also, add a modparam override for debugging and igt. v2: Send the right version of the patch (ie. the one that actually compiles) Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/487601/ Link: https://lore.kernel.org/r/20220529180428.2577832-1-robdclark@gmail.com
2022-07-06drm/msm/adreno: Fix up formattingKonrad Dybcio1-9/+8
Leading spaces are not something checkpatch likes, and it says so when they are present. Use tabs consistently to indent function body and unwrap a 83-char-long line, as 100 is cool nowadays. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/487592/ Link: https://lore.kernel.org/r/20220528160353.157870-4-konrad.dybcio@somainline.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-06drm/msm/a6xx: Add speedbin support for A619 GPUKonrad Dybcio1-0/+19
There are various SKUs of A619, ranging from 565 MHz to 850 MHz, depending on the bin. Add support for distinguishing them, so that proper frequency ranges can be applied, depending on the HW. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/487590/ Link: https://lore.kernel.org/r/20220528160353.157870-3-konrad.dybcio@somainline.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-06drm/msm/adreno: Add A619 supportKonrad Dybcio5-6/+169
Add support for the Adreno 619 GPU, as found in Snapdragon 690 (SM6350), 480 (SM4350) and 750G (SM7225). Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/487588/ Link: https://lore.kernel.org/r/20220528160353.157870-2-konrad.dybcio@somainline.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-06drm/msm/adreno: Remove dead codeKonrad Dybcio1-2/+0
This BUG_ON will never be reached, and there is a comment 20 above explaining why. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/487586/ Link: https://lore.kernel.org/r/20220528160353.157870-1-konrad.dybcio@somainline.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-06drm/msm: Avoid unclocked GMU register access in 6xx gpu_busyDouglas Anderson4-24/+12
From testing on sc7180-trogdor devices, reading the GMU registers needs the GMU clocks to be enabled. Those clocks get turned on in a6xx_gmu_resume(). Confusingly enough, that function is called as a result of the runtime_pm of the GPU "struct device", not the GMU "struct device". Unfortunately the current a6xx_gpu_busy() grabs a reference to the GMU's "struct device". The fact that we were grabbing the wrong reference was easily seen to cause crashes that happen if we change the GPU's pm_runtime usage to not use autosuspend. It's also believed to cause some long tail GPU crashes even with autosuspend. We could look at changing it so that we do pm_runtime_get_if_in_use() on the GPU's "struct device", but then we run into a different problem. pm_runtime_get_if_in_use() will return 0 for the GPU's "struct device" the whole time when we're in the "autosuspend delay". That is, when we drop the last reference to the GPU but we're waiting a period before actually suspending then we'll think the GPU is off. One reason that's bad is that if the GPU didn't actually turn off then the cycle counter doesn't lose state and that throws off all of our calculations. Let's change the code to keep track of the suspend state of devfreq. msm_devfreq_suspend() is always called before we actually suspend the GPU and msm_devfreq_resume() after we resume it. This means we can use the suspended state to know if we're powered or not. NOTE: one might wonder when exactly our status function is called when devfreq is supposed to be disabled. The stack crawl I captured was: msm_devfreq_get_dev_status devfreq_simple_ondemand_func devfreq_update_target qos_notifier_call qos_max_notifier_call blocking_notifier_call_chain pm_qos_update_target freq_qos_apply apply_constraint __dev_pm_qos_update_request dev_pm_qos_update_request msm_devfreq_idle_work Fixes: eadf79286a4b ("drm/msm: Check for powered down HW in the devfreq callbacks") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/489124/ Link: https://lore.kernel.org/r/20220610124639.v4.1.Ie846c5352bc307ee4248d7cab998ab3016b85d06@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-29Merge tag 'drm-msm-fixes-2022-06-28' into msm-next-stagingRob Clark1-4/+10
Merge v5.19 fixes to avoid merge conflicts Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-18drm/msm: Don't overwrite hw fence in hw_initRob Clark1-3/+8
Prior to the last commit, this could result in setting the GPU written fence value back to an older value, if we had missed updating completed_fence prior to suspend. This was mostly harmless as the GPU would eventually overwrite it again with the correct value. But we should just not do this. Instead just leave a sanity check that the fence looks plausible (in case the GPU scribbled on memory). Reported-by: Steev Klimaszewski <steev@kali.org> Fixes: 95d1deb02a9c ("drm/msm/gem: Add fenced vma unpin") Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Steev Klimaszewski <steev@kali.org> Patchwork: https://patchwork.freedesktop.org/patch/490138/ Link: https://lore.kernel.org/r/20220618161120.3451993-2-robdclark@gmail.com
2022-06-07drm/msm: Fix double pm_runtime_disable() callMaximilian Luz1-1/+2
Following commit 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}"), any call to adreno_unbind() will disable runtime PM twice, as indicated by the call trees below: adreno_unbind() -> pm_runtime_force_suspend() -> pm_runtime_disable() adreno_unbind() -> gpu->funcs->destroy() [= aNxx_destroy()] -> adreno_gpu_cleanup() -> pm_runtime_disable() Note that pm_runtime_force_suspend() is called right before gpu->funcs->destroy() and both functions are called unconditionally. With recent addition of the eDP AUX bus code, this problem manifests itself when the eDP panel cannot be found yet and probing is deferred. On the first probe attempt, we disable runtime PM twice as described above. This then causes any later probe attempt to fail with [drm:adreno_load_gpu [msm]] *ERROR* Couldn't power up the GPU: -13 preventing the driver from loading. As there seem to be scenarios where the aNxx_destroy() functions are not called from adreno_unbind(), simply removing pm_runtime_disable() from inside adreno_unbind() does not seem to be the proper fix. This is what commit 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}") intended to fix. Therefore, instead check whether runtime PM is still enabled, and only disable it in that case. Fixes: 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}") Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Link: https://lore.kernel.org/r/20220606211305.189585-1-luzmaximilian@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-05-20Merge tag 'msm-next-5.19-fixes' of ↵Dave Airlie1-0/+1
https://gitlab.freedesktop.org/abhinavk/msm into drm-next 5.19 fixes for msm-next - Limiting WB modes to max sspp linewidth - Fixing the supported rotations to add 180 back for IGT - Fix to handle pm_runtime_get_sync() errors to avoid unclocked access in the bind() path for dpu driver - Fix the irq_free() without request issue which was a big-time hitter in the CI-runs. Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Dave Airlie <airlied@redhat.com> From: Abhinav Kumar <quic_abhinavk@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/b011d51d-d634-123e-bf5f-27219ee33151@quicinc.com
2022-05-18drm/msm/a6xx: Fix refcount leak in a6xx_gpu_initMiaoqian Lin1-0/+1
of_parse_phandle() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. a6xx_gmu_init() passes the node to of_find_device_by_node() and of_dma_configure(), of_find_device_by_node() will takes its reference, of_dma_configure() doesn't need the node after usage. Add missing of_node_put() to avoid refcount leak. Fixes: 4b565ca5a2cb ("drm/msm: Add A6XX device support") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Link: https://lore.kernel.org/r/20220512121955.56937-1-linmq006@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-05-11Merge tag 'drm-msm-next-2022-05-09' of ↵Dave Airlie5-32/+80
https://gitlab.freedesktop.org/drm/msm into drm-next - Fourcc modifier for tiled but not compressed layouts - Support for userspace allocated IOVA (GPU virtual address) - Devfreq clamp_to_idle fix - DPU: DSC (Display Stream Compression) support - DPU: inline rotation support on SC7280 - DPU: update DP timings to follow vendor recommendations - DP, DPU: add support for wide bus (on newer chipsets) - DP: eDP support - Merge DPU1 and MDP5 MDSS driver, make dpu/mdp device the master component - MDSS: optionally reset the IP block at the bootup to drop bootloader state - Properly register and unregister internal bridges in the DRM framework - Complete DPU IRQ cleanup - DP: conversion to use drm_bridge and drm_bridge_connector - eDP: drop old eDP parts again - DPU: writeback support - Misc small fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvJCr_1D8d0dgmyQC5HD4gmXeZw=bFV_CNCfceZbpMxRw@mail.gmail.com
2022-05-02drm/msm: Fix null pointer dereferences without iommuLuca Weiss1-1/+4
Check if 'aspace' is set before using it as it will stay null without IOMMU, such as on msm8974. Fixes: bc2112583a0b ("drm/msm/gpu: Track global faults per address-space") Signed-off-by: Luca Weiss <luca@z3ntu.xyz> Link: https://lore.kernel.org/r/20220421203455.313523-1-luca@z3ntu.xyz Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-22drm/msm: simplify gpu_busy callbackChia-I Wu2-22/+12
Move tracking and busy time calculation to msm_devfreq_get_dev_status. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Cc: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220416003314.59211-2-olvaffe@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-22drm/msm: Add a way for userspace to allocate GPU iovaRob Clark1-0/+10
The motivation at this point is mainly native userspace mesa driver in a VM guest. The one remaining synchronous "hotpath" is buffer allocation, because guest needs to wait to know the bo's iova before it can start emitting cmdstream/state that references the new bo. By allocating the iova in the guest userspace, we no longer need to wait for a response from the host, but can just rely on the allocation request being processed before the cmdstream submission. Allocation failures (OoM, etc) would just be treated as context-lost (ie. GL_GUILTY_CONTEXT_RESET) or subsequent allocations (or readpix, etc) can raise GL_OUT_OF_MEMORY. v2: Fix inuse check v3: Change mismatched iova case to -EBUSY Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Link: https://lore.kernel.org/r/20220411215849.297838-11-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-22drm/msm/gem: Drop PAGE_SHIFT for address space mmRob Clark1-1/+1
Get rid of all the unnecessary conversion between address/size and page offsets. It just confuses things. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220411215849.297838-6-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-22drm/msm/gpu: Drop duplicate fence counterRob Clark3-4/+4
The ring seqno counter duplicates the fence-context last_fence counter. They end up getting incremented in lock-step, on the same scheduler thread, but the split just makes things less obvious. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220411215849.297838-3-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-22drm/msm: Add a way to override processes comm/cmdlineRob Clark1-3/+40
In the cause of using the GPU via virtgpu, the host side process is really a sort of proxy, and not terribly interesting from the PoV of crash/fault logging. Add a way to override these per process so that we can see the guest process's name. v2: Handle kmalloc failure, add comment to explain kstrdup returns NULL if passed NULL [Dan Carpenter] Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220317165144.222101-4-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-22drm/msm: Add support for pointer paramsRob Clark2-4/+12
The 64b value field is already suffient to hold a pointer instead of immediate, but we also need a length field. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220317165144.222101-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-12drm/msm/gpu: Avoid -Wunused-function with !CONFIG_PM_SLEEPNathan Chancellor1-5/+2
When building with CONFIG_PM=y and CONFIG_PM_SLEEP=n (such as ARCH=riscv allmodconfig), the following warnings/errors occur: drivers/gpu/drm/msm/adreno/adreno_device.c:679:12: error: 'adreno_system_resume' defined but not used [-Werror=unused-function] 679 | static int adreno_system_resume(struct device *dev) | ^~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/msm/adreno/adreno_device.c:655:12: error: 'adreno_system_suspend' defined but not used [-Werror=unused-function] 655 | static int adreno_system_suspend(struct device *dev) | ^~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors These functions are only used in SET_SYSTEM_SLEEP_PM_OPS(), which evaluates to empty when CONFIG_PM_SLEEP is not set, making these functions unused. To resolve this, use the SYSTEM_SLEEP_PM_OPS() and RUNTIME_PM_OPS() macros, which were introduced in commit 1a3c7bb08826 ("PM: core: Add new *_PM_OPS macros, deprecate old ones"). They are designed to avoid these compiler warnings while still guarding their use on CONFIG_PM{,_SLEEP}=y. Fixes: 7e4167c9e021 ("drm/msm/gpu: Park scheduler threads for system suspend") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20220411181249.2758344-1-nathan@kernel.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-11drm/msm: Fix range size vs end confusionRob Clark1-1/+1
The fourth param is size, rather than range_end. Note that we could increase the address space size if we had a way to prevent buffers from spanning a 4G split, mostly just to avoid fw bugs with 64b math. Fixes: 84c31ee16f90 ("drm/msm/a6xx: Add support for per-instance pagetables") Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220407202836.1211268-1-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-03-24drm/msm/gpu: Remove mutex from wait_event conditionRob Clark1-10/+1
The mutex wasn't really protecting anything before. Before the previous patch we could still be racing with the scheduler's kthread, as that is not necessarily frozen yet. Now that we've parked the sched threads, the only race is with jobs retiring, and that is harmless, ie. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220310234611.424743-4-robdclark@gmail.com
2022-03-24drm/msm/gpu: Park scheduler threads for system suspendRob Clark1-4/+64
In the system suspend path, we don't want to be racing with the scheduler kthreads pushing additional queued up jobs to the hw queue (ringbuffer). So park them first. While we are at it, move the wait for active jobs to complete into the new system- suspend path. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220310234611.424743-3-robdclark@gmail.com
2022-03-24drm/msm/gpu: Rename runtime suspend/resume functionsRob Clark1-3/+3
Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220310234611.424743-2-robdclark@gmail.com
2022-03-08drm/msm/adreno: fix cast in adreno_get_param()Dan Carpenter1-4/+4
These casts need to happen before the shift. The only time it would matter would be if "rev.core" is >= 128. In that case the sign bit would be extended and we do not want that. Fixes: afab9d91d872 ("drm/msm/adreno: Expose speedbin to userspace") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Link: https://lore.kernel.org/r/20220307133105.GA17534@kili Signed-off-by: Rob Clark <robdclark@chromium.org>