summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2023-05-17drm/amd: Use `amdgpu_ucode_*` helpers for MESMario Limonciello3-26/+4
commit 11e0b0067ec0707e8e598a5f9a547ab618ae7982 upstream. The `amdgpu_ucode_request` helper will ensure that the return code for missing firmware is -ENODEV so that early_init can fail. The `amdgpu_ucode_release` helper provides symmetry for releasing firmware. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17drm/amd: Add a new helper for loading/validating microcodeMario Limonciello2-0/+39
commit 2210af50ae7f4104269dfde7bafbbfbacdbe1a2b upstream. All microcode runs a basic validation after it's been loaded. Each IP block as part of init will run both. Introduce a wrapper for request_firmware and amdgpu_ucode_validate. This wrapper will also remap any error codes from request_firmware to -ENODEV. This is so that early_init will fail if firmware couldn't be loaded instead of the IP block being disabled. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17drm/amd: Load MES microcode during early_initMario Limonciello4-151/+100
commit cc42e76e7de5190a7da5dac9d7b2bbb458e050bf upstream. Add an early_init phase to MES for fetching and validating microcode from the filesystem. If MES microcode is required but not available during early init, the firmware framebuffer will have already been released and the screen will freeze. Move the request for MES microcode into the early_init phase so that if it's not available, early_init will fail. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17drm/amdgpu: disable sdma ecc irq only when sdma RAS is enabled in suspendGuchun Chen1-3/+5
commit 8b229ada2669b74fdae06c83fbfda5a5a99fc253 upstream. sdma_v4_0_ip is shared on a few asics, but in sdma_v4_0_hw_fini, driver unconditionally disables ecc_irq which is only enabled on those asics enabling sdma ecc. This will introduce a warning in suspend cycle on those chips with sdma ip v4.0, while without sdma ecc. So this patch correct this. [ 7283.166354] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu] [ 7283.167001] RSP: 0018:ffff9a5fc3967d08 EFLAGS: 00010246 [ 7283.167019] RAX: ffff98d88afd3770 RBX: 0000000000000001 RCX: 0000000000000000 [ 7283.167023] RDX: 0000000000000000 RSI: ffff98d89da30390 RDI: ffff98d89da20000 [ 7283.167025] RBP: ffff98d89da20000 R08: 0000000000036838 R09: 0000000000000006 [ 7283.167028] R10: ffffd5764243c008 R11: 0000000000000000 R12: ffff98d89da30390 [ 7283.167030] R13: ffff98d89da38978 R14: ffffffff999ae15a R15: ffff98d880130105 [ 7283.167032] FS: 0000000000000000(0000) GS:ffff98d996f00000(0000) knlGS:0000000000000000 [ 7283.167036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7283.167039] CR2: 00000000f7a9d178 CR3: 00000001c42ea000 CR4: 00000000003506e0 [ 7283.167041] Call Trace: [ 7283.167046] <TASK> [ 7283.167048] sdma_v4_0_hw_fini+0x38/0xa0 [amdgpu] [ 7283.167704] amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu] [ 7283.168296] amdgpu_device_suspend+0x103/0x180 [amdgpu] [ 7283.168875] amdgpu_pmops_freeze+0x21/0x60 [amdgpu] [ 7283.169464] pci_pm_freeze+0x54/0xc0 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17drm/amdgpu: Fix vram recover doesn't work after whole GPU reset (v2)Lin.Cao1-1/+5
commit 6c032c37ac3ef3b7df30937c785ecc4da428edc0 upstream. v1: Vmbo->shadow is used to back vram bo up when vram lost. So that we should set shadow as vmbo->shadow to recover vmbo->bo v2: Modify if(vmbo->shadow) shadow = vmbo->shadow as if(!vmbo->shadow) continue; Fixes: e18aaea733da ("drm/amdgpu: move shadow_list to amdgpu_bo_vm") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lin.Cao <lincao12@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17drm/amdgpu: change gfx 11.0.4 external_id rangeYifan Zhang1-1/+1
commit 996e93a3fe74dcf9d467ae3020aea42cc3ff65e3 upstream. gfx 11.0.4 range starts from 0x80. Fixes: 311d52367d0a ("drm/amdgpu: add soc21 common ip block support for GC 11.0.4") Cc: stable@vger.kernel.org Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reported-by: Yogesh Mohan Marimuthu <Yogesh.Mohanmarimuthu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17drm/amdgpu/jpeg: Remove harvest checking for JPEG3Saleemkhan Jamadar1-0/+1
commit 5b94db73e45e2e6c2840f39c022fd71dfa47fc58 upstream. Register CC_UVD_HARVESTING is obsolete for JPEG 3.1.2 Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17drm/amdgpu/gfx: disable gfx9 cp_ecc_error_irq only when enabling legacy gfx rasGuchun Chen1-1/+2
commit 4a76680311330aefe5074bed8f06afa354b85c48 upstream. gfx9 cp_ecc_error_irq is only enabled when legacy gfx ras is assert. So in gfx_v9_0_hw_fini, interrupt disablement for cp_ecc_error_irq should be executed under such condition, otherwise, an amdgpu_irq_put calltrace will occur. [ 7283.170322] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu] [ 7283.170964] RSP: 0018:ffff9a5fc3967d00 EFLAGS: 00010246 [ 7283.170967] RAX: ffff98d88afd3040 RBX: ffff98d89da20000 RCX: 0000000000000000 [ 7283.170969] RDX: 0000000000000000 RSI: ffff98d89da2bef8 RDI: ffff98d89da20000 [ 7283.170971] RBP: ffff98d89da20000 R08: ffff98d89da2ca18 R09: 0000000000000006 [ 7283.170973] R10: ffffd5764243c008 R11: 0000000000000000 R12: 0000000000001050 [ 7283.170975] R13: ffff98d89da38978 R14: ffffffff999ae15a R15: ffff98d880130105 [ 7283.170978] FS: 0000000000000000(0000) GS:ffff98d996f00000(0000) knlGS:0000000000000000 [ 7283.170981] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7283.170983] CR2: 00000000f7a9d178 CR3: 00000001c42ea000 CR4: 00000000003506e0 [ 7283.170986] Call Trace: [ 7283.170988] <TASK> [ 7283.170989] gfx_v9_0_hw_fini+0x1c/0x6d0 [amdgpu] [ 7283.171655] amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu] [ 7283.172245] amdgpu_device_suspend+0x103/0x180 [amdgpu] [ 7283.172823] amdgpu_pmops_freeze+0x21/0x60 [amdgpu] [ 7283.173412] pci_pm_freeze+0x54/0xc0 [ 7283.173419] ? __pfx_pci_pm_freeze+0x10/0x10 [ 7283.173425] dpm_run_callback+0x98/0x200 [ 7283.173430] __device_suspend+0x164/0x5f0 v2: drop gfx11 as it's fixed in a different solution by retiring cp_ecc_irq funcs(Hawking) Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v11_0_hw_finiHoratio Zhang1-1/+0
commit 13af556104fa93b1945c70bbf8a0a62cd2c92879 upstream. The gmc.ecc_irq is enabled by firmware per IFWI setting, and the host driver is not privileged to enable/disable the interrupt. So, it is meaningless to use the amdgpu_irq_put function in gmc_v11_0_hw_fini, which also leads to the call trace. [ 102.980303] Call Trace: [ 102.980303] <TASK> [ 102.980304] gmc_v11_0_hw_fini+0x54/0x90 [amdgpu] [ 102.980357] gmc_v11_0_suspend+0xe/0x20 [amdgpu] [ 102.980409] amdgpu_device_ip_suspend_phase2+0x240/0x460 [amdgpu] [ 102.980459] amdgpu_device_ip_suspend+0x3d/0x80 [amdgpu] [ 102.980520] amdgpu_device_pre_asic_reset+0xd9/0x490 [amdgpu] [ 102.980573] amdgpu_device_gpu_recover.cold+0x548/0xce6 [amdgpu] [ 102.980687] amdgpu_debugfs_reset_work+0x4c/0x70 [amdgpu] [ 102.980740] process_one_work+0x21f/0x3f0 [ 102.980741] worker_thread+0x200/0x3e0 [ 102.980742] ? process_one_work+0x3f0/0x3f0 [ 102.980743] kthread+0xfd/0x130 [ 102.980743] ? kthread_complete_and_exit+0x20/0x20 [ 102.980744] ret_from_fork+0x22/0x30 Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522 Fixes: c8b5a95b5709 ("drm/amdgpu: Fix desktop freezed after gpu-reset") Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17drm/amdgpu: fix an amdgpu_irq_put() issue in gmc_v9_0_hw_fini()Hamza Mahfooz1-1/+0
commit 922a76ba31adf84e72bc947267385be420c689ee upstream. As made mention of in commit 08c677cb0b43 ("drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v10_0_hw_fini") and commit 13af556104fa ("drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v11_0_hw_fini"). It is meaningless to call amdgpu_irq_put() for gmc.ecc_irq. So, remove it from gmc_v9_0_hw_fini(). Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522 Fixes: 3029c855d79f ("drm/amdgpu: Fix desktop freezed after gpu-reset") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v10_0_hw_finiHoratio Zhang1-1/+0
commit 08c677cb0b436a96a836792bb35a8ec5de4999c2 upstream. The gmc.ecc_irq is enabled by firmware per IFWI setting, and the host driver is not privileged to enable/disable the interrupt. So, it is meaningless to use the amdgpu_irq_put function in gmc_v10_0_hw_fini, which also leads to the call trace. [ 82.340264] Call Trace: [ 82.340265] <TASK> [ 82.340269] gmc_v10_0_hw_fini+0x83/0xa0 [amdgpu] [ 82.340447] gmc_v10_0_suspend+0xe/0x20 [amdgpu] [ 82.340623] amdgpu_device_ip_suspend_phase2+0x127/0x1c0 [amdgpu] [ 82.340789] amdgpu_device_ip_suspend+0x3d/0x80 [amdgpu] [ 82.340955] amdgpu_device_pre_asic_reset+0xdd/0x2b0 [amdgpu] [ 82.341122] amdgpu_device_gpu_recover.cold+0x4dd/0xbb2 [amdgpu] [ 82.341359] amdgpu_debugfs_reset_work+0x4c/0x70 [amdgpu] [ 82.341529] process_one_work+0x21d/0x3f0 [ 82.341535] worker_thread+0x1fa/0x3c0 [ 82.341538] ? process_one_work+0x3f0/0x3f0 [ 82.341540] kthread+0xff/0x130 [ 82.341544] ? kthread_complete_and_exit+0x20/0x20 [ 82.341547] ret_from_fork+0x22/0x30 Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522 Fixes: c8b5a95b5709 ("drm/amdgpu: Fix desktop freezed after gpu-reset") Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17drm/amdgpu: drop redundant sched job cleanup when cs is abortedGuchun Chen1-10/+3
commit 1253685f0d3eb3eab0bfc4bf15ab341a5f3da0c8 upstream. Once command submission failed due to userptr invalidation in amdgpu_cs_submit, legacy code will perform cleanup of scheduler job. However, it's not needed at all, as former commit has integrated job cleanup stuff into amdgpu_job_free. Otherwise, because of double free, a NULL pointer dereference will occur in such scenario. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2457 Fixes: f7d66fb2ea43 ("drm/amdgpu: cleanup scheduler job initialization v2") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17drm/amdgpu: add a missing lock for AMDGPU_SCHEDChia-I Wu1-1/+5
[ Upstream commit 2397e3d8d2e120355201a8310b61929f5a8bd2c0 ] mgr->ctx_handles should be protected by mgr->lock. v2: improve commit message v3: add a Fixes tag Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Fixes: 52c6a62c64fa ("drm/amdgpu: add interface for editing a foreign process's priority v3") Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11drm/amdgpu: register a vga_switcheroo client for MacBooks with apple-gmuxOrlando Chamberlain1-5/+16
[ Upstream commit d37a3929ca0363ed1dce02b2772cd5bc547ca66d ] Commit 3840c5bcc245 ("drm/amdgpu: disentangle runtime pm and vga_switcheroo") made amdgpu only register a vga_switcheroo client for GPU's with PX, however AMD GPUs in dual gpu Apple Macbooks do need to register, but don't have PX. Instead of AMD's PX, they use apple-gmux. Use apple_gmux_detect() to identify these gpus, and pci_is_thunderbolt_attached() to ensure eGPUs connected to Dual GPU Macbooks don't register with vga_switcheroo. Fixes: 3840c5bcc245 ("drm/amdgpu: disentangle runtime pm and vga_switcheroo") Link: https://lore.kernel.org/amd-gfx/20230210044826.9834-10-orlandoch.dev@gmail.com/ Signed-off-by: Orlando Chamberlain <orlandoch.dev@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-26drm/amdgpu: Fix desktop freezed after gpu-resetAlan Liu1-0/+3
commit c8b5a95b570949536a2b75cd8fc4f1de0bc60629 upstream. [Why] After gpu-reset, sometimes the driver fails to enable vblank irq, causing flip_done timed out and the desktop freezed. During gpu-reset, we disable and enable vblank irq in dm_suspend() and dm_resume(). Later on in amdgpu_irq_gpu_reset_resume_helper(), we check irqs' refcount and decide to enable or disable the irqs again. However, we have 2 sets of API for controling vblank irq, one is dm_vblank_get/put() and another is amdgpu_irq_get/put(). Each API has its own refcount and flag to store the state of vblank irq, and they are not synchronized. In drm we use the first API to control vblank irq but in amdgpu_irq_gpu_reset_resume_helper() we use the second set of API. The failure happens when vblank irq was enabled by dm_vblank_get() before gpu-reset, we have vblank->enabled true. However, during gpu-reset, in amdgpu_irq_gpu_reset_resume_helper() vblank irq's state checked from amdgpu_irq_update() is DISABLED. So finally it disables vblank irq again. After gpu-reset, if there is a cursor plane commit, the driver will try to enable vblank irq by calling drm_vblank_enable(), but the vblank->enabled is still true, so it fails to turn on vblank irq and causes flip_done can't be completed in vblank irq handler and desktop become freezed. [How] Combining the 2 vblank control APIs by letting drm's API finally calls amdgpu_irq's API, so the irq's refcount and state of both APIs can be synchronized. Also add a check to prevent refcount from being less then 0 in amdgpu_irq_put(). v2: - Add warning in amdgpu_irq_enable() if the irq is already disabled. - Call dc_interrupt_set() in dm_set_vblank() to avoid refcount change if it is in gpu-reset. v3: - Improve commit message and code comments. Signed-off-by: Alan Liu <HaoPing.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-20drm/amdgpu/gfx: set cg flags to enter/exit safe modeJane Jian1-0/+5
[ Upstream commit e06bfcc1a1c41bcb8c31470d437e147ce9f0acfd ] sriov needs to enter/exit safe mode in update umd p state add the cg flag to let it enter or exit while needed Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-20drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobsYuBiao Wang1-0/+9
[ Upstream commit 033c56474acf567a450f8bafca50e0b610f2b716 ] [Why] For engines not supporting soft reset, i.e. VCN, there will be a failed ib test before mode 1 reset during asic reset. The fences in this case are never signaled and next time when we try to free the sa_bo, kernel will hang. [How] During pre_asic_reset, driver will clear job fences and afterwards the fences' refcount will be reduced to 1. For drm_sched_jobs it will be released in job_free_cb, and for non-sched jobs like ib_test, it's meant to be released in sa_bo_free but only when the fences are signaled. So we have to force signal the non_sched bad job's fence during pre_asic_reset or the clear is not complete. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-20drm/amdgpu: add mes resume when do gfx post soft resetTong Liu011-0/+9
[ Upstream commit 4eb0b49a0ad3e004a6a65b84efe37bc7e66d560f ] [why] when gfx do soft reset, mes will also do reset, if mes is not resumed when do recover from soft reset, mes is unable to respond in later sequence [how] resume mes when do gfx post soft reset Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-13drm/amdgpu: skip psp suspend for IMU enabled ASICs mode2 resetTim Huang1-0/+12
commit e11c775030c5585370fda43035204bb5fa23b139 upstream. The psp suspend & resume should be skipped to avoid destroy the TMR and reload FWs again for IMU enabled APU ASICs. Signed-off-by: Tim Huang <tim.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13drm/amdgpu: for S0ix, skip SDMA 5.x+ suspend/resumeAlex Deucher1-0/+6
commit 2a7798ea7390fd78f191c9e9bf68f5581d3b4a02 upstream. SDMA 5.x is part of the GFX block so it's controlled via GFXOFF. Skip suspend as it should be handled the same as GFX. v2: drop SDMA 4.x. That requires special handling. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: "Limonciello, Mario" <Mario.Limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-06drm/amdgpu: allow more APUs to do mode2 reset when go to S4Tim Huang1-1/+6
commit 2fec9dc8e0acc3dfb56d1389151bcf405f087b10 upstream. Skip mode2 reset only for IMU enabled APUs when do S4. This patch is to fix the regression issue https://gitlab.freedesktop.org/drm/amd/-/issues/2483 It is generated by commit b589626674de ("drm/amdgpu: skip ASIC reset for APUs when go to S4"). Fixes: b589626674de ("drm/amdgpu: skip ASIC reset for APUs when go to S4") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2483 Tested-by: Yuan Perry <Perry.Yuan@amd.com> Signed-off-by: Tim Huang <tim.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-06drm/amdgpu/vcn: custom video info caps for sriovJane Jian3-11/+99
[ Upstream commit d71e38df3b730a17ab6b25cabb2ccfe8a7f04385 ] for sriov, we added a new flag to indicate av1 support, this will override the original caps info. Signed-off-by: Jane Jian <Jane.Jian@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-30drm/amdgpu: reposition the gpu reset checking for reuseTim Huang2-20/+25
commit aaee0ce460b954e08b6e630d7e54b2abb672feb8 upstream. Move the amdgpu_acpi_should_gpu_reset out of CONFIG_SUSPEND to share it with hibernate case. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-30drm/amdgpu: skip ASIC reset for APUs when go to S4Tim Huang1-1/+4
commit b589626674de94d977e81c99bf7905872b991197 upstream. For GC IP v11.0.4/11, PSP TMR need to be reserved for ASIC mode2 reset. But for S4, when psp suspend, it will destroy the TMR that fails the ASIC reset. [ 96.006101] amdgpu 0000:62:00.0: amdgpu: MODE2 reset [ 100.409717] amdgpu 0000:62:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x00000011 SMN_C2PMSG_82:0x00000002 [ 100.411593] amdgpu 0000:62:00.0: amdgpu: Mode2 reset failed! [ 100.412470] amdgpu 0000:62:00.0: PM: pci_pm_freeze(): amdgpu_pmops_freeze+0x0/0x50 [amdgpu] returns -62 [ 100.414020] amdgpu 0000:62:00.0: PM: dpm_run_callback(): pci_pm_freeze+0x0/0xd0 returns -62 [ 100.415311] amdgpu 0000:62:00.0: PM: pci_pm_freeze+0x0/0xd0 returned -62 after 4623202 usecs [ 100.416608] amdgpu 0000:62:00.0: PM: failed to freeze async: error -62 We can skip the reset on APUs, assuming we can resume them properly. Verified on some GFX11, GFX10 and old GFX9 APUs. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-30drm/amdgpu/nv: Apply ASPM quirk on Intel ADL + AMD NaviKai-Heng Feng4-17/+18
commit 2b072442f4962231a8516485012bb2d2551ef2fe upstream. S2idle resume freeze can be observed on Intel ADL + AMD WX5500. This is caused by commit 0064b0ce85bb ("drm/amd/pm: enable ASPM by default"). The root cause is still not clear for now. So extend and apply the ASPM quirk from commit e02fe3bc7aba ("drm/amdgpu: vi: disable ASPM on Intel Alder Lake based systems"), to workaround the issue on Navi cards too. Fixes: 0064b0ce85bb ("drm/amd/pm: enable ASPM by default") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2458 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-30drm/amd: Fix initialization mistake for NBIO 7.3.0Mario Limonciello1-5/+9
[ Upstream commit 1717cc5f2962a4652c76ed3858b499ccae6c277c ] The same strapping initialization issue that happened on NBIO 7.5.1 appears to be happening on NBIO 7.3.0. Apply the same fix to 7.3.0 as well. Note: This workaround relies upon the integrated GPU being enabled in BIOS. If the integrated GPU is disabled in BIOS a different workaround will be required. Reported-by: Thomas Glanzmann <thomas@glanzmann.de> Cc: Basavaraj Natikar <Basavaraj.Natikar@amd.com> Link: https://lore.kernel.org/linux-usb/Y%2Fz9GdHjPyF2rNG3@glanzmann.de/T/#u Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-30drm/amdgpu: Fix call trace warning and hang when removing amdgpu devicelyndonli1-1/+1
[ Upstream commit 93bb18d2a873d2fa9625c8ea927723660a868b95 ] On GPUs with RAS enabled, below call trace and hang are observed when shutting down device. v2: use DRM device unplugged flag instead of shutdown flag as the check to prevent memory wipe in shutdown stage. [ +0.000000] RIP: 0010:amdgpu_vram_mgr_fini+0x18d/0x1c0 [amdgpu] [ +0.000001] PKRU: 55555554 [ +0.000001] Call Trace: [ +0.000001] <TASK> [ +0.000002] amdgpu_ttm_fini+0x140/0x1c0 [amdgpu] [ +0.000183] amdgpu_bo_fini+0x27/0xa0 [amdgpu] [ +0.000184] gmc_v11_0_sw_fini+0x2b/0x40 [amdgpu] [ +0.000163] amdgpu_device_fini_sw+0xb6/0x510 [amdgpu] [ +0.000152] amdgpu_driver_release_kms+0x16/0x30 [amdgpu] [ +0.000090] drm_dev_release+0x28/0x50 [drm] [ +0.000016] devm_drm_dev_init_release+0x38/0x60 [drm] [ +0.000011] devm_action_release+0x15/0x20 [ +0.000003] release_nodes+0x40/0xc0 [ +0.000001] devres_release_all+0x9e/0xe0 [ +0.000001] device_unbind_cleanup+0x12/0x80 [ +0.000003] device_release_driver_internal+0xff/0x160 [ +0.000001] driver_detach+0x4a/0x90 [ +0.000001] bus_remove_driver+0x6c/0xf0 [ +0.000001] driver_unregister+0x31/0x50 [ +0.000001] pci_unregister_driver+0x40/0x90 [ +0.000003] amdgpu_exit+0x15/0x120 [amdgpu] Signed-off-by: lyndonli <Lyndon.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-22drm/amdgpu/vcn: Disable indirect SRAM on Vangogh broken BIOSesGuilherme G. Piccoli1-0/+19
commit 542a56e8eb4467ae654eefab31ff194569db39cd upstream. The VCN firmware loading path enables the indirect SRAM mode if it's advertised as supported. We might have some cases of FW issues that prevents this mode to working properly though, ending-up in a failed probe. An example below, observed in the Steam Deck: [...] [drm] failed to load ucode VCN0_RAM(0x3A) [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0000) amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec_0 test failed (-110) [drm:amdgpu_device_init.cold [amdgpu]] *ERROR* hw_init of IP block <vcn_v3_0> failed -110 amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_init failed amdgpu 0000:04:00.0: amdgpu: Fatal error during GPU init [...] Disabling the VCN block circumvents this, but it's a very invasive workaround that turns off the entire feature. So, let's add a quirk on VCN loading that checks for known problematic BIOSes on Vangogh, so we can proactively disable the indirect SRAM mode and allow the HW proper probe and VCN IP block to work fine. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2385 Fixes: 82132ecc5432 ("drm/amdgpu: enable Vangogh VCN indirect sram mode") Cc: stable@vger.kernel.org Cc: James Zhu <James.Zhu@amd.com> Cc: Leo Liu <leo.liu@amd.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-22drm/amdgpu: fix ttm_bo calltrace warning in psp_hw_finiHoratio Zhang1-3/+3
[ Upstream commit 23f4a2d29ba57bf88095f817de5809d427fcbe7e ] The call trace occurs when the amdgpu is removed after the mode1 reset. During mode1 reset, from suspend to resume, there is no need to reinitialize the ta firmware buffer which caused the bo pin_count increase redundantly. [ 489.885525] Call Trace: [ 489.885525] <TASK> [ 489.885526] amdttm_bo_put+0x34/0x50 [amdttm] [ 489.885529] amdgpu_bo_free_kernel+0xe8/0x130 [amdgpu] [ 489.885620] psp_free_shared_bufs+0xb7/0x150 [amdgpu] [ 489.885720] psp_hw_fini+0xce/0x170 [amdgpu] [ 489.885815] amdgpu_device_fini_hw+0x2ff/0x413 [amdgpu] [ 489.885960] ? blocking_notifier_chain_unregister+0x56/0xb0 [ 489.885962] amdgpu_driver_unload_kms+0x51/0x60 [amdgpu] [ 489.886049] amdgpu_pci_remove+0x5a/0x140 [amdgpu] [ 489.886132] ? __pm_runtime_resume+0x60/0x90 [ 489.886134] pci_device_remove+0x3e/0xb0 [ 489.886135] __device_release_driver+0x1ab/0x2a0 [ 489.886137] driver_detach+0xf3/0x140 [ 489.886138] bus_remove_driver+0x6c/0xf0 [ 489.886140] driver_unregister+0x31/0x60 [ 489.886141] pci_unregister_driver+0x40/0x90 [ 489.886142] amdgpu_exit+0x15/0x451 [amdgpu] Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Signed-off-by: longlyao <Longlong.Yao@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17drm/amdgpu/soc21: Add video cap query support for VCN_4_0_4Veerabadhran Gopalakrishnan1-0/+1
[ Upstream commit 6ce2ea07c5ff0a8188eab0e5cd1f0e4899b36835 ] Added the video capability query support for VCN version 4_0_4 Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17drm/amdgpu/soc21: don't expose AV1 if VCN0 is harvestedAlex Deucher1-13/+48
[ Upstream commit a6de636eb04f146d23644dbbb7173e142452a9b7 ] Only VCN0 supports AV1. Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: 6ce2ea07c5ff ("drm/amdgpu/soc21: Add video cap query support for VCN_4_0_4") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17drm/amdgpu: fix error checking in amdgpu_read_mm_registers for nvAlex Deucher1-3/+4
commit b42fee5e0b44344cfe4c38e61341ee250362c83f upstream. Properly skip non-existent registers as well. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2442 Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc21Alex Deucher1-3/+4
commit 2915e43a033a778816fa4bc621f033576796521e upstream. Properly skip non-existent registers as well. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2442 Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc15Alex Deucher1-2/+3
commit 0dcdf8498eae2727bb33cef3576991dc841d4343 upstream. Properly skip non-existent registers as well. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2442 Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10drm/amd: Fix initialization for nbio 7.5.1Mario Limonciello1-0/+5
commit 65a24000808f70ac69bd2a96381fa0c7341f20c0 upstream. A mistake has been made in the BIOS for some ASICs with NBIO 7.5.1 where some NBIO registers aren't properly setup. Ensure that they're set during initialization. Tested-by: Richard Gong <richard.gong@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-10Revert "drm/amdgpu: TA unload messages are not actually sent to psp when ↵Vitaly Prosyak2-3/+4
amdgpu is uninstalled" [ Upstream commit 39934d3ed5725c5e3570ed1b67f612f1ea60ce03 ] This reverts commit fac53471d0ea9693d314aa2df08d62b2e7e3a0f8. The following change: move the drm_dev_unplug call after amdgpu_driver_unload_kms in amdgpu_pci_remove. The reason is the following: amdgpu_pci_remove calls drm_dev_unregister and it should be called first to ensure userspace can't access the device instance anymore. If we call drm_dev_unplug after amdgpu_driver_unload_kms then we observe IGT PCI software unplug test failure (kernel hung) for all ASICs. This is how this regression was found. After this revert, the following commands do work not, but it would be fixed in the next commit: - sudo modprobe -r amdgpu - sudo modprobe amdgpu Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10drm/amdkfd: Page aligned memory reserve sizePhilip Yang2-6/+8
[ Upstream commit 0c2dece8fb541ab07b68c3312a1065fa9c927a81 ] Use page aligned size to reserve memory usage because page aligned TTM BO size is used to unreserve memory usage, otherwise no page aligned size causes memory usage accounting unbalanced. Change vram_used definition type to int64_t to be able to trigger WARN_ONCE(adev && adev->kfd.vram_used < 0, "..."), to help debug the accounting issue with warning and backtrace. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10drm/amd: Avoid BUG() for case of SRIOV missing IP versionMario Limonciello1-1/+1
[ Upstream commit 93fec4f8c158584065134b4d45e875499bf517c8 ] No need to crash the kernel. AMDGPU will now fail to probe. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-10drm/amdgpu: Use the sched from entity for amdgpu_cs traceLeo Liu1-2/+2
[ Upstream commit cf22ef78f22ce4df4757472c5dbd33c430c5b659 ] The problem is that base sched hasn't been assigned yet at this moment, causing something like "ring=0" all the time from trace. mpv:cs0-3473 [002] ..... 129.047431: amdgpu_cs: ring=0, dw=48, fences=0 mpv:cs0-3473 [002] ..... 129.089125: amdgpu_cs: ring=0, dw=48, fences=0 mpv:cs0-3473 [002] ..... 129.130987: amdgpu_cs: ring=0, dw=48, fences=0 mpv:cs0-3473 [002] ..... 129.172478: amdgpu_cs: ring=0, dw=48, fences=0 Fixes: 4624459c84d7 ("drm/amdgpu: add gang submit frontend v6") Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-16drm/amd/amdgpu: fix warning during suspendJack Xiao2-1/+4
Freeing memory was warned during suspend. Move the self test out of suspend. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2151825 Cc: jfalempe@redhat.com Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com> Tested-by: Jocelyn Falempe <jfalempe@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-02-10Merge tag 'amd-drm-fixes-6.2-2023-02-09' of ↵Dave Airlie2-0/+12
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.2-2023-02-09: amdgpu: - Add a parameter to disable S/G display - Re-enable S/G display on all DCNs Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230209174504.7577-1-alexander.deucher@amd.com
2023-02-10Merge tag 'drm-misc-fixes-2023-02-09' of ↵Dave Airlie1-1/+4
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A fix for a circular refcounting in drm/client, one for a memory leak in amdgpu and a virtio fence fix when interrupted Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20230209083600.7hi6roht6xxgldgz@houat
2023-02-09drm/amdgpu: add S/G display parameterAlex Deucher2-0/+12
Some users have reported flickerng with S/G display. We've tried extensively to reproduce and debug the issue on a wide variety of platform configurations (DRAM bandwidth, etc.) and a variety of monitors, but so far have not been able to. We disabled S/G display on a number of platforms to address this but that leads to failure to pin framebuffers errors and blank displays when there is memory pressure or no displays at all on systems with limited carveout (e.g., Chromebooks). Add a option to disable this as a debugging option as a way for users to disable this, depending on their use case, and for us to help debug this further. v2: fix typo Reviewed-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-09amd/amdgpu: remove test ib on hw ringJesseZhang2-2/+1
test ib function is not necessary on hw ring, so remove it. v2: squash in NULL check fix Signed-off-by: JesseZhang <Jesse.Zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-09drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/finiGuilherme G. Piccoli1-1/+7
Currently amdgpu calls drm_sched_fini() from the fence driver sw fini routine - such function is expected to be called only after the respective init function - drm_sched_init() - was executed successfully. Happens that we faced a driver probe failure in the Steam Deck recently, and the function drm_sched_fini() was called even without its counter-part had been previously called, causing the following oops: amdgpu: probe of 0000:04:00.0 failed with error -110 BUG: kernel NULL pointer dereference, address: 0000000000000090 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338 Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022 RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched] [...] Call Trace: <TASK> amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu] amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu] amdgpu_driver_release_kms+0x16/0x30 [amdgpu] devm_drm_dev_init_release+0x49/0x70 [...] To prevent that, check if the drm_sched was properly initialized for a given ring before calling its fini counter-part. Notice ideally we'd use sched.ready for that; such field is set as the latest thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such field - in the above oops for example, it was a GFX ring causing the crash, and the sched.ready field was set to true in the ring init routine, regardless of the state of the DRM scheduler. Hence, we ended-up using sched.ops as per Christian's suggestion [0], and also removed the no_scheduler check [1]. [0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/ [1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/ Fixes: 067f44c8b459 ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)") Suggested-by: Christian König <christian.koenig@amd.com> Cc: Guchun Chen <guchun.chen@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-02-09drm/amdgpu: Use the TGID for trace_amdgpu_vm_update_ptesFriedrich Vock1-1/+1
The pid field corresponds to the result of gettid() in userspace. However, userspace cannot reliably attribute PTE events to processes with just the thread id. This patch allows userspace to easily attribute PTE update events to specific processes by comparing this field with the result of getpid(). For attributing events to specific threads, the thread id is also contained in the common fields of each trace event. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-02-09drm/amd/amdgpu: enable athub cg 11.0.3Kenneth Feng1-1/+3
enable athub cg on gc 11.0.3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-03drm/amdgpu: fix memory leak in amdgpu_cs_sync_ringsBert Karwatzki1-1/+4
amdgpu_sync_get_fence deletes the returned fence from the syncobj, so the refcount of fence needs to lowered to avoid a memory leak. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2360 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Bert Karwatzki <spasswolf@web.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/3b590ba0f11d24b8c6c39c3d38250129c1116af4.camel@web.de
2023-02-02drm/amd: Fix initialization for nbio 4.3.0Mario Limonciello1-1/+7
A mistake has been made on some boards with NBIO 4.3.0 where some NBIO registers aren't properly set by the hardware. Ensure that they're set during initialization. Cc: Natikar Basavaraj <Basavaraj.Natikar@amd.com> Tested-by: Satyanarayana ReddyTVN <Satyanarayana.ReddyTVN@amd.com> Tested-by: Rutvij Gajjar <Rutvij.Gajjar@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-02-02drm/amdgpu: enable HDP SD for gfx 11.0.3Evan Quan1-1/+2
Enable HDP clock gating control for gfx 11.0.3. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>