summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2021-08-30drm/amdgpu: show both cmd id and name when psp cmd failedLang Yu1-4/+4
To cover the corner case that people want to know the ID of an UNKNOWN CMD. Suggested-by: John Clements <john.clements@amd.com> Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30drm/amdgpu: Enable S/G for Yellow CarpNicholas Kazlauskas1-0/+1
Missing code for Yellow Carp to enable scatter gather - follows how DCN21 support was added. Tested that 8k framebuffer allocation and display can now succeed after applying the patch. v2: Add hookup in DM Reviewed-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-08-30drm/amdgpu: reenable BACO support for 699F:C7 polaris12 SKUEvan Quan1-8/+1
This reverts the commit below: "drm/amdgpu: disable BACO support for 699F:C7 polaris12 SKU temporarily". As the S3 hang issue has been fixed by another commit: "drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend". Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30drm/amd/amdgpu: Add ready_to_reset resp for vega10YuBiao Wang2-0/+3
Send response to host after received the flr notification from host. Port NV change to vega10. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Reviewed-by: Jingwen Chen <Jingwen.Chen2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30drm/amdgpu: add some additional RDNA2 PCI IDsAlex Deucher1-0/+17
New PCI IDs. Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30drm/amdgpu: correct comments in memory type managersYifan Zhang2-5/+5
The parameters were renamed. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30drm/amdgpu: Process any VBIOS RAS EEPROM addressLuben Tuikov1-12/+13
We can now process any RAS EEPROM address from VBIOS. Generalize so as to compute the top three bits of the 19-bit EEPROM address, from any byte returned as the "i2c address" from VBIOS. Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30drm/amdgpu: Fixes to returning VBIOS RAS EEPROM addressLuben Tuikov1-17/+33
1) Generalize the function--if the user didn't set i2c_address, still return true/false to indicate whether VBIOS contains the RAS EEPROM address. This function shouldn't evaluate whether the user set the i2c_address pointer or not. 2) Don't touch the caller's i2c_address, unless you have to--this function shouldn't have side effects. 3) Correctly set the function comment as a kernel-doc comment. Cc: John Clements <john.clements@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-30drm/sched: drop entity parameter from drm_sched_push_jobDaniel Vetter2-2/+2
Originally a job was only bound to the queue when we pushed this, but now that's done in drm_sched_job_init, making that parameter entirely redundant. Remove it. The same applies to the context parameter in lima_sched_context_queue_task, simplify that too. v2: Rebase on top of msm adopting drm/sched Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Steven Price <steven.price@arm.com> (v1) Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Emma Anholt <emma@anholt.net> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Nirmoy Das <nirmoy.das@amd.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Chen Li <chenli@uniontech.com> Cc: Lee Jones <lee.jones@linaro.org> Cc: Deepak R Varma <mh12gx2825@gmail.com> Cc: Kevin Wang <kevin1.wang@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: "Marek Olšák" <marek.olsak@amd.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: Dennis Li <Dennis.Li@amd.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: etnaviv@lists.freedesktop.org Cc: lima@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: Rob Clark <robdclark@gmail.com> Cc: Sean Paul <sean@poorly.run> Cc: Melissa Wen <mwen@igalia.com> Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-6-daniel.vetter@ffwll.ch
2021-08-30drm/sched: Split drm_sched_job_initDaniel Vetter2-0/+4
This is a very confusingly named function, because not just does it init an object, it arms it and provides a point of no return for pushing a job into the scheduler. It would be nice if that's a bit clearer in the interface. But the real reason is that I want to push the dependency tracking helpers into the scheduler code, and that means drm_sched_job_init must be called a lot earlier, without arming the job. v2: - don't change .gitignore (Steven) - don't forget v3d (Emma) v3: Emma noticed that I leak the memory allocated in drm_sched_job_init if we bail out before the point of no return in subsequent driver patches. To be able to fix this change drm_sched_job_cleanup() so it can handle being called both before and after drm_sched_job_arm(). Also improve the kerneldoc for this. v4: - Fix the drm_sched_job_cleanup logic, I inverted the booleans, as usual (Melissa) - Christian pointed out that drm_sched_entity_select_rq() also needs to be moved into drm_sched_job_arm, which made me realize that the job->id definitely needs to be moved too. Shuffle things to fit between job_init and job_arm. v5: Reshuffle the split between init/arm once more, amdgpu abuses drm_sched.ready to signal gpu reset failures. Also document this somewhat. (Christian) v6: Rebase on top of the msm drm/sched support. Note that the drm_sched_job_init() call is completely misplaced, and hence also the split-out drm_sched_entity_push_job(). I've put in a FIXME which the next patch will address. v7: Drop the FIXME in msm, after discussions with Rob I agree it shouldn't be a problem where it is now. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Melissa Wen <mwen@igalia.com> Cc: Melissa Wen <melissa.srw@gmail.com> Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Steven Price <steven.price@arm.com> (v2) Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v5) Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Adam Borowski <kilobyte@angband.pl> Cc: Nick Terrell <terrelln@fb.com> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Nirmoy Das <nirmoy.das@amd.com> Cc: Deepak R Varma <mh12gx2825@gmail.com> Cc: Lee Jones <lee.jones@linaro.org> Cc: Kevin Wang <kevin1.wang@amd.com> Cc: Chen Li <chenli@uniontech.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: "Marek Olšák" <marek.olsak@amd.com> Cc: Dennis Li <Dennis.Li@amd.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: Sonny Jiang <sonny.jiang@amd.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Tian Tao <tiantao6@hisilicon.com> Cc: etnaviv@lists.freedesktop.org Cc: lima@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: Emma Anholt <emma@anholt.net> Cc: Rob Clark <robdclark@gmail.com> Cc: Sean Paul <sean@poorly.run> Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20210817084917.3555822-1-daniel.vetter@ffwll.ch
2021-08-30Merge tag 'amd-drm-next-5.15-2021-08-27' of ↵Dave Airlie40-143/+761
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.15-2021-08-27: amdgpu: - PLL fix for SI - Misc code cleanups - RAS fixes - PSP cleanups - Polaris UVD/VCE suspend fixes - aldebaran fixes - DCN3.x mclk fixes amdkfd: - CWSR fixes for arcturus and aldebaran - SVM fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210827192336.4649-1-alexander.deucher@amd.com
2021-08-29Merge tag 'irqchip-5.15' of ↵Thomas Gleixner1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - API updates: - Treewide conversion to generic_handle_domain_irq() for anything that looks like a chained interrupt controller - Update the irqdomain documentation - Use of bitmap_zalloc() throughout the tree - New functionalities: - Support for GICv3 EPPI partitions - Fixes: - Qualcomm PDC hierarchy fixes - Yet another priority decoding fix for the GICv3 pseudo-NMIs - Fix the apple-aic driver irq_eoi() callback to always unmask the interrupt - Properly handle edge interrupts on loongson-pch-pic - Let the mtk-sysirq driver advertise IRQCHIP_SKIP_SET_WAKE Link: https://lore.kernel.org/r/20210828121013.2647964-1-maz@kernel.org
2021-08-26drm/amdgpu: disable GFX CGCG in aldebaranHawking Zhang1-2/+0
disable GFX CGCG and CGLS to workaround a hardware issue found in aldebaran. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-26drm/amdgpu: Clear RAS interrupt status on aldebaranJohn Clements1-5/+29
resolve register address issue for detecting/clearing RAS interrupt Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-26drm/amdgpu: Add support for RAS XGMI err queryJohn Clements1-0/+65
Update XGMI RAS to support error query on aldebaran Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-26Merge tag 'amd-drm-next-5.15-2021-08-20' of ↵Dave Airlie45-439/+716
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.15-2021-08-20: amdgpu: - embed hw fence into job - Misc SMU fixes - PSP TA code cleanup - RAS fixes - PWM fan speed fixes - DC workqueue cleanups - SR-IOV fixes - gfxoff delayed work fix - Pin domain check fix amdkfd: - SVM fixes radeon: - Code cleanup Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210820172335.4190-1-alexander.deucher@amd.com
2021-08-26drm/amdgpu: rename amdgpu_bo_get_preferred_pin_domainYifan Zhang4-7/+7
amdgpu_bo_get_preferred_pin_domain is used for page tables creation, which is not involved with page pinning. And it is used in more cases than display scanout, modify its documentation as well. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-26drm/amdgpu: drop redundant cancel_delayed_work_sync callEvan Quan4-6/+0
As those _sw_fini() APIs follow just after _suspend() APIs. And the cancel_delayed_work_sync was already called in latter. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-26drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspendEvan Quan6-1/+144
This is a supplement for commit below: "drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend". Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-26drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspendEvan Quan2-0/+47
Perform proper cleanups on UVD/VCE suspend: powergate enablement, clockgating enablement and dpm disablement. This can fix some hangs observed on suspending when UVD/VCE still using(e.g. issue "pm-suspend" when video is still playing). Signed-off-by: Evan Quan <evan.quan@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24drm/amdkfd: check access permisson to restore retry faultPhilip Yang4-5/+8
Check range access permission to restore GPU retry fault, if GPU retry fault on address which belongs to VMA, and VMA has no read or write permission requested by GPU, failed to restore the address. The vm fault event will pass back to user space. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24drm/amdgpu: Update RAS XGMI Error QueryJohn Clements1-1/+3
Resolve bug querying error on unsupported ASIC Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24drm/amdgpu: Add driver infrastructure for MCA RASJohn Clements9-2/+388
Add MCA specific IP blocks targetting RAS features Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24drm/amd/amdgpu: consolidate PSP TA init shared buf functionsCandice Li1-99/+43
Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24drm/amd/amdgpu: add name field back to ras_common_ifCandice Li1-0/+1
Adding name field back to ras_common_if to work around error injection failure with amdgpuras tool. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24drm/amdgpu: Fix build with missing pm_suspend_target_state module exportBorislav Petkov1-1/+1
Building a randconfig here triggered: ERROR: modpost: "pm_suspend_target_state" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! because the module export of that symbol happens in kernel/power/suspend.c which is enabled with CONFIG_SUSPEND. The ifdef guards in amdgpu_acpi_is_s0ix_supported(), however, test for CONFIG_PM_SLEEP which is defined like this: config PM_SLEEP def_bool y depends on SUSPEND || HIBERNATE_CALLBACKS and that randconfig has: # CONFIG_SUSPEND is not set CONFIG_HIBERNATE_CALLBACKS=y leading to the module export missing. Change the ifdeffery to depend directly on CONFIG_SUSPEND. Fixes: 5706cb3c910c ("drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/YSP6Lv53QV0cOAsd@zn.tnic Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-08-24drm/amdgpu: switch from 'pci_' to 'dma_' APIChristophe JAILLET1-3/+3
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below. It has been compile tested. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24drm/amdkfd: CWSR with sw scheduler on Aldebaran and ArcturusMukul Joshi4-2/+6
Program trap handler settings to enable CWSR with software scheduler on Aldebaran and Arcturus. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24drm/amdgpu/OLAND: clip the ref divider max valueShashank Sharma3-9/+16
This patch limits the ref_div_max value to 100, during the calculation of PLL feedback reference divider. With current value (128), the produced fb_ref_div value generates unstable output at particular frequencies. Radeon driver limits this value at 100. On Oland, when we try to setup mode 2048x1280@60 (a bit weird, I know), it demands a clock of 221270 Khz. It's been observed that the PLL calculations using values 128 and 100 are vastly different, and look like this: +------------------------------------------+ |Parameter |AMDGPU |Radeon | | | | | +-------------+----------------------------+ |Clock feedback | | |divider max | 128 | 100 | |cap value | | | | | | | | | | | +------------------------------------------+ |ref_div_max | | | | | 42 | 20 | | | | | | | | | +------------------------------------------+ |ref_div | 42 | 20 | | | | | +------------------------------------------+ |fb_div | 10326 | 8195 | +------------------------------------------+ |fb_div | 1024 | 163 | +------------------------------------------+ |fb_dev_p | 4 | 9 | |frac fb_de^_p| | | +----------------------------+-------------+ With ref_div_max value clipped at 100, AMDGPU driver can also drive videmode 2048x1280@60 (221Mhz) and produce proper output without any blanking and distortion on the screen. PS: This value was changed from 128 to 100 in Radeon driver also, here: https://github.com/freedesktop/drm-tip/commit/4b21ce1b4b5d262e7d4656b8ececc891fc3cb806 V1: Got acks from: Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> V2: - Restricting the changes only for OLAND, just to avoid any regression for other cards. - Changed unsigned -> unsigned int to make checkpatch quiet. V3: Apply the change on SI family (not only oland) (Christian) Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Eddy Qin <Eddy.Qin@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24drm/amdgpu: Fix build with missing pm_suspend_target_state module exportBorislav Petkov1-1/+1
Building a randconfig here triggered: ERROR: modpost: "pm_suspend_target_state" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! because the module export of that symbol happens in kernel/power/suspend.c which is enabled with CONFIG_SUSPEND. The ifdef guards in amdgpu_acpi_is_s0ix_supported(), however, test for CONFIG_PM_SLEEP which is defined like this: config PM_SLEEP def_bool y depends on SUSPEND || HIBERNATE_CALLBACKS and that randconfig has: # CONFIG_SUSPEND is not set CONFIG_HIBERNATE_CALLBACKS=y leading to the module export missing. Change the ifdeffery to depend directly on CONFIG_SUSPEND. Fixes: 5706cb3c910c ("drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/YSP6Lv53QV0cOAsd@zn.tnic Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-08-23drm/ttm: remove ttm_tt_destroy_common v2Christian König1-1/+0
Move the functionality into ttm_tt_fini and ttm_bo_tt_destroy instead. We don't need this any more since we removed the unbind from the destroy code paths in the drivers. Also add a warning to ttm_tt_fini() if we try to fini a still populated TT object. v2: instead of reverting the patch move the functionality to different places. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210728130552.2074-5-christian.koenig@amd.com
2021-08-23drm/amdgpu: unbind in amdgpu_ttm_tt_unpopulateChristian König1-1/+2
Doing this in amdgpu_ttm_backend_destroy() is to late. It turned out that this is not a good idea at all because it leaves pointers to freed up system memory pages in the GART tables of the drivers. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210728130552.2074-2-christian.koenig@amd.com
2021-08-20drm/amdgpu: Cancel delayed work when GFXOFF is disabledMichel Dänzer2-17/+30
schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when the disable count transitions from 0 to 1, and only schedule the delayed work on the reverse transition, not if the disable count was already 0. This makes sure the delayed work doesn't run at unexpected times, and allows it to be lock-free. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. v3: * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König) v4: * Fix race condition between amdgpu_gfx_off_ctrl incrementing adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off checking for it to be 0 (Evan Quan) Cc: stable@vger.kernel.org Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3 Acked-by: Christian König <christian.koenig@amd.com> # v3 Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-20drm/amdgpu: use the preferred pin domain after the checkChristian König1-5/+5
For some reason we run into an use case where a BO is already pinned into GTT, but should be pinned into VRAM|GTT again. Handle that case gracefully as well. Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-08-20drm/amdgpu: Cancel delayed work when GFXOFF is disabledMichel Dänzer2-17/+30
schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when the disable count transitions from 0 to 1, and only schedule the delayed work on the reverse transition, not if the disable count was already 0. This makes sure the delayed work doesn't run at unexpected times, and allows it to be lock-free. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. v3: * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König) v4: * Fix race condition between amdgpu_gfx_off_ctrl incrementing adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off checking for it to be 0 (Evan Quan) Cc: stable@vger.kernel.org Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3 Acked-by: Christian König <christian.koenig@amd.com> # v3 Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-20drm/amdgpu: use the preferred pin domain after the checkChristian König1-5/+5
For some reason we run into an use case where a BO is already pinned into GTT, but should be pinned into VRAM|GTT again. Handle that case gracefully as well. Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-08-19drm/amd/amdgpu:flush ttm delayed work before cancel_syncYuBiao Wang1-1/+3
[Why] In some cases when we unload driver, warning call trace will show up in vram_mgr_fini which claims that LRU is not empty, caused by the ttm bo inside delay deleted queue. [How] We should flush delayed work to make sure the delay deleting is done. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-19drm/amd: consolidate TA shared memory structuresCandice Li4-162/+134
Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-19drm/amdgpu: increase max xgmi physical node for aldebaranHawking Zhang1-3/+2
aldebaran supports up to 16 xgmi physical nodes. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-19drm/amdgpu: disable BACO support for 699F:C7 polaris12 SKU temporarilyEvan Quan1-1/+8
We have a S3 issue on that SKU with BACO enabled. Will bring back this when that root caused. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-19drm/amdgpu: correct MMSCH 1.0 versionZhigang Luo1-3/+1
MMSCH 1.0 doesn't have major/minor version, only verison. Signed-off-by: Zhigang Luo <zhigang.luo@amd.com> Reviewed by Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-19drm/amdgpu: get extended xgmi topology dataJonathan Kim4-14/+145
The TA has a limit to the amount of data that can be retrieved from GET_TOPOLOGY. For setups that exceed this limit, the xGMI topology needs to be re-initialized and data needs to be re-fetched from the extended link records by setting a flag in the shared command buffer. The number of hops and the number of links must be accumulated by the driver. Other data points are all fetched from the first request. Because the TA has already exceeded its link record limit, it cannot hold bidirectional information. Otherwise the driver would have to do more than two fetches so the driver has to reflect the topology information in the opposite direction. v2: squashed with internal reviewed fix Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Hawking Zhang <hawking.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16drm/amd/pm: drop the unnecessary intermediate percent-based transitionEvan Quan1-0/+2
Currently, the readout of fan speed pwm is transited into percent-based and then pwm-based. However, the transition into percent-based is totally unnecessary and make the final output less accurate. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16drm/amd/amdgpu: remove unnecessary RAS context fieldCandice Li10-14/+6
Delete ras_if->name in the RAS ctx structure and remove related lines. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16drm/amd/amdgpu: consolidate PSP TA contextCandice Li11-187/+158
Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16drm/amdgpu: Add MB_REQ_MSG_READY_TO_RESET response when VF get FLR notification.Jiange Zhao2-1/+4
When guest received FLR notification from host, it would lock adapter into reset state. There will be no more job submission and hardware access after that. Then it should send a response to host that it has prepared for host reset. Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com> Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16drm/amd/amdgpu embed hw_fence into amdgpu_jobJack Zhang9-37/+119
Why: Previously hw fence is alloced separately with job. It caused historical lifetime issues and corner cases. The ideal situation is to take fence to manage both job and fence's lifetime, and simplify the design of gpu-scheduler. How: We propose to embed hw_fence into amdgpu_job. 1. We cover the normal job submission by this method. 2. For ib_test, and submit without a parent job keep the legacy way to create a hw fence separately. v2: use AMDGPU_FENCE_FLAG_EMBED_IN_JOB_BIT to show that the fence is embedded in a job. v3: remove redundant variable ring in amdgpu_job v4: add tdr sequence support for this feature. Add a job_run_counter to indicate whether this job is a resubmit job. v5 add missing handling in amdgpu_fence_enable_signaling Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Signed-off-by: Jack Zhang <Jack.Zhang7@hotmail.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16Merge tag 'drm-misc-next-2021-08-12' of ↵Dave Airlie3-9/+15
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.15: UAPI Changes: Cross-subsystem Changes: - Add lockdep_assert(once) helpers. Core Changes: - Add lockdep assert to drm_is_current_master_locked. - Fix typos in dma-buf documentation. - Mark drm irq midlayer as legacy only. - Fix GPF in udmabuf_create. - Rename member to correct value in drm_edid.h Driver Changes: - Build fix to make nouveau build with NOUVEAU_BACKLIGHT. - Add MI101AIT-ICP1, LTTD800480070-L6WWH-RT panels. - Assorted fixes to bridge/it66121, anx7625. - Add custom crtc_state to simple helpers, and use it to convert pll handling in mgag200 to atomic. - Convert drivers to use offset-adjusted framebuffer bo mappings. - Assorted small fixes and fix for a use-after-free in vmwgfx. - Convert remaining callers of non-legacy drivers to use linux irqs directly. - Small cleanup in ingenic. - Small fixes to virtio and ti-sn65dsi86. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1cf2d7fc-402d-1852-574a-21cbbd2eaebf@linux.intel.com
2021-08-12gpu: Bulk conversion to generic_handle_domain_irq()Marc Zyngier1-1/+1
Wherever possible, replace constructs that match either generic_handle_irq(irq_find_mapping()) or generic_handle_irq(irq_linear_revmap()) to a single call to generic_handle_domain_irq(). Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-08-12gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable ↵Tuo Li1-1/+1
access in amdgpu_i2c_router_select_ddc_port() The variable val is declared without initialization, and its address is passed to amdgpu_i2c_get_byte(). In this function, the value of val is accessed in: DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n", addr, *val); Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized, but it is accessed in: val &= ~amdgpu_connector->router.ddc_mux_control_pin; To fix this possible uninitialized-variable access, initialize val to 0 in amdgpu_i2c_router_select_ddc_port(). Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Tuo Li <islituo@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>