kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2020-05-22	drm/amdkfd: fix restore worker race condition	Philip Yang	1	-3/+3
	In free memory of gpu path, remove bo from validate_list to make sure restore worker don't access the BO any more, then unregister bo MMU interval notifier. Otherwise, the restore worker will crash in the middle of validating BO user pages if MMU interval notifer is gone. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-21	drm/amdgpu: off by one in amdgpu_device_attr_create_groups() error handling	Dan Carpenter	1	-2/+1
	This loop in the error handling code should start a "i - 1" and end at "i == 0". Currently it starts a "i" and ends at "i == 1". The result is that it removes one attribute that wasn't created yet, and leaks the zeroeth attribute. Fixes: 4e01847c38f7 ("drm/amdgpu: optimize amdgpu device attribute code") Acked-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-21	drm/amdgpu: add condition to set MP1 state on gpu reset	Likun Gao	1	-1/+2
	Only ras supportted need to set MP1 state to prepare for unload before reloading SMU FW. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-21	drm/amdgpu fix incorrect sysfs remove behavior for xgmi	Jack Zhang	1	-7/+16
	Under xgmi setup,some sysfs fail to create for the second time of kmd driver loading. It's due to sysfs nodes are not removed appropriately in the last unlod time. Changes of this patch: 1. remove sysfs for dev_attr_xgmi_error 2. remove sysfs_link adev->dev->kobj with target name. And it only needs to be removed once for a xgmi setup 3. remove sysfs_link hive->kobj with target name In amdgpu_xgmi_remove_device: 1. amdgpu_xgmi_sysfs_rem_dev_info needs to be run per device 2. amdgpu_xgmi_sysfs_destroy needs to be run on the last node of device. v2: initialize array with memset Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-21	drm/amdgpu/vcn2.5: Remove old DPG workaround	James Zhu	1	-9/+0
	SCRATCH2 is used to keep decode wptr as a workaround which fix a hardware DPG decode wptr update bug for vcn2.5 beforehand. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-21	drm/amdgpu/jpeg2.5: Remove JPEG_ENC_MASK from clock ungating	James Zhu	1	-1/+0
	Remove JPEG_ENC_MASK from clock ungating since MJPEG encoder hasn't been support yet. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-21	drm/amd/display: Add DC Debug mask to disable features for bringup	Harry Wentland	2	-0/+9
	[Why] At bringup we want to be able to disable various power features. [How] These features are already exposed as dc_debug_options and exercised on other OSes. Create a new dc_debug_mask module parameter and expose relevant bits, in particular * DC_DISABLE_PIPE_SPLIT * DC_DISABLE_STUTTER * DC_DISABLE_DSC * DC_DISABLE_CLOCK_GATING Signed-off-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-21	drm/amdgpu: cleanup unnecessary virt sriov check in amdgpu attribute	Kevin Wang	1	-105/+0
	the amdgpu device attribute node will be created accordding to sriov vf mode at runtime. cleanup unnecessary sriov check in attribute operation function. Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-20	drm/amd: remove _unlocked suffix in drm_gem_object_put_unlocked	Emil Velikov	10	-29/+29
	Spelling out _unlocked for each and every driver is a annoying. Especially if we consider how many drivers, do not know (or need to) about the horror stories involving struct_mutex. Just drop the suffix. It makes the API cleaner. Done via the following script: __from=drm_gem_object_put_unlocked __to=drm_gem_object_put for __file in $(git grep --name-only $__from); do sed -i "s/$__from/$__to/g" $__file; done Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20200515095118.2743122-15-emil.l.velikov@gmail.com
2020-05-20	drm/amdgpu: use the unlocked drm_gem_object_put	Emil Velikov	1	-1/+1
	The driver does not hold struct_mutex, thus using the locked version of the helper is incorrect. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: amd-gfx@lists.freedesktop.org Fixes: a39414716ca0 ("drm/amdgpu: add independent DMA-buf import v9") Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20200515095118.2743122-8-emil.l.velikov@gmail.com
2020-05-18	drm/amdgpu: Add a UAPI flag for user to call mem_sync	Andrey Grodzovsky	2	-1/+5
	When this flag is set in the CS IB flags, it causes a memory cache flush of the GFX. v2: Move new flag to drm_amdgpu_cs_chunk_ib.flags Bump up UAPI version Remove condition on job != null to emit mem_sync Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-18	drm/amdgpu: apply AMDGPU_IB_FLAG_EMIT_MEM_SYNC to compute IBs too (v3)	Marek Olšák	7	-7/+46
	Compute IBs need this too. v2: split out version bump v3: squash in emit frame count fixes Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-18	drm/amdgpu: Add mem_sync implementation for all the ASICs.	Andrey Grodzovsky	5	-5/+94
	Implement the .mem_sync hook defined earlier. v2: Rename functions Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-18	drm/amdgpu: Add new ring callback to insert memory sync	Andrey Grodzovsky	1	-0/+1
	Used to flush and invalidate various caches. v2: Rename function hook Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-18	drm/amdgpu: optimize amdgpu device attribute code	Kevin Wang	2	-278/+262
	unified amdgpu device attribute node functions: 1. add some helper functions to create amdgpu device attribute node. 2. create device node according to device attr flags on different VF mode. 3. rename some functions name to adapt a new interface. v2: 1. remove ATTR_STATE_DEAD, ATTR_STATE_ALIVE enum. 2. rename callback function perform to attr_update. 3. modify some variable names Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-18	drm/amdgpu: add amdgpu_virt_get_vf_mode helper function	Kevin Wang	2	-0/+24
	the swsmu or powerplay(hwmgr) need to handle task according to different VF mode, this function to help query vf mode. vf mode: 1. SRIOV_VF_MODE_BARE_METAL: the driver work on host OS (PF) 2. SRIOV_VF_MODE_ONE_VF : the driver work on guest OS with one VF 3. SRIOV_VF_MODE_MULTI_VF : the driver work on guest OS with multi VF Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-18	drm/amdgpu: Add autodump debugfs node for gpu reset v8	Jiange Zhao	4	-1/+87
	When GPU got timeout, it would notify an interested part of an opportunity to dump info before actual GPU reset. A usermode app would open 'autodump' node under debugfs system and poll() for readable/writable. When a GPU reset is due, amdgpu would notify usermode app through wait_queue_head and give it 10 minutes to dump info. After usermode app has done its work, this 'autodump' node is closed. On node closure, amdgpu gets to know the dump is done through the completion that is triggered in release(). There is no write or read callback because necessary info can be obtained through dmesg and umr. Messages back and forth between usermode app and amdgpu are unnecessary. v2: (1) changed 'registered' to 'app_listening' (2) add a mutex in open() to prevent race condition v3 (chk): grab the reset lock to avoid race in autodump_open, rename debugfs file to amdgpu_autodump, provide autodump_read as well, style and code cleanups v4: add 'bool app_listening' to differentiate situations, so that the node can be reopened; also, there is no need to wait for completion when no app is waiting for a dump. v5: change 'bool app_listening' to 'enum amdgpu_autodump_state' add 'app_state_mutex' for race conditions: (1)Only 1 user can open this file node (2)wait_dump() can only take effect after poll() executed. (3)eliminated the race condition between release() and wait_dump() v6: removed 'enum amdgpu_autodump_state' and 'app_state_mutex' removed state checking in amdgpu_debugfs_wait_dump Improve on top of version 3 so that the node can be reopened. v7: move reinit_completion into open() so that only one user can open it. v8: remove complete_all() from amdgpu_debugfs_wait_dump(). Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-15	drm/amdgpu: Update RAS XGMI error inject sequence	John Clements	1	-1/+29
	Disable XGMI link power down prior to issuing a XGMI RAS error Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-15	drm/amdgpu: Add DPM function for XGMI link power down control	John Clements	2	-0/+12
	Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-14	drm/amdgpu: remove redundant assignment to variable ret	Colin Ian King	1	-1/+1
	The variable ret is being initializeed with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-14	drm/amdgpu: turn back rlcg write for gfx_v10	Yintian Tao	1	-8/+6
	There is no need to use amdgpu_mm_wreg_mmio_rlc() during initialization time because this interface is only designed for debugfs case to access the registers which are only permitted by RLCG during run-time. Therefore, turn back rlcg write for gfx_v10. If we not turn back it, it will raise amdgpu load failure. [ 54.904333] amdgpu: SMU driver if version not matched [ 54.904393] amdgpu: SMU is initialized successfully! [ 54.905971] [drm] kiq ring mec 2 pipe 1 q 0 [ 55.115416] amdgpu 0000:00:06.0: [drm:amdgpu_ring_test_helper [amdgpu]] ERROR ring gfx_0.0.0 test failed (-110) [ 55.118877] [drm:amdgpu_device_init [amdgpu]] ERROR hw_init of IP block <gfx_v10_0> failed -110 [ 55.126587] amdgpu 0000:00:06.0: amdgpu_device_ip_init failed [ 55.133466] amdgpu 0000:00:06.0: Fatal error during GPU init Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-14	drm/amdgpu: Add AQUIRE_MEM PACKET3 fields defintion	Andrey Grodzovsky	2	-1/+72
	Add this for gfx10 and gfx9. v2: Fix identation Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-14	Merge tag 'amd-drm-next-5.8-2020-05-12' of ↵	Dave Airlie	29	-689/+771
	git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-next-5.8-2020-05-12: amdgpu: - Misc cleanups - RAS fixes - Expose FP16 for modesetting - DP 1.4 compliance test fixes - Clockgating fixes - MAINTAINERS update - Soft recovery for gfx10 - Runtime PM cleanups - PSP code cleanups amdkfd: - Track GPU memory utilization per process - Report PCI domain in topology Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200512213703.4039-1-alexander.deucher@amd.com
2020-05-12	drm/amd/amdgpu: add raven1 part to the gfxoff quirk list	Tom St Denis	1	-0/+2
	On my raven1 system (rev c6) with VBIOS 113-RAVEN-114 GFXOFF is not stable (resulting in large block tiling noise in some applications). Disabling GFXOFF via the quirk list fixes the problems for me. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-05-12	drm/amd/amdgpu: remove defined but not used 'crtc_offsets'	Jason Yan	1	-11/+0
	Fix the following gcc warning: drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:65:18: warning: ‘crtc_offsets’ defined but not used [-Wunused-const-variable=] static const u32 crtc_offsets[6] = ^~~~~~~~~~~~ Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-12	drm/amd/amdgpu: add raven1 part to the gfxoff quirk list	Tom St Denis	1	-0/+2
	On my raven1 system (rev c6) with VBIOS 113-RAVEN-114 GFXOFF is not stable (resulting in large block tiling noise in some applications). Disabling GFXOFF via the quirk list fixes the problems for me. Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-11	mm/hmm: remove the customizable pfn format from hmm_range_fault	Jason Gunthorpe	1	-25/+10
	Presumably the intent here was that hmm_range_fault() could put the data into some HW specific format and thus avoid some work. However, nothing actually does that, and it isn't clear how anything actually could do that as hmm_range_fault() provides CPU addresses which must be DMA mapped. Perhaps there is some special HW that does not need DMA mapping, but we don't have any examples of this, and the theoretical performance win of avoiding an extra scan over the pfns array doesn't seem worth the complexity. Plus pfns needs to be scanned anyhow to sort out any DEVICE_PRIVATE pages. This version replaces the uint64_t with an usigned long containing a pfn and fixed flags. On input flags is filled with the HMM_PFN_REQ_* values, on successful output it is filled with HMM_PFN_* values, describing the state of the pages. amdgpu is simple to convert, it doesn't use snapshot and doesn't use per-page flags. nouveau uses only 16 hmm_pte entries at most (ie fits in a few cache lines), and it sweeps over its pfns array a couple of times anyhow. It also has a nasty call chain before it reaches the dma map and hardware suggesting performance isn't important: nouveau_svm_fault(): args.i.m.method = NVIF_VMM_V0_PFNMAP nouveau_range_fault() nvif_object_ioctl() client->driver->ioctl() struct nvif_driver nvif_driver_nvkm: .ioctl = nvkm_client_ioctl nvkm_ioctl() nvkm_ioctl_path() nvkm_ioctl_v0[type].func(..) nvkm_ioctl_mthd() nvkm_object_mthd() struct nvkm_object_func nvkm_uvmm: .mthd = nvkm_uvmm_mthd nvkm_uvmm_mthd() nvkm_uvmm_mthd_pfnmap() nvkm_vmm_pfn_map() nvkm_vmm_ptes_get_map() func == gp100_vmm_pgt_pfn struct nvkm_vmm_desc_func gp100_vmm_desc_spt: .pfn = gp100_vmm_pgt_pfn nvkm_vmm_iter() REF_PTES == func == gp100_vmm_pgt_pfn() dma_map_page() Link: https://lore.kernel.org/r/5-v2-b4e84f444c7d+24f57-hmm_no_flags_jgg@mellanox.com Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-11	mm/hmm: remove HMM_PFN_SPECIAL	Jason Gunthorpe	1	-1/+0
	This is just an alias for HMM_PFN_ERROR, nothing cares that the error was because of a special page vs any other error case. Link: https://lore.kernel.org/r/4-v2-b4e84f444c7d+24f57-hmm_no_flags_jgg@mellanox.com Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-11	drm/amdgpu: remove dead code after hmm_range_fault()	Jason Gunthorpe	1	-10/+6
	Since amdgpu does not use the snapshot mode of hmm_range_fault() a successful return already proves that all entries in the pfns are HMM_PFN_VALID, there is no need to check the return result of hmm_device_entry_to_page(). Link: https://lore.kernel.org/r/3-v2-b4e84f444c7d+24f57-hmm_no_flags_jgg@mellanox.com Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-11	mm/hmm: make hmm_range_fault return 0 or -1	Jason Gunthorpe	1	-2/+2
	hmm_vma_walk->last is supposed to be updated after every write to the pfns, so that it can be returned by hmm_range_fault(). However, this is not done consistently. Fortunately nothing checks the return code of hmm_range_fault() for anything other than error. More importantly last must be set before returning -EBUSY as it is used to prevent reading an output pfn as an input flags when the loop restarts. For clarity and simplicity make hmm_range_fault() return 0 or -ERRNO. Only set last when returning -EBUSY. Link: https://lore.kernel.org/r/2-v2-b4e84f444c7d+24f57-hmm_no_flags_jgg@mellanox.com Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-08	drm/amdgpu: implement soft_recovery for gfx10	Alex Deucher	1	-0/+14
	Same as gfx9. This allows us to kill the waves for hung shaders. Acked-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: enable hibernate support on Navi1X	Evan Quan	2	-0/+3
	BACO is needed to support hibernate on Navi1X. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-05-08	drm/amdgpu: Use GEM obj reference for KFD BOs	Felix Kuehling	1	-2/+3
	Releasing the AMDGPU BO ref directly leads to problems when BOs were exported as DMA bufs. Releasing the GEM reference makes sure that the AMDGPU/TTM BO is not freed too early. Also take a GEM reference when importing BOs from DMABufs to keep references to imported BOs balances properly. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Sierra <alex.sierra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: force fbdev into vram	Alex Deucher	1	-2/+1
	We set the fb smem pointer to the offset into the BAR, so keep the fbdev bo in vram. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=207581 Fixes: 6c8d74caa2fa33 ("drm/amdgpu: Enable scatter gather display support") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-05-08	drm/amdgpu: drop unnecessary cancel_delayed_work_sync on PG ungate	Evan Quan	2	-14/+4
	As this is already properly handled in amdgpu_gfx_off_ctrl(). In fact, this unnecessary cancel_delayed_work_sync may leave a small time window for race condition and is dangerous. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: disable MGCG/MGLS also on gfx CG ungate	Evan Quan	1	-1/+1
	Otherwise, MGCG/MGLS will be left enabled. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: only set DPM_FLAG_NEVER_SKIP for legacy ATPX BOCO	Alex Deucher	1	-1/+4
	We only need to set DPM_FLAG_NEVER_SKIP for the legacy ATPX BOCO case. D3cold and BACO work as expected. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: drop extra runtime pm handling in resume pmop	Alex Deucher	1	-8/+0
	The core handles this for us. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: fix runpm logic in amdgpu_pmops_resume	Alex Deucher	1	-2/+2
	We should be checking whether the driver enabled runtime pm rather than whether the asic supports BOCO or BACO. That said in general they are equivalent unless the user has disabled runpm or it has been disabled for a specific asic. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: drop pm_runtime_set_active	Alex Deucher	1	-1/+0
	The pci core handles this for us in pci_pm_init. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: implement soft_recovery for gfx10	Alex Deucher	1	-0/+14
	Same as gfx9. This allows us to kill the waves for hung shaders. Acked-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: cleanup sysfs file handling	Nirmoy Das	1	-24/+12
	Create sysfs file using attributes. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: enable hibernate support on Navi1X	Evan Quan	2	-0/+3
	BACO is needed to support hibernate on Navi1X. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: use node_id and node_size to calcualte dram_base_address	Hawking Zhang	3	-82/+2
	physical_node_id * node_segment_size should be the dram_base_address for current gpu node in xgmi config Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: switch to common rlc_autoload helper	Hawking Zhang	3	-10/+1
	drop IP specific psp function for rlc autoload since the autoload_supported was introduced to mark ASICs that support rlc_autoload Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: drop unused ras ta helper function	Hawking Zhang	2	-32/+0
	cure posion command was replaced by ras recovery solution and was not a formal command supported by ras ta anymore Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: switch to common ras ta helper	Hawking Zhang	3	-33/+30
	TRIGGER_ERROR is common ras ta command for all the ASICs that support RAS feature. switch to common helper to avoid duplicate implementation per IP generation Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	drm/amdgpu: switch to common xgmi ta helpers	Hawking Zhang	3	-137/+123
	get_hive_id/get_node_id/get_topology_info/set_topology_info are common xgmi command supported by TA for all the ASICs that support xgmi link. They should be implemented as common helper functions to avoid duplicated code per IP generation Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-05-08	Merge tag 'amd-drm-next-5.8-2020-04-30' of ↵	Dave Airlie	52	-354/+739
	git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-next-5.8-2020-04-30: amdgpu: - SR-IOV fixes - SDMA fix for Navi - VCN 2.5 DPG fixes - Display fixes - Display stuttering fixes for pageflip and cursor - Add support for handling encrypted GPU memory - Add UAPI for encrypted GPU memory - Rework IB pool handling amdkfd: - Expose asic revision in topology - Add UAPI for GWS (Global Wave Sync) resource management UAPI: - Add amdgpu UAPI for encrypted GPU memory Used by: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401 - Add amdkfd UAPI for GWS (Global Wave Sync) resource management Thunk usage of KFD ioctl: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/blob/roc-2.8.0/src/queues.c#L840 ROCr usage of Thunk API: https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/roc-3.1.0/src/core/runtime/amd_gpu_agent.cpp#L597 HCC code using ROCr API: https://github.com/RadeonOpenCompute/hcc/blob/98ee9f34945d3b5f572d7a4c15cbffa506487734/lib/hsa/mcwamp_hsa.cpp#L2161 HIP code using HCC API: https://github.com/ROCm-Developer-Tools/HIP/blob/cf8589b8c8a40ddcc55fa3a51e23390a49824130/src/hip_module.cpp#L567 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430212951.3902-1-alexander.deucher@amd.com
2020-05-07	drm/amd/amdgpu: cleanup coding style a bit	Bernard Zhao	1	-30/+13
	There is DEVICE_ATTR mechanism in separate attribute define. So this change is to use attr array, also use sysfs_create_files in init function & sysfs_remove_files in fini function. This maybe make the code a bit readable. Signed-off-by: Bernard Zhao <bernard@vivo.com> Changes since V1: Use DEVICE_ATTR mechanism Link for V1: https://lore.kernel.org/patchwork/patch/1228076/ V2: make array const to fix build errors Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>