BMC/Intel-BMC/linux.git - Intel OpenBMC Linux kernel source tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2021-04-09	drm/amdgpu: implement umc query error count callback	Hawking Zhang	2	-0/+92
	umc query_ras_error_count will be invoked to query umc correctable and uncorrectable error. It will reset the umc ras error counter after the query. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: add helper funtion to query umc ras error	Hawking Zhang	2	-0/+77
	Add helper functions to query correctable and uncorrectable umc ras error. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: create umc_v6_7_funcs for aldebaran	Hawking Zhang	3	-1/+58
	umc_v6_7_funcs are callbacks to support umc ras functionalities in aldebaran Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: initialze ras caps per paltform config	Hawking Zhang	1	-12/+23
	Driver only manages GFX/SDMA/MMHUB RAS in platforms that gpu node is connected to cpu through XGMI, other than that, it queries VBIOS for RAS capabilities. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: drop some unused atombios functions	Alex Deucher	2	-163/+0
	These were leftover from the old CI dpm code which was retired a while ago. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amd: cleanup coding style a bit	Bernard Zhao	1	-4/+3
	Fix patch check warning: WARNING: suspect code indent for conditional statements (8, 17) + if (obj && obj->use < 0) { + DRM_ERROR("RAS ERROR: Unbalance obj(%s) use\n", obj->head.name); WARNING: braces {} are not necessary for single statement blocks + if (obj && obj->use < 0) { + DRM_ERROR("RAS ERROR: Unbalance obj(%s) use\n", obj->head.name); + } Signed-off-by: Bernard Zhao <bernard@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: support sdma error injection	Stanley.Yang	1	-0/+1
	Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reivewed-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: reserve fence slot to update page table	Philip Yang	1	-2/+8
	Forgot to reserve a fence slot to use sdma to update page table, cause below kernel BUG backtrace to handle vm retry fault while application is exiting. [ 133.048143] kernel BUG at /home/yangp/git/compute_staging/kernel/drivers/dma-buf/dma-resv.c:281! [ 133.048487] Workqueue: events amdgpu_irq_handle_ih1 [amdgpu] [ 133.048506] RIP: 0010:dma_resv_add_shared_fence+0x204/0x280 [ 133.048672] amdgpu_vm_sdma_commit+0x134/0x220 [amdgpu] [ 133.048788] amdgpu_vm_bo_update_range+0x220/0x250 [amdgpu] [ 133.048905] amdgpu_vm_handle_fault+0x202/0x370 [amdgpu] [ 133.049031] gmc_v9_0_process_interrupt+0x1ab/0x310 [amdgpu] [ 133.049165] ? kgd2kfd_interrupt+0x9a/0x180 [amdgpu] [ 133.049289] ? amdgpu_irq_dispatch+0xb6/0x240 [amdgpu] [ 133.049408] amdgpu_irq_dispatch+0xb6/0x240 [amdgpu] [ 133.049534] amdgpu_ih_process+0x9b/0x1c0 [amdgpu] [ 133.049657] amdgpu_irq_handle_ih1+0x21/0x60 [amdgpu] [ 133.049669] process_one_work+0x29f/0x640 [ 133.049678] worker_thread+0x39/0x3f0 [ 133.049685] ? process_one_work+0x640/0x640 Signed-off-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: indirect register access for nv12 sriov	Peng Ju Zhou	5	-63/+164
	1. expand rlcg interface for gc & mmhub indirect access 2. add rlcg interface for no kiq v2: squash in fix for gfx9 (Changfeng) Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: indirect register access for nv12 sriov	Peng Ju Zhou	2	-0/+19
	using the control bits got from host to control registers access. Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: indirect register access for nv12 sriov	Peng Ju Zhou	2	-0/+13
	get pf2vf msg info at it's earliest time so that guest driver can use these info to decide whether register indirect access enabled. Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: indirect register access for nv12 sriov	Peng Ju Zhou	1	-3/+3
	unify host driver and guest driver indirect access control bits names Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: check alignment on CPU page for bo map	Xℹ Ruoyao	1	-4/+4
	The page table of AMDGPU requires an alignment to CPU page so we should check ioctl parameters for it. Return -EINVAL if some parameter is unaligned to CPU page, instead of corrupt the page table sliently. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Set a suitable dev_info.gart_page_size	Huacai Chen	1	-2/+2
	In Mesa, dev_info.gart_page_size is used for alignment and it was set to AMDGPU_GPU_PAGE_SIZE(4KB). However, the page table of AMDGPU driver requires an alignment on CPU pages. So, for non-4KB page system, gart_page_size should be max_t(u32, PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE). Signed-off-by: Rui Wang <wangr@lemote.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Link: https://github.com/loongson-community/linux-stable/commit/caa9c0a1 [Xi: rebased for drm-next, use max_t for checkpatch, and reworded commit message.] Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang> BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1549 Tested-by: Dan Horák <dan@danny.cz> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: fix compiler warning(v2)	Guchun Chen	1	-3/+1
	warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] int write = !(gtt->userflags & AMDGPU_GEM_USERPTR_READONLY); v2: put short variable declaration last Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: fix NULL pointer dereference	Guchun Chen	1	-1/+1
	ttm->sg needs to be checked before accessing its child member. Call Trace: amdgpu_ttm_backend_destroy+0x12/0x70 [amdgpu] ttm_bo_cleanup_memtype_use+0x3a/0x60 [ttm] ttm_bo_release+0x17d/0x300 [ttm] amdgpu_bo_unref+0x1a/0x30 [amdgpu] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x78b/0x8b0 [amdgpu] kfd_ioctl_alloc_memory_of_gpu+0x118/0x220 [amdgpu] kfd_ioctl+0x222/0x400 [amdgpu] ? kfd_dev_is_large_bar+0x90/0x90 [amdgpu] __x64_sys_ioctl+0x8e/0xd0 ? __context_tracking_exit+0x52/0x90 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f97f264d317 Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48 RSP: 002b:00007ffdb402c338 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f97f3cc63a0 RCX: 00007f97f264d317 RDX: 00007ffdb402c380 RSI: 00000000c0284b16 RDI: 0000000000000003 RBP: 00007ffdb402c380 R08: 00007ffdb402c428 R09: 00000000c4000004 R10: 00000000c4000004 R11: 0000000000000246 R12: 00000000c0284b16 R13: 0000000000000003 R14: 00007f97f3cc63a0 R15: 00007f8836200000 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Add new PF2VF flags for VF register access method	Rohit Khaire	2	-2/+26
	Add 3 sub flags to notify guest for indirect reg access of gc, mmhub and ih The host sets these flags depending on L1 RAP version, asic and other scenarios. These flags ensure that there is compatibility between different guest/host/vbios versions. Signed-off-by: Rohit Khaire <rohit.khaire@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: skip PP_MP1_STATE_UNLOAD on aldebaran	Feifei Xu	1	-2/+1
	This message is not needed on Aldebaran. Signed-off-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amd/amdgpu: set MP1 state to UNLOAD before reload its FW for ↵	Chengming Gui	1	-1/+3
	vega20/ALDEBARAN When resume from gpu reset, need set MP1 state to UNLOAD before reload SMU FW otherwise will cause following errors: [ 121.642772] [drm] reserve 0x400000 from 0x87fec00000 for PSP TMR [ 123.801051] [drm] failed to load ucode id (24) [ 123.801055] [drm] psp command (0x6) failed and response status is (0x0) [ 123.801214] [drm:psp_load_smu_fw [amdgpu]] ERROR PSP load smu failed! [ 123.801398] [drm:psp_resume [amdgpu]] ERROR PSP resume failed [ 123.801536] [drm:amdgpu_device_fw_loading [amdgpu]] ERROR resume of IP block <psp> failed -22 [ 123.801632] amdgpu 0000:04:00.0: amdgpu: GPU reset(9) failed [ 123.801691] amdgpu 0000:07:00.0: amdgpu: GPU reset(9) failed [ 123.802899] amdgpu 0000:04:00.0: amdgpu: GPU reset end with ret = -22 v2: add error info and including ALDEBARAN also Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Reset error code for 'no handler' case	Lijo Lazar	1	-3/+8
	If reset handler is not implemented, reset error before proceeding. Fixes issue with the following trace - [ 106.508592] amdgpu 0000:b1:00.0: amdgpu: ASIC reset failed with error, -38 for drm dev, 0000:b1:00.0 [ 106.508972] amdgpu 0000:b1:00.0: amdgpu: GPU reset succeeded, trying to resume [ 106.509116] [drm] PCIE GART of 512M enabled. [ 106.509120] [drm] PTB located at 0x0000008000000000 [ 106.509136] [drm] VRAM is lost due to GPU reset! [ 106.509332] [drm] PSP is resuming... Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-and-tested-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: ih reroute for newer asics than vega20	Alex Sierra	1	-3/+3
	Starting Arcturus, it supports ih reroute through mmio directly in bare metal environment. This is also valid for newer asics such as Aldebaran. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: fix offset calculation in amdgpu_vm_bo_clear_mappings()	Nirmoy Das	1	-1/+1
	Offset calculation wasn't correct as start addresses are in pfn not in bytes. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amd/pm: unify the interface for gfx state setting	Evan Quan	1	-10/+6
	No need to have special handling for swSMU supported ASICs. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amd/pm: unify the interface for loading SMU microcode	Evan Quan	1	-8/+2
	No need to have special handling for swSMU supported ASICs. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Fix build warnings	Lijo Lazar	2	-2/+2
	Fix header guard and make internal functions static. Fixes the below warnings: drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu_reset.h:24:9: warning: '__AMDUGPU_RESET_H__' is used as a header guard here, followed by #define of a different macro [-Wheader-guard] drivers/gpu/drm/amd/amdgpu/aldebaran.c:110:6: warning: no previous prototype for function 'aldebaran_async_reset' [-Wmissing-prototypes] drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/aldebaran_ppt.c:1435:5: warning: no previous prototype for function 'aldebaran_mode2_reset' [-Wmissing-prototypes] Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Enable recovery on aldebaran	Lijo Lazar	1	-0/+1
	Add aldebaran to devices which support recovery Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Add mode2 reset support for aldebaran	Lijo Lazar	5	-2/+463
	v1: Aldebaran uses reset control to support mode2 reset. The sequences to reset and restore hardware context are specific to a particular configuration. v2: Clear bus mastering before reset. Fix coding style issues, drop unwanted variables and info log. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Make set PG/CG state functions public	Lijo Lazar	2	-3/+9
	Expose PG/CG set states functions for other clients Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Add PSP public function to load a list of FWs	Lijo Lazar	2	-0/+19
	v1: Adds a function to load a list of FWs as passed by the caller. This is needed as only a select need to loaded for some use cases. v2: Omit unrelated change, remove info log, fix return value when count is 0 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Add reset control handling to reset workflow	Lijo Lazar	3	-39/+97
	This prefers reset control based handling if it's implemented for a particular ASIC. If not, it takes the legacy path. It uses the legacy method of preparing environment (job, scheduler tasks) and restoring environment. v2: remove unused variable (Alex) Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Add reset control to amdgpu_device	Lijo Lazar	4	-0/+175
	v1: Add generic amdgpu_reset_control to handle different types of resets. It may be added at device, hive or ip level. Each reset control has a list of handlers associated with it to handle different types of reset. Reset control is responsible for choosing the right handler given a particular reset context. Handler objects may implement a set of functions on how to handle a particular type of reset. prepare_env = Prepare environment/software context (not used currently). prepare_hwcontext = Prepare hardware context for the reset. perform_reset = Perform the type of reset. restore_hwcontext = Restore the hw context after reset. restore_env = Restore the environment after reset (not used currently). Reset context carries the context of reset, as of now this is based on the parameters used for current set of resets. v2: Fix coding style Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amd/amdgpu implement tdr advanced mode	Jack Zhang	2	-1/+82
	[Why] Previous tdr design treats the first job in job_timeout as the bad job. But sometimes a later bad compute job can block a good gfx job and cause an unexpected gfx job timeout because gfx and compute ring share internal GC HW mutually. [How] This patch implements an advanced tdr mode.It involves an additinal synchronous pre-resubmit step(Step0 Resubmit) before normal resubmit step in order to find the real bad job. 1. At Step0 Resubmit stage, it synchronously submits and pends for the first job being signaled. If it gets timeout, we identify it as guilty and do hw reset. After that, we would do the normal resubmit step to resubmit left jobs. 2. For whole gpu reset(vram lost), do resubmit as the old way. v2: squash in build fix (Alex) Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: make BO type check less restrictive	Nirmoy Das	1	-4/+4
	BO with ttm_bo_type_sg type can also have tiling_flag and metadata. So so BO type check for only ttm_bo_type_kernel. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reported-by: Tom StDenis <Tom.StDenis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: use amdgpu_bo_user bo for metadata and tiling flag	Nirmoy Das	3	-22/+35
	Tiling flag and metadata are only needed for BOs created by amdgpu_gem_object_create(), so we can remove those from the base class. v2: * squash tiling_flags and metadata relared patches into one * use BUG_ON for non ttm_bo_type_device type when accessing tiling_flags and metadata._ v3: *include to_amdgpu_bo_user Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: use amdgpu_bo_create_user() for when possible	Nirmoy Das	2	-2/+6
	Use amdgpu_bo_create_user() for all the BO allocations for ttm_bo_type_device type. v2: include amdgpu_amdkfd_alloc_gws() as well it calls amdgpu_bo_create() for ttm_bo_type_device Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: introduce struct amdgpu_bo_user	Nirmoy Das	2	-0/+42
	Implement a new struct amdgpu_bo_user as subclass of struct amdgpu_bo and a function to created amdgpu_bo_user bo with a flag to identify the owner. v2: amdgpu_bo_to_amdgpu_bo_user -> to_amdgpu_bo_user() Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: allow variable BO struct creation	Nirmoy Das	9	-2/+19
	Allow allocating BO structures with different structure size than struct amdgpu_bo. v2: Check bo_ptr_size in all amdgpu_bo_create() caller. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: load balance VCN3 decode as well v8	Christian König	1	-2/+130
	Add VCN3 IB parsing to figure out to which instance we can send the stream for decode. v2: remove VCN instance limit as well, fix amdgpu_cs_find_mapping, check supported formats instead of unsupported. v3: fix typo and error handling v4: make sure the message BO is CPU accessible v5: fix addr calculation once more v6: only check message buffers v7: fix constant and use defines v8: fix create msg calculation Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: share scheduler score on VCN3 instances	Christian König	2	-4/+8
	The VCN3 instances can do both decode as well as encode. Share the scheduler load balancing score and remove fixing encode to only the second instance. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: add the sched_score to amdgpu_ring_init	Christian König	33	-99/+93
	Allow separate ring to share the same scheduler score. No functional change. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amd/amdgpu/gfx_v7_0: Trivial typo fixes	Bhaskar Chowdhury	1	-11/+11
	s/acccess/access/ s/inferface/interface/ s/sequnce/sequence/ .....two different places. s/retrive/retrieve/ s/sheduling/scheduling/ s/independant/independent/ s/wether/whether/ ......two different places. s/emmit/emit/ s/synce/sync/ Reviewed-by: Nirmoy Das<nirmoy.das@amd.com> Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	amdgpu: securedisplay: simplify i2c hexdump output	Arnd Bergmann	1	-8/+3
	A previous fix I did left a rather complicated loop in amdgpu_securedisplay_debugfs_write() for what could be expressed in a simple sprintf, as Rasmus pointed out. This drops the leading 0x for each byte, but is otherwise much nicer. Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Ensure that the modifier requested is supported by plane.	Mark Yacoub	1	-0/+13
	On initializing the framebuffer, call drm_any_plane_has_format to do a check if the modifier is supported. drm_any_plane_has_format calls dm_plane_format_mod_supported which is extended to validate that the modifier is on the list of the plane's supported modifiers. The bug was caught using igt-gpu-tools test: kms_addfb_basic.addfb25-bad-modifier Tested on ChromeOS Zork by turning on the display, running an overlay test, and running a YT video. === Changes from v1 === Explicitly handle DRM_FORMAT_MOD_INVALID modifier. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Mark Yacoub <markyacoub@chromium.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Convert sysfs sprintf/snprintf family to sysfs_emit	Tian Tao	8	-35/+29
	Fix the following coccicheck warning: drivers/gpu//drm/amd/amdgpu/amdgpu_ras.c:434:9-17: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_xgmi.c:220:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_xgmi.c:249:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/df_v3_6.c:208:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_psp.c:2973:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:75:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:112:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:58:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:93:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_vram_mgr.c:125:9-17: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_gtt_mgr.c:52:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_gtt_mgr.c:71:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:140:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:164:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:186:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_device.c:208:8-16: WARNING: use scnprintf or sprintf drivers/gpu//drm/amd/amdgpu/amdgpu_atombios.c:1916:8-16: WARNING: use scnprintf or sprintf Signed-off-by: Tian Tao <tiantao6@hisilicon.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: remove irq_src->data handling	Christian König	2	-6/+0
	That is unused for quite some time now. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Fix check for RAS support	Luben Tuikov	1	-9/+6
	Use positive logic to check for RAS support. Rename the function to actually indicate what it is testing for. Essentially, make the function a predicate with the correct name. Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: Set amdgpu.noretry=1 for Arcturus	Philip Cox	1	-0/+1
	Setting amdgpu.noretry=1 as default for Arcturus. Signed-off-by: Philip Cox <Philip.Cox@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: added support for dynamic GECC	John Clements	1	-0/+23
	updated host to send boot config to psp to enable GECC for sienna cichlid Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: update host to psp interface	John Clements	1	-0/+25
	added interface support for setting boot config Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-04-09	drm/amdgpu: move vram recover into sriov full access	Horace Chen	1	-1/+1
	[what] currently driver recover vram after full access, which may hit a corner case that meanwhile another whole gpu reset may be triggered by another VF, which will cause vram recover fail then fail the whole device reset. [how] move the recover vram into full access. So another bad VF will not disturb the recover sequence for this vf. Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>