summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2022-09-28drm/amdgpu: use dirty framebuffer helperHamza Mahfooz1-0/+2
[ Upstream commit 66f99628eb24409cb8feb5061f78283c8b65f820 ] Currently, we aren't handling DRM_IOCTL_MODE_DIRTYFB. So, use drm_atomic_helper_dirtyfb() as the dirty callback in the amdgpu_fb_funcs struct. Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28drm/amdgpu: Fix check for RAS supportLuben Tuikov1-9/+6
[ Upstream commit 084e2640e51626f413f85663e3ba7e32d4272477 ] Use positive logic to check for RAS support. Rename the function to actually indicate what it is testing for. Essentially, make the function a predicate with the correct name. Cc: Stanley Yang <Stanley.Yang@amd.com> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: 6c2049066355 ("drm/amdgpu: Don't enable LTR if not supported") Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28drm/amd/amdgpu: fixing read wrong pf2vf data in SRIOVJingwen Chen2-14/+8
commit 9a458402fb69bda886aa6cbe067311b6e3d9c52a upstream. [Why] This fixes 892deb48269c ("drm/amdgpu: Separate vf2pf work item init from virt data exchange"). we should read pf2vf data based at mman.fw_vram_usage_va after gmc sw_init. commit 892deb48269c breaks this logic. [How] calling amdgpu_virt_exchange_data in amdgpu_virt_init_data_exchange to set the right base in the right sequence. v2: call amdgpu_virt_init_data_exchange after gmc sw_init to make data exchange workqueue run v3: clean up the code logic v4: add some comment and make the code more readable Fixes: 892deb48269c ("drm/amdgpu: Separate vf2pf work item init from virt data exchange") Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Reviewed-by: Horace Chen <horace.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-28drm/amdgpu: make sure to init common IP before gmcAlex Deucher1-3/+11
[ Upstream commit a8671493d2074950553da3cf07d1be43185ef6c6 ] Move common IP init before GMC init so that HDP gets remapped before GMC init which uses it. This fixes the Unsupported Request error reported through AER during driver load. The error happens as a write happens to the remap offset before real remapping is done. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28drm/amdgpu: Separate vf2pf work item init from virt data exchangeVictor Skvortsov3-13/+30
[ Upstream commit 892deb48269c65376f3eeb5b4c032ff2c2979bd7 ] We want to be able to call virt data exchange conditionally after gmc sw init to reserve bad pages as early as possible. Since this is a conditional call, we will need to call it again unconditionally later in the init sequence. Refactor the data exchange function so it can be called multiple times without re-initializing the work item. v2: Cleaned up the code. Kept the original call to init_exchange_data() inside early init to initialize the work item, afterwards call exchange_data() when needed. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed By: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28drm/amdgpu: indirect register access for nv12 sriovPeng Ju Zhou2-0/+13
[ Upstream commit 8b8a162da820d48bb94261ae4684f2c839ce148c ] unify host driver and guest driver indirect access control bits names Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vegaAlex Deucher2-25/+5
[ Upstream commit e3163bc8ffdfdb405e10530b140135b2ee487f89 ] This mirrors what we do for other asics and this way we are sure the sdma doorbell range is properly initialized. There is a comment about the way doorbells on gfx9 work that requires that they are initialized for other IPs before GFX is initialized. However, the statement says that it applies to multimedia as well, but the VCN code currently initializes doorbells after GFX and there are no known issues there. In my testing at least I don't see any problems on SDMA. This is a prerequisite for fixing the Unsupported Request error reported through AER during driver load. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-20drm/amd/amdgpu: skip ucode loading if ucode_size == 0Chengming Gui1-1/+1
[ Upstream commit 39c84b8e929dbd4f63be7e04bf1a2bcd92b44177 ] Restrict the ucode loading check to avoid frontdoor loading error. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-15drm/amdgpu: mmVM_L2_CNTL3 register not initialized correctlyQu Huang1-0/+1
[ Upstream commit b8983d42524f10ac6bf35bbce6a7cc8e45f61e04 ] The mmVM_L2_CNTL3 register is not assigned an initial value Signed-off-by: Qu Huang <jinsdb@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-15drm/amdgpu: Check num_gfx_rings for gfx v9_0 rb setup.Candice Li1-1/+2
[ Upstream commit c351938350ab9b5e978dede2c321da43de7eb70c ] No need to set up rb when no gfx rings. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-15drm/amdgpu: Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to ↵YiPeng Chai2-1/+4
psp_hw_fini [ Upstream commit 9d705d7741ae70764f3d6d87e67fad3b5c30ffd0 ] V1: The amdgpu_xgmi_remove_device function will send unload command to psp through psp ring to terminate xgmi, but psp ring has been destroyed in psp_hw_fini. V2: 1. Change the commit title. 2. Restore amdgpu_xgmi_remove_device to its original calling location. Move psp_xgmi_terminate call from amdgpu_xgmi_remove_device to psp_hw_fini. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-05drm/amdgpu: Increase tlb flush timeout for sriovDusica Milinkovic3-3/+5
[ Upstream commit 373008bfc9cdb0f050258947fa5a095f0657e1bc ] [Why] During multi-vf executing benchmark (Luxmark) observed kiq error timeout. It happenes because all of VFs do the tlb invalidation at the same time. Although each VF has the invalidate register set, from hardware side the invalidate requests are queue to execute. [How] In case of 12 VF increase timeout on 12*100ms Signed-off-by: Dusica Milinkovic <Dusica.Milinkovic@amd.com> Acked-by: Shaoyun Liu <shaoyun.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21drm/amdgpu: Check BO's requested pinning domains against its preferred_domainsLeo Li1-0/+4
commit f5ba14043621f4afdf3ad5f92ee2d8dbebbe4340 upstream. When pinning a buffer, we should check to see if there are any additional restrictions imposed by bo->preferred_domains. This will prevent the BO from being moved to an invalid domain when pinning. For example, this can happen if the user requests to create a BO in GTT domain for display scanout. amdgpu_dm will allow pinning to either VRAM or GTT domains, since DCN can scanout from either or. However, in amdgpu_bo_pin_restricted(), pinning to VRAM is preferred if there is adequate carveout. This can lead to pinning to VRAM despite the user requesting GTT placement for the BO. v2: Allow the kernel to override the domain, which can happen when exporting a BO to a V4L camera (for example). Signed-off-by: Leo Li <sunpeng.li@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-07drm/amdgpu: To flush tlb for MMHUB of RAVEN seriesRuili Ji1-1/+2
commit 5cb0e3fb2c54eabfb3f932a1574bff1774946bc0 upstream. amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:40 vmid:8 pasid:32769, for process test_basic pid 3305 thread test_basic pid 3305) amdgpu: in page starting at address 0x00007ff990003000 from IH client 0x12 (VMC) amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00840051 amdgpu: Faulty UTCL2 client ID: MP1 (0x0) amdgpu: MORE_FAULTS: 0x1 amdgpu: WALKER_ERROR: 0x0 amdgpu: PERMISSION_FAULTS: 0x5 amdgpu: MAPPING_ERROR: 0x0 amdgpu: RW: 0x1 When memory is allocated by kfd, no one triggers the tlb flush for MMHUB0. There is page fault from MMHUB0. v2:fix indentation v3:change subject and fix indentation Signed-off-by: Ruili Ji <ruiliji2@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Reviewed-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09drm/amdgpu/cs: make commands with 0 chunks illegal behaviour.Dave Airlie1-1/+1
commit 31ab27b14daaa75541a415c6794d6f3567fea44a upstream. Submitting a cs with 0 chunks, causes an oops later, found trying to execute the wrong userspace driver. MESA_LOADER_DRIVER_OVERRIDE=v3d glxinfo [172536.665184] BUG: kernel NULL pointer dereference, address: 00000000000001d8 [172536.665188] #PF: supervisor read access in kernel mode [172536.665189] #PF: error_code(0x0000) - not-present page [172536.665191] PGD 6712a0067 P4D 6712a0067 PUD 5af9ff067 PMD 0 [172536.665195] Oops: 0000 [#1] SMP NOPTI [172536.665197] CPU: 7 PID: 2769838 Comm: glxinfo Tainted: P O 5.10.81 #1-NixOS [172536.665199] Hardware name: To be filled by O.E.M. To be filled by O.E.M./CROSSHAIR V FORMULA-Z, BIOS 2201 03/23/2015 [172536.665272] RIP: 0010:amdgpu_cs_ioctl+0x96/0x1ce0 [amdgpu] [172536.665274] Code: 75 18 00 00 4c 8b b2 88 00 00 00 8b 46 08 48 89 54 24 68 49 89 f7 4c 89 5c 24 60 31 d2 4c 89 74 24 30 85 c0 0f 85 c0 01 00 00 <48> 83 ba d8 01 00 00 00 48 8b b4 24 90 00 00 00 74 16 48 8b 46 10 [172536.665276] RSP: 0018:ffffb47c0e81bbe0 EFLAGS: 00010246 [172536.665277] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [172536.665278] RDX: 0000000000000000 RSI: ffffb47c0e81be28 RDI: ffffb47c0e81bd68 [172536.665279] RBP: ffff936524080010 R08: 0000000000000000 R09: ffffb47c0e81be38 [172536.665281] R10: ffff936524080010 R11: ffff936524080000 R12: ffffb47c0e81bc40 [172536.665282] R13: ffffb47c0e81be28 R14: ffff9367bc410000 R15: ffffb47c0e81be28 [172536.665283] FS: 00007fe35e05d740(0000) GS:ffff936c1edc0000(0000) knlGS:0000000000000000 [172536.665284] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [172536.665286] CR2: 00000000000001d8 CR3: 0000000532e46000 CR4: 00000000000406e0 [172536.665287] Call Trace: [172536.665322] ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu] [172536.665332] drm_ioctl_kernel+0xaa/0xf0 [drm] [172536.665338] drm_ioctl+0x201/0x3b0 [drm] [172536.665369] ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu] [172536.665372] ? selinux_file_ioctl+0x135/0x230 [172536.665399] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] [172536.665403] __x64_sys_ioctl+0x83/0xb0 [172536.665406] do_syscall_64+0x33/0x40 [172536.665409] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2018 Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09drm/amdgpu/ucode: Remove firmware load type check in amdgpu_ucode_free_boAlice Wong1-2/+1
[ Upstream commit ab0cd4a9ae5b4679b714d8dbfedc0901fecdce9f ] When psp_hw_init failed, it will set the load_type to AMDGPU_FW_LOAD_DIRECT. During amdgpu_device_ip_fini, amdgpu_ucode_free_bo checks that load_type is AMDGPU_FW_LOAD_DIRECT and skips deallocating fw_buf causing memory leak. Remove load_type check in amdgpu_ucode_free_bo. Signed-off-by: Alice Wong <shiwei.wong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20drm/amdgpu: Enable gfxoff quirk on MacBook ProTomasz Moń1-0/+2
commit 4593c1b6d159f1e5c35c07a7f125e79e5a864302 upstream. Enabling gfxoff quirk results in perfectly usable graphical user interface on MacBook Pro (15-inch, 2019) with Radeon Pro Vega 20 4 GB. Without the quirk, X server is completely unusable as every few seconds there is gpu reset due to ring gfx timeout. Signed-off-by: Tomasz Moń <desowin@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-20drm/amdgpu/vcn: improve vcn dpg stop procedureTianci Yin1-0/+3
[ Upstream commit 6ea239adc2a712eb318f04f5c29b018ba65ea38a ] Prior to disabling dpg, VCN need unpausing dpg mode, or VCN will hang in S3 resuming. Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Tianci Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20drm/amdkfd: Fix Incorrect VMIDs passed to HWSTushar Patel1-1/+1
[ Upstream commit b7dfbd2e601f3fee545bc158feceba4f340fe7cf ] Compute-only GPUs have more than 8 VMIDs allocated to KFD. Fix this by passing correct number of VMIDs to HWS v2: squash in warning fix (Alex) Signed-off-by: Tushar Patel <tushar.patel@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20drm/amd: Add USBC connector IDAurabindo Pillai1-0/+1
[ Upstream commit c5c948aa894a831f96fccd025e47186b1ee41615 ] [Why&How] Add a dedicated AMDGPU specific ID for use with newer ASICs that support USB-C output Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-20drm/amdkfd: Use drm_priv to pass VM from KFD to amdgpuFelix Kuehling1-3/+7
commit b40a6ab2cf9213923bf8e821ce7fa7f6a0a26990 upstream. amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu needs the drm_priv to allow mmap to access the BO through the corresponding file descriptor. The VM can also be extracted from drm_priv, so drm_priv can replace the vm parameter in the kfd2kgd interface. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> [This is a partial cherry-pick of the upstream commit.] Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-13drm/amdgpu: fix off by one in amdgpu_gfx_kiq_acquire()Dan Carpenter1-1/+1
[ Upstream commit 1647b54ed55d4d48c7199d439f8834626576cbe9 ] This post-op should be a pre-op so that we do not pass -1 as the bit number to test_bit(). The current code will loop downwards from 63 to -1. After changing to a pre-op, it loops from 63 to 0. Fixes: 71c37505e7ea ("drm/amdgpu/gfx: move more common KIQ code to amdgpu_gfx.c") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13drm/amdgpu: Fix recursive locking warningRajneesh Bhardwaj1-1/+2
[ Upstream commit 447c7997b62a5115ba4da846dcdee4fc12298a6a ] Noticed the below warning while running a pytorch workload on vega10 GPUs. Change to trylock to avoid conflicts with already held reservation locks. [ +0.000003] WARNING: possible recursive locking detected [ +0.000003] 5.13.0-kfd-rajneesh #1030 Not tainted [ +0.000004] -------------------------------------------- [ +0.000002] python/4822 is trying to acquire lock: [ +0.000004] ffff932cd9a259f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000203] but task is already holding lock: [ +0.000003] ffff932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_eu_reserve_buffers+0x270/0x470 [ttm] [ +0.000017] other info that might help us debug this: [ +0.000002] Possible unsafe locking scenario: [ +0.000003] CPU0 [ +0.000002] ---- [ +0.000002] lock(reservation_ww_class_mutex); [ +0.000004] lock(reservation_ww_class_mutex); [ +0.000003] *** DEADLOCK *** [ +0.000002] May be due to missing lock nesting notation [ +0.000003] 7 locks held by python/4822: [ +0.000003] #0: ffff932c4ac028d0 (&process->mutex){+.+.}-{3:3}, at: kfd_ioctl_map_memory_to_gpu+0x10b/0x320 [amdgpu] [ +0.000232] #1: ffff932c55e830a8 (&info->lock#2){+.+.}-{3:3}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x64/0xf60 [amdgpu] [ +0.000241] #2: ffff932cc45b5e68 (&(*mem)->lock){+.+.}-{3:3}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0xdf/0xf60 [amdgpu] [ +0.000236] #3: ffffb2b35606fd28 (reservation_ww_class_acquire){+.+.}-{0:0}, at: amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x232/0xf60 [amdgpu] [ +0.000235] #4: ffff932cbb7181f8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: ttm_eu_reserve_buffers+0x270/0x470 [ttm] [ +0.000015] #5: ffffffffc045f700 (*(sspp++)){....}-{0:0}, at: drm_dev_enter+0x5/0xa0 [drm] [ +0.000038] #6: ffff932c52da7078 (&vm->eviction_lock){+.+.}-{3:3}, at: amdgpu_vm_bo_update_mapping+0xd5/0x4f0 [amdgpu] [ +0.000195] stack backtrace: [ +0.000003] CPU: 11 PID: 4822 Comm: python Not tainted 5.13.0-kfd-rajneesh #1030 [ +0.000005] Hardware name: GIGABYTE MZ01-CE0-00/MZ01-CE0-00, BIOS F02 08/29/2018 [ +0.000003] Call Trace: [ +0.000003] dump_stack+0x6d/0x89 [ +0.000010] __lock_acquire+0xb93/0x1a90 [ +0.000009] lock_acquire+0x25d/0x2d0 [ +0.000005] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000184] ? lock_is_held_type+0xa2/0x110 [ +0.000006] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000184] __ww_mutex_lock.constprop.17+0xca/0x1060 [ +0.000007] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000183] ? lock_release+0x13f/0x270 [ +0.000005] ? lock_is_held_type+0xa2/0x110 [ +0.000006] ? amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000183] amdgpu_bo_release_notify+0xc4/0x160 [amdgpu] [ +0.000185] ttm_bo_release+0x4c6/0x580 [ttm] [ +0.000010] amdgpu_bo_unref+0x1a/0x30 [amdgpu] [ +0.000183] amdgpu_vm_free_table+0x76/0xa0 [amdgpu] [ +0.000189] amdgpu_vm_free_pts+0xb8/0xf0 [amdgpu] [ +0.000189] amdgpu_vm_update_ptes+0x411/0x770 [amdgpu] [ +0.000191] amdgpu_vm_bo_update_mapping+0x324/0x4f0 [amdgpu] [ +0.000191] amdgpu_vm_bo_update+0x251/0x610 [amdgpu] [ +0.000191] update_gpuvm_pte+0xcc/0x290 [amdgpu] [ +0.000229] ? amdgpu_vm_bo_map+0xd7/0x130 [amdgpu] [ +0.000190] amdgpu_amdkfd_gpuvm_map_memory_to_gpu+0x912/0xf60 [amdgpu] [ +0.000234] kfd_ioctl_map_memory_to_gpu+0x182/0x320 [amdgpu] [ +0.000218] kfd_ioctl+0x2b9/0x600 [amdgpu] [ +0.000216] ? kfd_ioctl_unmap_memory_from_gpu+0x270/0x270 [amdgpu] [ +0.000216] ? lock_release+0x13f/0x270 [ +0.000006] ? __fget_files+0x107/0x1e0 [ +0.000007] __x64_sys_ioctl+0x8b/0xd0 [ +0.000007] do_syscall_64+0x36/0x70 [ +0.000004] entry_SYSCALL_64_after_hwframe+0x44/0xae [ +0.000007] RIP: 0033:0x7fbff90a7317 [ +0.000004] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48 [ +0.000005] RSP: 002b:00007fbe301fe648 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ +0.000006] RAX: ffffffffffffffda RBX: 00007fbcc402d820 RCX: 00007fbff90a7317 [ +0.000003] RDX: 00007fbe301fe690 RSI: 00000000c0184b18 RDI: 0000000000000004 [ +0.000003] RBP: 00007fbe301fe690 R08: 0000000000000000 R09: 00007fbcc402d880 [ +0.000003] R10: 0000000002001000 R11: 0000000000000246 R12: 00000000c0184b18 [ +0.000003] R13: 0000000000000004 R14: 00007fbf689593a0 R15: 00007fbcc402d820 Cc: Christian König <christian.koenig@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Alex Deucher <Alexander.Deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13drm/amd/amdgpu/amdgpu_cs: fix refcount leak of a dma_fence objXin Xiong1-0/+1
[ Upstream commit dfced44f122c500004a48ecc8db516bb6a295a1b ] This issue takes place in an error path in amdgpu_cs_fence_to_handle_ioctl(). When `info->in.what` falls into default case, the function simply returns -EINVAL, forgetting to decrement the reference count of a dma_fence obj, which is bumped earlier by amdgpu_cs_get_fence(). This may result in reference count leaks. Fix it by decreasing the refcount of specific object before returning the error code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08drm/amdgpu: fix suspend/resume hang regressionQiang Yu1-1/+2
[ Upstream commit f1ef17011c765495c876fa75435e59eecfdc1ee4 ] Regression has been reported that suspend/resume may hang with the previous vm ready check commit. So bring back the evicted list check as a temp fix. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1922 Fixes: c1a66c3bc425 ("drm/amdgpu: check vm ready by amdgpu_vm->evicting flag") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Qiang Yu <qiang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-08drm/amdgpu: check vm ready by amdgpu_vm->evicting flagQiang Yu1-2/+7
[ Upstream commit c1a66c3bc425ff93774fb2f6eefa67b83170dd7e ] Workstation application ANSA/META v21.1.4 get this error dmesg when running CI test suite provided by ANSA/META: [drm:amdgpu_gem_va_ioctl [amdgpu]] *ERROR* Couldn't update BO_VA (-16) This is caused by: 1. create a 256MB buffer in invisible VRAM 2. CPU map the buffer and access it causes vm_fault and try to move it to visible VRAM 3. force visible VRAM space and traverse all VRAM bos to check if evicting this bo is valuable 4. when checking a VM bo (in invisible VRAM), amdgpu_vm_evictable() will set amdgpu_vm->evicting, but latter due to not in visible VRAM, won't really evict it so not add it to amdgpu_vm->evicted 5. before next CS to clear the amdgpu_vm->evicting, user VM ops ioctl will pass amdgpu_vm_ready() (check amdgpu_vm->evicted) but fail in amdgpu_vm_bo_update_mapping() (check amdgpu_vm->evicting) and get this error log This error won't affect functionality as next CS will finish the waiting VM ops. But we'd better clear the error log by checking the amdgpu_vm->evicting flag in amdgpu_vm_ready() to stop calling amdgpu_vm_bo_update_mapping() later. Another reason is amdgpu_vm->evicted list holds all BOs (both user buffer and page table), but only page table BOs' eviction prevent VM ops. amdgpu_vm->evicting flag is set only for page table BOs, so we should use evicting flag instead of evicted list in amdgpu_vm_ready(). The side effect of this change is: previously blocked VM op (user buffer in "evicted" list but no page table in it) gets done immediately. v2: update commit comments. Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Qiang Yu <qiang.yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-03-02drm/amdgpu: disable MMHUB PG for PicassoEvan Quan1-1/+4
commit f626dd0ff05043e5a7154770cc7cda66acee33a3 upstream. MMHUB PG needs to be disabled for Picasso for stability reasons. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-23drm/amdgpu: fix logic inversion in checkChristian König1-1/+1
[ Upstream commit e8ae38720e1a685fd98cfa5ae118c9d07b45ca79 ] We probably never trigger this, but the logic inside the check is inverted. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-02-16drm/amdgpu: Set a suitable dev_info.gart_page_sizeHuacai Chen1-2/+2
commit f4d3da72a76a9ce5f57bba64788931686a9dc333 upstream. In Mesa, dev_info.gart_page_size is used for alignment and it was set to AMDGPU_GPU_PAGE_SIZE(4KB). However, the page table of AMDGPU driver requires an alignment on CPU pages. So, for non-4KB page system, gart_page_size should be max_t(u32, PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE). Signed-off-by: Rui Wang <wangr@lemote.com> Signed-off-by: Huacai Chen <chenhc@lemote.com> Link: https://github.com/loongson-community/linux-stable/commit/caa9c0a1 [Xi: rebased for drm-next, use max_t for checkpatch, and reworded commit message.] Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang> BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1549 Tested-by: Dan Horák <dan@danny.cz> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> [Salvatore Bonaccorso: Backport to 5.10.y which does not contain a5a52a43eac0 ("drm/amd/amdgpu/amdgpu_kms: Remove 'struct drm_amdgpu_info_device dev_info' from the stack") which removes dev_info from the stack and places it on the heap.] Tested-by: Timothy Pearson <tpearson@raptorengineering.com> Signed-off-by: Salvatore Bonaccorso <carnil@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-27drm/amdgpu: fixup bad vram size on gmc v8Zongmin Zhou1-3/+10
[ Upstream commit 11544d77e3974924c5a9c8a8320b996a3e9b2f8b ] Some boards(like RX550) seem to have garbage in the upper 16 bits of the vram size register. Check for this and clamp the size properly. Fixes boards reporting bogus amounts of vram. after add this patch,the maximum GPU VRAM size is 64GB, otherwise only 64GB vram size will be used. Signed-off-by: Zongmin Zhou<zhouzongmin@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-27drm/amdgpu/display: set vblank_disable_immediate for DCAlex Deucher1-1/+0
[ Upstream commit 92020e81ddbeac351ea4a19bcf01743f32b9c800 ] Disable vblanks immediately to save power. I think this was missed when we merged DC support. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1781 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-27drm/amdgpu: Fix a NULL pointer dereference in amdgpu_connector_lcd_native_mode()Zhou Qingyang1-0/+6
[ Upstream commit b220110e4cd442156f36e1d9b4914bb9e87b0d00 ] In amdgpu_connector_lcd_native_mode(), the return value of drm_mode_duplicate() is assigned to mode, and there is a dereference of it in amdgpu_connector_lcd_native_mode(), which will lead to a NULL pointer dereference on failure of drm_mode_duplicate(). Fix this bug add a check of mode. This bug was found by a static analyzer. The analysis employs differential checking to identify inconsistent security operations (e.g., checks or kfrees) between two code paths and confirms that the inconsistent operations are not recovered in the current function or the callers, so they constitute bugs. Note that, as a bug found by static analysis, it can be a false positive or hard to trigger. Multiple researchers have cross-reviewed the bug. Builds with CONFIG_DRM_AMDGPU=m show no new warnings, and our static analyzer no longer warns about this code. Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-05drm/amdgpu: add support for IP discovery gc_info table v2Alex Deucher1-22/+54
commit 5e713c6afa34c0fd6f113bf7bb1c2847172d7b20 upstream. Used on gfx9 based systems. Fixes incorrect CU counts reported in the kernel log. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1833 Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-05drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly ↵chen gong1-0/+7
enabled commit b7865173cf6ae59942e2c69326a06e1c1df5ecf6 upstream. Play a video on the raven (or PCO, raven2) platform, and then do the S3 test. When resume, the following error will be reported: amdgpu 0000:02:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec test failed (-110) [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <vcn_v1_0> failed -110 amdgpu 0000:02:00.0: amdgpu: amdgpu_device_ip_resume failed (-110). PM: dpm_run_callback(): pci_pm_resume+0x0/0x90 returns -110 [why] When playing the video: The power state flag of the vcn block is set to POWER_STATE_ON. When doing suspend: There is no change to the power state flag of the vcn block, it is still POWER_STATE_ON. When doing resume: Need to open the power gate of the vcn block and set the power state flag of the VCN block to POWER_STATE_ON. But at this time, the power state flag of the vcn block is already POWER_STATE_ON. The power status flag check in the "8f2cdef drm/amd/pm: avoid duplicate powergate/ungate setting" patch will return the amdgpu_dpm_set_powergating_by_smu function directly. As a result, the gate of the power was not opened, causing the subsequent ring test to fail. [how] In the suspend function of the vcn block, explicitly change the power state flag of the vcn block to POWER_STATE_OFF. BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1828 Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-22drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORELe Ma1-2/+2
commit f3a8076eb28cae1553958c629aecec479394bbe2 upstream. should count on GC IP base address Signed-off-by: Le Ma <le.ma@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14drm/amdkfd: fix boot failure when iommu is disabled in Picasso.Yifan Zhang1-4/+0
commit afd18180c07026f94a80ff024acef5f4159084a4 upstream. When IOMMU disabled in sbios and kfd in iommuv2 path, iommuv2 init will fail. But this failure should not block amdgpu driver init. Reported-by: youling <youling257@gmail.com> Tested-by: youling <youling257@gmail.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14drm/amdgpu: init iommu after amdkfd device initYifan Zhang1-4/+4
commit 714d9e4574d54596973ee3b0624ee4a16264d700 upstream. This patch is to fix clinfo failure in Raven/Picasso: Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.2 AMD-APP (3364.0) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback Platform Name: AMD Accelerated Parallel Processing Number of devices: 0 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Tested-by: James Zhu <James.Zhu@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14drm/amdgpu: move iommu_resume before ip init/resumeJames Zhu1-0/+12
commit f02abeb0779700c308e661a412451b38962b8a0b upstream. Separate iommu_resume from kfd_resume, and move it before other amdgpu ip init/resume. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14drm/amdgpu: add amdgpu_amdkfd_resume_iommuJames Zhu2-0/+11
commit 8066008482e533e91934bee49765bf8b4a7c40db upstream. Add amdgpu_amdkfd_resume_iommu for amdgpu. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14drm/amdkfd: separate kfd_iommu_resume from kfd_resumeJames Zhu1-0/+6
commit fefc01f042f44ede373ee66773b8238dd8fdcb55 upstream. Separate kfd_iommu_resume from kfd_resume for fine-tuning of amdgpu device init/resume/reset/recovery sequence. v2: squash in fix for !CONFIG_HSA_AMD Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-14drm/amd/amdkfd: adjust dummy functions' placementLang Yu2-106/+119
commit cd63989e0e6aa2eb66b461f2bae769e2550e47ac upstream. Move all the dummy functions in amdgpu_amdkfd.c to amdgpu_amdkfd.h as inline functions. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-08drm/amd/amdgpu: fix potential memleakBernard Zhao1-0/+1
[ Upstream commit 27dfaedc0d321b4ea4e10c53e4679d6911ab17aa ] In function amdgpu_get_xgmi_hive, when kobject_init_and_add failed There is a potential memleak if not call kobject_put. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Bernard Zhao <bernard@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-01drm/amdgpu/gfx9: switch to golden tsc registers for renoir+Alex Deucher1-11/+35
commit 53af98c091bc42fd9ec64cfabc40da4e5f3aae93 upstream. Renoir and newer gfx9 APUs have new TSC register that is not part of the gfxoff tile, so it can be read without needing to disable gfx off. Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-26drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga ↵hongao1-0/+1
and dvi connectors commit bf552083916a7f8800477b5986940d1c9a31b953 upstream. amdgpu_connector_vga_get_modes missed function amdgpu_get_native_mode which assign amdgpu_encoder->native_mode with *preferred_mode result in amdgpu_encoder->native_mode.clock always be 0. That will cause amdgpu_connector_set_property returned early on: if ((rmx_type != DRM_MODE_SCALE_NONE) && (amdgpu_encoder->native_mode.clock == 0)) when we try to set scaling mode Full/Full aspect/Center. Add the missing function to amdgpu_connector_vga_get_mode can fix this. It also works on dvi connectors because amdgpu_connector_dvi_helper_funcs.get_mode use the same method. Signed-off-by: hongao <hongao@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bitsAlex Deucher1-2/+2
[ Upstream commit 403475be6d8b122c3e6b8a47e075926d7299e5ef ] The DMA mask on SI parts is 40 bits not 44. Copy paste typo. Fixes: 244511f386ccb9 ("drm/amdgpu: simplify and cleanup setting the dma mask") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1762 Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdgpu: fix warning for overflow checkArnd Bergmann2-2/+2
[ Upstream commit 335aea75b0d95518951cad7c4c676e6f1c02c150 ] The overflow check in amdgpu_bo_list_create() causes a warning with clang-14 on 64-bit architectures, since the limit can never be exceeded. drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c:74:18: error: result of comparison of constant 256204778801521549 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (num_entries > (SIZE_MAX - sizeof(struct amdgpu_bo_list)) ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The check remains useful for 32-bit architectures, so just avoid the warning by using size_t as the type for the count. Fixes: 920990cb080a ("drm/amdgpu: allocate the bo_list array after the list") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/amdgpu: Fix MMIO access page faultAndrey Grodzovsky2-8/+17
[ Upstream commit c03509cbc01559549700e14c4a6239f2572ab4ba ] Add more guards to MMIO access post device unbind/unplug Bug: https://bugs.archlinux.org/task/72092?project=1&order=dateopened&sort=desc&pagenum=1 Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-17drm/amdgpu: fix gart.bo pin_count leakLeslie Shi2-2/+4
[ Upstream commit 66805763a97f8f7bdf742fc0851d85c02ed9411f ] gmc_v{9,10}_0_gart_disable() isn't called matched with correspoding gart_enbale function in SRIOV case. This will lead to gart.bo pin_count leak on driver unload. Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-06drm/amdgpu: correct initial cp_hqd_quantum for gfx9Hawking Zhang1-1/+1
commit 9f52c25f59b504a29dda42d83ac1e24d2af535d4 upstream. didn't read the value of mmCP_HQD_QUANTUM from correct register offset Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-22drm/amd/amdgpu: Increase HWIP_MAX_INSTANCE to 10Ernst Sjöstrand1-1/+1
commit 67a44e659888569a133a8f858c8230e9d7aad1d5 upstream. Seems like newer cards can have even more instances now. Found by UBSAN: array-index-out-of-bounds in drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:318:29 index 8 is out of range for type 'uint32_t *[8]' Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1697 Cc: stable@vger.kernel.org Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>