summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2022-02-23drm/amdgpu: fix logic inversion in checkChristian König1-1/+1
[ Upstream commit e8ae38720e1a685fd98cfa5ae118c9d07b45ca79 ] We probably never trigger this, but the logic inside the check is inverted. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-27drm/amdgpu: fixup bad vram size on gmc v8Zongmin Zhou1-3/+10
[ Upstream commit 11544d77e3974924c5a9c8a8320b996a3e9b2f8b ] Some boards(like RX550) seem to have garbage in the upper 16 bits of the vram size register. Check for this and clamp the size properly. Fixes boards reporting bogus amounts of vram. after add this patch,the maximum GPU VRAM size is 64GB, otherwise only 64GB vram size will be used. Signed-off-by: Zongmin Zhou<zhouzongmin@kylinos.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-27drm/amdgpu: Fix a NULL pointer dereference in amdgpu_connector_lcd_native_mode()Zhou Qingyang1-0/+6
[ Upstream commit b220110e4cd442156f36e1d9b4914bb9e87b0d00 ] In amdgpu_connector_lcd_native_mode(), the return value of drm_mode_duplicate() is assigned to mode, and there is a dereference of it in amdgpu_connector_lcd_native_mode(), which will lead to a NULL pointer dereference on failure of drm_mode_duplicate(). Fix this bug add a check of mode. This bug was found by a static analyzer. The analysis employs differential checking to identify inconsistent security operations (e.g., checks or kfrees) between two code paths and confirms that the inconsistent operations are not recovered in the current function or the callers, so they constitute bugs. Note that, as a bug found by static analysis, it can be a false positive or hard to trigger. Multiple researchers have cross-reviewed the bug. Builds with CONFIG_DRM_AMDGPU=m show no new warnings, and our static analyzer no longer warns about this code. Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-22drm/amdgpu: correct register access for RLC_JUMP_TABLE_RESTORELe Ma1-2/+2
commit f3a8076eb28cae1553958c629aecec479394bbe2 upstream. should count on GC IP base address Signed-off-by: Le Ma <le.ma@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-26drm/amdgpu: fix set scaling mode Full/Full aspect/Center not works on vga ↵hongao1-0/+1
and dvi connectors commit bf552083916a7f8800477b5986940d1c9a31b953 upstream. amdgpu_connector_vga_get_modes missed function amdgpu_get_native_mode which assign amdgpu_encoder->native_mode with *preferred_mode result in amdgpu_encoder->native_mode.clock always be 0. That will cause amdgpu_connector_set_property returned early on: if ((rmx_type != DRM_MODE_SCALE_NONE) && (amdgpu_encoder->native_mode.clock == 0)) when we try to set scaling mode Full/Full aspect/Center. Add the missing function to amdgpu_connector_vga_get_mode can fix this. It also works on dvi connectors because amdgpu_connector_dvi_helper_funcs.get_mode use the same method. Signed-off-by: hongao <hongao@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-17drm/amdgpu/gmc6: fix DMA mask from 44 to 40 bitsAlex Deucher1-2/+2
[ Upstream commit 403475be6d8b122c3e6b8a47e075926d7299e5ef ] The DMA mask on SI parts is 40 bits not 44. Copy paste typo. Fixes: 244511f386ccb9 ("drm/amdgpu: simplify and cleanup setting the dma mask") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1762 Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-17drm/amdgpu: fix warning for overflow checkArnd Bergmann2-2/+2
[ Upstream commit 335aea75b0d95518951cad7c4c676e6f1c02c150 ] The overflow check in amdgpu_bo_list_create() causes a warning with clang-14 on 64-bit architectures, since the limit can never be exceeded. drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c:74:18: error: result of comparison of constant 256204778801521549 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare] if (num_entries > (SIZE_MAX - sizeof(struct amdgpu_bo_list)) ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The check remains useful for 32-bit architectures, so just avoid the warning by using size_t as the type for the count. Fixes: 920990cb080a ("drm/amdgpu: allocate the bo_list array after the list") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-10-17drm/amdgpu: fix gart.bo pin_count leakLeslie Shi2-2/+4
[ Upstream commit 66805763a97f8f7bdf742fc0851d85c02ed9411f ] gmc_v{9,10}_0_gart_disable() isn't called matched with correspoding gart_enbale function in SRIOV case. This will lead to gart.bo pin_count leak on driver unload. Cc: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-22drm/amd/amdgpu: Increase HWIP_MAX_INSTANCE to 10Ernst Sjöstrand1-1/+1
commit 67a44e659888569a133a8f858c8230e9d7aad1d5 upstream. Seems like newer cards can have even more instances now. Found by UBSAN: array-index-out-of-bounds in drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c:318:29 index 8 is out of range for type 'uint32_t *[8]' Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1697 Cc: stable@vger.kernel.org Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-22drm/amdgpu: Fix BUG_ON assertAndrey Grodzovsky1-1/+1
commit ea7acd7c5967542353430947f3faf699e70602e5 upstream. With added CPU domain to placement you can have now 3 placemnts at once. CC: stable@kernel.org Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210622162339.761651-5-andrey.grodzovsky@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-22gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable ↵Tuo Li1-1/+1
access in amdgpu_i2c_router_select_ddc_port() [ Upstream commit a211260c34cfadc6068fece8c9e99e0fe1e2a2b6 ] The variable val is declared without initialization, and its address is passed to amdgpu_i2c_get_byte(). In this function, the value of val is accessed in: DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n", addr, *val); Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized, but it is accessed in: val &= ~amdgpu_connector->router.ddc_mux_control_pin; To fix this possible uninitialized-variable access, initialize val to 0 in amdgpu_i2c_router_select_ddc_port(). Reported-by: TOTE Robot <oslab@tsinghua.edu.cn> Signed-off-by: Tuo Li <islituo@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-22drm/amdgpu: Fix amdgpu_ras_eeprom_init()Luben Tuikov1-1/+1
[ Upstream commit dce4400e6516d18313d23de45b5be8a18980b00e ] No need to account for the 2 bytes of EEPROM address--this is now well abstracted away by the fixes the the lower layers. Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alexander Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15drm/amdgpu/acp: Make PM domain really workKai-Heng Feng1-28/+26
[ Upstream commit aff890288de2d818e4f83ec40c9315e2d735df07 ] Devices created by mfd_add_hotplug_devices() don't really increase the index of its name, so get_mfd_cell_dev() cannot find any device, hence a NULL dev is passed to pm_genpd_add_device(): [ 56.974926] (NULL device *): amdgpu: device acp_audio_dma.0.auto added to pm domain [ 56.974933] (NULL device *): amdgpu: Failed to add dev to genpd [ 56.974941] [drm:amdgpu_device_ip_init [amdgpu]] *ERROR* hw_init of IP block <acp_ip> failed -22 [ 56.975810] amdgpu 0000:00:01.0: amdgpu: amdgpu_device_ip_init failed [ 56.975839] amdgpu 0000:00:01.0: amdgpu: Fatal error during GPU init [ 56.977136] ------------[ cut here ]------------ [ 56.977143] kernel BUG at mm/slub.c:4206! [ 56.977158] invalid opcode: 0000 [#1] SMP NOPTI [ 56.977167] CPU: 1 PID: 1648 Comm: modprobe Not tainted 5.12.0-051200rc8-generic #202104182230 [ 56.977175] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./FM2A68M-HD+, BIOS P5.20 02/13/2019 [ 56.977180] RIP: 0010:kfree+0x3bf/0x410 [ 56.977195] Code: 89 e7 48 d3 e2 f7 da e8 5f 0d 02 00 80 e7 02 75 3e 44 89 ee 4c 89 e7 e8 ef 5f fd ff e9 fa fe ff ff 49 8b 44 24 08 a8 01 75 b7 <0f> 0b 4c 8b 4d b0 48 8b 4d a8 48 89 da 4c 89 e6 41 b8 01 00 00 00 [ 56.977202] RSP: 0018:ffffa48640ff79f0 EFLAGS: 00010246 [ 56.977210] RAX: 0000000000000000 RBX: ffff9286127d5608 RCX: 0000000000000000 [ 56.977215] RDX: 0000000000000000 RSI: ffffffffc099d0fb RDI: ffff9286127d5608 [ 56.977220] RBP: ffffa48640ff7a48 R08: 0000000000000001 R09: 0000000000000001 [ 56.977224] R10: 0000000000000000 R11: ffff9286087d8458 R12: fffff3ae0449f540 [ 56.977229] R13: 0000000000000000 R14: dead000000000122 R15: dead000000000100 [ 56.977234] FS: 00007f9de5929540(0000) GS:ffff928612e80000(0000) knlGS:0000000000000000 [ 56.977240] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 56.977245] CR2: 00007f697dd97160 CR3: 00000001110f0000 CR4: 00000000001506e0 [ 56.977251] Call Trace: [ 56.977261] amdgpu_dm_encoder_destroy+0x1b/0x30 [amdgpu] [ 56.978056] drm_mode_config_cleanup+0x4f/0x2e0 [drm] [ 56.978147] ? kfree+0x3dd/0x410 [ 56.978157] ? drm_managed_release+0xc8/0x100 [drm] [ 56.978232] drm_mode_config_init_release+0xe/0x10 [drm] [ 56.978311] drm_managed_release+0x9d/0x100 [drm] [ 56.978388] devm_drm_dev_init_release+0x4d/0x70 [drm] [ 56.978450] devm_action_release+0x15/0x20 [ 56.978459] release_nodes+0x77/0xc0 [ 56.978469] devres_release_all+0x3f/0x50 [ 56.978477] really_probe+0x245/0x460 [ 56.978485] driver_probe_device+0xe9/0x160 [ 56.978492] device_driver_attach+0xab/0xb0 [ 56.978499] __driver_attach+0x8f/0x150 [ 56.978506] ? device_driver_attach+0xb0/0xb0 [ 56.978513] bus_for_each_dev+0x7e/0xc0 [ 56.978521] driver_attach+0x1e/0x20 [ 56.978528] bus_add_driver+0x135/0x1f0 [ 56.978534] driver_register+0x91/0xf0 [ 56.978540] __pci_register_driver+0x54/0x60 [ 56.978549] amdgpu_init+0x77/0x1000 [amdgpu] [ 56.979246] ? 0xffffffffc0dbc000 [ 56.979254] do_one_initcall+0x48/0x1d0 [ 56.979265] ? kmem_cache_alloc_trace+0x120/0x230 [ 56.979274] ? do_init_module+0x28/0x280 [ 56.979282] do_init_module+0x62/0x280 [ 56.979288] load_module+0x71c/0x7a0 [ 56.979296] __do_sys_finit_module+0xc2/0x120 [ 56.979305] __x64_sys_finit_module+0x1a/0x20 [ 56.979311] do_syscall_64+0x38/0x90 [ 56.979319] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 56.979328] RIP: 0033:0x7f9de54f989d [ 56.979335] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c3 f5 0c 00 f7 d8 64 89 01 48 [ 56.979342] RSP: 002b:00007ffe3c395a28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 56.979350] RAX: ffffffffffffffda RBX: 0000560df3ef4330 RCX: 00007f9de54f989d [ 56.979355] RDX: 0000000000000000 RSI: 0000560df3a07358 RDI: 000000000000000f [ 56.979360] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000000 [ 56.979365] R10: 000000000000000f R11: 0000000000000246 R12: 0000560df3a07358 [ 56.979369] R13: 0000000000000000 R14: 0000560df3ef4460 R15: 0000560df3ef4330 [ 56.979377] Modules linked in: amdgpu(+) iommu_v2 gpu_sched drm_ttm_helper ttm drm_kms_helper cec rc_core i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt nft_counter xt_tcpudp ipt_REJECT nf_reject_ipv4 xt_conntrack iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set nf_tables libcrc32c nfnetlink ip6_tables iptable_filter bpfilter input_leds binfmt_misc edac_mce_amd kvm_amd ccp kvm snd_hda_codec_realtek snd_hda_codec_generic crct10dif_pclmul snd_hda_codec_hdmi ledtrig_audio ghash_clmulni_intel aesni_intel snd_hda_intel snd_intel_dspcfg snd_seq_midi crypto_simd snd_intel_sdw_acpi cryptd snd_hda_codec snd_seq_midi_event snd_rawmidi snd_hda_core snd_hwdep snd_seq fam15h_power k10temp snd_pcm snd_seq_device snd_timer snd mac_hid soundcore sch_fq_codel nct6775 hwmon_vid drm ip_tables x_tables autofs4 dm_mirror dm_region_hash dm_log hid_generic usbhid hid uas usb_storage r8169 crc32_pclmul realtek ahci xhci_pci i2c_piix4 [ 56.979521] xhci_pci_renesas libahci video [ 56.979541] ---[ end trace cb8f6a346f18da7b ]--- Instead of finding MFD hotplugged device by its name, simply iterate over the child devices to avoid the issue. Squash in unused variable removal (Alex) BugLink: https://bugs.launchpad.net/bugs/1920674 Fixes: 25030321ba28 ("drm/amd: add pm domain for ACP IP sub blocks") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19drm/amdkfd: use allowed domain for vmbo validationNirmoy Das1-17/+4
[ Upstream commit bc05716d4fdd065013633602c5960a2bf1511b9c ] Fixes handling when page tables are in system memory. v3: remove struct amdgpu_vm_parser. v2: remove unwanted variable. change amdgpu_amdkfd_validate instead of amdgpu_amdkfd_bo_validate. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-19drm/amd/amdgpu/sriov disable all ip hw status by defaultJack Zhang1-1/+1
[ Upstream commit 95ea3dbc4e9548d35ab6fbf67675cef8c293e2f5 ] Disable all ip's hw status to false before any hw_init. Only set it to true until its hw_init is executed. The old 5.9 branch has this change but somehow the 5.11 kernrel does not have this fix. Without this change, sriov tdr have gfx IB test fail. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Review-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-30Revert "drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full ↵Yifan Zhang1-5/+1
doorbell." commit baacf52a473b24e10322b67757ddb92ab8d86717 upstream. This reverts commit 1c0b0efd148d5b24c4932ddb3fa03c8edd6097b3. Reason for revert: Side effect of enlarging CP_MEC_DOORBELL_RANGE may cause some APUs fail to enter gfxoff in certain user cases. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-30Revert "drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue."Yifan Zhang1-5/+1
commit ee5468b9f1d3bf48082eed351dace14598e8ca39 upstream. This reverts commit 4cbbe34807938e6e494e535a68d5ff64edac3f20. Reason for revert: side effect of enlarging CP_MEC_DOORBELL_RANGE may cause some APUs fail to enter gfxoff in certain user cases. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-23drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue.Yifan Zhang1-1/+5
commit 4cbbe34807938e6e494e535a68d5ff64edac3f20 upstream. If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC. Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-23drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell.Yifan Zhang1-1/+5
commit 1c0b0efd148d5b24c4932ddb3fa03c8edd6097b3 upstream. If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC. Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-10drm/amdgpu: make sure we unpin the UVD BONirmoy Das1-0/+1
commit 07438603a07e52f1c6aa731842bd298d2725b7be upstream. Releasing pinned BOs is illegal now. UVD 6 was missing from: commit 2f40801dc553 ("drm/amdgpu: make sure we unpin the UVD BO") Fixes: 2f40801dc553 ("drm/amdgpu: make sure we unpin the UVD BO") Cc: stable@vger.kernel.org Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-10drm/amdgpu: Don't query CE and UE errorsLuben Tuikov1-16/+0
commit dce3d8e1d070900e0feeb06787a319ff9379212c upstream. On QUERY2 IOCTL don't query counts of correctable and uncorrectable errors, since when RAS is enabled and supported on Vega20 server boards, this takes insurmountably long time, in O(n^3), which slows the system down to the point of it being unusable when we have GUI up. Fixes: ae363a212b14 ("drm/amdgpu: Add a new flag to AMDGPU_CTX_OP_QUERY_STATE2") Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03drm/amd/amdgpu: fix a potential deadlock in gpu resetLang Yu1-1/+0
[ Upstream commit 9c2876d56f1ce9b6b2072f1446fb1e8d1532cb3d ] When amdgpu_ib_ring_tests failed, the reset logic called amdgpu_device_ip_suspend twice, then deadlock occurred. Deadlock log: [ 805.655192] amdgpu 0000:04:00.0: amdgpu: ib ring test failed (-110). [ 806.290952] [drm] free PSP TMR buffer [ 806.319406] ============================================ [ 806.320315] WARNING: possible recursive locking detected [ 806.321225] 5.11.0-custom #1 Tainted: G W OEL [ 806.322135] -------------------------------------------- [ 806.323043] cat/2593 is trying to acquire lock: [ 806.323825] ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.325668] but task is already holding lock: [ 806.326664] ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.328430] other info that might help us debug this: [ 806.329539] Possible unsafe locking scenario: [ 806.330549] CPU0 [ 806.330983] ---- [ 806.331416] lock(&adev->dm.dc_lock); [ 806.332086] lock(&adev->dm.dc_lock); [ 806.332738] *** DEADLOCK *** [ 806.333747] May be due to missing lock nesting notation [ 806.334899] 3 locks held by cat/2593: [ 806.335537] #0: ffff888100d3f1b8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_read+0x4e/0x110 [ 806.337009] #1: ffff888136b1fd78 (&adev->reset_sem){++++}-{3:3}, at: amdgpu_device_lock_adev+0x42/0x94 [amdgpu] [ 806.339018] #2: ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.340869] stack backtrace: [ 806.341621] CPU: 6 PID: 2593 Comm: cat Tainted: G W OEL 5.11.0-custom #1 [ 806.342921] Hardware name: AMD Celadon-CZN/Celadon-CZN, BIOS WLD0C23N_Weekly_20_12_2 12/23/2020 [ 806.344413] Call Trace: [ 806.344849] dump_stack+0x93/0xbd [ 806.345435] __lock_acquire.cold+0x18a/0x2cf [ 806.346179] lock_acquire+0xca/0x390 [ 806.346807] ? dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.347813] __mutex_lock+0x9b/0x930 [ 806.348454] ? dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.349434] ? amdgpu_device_indirect_rreg+0x58/0x70 [amdgpu] [ 806.350581] ? _raw_spin_unlock_irqrestore+0x47/0x50 [ 806.351437] ? dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.352437] ? rcu_read_lock_sched_held+0x4f/0x80 [ 806.353252] ? rcu_read_lock_sched_held+0x4f/0x80 [ 806.354064] mutex_lock_nested+0x1b/0x20 [ 806.354747] ? mutex_lock_nested+0x1b/0x20 [ 806.355457] dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.356427] ? soc15_common_set_clockgating_state+0x17d/0x19 [amdgpu] [ 806.357736] amdgpu_device_ip_suspend_phase1+0x78/0xd0 [amdgpu] [ 806.360394] amdgpu_device_ip_suspend+0x21/0x70 [amdgpu] [ 806.362926] amdgpu_device_pre_asic_reset+0xb3/0x270 [amdgpu] [ 806.365560] amdgpu_device_gpu_recover.cold+0x679/0x8eb [amdgpu] Signed-off-by: Lang Yu <Lang.Yu@amd.com> Acked-by: Christian KÃnig <christian.koenig@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-03drm/amdgpu: Fix a use-after-freexinhui pan1-0/+1
[ Upstream commit 1e5c37385097c35911b0f8a0c67ffd10ee1af9a2 ] looks like we forget to set ttm->sg to NULL. Hit panic below [ 1235.844104] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b7b4b: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI [ 1235.989074] Call Trace: [ 1235.991751] sg_free_table+0x17/0x20 [ 1235.995667] amdgpu_ttm_backend_unbind.cold+0x4d/0xf7 [amdgpu] [ 1236.002288] amdgpu_ttm_backend_destroy+0x29/0x130 [amdgpu] [ 1236.008464] ttm_tt_destroy+0x1e/0x30 [ttm] [ 1236.013066] ttm_bo_cleanup_memtype_use+0x51/0xa0 [ttm] [ 1236.018783] ttm_bo_release+0x262/0xa50 [ttm] [ 1236.023547] ttm_bo_put+0x82/0xd0 [ttm] [ 1236.027766] amdgpu_bo_unref+0x26/0x50 [amdgpu] [ 1236.032809] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x7aa/0xd90 [amdgpu] [ 1236.040400] kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu] [ 1236.046912] kfd_ioctl+0x463/0x690 [amdgpu] Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-03drm/amd/amdgpu: fix refcount leakJingwen Chen1-0/+3
[ Upstream commit fa7e6abc75f3d491bc561734312d065dc9dc2a77 ] [Why] the gem object rfb->base.obj[0] is get according to num_planes in amdgpufb_create, but is not put according to num_planes [How] put rfb->base.obj[0] in amdgpu_fbdev_destroy according to num_planes Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-03drm/amdgpu/vcn2.5: add cancel_delayed_work_sync before power gateJames Zhu1-0/+2
commit 2fb536ea42d557f39f70c755f68e1aa1ad466c55 upstream. Add cancel_delayed_work_sync before set power gating state to avoid race condition issue when power gating. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03drm/amdgpu/vcn2.0: add cancel_delayed_work_sync before power gateJames Zhu1-0/+2
commit 0c6013377b4027e69d8f3e63b6bf556b6cb87802 upstream. Add cancel_delayed_work_sync before set power gating state to avoid race condition issue when power gating. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-06-03drm/amdgpu/vcn1: add cancel_delayed_work_sync before power gateJames Zhu1-1/+5
commit b95f045ea35673572ef46d6483ad8bd6d353d63c upstream. Add cancel_delayed_work_sync before set power gating state to avoid race condition issue when power gating. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-26drm/amdgpu: update sdma golden setting for Navi12Guchun Chen1-0/+4
commit 77194d8642dd4cb7ea8ced77bfaea55610574c38 upstream. Current golden setting is out of date. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-26drm/amdgpu: update gc golden setting for Navi12Guchun Chen1-2/+4
commit 99c45ba5799d6b938bd9bd20edfeb6f3e3e039b9 upstream. Current golden setting is out of date. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-26drm/amdgpu: disable 3DCGCG on picasso/raven1 to avoid compute hangChangfeng2-5/+7
commit dbd1003d1252db5973dddf20b24bb0106ac52aa2 upstream. There is problem with 3DCGCG firmware and it will cause compute test hang on picasso/raven1. It needs to disable 3DCGCG in driver to avoid compute hang. Signed-off-by: Changfeng <Changfeng.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11drm/amdgpu: fix NULL pointer dereferenceGuchun Chen1-1/+1
[ Upstream commit 3c3dc654333f6389803cdcaf03912e94173ae510 ] ttm->sg needs to be checked before accessing its child member. Call Trace: amdgpu_ttm_backend_destroy+0x12/0x70 [amdgpu] ttm_bo_cleanup_memtype_use+0x3a/0x60 [ttm] ttm_bo_release+0x17d/0x300 [ttm] amdgpu_bo_unref+0x1a/0x30 [amdgpu] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x78b/0x8b0 [amdgpu] kfd_ioctl_alloc_memory_of_gpu+0x118/0x220 [amdgpu] kfd_ioctl+0x222/0x400 [amdgpu] ? kfd_dev_is_large_bar+0x90/0x90 [amdgpu] __x64_sys_ioctl+0x8e/0xd0 ? __context_tracking_exit+0x52/0x90 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f97f264d317 Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48 RSP: 002b:00007ffdb402c338 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007f97f3cc63a0 RCX: 00007f97f264d317 RDX: 00007ffdb402c380 RSI: 00000000c0284b16 RDI: 0000000000000003 RBP: 00007ffdb402c380 R08: 00007ffdb402c428 R09: 00000000c4000004 R10: 00000000c4000004 R11: 0000000000000246 R12: 00000000c0284b16 R13: 0000000000000003 R14: 00007f97f3cc63a0 R15: 00007f8836200000 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11amdgpu: avoid incorrect %hu format stringArnd Bergmann1-1/+1
[ Upstream commit 7d98d416c2cc1c1f7d9508e887de4630e521d797 ] clang points out that the %hu format string does not match the type of the variables here: drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c:263:7: warning: format specifies type 'unsigned short' but the argument has type 'unsigned int' [-Wformat] version_major, version_minor); ^~~~~~~~~~~~~ include/drm/drm_print.h:498:19: note: expanded from macro 'DRM_ERROR' __drm_err(fmt, ##__VA_ARGS__) ~~~ ^~~~~~~~~~~ Change it to a regular %u, the same way a previous patch did for another instance of the same warning. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tom Rix <trix@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11drm/amdgpu : Fix asic reset regression issue introduce by 8f211fe8ac7c4fshaoyunl1-1/+1
[ Upstream commit c8941550aa66b2a90f4b32c45d59e8571e33336e ] This recent change introduce SDMA interrupt info printing with irq->process function. These functions do not require a set function to enable/disable the irq Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11drm/amdgpu: mask the xgmi number of hops reported from psp to kfdJonathan Kim1-1/+8
[ Upstream commit 4ac5617c4b7d0f0a8f879997f8ceaa14636d7554 ] The psp supplies the link type in the upper 2 bits of the psp xgmi node information num_hops field. With a new link type, Aldebaran has these bits set to a non-zero value (1 = xGMI3) so the KFD topology will report the incorrect IO link weights without proper masking. The actual number of hops is located in the 3 least significant bits of this field so mask if off accordingly before passing it to the KFD. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Amber Lin <amber.lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-07drm/amdgpu: check alignment on CPU page for bo mapXℹ Ruoyao1-4/+4
commit e3512fb67093fabdf27af303066627b921ee9bd8 upstream. The page table of AMDGPU requires an alignment to CPU page so we should check ioctl parameters for it. Return -EINVAL if some parameter is unaligned to CPU page, instead of corrupt the page table sliently. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Xi Ruoyao <xry111@mengyan1223.wang> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-07drm/amdgpu: fix offset calculation in amdgpu_vm_bo_clear_mappings()Nirmoy Das1-1/+1
commit 5e61b84f9d3ddfba73091f9fbc940caae1c9eb22 upstream. Offset calculation wasn't correct as start addresses are in pfn not in bytes. CC: stable@vger.kernel.org Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-30drm/amdgpu: fb BO should be ttm_bo_type_deviceNirmoy Das1-1/+1
[ Upstream commit 521f04f9e3ffc73ef96c776035f8a0a31b4cdd81 ] FB BO should not be ttm_bo_type_kernel type and amdgpufb_create_pinned_object() pins the FB BO anyway. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-09drm/amdgpu: fix parameter error of RREG32_PCIE() in amdgpu_regs_pcieKevin Wang1-2/+2
commit 1aa46901ee51c1c5779b3b239ea0374a50c6d9ff upstream. the register offset isn't needed division by 4 to pass RREG32_PCIE() Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-07drm/amdgpu: Add check to prevent IH overflowDefang Bo3-39/+71
[ Upstream commit e4180c4253f3f2da09047f5139959227f5cf1173 ] Similar to commit <b82175750131>("drm/amdgpu: fix IH overflow on Vega10 v2"). When an ring buffer overflow happens the appropriate bit is set in the WPTR register which is also written back to memory. But clearing the bit in the WPTR doesn't trigger another memory writeback. So what can happen is that we end up processing the buffer overflow over and over again because the bit is never cleared. Resulting in a random system lockup because of an infinite loop in an interrupt handler. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Defang Bo <bodefang@126.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04drm/amdgpu: Set reference clock to 100Mhz on Renoir (v2)Alex Deucher1-0/+2
commit 6e80fb8ab04f6c4f377e2fd422bdd1855beb7371 upstream. Fixes the rlc reference clock used for GPU timestamps. Value is 100Mhz. Confirmed with hardware team. v2: reword commit message. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1480 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04drm/amdgpu: Prevent shift wrapping in amdgpu_read_mask()Dan Carpenter1-3/+3
[ Upstream commit c915ef890d5dc79f483e1ca3b3a5b5f1a170690c ] If the user passes a "level" value which is higher than 31 then that leads to shift wrapping. The undefined behavior will lead to a syzkaller stack dump. Fixes: 5632708f4452 ("drm/amd/powerplay: add dpm force multiple levels on cz/tonga/fiji/polaris (v2)") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04drm/amdgpu: Fix macro name _AMDGPU_TRACE_H_ in preprocessor if conditionChenyang Li1-1/+1
[ Upstream commit 956e20eb0fbb206e5e795539db5469db099715c8 ] Add an underscore in amdgpu_trace.h line 24 "_AMDGPU_TRACE_H". Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)") Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Chenyang Li <lichenyang@loongson.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-27drm/amdgpu/psp: fix psp gfx ctrl cmdsVictor Zhao1-1/+1
[ Upstream commit f14a5c34d143f6627f0be70c0de1d962f3a6ff1c ] psp GFX_CTRL_CMD_ID_CONSUME_CMD different for windows and linux, according to psp, linux cmds are not correct. v2: only correct GFX_CTRL_CMD_ID_CONSUME_CMD. Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19drm/amdgpu: fix a GPU hang issue when remove deviceDennis Li1-2/+2
[ Upstream commit 88e21af1b3f887d217f2fb14fc7e7d3cd87ebf57 ] When GFXOFF is enabled and GPU is idle, driver will fail to access some registers. Therefore change to disable power gating before all access registers with MMIO. Dmesg log is as following: amdgpu 0000:03:00.0: amdgpu: amdgpu: finishing device. amdgpu: cp queue pipe 4 queue 0 preemption failed amdgpu 0000:03:00.0: amdgpu: failed to write reg 2890 wait reg 28a2 amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706 amdgpu 0000:03:00.0: amdgpu: failed to write reg 2890 wait reg 28a2 amdgpu 0000:03:00.0: amdgpu: failed to write reg 1a6f4 wait reg 1a706 Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-18amd/amdgpu: Disable VCN DPG mode for PicassoVeerabadhran Gopalakrishnan1-2/+1
[ Upstream commit c6d2b0fbb893d5c7dda405aa0e7bcbecf1c75f98 ] Concurrent operation of VCN and JPEG decoder in DPG mode is causing ring timeout due to power state. Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-18drm/amdgpu: perform srbm soft reset always on SDMA resumeEvan Quan1-15/+12
[ Upstream commit 253475c455eb5f8da34faa1af92709e7bb414624 ] This can address the random SDMA hang after pci config reset seen on Hawaii. Signed-off-by: Evan Quan <evan.quan@amd.com> Tested-by: Sandeep Raghuraman <sandy.8925@gmail.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-10drm/amdgpu: add DID for navi10 blockchain SKUTianci.Yin1-0/+1
[ Upstream commit 8942881144a7365143f196f5eafed24783a424a3 ] Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Tianci.Yin <tianci.yin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-05drm/amdgpu: increase the reserved VM size to 2MBChristian König1-2/+2
commit 55bb919be4e4973cd037a04f527ecc6686800437 upstream. Ideally this should be a multiple of the VM block size. 2MB should at least fit for Vega/Navi. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-05drm/amdgpu: correct the gpu reset handling for job != NULL caseEvan Quan1-1/+1
commit 207ac684792560acdb9e06f9d707ebf63c84b0e0 upstream. Current code wrongly treat all cases as job == NULL. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-and-tested-by: Jane Jian <Jane.Jian@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-05drm/amdgpu: don't map BO in reserved regionMadhav Chauhan1-0/+10
commit c4aa8dff6091cc9536aeb255e544b0b4ba29faf4 upstream. 2MB area is reserved at top inside VM. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Madhav Chauhan <madhav.chauhan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>