kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2020-08-26	drm/amd/display: Add debugfs for connector's FEC & DSC capabilities	Eryk Brol	1	-1/+72
	[why & how] Useful entry to understand if link has DSC or FEC capabilities, implemented to read DPCD caps stored on the link. Better than manually reading the registers with aux dpcd helper. Signed-off-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Mikita Lipski <mikita.lipski@amd.com> Reviewed-by: Mikita Lipski <Mikita.Lipski@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/display: Revert HDCP disable sequence change	Jaehyun Chung	1	-1/+1
	[Why] Revert HDCP disable sequence change that blanks stream before disabling HDCP. PSP and HW teams are currently investigating the root cause of why HDCP cannot be disabled before stream blank, which is expected to work without issues. Signed-off-by: Jaehyun Chung <jaehyun.chung@amd.com> Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/display: Send H14b-VSIF specified in HDMI	Wayne Lin	3	-16/+5
	[Why] Current function excludes the logic to generate H14b-VSIF. Now it constructs HF-VSIF only and causes HDMI compliace test fail. [How] According to HDMI spec, source devices shall utilize the H14b-VSIF whenever the signaling capabilities of the H14b-VSIF allow this. Here keep the logic for HF-VSIF and add H14b-VSIF construction part. Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Reviewed-by: Roman Li <Roman.Li@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/display: Call DMUB for eDP power control	Chris Park	1	-0/+28
	[Why] If DMUB is used, LVTMA VBIOS call can be used to control eDP instead of tranditional transmitter control. Interface is agreed with VBIOS for eDP to use this new path to program LVTMA registers. [How] Expose DAL interface to send DMUB command for LVTMA control that VBIOS currently uses. Signed-off-by: Chris Park <Chris.Park@amd.com> Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/display: 3.2.99	Aric Cyr	1	-1/+1
	Signed-off-by: Aric Cyr <aric.cyr@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/display: Send DISPLAY_OFF after power down on boot	Sung Lee	3	-22/+43
	[WHY] update_clocks might not be called on headless adapters. This means DISPLAY_OFF may not be sent in headless cases. [HOW] If hardware is powered down on boot because it is headless (mode set does not happen on that adapter) also send DISPLAY_OFF notification. Signed-off-by: Sung Lee <sung.lee@amd.com> Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amdgpu/gfx10: refine mgcg setting	Jiansong Chen	1	-4/+2
	1. enable ENABLE_CGTS_LEGACY to fix specviewperf11 random hang. 2. remove obsolete RLC_CGTT_SCLK_OVERRIDE workaround. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amdkfd: implement the dGPU fallback path for apu (v6)	Huang Rui	9	-15/+64
	We still have a few iommu issues which need to address, so force raven as "dgpu" path for the moment. This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled or ACPI CRAT table not correct. v2: Use ignore_crat parameter to decide whether it will go with IOMMUv2. v3: Align with existed thunk, don't change the way of raven, only renoir will use "dgpu" path by default. v4: don't update global ignore_crat in the driver, and revise fallback function if CRAT is broken. v5: refine acpi crat good but no iommu support case, and rename the title. v6: fix the issue of dGPU initialized firstly, just modify the report value in the node_show(). Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/pm: correct Vega20 swctf limit setting	Evan Quan	1	-2/+4
	Correct the Vega20 thermal swctf limit. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/pm: correct Vega12 swctf limit setting	Evan Quan	1	-2/+4
	Correct the Vega12 thermal swctf limit. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/pm: correct Vega10 swctf limit setting	Evan Quan	1	-2/+5
	Correct the Vega10 thermal swctf limit. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1267 Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amdgpu: Embed drm_device into amdgpu_device (v3)	Luben Tuikov	4	-46/+45
	a) Embed struct drm_device into struct amdgpu_device. b) Modify the inline-f drm_to_adev() accordingly. c) Modify the inline-f adev_to_drm() accordingly. d) Eliminate the use of drm_device.dev_private, in amdgpu. e) Switch from using drm_dev_alloc() to drm_dev_init(). f) Add a DRM driver release function, which frees the container amdgpu_device after all krefs on the contained drm_device have been released. v2: Split out adding adev_to_drm() into its own patch (previous commit), making this patch more succinct and clear. More detailed commit description. v3: squash in fix to call drmm_add_final_kfree() to avoid a warning. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/display: Fix memleak in amdgpu_dm_mode_config_init	Dinghao Liu	1	-2/+8
	When amdgpu_display_modeset_create_props() fails, state and state->context should be freed to prevent memleak. It's the same when amdgpu_dm_audio_init() fails. Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amdgpu: disable runtime pm for navy_flounder	Jiansong Chen	1	-0/+1
	Disable runtime pm for navy_flounder temporarily. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/display: Retry AUX write when fail occurs	Wayne Lin	1	-1/+1
	[Why] In dm_dp_aux_transfer() now, we forget to handle AUX_WR fail cases. We suppose every write wil get done successfully and hence some AUX commands might not sent out indeed. [How] Check if AUX_WR success. If not, retry it. Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amdgpu: Fix buffer overflow in INFO ioctl	Alex Deucher	1	-0/+4
	The values for "se_num" and "sh_num" come from the user in the ioctl. They can be in the 0-255 range but if they're more than AMDGPU_GFX_MAX_SE (4) or AMDGPU_GFX_MAX_SH_PER_SE (2) then it results in an out of bounds read. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-08-26	drm/amd/powerplay: Fix hardmins not being sent to SMU for RV	Nicholas Kazlauskas	1	-6/+3
	[Why] DC uses these to raise the voltage as needed for higher dispclk/dppclk and to ensure that we have enough bandwidth to drive the displays. There's a bug preventing these from actuially sending messages since it's checking the actual clock (which is 0) instead of the incoming clock (which shouldn't be 0) when deciding to send the hardmin. [How] Check the clocks != 0 instead of the actual clocks. Fixes: 9ed9203c3ee7 ("drm/amd/powerplay: rv dal-pplib interface refactor powerplay part") Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-08-26	drm/amdgpu: use MODE1 reset for navy_flounder by default	Jiansong Chen	1	-0/+1
	Switch default gpu reset method to MODE1 for navy_flounder. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/pm: correct the thermal alert temperature limit settings	Evan Quan	3	-24/+21
	Do the maths in celsius degree. This can fix the issues caused by the changes below: drm/amd/pm: correct Vega20 swctf limit setting drm/amd/pm: correct Vega12 swctf limit setting drm/amd/pm: correct Vega10 swctf limit setting Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-08-26	drm/amdgpu: add asd fw check before loading asd	Tao Zhou	1	-2/+1
	asd is not ready for some ASICs in early stage, and psp->asd_fw is more generic than ASIC name in the check. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/display: Keep current gain when ABM disable immediately	Brandon Syu	1	-1/+1
	[Why] When system enters s3/s0i3, backlight PWM would set user level. [How] ABM disable function add keep current gain to avoid it. Signed-off-by: Brandon Syu <Brandon.Syu@amd.com> Reviewed-by: Josip Pavic <Josip.Pavic@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/display: Fix passive dongle mistaken as active dongle in EDID emulation	Samson Tam	1	-0/+1
	[Why] dongle_type is set during dongle connection but for passive dongles, dongle_type is not set. If user starts with an active dongle and then switches to a passive dongle, it will still report as an active dongle. Trying to emulate the wrong connecter type results in display not lighting up. [How] Set dpcd_caps.dongle_type for passive dongles in detect_dp(). Signed-off-by: Samson Tam <Samson.Tam@amd.com> Reviewed-by: Joshua Aberback <Joshua.Aberback@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/display: Revert HDCP disable sequence change	Jaehyun Chung	1	-1/+1
	[Why] Revert HDCP disable sequence change that blanks stream before disabling HDCP. PSP and HW teams are currently investigating the root cause of why HDCP cannot be disabled before stream blank, which is expected to work without issues. Signed-off-by: Jaehyun Chung <jaehyun.chung@amd.com> Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/display: Send DISPLAY_OFF after power down on boot	Sung Lee	3	-22/+43
	[WHY] update_clocks might not be called on headless adapters. This means DISPLAY_OFF may not be sent in headless cases. [HOW] If hardware is powered down on boot because it is headless (mode set does not happen on that adapter) also send DISPLAY_OFF notification. Signed-off-by: Sung Lee <sung.lee@amd.com> Reviewed-by: Yongqiang Sun <yongqiang.sun@amd.com> Acked-by: Eryk Brol <eryk.brol@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amdgpu/gfx10: refine mgcg setting	Jiansong Chen	1	-4/+2
	1. enable ENABLE_CGTS_LEGACY to fix specviewperf11 random hang. 2. remove obsolete RLC_CGTT_SCLK_OVERRIDE workaround. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-08-26	drm/amd/pm: correct Vega20 swctf limit setting	Evan Quan	1	-2/+4
	Correct the Vega20 thermal swctf limit. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-08-26	drm/amd/pm: correct Vega12 swctf limit setting	Evan Quan	1	-2/+4
	Correct the Vega12 thermal swctf limit. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-08-26	drm/amd/pm: correct Vega10 swctf limit setting	Evan Quan	1	-2/+5
	Correct the Vega10 thermal swctf limit. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1267 Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-08-26	drm/amd/pm: set VCN pg per instances	Jiansong Chen	1	-2/+2
	When deciding whether to set pg for vcn1, instances number is more generic than chip name. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/pm: enable run_btc callback for sienna_cichlid	Jiansong Chen	1	-0/+7
	DC BTC support for sienna_cichlid is added, it provides the DC tolerance and aging measurements. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drivers: gpu: amd: Initialize amdgpu_dm_backlight_caps object to 0 in ↵	Furquan Shaikh	1	-0/+2
	amdgpu_dm_update_backlight_caps In `amdgpu_dm_update_backlight_caps()`, there is a local `amdgpu_dm_backlight_caps` object that is filled in by `amdgpu_acpi_get_backlight_caps()`. However, this object is uninitialized before the call and hence the subsequent check for aux_support can fail since it is not initialized by `amdgpu_acpi_get_backlight_caps()` as well. This change initializes this local `amdgpu_dm_backlight_caps` object to 0. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Furquan Shaikh <furquan@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/display: Reject overlay plane configurations in multi-display scenarios	Nicholas Kazlauskas	1	-0/+8
	[Why] These aren't stable on some platform configurations when driving multiple displays, especially on higher resolution. In particular the delay in asserting p-state and validating from x86 outweights any power or performance benefit from the hardware composition. Under some configurations this will manifest itself as extreme stutter or unresponsiveness especially when combined with cursor movement. [How] Disable these for now. Exposing overlays to userspace doesn't guarantee that they'll be able to use them in any and all configurations and it's part of the DRM contract to have userspace gracefully handle validation failures when they occur. Valdiation occurs as part of DC and this in particular affects RV, so disable this in dcn10_global_validation. Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26	drm/amd/display: use correct scale for actual_brightness	Alexander Monakov	1	-41/+40
	Documentation for sysfs backlight level interface requires that values in both 'brightness' and 'actual_brightness' files are interpreted to be in range from 0 to the value given in the 'max_brightness' file. With amdgpu, max_brightness gives 255, and values written by the user into 'brightness' are internally rescaled to a wider range. However, reading from 'actual_brightness' gives the raw register value without inverse rescaling. This causes issues for various userspace tools such as PowerTop and systemd that expect the value to be in the correct range. Introduce a helper to retrieve internal backlight range. Use it to reimplement 'convert_brightness' as 'convert_brightness_from_user' and introduce 'convert_brightness_to_user'. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=203905 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1242 Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alexander Monakov <amonakov@ispras.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2020-08-26	drm/amd/display: should check error using DC_OK	Tong Zhang	1	-4/+4
	core_link_read_dpcd returns only DC_OK(1) and DC_ERROR_UNEXPECTED(-1), the caller should check error using DC_OK instead of checking against 0 Signed-off-by: Tong Zhang <ztong0001@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amdgpu: Get DRM dev from adev by inline-f	Luben Tuikov	33	-423/+428
	Add a static inline adev_to_drm() to obtain the DRM device pointer from an amdgpu_device pointer. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amdgpu: drm_device to amdgpu_device by inline-f (v2)	Luben Tuikov	44	-300/+303
	Get the amdgpu_device from the DRM device by use of an inline function, drm_to_adev(). The inline function resolves a pointer to struct drm_device to a pointer to struct amdgpu_device. v2: Use a typed visible static inline function instead of an invisible macro. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amdgpu: enable HDP clock gatting	Prike.Liang	1	-1/+2
	Enabe HDP SD/DS clock gatting in Renoir series. Signed-off-by: Prike.Liang <Prike.Liang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amdgpu: enable ATHUB clock gatting	Prike.Liang	1	-0/+1
	Enable ATHUB clock gatting set in Renoir series. Signed-off-by: Prike.Liang <Prike.Liang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amd/pm: set VCN pg per instances	Jiansong Chen	1	-2/+2
	When deciding whether to set pg for vcn1, instances number is more generic than chip name. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amdgpu: annotate a false positive recursive locking	Dennis Li	1	-3/+8
	Re-apply commit 72e14ebf9fc09e33b28b70f00a2ed9821c198633 [ 584.110304] ============================================ [ 584.110590] WARNING: possible recursive locking detected [ 584.110876] 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 Tainted: G OE [ 584.111164] -------------------------------------------- [ 584.111456] kworker/38:1/553 is trying to acquire lock: [ 584.111721] ffff9b15ff0a47a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.112112] but task is already holding lock: [ 584.112673] ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.113068] other info that might help us debug this: [ 584.113689] Possible unsafe locking scenario: [ 584.114350] CPU0 [ 584.114685] ---- [ 584.115014] lock(&adev->reset_sem); [ 584.115349] lock(&adev->reset_sem); [ 584.115678] * DEADLOCK * [ 584.116624] May be due to missing lock nesting notation [ 584.117284] 4 locks held by kworker/38:1/553: [ 584.117616] #0: ffff9ad635c1d348 ((wq_completion)events){+.+.}, at: process_one_work+0x21f/0x630 [ 584.117967] #1: ffffac708e1c3e58 ((work_completion)(&con->recovery_work)){+.+.}, at: process_one_work+0x21f/0x630 [ 584.118358] #2: ffffffffc1c2a5d0 (&tmp->hive_lock){+.+.}, at: amdgpu_device_gpu_recover+0xae/0x1030 [amdgpu] [ 584.118786] #3: ffff9b1603d247a0 (&adev->reset_sem){++++}, at: amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.119222] stack backtrace: [ 584.119990] CPU: 38 PID: 553 Comm: kworker/38:1 Kdump: loaded Tainted: G OE 5.6.0-deli-v5.6-2848-g3f3109b0e75f #1 [ 584.120782] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 584.121223] Workqueue: events amdgpu_ras_do_recovery [amdgpu] [ 584.121638] Call Trace: [ 584.122050] dump_stack+0x98/0xd5 [ 584.122499] __lock_acquire+0x1139/0x16e0 [ 584.122931] ? trace_hardirqs_on+0x3b/0xf0 [ 584.123358] ? cancel_delayed_work+0xa6/0xc0 [ 584.123771] lock_acquire+0xb8/0x1c0 [ 584.124197] ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.124599] down_write+0x49/0x120 [ 584.125032] ? amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.125472] amdgpu_device_gpu_recover+0x262/0x1030 [amdgpu] [ 584.125910] ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu] [ 584.126367] amdgpu_ras_do_recovery+0x159/0x190 [amdgpu] [ 584.126789] process_one_work+0x29e/0x630 [ 584.127208] worker_thread+0x3c/0x3f0 [ 584.127621] ? __kthread_parkme+0x61/0x90 [ 584.128014] kthread+0x12f/0x150 [ 584.128402] ? process_one_work+0x630/0x630 [ 584.128790] ? kthread_park+0x90/0x90 [ 584.129174] ret_from_fork+0x3a/0x50 Each adev has owned lock_class_key to avoid false positive recursive locking. v2: 1. register adev->lock_key into lockdep, otherwise lockdep will report the below warning [ 1216.705820] BUG: key ffff890183b647d0 has not been registered! [ 1216.705924] ------------[ cut here ]------------ [ 1216.705972] DEBUG_LOCKS_WARN_ON(1) [ 1216.705997] WARNING: CPU: 20 PID: 541 at kernel/locking/lockdep.c:3743 lockdep_init_map+0x150/0x210 v3: change to use down_write_nest_lock to annotate the false dead-lock warning. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amdgpu: refine create and release logic of hive info	Dennis Li	5	-106/+131
	Change to dynamically create and release hive info object, which help driver support more hives in the future. v2: Change to save hive object pointer in adev, to avoid locking xgmi_mutex every time when calling amdgpu_get_xgmi_hive. v3: 1. Change type of hive object pointer in adev from void* to amdgpu_hive_info*. 2. remove unnecessary variable initialization. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amdgpu: refine message print for devices of hive	Dennis Li	5	-21/+21
	Using dev_xxx instead of DRM_xxx/pr_xxx to indicate which device of a hive is the message for. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amdgpu: fix the nullptr issue when reenter GPU recovery	Dennis Li	1	-2/+3
	in single gpu system, if driver reenter gpu recovery, amdgpu_device_lock_adev will return false, but hive is nullptr now. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amdgpu: change reset lock from mutex to rw_semaphore	Dennis Li	5	-35/+32
	clients don't need reset-lock for synchronization when no GPU recovery. v2: change to return the return value of down_read_killable. v3: if GPU recovery begin, VF ignore FLR notification. Reviewed-by: Monk Liu <monk.liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amd/pm: enable run_btc callback for sienna_cichlid	Jiansong Chen	1	-0/+7
	DC BTC support for sienna_cichlid is added, it provides the DC tolerance and aging measurements. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drivers: gpu: amd: Initialize amdgpu_dm_backlight_caps object to 0 in ↵	Furquan Shaikh	1	-0/+2
	amdgpu_dm_update_backlight_caps In `amdgpu_dm_update_backlight_caps()`, there is a local `amdgpu_dm_backlight_caps` object that is filled in by `amdgpu_acpi_get_backlight_caps()`. However, this object is uninitialized before the call and hence the subsequent check for aux_support can fail since it is not initialized by `amdgpu_acpi_get_backlight_caps()` as well. This change initializes this local `amdgpu_dm_backlight_caps` object to 0. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Furquan Shaikh <furquan@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amd/pm: Remove unnecessary cast	Alex Dewar	1	-1/+1
	In init_powerplay_table_information() the value returned from kmalloc() is cast unnecessarily. Remove cast. Issue identified with Coccinelle. Signed-off-by: Alex Dewar <alex.dewar90@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amd/powerplay: remove duplicate include	Wang Hai	1	-1/+0
	Remove asic_reg/nbio/nbio_6_1_offset.h which is included more than once Reviewed-by: Evan Quan <evan.quan@amd.com> Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amd/display: remove unintended executable mode	Lukas Bulwahn	6	-0/+0
	Besides the intended change, commit 4cc1178e166a ("drm/amdgpu: replace DRM prefix with PCI device info for gfx/mmhub") also set the source files mmhub_v1_0.c and gfx_v9_4.c to be executable, i.e., changed fromold mode 644 to new mode 755. Commit 241b2ec9317e ("drm/amd/display: Add dcn30 Headers (v2)") added the four header files {dpcs,dcn}_3_0_0_{offset,sh_mask}.h as executable, i.e., mode 755. Set to the usual modes for source and headers files and clean up those mistakes. No functional change. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-24	drm/amdgpu: refine codes to avoid reentering GPU recovery	Dennis Li	28	-122/+127
	if other threads have holden the reset lock, recovery will fail to try_lock. Therefore we introduce atomic hive->in_reset and adev->in_gpu_reset, to avoid reentering GPU recovery. v2: drop "? true : false" in the definition of amdgpu_in_reset Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>