summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)AuthorFilesLines
2023-07-25drm/amd/display: Add VESA SCR case for default aux backlightIswara Nagulendran1-4/+14
[How & Why] When determining default aux backlight level, read from DPCD address 0x734 for VESA SCR on OLED. Reviewed-by: Felipe Clark <felipe.clark@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Iswara Nagulendran <iswara.nagulendran@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25drm/amd/amdgpu: Fix warnings in amdgpu/amdgpu_display.cSrinivasan Shanmugam1-17/+25
Fixes the below checkpatch.pl warnings: WARNING: Block comments use * on subsequent lines WARNING: Block comments use a trailing */ on a separate line WARNING: suspect code indent for conditional statements (8, 12) WARNING: braces {} are not necessary for single statement blocks Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25drm/amd/display: Guard DCN31 PHYD32CLK logic against chip familyGeorge Shen1-1/+2
[Why] Current yellow carp B0 PHYD32CLK logic is incorrectly applied to other ASICs. [How] Add guard to check chip family is yellow carp before applying logic. Reviewed-by: Hansen Dsouza <hansen.dsouza@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25drm/amd/display: Correct grammar mistakesReza Amini1-9/+11
[Why] There are grammer mistakes in comments [How] Correct grammar mistakes Reviewed-by: Anthony Koo <anthony.koo@amd.com> Acked-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Reza Amini <reza.amini@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25drm/amdgpu: Return -ENOMEM when there is no memory in 'amdgpu_gfx_mqd_sw_init'Srinivasan Shanmugam1-3/+8
Return -ENOMEM, when there is no sufficient dynamically allocated memory to create MQD backup for ring Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25drm/amdgpu: Fix do not add new typedefs in amdgpu_fw_attestation.cSrinivasan Shanmugam1-20/+18
Fixes the following to align to coding style: WARNING: do not add new typedefs +typedef struct FW_ATT_DB_HEADER WARNING: do not add new typedefs +typedef struct FW_ATT_RECORD WARNING: Symbolic permissions 'S_IRUSR' are not preferred. Consider using octal permissions '0400'. + S_IRUSR, ERROR: "(foo*)" should be "(foo *)" WARNING: please, no space before tabs Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25drm/amdgpu: Prefer #if IS_ENABLED over #if defined in amdgpu_drv.cSrinivasan Shanmugam1-2/+2
Adhere to linux coding style Fixes the following: WARNING: Prefer IS_ENABLED(<FOO>) to CONFIG_<FOO> || CONFIG_<FOO>_MODULE +#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE) WARNING: Prefer IS_ENABLED(<FOO>) to CONFIG_<FOO> || CONFIG_<FOO>_MODULE +#if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE) Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25drm/amdkfd: enable cooperative groups for gfx11Jonathan Kim9-13/+27
MES can concurrently schedule queues on the device that require exclusive device access if marked exclusively_scheduled without the requirement of GWS. Similar to the F32 HWS, MES will manage quality of service for these queues. Use this for cooperative groups since cooperative groups are device occupancy limited. Since some GFX11 devices can only be debugged with partial CUs, do not allow the debugging of cooperative groups on these devices as the CU occupancy limit will change on attach. In addition, zero initialize the MES add queue submission vector for MES initialization tests as we do not want these to be cooperative dispatches. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25drm/amdgpu: set sw state to gfxoff after SR-IOV resetHorace Chen3-0/+14
[Why] Current SR-IOV will not set GC to off state, while it is a real GC hard reset. Whthout GFX off flag, driver may do gfxhub invalidation before firmware load and gfxhub gart enable. This operation may cause CP to become busy because GC is not in the right state for invalidation. [How] Add a function for SR-IOV to clean up some sw state before recover. Set adev->gfx.is_poweron to false to prevent gfxhub invalidation before gfx firmware autoload complete. Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: HaiJun Chang <HaiJun.Chang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-25drm/amd/smu: use AverageGfxclkFrequency* to replace previous GFX Curr ClockJane Jian1-1/+1
Report current GFX clock also from average clock value as the original CurrClock data is not valid/accurate any more as per FW team Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-21drm/amdgpu: Fix one kernel-doc commentYang Li1-1/+1
Use colon to separate parameter name from their specific meaning. silence the warning: drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c:793: warning: Function parameter or member 'adev' not described in 'amdgpu_vm_pte_update_noretry_flags' Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-21drm/amd: Fix an error handling mistake in psp_sw_init()Mario Limonciello1-3/+3
If the second call to amdgpu_bo_create_kernel() fails, the memory allocated from the first call should be cleared. If the third call fails, the memory from the second call should be cleared. Fixes: b95b5391684b ("drm/amdgpu/psp: move PSP memory alloc from hw_init to sw_init") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-21drm/amdgpu: Fix infinite loop in gfxhub_v1_2_xcc_gart_enable (v2)Victor Lu1-4/+1
An instance of for_each_inst() was not changed to match its new behaviour and is causing a loop. v2: remove tmp_mask variable Fixes: b579ea632fca ("drm/amdgpu: Modify for_each_inst macro") Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-21drm/amdgpu: Program xcp_ctl registers as neededLijo Lazar1-11/+12
XCP_CTL register is expected to be programmed by firmware. Under certain conditions FW may not have programmed it correctly. As a workaround, program it when FW has not programmed the right values. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-21drm/amdkfd: fix trap handling work around for debuggingJonathan Kim3-7/+10
Update the list of devices that require the cwsr trap handling workaround for debugging use cases. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Acked-by: Ruili Ji <ruili.ji@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-21drm/amd/display: Allow building DC with clang on RISC-VSamuel Holland1-1/+1
clang on RISC-V appears to be unaffected by the bug causing excessive stack usage in calculate_bandwidth(). clang 16 with -fstack-usage reports a 304 byte stack frame size with CONFIG_ARCH_RV32I, and 512 bytes with CONFIG_ARCH_RV64I. Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-21drm/amd/display: remove an unused fileAurabindo Pillai2-185/+0
[Why&How] Internal subvp state is not referenced in driver code, so it can be removed. Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-21drm/amdgpu: allow secure submission on VCN4 ringsguttula1-2/+6
This patch will enable secure decode playback on VCN4_0_2 Signed-off-by: sguttula <Suresh.Guttula@amd.com> Reviewed-by: Leo Liu <leo.liiu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-21drm/amd: Avoid reading the VBIOS part number twiceMario Limonciello7-34/+22
The VBIOS part number is read both in amdgpu_atom_parse() as well as in atom_get_vbios_pn() and stored twice in the `struct atom_context` structure. Remove the first unnecessary read and move the `pr_info` line from that read into the second. v2: squash in unused variable removal Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amdgpu: use a macro to define no xcp partition caseGuchun Chen4-5/+8
~0 as no xcp partition is used in several places, so improve its definition by a macro for code consistency. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amdgpu/vm: use the same xcp_id from root PDGuchun Chen1-1/+2
Other PDs/PTs allocation should just use the same xcp_id as that stored in root PD. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amdgpu: fix slab-out-of-bounds issue in amdgpu_vm_pt_createGuchun Chen5-11/+14
Recent code set xcp_id stored from file private data when opening device to amdgpu bo for accounting memory usage etc, but not all VMs are attached to this fpriv structure like the vm cases in amdgpu_mes_self_test, otherwise, KASAN will complain below out of bound access. And more importantly, VM code should not touch fpriv structure, so drop fpriv code handling from amdgpu_vm_pt. [ 77.292314] BUG: KASAN: slab-out-of-bounds in amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.293845] Read of size 4 at addr ffff888102c48a48 by task modprobe/1069 [ 77.294146] Call Trace: [ 77.294178] <TASK> [ 77.294208] dump_stack_lvl+0x49/0x63 [ 77.294260] print_report+0x16f/0x4a6 [ 77.294307] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.295979] ? kasan_complete_mode_report_info+0x3c/0x200 [ 77.296057] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.297556] kasan_report+0xb4/0x130 [ 77.297609] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.299202] __asan_load4+0x6f/0x90 [ 77.299272] amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.300796] ? amdgpu_init+0x6e/0x1000 [amdgpu] [ 77.302222] ? amdgpu_vm_pt_clear+0x750/0x750 [amdgpu] [ 77.303721] ? preempt_count_sub+0x18/0xc0 [ 77.303786] amdgpu_vm_init+0x39e/0x870 [amdgpu] [ 77.305186] ? amdgpu_vm_wait_idle+0x90/0x90 [amdgpu] [ 77.306683] ? kasan_set_track+0x25/0x30 [ 77.306737] ? kasan_save_alloc_info+0x1b/0x30 [ 77.306795] ? __kasan_kmalloc+0x87/0xa0 [ 77.306852] amdgpu_mes_self_test+0x169/0x620 [amdgpu] v2: without specifying xcp partition for PD/PT bo, the xcp id is -1. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2686 Fixes: 3ebfd221c1a8 ("drm/amdkfd: Store xcp partition id to amdgpu bo") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amdgpu: Allocate root PD on correct partitionGuchun Chen1-3/+3
file_priv needs to be setup firstly, otherwise, root PD will always be allocated on partition 0, even if opening the device from other partitions. Fixes: 3ebfd221c1a8 ("drm/amdkfd: Store xcp partition id to amdgpu bo") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: Keep PHY active for DP displays on DCN31Nicholas Kazlauskas1-0/+5
[Why & How] Port of a change that went into DCN314 to keep the PHY enabled when we have a connected and active DP display. The PHY can hang if PHY refclk is disabled inadvertently. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Josip Pavic <josip.pavic@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: Prevent vtotal from being set to 0Daniel Miess1-1/+5
[Why] In dcn314 DML the destination pipe vtotal was being set to the crtc adjustment vtotal_min value even in cases where that value is 0. [How] Only set vtotal to the crtc adjustment vtotal_min value in cases where the value is non-zero. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Daniel Miess <daniel.miess@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: Disable MPC split by default on special asicZhikai Zhai1-1/+1
[WHY] All of pipes will be used when the MPC split enable on the dcn which just has 2 pipes. Then MPO enter will trigger the minimal transition which need programe dcn from 2 pipes MPC split to 2 pipes MPO. This action will cause lag if happen frequently. [HOW] Disable the MPC split for the platform which dcn resource is limited Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Zhikai Zhai <zhikai.zhai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: check TG is non-null before checking if enabledTaimur Hassan1-1/+2
[Why & How] If there is no TG allocation we can dereference a NULL pointer when checking if the TG is enabled. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Taimur Hassan <syed.hassan@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: Add polling method to handle MST reply packetWayne Lin4-86/+159
[Why] Specific TBT4 dock doesn't send out short HPD to notify source that IRQ event DOWN_REP_MSG_RDY is set. Which violates the spec and cause source can't send out streams to mst sinks. [How] To cover this misbehavior, add an additional polling method to detect DOWN_REP_MSG_RDY is set. HPD driven handling method is still kept. Just hook up our handler to drm mgr->cbs->poll_hpd_irq(). Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Jerry Zuo <jerry.zuo@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: Clean up errors & warnings in amdgpu_dm.cSrinivasan Shanmugam1-68/+65
Fix the following errors & warnings reported by checkpatch: ERROR: space required before the open brace '{' ERROR: space required before the open parenthesis '(' ERROR: that open brace { should be on the previous line ERROR: space prohibited before that ',' (ctx:WxW) ERROR: else should follow close brace '}' ERROR: open brace '{' following function definitions go on the next line ERROR: code indent should use tabs where possible WARNING: braces {} are not necessary for single statement blocks WARNING: void function return statements are not generally useful WARNING: Block comments use * on subsequent lines WARNING: Block comments use a trailing */ on a separate line Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amdgpu: Allow the initramfs generator to include psp_13_0_6_taCandice Li1-0/+1
Allow the initramfs generator to automatically include psp_13_0_6_ta firmware to initramfs. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amdgpu/pm: make mclk consistent for smu 13.0.7Alex Deucher1-1/+1
Use current uclk to be consistent with other dGPUs. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-07-18drm/amdgpu/pm: make gfxclock consistent for sienna cichlidAlex Deucher1-2/+6
Use average gfxclock for consistency with other dGPUs. Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-07-18drm/amd/display: only accept async flips for fast updatesSimon Ser2-0/+20
Up until now, amdgpu was silently degrading to vsync when user-space requested an async flip but the hardware didn't support it. The hardware doesn't support immediate flips when the update changes the FB pitch, the DCC state, the rotation, enables or disables CRTCs or planes, etc. This is reflected in the dm_crtc_state.update_type field: UPDATE_TYPE_FAST means that immediate flip is supported. Silently degrading async flips to vsync is not the expected behavior from a uAPI point-of-view. Xorg expects async flips to fail if unsupported, to be able to fall back to a blit. i915 already behaves this way. This patch aligns amdgpu with uAPI expectations and returns a failure when an async flip is not possible. Signed-off-by: Simon Ser <contact@emersion.fr> Reviewed-by: André Almeida <andrealmeid@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2023-07-18drm/amdgpu/vkms: relax timer deactivation by hrtimer_try_to_cancelGuchun Chen1-2/+3
In below thousands of screen rotation loop tests with virtual display enabled, a CPU hard lockup issue may happen, leading system to unresponsive and crash. do { xrandr --output Virtual --rotate inverted xrandr --output Virtual --rotate right xrandr --output Virtual --rotate left xrandr --output Virtual --rotate normal } while (1); NMI watchdog: Watchdog detected hard LOCKUP on cpu 1 ? hrtimer_run_softirq+0x140/0x140 ? store_vblank+0xe0/0xe0 [drm] hrtimer_cancel+0x15/0x30 amdgpu_vkms_disable_vblank+0x15/0x30 [amdgpu] drm_vblank_disable_and_save+0x185/0x1f0 [drm] drm_crtc_vblank_off+0x159/0x4c0 [drm] ? record_print_text.cold+0x11/0x11 ? wait_for_completion_timeout+0x232/0x280 ? drm_crtc_wait_one_vblank+0x40/0x40 [drm] ? bit_wait_io_timeout+0xe0/0xe0 ? wait_for_completion_interruptible+0x1d7/0x320 ? mutex_unlock+0x81/0xd0 amdgpu_vkms_crtc_atomic_disable It's caused by a stuck in lock dependency in such scenario on different CPUs. CPU1 CPU2 drm_crtc_vblank_off hrtimer_interrupt grab event_lock (irq disabled) __hrtimer_run_queues grab vbl_lock/vblank_time_block amdgpu_vkms_vblank_simulate amdgpu_vkms_disable_vblank drm_handle_vblank hrtimer_cancel grab dev->event_lock So CPU1 stucks in hrtimer_cancel as timer callback is running endless on current clock base, as that timer queue on CPU2 has no chance to finish it because of failing to hold the lock. So NMI watchdog will throw the errors after its threshold, and all later CPUs are impacted/blocked. So use hrtimer_try_to_cancel to fix this, as disable_vblank callback does not need to wait the handler to finish. And also it's not necessary to check the return value of hrtimer_try_to_cancel, because even if it's -1 which means current timer callback is running, it will be reprogrammed in hrtimer_start with calling enable_vblank to make it works. v2: only re-arm timer when vblank is enabled (Christian) and add a Fixes tag as well v3: drop warn printing (Christian) v4: drop superfluous check of blank->enabled in timer function, as it's guaranteed in drm_handle_vblank (Christian) Fixes: 84ec374bd580 ("drm/amdgpu: create amdgpu_vkms (v4)") Cc: stable@vger.kernel.org Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: add DCN301 specific logic for OTG programmingAurabindo Pillai4-3/+225
[Why&How] DCN301 does not have FAMS hence the workaround needed on other DCN3x variants related to OTG min/max selector programming is not applicable for it. Hence isolate it and have it use the old sequence without workaround. Fixes: 1598fc576420 ("drm/amd/display: Program OTG vtotal min/max selectors unconditionally for DCN1+") Reviewed-by: Swapnil Patel <swapnil.patel@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: export some optc function for reuseAurabindo Pillai2-2/+5
[Why&How] Make a few functions non static so that they can be reused for other asic. This is in preparation for separating out OTG programming sequence for DCN301 Fixes: 1598fc576420 ("drm/amd/display: Program OTG vtotal min/max selectors unconditionally for DCN1+") Reviewed-by: Swapnil Patel <swapnil.patel@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd: Use amdgpu_device_pcie_dynamic_switching_supported() for SMU7Mario Limonciello1-12/+2
SMU7 does a check if the dGPU is inserted into a Rocket Lake system, to turn off DPM. Extend this check to all systems that have problems with dynamic switching by using the amdgpu_device_pcie_dynamic_switching_supported() helper. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amdgpu: use a macro to define no xcp partition caseGuchun Chen4-5/+8
~0 as no xcp partition is used in several places, so improve its definition by a macro for code consistency. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amdgpu/vm: use the same xcp_id from root PDGuchun Chen1-1/+2
Other PDs/PTs allocation should just use the same xcp_id as that stored in root PD. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amdgpu: fix slab-out-of-bounds issue in amdgpu_vm_pt_createGuchun Chen5-11/+14
Recent code set xcp_id stored from file private data when opening device to amdgpu bo for accounting memory usage etc, but not all VMs are attached to this fpriv structure like the vm cases in amdgpu_mes_self_test, otherwise, KASAN will complain below out of bound access. And more importantly, VM code should not touch fpriv structure, so drop fpriv code handling from amdgpu_vm_pt. [ 77.292314] BUG: KASAN: slab-out-of-bounds in amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.293845] Read of size 4 at addr ffff888102c48a48 by task modprobe/1069 [ 77.294146] Call Trace: [ 77.294178] <TASK> [ 77.294208] dump_stack_lvl+0x49/0x63 [ 77.294260] print_report+0x16f/0x4a6 [ 77.294307] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.295979] ? kasan_complete_mode_report_info+0x3c/0x200 [ 77.296057] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.297556] kasan_report+0xb4/0x130 [ 77.297609] ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.299202] __asan_load4+0x6f/0x90 [ 77.299272] amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu] [ 77.300796] ? amdgpu_init+0x6e/0x1000 [amdgpu] [ 77.302222] ? amdgpu_vm_pt_clear+0x750/0x750 [amdgpu] [ 77.303721] ? preempt_count_sub+0x18/0xc0 [ 77.303786] amdgpu_vm_init+0x39e/0x870 [amdgpu] [ 77.305186] ? amdgpu_vm_wait_idle+0x90/0x90 [amdgpu] [ 77.306683] ? kasan_set_track+0x25/0x30 [ 77.306737] ? kasan_save_alloc_info+0x1b/0x30 [ 77.306795] ? __kasan_kmalloc+0x87/0xa0 [ 77.306852] amdgpu_mes_self_test+0x169/0x620 [amdgpu] v2: without specifying xcp partition for PD/PT bo, the xcp id is -1. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2686 Fixes: 3ebfd221c1a8 ("drm/amdkfd: Store xcp partition id to amdgpu bo") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amdgpu: Allocate root PD on correct partitionGuchun Chen1-3/+3
file_priv needs to be setup firstly, otherwise, root PD will always be allocated on partition 0, even if opening the device from other partitions. Fixes: 3ebfd221c1a8 ("drm/amdkfd: Store xcp partition id to amdgpu bo") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amdgpu: Add RLCG interface driver implementation for gfx v9.4.3 (v3)Victor Lu11-51/+81
Add RLCG interface support for gfx v9.4.3 and multiple XCCs. Do not enable it yet. v2: Fix amdgpu_rlcg_reg_access_ctrl init, add support for multiple XCCs in amdgpu_mm_wreg_mmio_rlc v3: Use GET_INST() when indexing amdgpu_rlcg_reg_access_ctrl Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: Promote DAL to 3.2.243Aric Cyr1-1/+1
This version brings along following fixes: - Update 128b/132b downspread factor to 0.3% - Add helpers to get DMUB FW boot options - Initialize necessary uninitialized variables - Add stream overhead in BW calculations for 128b/132b - Add link encoding to timing BW calculation parameters - Prevent vtotal from being set to 0 - Fix race condition when turning off an output alone - Keep PHY active for DP displays on DCN31 - Fix ASIC check in aux timeout workaround - ABM pause toggle - Add missing triggers for full updates - Disable MPC split by default on special asic - Add additional refresh rate conditions for SubVP cases - Fix DP2 link training failure with RCO - Reenable all root clock gating options - Cache backlight_millinits in link structure and setting brightness accordingly - Refine to decide the verified link setting - Update SW cursor fallback for subvp high refresh Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: Add missing triggers for full updatesAlvin Lee1-8/+16
[Description] - Full update was missed for the following cases: - Idle optimization is enabled - Plane is not in current context - Also don't clear surface updates at end of commit_plane_for_stream_fast as they are cleared at the beginning of each flip (only stream updates need to be cleared in case there is no stream update in the next flip) Reviewed-by: Samson Tam <samson.tam@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Alvin Lee <alvin.lee2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: ABM pause toggleReza Amini7-0/+277
[why] Allow ABM states to be transferred across display adapters for smooth display transitions. [how] We call DMUB to pause and get ABM states. We transfer data to other gpu, and deliver data and ask ABM to un-pause. Reviewed-by: Harry Vanzylldejong <harry.vanzylldejong@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Reza Amini <reza.amini@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: Fix ASIC check in aux timeout workaroundTaimur Hassan1-1/+1
[Why] Aux write was meant to be ASIC specific, and is causing compliance failures on newer parts. [How] Make workaround specific to single ASIC. Reviewed-by: Michael Strauss <michael.strauss@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Taimur Hassan <syed.hassan@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: Keep PHY active for DP displays on DCN31Nicholas Kazlauskas1-0/+5
[Why & How] Port of a change that went into DCN314 to keep the PHY enabled when we have a connected and active DP display. The PHY can hang if PHY refclk is disabled inadvertently. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Josip Pavic <josip.pavic@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: Prevent vtotal from being set to 0Daniel Miess1-1/+5
[Why] In dcn314 DML the destination pipe vtotal was being set to the crtc adjustment vtotal_min value even in cases where that value is 0. [How] Only set vtotal to the crtc adjustment vtotal_min value in cases where the value is non-zero. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: Daniel Miess <daniel.miess@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: Add link encoding to timing BW calculation parametersGeorge Shen12-22/+101
[Why] There certain cases where the timing BW is dependent on the type of link encoding in use. Thus to calculate the correct BW required for a given timing, the link encoding should be added as a parameter. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-07-18drm/amd/display: Add stream overhead in BW calculations for 128b/132bGeorge Shen2-0/+44
[Why] Current BW calculations do not account for the additional padding added for uncompressed pixel-to-symbol packing. This results in X.Y being too low for 128b/132b SST streams in certain scenarios. If X.Y is too low, end user can observe image corruption. [How] Add function to calculate stream overhead to timing BW calculation for 128b/132b SST cases. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Acked-by: Alan Liu <haoping.liu@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>