summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorFilesLines
2023-04-14drm/amdgpu: add gc v9_4_3 rlc_funcs implementationHawking Zhang3-0/+364
all the gc v9_4_3 registers fall in gc_rlcpdec address range have different relative offsets and base_idx from the ones defined in gc v9_0 ip headers. gc_v9_0_rlc_funcs can not be reused anymore for gc v9_4_3 v2: drop unused handshake function (Alex) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-14drm/i915/gt: Avoid out-of-bounds access when loading HuCLucas De Marchi1-4/+17
When HuC is loaded by GSC, there is no header definition for the kernel to look at and firmware is just handed to GSC. However when reading the version, it should still check the size of the blob to guarantee it's not incurring into out-of-bounds array access. If firmware is smaller than expected, the following message is now printed: # echo boom > /lib/firmware/i915/dg2_huc_gsc.bin # dmesg | grep -i huc [drm] GT0: HuC firmware i915/dg2_huc_gsc.bin: invalid size: 5 < 184 [drm] *ERROR* GT0: HuC firmware i915/dg2_huc_gsc.bin: fetch failed -ENODATA ... Even without this change the size, header and signature are still checked by GSC when loading, so this only avoids the out-of-bounds array access. Fixes: a7b516bd981f ("drm/i915/huc: Add fetch support for gsc-loaded HuC binary") Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230413200349.3492571-1-lucas.demarchi@intel.com
2023-04-14drm/ttm: revert "Reduce the number of used allocation orders for TTM pages"Christian König1-19/+11
This reverts commit 322458c2bb1a0398c5775333e1e71e1ece8a461f. PMD_SHIFT is not necessary constant on all architectures resulting in build failures. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/CAKMK7uHgUuqWJuqmZKrxi2mNiqExhmMif-naYnzUSj-puW-x+A@mail.gmail.com
2023-04-14drm/i915/color: Fix typo for Plane CSC indexesChaitanya Kumar Borah1-2/+2
Replace _PLANE_INPUT_CSC_RY_GY_2_* with _PLANE_CSC_RY_GY_2_* for Plane CSC Fixes: 6eba56f64d5d ("drm/i915/pxp: black pixels on pxp disabled") Cc: <stable@vger.kernel.org> Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Animesh Manna <animesh.manna@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230330150104.2923519-1-chaitanya.kumar.borah@intel.com (cherry picked from commit e39c76b2160bbd005587f978d29603ef790aefcd) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2023-04-14drm/sched: Check scheduler ready before calling timeout handlingVitaly Prosyak1-1/+2
During an IGT GPU reset test we see the following oops, [ +0.000003] ------------[ cut here ]------------ [ +0.000000] WARNING: CPU: 9 PID: 0 at kernel/workqueue.c:1656 __queue_delayed_work+0x6d/0xa0 [ +0.000004] Modules linked in: iptable_filter bpfilter amdgpu(OE) nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_generic intel_rapl_msr ledtrig_audio snd_hda_codec_hdmi intel_rapl_common snd_hda_intel edac_mce_amd snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core iommu_v2 gpu_sched(OE) kvm_amd drm_buddy snd_hwdep kvm video drm_ttm_helper snd_pcm ttm snd_seq_midi drm_display_helper snd_seq_midi_event snd_rawmidi cec crct10dif_pclmul ghash_clmulni_intel sha512_ssse3 snd_seq aesni_intel rc_core crypto_simd cryptd binfmt_misc drm_kms_helper rapl snd_seq_device input_leds joydev snd_timer i2c_algo_bit syscopyarea snd ccp sysfillrect sysimgblt wmi_bmof k10temp soundcore mac_hid sch_fq_codel msr parport_pc ppdev drm lp parport ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 hid_generic usbhid hid r8169 ahci xhci_pci gpio_amdpt realtek i2c_piix4 wmi crc32_pclmul xhci_pci_renesas libahci gpio_generic [ +0.000070] CPU: 9 PID: 0 Comm: swapper/9 Tainted: G W OE 6.1.11+ #2 [ +0.000003] Hardware name: Gigabyte Technology Co., Ltd. AB350-Gaming 3/AB350-Gaming 3-CF, BIOS F7 06/16/2017 [ +0.000001] RIP: 0010:__queue_delayed_work+0x6d/0xa0 [ +0.000003] Code: 7a 50 48 01 c1 48 89 4a 30 81 ff 00 20 00 00 75 38 4c 89 cf e8 64 3e 0a 00 5d e9 1e c5 11 01 e8 99 f7 ff ff 5d e9 13 c5 11 01 <0f> 0b eb c1 0f 0b 48 81 7a 38 70 5c 0e 81 74 9f 0f 0b 48 8b 42 28 [ +0.000002] RSP: 0018:ffffc90000398d60 EFLAGS: 00010007 [ +0.000002] RAX: ffff88810d589c60 RBX: 0000000000000000 RCX: 0000000000000000 [ +0.000002] RDX: ffff88810d589c58 RSI: 0000000000000000 RDI: 0000000000002000 [ +0.000001] RBP: ffffc90000398d60 R08: 0000000000000000 R09: ffff88810d589c78 [ +0.000002] R10: 72705f305f39765f R11: 7866673a6d72645b R12: ffff88810d589c58 [ +0.000001] R13: 0000000000002000 R14: 0000000000000000 R15: 0000000000000000 [ +0.000002] FS: 0000000000000000(0000) GS:ffff8887fee40000(0000) knlGS:0000000000000000 [ +0.000001] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000002] CR2: 00005562c4797fa0 CR3: 0000000110da0000 CR4: 00000000003506e0 [ +0.000002] Call Trace: [ +0.000001] <IRQ> [ +0.000001] mod_delayed_work_on+0x5e/0xa0 [ +0.000004] drm_sched_fault+0x23/0x30 [gpu_sched] [ +0.000007] gfx_v9_0_fault.isra.0+0xa6/0xd0 [amdgpu] [ +0.000258] gfx_v9_0_priv_reg_irq+0x29/0x40 [amdgpu] [ +0.000254] amdgpu_irq_dispatch+0x1ac/0x2b0 [amdgpu] [ +0.000243] amdgpu_ih_process+0x89/0x130 [amdgpu] [ +0.000245] amdgpu_irq_handler+0x24/0x60 [amdgpu] [ +0.000165] __handle_irq_event_percpu+0x4f/0x1a0 [ +0.000003] handle_irq_event_percpu+0x15/0x50 [ +0.000001] handle_irq_event+0x39/0x60 [ +0.000002] handle_edge_irq+0xa8/0x250 [ +0.000003] __common_interrupt+0x7b/0x150 [ +0.000002] common_interrupt+0xc1/0xe0 [ +0.000003] </IRQ> [ +0.000000] <TASK> [ +0.000001] asm_common_interrupt+0x27/0x40 [ +0.000002] RIP: 0010:native_safe_halt+0xb/0x10 [ +0.000003] Code: 46 ff ff ff cc cc cc cc cc cc cc cc cc cc cc eb 07 0f 00 2d 69 f2 5e 00 f4 e9 f1 3b 3e 00 90 eb 07 0f 00 2d 59 f2 5e 00 fb f4 <e9> e0 3b 3e 00 0f 1f 44 00 00 55 48 89 e5 53 e8 b1 d4 fe ff 66 90 [ +0.000002] RSP: 0018:ffffc9000018fdc8 EFLAGS: 00000246 [ +0.000002] RAX: 0000000000004000 RBX: 000000000002e5a8 RCX: 000000000000001f [ +0.000001] RDX: 0000000000000001 RSI: ffff888101298800 RDI: ffff888101298864 [ +0.000001] RBP: ffffc9000018fdd0 R08: 000000527f64bd8b R09: 000000000001dc90 [ +0.000001] R10: 000000000001dc90 R11: 0000000000000003 R12: 0000000000000001 [ +0.000001] R13: ffff888101298864 R14: ffffffff832d9e20 R15: ffff888193aa8c00 [ +0.000003] ? acpi_idle_do_entry+0x5e/0x70 [ +0.000002] acpi_idle_enter+0xd1/0x160 [ +0.000003] cpuidle_enter_state+0x9a/0x6e0 [ +0.000003] cpuidle_enter+0x2e/0x50 [ +0.000003] call_cpuidle+0x23/0x50 [ +0.000002] do_idle+0x1de/0x260 [ +0.000002] cpu_startup_entry+0x20/0x30 [ +0.000002] start_secondary+0x120/0x150 [ +0.000003] secondary_startup_64_no_verify+0xe5/0xeb [ +0.000004] </TASK> [ +0.000000] ---[ end trace 0000000000000000 ]--- [ +0.000003] BUG: kernel NULL pointer dereference, address: 0000000000000102 [ +0.006233] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring gfx_low timeout, signaled seq=3, emitted seq=4 [ +0.000734] #PF: supervisor read access in kernel mode [ +0.009670] [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process amd_deadlock pid 2002 thread amd_deadlock pid 2002 [ +0.005135] #PF: error_code(0x0000) - not-present page [ +0.000002] PGD 0 P4D 0 [ +0.000002] Oops: 0000 [#1] PREEMPT SMP NOPTI [ +0.000002] CPU: 9 PID: 0 Comm: swapper/9 Tainted: G W OE 6.1.11+ #2 [ +0.000002] Hardware name: Gigabyte Technology Co., Ltd. AB350-Gaming 3/AB350-Gaming 3-CF, BIOS F7 06/16/2017 [ +0.012101] amdgpu 0000:0c:00.0: amdgpu: GPU reset begin! [ +0.005136] RIP: 0010:__queue_work+0x1f/0x4e0 [ +0.000004] Code: 87 cd 11 01 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 49 89 d5 41 54 49 89 f4 53 48 83 ec 10 89 7d d4 <f6> 86 02 01 00 00 01 0f 85 6c 03 00 00 e8 7f 36 08 00 8b 45 d4 48 For gfx_rings the schedulers may not be initialized by amdgpu_device_init_schedulers() due to ring->no_scheduler flag being set to true and thus the timeout_wq is NULL. As a result, since all ASICs call drm_sched_fault() unconditionally even for schedulers which have not been initialized, it is simpler to use the ready condition which indicates whether the given scheduler worker thread runs and whether the timeout_wq of the reset domain has been initialized. Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Cc: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://lore.kernel.org/r/20230406200054.633379-1-luben.tuikov@amd.com
2023-04-13Merge tag 'drm-misc-fixes-2023-04-13' of ↵Daniel Vetter6-3/+13
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Short summary of fixes pull: * armada: Fix double free * fb: Clear FB_ACTIVATE_KD_TEXT in ioctl * nouveau: Add missing callbacks * scheduler: Fix use-after-free error Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230413184233.GA8148@linux-uq9g
2023-04-13Merge tag 'drm-intel-next-fixes-2023-04-13' of ↵Daniel Vetter2-0/+20
git://anongit.freedesktop.org/drm/drm-intel into drm-next Just one Cc:stable fix for sampler indirect state in bindless heap. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZDfxo+PXyw9ivFLI@jlahtine-mobl.ger.corp.intel.com
2023-04-13Merge tag 'drm-intel-fixes-2023-04-13' of ↵Daniel Vetter1-4/+16
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for v6.3-rc7: - Fix dual link DSI for TGL+ Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/877cugckzu.fsf@intel.com
2023-04-13drm/amdgpu: drop temp programming for pagefault handlingHawking Zhang1-22/+0
Was introduced as workaround. not needed anymore Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Jack Gui <Jack.Gui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-13drm/amdgpu: include protection for doorbell.hShashank Sharma1-0/+4
This patch adds double include protection for doorbell.h Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-13drm/amdgpu: rename num_doorbellsShashank Sharma3-15/+17
Rename doorbell.num_doorbells to doorbell.num_kernel_doorbells to make it more readable. Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian Koenig <christian.koenig@amd.com> Acked-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-13drm/amdgpu: switch to golden tsc registers for raven/raven2Jesse Zhang1-0/+40
Due to raven/raven2 maybe enable  sclk slow down, they cannot get clock count by the RLC at the auto level of dpm performance. So switch to golden tsc register. Suggested-by: shanshengwang <shansheng.wang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-13drm/amd/pm: correct the pcie link state check for SMU13Evan Quan3-4/+10
Update the driver implementations to fit those data exposed by PMFW. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-13drm/amdgpu: add gfx v11_0_3 fed irq handling for sriovYiPeng Chai1-3/+11
Add gfx v11_0_3 fed irq handling for sriov. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-13drm/amdgpu: Rework retry fault removalMukul Joshi2-3/+34
Rework retry fault removal from the software filter by storing an expired timestamp for a fault that is being removed. When a new fault comes, and it matches an entry in the sw filter, it will be added as a new fault only when its timestamp is greater than the timestamp expiry of the fault in the sw filter. This helps in avoiding stale faults being added back into the filter and preventing legitimate faults from being handled. Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-13drm/amdgpu: Enable IH retry CAM on GFX9Mukul Joshi7-49/+88
This patch enables the IH retry CAM on GFX9 series cards. This retry filter is used to prevent sending lots of retry interrupts in a short span of time and overflowing the IH ring buffer. This will also help reduce CPU interrupt workload. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-13drm/amd/pm: remove unused num_of_active_display variableTom Rix1-7/+0
clang with W=1 reports drivers/gpu/drm/amd/amdgpu/../pm/swsmu/amdgpu_smu.c:1700:6: error: variable 'num_of_active_display' set but not used [-Werror,-Wunused-but-set-variable] int num_of_active_display = 0; ^ This variable is not used so remove it. Fixes: 75145aab7a0d ("drm/amdgpu/swsmu: clean up a bunch of stale interfaces") Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-13drm/amdgpu: simplify amdgpu_ras_eeprom.cAlex Deucher1-52/+20
All chips that support RAS also support IP discovery, so use the IP versions rather than a mix of IP versions and asic types. Checking the validity of the atom_ctx pointer is not required as the vbios is already fetched at this point. v2: add comments to id asic types based on feedback from Luben Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com>
2023-04-12drm/amd/pm: correct the pcie link state check for SMU13Evan Quan3-4/+10
Update the driver implementations to fit those data exposed by PMFW. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-04-12drm/amd/pm: correct SMU13.0.7 max shader clock reportingHoratio Zhang1-1/+60
Correct the max shader clock reporting on SMU 13.0.7. Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-04-12drm/amd/pm: correct SMU13.0.7 pstate profiling clock settingsHoratio Zhang1-7/+15
Correct the pstate standard/peak profiling mode clock settings for SMU13.0.7. Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.1.x
2023-04-12drm/amd/display: Pass the right info to drm_dp_remove_payloadWayne Lin1-7/+50
[Why & How] drm_dp_remove_payload() interface was changed. Correct amdgpu dm code to pass the right parameter to the drm helper function. Reviewed-by: Jerry Zuo <Jerry.Zuo@amd.com> Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12Merge tag 'drm-misc-next-2023-04-12' of ↵Daniel Vetter18-211/+209
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.4-rc1: Cross-subsystem Changes: - Convert MIPI DSIM bridge dt to yaml. Core Changes: - Fix UAF race in drm scheduler. Driver Changes: - Add primary plane positioning support to VKMS. - Convert omapdrm fbdev emulation to in-kernel client. - Assorted small fixes to vkms, vc4, nouveau, vmwgfx. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/b7c37d4e-8f16-85dc-0f5f-3bd98f961395@linux.intel.com
2023-04-12drm/vkms: Use drmm_mode_config_init()Maíra Canal1-1/+5
Use drmm_mode_config_init() instead of drm_mode_config_init(), as it allows us to assure that the resource will be properly cleaned. Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Maíra Canal <mairacanal@riseup.net> Link: https://patchwork.freedesktop.org/patch/msgid/20230116205800.1266227-2-mcanal@igalia.com
2023-04-12drm/vkms: Use drmm_crtc_init_with_planes()Maíra Canal1-3/+2
Use drmm_crtc_init_with_planes() instead of drm_crtc_init_with_planes() to get rid of the explicit destroy hook in struct drm_crtc_funcs. Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Maíra Canal <mairacanal@riseup.net> Link: https://patchwork.freedesktop.org/patch/msgid/20230116205800.1266227-1-mcanal@igalia.com
2023-04-12Merge remote-tracking branch 'drm/drm-fixes' into drm-misc-fixesMaarten Lankhorst70-333/+874
We were stuck on rc2, should at least attempt to track drm-fixes slightly. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
2023-04-12drm/i915: disable sampler indirect state in bindless heapLionel Landwerlin2-0/+20
By default the indirect state sampler data (border colors) are stored in the same heap as the SAMPLER_STATE structure. For userspace drivers that can be 2 different heaps (dynamic state heap & bindless sampler state heap). This means that border colors have to copied in 2 different places so that the same SAMPLER_STATE structure find the right data. This change is forcing the indirect state sampler data to only be in the dynamic state pool (more convenient for userspace drivers, they only have to have one copy of the border colors). This is reproducing the behavior of the Windows drivers. BSpec: 46052 Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230407093237.3296286-1-lionel.g.landwerlin@intel.com (cherry picked from commit 16fc9c08f0ec7b1c95f1ea4a16097acdb3fc943d) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2023-04-12drm/amdgpu: Enable GFX11 SDMA context empty interruptGraham Sider2-10/+22
Enable SDMA queue empty context switching. SDMA context switch due to quantum programming no longer done here (as of sdma v6), so re-name sdma_v6_0_ctx_switch_enable to sdma_v6_0_ctxempty_int_enable to reflect this. Also program SDMAx_QUEUEx_SCHEDULE_CNTL for context switch due to quantum in KFD. Set to amdgpu_sdma_phase_quantum (defaults to 32 i.e. 3200us). Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amdkfd: Check PCIe atomics support on GFX11 to set CP_HQD_HQ_STATUS0[29]Sreekant Somasekharan3-1/+16
CP_HQD_HQ_STATUS0[29] bit will be used by CPFW to acknowledge whether PCIe atomics are supported. The default value of this bit is set to 0. Driver will check whether PCIe atomics are supported and set the bit to 1 if supported. This will force CPFW to use real atomic ops. If the bit is not set, CPFW will default to read/modify/write using the firmware itself. This is applicable only to GFX11 RS64 CP with MEC FW >= 509. If MEC FW < 509 and for all GFX11 F32 CP, PCIe atomics needs to be supported else it will skip the device. This commit also involves moving amdgpu_amdkfd_device_probe() function call after per-IP early_init loop in amdgpu_device_ip_early_init() function so as to check for RS64 enabled device. Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com> Reviewed-by: Graham Sider <Graham.Sider@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amd/display: Add logging for DP link traning Test Pattern SeqeuncesSrinivasan Shanmugam1-0/+9
Add some more logging for DP link traning test pattern seqeunces for better debugging. Cc: Fangzhi Zuo <Jerry.Zuo@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amdgpu: correct ras enabled flagStanley.Yang1-0/+7
XGMI RAS should be according to the gmc xgmi physical nodes number, XGMI RAS should not be enabled if xgmi num_physical_nodes is zero. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amdgpu: fix unexpected block idStanley.Yang2-0/+6
Aldebaran supports VCN and JPEG RAS, it reports unexpected block id message during VCN and JPEG RAS initialization if VCN and JPEG block id not defined. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amdgpu: use sdma_v6 single packet invalidationPierre-Eric Pelloux-Prayer1-1/+22
This achieves the same result as the sequence used in emit_flush_gpu_tlb but the invalidation is now a single packet instead of the 3 packets required to implement reg_write_reg_wait. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amd/display : Log DP link training downspread infoSrinivasan Shanmugam1-6/+8
Update the existing log with DP LT downspread info: [Downstream devices shall support down spreading of the link clock. The down-spread amplitude shall either be disabled (0.0%) or up to 0.5%, as written by the upstream device to the DOWNSPREAD_CTRL register (DPCD 00107h). The modulation frequency range shall be 30 to 33 kHz] Besides, fix checkpatch warning: CHECK: Alignment should match open parenthesis Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amd/display: remove unused matching_stream_ptrs variableTom Rix1-4/+1
clang with W=1 reports drivers/gpu/drm/amd/amdgpu/../display/dc/core/dc_link_enc_cfg.c:625:6: error: variable 'matching_stream_ptrs' set but not used [-Werror,-Wunused-but-set-variable] int matching_stream_ptrs = 0; ^ This variable is not used so remove it. Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amd/display: set variables dml*_funcs storage-class-specifier to staticTom Rix1-12/+12
smatch reports drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:44:24: warning: symbol 'dml20_funcs' was not declared. Should it be static? drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:51:24: warning: symbol 'dml20v2_funcs' was not declared. Should it be static? drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:58:24: warning: symbol 'dml21_funcs' was not declared. Should it be static? drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:65:24: warning: symbol 'dml30_funcs' was not declared. Should it be static? drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:72:24: warning: symbol 'dml31_funcs' was not declared. Should it be static? drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:79:24: warning: symbol 'dml314_funcs' was not declared. Should it be static? drivers/gpu/drm/amd/amdgpu/../display/dc/dml/display_mode_lib.c:86:24: warning: symbol 'dml32_funcs' was not declared. Should it be static? These variables are only used in one file so should be static. Cleanup whitespace, use tabs consistently for indents. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amd/display: set variables aperture_default_system and ↵Tom Rix1-2/+2
context0_default_system storage-class-specifier to static smatch reports drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_hubp.c:758:10: warning: symbol 'aperture_default_system' was not declared. Should it be static? drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_hubp.c:759:10: warning: symbol 'context0_default_system' was not declared. Should it be static? These variables are only used in one file so should be static. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amd/display: set variable dcn3_14_soc storage-class-specifier to staticTom Rix1-1/+1
smatch reports drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn314/dcn314_fpu.c:100:37: warning: symbol 'dcn3_14_soc' was not declared. Should it be static? This variable is only used in one file so should be static. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amdgpu: Fix warningsLijo Lazar1-1/+1
Fix below warning due to incompatible types in conditional operator ../pm/swsmu/smu13/smu_v13_0_6_ppt.c:315:17: sparse: sparse: incompatible types in conditional expression (different base types): Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Link: https://lore.kernel.org/oe-kbuild-all/202303082135.NjdX1Bij-lkp@intel.com/ Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amd/pm: correct SMU13.0.7 max shader clock reportingHoratio Zhang1-1/+60
Correct the max shader clock reporting on SMU 13.0.7. Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amd/pm: correct SMU13.0.7 pstate profiling clock settingsHoratio Zhang1-7/+15
Correct the pstate standard/peak profiling mode clock settings for SMU13.0.7. Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amdgpu: refine get gpu clock counter methodTong Liu011-2/+15
[why] regGOLDEN_TSC_COUNT_LOWER/regGOLDEN_TSC_COUNT_UPPER are protected and unaccessible under sriov. The clock counter high bit may update during reading process. [How] Replace regGOLDEN_TSC_COUNT_LOWER/regGOLDEN_TSC_COUNT_UPPER with regCP_MES_MTIME_LO/regCP_MES_MTIME_HI to get gpu clock under sriov. Refine get gpu clock counter method to make the result more precise. Signed-off-by: Tong Liu01 <Tong.Liu01@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amdgpu: optimize redundant code in umc_v6_7YiPeng Chai1-91/+70
Optimize redundant code in umc_v6_7. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amdgpu: Fix sdma v4 sw fini errorlyndonli1-1/+1
Fix sdma v4 sw fini error for sdma 4.2.2 to solve the following general protection fault [ +0.108196] general protection fault, probably for non-canonical address 0xd5e5a4ae79d24a32: 0000 [#1] PREEMPT SMP PTI [ +0.000018] RIP: 0010:free_fw_priv+0xd/0x70 [ +0.000022] Call Trace: [ +0.000012] <TASK> [ +0.000011] release_firmware+0x55/0x80 [ +0.000021] amdgpu_ucode_release+0x11/0x20 [amdgpu] [ +0.000415] amdgpu_sdma_destroy_inst_ctx+0x4f/0x90 [amdgpu] [ +0.000360] sdma_v4_0_sw_fini+0xce/0x110 [amdgpu] Signed-off-by: lyndonli <Lyndon.Li@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amdgpu: DROP redundant drm_prime_sg_to_dma_addr_arrayShane Xiao1-3/+0
For DMA-MAP userptr on other GPUs, the dma address array will be populated in amdgpu_ttm_backend_bind. Remove the redundant call from the driver. v2: update the comment Signed-off-by: Shane Xiao <shane.xiao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amdgpu: optimize redundant code in umc_v8_10YiPeng Chai3-120/+115
Optimize redundant code in umc_v8_10 Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12amd/amdgpu: Inherit coherence flags base on original BO flagsShane Xiao1-4/+8
For SG BO to DMA-map userptrs on other GPUs, the SG BO need inherit MTYPEs in PTEs from original BO. If we set the flags, the device can be coherent with the CPUs and other GPUs. v2: 1. Drop unnecessary flags check 2. Remove local variable align Signed-off-by: Shane Xiao <shane.xiao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amd/pm: Fix incorrect comment about Vangogh power cap supportGuilherme G. Piccoli1-2/+2
The comment mentions that power1 cap attributes are not supported on Vangogh, but the opposite is indeed valid: for APUs, only Vangogh is supported. While at it, also fixed the Renoir comment below (thanks Melissa for noticing that!). Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Melissa Wen <mwen@igalia.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amdgpu: Add userptr bo support for mGPUs when iommu is onShane Xiao1-4/+23
For userptr bo with iommu on, multiple GPUs use same system memory dma mapping address when both adev and bo_adev are in identity mode or in the same iommu group. If RAM direct map to one GPU, other GPUs can share the original BO in order to reduce dma address array usage when RAM can direct map to these GPUs. However, we should explicit check whether RAM can direct map to all these GPUs. This patch fixes a potential issue that where RAM is direct mapped on some but not all GPUs. v2: 1. Update comment 2. Add helper function reuse_dmamap Signed-off-by: Shane Xiao <shane.xiao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-04-12drm/amd/amdgpu: Drop the hang limit parameterSrinivasan Shanmugam3-10/+1
The driver doesn't resubmit jobs on hangs any more, hence drop the hang limit parameter - amdgpu_job_hang_limit, wherever it is used. Suggested-by: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Kent Russell <kent.russell@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>