Age | Commit message (Collapse) | Author | Files | Lines |
|
add HDP_SD support on gc 12.0.0/1
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 61cffacb3a1c590b15c0e9ff987de02d293e0dd8)
|
|
kmd_fw_shared changed in VCN5
Signed-off-by: Yinjie Yao <yinjie.yao@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit aa02486fb18cecbaca0c4fd393d1a03f1d4c3f9a)
|
|
Add JPEG IB command parser to ensure registers
in the command are within the JPEG IP block.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: David (Ming Qiang) Wu <David.Wu3@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a7f670d5d8e77b092404ca8a35bb0f8f89ed3117)
Cc: stable@vger.kernel.org
|
|
Use mes pipe to unmap kcq and kgq.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f7fb9d677faf0460131bc2af15afd766d48a1f47)
|
|
Free memory for two pipes and unmap pipe0 via pipe1.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 98cae695a8ae0e4291b1fa7feef9b54fabefe885)
|
|
Configure two pipes with different hardware resources.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ea5d6db17a8e3635ad91e8c53faa1fdc9570fbbb)
|
|
Adjust mes12 sw/hw initiailization for both pipe0 and pipe1
enablement. The two pipes are almost identical pipe. Pipe0
behaves like schq and pipe1 like kiq, pipe0 was mapped by pipe1.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit aa539da8aff07ab08def6490e8c9b441439e70ba)
|
|
Add mes pipe switch to let caller choose pipe
to submit packet.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit b2dee0837a4be63e8d3e00550a9f057644f962c4)
|
|
Enable unified mes firmware to load on pipe0 and pipe1.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit e69c2dd7534f3fcabf7bb801db2a7ac71e7e5da6)
|
|
Add multiple mes ring instances in mes structure to support
multiple mes pipes.
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c7d4355648ffa02a1551495b05c71ea6c884d29c)
|
|
Missing validation ...
Checked libdrm and it clears all the structs, so we should be
safe to just check everything.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c6b86421f1f9ddf9d706f2453159813ee39d0cf9)
Cc: stable@vger.kernel.org
|
|
This needs to be set as well if the IB uses atomics.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c6c2e8b6a427d4fecc7c36cffccb908185afcab2)
Cc: stable@vger.kernel.org
|
|
This needs to be set as well if the IB uses atomics.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 35c628774e50b3784c59e8ca7973f03bcb067132)
Cc: stable@vger.kernel.org
|
|
wait memory room until enough before writing mes packets
to avoid ring buffer overflow.
v2: squash in sched_hw_submission fix
Fixes: de3246254156 ("drm/amdgpu: cleanup MES11 command submission")
Fixes: fffe347e1478 ("drm/amdgpu: cleanup MES12 command submission")
Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 34e087e8920e635c62e2ed6a758b0cd27f836d13)
Cc: stable@vger.kernel.org
|
|
We require this flag AMDGPU_GEM_CREATE_GFX12_DCC or any other
kernel level GFX12 DCC flag to differentiate the DCC buffers and other
pinned display buffers(which has TTM_PL_FLAG_CONTIGUOUS enabled).
If we use the TTM_PL_FLAG_CONTIGUOUS flag for DCC buffers, we may over
allocate for all the pinned display buffers unnecessarily that leads to
memory allocation failure.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 46142cc1b9272d664e0258e105b537735bfeeccc)
|
|
correct sdma7 max dw into 8
Signed-off-by: Frank Min <Frank.Min@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 86598c3819fdc70e59d28221bfa7bc36e9f5777e)
|
|
Add address alignment support to the DCC VRAM buffers.
v2:
- adjust size based on the max_texture_channel_caches values
only for GFX12 DCC buffers.
- used AMDGPU_GEM_CREATE_GFX12_DCC flag to apply change only
for DCC buffers.
- roundup non power of two DCC buffer adjusted size to nearest
power of two number as the buddy allocator does not support non
power of two alignments. This applies only to the contiguous
DCC buffers.
v3:(Alex)
- rewrite the max texture channel caches comparison code in an
algorithmic way to determine the alignment size.
v4:(Alex)
- Move the logic from amdgpu_vram_mgr_dcc_alignment() to gmc_v12_0.c
and add a new gmc func callback for dcc alignment. If the callback
is non-NULL, call it to get the alignment, otherwise, use the default.
v5:(Alex)
- Set the Alignment to a default value if the callback doesn't exist.
- Add the callback to amdgpu_gmc_funcs.
v6:
- Fix checkpatch warning reported by Intel CI.
v7:(Christian)
- remove the AMDGPU_GEM_CREATE_GFX12_DCC flag and keep a flag that
checks the BO pinning and for a specific hw generation.
v8:(Christian)
- move this check into gmc_v12_0_get_dcc_alignment.
v9:
- Fix 32bit build errors
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit aa94b623cb9233b91ed342dd87ecd62e56ff4938)
|
|
Without setting cpv bit and 7th ib dw, non-dcc buffer copy will have
random corruption
So set the cpv bit and clear the 7th ib dw for copy non-dcc buffers
Signed-off-by: Frank Min <Frank.Min@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5aacf8917fde5bc2a640f3cd49130c0e2e85e726)
|
|
As we discussed before[1], soft recovery should be
forwarded to userspace, or we can get into a really
bad state where apps will keep submitting hanging
command buffers cascading us to a hard reset.
1: https://lore.kernel.org/all/bf23d5ed-9a6b-43e7-84ee-8cbfd0d60f18@froggi.es/
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 434967aadbbbe3ad9103cc29e9a327de20fdba01)
Cc: stable@vger.kernel.org
|
|
Adding Manual GDB golden setting for gc v12
revision 0 ASIC.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c9875d0a789060facc274dee0d4eb6500d471772)
|
|
MMHUB v4.1.0 only support fixed cache mode, so
only use legacy invalidation accordingly.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9192c7613ca53572908ba23a4c3f39c7f8ba8021)
|
|
MES firmware requires larger log buffer for gfx12. Allocate
proper buffer respectively for gfx11 and gfx12.
Signed-off-by: Michael Chen <michael.chen@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 739d0f3e1f36738d4cd84166784a8f7a58d69612)
|
|
Otherwise we won't get correct access to the IB.
v2: keep setting AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS to avoid problems in
the VRAM backend.
Signed-off-by: Christian König <christian.koenig@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3501
Fixes: e362b7c8f8c7 ("drm/amdgpu: Modify the contiguous flags behaviour")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Tested-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit fbfb5f0342253d92c4e446588c428a9d90c3f610)
|
|
[Why]
Page table of compute VM in the VRAM will lost after gpu reset.
VRAM won't be restored since compute VM has no shadows.
[How]
Use higher 32-bit of vm->generation to record a vram_lost_counter.
Reset the VM state machine when vm->genertaion is not equal to
the new generation token.
v2: Check vm->generation instead of calling drm_sched_entity_error
in amdgpu_vm_validate.
v3: Use new generation token instead of vram_lost_counter for check.
Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
(cherry picked from commit 47c0388b0589cb481c294dcb857d25a214c46eb3)
|
|
To prevent below probe failure, add a check for models with VCN
IP v4.0.6 where VCN1 may be harvested.
v2:
Apply the same check to VCN IP v4.0 and v5.0.
[ 54.070117] RIP: 0010:vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
[ 54.071055] Code: 80 fb ff 8d 82 00 80 fe ff 81 fe 00 06 00 00 0f 43
c2 49 69 d5 38 0d 00 00 48 8d 71 04 c1 e8 02 4c 01 f2 48 89 b2 50 f6 02
00 <89> 01 48 8b 82 50 f6 02 00 48 8d 48 04 48 89 8a 50 f6 02 00 c7 00
[ 54.072408] RSP: 0018:ffffb17985f736f8 EFLAGS: 00010286
[ 54.072793] RAX: 00000000000000d6 RBX: ffff99a82f680000 RCX:
0000000000000000
[ 54.073315] RDX: ffff99a82f680000 RSI: 0000000000000004 RDI:
ffff99a82f680000
[ 54.073835] RBP: ffffb17985f73730 R08: 0000000000000001 R09:
0000000000000000
[ 54.074353] R10: 0000000000000008 R11: ffffb17983c05000 R12:
0000000000000000
[ 54.074879] R13: 0000000000000000 R14: ffff99a82f680000 R15:
0000000000000001
[ 54.075400] FS: 00007f8d9c79a000(0000) GS:ffff99ab2f140000(0000)
knlGS:0000000000000000
[ 54.075988] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 54.076408] CR2: 0000000000000000 CR3: 0000000140c3a000 CR4:
0000000000750ef0
[ 54.076927] PKRU: 55555554
[ 54.077132] Call Trace:
[ 54.077319] <TASK>
[ 54.077484] ? show_regs+0x69/0x80
[ 54.077747] ? __die+0x28/0x70
[ 54.077979] ? page_fault_oops+0x180/0x4b0
[ 54.078286] ? do_user_addr_fault+0x2d2/0x680
[ 54.078610] ? exc_page_fault+0x84/0x190
[ 54.078910] ? asm_exc_page_fault+0x2b/0x30
[ 54.079224] ? vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu]
[ 54.079941] ? vcn_v4_0_5_start_dpg_mode+0xe6/0x36b0 [amdgpu]
[ 54.080617] vcn_v4_0_5_set_powergating_state+0x82/0x19b0 [amdgpu]
[ 54.081316] amdgpu_device_ip_set_powergating_state+0x64/0xc0
[amdgpu]
[ 54.082057] amdgpu_vcn_ring_begin_use+0x6f/0x1d0 [amdgpu]
[ 54.082727] amdgpu_ring_alloc+0x44/0x70 [amdgpu]
[ 54.083351] amdgpu_vcn_dec_sw_ring_test_ring+0x40/0x110 [amdgpu]
[ 54.084054] amdgpu_ring_test_helper+0x22/0x90 [amdgpu]
[ 54.084698] vcn_v4_0_5_hw_init+0x87/0xc0 [amdgpu]
[ 54.085307] amdgpu_device_init+0x1f96/0x2780 [amdgpu]
[ 54.085951] amdgpu_driver_load_kms+0x1e/0xc0 [amdgpu]
[ 54.086591] amdgpu_pci_probe+0x19f/0x550 [amdgpu]
[ 54.087215] local_pci_probe+0x48/0xa0
[ 54.087509] pci_device_probe+0xc9/0x250
[ 54.087812] really_probe+0x1a4/0x3f0
[ 54.088101] __driver_probe_device+0x7d/0x170
[ 54.088443] driver_probe_device+0x24/0xa0
[ 54.088765] __driver_attach+0xdd/0x1d0
[ 54.089068] ? __pfx___driver_attach+0x10/0x10
[ 54.089417] bus_for_each_dev+0x8e/0xe0
[ 54.089718] driver_attach+0x22/0x30
[ 54.090000] bus_add_driver+0x120/0x220
[ 54.090303] driver_register+0x62/0x120
[ 54.090606] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
[ 54.091255] __pci_register_driver+0x62/0x70
[ 54.091593] amdgpu_init+0x67/0xff0 [amdgpu]
[ 54.092190] do_one_initcall+0x5f/0x330
[ 54.092495] do_init_module+0x68/0x240
[ 54.092794] load_module+0x201c/0x2110
[ 54.093093] init_module_from_file+0x97/0xd0
[ 54.093428] ? init_module_from_file+0x97/0xd0
[ 54.093777] idempotent_init_module+0x11c/0x2a0
[ 54.094134] __x64_sys_finit_module+0x64/0xc0
[ 54.094476] do_syscall_64+0x58/0x120
[ 54.094767] entry_SYSCALL_64_after_hwframe+0x6e/0x76
Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
(cherry picked from commit 0b071245ddd98539d4f7493bdd188417fcf2d629)
|
|
The eeprom table is empty before initializing,
set eeprom table version first before initializing.
Changed from V1:
Reuse amdgpu_ras_set_eeprom_table_version function
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 015b8a2fdf39a4c288ff24e7b715b8d9198e56dc)
|
|
The ras command shared memory is allocated from
VRAM and the response status of the command
buffer will not be zero due to gpu being in
fatal error state after ras UE error injection.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8284951a6e79c6806c675e5f68a4cd425dd56bc4)
|
|
For VCN/JPEG 4.0.3, use only the local addressing scheme.
- Mask bit higher than AID0 range
v2
remain the case for mmhub use master XCC
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit caaf576292f8ccef5cdc0ac16e77b87dbf6e17ab)
|
|
VCN 4.0.3 does not HDP flush with RRMT enabled. Instead, mmsch
will do the HDP flush.
This change is necessary for VCN v4.0.3, no need for backward compatibility
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 49cfaebe48e97500a68d5322a8194736b0a2c3cf)
|
|
JPEG v4.0.3 doesn't support HDP flush when RRMT is enabled. Instead,
mmsch fw will do the flush.
This change is necessary for JPEG v4.0.3, no need for backward compatibility
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 585e3fdb36f59c5cfed0ae06c852dc1df22b1d60)
|
|
Return 0 to avoid returning an uninitialized variable r.
Cc: stable@vger.kernel.org
Fixes: 230dd6bb6117 ("drm/amd/amdgpu: implement mode2 reset on smu_v13_0_10")
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 6472de66c0aa18d50a4b5ca85f8272e88a737676)
|
|
If PCIe supports atomics, configure register to prevent DF from
breaking atomics in separate load/store operations.
Signed-off-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 666f14cab21b17ccc1bdfe1e82458aa429b3b7e0)
|
|
We seem to have a case where SDMA will sometimes miss a doorbell
if GFX is entering the powergating state when the doorbell comes in.
To workaround this, we can update the wptr via MMIO, however,
this is only safe because we disallow gfxoff in begin_ring() for
SDMA 5.2 and then allow it again in end_ring().
Enable this workaround while we are root causing the issue with
the HW team.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/3440
Tested-by: Friedrich Vock <friedrich.vock@gmx.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
(cherry picked from commit f2ac52634963fc38e4935e11077b6f7854e5d700)
|
|
Add mutex to protect ras shared memory.
v2:
Add TA_RAS_COMMAND__TRIGGER_ERROR command call
status check.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
For unified queue, DPG pause for encoding is done inside VCN firmware,
so there is no need to pause dpg based on ring type in kernel.
For VCN3 and below, pausing DPG for encoding in kernel is still needed.
v2: add more comments
v3: update commit message
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Determine whether VCN using unified queue in sw_init, instead of calling
functions later on.
v2: fix coding style
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Increase the KMS minor version to indicate GFX12 DCC support since this
contains a major change in how DCC is managed across IPs like GFX, DCN
etc. This will be used mainly by userspace like Mesa to figure out
DCC support on GFX12 hardware.
v2: fix version number (Alex)
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fixes the indexing of the string array.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add new packet.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Enable it by default.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The problem case is as follows:
1. GPU A triggers a gpu ras reset, and GPU A drives
GPU B to also perform a gpu ras reset.
2. After gpu B ras reset started, gpu B queried a DE
data. Since the DE data was queried in the ras reset
thread instead of the page retirement thread, bad
page retirement work would not be triggered. Then
even if all gpu resets are completed, the bad pages
will be cached in RAM until GPU B's bad page retirement
work is triggered again and then saved to eeprom.
This patch can save the bad pages to eeprom in time after gpu
ras reset is completed.
v2:
1. Add the above description to code comments.
2. Reuse existing function.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Before uninstalling gpu driver, flush all cached ras
bad pages to eeprom.
v2:
Put the same code into a function and reuse the function.
Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
GFX ME right now is one but this could change in
future SOC's. Use no of ME for GFX as start point
for ME for compute for GFX12.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
GFX ME right now is one but this could change in
future SOC's. Use no of ME for GFX as start point
for ME for compute for GFX11.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use the dev_info/err variants so we get per device logging
in multi-GPU cases.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
GFX ME right now is one but this could change in
future SOC's. Use no of ME for GFX as start point
for ME for compute for GFX10.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
For SOCs with GFX v9.4.3, a VF may have multiple compute partitions.
Fetch the partition information during init and initialize partition
nodes. There is no support to switch partition mode in VF mode, hence
disable the same.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
sdma has 2 instances in SRIOV cpx mode. Odd numbered VFs have
sdma0/sdma1 instances. Even numbered vfs have sdma2/sdma3. For
Even numbered vfs, the sdma2 & sdma3 (irq srouce id
CLIENTID_SDMA2 and CLIENTID_SDMA3) should map to irq seq 0 & 1.
Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
to avoid reading wrong WPTR from doorbell in sriov vf, set
CP_HQD_PQ_DOORBELL_CONTROL.DOORBELL_MODE to 1 to read WPTR from MQD.
Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
add amdgpu ras 'event_state' sysfs device attribute support
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|