summaryrefslogtreecommitdiff
path: root/include
diff options
context:
space:
mode:
authorYunxiang Li <Yunxiang.Li@amd.com>2024-04-22 22:04:52 +0300
committerAlex Deucher <alexander.deucher@amd.com>2024-05-02 22:41:05 +0300
commit6e4aa08fa9c6c0c027fc86f242517c925d159393 (patch)
tree4ddddca78ac62e823b8852fef3c32a41a56483dc /include
parenta5b843269a8f664df85948ec41db1dbcbc2a2d8b (diff)
downloadlinux-6e4aa08fa9c6c0c027fc86f242517c925d159393.tar.xz
drm/amdgpu: Fix amdgpu_device_reset_sriov retry logic
The retry loop for SRIOV reset have refcount and memory leak issue. Depending on which function call fails it can potentially call amdgpu_amdkfd_pre/post_reset different number of times and causes kfd_locked count to be wrong. This will block all future attempts at opening /dev/kfd. The retry loop also leakes resources by calling amdgpu_virt_init_data_exchange multiple times without calling the corresponding fini function. Align with the bare-metal reset path which doesn't have these issues. This means taking the amdgpu_amdkfd_pre/post_reset functions out of the reset loop and calling amdgpu_device_pre_asic_reset each retry which properly free the resources from previous try by calling amdgpu_virt_fini_data_exchange. Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Diffstat (limited to 'include')
0 files changed, 0 insertions, 0 deletions