summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorYifan Zha <Yifan.Zha@amd.com>2025-03-05 08:14:55 +0300
committerAlex Deucher <alexander.deucher@amd.com>2025-03-11 19:36:52 +0300
commit42c854b8fb0cce512534aa2b7141948e80c6ebb0 (patch)
tree543b4294578eb9a4029d6dafdbd4a5f6adb72aeb
parent760632fa2e3dbb13a9b55acbb960592628d274dd (diff)
downloadlinux-42c854b8fb0cce512534aa2b7141948e80c6ebb0.tar.xz
drm/amd/amdkfd: Evict all queues even HWS remove queue failed
[Why] If reset is detected and kfd need to evict working queues, HWS moving queue will be failed. Then remaining queues are not evicted and in active state. After reset done, kfd uses HWS to termination remaining activated queues but HWS is resetted. So remove queue will be failed again. [How] Keep removing all queues even if HWS returns failed. It will not affect cpsch as it checks reset_domain->sem. v2: If any queue failed, evict queue returns error. v3: Declare err inside the if-block. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
-rw-r--r--drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c8
1 files changed, 5 insertions, 3 deletions
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 885e0e9cf21b..2ed003d3ff0e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -1221,11 +1221,13 @@ static int evict_process_queues_cpsch(struct device_queue_manager *dqm,
decrement_queue_count(dqm, qpd, q);
if (dqm->dev->kfd->shared_resources.enable_mes) {
- retval = remove_queue_mes(dqm, q, qpd);
- if (retval) {
+ int err;
+
+ err = remove_queue_mes(dqm, q, qpd);
+ if (err) {
dev_err(dev, "Failed to evict queue %d\n",
q->properties.queue_id);
- goto out;
+ retval = err;
}
}
}