kernel/linux.git/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c, branch v6.6.132

drm/amdgpu: trigger flr_work if reading pf2vf data failed

2025-05-22T12:12:14+00:00

[ Upstream commit ab66c832847fcdffc97d4591ba5547e3990d9d33 ] if reading pf2vf data failed 30 times continuously, it means something is wrong. Need to trigger flr_work to recover the issue. also use dev_err to print the error message to get which device has issue and add warning message if waiting IDH_FLR_NOTIFICATION_CMPL timeout. Signed-off-by: Zhigang Luo Acked-by: Hawking Zhang Signed-off-by: Alex Deucher Stable-dep-of: d0ce1aaa8531 ("Revert "drm/amd: Stop evicting resources on APUs in suspend"") Signed-off-by: Sasha Levin

drm/amdgpu: Clean up errors in mxgpu_nv.c

2023-08-09T13:43:32+00:00

Fix the following errors reported by checkpatch: ERROR: else should follow close brace '}' ERROR: that open brace { should be on the previous line Signed-off-by: Ran Sun Signed-off-by: Alex Deucher

drm/amdgpu: add RAS poison consumption handler for NV SRIOV

2022-12-15T17:18:19+00:00

Send handling request to host. Signed-off-by: Tao Zhou Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher

Revert "drm/amdgpu: let mode2 reset fallback to default when failure"

2022-10-19T02:08:33+00:00

This reverts commit dac6b80818ac2353631c5a33d140d8d5508e2957. This commit reverted the AMDGPU_SKIP_MODE2_RESET as it conflicts with the original design of reset handler. Will redesign it. Fixes: dac6b80818ac23 ("drm/amdgpu: let mode2 reset fallback to default when failure") Signed-off-by: Victor Zhao Reviewed-by: Lijo Lazar Signed-off-by: Alex Deucher

drm/amdgpu: let mode2 reset fallback to default when failure

2022-08-16T22:14:31+00:00

- introduce AMDGPU_SKIP_MODE2_RESET flag - let mode2 reset fallback to default reset method if failed v2: move this part out from the asic specific part Signed-off-by: Victor Zhao Acked-by: Andrey Grodzovsky Signed-off-by: Alex Deucher

drm/amdgpu: support reset flag set for gpu reset

2022-07-13T15:25:17+00:00

Move reset_context out of gpu recover function to make it configurable for different reset purpose. For the reset way of call gpu_recovery sysfs, force to use full reset method. Otherwise, try soft reset by default if the related ASIC supportted, if soft reset failed, will use full reset. Signed-off-by: Likun Gao Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher

drm/amdgpu: Rename amdgpu_device_gpu_recover_imp back to amdgpu_device_gpu_recover

2022-06-10T19:26:12+00:00

We removed the wrapper that was queueing the recover function into reset domain queue who was using this name. Signed-off-by: Andrey Grodzovsky Reviewed-by: Christian König Signed-off-by: Alex Deucher

drm/amdgpu: Move in_gpu_reset into reset_domain

2022-02-09T17:17:57+00:00

We should have a single instance per entrire reset domain. Signed-off-by: Andrey Grodzovsky Suggested-by: Lijo Lazar Reviewed-by: Christian König Link: https://www.spinics.net/lists/amd-gfx/msg74116.html

drm/amdgpu: Move reset sem into reset_domain

2022-02-09T17:17:32+00:00

We want single instance of reset sem across all reset clients because in case of XGMI we should stop access cross device MMIO because any of them could be in a reset in the moment. Signed-off-by: Andrey Grodzovsky Reviewed-by: Christian König Link: https://www.spinics.net/lists/amd-gfx/msg74117.html

drm/amdgpu: Rework reset domain to be refcounted.

2022-02-09T17:17:09+00:00

The reset domain contains register access semaphor now and so needs to be present as long as each device in a hive needs it and so it cannot be binded to XGMI hive life cycle. Adress this by making reset domain refcounted and pointed by each member of the hive and the hive itself. v4: Fix crash on boot witrh XGMI hive by adding type to reset_domain. XGMI will only create a new reset_domain if prevoius was of single device type meaning it's first boot. Otherwsie it will take a refocunt to exsiting reset_domain from the amdgou device. Add a wrapper around reset_domain->refcount get/put and a wrapper around send to reset wq (Lijo) Signed-off-by: Andrey Grodzovsky Acked-by: Christian König Link: https://www.spinics.net/lists/amd-gfx/msg74121.html