kernel/linux.git/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h, branch v6.12.80

drm/amd/sriov: extend NV_MAILBOX_POLL_MSG_TIMEDOUT

2024-08-13T16:12:51+00:00

on MI300/MI308 UBB products, when doing mode1 reset, since 1 gpu need to wait all 8 gpus finish mode1 reset and then do re-init. As observed, sometimes the gpu which triggered the reset need to wait 15s for all gpus to finish. If poll msg timeout, guest driver will send the reset message again, and may mess up the following reinit sequence on other gpus. So extend the time to cover the maximum time needed to recover. Signed-off-by: Victor Zhao Acked-by: Alex Deucher Signed-off-by: Alex Deucher

drm/amdgpu: process RAS fatal error MB notification

2024-06-27T21:31:37+00:00

For RAS error scenario, VF guest driver will check mailbox and set fed flag to avoid unnecessary HW accesses. additionally, poll for reset completion message first to avoid accidentally spamming multiple reset requests to host. v2: add another mailbox check for handling case where kfd detects timeout first v3: set host_flr bit and use wait_for_reset Signed-off-by: Vignesh Chander Reviewed-by: Zhigang Luo Signed-off-by: Alex Deucher

drm/amdgpu: Add RAS_POISON_READY host response message

2024-01-25T19:58:03+00:00

In a non-FLR page avoidance scenario, the host driver will provide the bad pages in the pf2vf exchange region. Adding a new host response message to indicate when the pf2vf exchange region has been updated. Signed-off-by: Victor Skvortsov Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher

drm/amdgpu: add RAS poison consumption handler for NV SRIOV

2022-12-15T17:18:19+00:00

Send handling request to host. Signed-off-by: Tao Zhou Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher

drm/amdgpu: Add MB_REQ_MSG_READY_TO_RESET response when VF get FLR notification.

2021-08-16T19:17:57+00:00

When guest received FLR notification from host, it would lock adapter into reset state. There will be no more job submission and hardware access after that. Then it should send a response to host that it has prepared for host reset. Signed-off-by: Jiange Zhao Signed-off-by: Peng Ju Zhou Reviewed-by: Emily.Deng Signed-off-by: Alex Deucher

drm/amdgpu/SRIOV: Extend VF reset request wait period

2020-12-15T16:35:35+00:00

In Virtualization case, when one VF is sending too many FLR requests, hypervisor would stop responding to this VF's request for a long period of time. This is called event guard. During this period of cooling time, guest driver should wait instead of doing other things. After this period of time, guest driver would resume reset process and return to normal. Currently, guest driver would wait 12 seconds and return fail if it doesn't get response from host. Solution: extend this waiting time in guest driver and poll response periodically. Poll happens every 6 seconds and it will last for 60 seconds. v2: change the max repetition times from number to macro. Signed-off-by: Jiange Zhao Acked-by: Hawking Zhang Signed-off-by: Alex Deucher

drm/amdgpu: extent threshold of waiting FLR_COMPLETE

2020-04-24T15:42:11+00:00

to 5s to satisfy WHOLE GPU reset which need 3+ seconds to finish Signed-off-by: Monk Liu Acked-by: Yintian Tao Signed-off-by: Alex Deucher

drm/amdgpu: use static mmio offset for NV mailbox

2020-04-01T18:44:43+00:00

what: with the new "req_init_data" handshake we need to use mailbox before do IP discovery, so in mxgpu_nv.c file the original SOC15_REG method won'twork because that depends on IP discovery complete first. how: so the solution is to always use static MMIO offset for NV+ mailbox registers. HW team confirm us all MAILBOX registers will be at the same offset for all ASICs, no IP discovery needed for those registers Signed-off-by: Monk Liu Reviewed-by: Emily Deng Signed-off-by: Alex Deucher

drm/amdgpu: introduce new request and its function

2020-04-01T18:44:43+00:00

1) modify xgpu_nv_send_access_requests to support new idh request 2) introduce new function: req_gpu_init_data() which is used to notify host to prepare vbios/ip-discovery/pfvf exchange Signed-off-by: Monk Liu Reviewed-by: Emily Deng Signed-off-by: Alex Deucher

drm/amdgpu: introduce new idh_request/event enum

2020-04-01T18:44:43+00:00

new idh_request and ihd_event to prepare for the new handshake protocol implementation later Signed-off-by: Monk Liu Reviewed-by: Emily Deng Signed-off-by: Alex Deucher