<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-08-13T16:12:51+00:00</updated>
<entry>
<title>drm/amd/sriov: extend NV_MAILBOX_POLL_MSG_TIMEDOUT</title>
<updated>2024-08-13T16:12:51+00:00</updated>
<author>
<name>Victor Zhao</name>
<email>Victor.Zhao@amd.com</email>
</author>
<published>2024-08-07T09:32:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ef6c2cb349c708676b7820c36a5beb75868ad544'/>
<id>urn:sha1:ef6c2cb349c708676b7820c36a5beb75868ad544</id>
<content type='text'>
on MI300/MI308 UBB products, when doing mode1 reset, since 1 gpu need to
wait all 8 gpus finish mode1 reset and then do re-init. As observed,
sometimes the gpu which triggered the reset need to wait 15s for all
gpus to finish.

If poll msg timeout, guest driver will send the reset message again, and
may mess up the following reinit sequence on other gpus.

So extend the time to cover the maximum time needed to recover.

Signed-off-by: Victor Zhao &lt;Victor.Zhao@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: process RAS fatal error MB notification</title>
<updated>2024-06-27T21:31:37+00:00</updated>
<author>
<name>Vignesh Chander</name>
<email>Vignesh.Chander@amd.com</email>
</author>
<published>2024-06-24T21:44:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cbda2758d8bfae323b846210a3e52f0ad5fe7164'/>
<id>urn:sha1:cbda2758d8bfae323b846210a3e52f0ad5fe7164</id>
<content type='text'>
For RAS error scenario, VF guest driver will check mailbox
and set fed flag to avoid unnecessary HW accesses.
additionally, poll for reset completion message first
to avoid accidentally spamming multiple reset requests to host.

v2: add another mailbox check for handling case where kfd detects
timeout first

v3: set host_flr bit and use wait_for_reset

Signed-off-by: Vignesh Chander &lt;Vignesh.Chander@amd.com&gt;
Reviewed-by: Zhigang Luo &lt;Zhigang.Luo@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Add RAS_POISON_READY host response message</title>
<updated>2024-01-25T19:58:03+00:00</updated>
<author>
<name>Victor Skvortsov</name>
<email>victor.skvortsov@amd.com</email>
</author>
<published>2024-01-21T15:25:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2474414c60b7ed1f90293facdc4d94ef7cf61a3b'/>
<id>urn:sha1:2474414c60b7ed1f90293facdc4d94ef7cf61a3b</id>
<content type='text'>
In a non-FLR page avoidance scenario, the host driver will
provide the bad pages in the pf2vf exchange region.

Adding a new host response message to indicate when the
pf2vf exchange region has been updated.

Signed-off-by: Victor Skvortsov &lt;victor.skvortsov@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: add RAS poison consumption handler for NV SRIOV</title>
<updated>2022-12-15T17:18:19+00:00</updated>
<author>
<name>Tao Zhou</name>
<email>tao.zhou1@amd.com</email>
</author>
<published>2022-12-06T09:04:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ae844dd79ffc60f419b32a8d6026128f18021650'/>
<id>urn:sha1:ae844dd79ffc60f419b32a8d6026128f18021650</id>
<content type='text'>
Send handling request to host.

Signed-off-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Add MB_REQ_MSG_READY_TO_RESET response when VF get FLR notification.</title>
<updated>2021-08-16T19:17:57+00:00</updated>
<author>
<name>Jiange Zhao</name>
<email>Jiange.Zhao@amd.com</email>
</author>
<published>2021-03-19T02:32:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3e183e2faea97fb284f82861286de09aa16e3630'/>
<id>urn:sha1:3e183e2faea97fb284f82861286de09aa16e3630</id>
<content type='text'>
When guest received FLR notification from host, it would
lock adapter into reset state. There will be no more
job submission and hardware access after that.

Then it should send a response to host that it has prepared
for host reset.

Signed-off-by: Jiange Zhao &lt;Jiange.Zhao@amd.com&gt;
Signed-off-by: Peng Ju Zhou &lt;PengJu.Zhou@amd.com&gt;
Reviewed-by: Emily.Deng &lt;Emily.Deng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/SRIOV: Extend VF reset request wait period</title>
<updated>2020-12-15T16:35:35+00:00</updated>
<author>
<name>Jiange Zhao</name>
<email>Jiange.Zhao@amd.com</email>
</author>
<published>2020-11-25T13:56:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3aa883ac8eea38281f97a7409d2922e6f343bf6c'/>
<id>urn:sha1:3aa883ac8eea38281f97a7409d2922e6f343bf6c</id>
<content type='text'>
In Virtualization case, when one VF is sending too many
FLR requests, hypervisor would stop responding to this
VF's request for a long period of time. This is called
event guard. During this period of cooling time, guest
driver should wait instead of doing other things. After
this period of time, guest driver would resume reset
process and return to normal.

Currently, guest driver would wait 12 seconds and return fail
if it doesn't get response from host.

Solution: extend this waiting time in guest driver and poll
response periodically. Poll happens every 6 seconds and it will
last for 60 seconds.

v2: change the max repetition times from number to macro.

Signed-off-by: Jiange Zhao &lt;Jiange.Zhao@amd.com&gt;
Acked-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu:  extent threshold of waiting FLR_COMPLETE</title>
<updated>2020-04-24T15:42:11+00:00</updated>
<author>
<name>Monk Liu</name>
<email>Monk.Liu@amd.com</email>
</author>
<published>2020-04-21T10:04:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=312a79b6eafe5c45e3e232506a4a6e97d7cdbba4'/>
<id>urn:sha1:312a79b6eafe5c45e3e232506a4a6e97d7cdbba4</id>
<content type='text'>
to 5s to satisfy WHOLE GPU reset which need 3+ seconds to
finish

Signed-off-by: Monk Liu &lt;Monk.Liu@amd.com&gt;
Acked-by: Yintian Tao &lt;yttao@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: use static mmio offset for NV mailbox</title>
<updated>2020-04-01T18:44:43+00:00</updated>
<author>
<name>Monk Liu</name>
<email>Monk.Liu@amd.com</email>
</author>
<published>2020-03-04T15:46:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ff1f03a7b8c4787faefdb44b189e39cbf4f7611c'/>
<id>urn:sha1:ff1f03a7b8c4787faefdb44b189e39cbf4f7611c</id>
<content type='text'>
what:
with the new "req_init_data" handshake we need to use mailbox
before do IP discovery, so in mxgpu_nv.c file the original
SOC15_REG method won'twork because that depends on IP discovery
complete first.

how:
so the solution is to always use static MMIO offset for NV+ mailbox
registers.
HW team confirm us all MAILBOX registers will be at the same
offset for all ASICs, no IP discovery needed for those registers

Signed-off-by: Monk Liu &lt;Monk.Liu@amd.com&gt;
Reviewed-by: Emily Deng &lt;Emily.Deng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: introduce new request and its function</title>
<updated>2020-04-01T18:44:43+00:00</updated>
<author>
<name>Monk Liu</name>
<email>Monk.Liu@amd.com</email>
</author>
<published>2020-03-04T03:38:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=aa53bc2edb66624ac05902910c41d8b4f685b8bc'/>
<id>urn:sha1:aa53bc2edb66624ac05902910c41d8b4f685b8bc</id>
<content type='text'>
1) modify xgpu_nv_send_access_requests to support
new idh request

2) introduce new function: req_gpu_init_data() which
is used to notify host to prepare vbios/ip-discovery/pfvf exchange

Signed-off-by: Monk Liu &lt;Monk.Liu@amd.com&gt;
Reviewed-by: Emily Deng &lt;Emily.Deng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: introduce new idh_request/event enum</title>
<updated>2020-04-01T18:44:43+00:00</updated>
<author>
<name>Monk Liu</name>
<email>Monk.Liu@amd.com</email>
</author>
<published>2020-03-03T10:13:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c27cbdd2d073baf77deaf3e73dace7945a72dde7'/>
<id>urn:sha1:c27cbdd2d073baf77deaf3e73dace7945a72dde7</id>
<content type='text'>
new idh_request and ihd_event to prepare for the
new handshake protocol implementation later

Signed-off-by: Monk Liu &lt;Monk.Liu@amd.com&gt;
Reviewed-by: Emily Deng &lt;Emily.Deng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
