<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c, branch v5.16.1</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.16.1</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.16.1'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2021-10-13T18:14:48+00:00</updated>
<entry>
<title>drm/amdgpu: Fix RAS page retirement with mode2 reset on Aldebaran</title>
<updated>2021-10-13T18:14:48+00:00</updated>
<author>
<name>Mukul Joshi</name>
<email>mukul.joshi@amd.com</email>
</author>
<published>2021-09-21T00:48:23+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=91a1a52d03aa0f1f2b51c7df8a7bf437e906e29f'/>
<id>urn:sha1:91a1a52d03aa0f1f2b51c7df8a7bf437e906e29f</id>
<content type='text'>
During mode2 reset, the GPU is temporarily removed from the
mgpu_info list. As a result, page retirement fails because it
cannot find the GPU in the GPU list.
To fix this, create our own list of GPUs that support MCE notifier
based page retirement and use that list to check if the UMC error
occurred on a GPU that supports MCE notifier based page retirement.

Signed-off-by: Mukul Joshi &lt;mukul.joshi@amd.com&gt;
Reviewed-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Register MCE notifier for Aldebaran RAS</title>
<updated>2021-10-06T19:53:38+00:00</updated>
<author>
<name>Mukul Joshi</name>
<email>mukul.joshi@amd.com</email>
</author>
<published>2021-09-22T18:49:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=12b2cab79017ebe598c74493ac1cfc5934d3ccc2'/>
<id>urn:sha1:12b2cab79017ebe598c74493ac1cfc5934d3ccc2</id>
<content type='text'>
On Aldebaran, GPU driver will handle bad page retirement
for GPU memory even though UMC is host managed. As a result,
register a bad page retirement handler on the mce notifier
chain to retire bad pages on Aldebaran.

Signed-off-by: Mukul Joshi &lt;mukul.joshi@amd.com&gt;
Reviewed-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: resolve RAS query bug</title>
<updated>2021-10-04T19:22:57+00:00</updated>
<author>
<name>John Clements</name>
<email>john.clements@amd.com</email>
</author>
<published>2021-09-29T07:06:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=eb601e61d3492d809cb82a19560a6c31c36fd48a'/>
<id>urn:sha1:eb601e61d3492d809cb82a19560a6c31c36fd48a</id>
<content type='text'>
clear error count when persistant harvesting is not enabled

Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: John Clements &lt;john.clements@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: skip umc ras irq handling in poison mode (v2)</title>
<updated>2021-09-28T13:30:07+00:00</updated>
<author>
<name>Tao Zhou</name>
<email>tao.zhou1@amd.com</email>
</author>
<published>2021-09-17T10:40:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f524dd54a78924b59acd8f251788889129b3a2e9'/>
<id>urn:sha1:f524dd54a78924b59acd8f251788889129b3a2e9</id>
<content type='text'>
In ras poison mode, umc uncorrectable error will be ignored until
the corrupted data consumed by another ras module (such as gfx, sdma).

v2: update the debug message and replace dev_warn with dev_info.

Signed-off-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: set poison supported flag for RAS (v2)</title>
<updated>2021-09-28T13:30:07+00:00</updated>
<author>
<name>Tao Zhou</name>
<email>tao.zhou1@amd.com</email>
</author>
<published>2021-09-17T10:24:09+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e43488493cbb46e862f83c66887f3e6cb854c6f0'/>
<id>urn:sha1:e43488493cbb46e862f83c66887f3e6cb854c6f0</id>
<content type='text'>
Add RAS poison supported flag and tell PSP RAS TA about the info.

v2: rename poison mode to poison supported, we can also disable poison
mode even we support it.
    print value of poison supported if ras feature enablement fails.

Signed-off-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Remove all code paths under the EAGAIN path in RAS late init</title>
<updated>2021-09-23T20:35:13+00:00</updated>
<author>
<name>Candice Li</name>
<email>candice.li@amd.com</email>
</author>
<published>2021-09-15T07:14:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9080a18fc554cea0858fae6692a7003c5f0365fc'/>
<id>urn:sha1:9080a18fc554cea0858fae6692a7003c5f0365fc</id>
<content type='text'>
All code paths under the EAGAIN path in RAS late init are unused.

Signed-off-by: Candice Li &lt;candice.li@amd.com&gt;
Reviewed-by: John Clements &lt;john.clements@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Updated RAS infrastructure</title>
<updated>2021-09-23T20:34:43+00:00</updated>
<author>
<name>John Clements</name>
<email>john.clements@amd.com</email>
</author>
<published>2021-09-22T06:04:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=640ae42efb828be69a9ee6ac88fb3d5a3e678ddf'/>
<id>urn:sha1:640ae42efb828be69a9ee6ac88fb3d5a3e678ddf</id>
<content type='text'>
Update RAS infrastructure to support RAS query for MCA subblocks

Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: John Clements &lt;john.clements@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Update RAS trigger error block support</title>
<updated>2021-09-14T19:37:48+00:00</updated>
<author>
<name>John Clements</name>
<email>john.clements@amd.com</email>
</author>
<published>2021-09-09T08:05:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3771449bc80fa494c15f366ce1fa9e3168332b6a'/>
<id>urn:sha1:3771449bc80fa494c15f366ce1fa9e3168332b6a</id>
<content type='text'>
Added trigger error support for MP0/MP1/MPIO blocks

Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: John Clements &lt;john.clements@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd/amdgpu: add mpio to ras block</title>
<updated>2021-09-01T20:55:11+00:00</updated>
<author>
<name>Candice Li</name>
<email>candice.li@amd.com</email>
</author>
<published>2021-08-27T11:14:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=a0a2f7bb220945e369de77ea004d96236e9463a6'/>
<id>urn:sha1:a0a2f7bb220945e369de77ea004d96236e9463a6</id>
<content type='text'>
Add MPIO to RAS block

Signed-off-by: Candice Li &lt;candice.li@amd.com&gt;
Reviewed-by: John Clements &lt;john.clements@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd/amdgpu: remove unnecessary RAS context field</title>
<updated>2021-08-16T19:35:55+00:00</updated>
<author>
<name>Candice Li</name>
<email>candice.li@amd.com</email>
</author>
<published>2021-08-13T11:06:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=893cf382c0403d7c4581f0f01f6d06c76485123d'/>
<id>urn:sha1:893cf382c0403d7c4581f0f01f6d06c76485123d</id>
<content type='text'>
Delete ras_if-&gt;name in the RAS ctx structure and remove related lines.

Signed-off-by: Candice Li &lt;candice.li@amd.com&gt;
Reviewed-by: John Clements &lt;john.clements@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
