<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/gpu/drm/amd/amdgpu/amdgpu_mca.h, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2024-06-19T16:50:31+00:00</updated>
<entry>
<title>Revert "drm/amdgpu: change bank cache lock type to spinlock"</title>
<updated>2024-06-19T16:50:31+00:00</updated>
<author>
<name>Yang Wang</name>
<email>kevinyang.wang@amd.com</email>
</author>
<published>2024-06-18T01:47:36+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8c9ee180196fb2a04e28891578ae608f772eab9c'/>
<id>urn:sha1:8c9ee180196fb2a04e28891578ae608f772eab9c</id>
<content type='text'>
This reverts commit 258ed689bc3163f86204f75df6c23f92b59b3fad

revert this patch to modify lock type back to 'mutex' to avoid kernel
calltrace issue.

[  602.668806] Workqueue: amdgpu-reset-dev amdgpu_ras_do_recovery [amdgpu]
[  602.668939] Call Trace:
[  602.668940]  &lt;TASK&gt;
[  602.668941]  dump_stack_lvl+0x4c/0x70
[  602.668945]  dump_stack+0x14/0x20
[  602.668946]  __schedule_bug+0x5a/0x70
[  602.668950]  __schedule+0x940/0xb30
[  602.668952]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.668955]  ? hrtimer_reprogram+0x77/0xb0
[  602.668957]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.668959]  ? hrtimer_start_range_ns+0x126/0x370
[  602.668961]  schedule+0x39/0xe0
[  602.668962]  schedule_hrtimeout_range_clock+0xb1/0x140
[  602.668964]  ? __pfx_hrtimer_wakeup+0x10/0x10
[  602.668966]  schedule_hrtimeout_range+0x17/0x20
[  602.668967]  usleep_range_state+0x69/0x90
[  602.668970]  psp_cmd_submit_buf+0x132/0x570 [amdgpu]
[  602.669066]  psp_ras_invoke+0x75/0x1a0 [amdgpu]
[  602.669156]  psp_ras_query_address+0x9c/0x120 [amdgpu]
[  602.669245]  umc_v12_0_update_ecc_status+0x16d/0x520 [amdgpu]
[  602.669337]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.669339]  ? stack_depot_save+0x12/0x20
[  602.669342]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.669343]  ? set_track_prepare+0x52/0x70
[  602.669346]  ? kmemleak_alloc+0x4f/0x90
[  602.669348]  ? __kmalloc_node+0x34b/0x450
[  602.669352]  amdgpu_umc_update_ecc_status+0x23/0x40 [amdgpu]
[  602.669438]  mca_umc_mca_get_err_count+0x85/0xc0 [amdgpu]
[  602.669554]  mca_smu_parse_mca_error_count+0x120/0x1d0 [amdgpu]
[  602.669655]  amdgpu_mca_dispatch_mca_set.part.0+0x141/0x250 [amdgpu]
[  602.669743]  ? kmemleak_free+0x36/0x60
[  602.669745]  ? kvfree+0x32/0x40
[  602.669747]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.669749]  ? kfree+0x15d/0x2a0
[  602.669752]  amdgpu_mca_smu_log_ras_error+0x1f6/0x210 [amdgpu]
[  602.669839]  amdgpu_ras_query_error_status_helper+0x2ad/0x390 [amdgpu]
[  602.669924]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.669925]  ? __call_rcu_common.constprop.0+0xa6/0x2b0
[  602.669929]  amdgpu_ras_query_error_status+0xf3/0x620 [amdgpu]
[  602.670014]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.670017]  amdgpu_ras_log_on_err_counter+0xe1/0x170 [amdgpu]
[  602.670103]  amdgpu_ras_do_recovery+0xd2/0x2c0 [amdgpu]
[  602.670187]  ? srso_alias_return_thunk+0x5/0xfbef5
[  602.670189]  ? __schedule+0x37d/0xb30
[  602.670191]  process_one_work+0x176/0x350
[  602.670194]  worker_thread+0x2f7/0x420
[  602.670197]  ?

Signed-off-by: Yang Wang &lt;kevinyang.wang@amd.com&gt;
Reviewed-by: YiPeng Chai &lt;YiPeng.Chai@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: change bank cache lock type to spinlock</title>
<updated>2024-05-17T21:40:39+00:00</updated>
<author>
<name>Yang Wang</name>
<email>kevinyang.wang@amd.com</email>
</author>
<published>2024-05-16T23:56:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=258ed689bc3163f86204f75df6c23f92b59b3fad'/>
<id>urn:sha1:258ed689bc3163f86204f75df6c23f92b59b3fad</id>
<content type='text'>
modify the lock type to 'spinlock' to avoid schedule issue
in interrupt context.

Signed-off-by: Yang Wang &lt;kevinyang.wang@amd.com&gt;
Reviewed-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: avoid dump mca bank log muti times during ras ISR</title>
<updated>2024-04-30T13:58:47+00:00</updated>
<author>
<name>Yang Wang</name>
<email>kevinyang.wang@amd.com</email>
</author>
<published>2024-04-23T02:14:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=5eccab32c15f1e5cf9651d865fb20012d3563c96'/>
<id>urn:sha1:5eccab32c15f1e5cf9651d865fb20012d3563c96</id>
<content type='text'>
because the ue valid mca count will only be cleared after gpu reset,
so only dump mca log on the first time to get mca bank after receive RAS interrupt.

Signed-off-by: Yang Wang &lt;kevinyang.wang@amd.com&gt;
Reviewed-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: add MCA smu cache support</title>
<updated>2024-04-30T13:58:41+00:00</updated>
<author>
<name>Yang Wang</name>
<email>kevinyang.wang@amd.com</email>
</author>
<published>2024-04-18T07:46:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=76ad30f51aa0d1bd99f12658d4775a86df6e4282'/>
<id>urn:sha1:76ad30f51aa0d1bd99f12658d4775a86df6e4282</id>
<content type='text'>
v1:
because SMU CE valid mca bank will be cleared after reading,
this patch adds mca cache at the driver level to ensure that the mca bank is not lost.

v2:
refine amdgpu_mca_init/fini/reset() function name.

v3:
add mca_cache.lock support
only add CE bank to mca bank cache.

Signed-off-by: Yang Wang &lt;kevinyang.wang@amd.com&gt;
Reviewed-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: remove unused MCA driver codes</title>
<updated>2024-04-30T13:51:44+00:00</updated>
<author>
<name>Yang Wang</name>
<email>kevinyang.wang@amd.com</email>
</author>
<published>2024-04-18T04:07:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7e0357bef402875425de0296800c34c41842ba82'/>
<id>urn:sha1:7e0357bef402875425de0296800c34c41842ba82</id>
<content type='text'>
- remove unused callback functions.
- make part of mca functions static and refine the function order.

Signed-off-by: Yang Wang &lt;kevinyang.wang@amd.com&gt;
Reviewed-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: add ras event id support</title>
<updated>2024-03-20T17:38:13+00:00</updated>
<author>
<name>Yang Wang</name>
<email>kevinyang.wang@amd.com</email>
</author>
<published>2024-03-13T04:50:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9dc57c2adf2c307a672f15b4be17c6c14e37cfb9'/>
<id>urn:sha1:9dc57c2adf2c307a672f15b4be17c6c14e37cfb9</id>
<content type='text'>
add amdgpu ras event id support to better distinguish different
error information sources in dmesg logs.

the following log will be identify by event id:
{event_id} interrupt to inform RAS event
{event_id} ACA logs
{event_id} errors statistic since from current injection/error query
{event_id} errors statistic since from gpu load

Signed-off-by: Yang Wang &lt;kevinyang.wang@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Reviewed-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: add interface to check mca umc status</title>
<updated>2024-01-22T22:13:25+00:00</updated>
<author>
<name>YiPeng Chai</name>
<email>YiPeng.Chai@amd.com</email>
</author>
<published>2024-01-15T02:56:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=afb617f38f221e88dc5b3f3fc2d87cc749175609'/>
<id>urn:sha1:afb617f38f221e88dc5b3f3fc2d87cc749175609</id>
<content type='text'>
Add interface to check mca umc status.

Signed-off-by: YiPeng Chai &lt;YiPeng.Chai@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Switch to aca bank for xgmi pcs err cnt</title>
<updated>2023-12-13T20:28:47+00:00</updated>
<author>
<name>Hawking Zhang</name>
<email>Hawking.Zhang@amd.com</email>
</author>
<published>2023-12-12T08:46:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=058eb51912ca3a5fb121668b30e8e94d976afb27'/>
<id>urn:sha1:058eb51912ca3a5fb121668b30e8e94d976afb27</id>
<content type='text'>
Instead of software managed counters.

Signed-off-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Reviewed-by: Yang Wang &lt;kevinyang.wang@amd.com&gt;
Reviewed-by: Stanley.Yang &lt;Stanley.Yang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd/pm: support new mca smu error code decoding</title>
<updated>2023-12-06T21:05:32+00:00</updated>
<author>
<name>Yang Wang</name>
<email>kevinyang.wang@amd.com</email>
</author>
<published>2023-12-04T02:17:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=37c57631c18661c4c0dc415e75afd143ed89e098'/>
<id>urn:sha1:37c57631c18661c4c0dc415e75afd143ed89e098</id>
<content type='text'>
support new mca smu error code decoding from smu 85.86.0 for smu v13.0.6

Signed-off-by: Yang Wang &lt;kevinyang.wang@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: add smu v13.0.6 pcs xgmi ras error query support</title>
<updated>2023-11-09T22:02:59+00:00</updated>
<author>
<name>Yang Wang</name>
<email>kevinyang.wang@amd.com</email>
</author>
<published>2023-11-07T10:03:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=76d2da18afde2c78e9fc1fbcc9dc57c27ac77ac5'/>
<id>urn:sha1:76d2da18afde2c78e9fc1fbcc9dc57c27ac77ac5</id>
<content type='text'>
add pcs xgmi ras error query support for smu v13.0.6.

Signed-off-by: Yang Wang &lt;kevinyang.wang@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
