<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/accel/amdxdna, branch v6.19.12</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.12</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.19.12'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2026-03-19T15:14:56+00:00</updated>
<entry>
<title>accel/amdxdna: Fix runtime suspend deadlock when there is pending job</title>
<updated>2026-03-19T15:14:56+00:00</updated>
<author>
<name>Lizhi Hou</name>
<email>lizhi.hou@amd.com</email>
</author>
<published>2026-03-10T18:00:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ac72e7385a2c7533dd766de4197134d96230be85'/>
<id>urn:sha1:ac72e7385a2c7533dd766de4197134d96230be85</id>
<content type='text'>
[ Upstream commit 6b13cb8f48a42ddf6dd98865b673a82e37ff238b ]

The runtime suspend callback drains the running job workqueue before
suspending the device. If a job is still executing and calls
pm_runtime_resume_and_get(), it can deadlock with the runtime suspend
path.

Fix this by moving pm_runtime_resume_and_get() from the job execution
routine to the job submission routine, ensuring the device is resumed
before the job is queued and avoiding the deadlock during runtime
suspend.

Fixes: 063db451832b ("accel/amdxdna: Enhance runtime power management")
Reviewed-by: Mario Limonciello (AMD) &lt;superm1@kernel.org&gt;
Signed-off-by: Lizhi Hou &lt;lizhi.hou@amd.com&gt;
Link: https://patch.msgid.link/20260310180058.336348-1-lizhi.hou@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>accel/amdxdna: Fix NULL pointer dereference of mgmt_chann</title>
<updated>2026-03-12T11:09:49+00:00</updated>
<author>
<name>Lizhi Hou</name>
<email>lizhi.hou@amd.com</email>
</author>
<published>2026-02-26T21:38:57+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=032ca7a9059c4ba6c329e0f1b442dab54dd9c3e5'/>
<id>urn:sha1:032ca7a9059c4ba6c329e0f1b442dab54dd9c3e5</id>
<content type='text'>
[ Upstream commit 6270ee26e1edd862ea17e3eba148ca8fb2c99dc9 ]

mgmt_chann may be set to NULL if the firmware returns an unexpected
error in aie2_send_mgmt_msg_wait(). This can later lead to a NULL
pointer dereference in aie2_hw_stop().

Fix this by introducing a dedicated helper to destroy mgmt_chann
and by adding proper NULL checks before accessing it.

Fixes: b87f920b9344 ("accel/amdxdna: Support hardware mailbox")
Reviewed-by: Mario Limonciello (AMD) &lt;superm1@kernel.org&gt;
Signed-off-by: Lizhi Hou &lt;lizhi.hou@amd.com&gt;
Link: https://patch.msgid.link/20260226213857.3068474-1-lizhi.hou@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>accel/amdxdna: Fill invalid payload for failed command</title>
<updated>2026-03-12T11:09:45+00:00</updated>
<author>
<name>Lizhi Hou</name>
<email>lizhi.hou@amd.com</email>
</author>
<published>2026-02-27T00:48:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=84a4b37eb03699a068205414636d510d3dc1e5e7'/>
<id>urn:sha1:84a4b37eb03699a068205414636d510d3dc1e5e7</id>
<content type='text'>
[ Upstream commit 89ff45359abbf9d8d3c4aa3f5a57ed0be82b5a12 ]

Newer userspace applications may read the payload of a failed command
to obtain detailed error information. However, the driver and old firmware
versions may not support returning advanced error information.
In this case, initialize the command payload with an invalid value so
userspace can detect that no detailed error information is available.

Fixes: aac243092b70 ("accel/amdxdna: Add command execution")
Reviewed-by: Mario Limonciello (AMD) &lt;superm1@kernel.org&gt;
Signed-off-by: Lizhi Hou &lt;lizhi.hou@amd.com&gt;
Link: https://patch.msgid.link/20260227004841.3080241-1-lizhi.hou@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>accel/amdxdna: Validate command buffer payload count</title>
<updated>2026-03-12T11:09:15+00:00</updated>
<author>
<name>Lizhi Hou</name>
<email>lizhi.hou@amd.com</email>
</author>
<published>2026-02-19T21:19:46+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3ed2ae6b3fe869f99b75afd02045ba5c0c0773e2'/>
<id>urn:sha1:3ed2ae6b3fe869f99b75afd02045ba5c0c0773e2</id>
<content type='text'>
[ Upstream commit 901ec3470994006bc8dd02399e16b675566c3416 ]

The count field in the command header is used to determine the valid
payload size. Verify that the valid payload does not exceed the remaining
buffer space.

Fixes: aac243092b70 ("accel/amdxdna: Add command execution")
Reviewed-by: Mario Limonciello (AMD) &lt;superm1@kernel.org&gt;
Signed-off-by: Lizhi Hou &lt;lizhi.hou@amd.com&gt;
Link: https://patch.msgid.link/20260219211946.1920485-1-lizhi.hou@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>accel/amdxdna: Prevent ubuf size overflow</title>
<updated>2026-03-12T11:09:15+00:00</updated>
<author>
<name>Lizhi Hou</name>
<email>lizhi.hou@amd.com</email>
</author>
<published>2026-02-17T19:28:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=972bf4a23478fcb247b4f507d47a584bc8aea5bd'/>
<id>urn:sha1:972bf4a23478fcb247b4f507d47a584bc8aea5bd</id>
<content type='text'>
[ Upstream commit 03808abb1d868aed7478a11a82e5bb4b3f1ca6d6 ]

The ubuf size calculation may overflow, resulting in an undersized
allocation and possible memory corruption.

Use check_add_overflow() helpers to validate the size calculation before
allocation.

Fixes: bd72d4acda10 ("accel/amdxdna: Support user space allocated buffer")
Reviewed-by: Mario Limonciello (AMD) &lt;superm1@kernel.org&gt;
Signed-off-by: Lizhi Hou &lt;lizhi.hou@amd.com&gt;
Link: https://patch.msgid.link/20260217192815.1784689-1-lizhi.hou@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>accel/amdxdna: Fix out-of-bounds memset in command slot handling</title>
<updated>2026-03-12T11:09:15+00:00</updated>
<author>
<name>Lizhi Hou</name>
<email>lizhi.hou@amd.com</email>
</author>
<published>2026-02-17T18:54:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cca770d710d5e03bc814af585cd6975eb6d74074'/>
<id>urn:sha1:cca770d710d5e03bc814af585cd6975eb6d74074</id>
<content type='text'>
[ Upstream commit 1110a949675ebd56b3f0286e664ea543f745801c ]

The remaining space in a command slot may be smaller than the size of
the command header. Clearing the command header with memset() before
verifying the available slot space can result in an out-of-bounds write
and memory corruption.

Fix this by moving the memset() call after the size validation.

Fixes: 3d32eb7a5ecf ("accel/amdxdna: Fix cu_idx being cleared by memset() during command setup")
Reviewed-by: Mario Limonciello (AMD) &lt;superm1@kernel.org&gt;
Signed-off-by: Lizhi Hou &lt;lizhi.hou@amd.com&gt;
Link: https://patch.msgid.link/20260217185415.1781908-1-lizhi.hou@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>accel/amdxdna: Fix command hang on suspended hardware context</title>
<updated>2026-03-12T11:09:14+00:00</updated>
<author>
<name>Lizhi Hou</name>
<email>lizhi.hou@amd.com</email>
</author>
<published>2026-02-11T20:53:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=105caae0ee7b0395bb88017872a351f1b8c59c00'/>
<id>urn:sha1:105caae0ee7b0395bb88017872a351f1b8c59c00</id>
<content type='text'>
[ Upstream commit 07efce5a6611af6714ea3ef65694e0c8dd7e44f5 ]

When a hardware context is suspended, the job scheduler is stopped. If a
command is submitted while the context is suspended, the job is queued in
the scheduler but aie2_sched_job_run() is never invoked to restart the
hardware context. As a result, the command hangs.

Fix this by modifying the hardware context suspend routine to keep the job
scheduler running so that queued jobs can trigger context restart properly.

Fixes: aac243092b70 ("accel/amdxdna: Add command execution")
Reviewed-by: Mario Limonciello (AMD) &lt;superm1@kernel.org&gt;
Signed-off-by: Lizhi Hou &lt;lizhi.hou@amd.com&gt;
Link: https://patch.msgid.link/20260211205341.722982-1-lizhi.hou@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>accel/amdxdna: Fix suspend failure after enabling turbo mode</title>
<updated>2026-03-12T11:09:14+00:00</updated>
<author>
<name>Lizhi Hou</name>
<email>lizhi.hou@amd.com</email>
</author>
<published>2026-02-11T20:47:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9aecc37ef8dc805fe60bfaea47454cf14d7fcf57'/>
<id>urn:sha1:9aecc37ef8dc805fe60bfaea47454cf14d7fcf57</id>
<content type='text'>
[ Upstream commit fdb65acfe655f844ae1e88696b9656d3ef5bb8fb ]

Enabling turbo mode disables hardware clock gating. Suspend requires
hardware clock gating to be re-enabled, otherwise suspend will fail.
Fix this by calling aie2_runtime_cfg() from aie2_hw_stop() to
re-enable clock gating during suspend. Also ensure that firmware is
initialized in aie2_hw_start() before modifying clock-gating
settings during resume.

Fixes: f4d7b8a6bc8c ("accel/amdxdna: Enhance power management settings")
Reviewed-by: Mario Limonciello (AMD) &lt;superm1@kernel.org&gt;
Signed-off-by: Lizhi Hou &lt;lizhi.hou@amd.com&gt;
Link: https://patch.msgid.link/20260211204716.722788-1-lizhi.hou@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>accel/amdxdna: Fix dead lock for suspend and resume</title>
<updated>2026-03-12T11:09:14+00:00</updated>
<author>
<name>Lizhi Hou</name>
<email>lizhi.hou@amd.com</email>
</author>
<published>2026-02-11T20:46:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ac24537478dd8eb2fd3984b4652bb19461e5e74c'/>
<id>urn:sha1:ac24537478dd8eb2fd3984b4652bb19461e5e74c</id>
<content type='text'>
[ Upstream commit 1aa82181a3c285c7351523d587f7981ae4c015c8 ]

When an application issues a query IOCTL while auto suspend is running,
a deadlock can occur. The query path holds dev_lock and then calls
pm_runtime_resume_and_get(), which waits for the ongoing suspend to
complete. Meanwhile, the suspend callback attempts to acquire dev_lock
and blocks, resulting in a deadlock.

Fix this by releasing dev_lock before calling pm_runtime_resume_and_get()
and reacquiring it after the call completes. Also acquire dev_lock in the
resume callback to keep the locking consistent.

Fixes: 063db451832b ("accel/amdxdna: Enhance runtime power management")
Reviewed-by: Mario Limonciello (AMD) &lt;superm1@kernel.org&gt;
Signed-off-by: Lizhi Hou &lt;lizhi.hou@amd.com&gt;
Link: https://patch.msgid.link/20260211204644.722758-1-lizhi.hou@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>accel/amdxdna: Reduce log noise during process termination</title>
<updated>2026-03-12T11:09:14+00:00</updated>
<author>
<name>Mario Limonciello</name>
<email>mario.limonciello@amd.com</email>
</author>
<published>2026-02-10T16:42:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c503a8b3de4892caf63b5baea5e38c1af4cf3b9f'/>
<id>urn:sha1:c503a8b3de4892caf63b5baea5e38c1af4cf3b9f</id>
<content type='text'>
[ Upstream commit 57aa3917a3b3bd805a3679371f97a1ceda3c5510 ]

During process termination, several error messages are logged that are
not actual errors but expected conditions when a process is killed or
interrupted. This creates unnecessary noise in the kernel log.

The specific scenarios are:

1. HMM invalidation returns -ERESTARTSYS when the wait is interrupted by
   a signal during process cleanup. This is expected when a process is
   being terminated and should not be logged as an error.

2. Context destruction returns -ENODEV when the firmware or device has
   already stopped, which commonly occurs during cleanup if the device
   was already torn down. This is also an expected condition during
   orderly shutdown.

Downgrade these expected error conditions from error level to debug level
to reduce log noise while still keeping genuine errors visible.

Fixes: 97f27573837e ("accel/amdxdna: Fix potential NULL pointer dereference in context cleanup")
Reviewed-by: Lizhi Hou &lt;lizhi.hou@amd.com&gt;
Signed-off-by: Mario Limonciello &lt;mario.limonciello@amd.com&gt;
Signed-off-by: Lizhi Hou &lt;lizhi.hou@amd.com&gt;
Link: https://patch.msgid.link/20260210164521.1094274-3-mario.limonciello@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
