| Age | Commit message (Collapse) | Author | Files | Lines |
|
Timeline Management v7
[ Upstream commit efdc66fe12b07e7b7d28650bd8d4f7e3bb92c5d4 ]
When GPU memory mappings are updated, the driver returns a fence so
userspace knows when the update is finished.
The previous refactor could pick the wrong fence or rely on checks that
are not safe for GPU mappings that stay valid even when memory is
missing. In some cases this could return an invalid fence or cause fence
reference counting problems.
Fix this by (v5,v6, per Christian):
- Starting from the VM’s existing last update fence, so a valid and
meaningful fence is always returned even when no new work is required.
- Selecting the VM-level fence only for always-valid / PRT mappings using
the required combined bo_va + bo guard.
- Using the per-BO page table update fence for normal MAP and REPLACE
operations.
- For UNMAP and CLEAR, returning the fence provided by
amdgpu_vm_clear_freed(), which may remain unchanged when nothing needs
clearing.
- Keeping fence reference counting balanced.
v7: Drop the extra bo_va/bo NULL guard since
amdgpu_vm_is_bo_always_valid() handles NULL BOs correctly (including
PRT). (Christian)
This makes VM timeline fences correct and prevents crashes caused by
incorrect fence handling.
Fixes: bd8150a1b337 ("drm/amdgpu: Refactor amdgpu_gem_va_ioctl for Handling Last Fence Update and Timeline Management v4")
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 524696a19e34598c9173fdd5b32fb7e5d16a91d3 ]
Commit 469c1c9eb6c9 ("kernel-doc: Issue warnings that were silently
discarded") started emitting warnings for cases that were previously
silently discarded. One such case is in intel_wakeref.h:
Warning: drivers/gpu/drm/i915/intel_wakeref.h:156 expecting prototype
for __intel_wakeref_put(). Prototype was for INTEL_WAKEREF_PUT_ASYNC()
instead
Arguably kernel-doc should be able to handle this, as it's valid C, but
having the flags defined between the function declarator and the body is
just asking for trouble. Move the INTEL_WAKEREF_PUT_* macros away from
there, making kernel-doc's life easier.
While at it, reduce the unnecessary abstraction levels by removing the
enum, and append _MASK to INTEL_WAKEREF_PUT_DELAY for clarity.
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20251215120908.3515578-1-jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 096bb75e13cc508d3915b7604e356bcb12b17766 ]
On Intel MacBookPros with switchable graphics, when the iGPU
is enabled, the address of VRAM gets put at 0 in the dGPU's
virtual address space. This is non-standard and seems to cause
issues with the cursor if it ends up at 0. We have the framework
to reserve memory at 0 in the address space, so enable it here if
the vram start address is 0.
Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4302
Cc: stable@vger.kernel.org
Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b6a65009e7ce3f0cc72da18f186adb60717b51a0 ]
[Why]
Fix fastboot broken in driver.
This is caused by an open source backport change 7495962c.
from the comment, the intended check is to disable fastboot
for pre-DCN10. but the logic check is reversed, and causes
fastboot to be disabled on all DCN10 and after.
fastboot is for driver trying to pick up bios used hw setting
and bypass reprogramming the hw if dc_validate_boot_timing()
condition meets.
Fixes: 7495962cbceb ("drm/amd/display: Disable fastboot on DCE 6 too")
Cc: stable@vger.kernel.org
Reviewed-by: Mario Limonciello <Mario.Limonciello@amd.com>
Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Charlene Liu <Charlene.Liu@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit fbbe32618e97eff81577a01eb7d9adcd64a216d7 ]
When user provides a bogus pat_index value through the madvise IOCTL, the
xe_pat_index_get_coh_mode() function performs an array access without
validating bounds. This allows a malicious user to trigger an out-of-bounds
kernel read from the xe->pat.table array.
The vulnerability exists because the validation in madvise_args_are_sane()
directly calls xe_pat_index_get_coh_mode(xe, args->pat_index.val) without
first checking if pat_index is within [0, xe->pat.n_entries).
Although xe_pat_index_get_coh_mode() has a WARN_ON to catch this in debug
builds, it still performs the unsafe array access in production kernels.
v2(Matthew Auld)
- Using array_index_nospec() to mitigate spectre attacks when the value
is used
v3(Matthew Auld)
- Put the declarations at the start of the block
Fixes: ada7486c5668 ("drm/xe: Implement madvise ioctl for xe")
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v6.18+
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Jia Yao <jia.yao@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260205161529.1819276-1-jia.yao@intel.com
(cherry picked from commit 944a3329b05510d55c69c2ef455136e2fc02de29)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b18fc0ab837381c1a6ef28386602cd888f2d9edf ]
Invalidating a dmabuf will impact other users of the shared BO.
In the scenario where process A moves the BO, it needs to inform
process B about the move and process B will need to update its
page table.
The commit fixes a synchronisation bug caused by the use of the
ticket: it made amdgpu_vm_handle_moved behave as if updating
the page table immediately was correct but in this case it's not.
An example is the following scenario, with 2 GPUs and glxgears
running on GPU0 and Xorg running on GPU1, on a system where P2P
PCI isn't supported:
glxgears:
export linear buffer from GPU0 and import using GPU1
submit frame rendering to GPU0
submit tiled->linear blit
Xorg:
copy of linear buffer
The sequence of jobs would be:
drm_sched_job_run # GPU0, frame rendering
drm_sched_job_queue # GPU0, blit
drm_sched_job_done # GPU0, frame rendering
drm_sched_job_run # GPU0, blit
move linear buffer for GPU1 access #
amdgpu_dma_buf_move_notify -> update pt # GPU0
It this point the blit job on GPU0 is still running and would
likely produce a page fault.
Cc: stable@vger.kernel.org
Fixes: a448cb003edc ("drm/amdgpu: implement amdgpu_gem_prime_move_notify v2")
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 318917e1d8ecc89f820f4fabf79935f4fed718cd ]
[Why & How]
On Framework laptops with DDR5 modules, underflow can be observed.
It's unclear why it only occurs on specific desktop contents. However,
increasing enter/exit latencies by 3us seems to resolve it.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4463
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 510e7261a7bcd6232e90f0b6b9f93303bdd29f8a ]
Update the device ID for Dell XPS 13 7390 2-in-1 in the quirk
`QUIRK_EDP_LIMIT_RATE_HBR2` entry. The previous ID (0x8a12) was
incorrect; the correct ID is 0x8a52.
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/5969
Fixes: 21c586d9233a ("drm/i915/dp: Add device specific quirk to limit eDP rate to HBR2")
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Cc: <stable@vger.kernel.org> # v6.18+
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://patch.msgid.link/20251226043359.2553-1-ankit.k.nautiyal@intel.com
(cherry picked from commit c7c30c4093cc11ff66672471f12599a555708343)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 39fc2bc4da0082c226cbee331f0a5d44db3997da ]
Ungate GPU CG/PG in device_fini_hw and device_halt to protect GPU
register accesses, e.g. GC registers are accessed in amdgpu_irq_disable_all()
and amdgpu_fence_driver_hw_fini().
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8a70a26c9f34baea6c3199a9862ddaff4554a96d ]
The kfd_event_page_set() function writes KFD_SIGNAL_EVENT_LIMIT * 8
bytes via memset without checking the buffer size parameter. This allows
unprivileged userspace to trigger an out-of bounds kernel memory write
by passing a small buffer, leading to potential privilege
escalation.
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1d5362145de96b5d00d590605cc94cdfa572b405 ]
DRM checks EDID block count against allocated size in drm_edid_valid
function. We have to allocate the right EDID size instead of the max
size to prevent the EDID to be reported as invalid.
Cc: stable@kernel.org
Fixes: 7c585f9a71aa ("drm/bridge: anx7625: use struct drm_edid more")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
Link: https://patch.msgid.link/20251218151307.95491-1-loic.poulain@oss.qualcomm.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5488a29596cdba93a60a79398dc9b69d5bdadf92 ]
When DRM_BUDDY_CONTIGUOUS_ALLOCATION is set, the requested size is
rounded up to the next power-of-two via roundup_pow_of_two().
Similarly, for non-contiguous allocations with large min_block_size,
the size is aligned up via round_up(). Both operations can produce a
rounded size that exceeds mm->size, which later triggers
BUG_ON(order > mm->max_order).
Example scenarios:
- 9G CONTIGUOUS allocation on 10G VRAM memory:
roundup_pow_of_two(9G) = 16G > 10G
- 9G allocation with 8G min_block_size on 10G VRAM memory:
round_up(9G, 8G) = 16G > 10G
Fix this by checking the rounded size against mm->size. For
non-contiguous or range allocations where size > mm->size is invalid,
return -EINVAL immediately. For contiguous allocations without range
restrictions, allow the request to fall through to the existing
__alloc_contig_try_harder() fallback.
This ensures invalid user input returns an error or uses the fallback
path instead of hitting BUG_ON.
v2: (Matt A)
- Add Fixes, Cc stable, and Closes tags for context
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/6712
Fixes: 0a1844bf0b53 ("drm/buddy: Improve contiguous memory allocation")
Cc: <stable@vger.kernel.org> # v6.7+
Cc: Christian König <christian.koenig@amd.com>
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://patch.msgid.link/20260108113227.2101872-5-sanjay.kumar.yadav@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 793e8f7d52814e096f63373eca643d2672366a5a ]
The `..IRQ..` register is printed here. Not the `..INT..` one.
Correct this.
Cc: stable@vger.kernel.org
Fixes: cf4fd52e3236 ("rust: drm: Introduce the Tyr driver for Arm Mali GPUs")
Link: https://lore.kernel.org/rust-for-linux/A04F0357-896E-4ACC-BC0E-DEE8608CE518@collabora.com/
Signed-off-by: Dirk Behme <dirk.behme@de.bosch.com>
Link: https://patch.msgid.link/20260119070838.3219739-1-dirk.behme@de.bosch.com
[aliceryhl: update commit message prefix]
[aliceryhl: add cc stable as per Miguel's suggestion]
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 69f83f167463bad26104af7fbc114ce1f80366b0 ]
With some panels informing support for Panel Replay we are observing
problems if having Panel Replay enable bit set on sink when forced to use
PSR instead of Panel Replay. Avoid these problems by not setting Panel
Replay enable bit in sink when Panel Replay is globally disabled during
link training. I.e. disabled by module parameter.
The enable bit is still set when disabling Panel Replay via debugfs
interface. Added note comment about this.
Fixes: 68f3a505b367 ("drm/i915/psr: Enable Panel Replay on sink always when it's supported")
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: <stable@vger.kernel.org> # v6.15+
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patch.msgid.link/20260115070039.368965-1-jouni.hogander@intel.com
(cherry picked from commit c5a52cd04e24f0ae53fda26f74ab027b8c548e0e)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a61bf068f1fe359203f1af191cb523b77dc32752 ]
Pass the correct alignment from intel_fb_pin_to_ggtt() down to
__xe_pin_fb_vma().
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Closes: https://lore.kernel.org/intel-xe/aNL_RgLy13fXJbYx@intel.com/
Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: b0228a337de8 ("drm/xe/display: align framebuffers according to hw requirements")
Cc: <stable@vger.kernel.org> # v6.13+
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20251208181550.6618-1-tursulin@igalia.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3f41307d589c2f25d556d47b165df808124cd0c4 ]
Acquire and release the GEM object's reservation lock around calls
to the object's purge operation. The tests use
drm_gem_shmem_purge_locked(), which led to errors such as show below.
[ 58.709128] WARNING: CPU: 1 PID: 1354 at drivers/gpu/drm/drm_gem_shmem_helper.c:515 drm_gem_shmem_purge_locked+0x51c/0x740
Only export the new helper drm_gem_shmem_purge() for Kunit tests.
This is not an interface for regular drivers.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 954907f7147d ("drm/shmem-helper: Refactor locked/unlocked functions")
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v6.16+
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patch.msgid.link/20251212160317.287409-6-tzimmermann@suse.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 607d07d8cc0b835a8701259f08a03dc149b79b4f ]
Acquire and release the GEM object's reservation lock around calls
to the object's madvide operation. The tests use
drm_gem_shmem_madvise_locked(), which led to errors such as show below.
[ 58.339389] WARNING: CPU: 1 PID: 1352 at drivers/gpu/drm/drm_gem_shmem_helper.c:499 drm_gem_shmem_madvise_locked+0xde/0x140
Only export the new helper drm_gem_shmem_madvise() for Kunit tests.
This is not an interface for regular drivers.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 954907f7147d ("drm/shmem-helper: Refactor locked/unlocked functions")
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v6.16+
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patch.msgid.link/20251212160317.287409-5-tzimmermann@suse.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit cda83b099f117f2a28a77bf467af934cb39e49cf ]
Acquire and release the GEM object's reservation lock around vmap and
vunmap operations. The tests use vmap_locked, which led to errors such
as show below.
[ 122.292030] WARNING: CPU: 3 PID: 1413 at drivers/gpu/drm/drm_gem_shmem_helper.c:390 drm_gem_shmem_vmap_locked+0x3a3/0x6f0
[ 122.468066] WARNING: CPU: 3 PID: 1413 at drivers/gpu/drm/drm_gem_shmem_helper.c:293 drm_gem_shmem_pin_locked+0x1fe/0x350
[ 122.563504] WARNING: CPU: 3 PID: 1413 at drivers/gpu/drm/drm_gem_shmem_helper.c:234 drm_gem_shmem_get_pages_locked+0x23c/0x370
[ 122.662248] WARNING: CPU: 2 PID: 1413 at drivers/gpu/drm/drm_gem_shmem_helper.c:452 drm_gem_shmem_vunmap_locked+0x101/0x330
Only export the new vmap/vunmap helpers for Kunit tests. These are
not interfaces for regular drivers.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 954907f7147d ("drm/shmem-helper: Refactor locked/unlocked functions")
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v6.16+
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patch.msgid.link/20251212160317.287409-4-tzimmermann@suse.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b47b9ecef309459278eb52f02b50eefdeaac4f6d ]
Automatically unpin pages on cleanup. The test currently fails with
the error
[ 58.246263] drm-kunit-mock-device drm_gem_shmem_test_get_sg_table.drm-kunit-mock-device: [drm] drm_WARN_ON(refcount_read(&shmem->pages_pin_count))
while cleaning up the GEM object. The pin count has to be zero at this
point.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: d586b535f144 ("drm/shmem-helper: Add and use pages_pin_count")
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v6.16+
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patch.msgid.link/20251212160317.287409-3-tzimmermann@suse.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 89f23d42006630dd94c01a8c916f8c648141ad8e ]
GEM SHMEM has 2 helpers for exporting S/G tables. Swap the names of
the rsp. tests, so that each matches the helper it tests.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 93032ae634d4 ("drm/test: add a test suite for GEM objects backed by shmem")
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v6.8+
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://patch.msgid.link/20251212160317.287409-2-tzimmermann@suse.de
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit efe24898485c5c831e629d9c6fb9350c35cb576f ]
Commit 506aa8b02a8d6 ("dma-fence: Add safe access helpers and document
the rules") details the dma-fence safe access rules. The most common
culprit is that drm_sched_fence_get_timeline_name may race with
group_free_queue.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Cc: stable@vger.kernel.org # v6.17+
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patch.msgid.link/20251204174545.399059-1-olvaffe@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5cc7bbd9f1b74d9fe2f7ac08d6ba0477e8d2d65f ]
sdma ring reset is not supported in SRIOV. kfd driver does not check
reset mask, and could queue sdma ring reset during unmap_queues_cpsch.
Avoid the ring reset for sriov.
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1b38a87b8f8020e8ef4563e7752a64182b5a39b9 ]
[Why]
Shaper programming has high chance to fail on first time after
power-on or reboot. This can be verified by running IGT's kms_colorop.
[How]
Always power on the shaper and 3DLUT before programming by
removing the debug flag of low power mode.
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 908d318f23d6b5d625bea093c5fc056238cdb7ff ]
This patch limits the clock speeds of the AMD Radeon R5 M420 GPU from
850/1000MHz (core/memory) to 800/950 MHz, making it work stably. This
patch is for radeon.
Signed-off-by: decce6 <decce6@proton.me>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7d9ec9dc20ecdb1661f4538cd9112cd3d6a5f15a ]
[Why]
For RGB BT2020 full and limited color spaces, overlay adjustments were
applied twice (once by MM and once by DAL). This results in incorrect
colours and a noticeable difference between mpo and non-mpo cases.
[How]
Add RGB BT2020 full and limited color spaces to list that bypasses post
csc adjustment.
Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Clay King <clayking@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 49fe2c57bdc0acff9d2551ae337270b6fd8119d9 ]
This patch limits the clock speeds of the AMD Radeon R5 M420 GPU from
850/1000MHz (core/memory) to 800/950 MHz, making it work stably. This
patch is for amdgpu.
Signed-off-by: decce6 <decce6@proton.me>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3ee1c72606bd2842f0f377fd4b118362af0323ae ]
Tune the sleep interval in the PSP fence wait loop from 10-100us to
60-100us.This adjustment results in an overall wait window of 1.2s
(60us * 20000 iterations) to 2 seconds (100us * 20000 iterations),
which guarantees that we can retrieve the correct fence value
Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 044f8d3b1fac6ac89c560f61415000e6bdab3a03 ]
end the function flow when ras table checksum is error
Signed-off-by: Gangliang Xie <ganglxie@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 1a38ded4bc8ac09fd029ec656b1e2c98cc0d238c ]
[Why & How]
Although it's dummy updates of surface update for committing stream
updates, we should not have dummy_updates[j].surface all indicating
to the same surface under multiple surfaces case. Otherwise,
copy_surface_update_to_plane() in update_planes_and_stream_state()
will update to the same surface only.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 64c94cd9be2e188ed07efeafa6a109bce638c967 ]
[Why]
System will try to apply idle power optimizations setting during
system resume. But system power state is still in D3 state, and
it will cause the idle power optimizations command not actually
to be sent to DMUB and cause some platforms to go into IPS.
[How]
Set power state to D0 first before calling the
dc_dmub_srv_apply_idle_power_optimizations(dm->dc, false)
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8980be03b3f9a4b58197ef95d3b37efa41a25331 ]
VF doesn't enable VCN poison irq in VCNv2.5. Skip releasing it and avoid
call trace during deinitialization.
[ 71.913601] [drm] clean up the vf2pf work item
[ 71.915088] ------------[ cut here ]------------
[ 71.915092] WARNING: CPU: 3 PID: 1079 at /tmp/amd.aFkFvSQl/amd/amdgpu/amdgpu_irq.c:641 amdgpu_irq_put+0xc6/0xe0 [amdgpu]
[ 71.915355] Modules linked in: amdgpu(OE-) amddrm_ttm_helper(OE) amdttm(OE) amddrm_buddy(OE) amdxcp(OE) amddrm_exec(OE) amd_sched(OE) amdkcl(OE) drm_suballoc_helper drm_display_helper cec rc_core i2c_algo_bit video wmi binfmt_misc nls_iso8859_1 intel_rapl_msr intel_rapl_common input_leds joydev serio_raw mac_hid qemu_fw_cfg sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 hid_generic crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel usbhid 8139too sha256_ssse3 sha1_ssse3 hid psmouse bochs i2c_i801 ahci drm_vram_helper libahci i2c_smbus lpc_ich drm_ttm_helper 8139cp mii ttm aesni_intel crypto_simd cryptd
[ 71.915484] CPU: 3 PID: 1079 Comm: rmmod Tainted: G OE 6.8.0-87-generic #88~22.04.1-Ubuntu
[ 71.915489] Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-2.el9_5.1 04/01/2014
[ 71.915492] RIP: 0010:amdgpu_irq_put+0xc6/0xe0 [amdgpu]
[ 71.915768] Code: 75 84 b8 ea ff ff ff eb d4 44 89 ea 48 89 de 4c 89 e7 e8 fd fc ff ff 5b 41 5c 41 5d 41 5e 5d 31 d2 31 f6 31 ff e9 55 30 3b c7 <0f> 0b eb d4 b8 fe ff ff ff eb a8 e9 b7 3b 8a 00 66 2e 0f 1f 84 00
[ 71.915771] RSP: 0018:ffffcf0800eafa30 EFLAGS: 00010246
[ 71.915775] RAX: 0000000000000000 RBX: ffff891bda4b0668 RCX: 0000000000000000
[ 71.915777] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 71.915779] RBP: ffffcf0800eafa50 R08: 0000000000000000 R09: 0000000000000000
[ 71.915781] R10: 0000000000000000 R11: 0000000000000000 R12: ffff891bda480000
[ 71.915782] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
[ 71.915792] FS: 000070cff87c4c40(0000) GS:ffff893abfb80000(0000) knlGS:0000000000000000
[ 71.915795] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 71.915797] CR2: 00005fa13073e478 CR3: 000000010d634006 CR4: 0000000000770ef0
[ 71.915800] PKRU: 55555554
[ 71.915802] Call Trace:
[ 71.915805] <TASK>
[ 71.915809] vcn_v2_5_hw_fini+0x19e/0x1e0 [amdgpu]
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9ef84a307582a92ef055ef0bd3db10fd8ac75960 ]
[WHAT]
1. Set no scaling for writeback as they are hardcoded in DCN3.2+.
2. Set no fast plane update for writeback commits.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8079b87c02e531cc91601f72ea8336dd2262fdf1 ]
Add validation to ensure user queue sizes meet hardware requirements:
- Size must be a power of two for efficient ring buffer wrapping
- Size must be at least AMDGPU_GPU_PAGE_SIZE to prevent undersized allocations
This prevents invalid configurations that could lead to GPU faults or
unexpected behavior.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 15b1d7b77e9836ff4184093163174a1ef28bbdd7 ]
[Why]
When usb4 link training fails, the dpia sym clock will be disabled and SYMCLK
source should be changed back to phy clock. In enable_streams, it is
assumed that link training succeeded and will switch from refclk to
phy clock. But phy clk here might not be on. Dig reg access timeout
will occur.
[How]
When enable_stream is hit, check if link training failed for usb4.
If it did, fall back to the ref clock to avoid reg access timeout.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Zhongwei <Zhongwei.Zhang@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bdad08670278829771626ea7b57c4db531e2544f ]
Using >=, <= for checking the family is not always correct.
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Matthew Stewart <Matthew.Stewart2@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a625dc4989a2affb8f06e7b418bf30e1474b99c1 ]
[Why&How]
This reverts commit 14bb17cc37e0.
Due to the change, the display shows garbage on startup.
We have an alternative solution for the original issue:
d24203bb629f ("drm/amd/display: Re-check seamless boot can be enabled or not")
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Wang, Sung-huai <Danny.Wang@amd.com>
Signed-off-by: Matthew Stewart <matthew.stewart2@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit bc847787233277a337788568e90a6ee1557595eb ]
The atmel_hlcdc_plane_atomic_duplicate_state() callback was copying
the atmel_hlcdc_plane state structure without properly duplicating the
drm_plane_state. In particular, state->commit remained set to the old
state commit, which can lead to a use-after-free in the next
drm_atomic_commit() call.
Fix this by calling
__drm_atomic_helper_duplicate_plane_state(), which correctly clones
the base drm_plane_state (including the ->commit pointer).
It has been seen when closing and re-opening the device node while
another DRM client (e.g. fbdev) is still attached:
=============================================================================
BUG kmalloc-64 (Not tainted): Poison overwritten
-----------------------------------------------------------------------------
0xc611b344-0xc611b344 @offset=836. First byte 0x6a instead of 0x6b
FIX kmalloc-64: Restoring Poison 0xc611b344-0xc611b344=0x6b
Allocated in drm_atomic_helper_setup_commit+0x1e8/0x7bc age=178 cpu=0
pid=29
drm_atomic_helper_setup_commit+0x1e8/0x7bc
drm_atomic_helper_commit+0x3c/0x15c
drm_atomic_commit+0xc0/0xf4
drm_framebuffer_remove+0x4cc/0x5a8
drm_mode_rmfb_work_fn+0x6c/0x80
process_one_work+0x12c/0x2cc
worker_thread+0x2a8/0x400
kthread+0xc0/0xdc
ret_from_fork+0x14/0x28
Freed in drm_atomic_helper_commit_hw_done+0x100/0x150 age=8 cpu=0
pid=169
drm_atomic_helper_commit_hw_done+0x100/0x150
drm_atomic_helper_commit_tail+0x64/0x8c
commit_tail+0x168/0x18c
drm_atomic_helper_commit+0x138/0x15c
drm_atomic_commit+0xc0/0xf4
drm_atomic_helper_set_config+0x84/0xb8
drm_mode_setcrtc+0x32c/0x810
drm_ioctl+0x20c/0x488
sys_ioctl+0x14c/0xc20
ret_fast_syscall+0x0/0x54
Slab 0xef8bc360 objects=21 used=16 fp=0xc611b7c0
flags=0x200(workingset|zone=0)
Object 0xc611b340 @offset=832 fp=0xc611b7c0
Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Reviewed-by: Manikandan Muralidharan <manikandan.m@microchip.com>
Link: https://patch.msgid.link/20251024-lcd_fixes_mainlining-v1-2-79b615130dc3@microchip.com
Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 06682206e2a1883354ed758c09efeb51f435adbd ]
Don’t reject the commit when the source rectangle has fractional parts.
This can occur due to scaling: drm_atomic_helper_check_plane_state() calls
drm_rect_clip_scaled(), which may introduce fractional parts while
computing the clipped source rectangle. This does not imply the commit is
invalid, so we should accept it instead of discarding it.
Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Reviewed-by: Manikandan Muralidharan <manikandan.m@microchip.com>
Link: https://patch.msgid.link/20251120-lcd_scaling_fix-v1-1-5ffc98557923@microchip.com
Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f12352471061df83a36edf54bbb16284793284e4 ]
After several commits, the slab memory increases. Some drm_crtc_commit
objects are not freed. The atomic_destroy_state callback only put the
framebuffer. Use the __drm_atomic_helper_plane_destroy_state() function
to put all the objects that are no longer needed.
It has been seen after hours of usage of a graphics application or using
kmemleak:
unreferenced object 0xc63a6580 (size 64):
comm "egt_basic", pid 171, jiffies 4294940784
hex dump (first 32 bytes):
40 50 34 c5 01 00 00 00 ff ff ff ff 8c 65 3a c6 @P4..........e:.
8c 65 3a c6 ff ff ff ff 98 65 3a c6 98 65 3a c6 .e:......e:..e:.
backtrace (crc c25aa925):
kmemleak_alloc+0x34/0x3c
__kmalloc_cache_noprof+0x150/0x1a4
drm_atomic_helper_setup_commit+0x1e8/0x7bc
drm_atomic_helper_commit+0x3c/0x15c
drm_atomic_commit+0xc0/0xf4
drm_atomic_helper_set_config+0x84/0xb8
drm_mode_setcrtc+0x32c/0x810
drm_ioctl+0x20c/0x488
sys_ioctl+0x14c/0xc20
ret_fast_syscall+0x0/0x54
Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Reviewed-by: Manikandan Muralidharan <manikandan.m@microchip.com>
Link: https://patch.msgid.link/20251024-lcd_fixes_mainlining-v1-1-79b615130dc3@microchip.com
Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 4589712e0111352973131bad975023b25569287c ]
[Why]
We're missing the code to actually disable the link output when we have
to leave the SYMCLK_ON but the TX remains OFF.
[How]
Port the code from DCN401 that detects SYMCLK_ON_TX_OFF and disable
the link output when the backend is reset.
Reviewed-by: Ovidiu (Ovi) Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Matthew Stewart <matthew.stewart2@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8cee62904caf95e5698fa0f2d420f5f22b4dea15 ]
[why & how]
VBIOS DMCUB FW can enable FEC for capable eDPs, but S/W DC state is
only updated for link0 when transitioning into OS with driver loaded.
This causes issues when the eDP is immediately hidden and DIG0 is
assigned to another link that does not support FEC. Driver will
attempt to disable FEC but FEC enablement occurs based on the link
state, which does not have fec_state updated since it is a different
link. Thus, FEC disablement on DIG0 will get skipped and cause no
light up.
Reviewed-by: Karen Chen <karen.chen@amd.com>
Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Signed-off-by: Matthew Stewart <matthew.stewart2@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 64aa8b3a60a825134f7d866adf05c024bbe0c24c ]
Since commit 56de5e305d4b ("clk: renesas: r9a07g044: Add MSTOP for RZ/G2L")
we may get the following kernel panic, for some panels, when rebooting:
systemd-shutdown[1]: Rebooting.
Call trace:
...
do_serror+0x28/0x68
el1h_64_error_handler+0x34/0x50
el1h_64_error+0x6c/0x70
rzg2l_mipi_dsi_host_transfer+0x114/0x458 (P)
mipi_dsi_device_transfer+0x44/0x58
mipi_dsi_dcs_set_display_off_multi+0x9c/0xc4
ili9881c_unprepare+0x38/0x88
drm_panel_unprepare+0xbc/0x108
This happens for panels that need to send MIPI-DSI commands in their
unprepare() callback. Since the MIPI-DSI interface is stopped at that
point, rzg2l_mipi_dsi_host_transfer() triggers the kernel panic.
Fix by moving rzg2l_mipi_dsi_stop() to new callback function
rzg2l_mipi_dsi_atomic_post_disable().
With this change we now have the correct power-down/stop sequence:
systemd-shutdown[1]: Rebooting.
rzg2l-mipi-dsi 10850000.dsi: rzg2l_mipi_dsi_atomic_disable(): entry
ili9881c-dsi 10850000.dsi.0: ili9881c_unprepare(): entry
rzg2l-mipi-dsi 10850000.dsi: rzg2l_mipi_dsi_atomic_post_disable(): entry
reboot: Restarting system
Suggested-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com>
Tested-by: Biju Das <biju.das.jz@bp.renesas.com>
Link: https://patch.msgid.link/20260112154333.655352-1-hugo@hugovil.com
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 26b4309a3ab82a0697751cde52eb336c29c19035 ]
DRM_IOCTL_MODE_CREATEPROPBLOB allows userspace to allocate arbitrary-sized
property blobs backed by kernel memory.
Currently, the blob data allocation is not accounted to the allocating
process's memory cgroup, allowing unprivileged users to trigger unbounded
kernel memory consumption and potentially cause system-wide OOM.
Mark the property blob data allocation with GFP_KERNEL_ACCOUNT so that the memory
is properly charged to the caller's memcg. This ensures existing cgroup
memory limits apply and prevents uncontrolled kernel memory growth without
introducing additional policy or per-file limits.
Signed-off-by: Xiao Kan <814091656@qq.com>
Signed-off-by: Xiao Kan <xiao.kan@samsung.com>
Link: https://patch.msgid.link/tencent_D12AA2DEDE6F359E1AF59405242FB7A5FD05@qq.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6c160001661b6c4e20f5c31909c722741e14c2d8 ]
In svm_migrate_gart_map(), while migrating GART mapping, the number of
bytes copied for the GART table only accounts for CPU pages. On non-4K
systems, each CPU page can contain multiple GPU pages, and the GART
requires one 8-byte PTE per GPU page. As a result, an incorrect size was
passed to the DMA, causing only a partial update of the GART table.
Fix this function to work correctly on non-4K page-size systems by
accounting for the number of GPU pages per CPU page when calculating the
number of bytes to be copied.
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 42ea9cf2f16b7131cb7302acb3dac510968f8bdc ]
HW-supported EOP buffer sizes are 4K and 32K. On systems that do not
use 4K pages, the minimum buffer object (BO) allocation size is
PAGE_SIZE (for example, 64K). During queue buffer acquisition, the driver
currently checks the allocated BO size against the supported EOP buffer
size. Since the allocated BO is larger than the expected size, this check
fails, preventing queue creation.
Relax the strict size validation and allow PAGE_SIZE-sized BOs to be used.
Only the required 4K region of the buffer will be used as the EOP buffer
and avoids queue creation failures on non-4K page systems.
Acked-by: Christian König <christian.koenig@amd.com>
Suggested-by: Philip Yang <yangp@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 95eed73b871111123a8b1d31cb1fce7e902e49ea ]
In jdi_panel_dsi_remove(), jdi is explicitly checked, indicating that it
may be NULL:
if (!jdi)
mipi_dsi_detach(dsi);
However, when jdi is NULL, the function does not return and continues by
calling jdi_panel_disable():
err = jdi_panel_disable(&jdi->base);
Inside jdi_panel_disable(), jdi is dereferenced unconditionally, which can
lead to a NULL-pointer dereference:
struct jdi_panel *jdi = to_panel_jdi(panel);
backlight_disable(jdi->backlight);
To prevent such a potential NULL-pointer dereference, return early from
jdi_panel_dsi_remove() when jdi is NULL.
Signed-off-by: Tuo Li <islituo@gmail.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patch.msgid.link/20251218120955.11185-1-islituo@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit dd1ef5e2456558876244795bb22a4d90cb24f160 ]
If the firmware is not running during TDR (e.g., when the driver is
unloading), there's no need to toggle scheduling in the GuC. In such
cases, skip this step.
v4:
- Bail on wait UC not running (Niranjana)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Link: https://patch.msgid.link/20260110012739.2888434-4-matthew.brost@intel.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0839d8d24e6f1fc2587c4a976f44da9fa69ae3d0 ]
This avoids any issues with dpia endpoints
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Signed-off-by: Matthew Stewart <matthew.stewart2@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 39c21b81112321cbe1267b02c77ecd2161ce19aa ]
VFs use the PF SDMA ucode and are unable to load SDMA_RS64.
Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Signed-off-by: Victor Skvortsov <Victor.Skvortsov@amd.com>
Reviewed-by: Gavin Wan <gavin.wan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
Timeline Management v4
[ Upstream commit bd8150a1b3370a9f7761c5814202a3fe5a79f44f ]
This commit simplifies the amdgpu_gem_va_ioctl function, key updates
include:
- Moved the logic for managing the last update fence directly into
amdgpu_gem_va_update_vm.
- Introduced checks for the timeline point to enable conditional
replacement or addition of fences.
v2: Addressed review comments from Christian.
v3: Updated comments (Christian).
v4: The previous version selected the fence too early and did not manage its
reference correctly, which could lead to stale or freed fences being used.
This resulted in refcount underflows and could crash when updating GPU
timelines.
The fence is now chosen only after the VA mapping work is completed, and its
reference is taken safely. After exporting it to the VM timeline syncobj, the
driver always drops its local fence reference, ensuring balanced refcounting
and avoiding use-after-free on dma_fence.
Crash signature:
[ 205.828135] refcount_t: underflow; use-after-free.
[ 205.832963] WARNING: CPU: 30 PID: 7274 at lib/refcount.c:28 refcount_warn_saturate+0xbe/0x110
...
[ 206.074014] Call Trace:
[ 206.076488] <TASK>
[ 206.078608] amdgpu_gem_va_ioctl+0x6ea/0x740 [amdgpu]
[ 206.084040] ? __pfx_amdgpu_gem_va_ioctl+0x10/0x10 [amdgpu]
[ 206.089994] drm_ioctl_kernel+0x86/0xe0 [drm]
[ 206.094415] drm_ioctl+0x26e/0x520 [drm]
[ 206.098424] ? __pfx_amdgpu_gem_va_ioctl+0x10/0x10 [amdgpu]
[ 206.104402] amdgpu_drm_ioctl+0x4b/0x80 [amdgpu]
[ 206.109387] __x64_sys_ioctl+0x96/0xe0
[ 206.113156] do_syscall_64+0x66/0x2d0
...
[ 206.553351] BUG: unable to handle page fault for address: ffffffffc0dfde90
...
[ 206.553378] RIP: 0010:dma_fence_signal_timestamp_locked+0x39/0xe0
...
[ 206.553405] Call Trace:
[ 206.553409] <IRQ>
[ 206.553415] ? __pfx_drm_sched_fence_free_rcu+0x10/0x10 [gpu_sched]
[ 206.553424] dma_fence_signal+0x30/0x60
[ 206.553427] drm_sched_job_done.isra.0+0x123/0x150 [gpu_sched]
[ 206.553434] dma_fence_signal_timestamp_locked+0x6e/0xe0
[ 206.553437] dma_fence_signal+0x30/0x60
[ 206.553441] amdgpu_fence_process+0xd8/0x150 [amdgpu]
[ 206.553854] sdma_v4_0_process_trap_irq+0x97/0xb0 [amdgpu]
[ 206.554353] edac_mce_amd(E) ee1004(E)
[ 206.554270] amdgpu_irq_dispatch+0x150/0x230 [amdgpu]
[ 206.554702] amdgpu_ih_process+0x6a/0x180 [amdgpu]
[ 206.555101] amdgpu_irq_handler+0x23/0x60 [amdgpu]
[ 206.555500] __handle_irq_event_percpu+0x4a/0x1c0
[ 206.555506] handle_irq_event+0x38/0x80
[ 206.555509] handle_edge_irq+0x92/0x1e0
[ 206.555513] __common_interrupt+0x3e/0xb0
[ 206.555519] common_interrupt+0x80/0xa0
[ 206.555525] </IRQ>
[ 206.555527] <TASK>
...
[ 206.555650] RIP: 0010:dma_fence_signal_timestamp_locked+0x39/0xe0
...
[ 206.555667] Kernel panic - not syncing: Fatal exception in interrupt
Link: https://patchwork.freedesktop.org/patch/654669/
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|