Age | Commit message (Collapse) | Author | Files | Lines |
|
Prefetch for SVM ranges can have more than one operation to increment,
hence modify the function to accept an increment value as input.
v2:
- Call xe_vma_ops_incr_pt_update_ops only once for REMAP (Matthew Brost)
- Add check for 0 ops
v3:
- s/u8/int for inc_val and num_remap_ops (Matthew Brost)
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/r/20250513040228.470682-7-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
|
|
These functions will be used in prefetch too, therefore make them public.
v2
- Fix kernel doc
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/r/20250513040228.470682-6-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
|
|
The to_xe_range function will be used in other files. Therefore, make it
public and add kernel-doc documentation
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/r/20250513040228.470682-5-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
|
|
Introduce a helper to add tile mask of binding present and invalidated
for the range. Add a lockdep_assert to ensure it is protected by GPU SVM
notifier lock.
-v7
rebased
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20250513040228.470682-4-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
|
|
This function will be used in prefetch too, hence make it public.
v2:
- Add kernel-doc (Matthew Brost)
- Rebase
v3:
- Move CONFIG_DRM_XE_DEVMEM_MIRROR stub out to xe_svm.c (Matthew Brost)
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/r/20250513040228.470682-3-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
|
|
Add xe_vma_op_prefetch_range struct for svm ranges prefetching, including
an xarray of SVM range pointers, range count, and target memory region.
-v2: Fix doc
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://lore.kernel.org/r/20250513040228.470682-2-himal.prasad.ghimiray@intel.com
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
|
|
The PTL XELPDP_PORT_CLOCK_CTL register XELPDP_DDI_CLOCK_SELECT field's
size is 5 bits vs. the earlier platforms where its size is 4 bits. Make
sure the field is read-out/programmed everywhere correctly, according to
the above.
Cc: Mika Kahola <mika.kahola@intel.com>
Cc: stable@vger.kernel.org # v6.13+
Tested-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://lore.kernel.org/r/20250512142600.824347-1-imre.deak@intel.com
|
|
Currently we are seeing these on PTL:
xe 0000:00:02.0: [drm] *ERROR* Timeout waiting for DDI BUF A to get active
These seem to be caused by writing ALPM registers while Panel Replay is
enabled.
Fix this by writing ALPM registers only when Panel Replay is about to be
enabled.
v4: improve comment on intel_psr_panel_replay_enable_sink call
v3: enable/disable ALPM from PSR code
Fixes: 172757acd6f6 ("drm/i915/lobf: Add lobf enablement in post plane update")
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://lore.kernel.org/r/20250513054814.3702977-3-jouni.hogander@intel.com
(cherry picked from commit a8eb102ce0944a9de2a62aa9d195861b7f26668a)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
We want to enable sink ALPM from PSR code. Make intel_alpm_enable_sink
available for PSR.
v2: do not add kerneldoc comments
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Link: https://lore.kernel.org/r/20250513054814.3702977-2-jouni.hogander@intel.com
(cherry picked from commit 2d278488761f0b5be651a3db41e615a964123d6c)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
vmwgfx enables its PCI device with pcim_enable_device(). This,
implicitly, switches the function pci_request_regions() into managed
mode, where it becomes a devres function.
The PCI subsystem wants to remove this hybrid nature from its
interfaces. To do so, users of the aforementioned combination of
functions must be ported to non-hybrid functions.
Moreover, since both functions are already managed in this driver, the
calls to pci_release_regions() are unnecessary.
Remove the calls to pci_release_regions().
Replace the call to sometimes-managed pci_request_regions() with one to
always-managed pcim_request_all_regions().
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Reviewed-by: Zack Rusin <zack.rusin@broadcom.com>
Signed-off-by: Zack Rusin <zack.rusin@broadcom.com>
Link: https://lore.kernel.org/r/20250514073126.85443-2-phasta@kernel.org
|
|
These includes have become unnecessary. Drop them.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/r/6ca3be3e3fbbd99c169345c3add4b76315390e77.1747128495.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
The include is not needed since commit 44a34dec43e8 ("drm/i915:
Calculate the VT-d guard size in the display code").
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/r/80ea203e004b7378c14f2367258b5785e40bf126.1747128495.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
We've accumulated lots of forward declarations in intel_display.h that
are no longer necessary. Clean them up.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/r/5ad046b74040e84fab51786c346ff9a445e351bc.1747128495.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Avoid passing drm_i915_private to DISPLAY_VER().
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/r/5e97ee7675b32397163eb4fba17184fc1c5a04cd.1747128495.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
All the PPS register users have been converted to struct
intel_display. The backward compat conversion to struct drm_i915_private
is no longer needed. Drop it, along with the include, and convert the
dev_priv macro parameter names to display while at it.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://lore.kernel.org/r/4c23fd8dfcadefeeb52189045421084bcfd50d57.1747128495.git.jani.nikula@intel.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
|
Currently we are seeing these on PTL:
xe 0000:00:02.0: [drm] *ERROR* Timeout waiting for DDI BUF A to get active
These seem to be caused by writing ALPM registers while Panel Replay is
enabled.
Fix this by writing ALPM registers only when Panel Replay is about to be
enabled.
v4: improve comment on intel_psr_panel_replay_enable_sink call
v3: enable/disable ALPM from PSR code
Fixes: 172757acd6f6 ("drm/i915/lobf: Add lobf enablement in post plane update")
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Link: https://lore.kernel.org/r/20250513054814.3702977-3-jouni.hogander@intel.com
|
|
We want to enable sink ALPM from PSR code. Make intel_alpm_enable_sink
available for PSR.
v2: do not add kerneldoc comments
Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com>
Signed-off-by: Jouni Högander <jouni.hogander@intel.com>
Link: https://lore.kernel.org/r/20250513054814.3702977-2-jouni.hogander@intel.com
|
|
i.MX8qxp Display Controller(DC) is comprised of three main components that
include a blit engine for 2D graphics accelerations, display controller for
display output processing, as well as a command sequencer. Add kernel
mode setting support for the display controller part with two CRTCs and
two primary planes(backed by FetchLayer and FetchWarp respectively). The
registers of the display controller are accessed without command sequencer
involved, instead just by using CPU. The command sequencer is supposed to
be used by the blit engine.
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Liu Ying <victor.liu@nxp.com>
Link: https://lore.kernel.org/r/20250414035028.1561475-13-victor.liu@nxp.com
|
|
i.MX8qxp Display Controller has a built-in interrupt controller to support
Enable/Status/Preset/Clear interrupt bit. Add driver for it.
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Liu Ying <victor.liu@nxp.com>
Link: https://lore.kernel.org/r/20250414035028.1561475-12-victor.liu@nxp.com
|
|
i.MX8qxp Display Controller pixel engine consists of all processing
units that operate in the AXI bus clock domain. Add drivers for
ConstFrame, ExtDst, FetchLayer, FetchWarp and LayerBlend units, as
well as a pixel engine driver, so that two displays with primary
planes can be supported. The pixel engine driver and those unit
drivers are components to be aggregated by a master registered in
the upcoming DRM driver.
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Liu Ying <victor.liu@nxp.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20250414035028.1561475-11-victor.liu@nxp.com
|
|
i.MX8qxp Display Controller display engine consists of all processing
units that operate in a display clock domain. Add minimal feature
support with FrameGen and TCon so that the engine can output display
timings. The FrameGen driver, TCon driver and display engine driver
are components to be aggregated by a master registered in the upcoming
DRM driver.
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Liu Ying <victor.liu@nxp.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20250414035028.1561475-10-victor.liu@nxp.com
|
|
This patch is based on the commit 73eb5476df72 ("drm: rcar-du: Support
panels connected directly to the DPAD outputs").
The RZ DU driver assumes that a bridge is always connected to the DU
output. This is valid for the HDMI output, but the DPAD output can be
connected directly to a panel, in which case no bridge is available.
To support this use case, detect whether the entities connected to the DU
DPAD output is encoders or panels based on the number of ports of their DT
node, and retrieve the corresponding type of DRM objects. For panels,
additionally create panel bridge instances.
Signed-off-by: hienhuynh <hien.huynh.px@renesas.com>
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Tested-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Link: https://lore.kernel.org/r/20250508095042.25164-1-biju.das.jz@bp.renesas.com
|
|
Since commit 21aa27ddc582 ("drm/shmem-helper: Switch to reservation
lock"), the drm_gem_shmem_vmap and drm_gem_shmem_vunmap functions
require that the caller holds the DMA reservation lock for the object.
Add lockdep assertions to help validate this.
Signed-off-by: Asahi Lina <lina@asahilina.net>
Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://lore.kernel.org/r/20250318-drm-gem-shmem-v1-1-64b96511a84f@collabora.com
|
|
The xe buffer object shrinker name is visible in the
<debugfs>/shrinker directory and most if not all other shinkers
follow a naming convention that looks like
<subsystem>-<driver>_<objects>:<unique>
Follow the same convention for xe, changing the name to
drm-xe_gem:<unique>.
Other shrinkers typically use the device node for <unique> but
since drm drivers typically don't have a single unique device-
node, instead use the unique name in the drm device.
Fixes: 00c8efc3180f ("drm/xe: Add a shrinker for xe bos")
Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Francois Dugast <francois.dugast@intel.com>
Link: https://lore.kernel.org/r/20250508112931.3347-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 243bf99e2fe75edf8df1711c1377b6fc020b806c)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
|
|
It's expected that we'll encounter temporary exceptions
during aux transactions. Adjust logging from drm_info to
drm_dbg_dp to prevent flooding with unnecessary log messages.
Fixes: 3637e457eb00 ("drm/amd/display: Fix wrong handling for AUX_DEFER case")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250513032026.838036-1-Wayne.Lin@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9a9c3e1fe5256da14a0a307dff0478f90c55fc8c)
Cc: stable@vger.kernel.org
|
|
Similar to commit 6a057072ddd1 ("drm/amd/display: Fix null check for
pipe_ctx->plane_state in dcn20_program_pipe") that addresses a null
pointer dereference on dcn20_update_dchubp_dpp. This is the same
function hooked for update_dchubp_dpp in dcn401, with the same issue.
Fix possible null pointer deference on dcn401_program_pipe too.
Fixes: 63ab80d9ac0a ("drm/amd/display: DML2.1 Post-Si Cleanup")
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d8d47f739752227957d8efc0cb894761bfe1d879)
|
|
[Why & How]
Fix a false positive warning which occurs due to lack of correct checks
when querying plane_id in DML21. This fixes the warning when performing a
mode1 reset (cat /sys/kernel/debug/dri/1/amdgpu_gpu_recover):
[ 35.751250] WARNING: CPU: 11 PID: 326 at /tmp/amd.PHpyAl7v/amd/amdgpu/../display/dc/dml2/dml2_dc_resource_mgmt.c:91 dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu]
[ 35.751434] Modules linked in: amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) amddrm_buddy(OE) amdxcp(OE) amddrm_exec(OE) amd_sched(OE) amdkcl(OE) drm_suballoc_helper drm_ttm_helper ttm drm_display_helper cec rc_core i2c_algo_bit rfcomm qrtr cmac algif_hash algif_skcipher af_alg bnep amd_atl intel_rapl_msr intel_rapl_common snd_hda_codec_hdmi snd_hda_intel edac_mce_amd snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_hda_core snd_hwdep snd_pcm kvm snd_seq_midi snd_seq_midi_event snd_rawmidi crct10dif_pclmul polyval_clmulni polyval_generic btusb ghash_clmulni_intel sha256_ssse3 btrtl sha1_ssse3 snd_seq btintel aesni_intel btbcm btmtk snd_seq_device crypto_simd sunrpc cryptd bluetooth snd_timer ccp binfmt_misc rapl snd i2c_piix4 wmi_bmof gigabyte_wmi k10temp i2c_smbus soundcore gpio_amdpt mac_hid sch_fq_codel msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 hid_generic usbhid hid crc32_pclmul igc ahci xhci_pci libahci xhci_pci_renesas video wmi
[ 35.751501] CPU: 11 UID: 0 PID: 326 Comm: kworker/u64:9 Tainted: G OE 6.11.0-21-generic #21~24.04.1-Ubuntu
[ 35.751504] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 35.751505] Hardware name: Gigabyte Technology Co., Ltd. X670E AORUS PRO X/X670E AORUS PRO X, BIOS F30 05/22/2024
[ 35.751506] Workqueue: amdgpu-reset-dev amdgpu_debugfs_reset_work [amdgpu]
[ 35.751638] RIP: 0010:dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu]
[ 35.751794] Code: 6d 0c 00 00 8b 84 24 88 00 00 00 41 3b 44 9c 20 0f 84 fc 07 00 00 48 83 c3 01 48 83 fb 06 75 b3 4c 8b 64 24 68 4c 8b 6c 24 40 <0f> 0b b8 06 00 00 00 49 8b 94 24 a0 49 00 00 89 c3 83 f8 07 0f 87
[ 35.751796] RSP: 0018:ffffbfa3805d7680 EFLAGS: 00010246
[ 35.751798] RAX: 0000000000010000 RBX: 0000000000000006 RCX: 0000000000000000
[ 35.751799] RDX: 0000000000000000 RSI: 0000000000000005 RDI: 0000000000000000
[ 35.751800] RBP: ffffbfa3805d78f0 R08: 0000000000000000 R09: 0000000000000000
[ 35.751801] R10: 0000000000000000 R11: 0000000000000000 R12: ffffbfa383249000
[ 35.751802] R13: ffffa0e68f280000 R14: ffffbfa383249658 R15: 0000000000000000
[ 35.751803] FS: 0000000000000000(0000) GS:ffffa0edbe580000(0000) knlGS:0000000000000000
[ 35.751804] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 35.751805] CR2: 00005d847ef96c58 CR3: 000000041de3e000 CR4: 0000000000f50ef0
[ 35.751806] PKRU: 55555554
[ 35.751807] Call Trace:
[ 35.751810] <TASK>
[ 35.751816] ? show_regs+0x6c/0x80
[ 35.751820] ? __warn+0x88/0x140
[ 35.751822] ? dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu]
[ 35.751964] ? report_bug+0x182/0x1b0
[ 35.751969] ? handle_bug+0x6e/0xb0
[ 35.751972] ? exc_invalid_op+0x18/0x80
[ 35.751974] ? asm_exc_invalid_op+0x1b/0x20
[ 35.751978] ? dml2_map_dc_pipes+0x243d/0x3f40 [amdgpu]
[ 35.752117] ? math_pow+0x48/0xa0 [amdgpu]
[ 35.752256] ? srso_alias_return_thunk+0x5/0xfbef5
[ 35.752260] ? math_pow+0x48/0xa0 [amdgpu]
[ 35.752400] ? srso_alias_return_thunk+0x5/0xfbef5
[ 35.752403] ? math_pow+0x11/0xa0 [amdgpu]
[ 35.752524] ? srso_alias_return_thunk+0x5/0xfbef5
[ 35.752526] ? core_dcn4_mode_programming+0xe4d/0x20d0 [amdgpu]
[ 35.752663] ? srso_alias_return_thunk+0x5/0xfbef5
[ 35.752669] dml21_validate+0x3d4/0x980 [amdgpu]
Reviewed-by: Austin Zheng <austin.zheng@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f8ad62c0a93e5dd94243e10f1b742232e4d6411e)
|
|
[Why & How]
When MST config is unplugged/replugged too quickly, it can potentially
result in a scenario where previous DC state has not been reset before
the HPD link detection sequence begins. In this case, driver will
disable the streams/link prior to re-enabling the link for link
training.
There is a bug in the current logic that does not account for the fact
that current_state can be released and cleared prior to swapping to a
new state (resulting in the pipe_ctx stream pointers to be cleared) in
between disabling streams.
To resolve this, cache the original streams prior to committing any
stream updates.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: George Shen <george.shen@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 1561782686ccc36af844d55d31b44c938dd412dc)
|
|
[Why & How]
Instead of dropping DRR updates, defer them. This fixes issues where
monitor continues to see incorrect refresh rate after VRR was turned off
by userspace.
Fixes: 32953485c558 ("drm/amd/display: Do not update DRR while BW optimizations pending")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3546
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: John Olender <john.olender@gmail.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 53761b7ecd83e6fbb9f2206f8c980a6aa308c844)
Cc: stable@vger.kernel.org
|
|
This reverts commit 756c85e4d0dd ("drm/amd/display: Enable urgent latency adjustment on DCN35")
Reason for revert: Negative power impact.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Gabe Teeger <Gabe.Teeger@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9334c491cd8f388232b9a187bf0ddb728482bd6f)
|
|
[Why]
Now forcing aux->transfer to return 0 when incomplete AUX write is
inappropriate. It should return bytes have been transferred.
[How]
aux->transfer is asked not to change original msg except reply field of
drm_dp_aux_msg structure. Copy the msg->buffer when it's write request,
and overwrite the first byte when sink reply 1 byte indicating partially
written byte number. Then we can return the correct value without
changing the original msg.
Fixes: 3637e457eb00 ("drm/amd/display: Fix wrong handling for AUX_DEFER case")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Ray Wu <ray.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7ac37f0dcd2e0b729fa7b5513908dc8ab802b540)
Cc: stable@vger.kernel.org
|
|
On GFX1151, the reported MALL cache size reflects only
half of its actual size; this adjustment corrects the discrepancy.
Signed-off-by: Tim Huang <tim.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0a5c060b593ad152318f89e5564bfdfcff8a6ac0)
Cc: stable@vger.kernel.org
|
|
After process exit to unmap csa and free GPU vm, if signal is accepted
and then waiting to take vm lock is interrupted and return, it causes
memory leaking and below warning backtrace.
Change to use uninterruptible wait lock fix the issue.
WARNING: CPU: 69 PID: 167800 at amd/amdgpu/amdgpu_kms.c:1525
amdgpu_driver_postclose_kms+0x294/0x2a0 [amdgpu]
Call Trace:
<TASK>
drm_file_free.part.0+0x1da/0x230 [drm]
drm_close_helper.isra.0+0x65/0x70 [drm]
drm_release+0x6a/0x120 [drm]
amdgpu_drm_release+0x51/0x60 [amdgpu]
__fput+0x9f/0x280
____fput+0xe/0x20
task_work_run+0x67/0xa0
do_exit+0x217/0x3c0
do_group_exit+0x3b/0xb0
get_signal+0x14a/0x8d0
arch_do_signal_or_restart+0xde/0x100
exit_to_user_mode_loop+0xc1/0x1a0
exit_to_user_mode_prepare+0xf4/0x100
syscall_exit_to_user_mode+0x17/0x40
do_syscall_64+0x69/0xc0
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7dbbfb3c171a6f63b01165958629c9c26abf38ab)
Cc: stable@vger.kernel.org
|
|
Fix DEBUG_LOCKS_WARN_ON(lock->magic != lock) warning logs.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The ttm_bo_pin and ttm_bo_unpin warnings are resolved by moving the
doorbell bo reserve up before pin/unpin.
WARNING: CPU: 11 PID: 1818 at drivers/gpu/drm/ttm/ttm_bo.c:592 ttm_bo_pin+0x1f6/0x270 [ttm]
[ +0.000277] CPU: 11 UID: 1000 PID: 1818 Comm: Xwayland Tainted: G W 6.12.0+ #15
[ +0.000006] Tainted: [W]=WARN
[ +0.000004] Hardware name: ASUS System Product Name/TUF GAMING B650-PLUS, BIOS 3072 12/20/2024
[ +0.000004] RIP: 0010:ttm_bo_pin+0x1f6/0x270 [ttm]
[ +0.000005] RSP: 0018:ffff88846ca879d0 EFLAGS: 00010246
[ +0.000007] RAX: 0000000000000000 RBX: ffff88810b7ca848 RCX: 0000000000000000
[ +0.000004] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ +0.000005] RBP: ffff88846ca879e8 R08: 0000000000000000 R09: 0000000000000000
[ +0.000004] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88810b7ca848
[ +0.000004] R13: ffff88846c666250 R14: 1ffff1108d950f44 R15: ffff88846ca87aa0
[ +0.000005] FS: 00007c45ff436d00(0000) GS:ffff888409580000(0000) knlGS:0000000000000000
[ +0.000004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000005] CR2: 00005b0c142a60e0 CR3: 000000012ce5a000 CR4: 0000000000f50ef0
[ +0.000004] PKRU: 55555554
[ +0.000004] Call Trace:
[ +0.000004] <TASK>
[ +0.000005] ? show_regs+0x6c/0x80
[ +0.000007] ? __warn+0xd2/0x2d0
[ +0.000007] ? ttm_bo_pin+0x1f6/0x270 [ttm]
[ +0.000031] ? report_bug+0x282/0x2f0
[ +0.000012] ? handle_bug+0x6e/0xc0
[ +0.000007] ? exc_invalid_op+0x18/0x50
[ +0.000007] ? asm_exc_invalid_op+0x1b/0x20
[ +0.000017] ? ttm_bo_pin+0x1f6/0x270 [ttm]
[ +0.000014] amdgpu_bo_pin+0x365/0x9d0 [amdgpu]
[ +0.000191] ? __pfx_amdgpu_bo_pin+0x10/0x10 [amdgpu]
[ +0.000185] ? drm_gem_object_lookup+0x81/0xc0
[ +0.000008] ? kasan_save_alloc_info+0x37/0x60
[ +0.000007] ? __kasan_kmalloc+0xc3/0xd0
[ +0.000013] amdgpu_userqueue_get_doorbell_index+0xee/0x5f0 [amdgpu]
[ +0.000209] amdgpu_userq_ioctl+0x6b4/0xd40 [amdgpu]
[ +0.000193] ? __pfx_amdgpu_userq_ioctl+0x10/0x10 [amdgpu]
[ +0.000211] ? lock_acquire+0x7c/0xc0
[ +0.000006] ? drm_dev_enter+0x51/0x190
[ +0.000015] drm_ioctl_kernel+0x18b/0x330
[ +0.000007] ? __pfx_amdgpu_userq_ioctl+0x10/0x10 [amdgpu]
[ +0.000190] ? __pfx_drm_ioctl_kernel+0x10/0x10
[ +0.000005] ? lock_acquire+0x7c/0xc0
[ +0.000009] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? __kasan_check_write+0x14/0x30
[ +0.000005] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000011] drm_ioctl+0x589/0xd00
[ +0.000005] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000006] ? __pfx_amdgpu_userq_ioctl+0x10/0x10 [amdgpu]
[ +0.000194] ? __pfx_drm_ioctl+0x10/0x10
[ +0.000006] ? __pm_runtime_resume+0x80/0x110
[ +0.000021] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? trace_hardirqs_on+0x53/0x60
[ +0.000005] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? _raw_spin_unlock_irqrestore+0x51/0x80
[ +0.000013] amdgpu_drm_ioctl+0xd2/0x1c0 [amdgpu]
[ +0.000185] __x64_sys_ioctl+0x13a/0x1c0
[ +0.000010] x64_sys_call+0x11ad/0x25f0
[ +0.000007] do_syscall_64+0x91/0x180
[ +0.000007] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? irqentry_exit+0x77/0xb0
[ +0.000005] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? exc_page_fault+0x93/0x150
[ +0.000009] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000005] RIP: 0033:0x7c45ff924ded
[ +0.000005] RSP: 002b:00007ffff7167810 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ +0.000008] RAX: ffffffffffffffda RBX: 00000000c0486456 RCX: 00007c45ff924ded
[ +0.000004] RDX: 00007ffff7167870 RSI: 00000000c0486456 RDI: 000000000000000b
[ +0.000004] RBP: 00007ffff7167860 R08: ffff800100000000 R09: 0000000000010000
[ +0.000005] R10: 00007ffff7167950 R11: 0000000000000246 R12: 00005b0c2a51bc48
[ +0.000004] R13: 000000000000000b R14: 0000000000000000 R15: 00007ffff7167950
[ +0.000022] </TASK>
[ +0.000004] irq event stamp: 80693
[ +0.000004] hardirqs last enabled at (80699): [<ffffffff86a693a9>] __up_console_sem+0x79/0xa0
[ +0.000005] hardirqs last disabled at (80704): [<ffffffff86a6938e>] __up_console_sem+0x5e/0xa0
[ +0.000005] softirqs last enabled at (80390): [<ffffffff8687377e>] __irq_exit_rcu+0x17e/0x1d0
[ +0.000005] softirqs last disabled at (80385): [<ffffffff8687377e>] __irq_exit_rcu+0x17e/0x1d0
[ +0.000006] ---[ end trace 0000000000000000 ]---
------------------------------------------------------------------------------------------------------
[ +0.000006] WARNING: CPU: 10 PID: 1818 at drivers/gpu/drm/ttm/ttm_bo.c:611 ttm_bo_unpin+0x21f/0x2c0 [ttm]
[ +0.000280] CPU: 10 UID: 1000 PID: 1818 Comm: Xwayland Tainted: G W 6.12.0+ #15
[ +0.000006] Tainted: [W]=WARN
[ +0.000004] Hardware name: ASUS System Product Name/TUF GAMING B650-PLUS, BIOS 3072 12/20/2024
[ +0.000004] RIP: 0010:ttm_bo_unpin+0x21f/0x2c0 [ttm]
[ +0.000005] RSP: 0018:ffff88846ca87888 EFLAGS: 00010246
[ +0.000007] RAX: 0000000000000000 RBX: ffff88810b7ca848 RCX: 0000000000000000
[ +0.000005] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ +0.000004] RBP: ffff88846ca878a0 R08: 0000000000000000 R09: 0000000000000000
[ +0.000004] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888164e90050
[ +0.000005] R13: ffff88846c666200 R14: 0000000000000001 R15: ffff888168402d28
[ +0.000004] FS: 00007c45ff436d00(0000) GS:ffff888409500000(0000) knlGS:0000000000000000
[ +0.000005] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ +0.000004] CR2: 00007c45f7373b20 CR3: 000000012ce5a000 CR4: 0000000000f50ef0
[ +0.000005] PKRU: 55555554
[ +0.000004] Call Trace:
[ +0.000004] <TASK>
[ +0.000005] ? show_regs+0x6c/0x80
[ +0.000008] ? __warn+0xd2/0x2d0
[ +0.000007] ? ttm_bo_unpin+0x21f/0x2c0 [ttm]
[ +0.000012] ? report_bug+0x282/0x2f0
[ +0.000013] ? handle_bug+0x6e/0xc0
[ +0.000006] ? exc_invalid_op+0x18/0x50
[ +0.000008] ? asm_exc_invalid_op+0x1b/0x20
[ +0.000017] ? ttm_bo_unpin+0x21f/0x2c0 [ttm]
[ +0.000011] ? ttm_bo_unpin+0x217/0x2c0 [ttm]
[ +0.000011] amdgpu_bo_unpin+0x45/0x250 [amdgpu]
[ +0.000216] amdgpu_userq_ioctl+0x2c3/0xd40 [amdgpu]
[ +0.000226] ? drm_dev_exit+0x2d/0x60
[ +0.000010] ? __pfx_amdgpu_userq_ioctl+0x10/0x10 [amdgpu]
[ +0.000201] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? lock_acquire+0x7c/0xc0
[ +0.000006] ? drm_dev_enter+0x51/0x190
[ +0.000015] drm_ioctl_kernel+0x18b/0x330
[ +0.000007] ? __pfx_amdgpu_userq_ioctl+0x10/0x10 [amdgpu]
[ +0.000188] ? __pfx_drm_ioctl_kernel+0x10/0x10
[ +0.000006] ? lock_acquire+0x7c/0xc0
[ +0.000008] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? __kasan_check_write+0x14/0x30
[ +0.000006] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000010] drm_ioctl+0x589/0xd00
[ +0.000005] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000006] ? __pfx_amdgpu_userq_ioctl+0x10/0x10 [amdgpu]
[ +0.000211] ? __pfx_drm_ioctl+0x10/0x10
[ +0.000006] ? __pm_runtime_resume+0x80/0x110
[ +0.000020] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000006] ? trace_hardirqs_on+0x53/0x60
[ +0.000005] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? _raw_spin_unlock_irqrestore+0x51/0x80
[ +0.000013] amdgpu_drm_ioctl+0xd2/0x1c0 [amdgpu]
[ +0.000186] __x64_sys_ioctl+0x13a/0x1c0
[ +0.000010] x64_sys_call+0x11ad/0x25f0
[ +0.000007] do_syscall_64+0x91/0x180
[ +0.000007] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? do_syscall_64+0x9d/0x180
[ +0.000007] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000010] ? __pfx___rseq_handle_notify_resume+0x10/0x10
[ +0.000005] ? __pfx_blkcg_maybe_throttle_current+0x10/0x10
[ +0.000013] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000009] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000008] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? syscall_exit_to_user_mode+0x95/0x260
[ +0.000008] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? do_syscall_64+0x9d/0x180
[ +0.000007] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? do_syscall_64+0x9d/0x180
[ +0.000011] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000010] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000009] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000008] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? irqentry_exit_to_user_mode+0x8b/0x260
[ +0.000007] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000006] ? irqentry_exit+0x77/0xb0
[ +0.000004] ? srso_alias_return_thunk+0x5/0xfbef5
[ +0.000005] ? exc_page_fault+0x93/0x150
[ +0.000010] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000005] RIP: 0033:0x7c45ff924ded
[ +0.000005] RSP: 002b:00007ffff7168790 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ +0.000008] RAX: ffffffffffffffda RBX: 00000000c0486456 RCX: 00007c45ff924ded
[ +0.000005] RDX: 00007ffff71687f0 RSI: 00000000c0486456 RDI: 000000000000000b
[ +0.000004] RBP: 00007ffff71687e0 R08: 00005b0c2a49b010 R09: 0000000000000007
[ +0.000004] R10: 00005b0c2a4d7140 R11: 0000000000000246 R12: 000000000000000b
[ +0.000004] R13: 00007c45ff19e5cc R14: 00005b0c2a51c538 R15: 00005b0c2a51bbd8
[ +0.000022] </TASK>
[ +0.000005] irq event stamp: 87419
[ +0.000004] hardirqs last enabled at (87425): [<ffffffff86a693a9>] __up_console_sem+0x79/0xa0
[ +0.000005] hardirqs last disabled at (87430): [<ffffffff86a6938e>] __up_console_sem+0x5e/0xa0
[ +0.000005] softirqs last enabled at (87058): [<ffffffff8687377e>] __irq_exit_rcu+0x17e/0x1d0
[ +0.000006] softirqs last disabled at (87053): [<ffffffff8687377e>] __irq_exit_rcu+0x17e/0x1d0
[ +0.000005] ---[ end trace 0000000000000000 ]---
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix lockdep warnings.
[ +0.000637] ================================
[ +0.000004] WARNING: inconsistent lock state
[ +0.000004] 6.12.0+ #18 Tainted: G W OE
[ +0.000004] --------------------------------
[ +0.000004] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[ +0.000004] Xwayland/1952 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ +0.000005] ffff8884636f4740 (&fence_drv->fence_list_lock){?...}-{2:2}, at: amdgpu_userq_fence_driver_destroy+0xb8/0x540 [amdgpu]
[ +0.000208] {IN-HARDIRQ-W} state was registered at:
[ +0.000004] lock_acquire.part.0+0x116/0x360
[ +0.000005] lock_acquire+0x7c/0xc0
[ +0.000005] _raw_spin_lock+0x2f/0x60
[ +0.000005] amdgpu_userq_fence_driver_process+0x75/0x400 [amdgpu]
[ +0.000185] gfx_v12_0_eop_irq+0x29f/0x420 [amdgpu]
[ +0.000210] amdgpu_irq_dispatch+0x2a4/0x7b0 [amdgpu]
[ +0.000191] amdgpu_ih_process+0x1e1/0x3d0 [amdgpu]
[ +0.000185] amdgpu_irq_handler+0x28/0xc0 [amdgpu]
[ +0.000183] __handle_irq_event_percpu+0x1bb/0x590
[ +0.000005] handle_irq_event+0xab/0x1d0
[ +0.000005] handle_edge_irq+0x1fd/0xc10
[ +0.000005] __common_interrupt+0x83/0x190
[ +0.000004] common_interrupt+0xb1/0xe0
[ +0.000005] asm_common_interrupt+0x27/0x40
[ +0.000004] cpuidle_enter_state+0x2ba/0x530
[ +0.000005] cpuidle_enter+0x4f/0xb0
[ +0.000006] call_cpuidle+0x46/0xd0
[ +0.000005] do_idle+0x367/0x430
[ +0.000004] cpu_startup_entry+0x58/0x70
[ +0.000005] start_secondary+0x224/0x2b0
[ +0.000005] common_startup_64+0x13e/0x141
[ +0.000005] irq event stamp: 88271
[ +0.000004] hardirqs last enabled at (88271): [<ffffffffad9ca7a1>] _raw_spin_unlock_irqrestore+0x51/0x80
[ +0.000005] hardirqs last disabled at (88270): [<ffffffffad9ca424>] _raw_spin_lock_irqsave+0x74/0x80
[ +0.000005] softirqs last enabled at (87858): [<ffffffffaa67377e>] __irq_exit_rcu+0x17e/0x1d0
[ +0.000005] softirqs last disabled at (87849): [<ffffffffaa67377e>] __irq_exit_rcu+0x17e/0x1d0
[ +0.000005]
other info that might help us debug this:
[ +0.000004] Possible unsafe locking scenario:
[ +0.000003] CPU0
[ +0.000004] ----
[ +0.000003] lock(&fence_drv->fence_list_lock);
v2:
Drop fence_list_flags and use xa_lock_irqsave() flags parameter (Christian)
Fix merge conflicts.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
It's expected that we'll encounter temporary exceptions
during aux transactions. Adjust logging from drm_info to
drm_dbg_dp to prevent flooding with unnecessary log messages.
Fixes: 3637e457eb00 ("drm/amd/display: Fix wrong handling for AUX_DEFER case")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://lore.kernel.org/r/20250513032026.838036-1-Wayne.Lin@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The kthread header doesn't need to be included anymore. It's a relict
from commit a6149f039369 ("drm/sched: Convert drm scheduler to use a
work queue rather than kthread").
Remove the unneeded includes.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250314101023.111248-3-phasta@kernel.org
|
|
The GPU scheduler's comments refer to a "thread" at various places.
Those are leftovers from commit a6149f039369 ("drm/sched: Convert drm
scheduler to use a work queue rather than kthread").
Replace all references to kthreads.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250314101023.111248-2-phasta@kernel.org
|
|
limit on MST
Atm, on an MST link in DSC mode
intel_dp_compute_config_link_bpp_limits() calculates the maximum link
bpp limit using the MST root connector's DSC capabilities. That's not
correct in general: the decompression could be performed by a branch
device downstream of the root branch device or the sink itself.
Fix the above by passing to intel_dp_compute_config_link_bpp_limits()
the actual connector being modeset, containing the correct DSC
capabilities.
Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Fixes: 1c5b72daff46 ("drm/i915/dp: Set the DSC link limits in intel_dp_compute_config_link_bpp_limits")
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Reviewed-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://lore.kernel.org/r/20250509180340.554867-2-imre.deak@intel.com
(cherry picked from commit 266e2fcfe2ea0d062ea392cd22f6250ae0d11c04)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
This patch enables per-queue and per-pipe reset functionality for
GFX IP v9.5.0 when using MEC firmware version 21 (0x15) or later.
This change:
1. Refactors the pipe reset support check in gfx_v9_4_3_pipe_reset_support()
to use the compute_supported_reset flags instead of hardcoding
version checks.
2. Adds support for GFX9.5.0 (IP 9.5.0) with MEC firmware version >= 21
to enable per-queue and per-pipe reset capabilities.
v2: Replaced mec version check with !!(adev->gfx.compute_supported_reset & AMDGPU_RESET_TYPE_PER_PIPE) (Lijo)
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Set vram type so we can take different actions according to the type.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Make the code more general, user doesn't need to pay attention to the
detail of flip bits setting.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The major of dcn and dce irqs share a copy-pasted collection
of copy-pasted function, which is: hpd_ack.
This patch removes the multiple copy-pasted by moving them to
the irq_service.c and make the irq_service's
calls the functions implemented by the irq_service.c
instead.
The hpd_ack function is replaced by hpd0_ack and hpd1_ack, the
required constants are also added.
The changes were not tested on actual hardware. I am only able
to verify that the changes keep the code compileable and do my
best to look repeatedly if I am not actually changing any code.
Signed-off-by: Sebastian Aguilera Novoa <saguileran@ime.usp.br>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Similar to commit 6a057072ddd1 ("drm/amd/display: Fix null check for
pipe_ctx->plane_state in dcn20_program_pipe") that addresses a null
pointer dereference on dcn20_update_dchubp_dpp. This is the same
function hooked for update_dchubp_dpp in dcn401, with the same issue.
Fix possible null pointer deference on dcn401_program_pipe too.
Fixes: 63ab80d9ac0a ("drm/amd/display: DML2.1 Post-Si Cleanup")
Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The number of newly added de counts and the number of
newly added error addresses remain consistent
Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Commit ded8b3c36f17 ("drm/amdgpu: properly handle GC vs MM in amdgpu_vmid_mgr_init()")
enables all 16 vmids for MMHUB on GC 10 and newer for KGD since
there are no KFD resources using MMHUB. With this change, KFD
starts seeing MMHUB vmids in it's range with no pasid set. As
such there is no need to warn, we can just ignore those interrupts.
Fixes: aded8b3c36f1 ("drm/amdgpu: properly handle GC vs MM in amdgpu_vmid_mgr_init()")
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
During driver load, RAS event manager may not be initialized. This will
cause any ATHUB event during driver load to be skipped in dmesg log. Log
the error in dmesg log for easier diagnosis.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This resolves a deadlock between user queue management and GPU reset
paths by enforcing consistent lock ordering.
The deadlock occurred when:
1. Process exit path (amdgpu_userq_mgr_fini) would:
- Take uqm->userq_mutex
- Then try to take adev->userq_mutex for list operations
2. GPU reset path (amdgpu_userq_pre_reset) would:
- Take adev->userq_mutex first (for list traversal)
- Then take uqm->userq_mutex
The solution establishes a strict top-down locking order:
1. Always take adev->userq_mutex before any uqm->userq_mutex
2. Maintain this order consistently across all code paths
Changes made:
- Reordered locking in amdgpu_userq_mgr_fini() to take device lock first
- Kept existing proper order in amdgpu_userq_pre_reset()
- Simplified the fini flow by removing redundant operations
This prevents circular dependencies while maintaining thread safety
during both normal operation and GPU reset scenarios.
Fixes: 4ce60dbada96 ("drm/amdgpu: store userq_managers in a list in adev")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Arvind Yadav <Arvind.Yadav@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
kernel panic caused by RAS records exceeding the threshold when load driver
specifying RMA(bad_page_threshold=128)
1.Fix the warnings caused by disabling the interrupt source
before it was enabled
2.Fix kernel panic when xcp sysfs is not initialized,null pointer
appears during fini
3.Fix the memory leak caused by the device's early exit due to rma
The first reason:
[ 2744.246650] ------------[ cut here ]------------
[ 2744.246651] WARNING: CPU: 0 PID: 289 at /tmp/amd.BkfTLqYV/amd/amdgpu/amdgpu_irq.c:635 amdgpu_irq_put.cold+0x42/0x6e [amdgpu]
[ 2744.247108] Modules linked in: amdgpu(OE+) amddrm_ttm_helper(OE) amdttm(OE) amdxcp(OE) amddrm_buddy(OE) amddrm_exec(OE) amd_sched(OE) amdkcl(OE) xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay binfmt_misc intel_rapl_msr intel_rapl_common i10nm_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif kvm_intel nls_iso8859_1 kvm rapl isst_if_mbox_pci isst_if_mmio pmt_telemetry pmt_crashlog isst_if_common pmt_class mei_me mei acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua msr ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath
[ 2744.247167] linear mlx5_ib ib_uverbs ib_core ast i2c_algo_bit drm_vram_helper drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul syscopyarea crc32_pclmul ghash_clmulni_intel mlx5_core sysfillrect sysimgblt aesni_intel mlxfw fb_sys_fops psample cec crypto_simd cryptd rc_core i2c_i801 nvme xhci_pci tls intel_pmt drm pci_hyperv_intf nvme_core i2c_smbus i2c_ismt xhci_pci_renesas wmi pinctrl_emmitsburg
[ 2744.247194] CPU: 0 PID: 289 Comm: kworker/0:1 Tainted: G OE 5.15.0-70-generic #77-Ubuntu
[ 2744.247197] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C23.AG.2 11/21/2024
[ 2744.247198] Workqueue: events work_for_cpu_fn
[ 2744.247206] RIP: 0010:amdgpu_irq_put.cold+0x42/0x6e [amdgpu]
[ 2744.247634] Code: 79 7f ff 44 89 ee 48 c7 c7 4d 5a 42 c2 89 55 d4 e8 90 09 bc bf 8b 55 d4 4c 89 e6 4c 89 ff e8 3c 76 7f ff 8b 55 d4 84 c0 75 07 <0f> 0b e9 95 79 7f ff 49 03 5c 24 08 f0 ff 0b 75 13 4c 89 e6 4c 89
[ 2744.247636] RSP: 0018:ffa0000019e27cb0 EFLAGS: 00010246
[ 2744.247639] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ff11000150fa87c0
[ 2744.247641] RDX: 0000000000000000 RSI: ffffffffc2222430 RDI: ff1100019f200000
[ 2744.247642] RBP: ffa0000019e27ce0 R08: 0000000000000003 R09: ffffffffffe41a08
[ 2744.247643] R10: 0000000000ffff0a R11: 0000000000000001 R12: ff1100019f22ce60
[ 2744.247644] R13: 0000000000000000 R14: 00000000ffffffea R15: ff1100019f200000
[ 2744.247645] FS: 0000000000000000(0000) GS:ff11007e7e400000(0000) knlGS:0000000000000000
[ 2744.247647] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2744.247649] CR2: 00007f3d2002819c CR3: 0000000006810003 CR4: 0000000000771ef0
[ 2744.247650] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2744.247651] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 2744.247652] PKRU: 55555554
[ 2744.247653] Call Trace:
[ 2744.247654] <TASK>
[ 2744.247656] sdma_v4_4_2_hw_fini+0x7a/0xc0 [amdgpu]
[ 2744.247997] ? vcn_v4_0_3_hw_fini+0x5f/0xa0 [amdgpu]
[ 2744.248336] amdgpu_ip_block_hw_fini+0x31/0x61 [amdgpu]
[ 2744.248776] amdgpu_device_fini_hw+0x3bb/0x47b [amdgpu]
[ 2744.249197] ? blocking_notifier_chain_unregister+0x56/0xb0
[ 2744.249202] amdgpu_driver_unload_kms+0x51/0x60 [amdgpu]
[ 2744.249482] amdgpu_driver_load_kms.cold+0x18/0x2e [amdgpu]
[ 2744.249913] amdgpu_pci_probe+0x23e/0x590 [amdgpu]
[ 2744.250187] local_pci_probe+0x48/0x90
[ 2744.250191] work_for_cpu_fn+0x17/0x30
[ 2744.250196] process_one_work+0x228/0x3d0
[ 2744.250198] worker_thread+0x223/0x420
[ 2744.250200] ? process_one_work+0x3d0/0x3d0
[ 2744.250201] kthread+0x127/0x150
[ 2744.250204] ? set_kthread_struct+0x50/0x50
[ 2744.250207] ret_from_fork+0x1f/0x30
[ 2744.250212] </TASK>
[ 2744.250213] ---[ end trace 488c997a88508bc3 ]---
The second reason:
[ 5139.303446] Memory manager not clean during takedown.
[ 5139.303509] WARNING: CPU: 145 PID: 117699 at drivers/gpu/drm/drm_mm.c:998 drm_mm_takedown+0x27/0x30 [drm]
[ 5139.303542] Modules linked in: amdgpu(OE+) amddrm_ttm_helper(OE) amdttm(OE) amdxcp(OE) amddrm_buddy(OE) amddrm_exec(OE) amd_sched(OE) amdkcl(OE) xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nf_tables nfnetlink br_netfilter bridge stp llc overlay intel_rapl_msr intel_rapl_common i10nm_edac nfit x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif kvm_intel binfmt_misc kvm nls_iso8859_1 rapl isst_if_mbox_pci pmt_telemetry pmt_crashlog isst_if_mmio pmt_class isst_if_common mei_me mei acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad acpi_power_meter mac_hid sch_fq_codel dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua msr ramoops reed_solomon pstore_blk pstore_zone efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath
[ 5139.303572] linear mlx5_ib ib_uverbs ib_core crct10dif_pclmul ast crc32_pclmul i2c_algo_bit ghash_clmulni_intel aesni_intel crypto_simd drm_vram_helper cryptd drm_ttm_helper mlx5_core ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core mlxfw psample intel_pmt nvme xhci_pci drm tls i2c_i801 pci_hyperv_intf nvme_core i2c_smbus i2c_ismt xhci_pci_renesas wmi pinctrl_emmitsburg [last unloaded: amdkcl]
[ 5139.303588] CPU: 145 PID: 117699 Comm: modprobe Tainted: G U OE 5.15.0-70-generic #77-Ubuntu
[ 5139.303590] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C23.AG.2 11/21/2024
[ 5139.303591] RIP: 0010:drm_mm_takedown+0x27/0x30 [drm]
[ 5139.303605] Code: cc 66 90 0f 1f 44 00 00 48 8b 47 38 48 83 c7 38 48 39 f8 75 05 c3 cc cc cc cc 55 48 c7 c7 18 d0 10 c0 48 89 e5 e8 5a bc c3 c1 <0f> 0b 5d c3 cc cc cc cc 90 0f 1f 44 00 00 55 b9 15 00 00 00 48 89
[ 5139.303607] RSP: 0018:ffa00000325c3940 EFLAGS: 00010286
[ 5139.303608] RAX: 0000000000000000 RBX: ff1100012f5cecb0 RCX: 0000000000000027
[ 5139.303609] RDX: ff11007e7fa60588 RSI: 0000000000000001 RDI: ff11007e7fa60580
[ 5139.303610] RBP: ffa00000325c3940 R08: 0000000000000003 R09: fffffffff00c2b78
[ 5139.303610] R10: 000000000000002b R11: 0000000000000001 R12: ff1100012f5cec00
[ 5139.303611] R13: ff1100012138f068 R14: 0000000000000000 R15: ff1100012f5cec90
[ 5139.303611] FS: 00007f42ffca0000(0000) GS:ff11007e7fa40000(0000) knlGS:0000000000000000
[ 5139.303612] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5139.303613] CR2: 00007f23d945ab68 CR3: 00000001212ce005 CR4: 0000000000771ee0
[ 5139.303614] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5139.303615] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 5139.303615] PKRU: 55555554
[ 5139.303616] Call Trace:
[ 5139.303617] <TASK>
[ 5139.303619] amdttm_range_man_fini_nocheck+0xfe/0x1c0 [amdttm]
[ 5139.303625] amdgpu_ttm_fini+0x2ed/0x390 [amdgpu]
[ 5139.303800] amdgpu_bo_fini+0x27/0xc0 [amdgpu]
[ 5139.303959] gmc_v9_0_sw_fini+0x63/0x90 [amdgpu]
[ 5139.304144] amdgpu_device_fini_sw+0x125/0x6a0 [amdgpu]
[ 5139.304302] amdgpu_driver_release_kms+0x16/0x30 [amdgpu]
[ 5139.304455] devm_drm_dev_init_release+0x4a/0x80 [drm]
[ 5139.304472] devm_action_release+0x12/0x20
[ 5139.304476] release_nodes+0x3d/0xb0
[ 5139.304478] devres_release_all+0x9b/0xd0
[ 5139.304480] really_probe+0x11d/0x420
[ 5139.304483] __driver_probe_device+0x119/0x190
[ 5139.304485] driver_probe_device+0x23/0xc0
[ 5139.304487] __driver_attach+0xf7/0x1f0
[ 5139.304489] ? __device_attach_driver+0x140/0x140
[ 5139.304491] bus_for_each_dev+0x7c/0xd0
[ 5139.304493] driver_attach+0x1e/0x30
[ 5139.304494] bus_add_driver+0x148/0x220
[ 5139.304496] driver_register+0x95/0x100
[ 5139.304498] __pci_register_driver+0x68/0x70
[ 5139.304500] amdgpu_init+0xbc/0x1000 [amdgpu]
[ 5139.304655] ? 0xffffffffc0b8f000
[ 5139.304657] do_one_initcall+0x46/0x1e0
[ 5139.304659] ? kmem_cache_alloc_trace+0x19e/0x2e0
[ 5139.304663] do_init_module+0x52/0x260
[ 5139.304665] load_module+0xb2b/0xbc0
[ 5139.304667] __do_sys_finit_module+0xbf/0x120
[ 5139.304669] __x64_sys_finit_module+0x18/0x20
[ 5139.304670] do_syscall_64+0x59/0xc0
[ 5139.304673] ? exit_to_user_mode_prepare+0x37/0xb0
[ 5139.304676] ? syscall_exit_to_user_mode+0x27/0x50
[ 5139.304678] ? __x64_sys_mmap+0x33/0x50
[ 5139.304680] ? do_syscall_64+0x69/0xc0
[ 5139.304681] entry_SYSCALL_64_after_hwframe+0x61/0xcb
[ 5139.304684] RIP: 0033:0x7f42ffdbf88d
[ 5139.304686] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
[ 5139.304687] RSP: 002b:00007ffcb7427158 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
[ 5139.304688] RAX: ffffffffffffffda RBX: 000055ce8b8f3150 RCX: 00007f42ffdbf88d
[ 5139.304689] RDX: 0000000000000000 RSI: 000055ce8b8f9a70 RDI: 000000000000000a
[ 5139.304690] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000011
[ 5139.304690] R10: 000000000000000a R11: 0000000000000246 R12: 000055ce8b8f9a70
[ 5139.304691] R13: 000055ce8b8f2ec0 R14: 000055ce8b8f2ab0 R15: 000055ce8b8f9aa0
[ 5139.304692] </TASK>
[ 5139.304693] ---[ end trace 8536b052f7883003 ]---
Signed-off-by: Ce Sun <cesun102@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|