diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2025-01-22 03:09:47 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2025-01-22 03:09:47 +0300 |
commit | 96c84703f1cf6ea43617f9565166681cd71df104 (patch) | |
tree | 3979738a0ede41e5bb5a21b957b82667a6ef39d7 /drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | |
parent | c0e75905caf368e19aab585d20151500e750de89 (diff) | |
parent | 951a6bf30667307e7901aac5e74e50dadd5ccfc7 (diff) | |
download | linux-96c84703f1cf6ea43617f9565166681cd71df104.tar.xz |
Merge tag 'drm-next-2025-01-17' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
"There are two external interactions of note, the msm tree pull in some
opp tree, hopefully the opp tree arrives from the same git tree
however it normally does.
There is also a new cgroup controller for device memory, that is used
by drm, so is merging through my tree. This will hopefully help open
up gpu cgroup usage a bit more and move us forward.
There is a new accelerator driver for the AMD XDNA Ryzen AI NPUs.
Then the usual xe/amdgpu/i915/msm leaders and lots of changes and
refactors across the board:
core:
- device memory cgroup controller added
- Remove driver date from drm_driver
- Add drm_printer based hex dumper
- drm memory stats docs update
- scheduler documentation improvements
new driver:
- amdxdna - Ryzen AI NPU support
connector:
- add a mutex to protect ELD
- make connector setup two-step
panels:
- Introduce backlight quirks infrastructure
- New panels: KDB KD116N2130B12, Tianma TM070JDHG34-00,
- Multi-Inno Technology MI1010Z1T-1CP11
bridge:
- ti-sn65dsi83: Add ti,lvds-vod-swing optional properties
- Provide default implementation of atomic_check for HDMI bridges
- it605: HDCP improvements, MCCS Support
xe:
- make OA buffer size configurable
- GuC capture fixes
- add ufence and g2h flushes
- restore system memory GGTT mappings
- ioctl fixes
- SRIOV PF scheduling priority
- allow fault injection
- lots of improvements/refactors
- Enable GuC's WA_DUAL_QUEUE for newer platforms
- IRQ related fixes and improvements
i915:
- More accurate engine busyness metrics with GuC submission
- Ensure partial BO segment offset never exceeds allowed max
- Flush GuC CT receive tasklet during reset preparation
- Some DG2 refactor to fix DG2 bugs when operating with certain CPUs
- Fix DG1 power gate sequence
- Enabling uncompressed 128b/132b UHBR SST
- Handle hdmi connector init failures, and no HDMI/DP cases
- More robust engine resets on Haswell and older
i915/xe display:
- HDCP fixes for Xe3Lpd
- New GSC FW ARL-H/ARL-U
- support 3 VDSC engines 12 slices
- MBUS joining sanitisation
- reconcile i915/xe display power mgmt
- Xe3Lpd fixes
- UHBR rates for Thunderbolt
amdgpu:
- DRM panic support
- track BO memory stats at runtime
- Fix max surface handling in DC
- Cleaner shader support for gfx10.3 dGPUs
- fix drm buddy trim handling
- SDMA engine reset updates
- Fix doorbell ttm cleanup
- RAS updates
- ISP updates
- SDMA queue reset support
- Rework DPM powergating interfaces
- Documentation updates and cleanups
- DCN 3.5 updates
- Use a pm notifier to more gracefully handle VRAM eviction on
suspend or hibernate
- Add debugfs interfaces for forcing scheduling to specific engine
instances
- GG 9.5 updates
- IH 4.4 updates
- Make missing optional firmware less noisy
- PSP 13.x updates
- SMU 13.x updates
- VCN 5.x updates
- JPEG 5.x updates
- GC 12.x updates
- DC FAMS updates
amdkfd:
- GG 9.5 updates
- Logging improvements
- Shader debugger fixes
- Trap handler cleanup
- Cleanup includes
- Eviction fence wq fix
msm:
- MDSS:
- properly described UBWC registers
- added SM6150 (aka QCS615) support
- DPU:
- added SM6150 (aka QCS615) support
- enabled wide planes if virtual planes are enabled (by using two
SSPPs for a single plane)
- added CWB hardware blocks support
- DSI:
- added SM6150 (aka QCS615) support
- GPU:
- Print GMU core fw version
- GMU bandwidth voting for a740 and a750
- Expose uche trap base via uapi
- UAPI error reporting
rcar-du:
- Add r8a779h0 Support
ivpu:
- Fix qemu crash when using passthrough
nouveau:
- expose GSP-RM logging buffers via debugfs
panfrost:
- Add MT8188 Mali-G57 MC3 support
rockchip:
- Gamma LUT support
hisilicon:
- new HIBMC support
virtio-gpu:
- convert to helpers
- add prime support for scanout buffers
v3d:
- Add DRM_IOCTL_V3D_PERFMON_SET_GLOBAL
vc4:
- Add support for BCM2712
vkms:
- line-per-line compositing algorithm to improve performance
zynqmp:
- Add DP audio support
mediatek:
- dp: Add sdp path reset
- dp: Support flexible length of DP calibration data
etnaviv:
- add fdinfo memory support
- add explicit reset handling"
* tag 'drm-next-2025-01-17' of https://gitlab.freedesktop.org/drm/kernel: (1070 commits)
drm/bridge: fix documentation for the hdmi_audio_prepare() callback
doc/cgroup: Fix title underline length
drm/doc: Include new drm-compute documentation
cgroup/dmem: Fix parameters documentation
cgroup/dmem: Select PAGE_COUNTER
kernel/cgroup: Remove the unused variable climit
drm/display: hdmi: Do not read EDID on disconnected connectors
drm/tests: hdmi: Add connector disablement test
drm/connector: hdmi: Do atomic check when necessary
drm/amd/display: 3.2.316
drm/amd/display: avoid reset DTBCLK at clock init
drm/amd/display: improve dpia pre-train
drm/amd/display: Apply DML21 Patches
drm/amd/display: Use HW lock mgr for PSR1
drm/amd/display: Revised for Replay Pseudo vblank control
drm/amd/display: Add a new flag for replay low hz
drm/amd/display: Remove unused read_ono_state function from Hwss module
drm/amd/display: Do not elevate mem_type change to full update
drm/amd/display: Do not wait for PSR disable on vbl enable
drm/amd/display: Remove unnecessary eDP power down
...
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c')
-rw-r--r-- | drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 90 |
1 files changed, 82 insertions, 8 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c index 1d155463d044..784b03abb3a4 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c @@ -515,7 +515,7 @@ int amdgpu_gfx_disable_kcq(struct amdgpu_device *adev, int xcc_id) if (!kiq->pmf || !kiq->pmf->kiq_unmap_queues) return -EINVAL; - if (!kiq_ring->sched.ready || adev->job_hang || amdgpu_in_reset(adev)) + if (!kiq_ring->sched.ready || amdgpu_in_reset(adev)) return 0; spin_lock(&kiq->ring_lock); @@ -567,7 +567,7 @@ int amdgpu_gfx_disable_kgq(struct amdgpu_device *adev, int xcc_id) if (!kiq->pmf || !kiq->pmf->kiq_unmap_queues) return -EINVAL; - if (!adev->gfx.kiq[0].ring.sched.ready || adev->job_hang) + if (!adev->gfx.kiq[0].ring.sched.ready || amdgpu_in_reset(adev)) return 0; if (amdgpu_gfx_is_master_xcc(adev, xcc_id)) { @@ -806,7 +806,7 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable) /* If going to s2idle, no need to wait */ if (adev->in_s0ix) { if (!amdgpu_dpm_set_powergating_by_smu(adev, - AMD_IP_BLOCK_TYPE_GFX, true)) + AMD_IP_BLOCK_TYPE_GFX, true, 0)) adev->gfx.gfx_off_state = true; } else { schedule_delayed_work(&adev->gfx.gfx_off_delay_work, @@ -818,7 +818,7 @@ void amdgpu_gfx_off_ctrl(struct amdgpu_device *adev, bool enable) cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work); if (adev->gfx.gfx_off_state && - !amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false)) { + !amdgpu_dpm_set_powergating_by_smu(adev, AMD_IP_BLOCK_TYPE_GFX, false, 0)) { adev->gfx.gfx_off_state = false; if (adev->gfx.funcs->init_spm_golden) { @@ -1484,6 +1484,24 @@ static int amdgpu_gfx_run_cleaner_shader(struct amdgpu_device *adev, int xcp_id) return 0; } +/** + * amdgpu_gfx_set_run_cleaner_shader - Execute the AMDGPU GFX Cleaner Shader + * @dev: The device structure + * @attr: The device attribute structure + * @buf: The buffer containing the input data + * @count: The size of the input data + * + * Provides the sysfs interface to manually run a cleaner shader, which is + * used to clear the GPU state between different tasks. Writing a value to the + * 'run_cleaner_shader' sysfs file triggers the cleaner shader execution. + * The value written corresponds to the partition index on multi-partition + * devices. On single-partition devices, the value should be '0'. + * + * The cleaner shader clears the Local Data Store (LDS) and General Purpose + * Registers (GPRs) to ensure data isolation between GPU workloads. + * + * Return: The number of bytes written to the sysfs file. + */ static ssize_t amdgpu_gfx_set_run_cleaner_shader(struct device *dev, struct device_attribute *attr, const char *buf, @@ -1532,6 +1550,19 @@ static ssize_t amdgpu_gfx_set_run_cleaner_shader(struct device *dev, return count; } +/** + * amdgpu_gfx_get_enforce_isolation - Query AMDGPU GFX Enforce Isolation Settings + * @dev: The device structure + * @attr: The device attribute structure + * @buf: The buffer to store the output data + * + * Provides the sysfs read interface to get the current settings of the 'enforce_isolation' + * feature for each GPU partition. Reading from the 'enforce_isolation' + * sysfs file returns the isolation settings for all partitions, where '0' + * indicates disabled and '1' indicates enabled. + * + * Return: The number of bytes read from the sysfs file. + */ static ssize_t amdgpu_gfx_get_enforce_isolation(struct device *dev, struct device_attribute *attr, char *buf) @@ -1555,6 +1586,20 @@ static ssize_t amdgpu_gfx_get_enforce_isolation(struct device *dev, return size; } +/** + * amdgpu_gfx_set_enforce_isolation - Control AMDGPU GFX Enforce Isolation + * @dev: The device structure + * @attr: The device attribute structure + * @buf: The buffer containing the input data + * @count: The size of the input data + * + * This function allows control over the 'enforce_isolation' feature, which + * serializes access to the graphics engine. Writing '1' or '0' to the + * 'enforce_isolation' sysfs file enables or disables process isolation for + * each partition. The input should specify the setting for all partitions. + * + * Return: The number of bytes written to the sysfs file. + */ static ssize_t amdgpu_gfx_set_enforce_isolation(struct device *dev, struct device_attribute *attr, const char *buf, size_t count) @@ -1940,6 +1985,17 @@ void amdgpu_gfx_enforce_isolation_handler(struct work_struct *work) mutex_unlock(&adev->enforce_isolation_mutex); } +/** + * amdgpu_gfx_enforce_isolation_wait_for_kfd - Manage KFD wait period for process isolation + * @adev: amdgpu_device pointer + * @idx: Index of the GPU partition + * + * When kernel submissions come in, the jobs are given a time slice and once + * that time slice is up, if there are KFD user queues active, kernel + * submissions are blocked until KFD has had its time slice. Once the KFD time + * slice is up, KFD user queues are preempted and kernel submissions are + * unblocked and allowed to run again. + */ static void amdgpu_gfx_enforce_isolation_wait_for_kfd(struct amdgpu_device *adev, u32 idx) @@ -1985,6 +2041,15 @@ amdgpu_gfx_enforce_isolation_wait_for_kfd(struct amdgpu_device *adev, msleep(GFX_SLICE_PERIOD_MS); } +/** + * amdgpu_gfx_enforce_isolation_ring_begin_use - Begin use of a ring with enforced isolation + * @ring: Pointer to the amdgpu_ring structure + * + * Ring begin_use helper implementation for gfx which serializes access to the + * gfx IP between kernel submission IOCTLs and KFD user queues when isolation + * enforcement is enabled. The kernel submission IOCTLs and KFD user queues + * each get a time slice when both are active. + */ void amdgpu_gfx_enforce_isolation_ring_begin_use(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; @@ -2016,6 +2081,15 @@ void amdgpu_gfx_enforce_isolation_ring_begin_use(struct amdgpu_ring *ring) amdgpu_gfx_kfd_sch_ctrl(adev, idx, false); } +/** + * amdgpu_gfx_enforce_isolation_ring_end_use - End use of a ring with enforced isolation + * @ring: Pointer to the amdgpu_ring structure + * + * Ring end_use helper implementation for gfx which serializes access to the + * gfx IP between kernel submission IOCTLs and KFD user queues when isolation + * enforcement is enabled. The kernel submission IOCTLs and KFD user queues + * each get a time slice when both are active. + */ void amdgpu_gfx_enforce_isolation_ring_end_use(struct amdgpu_ring *ring) { struct amdgpu_device *adev = ring->adev; @@ -2058,7 +2132,7 @@ static int amdgpu_debugfs_gfx_sched_mask_set(void *data, u64 val) if (!adev) return -ENODEV; - mask = (1 << adev->gfx.num_gfx_rings) - 1; + mask = (1ULL << adev->gfx.num_gfx_rings) - 1; if ((val & mask) == 0) return -EINVAL; @@ -2086,7 +2160,7 @@ static int amdgpu_debugfs_gfx_sched_mask_get(void *data, u64 *val) for (i = 0; i < adev->gfx.num_gfx_rings; ++i) { ring = &adev->gfx.gfx_ring[i]; if (ring->sched.ready) - mask |= 1 << i; + mask |= 1ULL << i; } *val = mask; @@ -2128,7 +2202,7 @@ static int amdgpu_debugfs_compute_sched_mask_set(void *data, u64 val) if (!adev) return -ENODEV; - mask = (1 << adev->gfx.num_compute_rings) - 1; + mask = (1ULL << adev->gfx.num_compute_rings) - 1; if ((val & mask) == 0) return -EINVAL; @@ -2157,7 +2231,7 @@ static int amdgpu_debugfs_compute_sched_mask_get(void *data, u64 *val) for (i = 0; i < adev->gfx.num_compute_rings; ++i) { ring = &adev->gfx.compute_ring[i]; if (ring->sched.ready) - mask |= 1 << i; + mask |= 1ULL << i; } *val = mask; |