kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2024-09-13	drm/xe: Fix missing conversion to xe_display_pm_runtime_resume	Maarten Lankhorst	1	-1/+1
	This error path was missed when converting away from xe_display_pm_resume with second argument. Fixes: 66a0f6b9f5fc ("drm/xe/display: handle HPD polling in display runtime suspend/resume") Cc: Arun R Murthy <arun.r.murthy@intel.com> Cc: Vinod Govindapillai <vinod.govindapillai@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240905150052.174895-2-maarten.lankhorst@linux.intel.com (cherry picked from commit 474f64cb988a410db8a0b779d6afdaa2a7fc5759) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-09-13	drm/xe: fix build warning with CONFIG_PM=n	Arnd Bergmann	1	-0/+4
	The 'runtime_status' field is an implementation detail of the power management code, so a device driver should not normally touch this: drivers/gpu/drm/xe/xe_pm.c: In function 'xe_pm_suspending_or_resuming': drivers/gpu/drm/xe/xe_pm.c:606:26: error: 'struct dev_pm_info' has no member named 'runtime_status' 606 \| return dev->power.runtime_status == RPM_SUSPENDING \|\| \| ^ drivers/gpu/drm/xe/xe_pm.c:607:27: error: 'struct dev_pm_info' has no member named 'runtime_status' 607 \| dev->power.runtime_status == RPM_RESUMING; \| ^ drivers/gpu/drm/xe/xe_pm.c:608:1: error: control reaches end of non-void function [-Werror=return-type] Add an #ifdef check to avoid the build regression. Fixes: ad92f5231261 ("drm/xe: Suppress missing outer rpm protection warning") Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240909202521.1018439-1-arnd@kernel.org Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 1c129ed07de47684ff2471e32b52fa823533aa06) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-09-12	drm/xe: Suppress missing outer rpm protection warning	Rodrigo Vivi	1	-1/+16
	Do not raise a WARN if we are likely within suspending or resuming path. This is likely this false positive: rpm_status: 0000:03:00.0 status=RPM_SUSPENDING console: xe_bo_evict_all (called from suspend) xe_sched_job_create: dev=0000:03:00.0, ... xe_sched_job_exec: dev=0000:03:00.0, ... xe_pm_runtime_put: dev=0000:03:00.0, ... xe_sched_job_run: dev=0000:03:00.0, ... rpm_usage: 0000:03:00.0 flags-0 cnt-2 ... rpm_usage: 0000:03:00.0 flags-0 cnt-2 ... rpm_usage: 0000:03:00.0 flags-0 cnt-2 ... console: xe 0000:03:00.0: [drm] Missing outer runtime PM protection console: xe_guc_ct_send+0x15/0x50 [xe] console: guc_exec_queue_run_job+0x1509/0x3950 [xe] [snip] console: drm_sched_run_job_work+0x649/0xc20 At this point, BOs are getting evicted from VRAM with rpm usage-counter = 2, but rpm status = SUSPENDING. The xe->pm_callback_task won't be equal 'current' because this call is coming from a work queue. So, pm_runtime_get_if_active() will be called and return 0 because rpm status != ACTIVE (but equal SUSPENDING or RESUMING). v2: Still get the reference even on non suspending/resuming path (Jonathan, Brost). Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240905140215.56404-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit cb85e39dc5d1717fab82810984cce0e54712a3c2) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-09-04	drm/xe: Use xe_pm_runtime_get in xe_bo_move() if reclaim-safe.	Thomas Hellström	1	-1/+8
	xe_bo_move() might be called in the TTM swapout path from validation by another TTM device. If so, we are not likely to have a RPM reference. So iff xe_pm_runtime_get() is safe to call from reclaim, use it instead of xe_pm_runtime_get_noresume(). Strictly this is currently needed only if handle_system_ccs is true, but use xe_pm_runtime_get() if possible anyway to increase test coverage. At the same time warn if handle_system_ccs is true and we can't call xe_pm_runtime_get() from reclaim context. This will likely trip if someone tries to enable SRIOV on LNL, without fixing Xe SRIOV runtime resume / suspend. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240903094232.166342-1-thomas.hellstrom@linux.intel.com
2024-08-28	drm/xe: Use separate rpm lockdep map for non-d3cold-capable devices	Thomas Hellström	1	-14/+70
	For non-d3cold-capable devices we'd like to be able to wake up the device from reclaim. In particular, for Lunar Lake we'd like to be able to blit CCS metadata to system at shrink time; at least from kswapd where it's reasonable OK to wait for rpm resume and a preceding rpm suspend. Therefore use a separate lockdep map for such devices and prime it reclaim-tainted. v2: - Rename lockmap acquire- and release functions. (Rodrigo Vivi). - Reinstate the old xe_pm_runtime_lockdep_prime() function and rename it to xe_rpm_might_enter_cb(). (Matthew Auld). - Introduce a separate xe_pm_runtime_lockdep_prime function called from module init for known required locking orders. v3: - Actually hook up the prime function at module init. v4: - Rebase. v5: - Don't use reclaim-safe RPM with sriov. Cc: "Vivi, Rodrigo" <rodrigo.vivi@intel.com> Cc: "Auld, Matthew" <matthew.auld@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240826143450.92511-1-thomas.hellstrom@linux.intel.com
2024-08-23	drm/xe/display: handle HPD polling in display runtime suspend/resume	Vinod Govindapillai	1	-3/+5
	In XE, display runtime suspend / resume routines are called only if d3cold is allowed. This makes the driver unable to detect any HPDs once the device goes into runtime suspend state in platforms like LNL. Update the display runtime suspend / resume routines to include HPD polling regardless of d3cold status. While xe_display_pm_suspend/resume() performs steps during runtime suspend/resume that shouldn't happen, like suspending MST and they are missing other steps like enabling DC9, this patchset is meant to keep the current behavior wrt. these, leaving the corresponding updates for a follow-up v2: have a separate function for display runtime s/r (Rodrigo) v3: better streamlining of system s/r and runtime s/r calls (Imre) v4: rebased Reviewed-by: Arun R Murthy <arun.r.murthy@intel.com> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240823112148.327015-4-vinod.govindapillai@intel.com
2024-08-19	drm/xe/display: Make display suspend/resume work on discrete	Maarten Lankhorst	1	-5/+6
	We should unpin before evicting all memory, and repin after GT resume. This way, we preserve the contents of the framebuffers, and won't hang on resume due to migration engine not being restored yet. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Uma Shankar <uma.shankar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240806105044.596842-3-maarten.lankhorst@linux.intel.com Signed-off-by: Maarten Lankhorst,,, <maarten.lankhorst@linux.intel.com>
2024-07-18	drm/xe/pm: Add trace for pm functions	Nirmoy Das	1	-0/+8
	Add trace for xe pm function for better debuggability. v2: Fix indentation and add trace for xe_pm_runtime_get_ioctl Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240717125950.9952-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
2024-05-23	drm/xe: Stop checking for power_lost on D3Cold	Rodrigo Vivi	1	-10/+2
	GuC reset status is not reliable for this purpose and it is once in a while ending up in a situation of D3Cold, where power_reset is false and without the proper memory restoration the GuC reload and Display will fail to come back from D3Cold. So, let's do a full restoration of everything if we have a risk of losing power, without further optimizations. v2: also remove the gut_in_reset function (Anshuman) Cc: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240522170105.327472-6-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-23	drm/xe: Prepare display for D3Cold	Rodrigo Vivi	1	-3/+12
	Prepare power-well and DC handling for a full power lost during D3Cold, then sanitize it upon D3->D0. Otherwise we get a bunch of state mismatch. Ideally we could leave DC9 enabled and wouldn't need to move DC9->DC0 on every runtime resume, however, the disable_DC is part of the power-well checks and intrinsic to the dc_off power well. In the future that can be detangled so we can have even bigger power savings. But for now, let's focus on getting a D3Cold, which saves much more power by itself. v2: create new functions to avoid full-suspend-resume path, which would result in a deadlock between xe_gem_fault and the modeset-ioctl. v3: Only avoid the full modeset to avoid the race, for a more robust suspend-resume. Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Uma Shankar <uma.shankar@intel.com> Tested-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240522170105.327472-5-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-23	drm/xe: Fix xe_pm_runtime_get_if_in_use documentation	Rodrigo Vivi	1	-2/+3
	Let's be clear on what it is actually doing and align with xe_pm_runtime_get_if_active doc style. Tested-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240522170105.327472-2-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-05-23	drm/xe: Fix xe_pm_runtime_get_if_active return	Rodrigo Vivi	1	-4/+4
	Current callers of this function are already taking the result to a boolean and using in an if. It might be a problem because current function might return negative error codes on failure, without increasing the reference counter. In this scenario we could end up with extra 'put' call ending in unbalanced scenarios. Let's fix it, while aligning with the current xe_pm_get_if_in_use style. Tested-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240522170105.327472-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-23	drm/xe: make xe_pm_runtime_lockdep_map a static struct	Rodrigo Vivi	1	-1/+1
	Fix the new sparse warning: >> drivers/gpu/drm/xe/xe_pm.c:72:20: sparse: sparse: symbol 'xe_pm_runtime_lockdep_map' was not declared. Should it be static? Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202404191329.EZzOTzwK-lkp@intel.com/ Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240422201454.699089-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-18	drm/xe/pm: Capture errors and handle them	Himal Prasad Ghimiray	1	-8/+28
	xe_pm_init may encounter failures for various reasons, such as a failure in initializing drmm_mutex, or when dealing with a d3cold-capable device for vram_threshold sysfs creation and setting default threshold. Presently, all these potential failures are disregarded. Move d3cold.lock initialization to xe_pm_init_early and cause driver abort if mutex initialization has failed. For xe_pm_init failures cleanup the driver and return error code -v2 Make mutex init cleaner (Lucas) Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240412181211.1155732-8-himal.prasad.ghimiray@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-04-18	drm/xe: Removing extra mem_access protection from runtime pm	Rodrigo Vivi	1	-3/+0
	This is not needed any longer, now that we have all the protection in place with the runtime pm itself. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240417203952.25503-7-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-18	drm/xe: Move lockdep protection from mem_access to xe_pm_runtime	Rodrigo Vivi	1	-8/+37
	The mem_access itself is not holding any lock, but attempting to train lockdep with possible scarring locks happening during runtime pm. We are going soon to kill the mem_access get and put helpers in favor of direct xe_pm_runtime calls, so let's just move this lock around to where it now belongs. v2: s/lockdep_training/lockdep_prime (Matt Auld) Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240417203952.25503-4-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-18	drm/xe: Introduce xe_pm_runtime_get_noresume for inner callers	Rodrigo Vivi	1	-0/+20
	Let's ensure that we have an option for inner callers that will raise WARN if device is not active and not protected by outer callers. Make this also a void function forcing every caller to unconditionally put the reference back afterwards. This will be very important for cases where we want to hold the reference before scheduling a work in a queue. Then the work job will be responsible for putting it back. While at this, already convert a case from mem_access_get_ongoing where it is not checking for the reference and put it back, what would cause the underflow. v2: Fix identation. v3: Convert equivalent missing put from mem_access towards pm_runtime. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240417203952.25503-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-04-12	Merge drm/drm-next into drm-xe-next	Thomas Hellström	1	-1/+1
	Backmerging drm-next in order to get up-to-date and in particular to access commit 9ca5facd0400f610f3f7f71aeb7fc0b949a48c67. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-04-10	drm/xe: check pcode init status only on root gt of root tile	Riana Tauro	1	-10/+6
	The root tile indicates the pcode initialization is complete when all tiles have completed their initialization. So the mailbox can be polled only on the root tile. Check pcode init status only on root tile and move it to device probe early as root tile is initialized there. Also make similar changes in resume paths. v2: add lock/unlocked version of pcode_mailbox_rw to allow pcode init to be called in device early probe (Rodrigo) v3: add code description about using root tile change function names to xe_pcode_probe_early and xe_pcode_init (Rodrigo) Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240410085005.1126343-2-riana.tauro@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-19	drm/xe: Add dbg messages on the suspend resume functions.	Rodrigo Vivi	1	-5/+17
	In case of the suspend/resume flow getting locked up we can get reports with some useful hints on where it might get locked and if that has failed. Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240318180141.267458-2-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-14	Merge tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel	Linus Torvalds	1	-7/+31
	Pull drm updates from Dave Airlie: "Highlights are usual, more AMD IP blocks for future hw, i915/xe changes, Displayport tunnelling support for i915, msm YUV over DP changes, new tests for ttm, but its mostly a lot of stuff all over the place from lots of people. core: - EDID cleanups - scheduler error handling fixes - managed: add drmm_release_action() with tests - add ratelimited drm debug print - DPCD PSR early transport macro - DP tunneling and bandwidth allocation helpers - remove built-in edids - dp: Avoid AUX transfers on powered-down displays - dp: Add VSC SDP helpers cross drivers: - use new drm print helpers - switch to ->read_edid callback - gem: add stats for shared buffers plus updates to amdgpu, i915, xe syncobj: - fixes to waiting and sleeping ttm: - add tests - fix errno codes - simply busy-placement handling - fix page decryption media: - tc358743: fix v4l device registration video: - move all kernel parameters for video behind CONFIG_VIDEO sound: - remove <drm/drm_edid.h> include from header ci: - add tests for msm - fix apq8016 runner efifb: - use copy of global screen_info state vesafb: - use copy of global screen_info state simplefb: - fix logging bridge: - ite-6505: fix DP link-training bug - samsung-dsim: fix error checking in probe - samsung-dsim: add bsh-smm-s2/pro boards - tc358767: fix regmap usage - imx: add i.MX8MP HDMI PVI plus DT bindings - imx: add i.MX8MP HDMI TX plus DT bindings - sii902x: fix probing and unregistration - tc358767: limit pixel PLL input range - switch to new drm_bridge_read_edid() interface panel: - ltk050h3146w: error-handling fixes - panel-edp: support delay between power-on and enable; use put_sync in unprepare; support Mediatek MT8173 Chromebooks, BOE NV116WHM-N49 V8.0, BOE NV122WUM-N41, CSO MNC207QS1-1 plus DT bindings - panel-lvds: support EDT ETML0700Z9NDHA plus DT bindings - panel-novatek: FRIDA FRD400B25025-A-CTK plus DT bindings - add BOE TH101MB31IG002-28A plus DT bindings - add EDT ETML1010G3DRA plus DT bindings - add Novatek NT36672E LCD DSI plus DT bindings - nt36523: support 120Hz timings, fix includes - simple: fix display timings on RK32FN48H - visionox-vtdr6130: fix initialization - add Powkiddy RGB10MAX3 plus DT bindings - st7703: support panel rotation plus DT bindings - add Himax HX83112A plus DT bindings - ltk500hd1829: add support for ltk101b4029w and admatec 9904370 - simple: add BOE BP082WX1-100 8.2" panel plus DT bindungs panel-orientation-quirks: - GPD Win Mini amdgpu: - Validate DMABuf imports in compute VMs - Add RAS ACA framework - PSP 13 fixes - Misc code cleanups - Replay fixes - Atom interpretor PS, WS bounds checking - DML2 fixes - Audio fixes - DCN 3.5 Z state fixes - Remove deprecated ida_simple usage - UBSAN fixes - RAS fixes - Enable seq64 infrastructure - DC color block enablement - Documentation updates - DC documentation updates - DMCUB updates - ATHUB 4.1 support - LSDMA 7.0 support - JPEG DPG support - IH 7.0 support - HDP 7.0 support - VCN 5.0 support - SMU 13.0.6 updates - NBIO 7.11 updates - SDMA 6.1 updates - MMHUB 3.3 updates - DCN 3.5.1 support - NBIF 6.3.1 support - VPE 6.1.1 support amdkfd: - Validate DMABuf imports in compute VMs - SVM fixes - Trap handler updates and enhancements - Fix cache size reporting - Relocate the trap handler radeon: - Atom interpretor PS, WS bounds checking - Misc code cleanups xe: - new query for GuC submission version - Remove unused persistent exec_queues - Add vram frequency sysfs attributes - Add the flag XE_VM_BIND_FLAG_DUMPABLE - Drop pre-production workarounds - Drop kunit tests for unsupported platforms - Start pumbling SR-IOV support with memory based interrupts for VF - Allow to map BO in GGTT with PAT index corresponding to XE_CACHE_UC to work with memory based interrupts - Add GuC Doorbells Manager as prep work SR-IOV - Implement additional workarounds for xe2 and MTL - Program a few registers according to perfomance guide spec for Xe2 - Fix remaining 32b build issues and enable it back - Fix build with CONFIG_DEBUG_FS=n - Fix warnings from GuC ABI headers - Introduce Relay Communication for SR-IOV for VF <-> GuC <-> PF - Release mmap mappings on rpm suspend - Disable mid-thread preemption when not properly supported by hardware - Fix xe_exec by reserving extra fence slot for CPU bind - Fix xe_exec with full long running exec queue - Canonicalize addresses where needed for Xe2 and add to devcoredum - Toggle USM support for Xe2 - Only allow 1 ufence per exec / bind IOCTL - Add GuC firmware loading for Lunar Lake - Add XE_VMA_PTE_64K VMA flag i915: - Add more ADL-N PCI IDs - Enable fastboot also on older platforms - Early transport for panel replay and PSR - New ARL PCI IDs - DP TPS4 PHY test pattern support - Unify and improve VSC SDP for PSR and non-PSR cases - Refactor memory regions and improve debug logging - Rework global state serialization - Remove unused CDCLK divider fields - Unify HDCP connector logging format - Use display instead of graphics version in display code - Move VBT and opregion debugfs next to the implementation - Abstract opregion interface, use opaque type - MTL fixes - HPD handling fixes - Add GuC submission interface version query - Atomically invalidate userptr on mmu-notifier - Update handling of MMIO triggered reports - Don't make assumptions about intel_wakeref_t type - Extend driver code of Xe_LPG to Xe_LPG+ - Add flex arrays to struct i915_syncmap - Allow for very slow HuC loading - DP tunneling and bandwidth allocation support msm: - Correct bindings for MSM8976 and SM8650 platforms - Start migration of MDP5 platforms to DPU driver - X1E80100 MDSS support - DPU: - Improve DSC allocation, fixing several important corner cases - Add support for SDM630/SDM660 platforms - Simplify dpu_encoder_phys_ops - Apply fixes targeting DSC support with a single DSC encoder - Apply fixes for HCTL_EN timing configuration - X1E80100 support - Add support for YUV420 over DP - GPU: - fix sc7180 UBWC config - fix a7xx LLC config - new gpu support: a305B, a750, a702 - machine support: SM7150 (different power levels than other a618) - a7xx devcoredump support habanalabs: - configure IRQ affinity according to NUMA node - move HBM MMU page tables inside the HBM - improve device reset - check extended PCIe errors ivpu: - updates to firmware API - refactor BO allocation imx: - use devm_ functions during init hisilicon: - fix EDID includes mgag200: - improve ioremap usage - convert to struct drm_edid - Work around PCI write bursts nouveau: - disp: use kmemdup() - fix EDID includes - documentation fixes qaic: - fixes to BO handling - make use of DRM managed release - fix order of remove operations rockchip: - analogix_dp: get encoder port from DT - inno_hdmi: support HDMI for RK3128 - lvds: error-handling fixes ssd130x: - support SSD133x plus DT bindings tegra: - fix error handling tilcdc: - make use of DRM managed release v3d: - show memory stats in debugfs - Support display MMU page size vc4: - fix error handling in plane prepare_fb - fix framebuffer test in plane helpers virtio: - add venus capset defines vkms: - fix OOB access when programming the LUT - Kconfig improvements vmwgfx: - unmap surface before changing plane state - fix memory leak in error handling - documentation fixes - list command SVGA_3D_CMD_DEFINE_GB_SURFACE_V4 as invalid - fix null-pointer deref in execbuf - refactor display-mode probing - fix fencing for creating cursor MOBs - fix cursor-memory lifetime xlnx: - fix live video input for ZynqMP DPSUB lima: - fix memory leak loongson: - fail if no VRAM present meson: - switch to new drm_bridge_read_edid() interface renesas: - add RZ/G2L DU support plus DT bindings mxsfb: - Use managed mode config sun4i: - HDMI: updates to atomic mode setting mediatek: - Add display driver for MT8188 VDOSYS1 - DSI driver cleanups - Filter modes according to hardware capability - Fix a null pointer crash in mtk_drm_crtc_finish_page_flip etnaviv: - enhancements for NPU and MRT support" * tag 'drm-next-2024-03-13' of https://gitlab.freedesktop.org/drm/kernel: (1420 commits) drm/amd/display: Removed redundant @ symbol to fix kernel-doc warnings in -next repo drm/amd/pm: wait for completion of the EnableGfxImu message drm/amdgpu/soc21: add mode2 asic reset for SMU IP v14.0.1 drm/amdgpu: add smu 14.0.1 support drm/amdgpu: add VPE 6.1.1 discovery support drm/amdgpu/vpe: add VPE 6.1.1 support drm/amdgpu/vpe: don't emit cond exec command under collaborate mode drm/amdgpu/vpe: add collaborate mode support for VPE drm/amdgpu/vpe: add PRED_EXE and COLLAB_SYNC OPCODE drm/amdgpu/vpe: add multi instance VPE support drm/amdgpu/discovery: add nbif v6_3_1 ip block drm/amdgpu: Add nbif v6_3_1 ip block support drm/amdgpu: Add pcie v6_1_0 ip headers (v5) drm/amdgpu: Add nbif v6_3_1 ip headers (v5) arch/powerpc: Remove <linux/fb.h> from backlight code macintosh/via-pmu-backlight: Include <linux/backlight.h> fbdev/chipsfb: Include <linux/backlight.h> drm/etnaviv: Restore some id values drm/amdkfd: make kfd_class constant drm/amdgpu: add ring timeout information in devcoredump ...
2024-03-04	drm/xe: Convert xe_pm_runtime_{get, put} to void and protect from recursion	Rodrigo Vivi	1	-11/+14
	With mem_access going away and pm_runtime getting called instead, we need to protect these against recursions. The put is asynchronous so there's no need to block it. However, for a proper balance, we need to ensure that the references are taken and restored regardless of the flow. So, let's convert them all to void and use some direct linux/pm_runtime functions. v2: Rebased and update commit message (Matt). Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240301180526.643505-3-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-04	drm/xe: Create a xe_pm_runtime_resume_and_get variant for display	Rodrigo Vivi	1	-0/+17
	Introduce the resume and get to fulfill the display need for checking if the device was actually resumed (or it is awake) and the reference was taken. Then we can convert the remaining cases to a void function and have individual functions for individual cases. Also, already start this new function protected from the runtime recursion, since runtime_pm will need to call for display functions for a proper D3Cold flow. Cc: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240301180526.643505-2-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-03-04	drm/xe: Fix display runtime_pm handling	Rodrigo Vivi	1	-0/+17
	i915's intel_runtime_pm_get_if_in_use actually calls the pm_runtime_get_if_active() with ign_usage_count = false, but Xe was erroneously calling it with true because of the mem_access cases. This can lead to unnecessary references getting hold here and device never getting into the runtime suspended state. Let's use directly the 'if_in_use' function provided by linux/pm_runtime. Also, already start this new function protected from the runtime recursion, since runtime_pm will need to call for display functions for a proper D3Cold flow. v2: Update commit message based on Matt's feedback. Fix return condition of pm_runtime_get_if_in_use (Matt) Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240301180526.643505-1-rodrigo.vivi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-02-26	drm/xe: Runtime PM wake on every IOCTL	Rodrigo Vivi	1	-0/+15
	Let's ensure our PCI device is awaken on every IOCTL entry. Let's increase the runtime_pm protection and start moving that to the outer bounds. v2: minor typo fix and renaming function to make it clear that is intended to be used by ioctl only. (Matt) v3: Make it NULL if CONFIG_COMPAT is not selected. Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-3-rodrigo.vivi@intel.com
2024-02-26	drm/xe: Convert mem_access assertion towards the runtime_pm state	Rodrigo Vivi	1	-0/+16
	The mem_access helpers are going away and getting replaced by direct calls of the xe_pm_runtime_{get,put} functions. However, an assertion with a warning splat is desired when we hit the worst case of a memory access with the device really in the 'suspended' state. Also, this needs to be the first step. Otherwise, the upcoming conversion would be really noise with warn splats of missing mem_access gets. v2: Minor doc changes as suggested by Matt Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-2-rodrigo.vivi@intel.com
2024-02-26	drm/xe: Document Xe PM component	Rodrigo Vivi	1	-11/+98
	Replace outdated information with a proper PM documentation. Already establish the rules for the runtime PM get and put that Xe needs to follow. Also add missing function documentation to all the "exported" functions. v2: updated after Francois' feedback. s/grater/greater (Matt) v3: detach D3 from runtime_pm remove opportunistic S0iX (Anshuman) Cc: Matthew Auld <matthew.auld@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Acked-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> #v2 Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240222163937.138342-1-rodrigo.vivi@intel.com
2024-02-12	PM: runtime: Simplify pm_runtime_get_if_active() usage	Sakari Ailus	1	-1/+1
	There are two ways to opportunistically increment a device's runtime PM usage count, calling either pm_runtime_get_if_active() or pm_runtime_get_if_in_use(). The former has an argument to tell whether to ignore the usage count or not, and the latter simply calls the former with ign_usage_count set to false. The other users that want to ignore the usage_count will have to explicitly set that argument to true which is a bit cumbersome. To make this function more practical to use, remove the ign_usage_count argument from the function. The main implementation is in a static function called pm_runtime_get_conditional() and implementations of pm_runtime_get_if_active() and pm_runtime_get_if_in_use() are moved to runtime.c. Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Reviewed-by: Alex Elder <elder@linaro.org> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Takashi Iwai <tiwai@suse.de> # sound/ Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> # drivers/accel/ivpu/ Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> # drivers/gpu/drm/i915/ Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # drivers/pci/ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-02-06	drm/xe/pm: add debug logs for D3cold	Riana Tauro	1	-6/+13
	add additional debug logs for PME# capability and presence of ACPI _PR3 resources. This is to identify the reason why the card is not capable of D3cold. No functional changes Signed-off-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240206055917.2629027-1-riana.tauro@intel.com
2024-01-31	drm/xe: drop display/ subdir from include directories	Jani Nikula	1	-1/+1
	There are very few places that need to include anything from under display/. Require the display/ prefix in #include directives, and drop the subdirectory from the header search path. Sort the include lists while at it. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240122101428.2683468-2-jani.nikula@intel.com
2024-01-09	drm/xe/dgfx: Release mmap mappings on rpm suspend	Badal Nilawar	1	-0/+17
	Release all mmap mappings for all vram objects which are associated with userfault such that, while pcie function in D3hot, any access to memory mappings will raise a userfault. Upon userfault, in order to access memory mappings, if graphics function is in D3 then runtime resume of dgpu will be triggered to transition to D0. v2: - Avoid iomem check before bo migration check as bo can migrate to system memory (Matthew Auld) v3: - Delete bo userfault link during bo destroy - Upon bo move (vram-smem), do bo userfault link deletion in xe_bo_move_notify instead of xe_bo_move (Thomas Hellström) - Grab lock in rpm hook while deleting bo userfault link (Matthew Auld) v4: - Add kernel doc and wrap vram_userfault related stuff in the structure (Matthew Auld) - Get rpm wakeref before taking dma reserve lock (Matthew Auld) - In suspend path apply lock for entire list op including list iteration (Matthew Auld) v5: - Use mutex lock instead of spin lock v6: - Fix review comments (Matthew Auld) Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Badal Nilawar <badal.nilawar@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #For the xe_bo_move_notify() changes Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20240104130702.950078-1-badal.nilawar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe: add some debug info for d3cold	Matthew Auld	1	-0/+3
	From the CI logs we want to easily know if the machine is capable and allowed to enter d3cold, and can therefore potentially trigger the d3cold RPM suspend and resume path. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe/display: Implement display support	Maarten Lankhorst	1	-1/+12
	As for display, the intent is to share the display code with the i915 driver so that there is maximum reuse there. We do this by recompiling i915/display code twice. Now that i915 has been adapted to support the Xe build, we can add the xe/display support. This initial work is a collaboration of many people and unfortunately this squashed patch won't fully honor the proper credits. But let's try to add a few from the squashed patches: Co-developed-by: Matthew Brost <matthew.brost@intel.com> Co-developed-by: Jani Nikula <jani.nikula@intel.com> Co-developed-by: Lucas De Marchi <lucas.demarchi@intel.com> Co-developed-by: Matt Roper <matthew.d.roper@intel.com> Co-developed-by: Mauro Carvalho Chehab <mchehab@kernel.org> Co-developed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Co-developed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2023-12-21	drm/xe: do not register to PM if GuC is disabled	Ohad Sharabi	1	-0/+4
	When working without GuC (i.e. working with execlists), the flow attempts to perform suspend operation which is failing due to a lack of support without GuC. If PM ops are not supported without GuC we may as well avoid PM registration rather than returning errors from various PM flows. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe/wa: Apply tile workarounds at probe/resume	Matt Roper	1	-0/+5
	Although the vast majority of workarounds the driver needs to implement are either GT-based or display-based, there are occasionally workarounds that reside outside those parts of the hardware (i.e., in they target registers in the sgunit/soc); we can consider these to be "tile" workarounds since there will be instance of these registers per tile. The registers in question should only lose their values during a function-level reset, so they only need to be applied during probe and resume; the registers will not be affected by GT/engine resets. Tile workarounds are rare (there's only one, 22010954014, that's relevant to Xe at the moment) so it's probably not worth updating the xe_rtp design to handle tile-level workarounds yet, although we may want to consider that in the future if/when more of these show up on future platforms. Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/20230913231411.291933-13-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe/pm: Add vram_d3cold_threshold for d3cold capable device	Anshuman Gupta	1	-2/+5
	Do not register vram_d3cold_threshold device sysfs universally for each gfx device, only register sysfs and set the threshold value for d3cold capable devices. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/all/20230802070449.2426563-1-anshuman.gupta@intel.com/ Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe: Ensure memory eviction on s2idle.	Rodrigo Vivi	1	-0/+11
	On discrete cards we cannot allow the pci subsystem to skip the regular suspend and we need to unblock the d3cold. Cc: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe: Only init runtime PM after all d3cold config is in place.	Rodrigo Vivi	1	-1/+3
	We cannot allow runtime pm suspend after we configured the d3cold capable and threshold. Cc: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe: Fix the runtime_idle call and d3cold.allowed decision.	Rodrigo Vivi	1	-2/+2
	According to Documentation/power/runtime_pm.txt: int pm_runtime_put(struct device dev); - decrement the device's usage counter; if the result is 0 then run pm_request_idle(dev) and return its result int pm_runtime_put_autosuspend(struct device dev); - decrement the device's usage counter; if the result is 0 then run pm_request_autosuspend(dev) and return its result We need to ensure that the idle function is called before suspending so we take the right d3cold.allowed decision and respect the values set on vram_d3cold_threshold sysfs. So we need pm_runtime_put() instead of pm_runtime_put_autosuspend(). Cc: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Tested-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe: Move d3cold_allowed decision all together.	Rodrigo Vivi	1	-0/+5
	And let's use the VRAM threshold to keep d3cold temporarily disabled. With this we have the ability to run D3Cold experiments just by touching the vram_d3cold_threshold sysfs entry. Cc: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe: add lockdep annotation for xe_device_mem_access_put()	Matthew Auld	1	-0/+27
	The main motivation is with d3cold which will make the suspend and resume callbacks even more scary, but is useful regardless. We already have the needed annotation on the acquire side with xe_device_mem_access_get(), and by adding the annotation on the release side we should have a lot more confidence that our locking hierarchy is correct. v2: - Move the annotation into both callbacks for better symmetry. Also don't hold over the entire mem_access_get(); we only need to lockep to understand what is being held upon entering mem_access_get(), and how that matches up with locks in the callbacks. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe: fix xe_device_mem_access_get() races	Matthew Auld	1	-25/+43
	It looks like there is at least one race here, given that the pm_runtime_suspended() check looks to return false if we are in the process of suspending the device (RPM_SUSPENDING vs RPM_SUSPENDED). We later also do xe_pm_runtime_get_if_active(), but since the device is suspending or has now suspended, this doesn't do anything either. Following from this we can potentially return from xe_device_mem_access_get() with the device suspended or about to be, leading to broken behaviour. Attempt to fix this by always grabbing the runtime ref when our internal ref transitions from 0 -> 1. The hard part is then dealing with the runtime_pm callbacks also calling xe_device_mem_access_get() and deadlocking, which the pm_runtime_suspended() check prevented. v2: - ct->lock looks to be primed with fs_reclaim, so holding that and then allocating memory will cause lockdep to complain. Now that we unconditionally grab the mem_access.lock around mem_access_{get,put}, we need to change the ordering wrt to grabbing the ct->lock, since some of the runtime_pm routines can allocate memory (or at least that's what lockdep seems to suggest). Hopefully not a big deal. It might be that there were already issues with this, just that the atomics where "hiding" the potential issues. v3: - Use Thomas Hellström' idea with tracking the active task that is executing in the resume or suspend callback, in order to avoid recursive resume/suspend calls deadlocking on itself. - Split the ct->lock change. v4: - Add smb_mb() around accessing the pm_callback_task for extra safety. (Thomas Hellström) v5: - Clarify the kernel-doc for the mem_access.lock, given that it is quite strange in what it protects (data vs code). The real motivation is to aid lockdep. (Rodrigo Vivi) v6: - Split out the lock change. We still want this as a lockdep aid but only for the xe_device_mem_access_get() path. Sticking a lock on the put() looks be a no-go, also the runtime_put() there is always async. - Now that the lock is gone move to atomics and rely on the pm code serialising multiple callers on the 0 -> 1 transition. - g2h_worker_func() looks to be the next issue, given that suspend-resume callbacks are using CT, so try to handle that. v7: - Add xe_device_mem_access_get_if_ongoing(), and use it in g2h_worker_func(). v8 (Anshuman): - Just always grab the rpm, instead of just on the 0 -> 1 transition, which is a lot clearer and simplifies the code quite a bit. v9: - Make sure we also adjust the CT fast-path with if-active. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/258 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Anshuman Gupta <anshuman.gupta@intel.com> Acked-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe/pm: Init pcode and restore vram on power lost	Anshuman Gupta	1	-2/+11
	Don't init pcode and restore VRAM objects in vain. We can rely on primary GT GUC_STATUS to detect whether card has really lost power even when d3cold is allowed by xe. Adding d3cold.lost_power flag to avoid pcode init and vram restoration. Also cleaning up the TODO code comment. v2: - %s/xe_guc_has_lost_power()/xe_guc_in_reset(). - Used existing gt instead of new variable. [Rodrigo] - Added kernel-doc function comment. [Rodrigo] - xe_guc_in_reset() return true if failed to get fw. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230718080703.239343-6-anshuman.gupta@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe/pm: Toggle d3cold_allowed using vram_usages	Anshuman Gupta	1	-0/+25
	Adding support to control d3cold by using vram_usages metric from ttm resource manager. When root port is capable of d3cold but xe has disallowed d3cold due to vram_usages above vram_d3ccold_threshol. It is required to disable d3cold to avoid any resume failure because root port can still transition to d3cold when all of pcie endpoints and {upstream, virtual} switch ports will transition to d3hot. Also cleaning up the TODO code comment. v2: - Modify d3cold.allowed in xe_pm_d3cold_allowed_toggle. [Riana] - Cond changed (total_vram_used_mb < xe->d3cold.vram_threshold) according to doc comment. v3: - Added enum instead of true/false argument in d3cold_toggle(). [Rodrigo] - Removed TODO comment. [Rodrigo] Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230718080703.239343-5-anshuman.gupta@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe/pm: Add vram_d3cold_threshold Sysfs	Anshuman Gupta	1	-4/+33
	Add per pci device vram_d3cold_threshold Sysfs to control the d3cold allowed knob. Adding a d3cold structure embedded in xe_device to encapsulate d3cold related stuff. v2: - Check total vram before initializing default threshold. [Riana] - Add static scope to vram_d3cold_threshold DEVICE_ATTR. [Riana] v3: - Fixed cosmetics review comment. [Riana] - Fixed CI Hook failures. - Used drmm_mutex_init(). v4: - Fixed kernel-doc warnings. v5: - Added doc explaining need for the device sysfs. [Rodrigo] - Removed TODO comment. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230718080703.239343-4-anshuman.gupta@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe/pm: Refactor xe_pm_runtime_init	Anshuman Gupta	1	-1/+2
	Wrap xe_pm_runtime_init inside xe_pm_init. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230718080703.239343-3-anshuman.gupta@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe/pm: Add pci d3cold_capable support	Anshuman Gupta	1	-0/+22
	Adding pci d3cold_capable check in order to initialize d3cold_allowed as false statically. It avoids vram save/restore latency during runtime suspend/resume v2: - Added else block to xe_pci_runtime_idle. [Rodrigo] Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230718080703.239343-2-anshuman.gupta@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-21	drm/xe/pm: Disable PM on unbounded pcie parent bridge	Anshuman Gupta	1	-0/+14
	Intel Discrete GFX cards gfx may have multiple PCIe endpoints, they connects to root port via pcie upstream switch port(USP) and virtual pcie switch port(VSP), sometimes VSP pcie devices doesn't bind to pcieport driver. Without pcieport driver, pcie PM comes without any warranty and with unbounded VSP gfx card won't transition to low power pcie Device and Link states therefore assert drm_warn on unbounded VSP and disable xe driver PM support. v2: - Disable Xe PCI PM support. [Rodrigo] v3: - Changed subject and Rebase. v4: - %s/xe_pci_unbounded_bridge_disable_pm/xe_assert_on_unbounded_bridge. [Rodrigo] - Use device_set_pm_not_required() instead of dev_pm_ops NULL assignment. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230524090653.1192566-1-anshuman.gupta@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-20	drm/xe: Sort includes	Lucas De Marchi	1	-2/+3
	Sort includes and split them in blocks: 1) .h corresponding to the .c. Example: xe_bb.c should have a "#include "xe_bb.h" first. 2) #include <linux/...> 3) #include <drm/...> 4) local includes 5) i915 includes This is accomplished by running `clang-format --style=file -i --sort-includes drivers/gpu/drm/xe/*.[ch]` and ignoring all the changes after the includes. There are also some manual tweaks to split the blocks. v2: Also sort includes in headers Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-20	drm/xe/pm: fix unbalanced ref handling	Matthew Auld	1	-0/+8
	In local_pci_probe() the core kernel increments the rpm for the device, just before calling into the probe hook. If the driver/device supports runtime pm it is then meant to put this ref during probe (like we do in xe_pm_runtime_init()). However when removing the device we then also need to take the reference back, otherwise the ref that is put in pci_device_remove() will be unbalanced when for example unloading the driver, leading to warnings like: [ 3808.596345] xe 0000:03:00.0: Runtime PM usage count underflow! Fix this by incrementing the rpm ref when removing the device. v2: - Improve the terminology in the commit message; s/drop/put/ etc (Lucas & Rodrigo) - Also call pm_runtime_forbid(dev) (Rodrigo) Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/193 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>