Age | Commit message (Collapse) | Author | Files | Lines |
|
To acquire all modeset locks requires a ww_ctx to be allocated. As this
is the legacy path and the allocation small, to reduce the changes
required (and complex untested error handling) to the legacy drivers, we
simply assume that the allocation succeeds. At present, it relies on the
too-small-to-fail rule, but syzbot found that by injecting a failure
here we would hit the WARN. Document that this allocation must succeed
with __GFP_NOFAIL.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171031115535.15166-1-chris@chris-wilson.co.uk
|
|
Kasan spotted
[IGT] gem_tiled_pread_pwrite: exiting, ret=0
==================================================================
BUG: KASAN: use-after-free in __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
Read of size 8 at addr ffff8801359da310 by task kworker/3:2/182
CPU: 3 PID: 182 Comm: kworker/3:2 Tainted: G U 4.14.0-rc6-CI-Custom_3340+ #1
Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017
Workqueue: events __i915_gem_free_work [i915]
Call Trace:
dump_stack+0x68/0xa0
print_address_description+0x78/0x290
? __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
kasan_report+0x23d/0x350
__asan_report_load8_noabort+0x19/0x20
__i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
? i915_gem_object_truncate+0x100/0x100 [i915]
? lock_acquire+0x380/0x380
__i915_gem_object_put_pages+0x30d/0x530 [i915]
__i915_gem_free_objects+0x551/0xbd0 [i915]
? lock_acquire+0x13e/0x380
__i915_gem_free_work+0x4e/0x70 [i915]
process_one_work+0x6f6/0x1590
? pwq_dec_nr_in_flight+0x2b0/0x2b0
worker_thread+0xe6/0xe90
? pci_mmcfg_check_reserved+0x110/0x110
kthread+0x309/0x410
? process_one_work+0x1590/0x1590
? kthread_create_on_node+0xb0/0xb0
ret_from_fork+0x27/0x40
Allocated by task 1801:
save_stack_trace+0x1b/0x20
kasan_kmalloc+0xee/0x190
kasan_slab_alloc+0x12/0x20
kmem_cache_alloc+0xdc/0x2e0
radix_tree_node_alloc.constprop.12+0x48/0x330
__radix_tree_create+0x274/0x480
__radix_tree_insert+0xa2/0x610
i915_gem_object_get_sg+0x224/0x670 [i915]
i915_gem_object_get_page+0xb5/0x1c0 [i915]
i915_gem_pread_ioctl+0x822/0xf60 [i915]
drm_ioctl_kernel+0x13f/0x1c0
drm_ioctl+0x6cf/0x980
do_vfs_ioctl+0x184/0xf30
SyS_ioctl+0x41/0x70
entry_SYSCALL_64_fastpath+0x1c/0xb1
Freed by task 37:
save_stack_trace+0x1b/0x20
kasan_slab_free+0xaf/0x190
kmem_cache_free+0xbf/0x340
radix_tree_node_rcu_free+0x79/0x90
rcu_process_callbacks+0x46d/0xf40
__do_softirq+0x21c/0x8d3
The buggy address belongs to the object at ffff8801359da0f0
which belongs to the cache radix_tree_node of size 576
The buggy address is located 544 bytes inside of
576-byte region [ffff8801359da0f0, ffff8801359da330)
The buggy address belongs to the page:
page:ffffea0004d67600 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0
flags: 0x8000000000008100(slab|head)
raw: 8000000000008100 0000000000000000 0000000000000000 0000000100110011
raw: ffffea0004b52920 ffffea0004b38020 ffff88015b416a80 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8801359da200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801359da280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8801359da300: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
^
ffff8801359da380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8801359da400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint
which looks like the slab containing the radixtree iter was freed as we
traversed the tree, taking the rcu read lock across the loop should
prevent that (deferring all the frees until the end).
Reported-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Fixes: d1b48c1e7184 ("drm/i915: Replace execbuf vma ht with an idr")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171026130032.10677-2-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
(cherry picked from commit 547da76b5777859f98bb78e6b57f19463f803c04)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Kasan spotted
[IGT] gem_tiled_pread_pwrite: exiting, ret=0
==================================================================
BUG: KASAN: use-after-free in __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
Read of size 8 at addr ffff8801359da310 by task kworker/3:2/182
CPU: 3 PID: 182 Comm: kworker/3:2 Tainted: G U 4.14.0-rc6-CI-Custom_3340+ #1
Hardware name: Intel Corp. Geminilake/GLK RVP1 DDR4 (05), BIOS GELKRVPA.X64.0062.B30.1708222146 08/22/2017
Workqueue: events __i915_gem_free_work [i915]
Call Trace:
dump_stack+0x68/0xa0
print_address_description+0x78/0x290
? __i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
kasan_report+0x23d/0x350
__asan_report_load8_noabort+0x19/0x20
__i915_gem_object_reset_page_iter+0x15c/0x170 [i915]
? i915_gem_object_truncate+0x100/0x100 [i915]
? lock_acquire+0x380/0x380
__i915_gem_object_put_pages+0x30d/0x530 [i915]
__i915_gem_free_objects+0x551/0xbd0 [i915]
? lock_acquire+0x13e/0x380
__i915_gem_free_work+0x4e/0x70 [i915]
process_one_work+0x6f6/0x1590
? pwq_dec_nr_in_flight+0x2b0/0x2b0
worker_thread+0xe6/0xe90
? pci_mmcfg_check_reserved+0x110/0x110
kthread+0x309/0x410
? process_one_work+0x1590/0x1590
? kthread_create_on_node+0xb0/0xb0
ret_from_fork+0x27/0x40
Allocated by task 1801:
save_stack_trace+0x1b/0x20
kasan_kmalloc+0xee/0x190
kasan_slab_alloc+0x12/0x20
kmem_cache_alloc+0xdc/0x2e0
radix_tree_node_alloc.constprop.12+0x48/0x330
__radix_tree_create+0x274/0x480
__radix_tree_insert+0xa2/0x610
i915_gem_object_get_sg+0x224/0x670 [i915]
i915_gem_object_get_page+0xb5/0x1c0 [i915]
i915_gem_pread_ioctl+0x822/0xf60 [i915]
drm_ioctl_kernel+0x13f/0x1c0
drm_ioctl+0x6cf/0x980
do_vfs_ioctl+0x184/0xf30
SyS_ioctl+0x41/0x70
entry_SYSCALL_64_fastpath+0x1c/0xb1
Freed by task 37:
save_stack_trace+0x1b/0x20
kasan_slab_free+0xaf/0x190
kmem_cache_free+0xbf/0x340
radix_tree_node_rcu_free+0x79/0x90
rcu_process_callbacks+0x46d/0xf40
__do_softirq+0x21c/0x8d3
The buggy address belongs to the object at ffff8801359da0f0
which belongs to the cache radix_tree_node of size 576
The buggy address is located 544 bytes inside of
576-byte region [ffff8801359da0f0, ffff8801359da330)
The buggy address belongs to the page:
page:ffffea0004d67600 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0
flags: 0x8000000000008100(slab|head)
raw: 8000000000008100 0000000000000000 0000000000000000 0000000100110011
raw: ffffea0004b52920 ffffea0004b38020 ffff88015b416a80 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff8801359da200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801359da280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8801359da300: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
^
ffff8801359da380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8801359da400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Disabling lock debugging due to kernel taint
which looks like the slab containing the radixtree iter was freed as we
traversed the tree, taking the rcu read lock across the loop should
prevent that (deferring all the frees until the end).
Reported-by: Tomi Sarvela <tomi.p.sarvela@intel.com>
Fixes: 96d776345277 ("drm/i915: Use a radixtree for random access to the object's backing storage")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171026130032.10677-1-chris@chris-wilson.co.uk
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
(cherry picked from commit bea6e987c1ff358224e7bef7084be7650f5d1c38)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Per my reading of the eDP spec, DP_DPCD_DISPLAY_CONTROL_CAPABLE bit in
DP_EDP_CONFIGURATION_CAP should be set if the eDP display control
registers starting at offset DP_EDP_DPCD_REV are "enabled". Currently we
check the bit before reading the registers, and DP_EDP_DPCD_REV is the
only way to detect eDP revision.
Turns out there are (likely buggy) displays that require eDP 1.4+
features, such as supported link rates and link rate select, but do not
have the bit set. Read the display control registers
unconditionally. They are supposed to read zero anyway if they are not
supported, so there should be no harm in this.
This fixes the referenced bug by enabling the eDP version check, and
thus reading of the supported link rates. The panel in question has 0 in
DP_MAX_LINK_RATE which is only supported in eDP 1.4+. Without the
supported link rates method we default to RBR which is insufficient for
the panel native mode. As a curiosity, the panel also has a bogus value
of 0x12 in DP_EDP_DPCD_REV, but that passes our check for >= DP_EDP_14
(which is 0x03).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103400
Reported-and-tested-by: Nicolas P. <issun.artiste@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171026142932.17737-1-jani.nikula@intel.com
(cherry picked from commit 0501a3b0eb01ac2209ef6fce76153e5d6b07034e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
The original intent was to preserve watermarks as much as possible
in intel_pipe_wm.raw_wm, and put the validated ones in intel_pipe_wm.wm.
It seems this approach is insufficient and we don't always preserve
the raw watermarks, so just use the atomic iterator we're already using
to get a const pointer to all bound planes on the crtc.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102373
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: stable@vger.kernel.org #v4.8+
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171019151341.4579-1-maarten.lankhorst@linux.intel.com
(cherry picked from commit 28283f4f359cd7cfa9e65457bb98c507a2cd0cd0)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
During modeset cleanup on driver unload we may have a pending
hotplug work. This needs to be canceled early during the teardown
so that it does not fire after we have freed the connector.
We do this after drm_kms_helper_poll_fini(dev) since this might
trigger modeset retry work due to link retrain and before
intel_fbdev_fini() since this work requires the lock from fbdev.
If this is not done we may see something like:
DEBUG_LOCKS_WARN_ON(mutex_is_locked(lock))
------------[ cut here ]------------
WARNING: CPU: 4 PID: 5010 at kernel/locking/mutex-debug.c:103 mutex_destroy+0x4e/0x60
Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178
+a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid
+[last unloaded: snd_hda_intel]
CPU: 4 PID: 5010 Comm: drv_module_relo Tainted: G U 4.14.0-rc3-CI-CI_DRM_3186+ #1
Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017
task: ffff8803c827aa40 task.stack: ffffc90000520000
RIP: 0010:mutex_destroy+0x4e/0x60
RSP: 0018:ffffc90000523d58 EFLAGS: 00010292
RAX: 000000000000002a RBX: ffff88044fbef648 RCX: 0000000000000000
RDX: 0000000080000001 RSI: 0000000000000001 RDI: ffffffff810f0cf0
RBP: ffffc90000523d60 R08: 0000000000000001 R09: 0000000000000001
R10: 000000000f21cb81 R11: 0000000000000000 R12: ffff88044f71efc8
R13: ffffffffa02b3d20 R14: ffffffffa02b3d90 R15: ffff880459b29308
FS: 00007f5df4d6e8c0(0000) GS:ffff88045d300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055ec51f00a18 CR3: 0000000451782006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
drm_fb_helper_fini+0xd9/0x130
intel_fbdev_destroy+0x12/0x60 [i915]
intel_fbdev_fini+0x28/0x30 [i915]
intel_modeset_cleanup+0x45/0xa0 [i915]
i915_driver_unload+0x92/0x180 [i915]
i915_pci_remove+0x19/0x30 [i915]
i915_driver_unload+0x92/0x180 [i915]
i915_pci_remove+0x19/0x30 [i915]
pci_device_remove+0x39/0xb0
device_release_driver_internal+0x15d/0x220
driver_detach+0x40/0x80
bus_remove_driver+0x58/0xd0
driver_unregister+0x2c/0x40
pci_unregister_driver+0x36/0xb0
i915_exit+0x1a/0x8b [i915]
SyS_delete_module+0x18c/0x1e0
entry_SYSCALL_64_fastpath+0x1c/0xb1
RIP: 0033:0x7f5df3286287
RSP: 002b:00007fff8e107cc8 EFLAGS: 00000246 ORIG_RAX: 00000000000000b0
RAX: ffffffffffffffda RBX: ffffffff81493a03 RCX: 00007f5df3286287
RDX: 0000000000000001 RSI: 0000000000000800 RDI: 0000564c7be02e48
RBP: ffffc90000523f88 R08: 0000000000000000 R09: 0000000000000080
R10: 00007f5df4d6e8c0 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff8e107eb0 R14: 0000000000000000 R15: 0000000000000000
Or a GPF like:
general protection fault: 0000 [#1] PREEMPT SMP
Modules linked in: i915(-) snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec snd_hwdep snd_hda_core snd_pcm vgem ax88179_178
+a usbnet mii x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e ptp pps_core prime_numbers i2c_hid
+[last unloaded: snd_hda_intel]
CPU: 0 PID: 82 Comm: kworker/0:1 Tainted: G U W 4.14.0-rc3-CI-CI_DRM_3186+ #1
Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWX1.R00.X104.A03.1709140524 09/14/2017
Workqueue: events intel_dp_modeset_retry_work_fn [i915]
task: ffff88045a5caa40 task.stack: ffffc90000378000
RIP: 0010:drm_setup_crtcs+0x143/0xbf0
RSP: 0018:ffffc9000037bd20 EFLAGS: 00010202
RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000002 RCX: 0000000000000001
RDX: 0000000000000001 RSI: 0000000000000780 RDI: 00000000ffffffff
RBP: ffffc9000037bdb8 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000780 R11: 0000000000000000 R12: 0000000000000002
R13: ffff88044fbef4e8 R14: 0000000000000780 R15: 0000000000000438
FS: 0000000000000000(0000) GS:ffff88045d200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055ec51ee5168 CR3: 000000044c89d003 CR4: 00000000003606f0
Call Trace:
drm_fb_helper_hotplug_event.part.18+0x7e/0xc0
drm_fb_helper_hotplug_event+0x1a/0x20
intel_fbdev_output_poll_changed+0x1a/0x20 [i915]
drm_kms_helper_hotplug_event+0x27/0x30
intel_dp_modeset_retry_work_fn+0x77/0x80 [i915]
process_one_work+0x233/0x660
worker_thread+0x206/0x3b0
kthread+0x152/0x190
? process_one_work+0x660/0x660
? kthread_create_on_node+0x40/0x40
ret_from_fork+0x27/0x40
Code: 06 00 00 45 8b 45 20 31 db 45 31 e4 45 85 c0 0f 8e 91 06 00 00 44 8b 75 94 44 8b 7d 90 49 8b 45 28 49 63 d4 44 89 f6 41 83 c4 01 <48> 8b 04 d0 44
+89 fa 48 8b 38 48 8b 87 a8 01 00 00 ff 50 20 01
RIP: drm_setup_crtcs+0x143/0xbf0 RSP: ffffc9000037bd20
---[ end trace 08901ff1a77d30c7 ]---
v2:
* Rename it to intel_hpd_poll_fini() and call drm_kms_helper_fini() inside it
as the first step before cancel work (Chris Wilson)
* Add GPF trace in commit message and make the function static (Maarten Lankhorst)
Suggested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 9301397a63b3 ("drm/i915: Implement Link Rate fallback on Link training failure")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tony Cheng <tony.cheng@amd.com>
Cc: Harry Wentland <Harry.wentland@amd.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1509054720-25325-1-git-send-email-manasi.d.navare@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 886c6b8692ba5f71b578097524b3b082e2e02119)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
To quote Felix: "For testing KV with current user mode stack, please use
amdgpu. I don't expect this to work with radeon and I'm not planning to
spend any effort on making radeon work with a current user mode stack."
Only compile tested, but should be straight forward.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
When a plane moves out of bounds (i.e, outside the crtc clip region), the
plane state's "visible" parameter changes to false. When this happens, we
(a) release the hwpipe resources away from it, and
(b) unstage the corresponding hwpipe(s) from the Layer Mixers in the CRTC.
(a) requires use to acquire the global atomic state and assign a new
hwpipe. (b) requires us to re-configure the Layer Mixer, which is done in
the CRTC. We don't want to do these things in the async plane update path,
so return an error if the new state's "visible" isn't the same as the
current state's "visible".
Cc: Gustavo Padovan <gustavo.padovan@collabora.com>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
MDP5 on newer SoCs support cursor planes (i.e, cursor SSPPs). They are a
separate entity unlike the cursors within LM.
Do not try to restore the MDP5 LM cursor registers, or the corresponding
CTL bits if we are not using LM cursors.
Also, since we've introduced a new variable 'lm_cursor_enabled', we can
now use it to avoid creating a different sets of crtc_funcs for CRTCs
with LM cursors and CRTCs with cursor planes.
Fixes: "drm/msm/mdp5: restore cursor state when enabling crtc"
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
We currently call mdp5_pipe_assign() twice to assign the left and right
hwpipes for our drm_plane. When merging 2 hwpipes, there are a few
constraints that we need to keep in mind:
- Only the same types of SSPPs are preferred. I.e, a RGB pipe should
be paired with another RGB pipe, VIG with VIG etc.
- The hwpipe staged on the left should have a higher priority than
the hwpipe staged on the right. The priorities are as follows:
VIG0 > VIG1 > VIG2 > VIG3
RGB0 > RGB1 > RGB2 > RGB3
DMA0 > DMA1
We can't apply these constraints easily if mdp5_pipe_assign() is
called twice. Update mdp5_pipe_assign() to find both hwpipes in
one go, and add the extra constraints needed.
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
mdp5_pipe_assign currently returns the hwpipe pointer for the drm_plane.
Return it indirectly by setting a pointer passed as an argument. This
is needed because we want the func to find out the right hwpipe too.
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
After converting legacy cursor updates to atomic async commits
mdp5_cursor_plane_funcs just duplicates mdp5_plane_funcs now.
Cc: Rob Clark <robdclark@gmail.com>
Cc: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Tested-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Add support to async updates of cursors by using the new atomic
interface for that. Basically what this commit does is do what
mdp5_update_cursor_plane_legacy() did but through atomic.
v5: call drm_atomic_helper_async_check() from the check hook
v4: add missing atomic async commit call to msm_atomic_commit(Archit Taneja)
v3: move size checks back to drivers (Ville Syrjälä)
v2: move fb setting to core and use new state (Eric Anholt)
Cc: Rob Clark <robdclark@gmail.com>
Cc: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Tested-by: Archit Taneja <architt@codeaurora.org> (v4)
[added comment about not hitting async update path if hwpipes are
re-assigned or global state is touched]
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Since we enabled runtime PM, we cannot count on cursor registers to
retain their values. This can result in situations where we think the
cursor is enabled when we enable the CRTC but it is trying to scan out
null (and the rest of cursor position/size is lost), resulting in faults
and generally angering the hw when coming out of DPMS with a cursor
enabled.
stable backport note: reverting 774e39ee3572 is also a suitable fix
Fixes: 774e39ee3572 drm/msm/mdp5: Set up runtime PM for MDSS
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Archit Taneja <architt@codeaurora.org>
|
|
It's only likely to paper over bugs. Unlike the gpu, where we want to
keep things alive a bit longer in expectation of the next frame's
submit, when the display is shut down we can power off immediately.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Archit Taneja <architt@codeaurora.org>
|
|
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Note we need to move update_fences() to after msm_rd_dump_submit(),
otherwise the bo's referenced by the submit may no longer be valid.
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
We need this if we want to dump the submit after cleanup (ie. from hang
or fault). But in the backoff/unpin case we want to clear them. So add
a flag so we can skip clearing the IOVAs in at cleanup.
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
For faults or hangs, it is nice to be able to include a bit more
information.
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Split into two instances, the existing $debugfs/rd which continues to
dump all submits, and $debugfs/hangrd which will be used to dump just
submits that cause gpu hangs (and eventually faults, but that will
require some iommu framework enhancements).
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Prep work for adding a debugfs file that dumps just submits which
trigger hangs/faults. In this case the bo may already be in the
MADV_DONTNEED state, but will be still on the active list (since
the submit hasn't completed yet). So the normal check that the
bo is in the WILLNEED state does not apply. (But of course the bo
should definitely not be in the PURGED state!)
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Now that freedreno gallium driver defaults to using submit_queue task
(render reordering), just showing task->comm is not so useful (ie. it is
always "flush_queue:0"), so also dump the cmdline. This should also be
more useful for piglit/shader_runner.
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Currently the rd dump avoids any buffers marked as WRITE under
the assumption that the contents are not interesting. While it
is true that the contents are uninteresting we should still print
the iova and size for all buffers so that any listening replay
tools can correctly construct the submission.
Print the header for all buffers but only dump the contents for
buffers marked as READ.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Recent changes to locking have rendered struct_mutex_task
unused.
Unused since 0e08270a1f01.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Implement preemption for A5XX targets - this allows multiple
ringbuffers for different priorities with automatic preemption
of a lower priority ringbuffer if a higher one is ready.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
We use a global ringbuffer size and block size for all targets and
at least for 5XX preemption we need to know the value the RB_CNTL
in several locations so it makes sense to calculate it once and use
it everywhere.
The only monkey wrench is that we need to disable the RPTR shadow
for A430 targets but that only needs to be done once and doesn't
affect A5XX so we can or in the value at init time.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Add a shadow pointer to track the current command being written into
the ring. Don't commit it as 'cur' until the command is submitted.
Because 'cur' is used to construct the software copy of the wptr this
ensures that somebody peeking in on the ring doesn't assume that a
command is inflight while it is being written. This isn't a huge deal
with a single ring (though technically the hangcheck could assume
the system is prematurely busy when it isn't) but it will be rather
important for preemption where the decision to preempt is based
on a non-empty ringbuffer. Without a shadow an aggressive preemption
scheme could assume that the ringbuffer is non empty and switch to it
before the CPU is done writing the command and boom.
Even though preemption won't be supported for all targets because of
the way the code is organized it is simpler to make this generic for
all targets. The extra load for non-preemption targets should be
minimal.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
In order to manage ringbuffer priority to its fullest userspace
should know how many ringbuffers it has to work with. Add a
parameter to return the number of active rings.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Add the infrastructure to support the idea of multiple ringbuffers.
Assign each ringbuffer an id and use that as an index for the various
ring specific operations.
The biggest delta is to support legacy fences. Each fence gets its own
sequence number but the legacy functions expect to use a unique integer.
To handle this we return a unique identifier for each submission but
map it to a specific ring/sequence under the covers. Newer users use
a dma_fence pointer anyway so they don't care about the actual sequence
ID or ring.
The actual mechanics for multiple ringbuffers are very target specific
so this code just allows for the possibility but still only defines
one ringbuffer for each target family.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
When we move to multiple ringbuffers we're going to store the data
in the memptrs on a per-ring basis. In order to prepare for that
move the current memptrs from the adreno namespace into msm_gpu.
This is way cleaner and immediately lets us kill off some sub
functions so there is much less cost later when we do move to
per-ring structs.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Currently the behavior of a command stream is provided by the user
application during submission and the application is expected to internally
maintain the settings for each 'context' or 'rendering queue' and specify
the correct ones.
This works okay for simple cases but as applications become more
complex we will want to set context specific flags and do various
permission checks to allow certain contexts to enable additional
privileges.
Add kernel-side submit queues to be analogous to 'contexts' or
'rendering queues' on the application side. Each file descriptor
instance will maintain its own list of queues. Queues cannot be
shared between file descriptors.
For backwards compatibility context id '0' is defined as a default
context specifying no priority and no special flags. This is
intended to be the usual configuration for 99% of applications so
that a garden variety application can function correctly without
creating a queue. Only those applications requiring the specific
benefit of different queues need create one.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
We already have, as a result of upstreaming the gpu bindings,
msm_clk_get() which will try to get the clock both without and with a
"_clk" suffix. Use this in HDMI code so we can drop the "_clk" suffix
in bindings while maintaing backwards compatibility.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
|
|
We already have, as a result of upstreaming the gpu bindings,
msm_clk_get() which will try to get the clock both without and with a
"_clk" suffix. Use this in eDP code so we can drop the "_clk" suffix
in bindings while maintaing backwards compatibility.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
|
|
We already have, as a result of upstreaming the gpu bindings,
msm_clk_get() which will try to get the clock both without and with a
"_clk" suffix. Use this in DSI code so we can drop the "_clk" suffix
in bindings while maintaing backwards compatibility.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
|
|
This is useful to see in the log, without requiring drm.debug.
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
When firmware was added to linux-firmware, it was put in a qcom sub-
directory, unlike what we'd been using before. For a300_pfp.fw and
a300_pm4.fw symlinks were created, but we'd prefer not to have to do
this in the future. So add support to look in both places when
loading firmware.
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Prep work for the next patch.
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
Previously, in an effort to defer initializing the gpu until firmware
was available (ie. rootfs mounted), the gpu was not loaded at when the
subdevice was bound. Which resulted that clks/etc were requested in a
place that devm couldn't really help unwind if something failed.
Instead move request_firmware() to gpu->hw_init() and construct the gpu
earlier in adreno_bind(). To avoid the rest of the driver needing to
be aware of a gpu that hasn't managed to load firmware and hw_init()
yet, stash the gpu ptr in the adreno device's drvdata, and don't set
priv->gpu() until hw_init() succeeds.
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
This was used as a placeholder. It was never really input to the MDSS/HDMI
clocks.
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
We need to call reservation_object_reserve_shared() in both cases, but
this wasn't happening in the _NO_IMPLICIT submit case.
Fixes: f0a42bb ("drm/msm: submit support for in-fences")
Reported-by: Jordan Crouse <jcrouse@codeaurora.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>
|
|
In systems under heavy load the IH work may experience significant
scheduling delays.
Under load + system workqueue:
Max Latency: 7.023695 ms
Avg Latency: 0.263994 ms
Under load + high priority workqueue:
Max Latency: 1.162568 ms
Avg Latency: 0.163213 ms
Further work is required to measure the impact of per-cpu settings on IH
performance.
Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
We don't need to wait for all work to complete in the IH exit function.
We only need to make sure the interrupt_work has finished executing to
guarantee that ih_kfifo is no longer in use.
Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
A larger buffer will let us accommodate applications with a large amount
of semi-simultaneous event signals.
Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Replace our implementation of a lockless ring buffer with the standard
linux kernel kfifo.
We shouldn't maintain our own version of a standard data structure.
Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
This allows increasing the KFD_SIGNAL_EVENT_LIMIT in kfd_ioctl.h
without breaking processes built with older kfd_ioctl.h versions.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
This speeds up signal lookup when the IH ring entry includes a
valid context ID or partial context ID. Only if the context ID is
found to be invalid, fall back to an exhaustive search of all
signaled events.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
|
Signal slots are identical to event IDs.
Replace the used_slot_bitmap and events hash table with an IDR to
allocate and lookup event IDs and signal slots more efficiently.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|