| Age | Commit message (Collapse) | Author | Files | Lines |
|
commit e84cb605e02f1b3d0aee8d7157419cd8aaa06038 upstream.
The calculated ideal rate can easily overflow an unsigned long, thus
making the best div selection buggy as soon as no ideal match is found
before the overflow occurs.
Fixes: 4731a72df273 ("drm/sun4i: request exact rates to our parents")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Acked-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181018100250.12565-1-boris.brezillon@bootlin.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit db05c481977599236f12a85e55de9f5ab37b0a2c upstream.
drm fbdev emulation doesn't support changing the pixel format at all,
so reject all pixel format changing requests.
Cc: stable@vger.kernel.org
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20181003164538.5534-1-Eugeniy.Paltsev@synopsys.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 9068e02f58740778d8270840657f1e250a2cc60f upstream.
HDMI Forum VSDB YCBCR420 deep color capability bits are 2:0. Correct
definitions in the header for the mask to work correctly.
Fixes: e6a9a2c3dc43 ("drm/edid: parse ycbcr 420 deep color information")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107893
Cc: <stable@vger.kernel.org> # v4.14+
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1538776335-12569-1-git-send-email-clinton.a.taylor@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 0711a43b6d84ff9189adfbf83c8bbf56eef794bf upstream.
There's another panel that reports "DFP 1.x compliant TMDS" but it
supports 6bpc instead of 8 bpc.
Apply 6 bpc quirk for the panel to fix it.
BugLink: https://bugs.launchpad.net/bugs/1794387
Cc: <stable@vger.kernel.org> # v4.8+
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20181002152911.4370-1-kai.heng.feng@canonical.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 987bf116445db5d63a5c2ed94c4479687d9c9973 ]
In amdgpu_dm_commit_tail(), wait until flip_done() is signaled before
we signal hw_done().
[Why]
This is to temporarily address a paging error that occurs when a
nonblocking commit contends with another commit, particularly in a
mirrored display configuration where at least 2 CRTCs are updated.
The error occurs in drm_atomic_helper_wait_for_flip_done(), when we
attempt to access the contents of new_crtc_state->commit.
Here's the sequence for a mirrored 2 display setup (irrelevant steps
left out for clarity):
**THREAD 1** | **THREAD 2**
|
Initialize atomic state for flip |
|
Queue worker |
...
| Do work for flip
|
| Signal hw_done() on CRTC 1
| Signal hw_done() on CRTC 2
|
| Wait for flip_done() on CRTC 1
<---- **PREEMPTED BY THREAD 1**
Initialize atomic state for cursor |
update (1) |
|
Do cursor update work on both CRTCs |
|
Clear atomic state (2) |
**DONE** |
...
|
| Wait for flip_done() on CRTC 2
| *ERROR*
|
The issue starts with (1). When the atomic state is initialized, the
current CRTC states are duplicated to be the new_crtc_states, and
referenced to be the old_crtc_states. (The new_crtc_states are to be
filled with update data.)
Some things to note:
* Due to the mirrored configuration, the cursor updates on both CRTCs.
* At this point, the pflip IRQ has already been handled, and flip_done
signaled on all CRTCs. The cursor commit can therefore continue.
* The old_crtc_states used by the cursor update are the **same states**
as the new_crtc_states used by the flip worker.
At (2), the old_crtc_state is freed (*), and the cursor commit
completes. We then context switch back to the flip worker, where we
attempt to access the new_crtc_state->commit object. This is
problematic, as this state has already been freed.
(*) Technically, 'state->crtcs[i].state' is freed, which was made to
reference old_crtc_state in drm_atomic_helper_swap_state()
[How]
By moving hw_done() after wait_for_flip_done(), we're guaranteed that
the new_crtc_state (from the flip worker's perspective) still exists.
This is because any other commit will be blocked, waiting for the
hw_done() signal.
Note that both the i915 and imx drivers have this sequence flipped
already, masking this problem.
Signed-off-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e0dccce1193f87597548d0db6ecc942fb92c04cd ]
The CEC_TX_STATUS_MAX_RETRIES should be set for errors only to
prevent the CEC framework from retrying the transmit. If the
transmit was successful, then don't set this flag.
Found by running 'cec-compliance -A' on a beaglebone box.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d98627d1360d55e3b28f702caca8b6342c4a4e45 ]
Currently the check to see if the timeout has reached zero is incorrect
and the check is instead checking if the timeout is non-zero and not
zero, hence it will break out of the loop on the first iteration and
the msleep is never executed. Fix this by breaking from the loop when
timeout is zero.
Detected by CoverityScan, CID#1469404 ("Logically Dead Code")
Fixes: f0316f93897c ("drm/i2c: tda9950: add CEC driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 69be1984ded00a11b1ed0888c6d8e4f35370372f ]
Currently, if userspace calls drm_wait_vblank before the crtc is
activated the crtc vblank_enable hook is called, which in case of
malidp driver triggers some warninngs. This happens because on
device init we don't inform the drm core about the vblank state
by calling drm_crtc_vblank_on/off/reset which together with
drm_vblank_get have some magic that prevents calling drm_vblank_enable
when crtc is off.
Signed-off-by: Alexandru Gheorghe <alexandru-cosmin.gheorghe@arm.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit e46368cf77f2cb6304c51d9ff5f147cfb7dc0074 upstream.
While we currently grab a runtime PM ref in nouveau's normal connector
detection code, we apparently don't do this for MST. This means if we're
in a scenario where the GPU is suspended and userspace attempts to do a
connector probe on an MSTC connector, the probe will fail entirely due
to the DP aux channel and GPU not being woken up:
[ 316.633489] nouveau 0000:01:00.0: i2c: aux 000a: begin idle timeout ffffffff
[ 316.635713] nouveau 0000:01:00.0: i2c: aux 000a: begin idle timeout ffffffff
[ 316.637785] nouveau 0000:01:00.0: i2c: aux 000a: begin idle timeout ffffffff
...
So, grab a runtime PM ref here.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 44d8cc6f1a905e4bb1d4221a898abb0d7e9d100a ]
Because CRAT_CU_FLAGS_IOMMU_PRESENT was not set in some BIOS crat, we
need to workaround this.
For future compatibility, we also overwrite the bit in capability according
to the value of needs_iommu_device.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 15426dbb65c5b37680d27e84d58a1ed3b8532518 ]
CWSR fails on Raven if the control stack is MTYPE_UC, which is used
for regular GART mappings. As a workaround we map it using MTYPE_NC.
The MEC firmware expects the control stack at one page offset from the
start of the MQD so it is part of the MQD allocation on GFXv9. AMDGPU
added a memory allocation flag just for this purpose.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit caaa4c8a6be2a275bd14f2369ee364978ff74704 ]
A wrong register bit was examinated for checking SDMA status so it reports
false failures. This typo only appears on gfx_v7. gfx_v8 checks the correct
bit.
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 7eb33224572636248d5b6cfa1a6b2472207be5c4 ]
We prefer to of_device_id tables are NULL terminated. So make
vexpress_muxfpga_match is NULL terminated.
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/1533379767-15629-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 12d43deb1ee639d01a2a8d2a7a4cc8ad31224475 upstream.
fd_install() moves the reference given to it into the file descriptor table
of the current process. If the current process is multithreaded, then
immediately after fd_install(), another thread can close() the file
descriptor and cause the file's resources to be cleaned up.
Since the reference to "lessee" is held by the file, we must not access
"lessee" after the fd_install() call.
As far as I can tell, to reach this codepath, the caller must have an open
file descriptor to a DRI device in master mode. I'm not sure what the
requirements for that are.
Signed-off-by: Jann Horn <jannh@google.com>
Fixes: 62884cd386b8 ("drm: Add four ioctls for managing drm mode object leases [v7]")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20181001153117.216923-1-jannh@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 337fe9f5c1e7de1f391c6a692531379d2aa2ee11 upstream.
We attempt to get fences earlier in the hopes that everything will
already have fences and no callbacks will be needed. If we do succeed
in getting a fence, getting one a second time will result in a duplicate
ref with no unref. This is causing memory leaks in Vulkan applications
that create a lot of fences; playing for a few hours can, apparently,
bring down the system.
Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107899
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180926071703.15257-1-jason.ekstrand@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 61ea6f5831974ebd1a57baffd7cc30600a2e26fc upstream.
The vce cancel_delayed_work_sync never be called.
driver call the function in error path.
This caused the A+A suspend hang when runtime pm enebled.
As we will visit the smu in the idle queue. this will cause
smu hang because the dgpu has been suspend, and the dgpu also
will be waked up. As the smu has been hang, so the dgpu resume
will failed.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This reverts commit 93b100ddda3be284be160e9ccba28c7f8f21ab73 which was
commit c3cb424a086921f6bb0449b10d998352a756d6d5 upstream.
It was not needed for 4.18.y and caused problems there.
Reported-by: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0165de983272d1fae0809ed9db47c46a412279bc ]
Slowly leaking memory one page at a time :)
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 68ebc13ea40656fddd3803735d621921a2d74a5e ]
Fix SDMA hang in prt mode, clear XNACK_WATERMARK in reg SDMA0_UTCL1_WATERMK to avoid the issue
Affected ASICs: VEGA10 VEGA12 RV1 RV2
v2: add reg clear for SDMA1
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Tested-by: Yukun Li <yukun1.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
panels
[ Upstream commit 53b0cc46f27cfc2cadca609b503a7d92b5185a47 ]
Fixes eDP backlight issues on more recent laptops.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit e04cfdc9b7398c60dbc70212415ea63b6c6a93ae ]
If a HPD pulse signalling the need to retrain the link occurs between
the KMS driver releasing the output and the supervisor interrupt that
finishes the teardown, it was possible get a NULL-ptr deref.
Avoid this by marking the link as inactive earlier.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 0a6986c6595e9afd20ff7280dab36431c1e467f8 ]
This Falcon application doesn't appear to be present on some newer
systems, so let's not fail init if we can't find it.
TBD: is there a way to determine whether it *should* be there?
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 51ed833c881b9d96557c773f6a37018d79e29a46 ]
Fixes oopses in certain failure paths.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit a43b16dda2d7485f5c5aed075c1dc9785e339515 ]
The NV_ERROR macro requires drm->client to be initialised, which it may not
be at this stage of the init process.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6ddd9769db4fc11a98bd7e58be1764e47fdb8384 ]
Fix the VMC page fault when the running sequence is as below:
1.amdgpu_gem_create_ioctl
2.ttm_bo_swapout->amdgpu_vm_bo_invalidate, as not called
amdgpu_vm_bo_base_init, so won't called
list_add_tail(&base->bo_list, &bo->va). Even the bo was evicted,
it won't set the bo_base->moved.
3.drm_gem_open_ioctl->amdgpu_vm_bo_base_init, here only called
list_move_tail(&base->vm_status, &vm->evicted), but not set the
bo_base->moved.
4.amdgpu_vm_bo_map->amdgpu_vm_bo_insert_map, as the bo_base->moved is
not set true, the function amdgpu_vm_bo_insert_map will call
list_move(&bo_va->base.vm_status, &vm->moved)
5.amdgpu_cs_ioctl won't validate the swapout bo, as it is only in the
moved list, not in the evict list. So VMC page fault occurs.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 2ab4d0e74256fc49b7b270f63c1d1e47c2455abc ]
For SI/Kv, the power state is managed by function
amdgpu_pm_compute_clocks.
when dpm enabled, we should call amdgpu_pm_compute_clocks
to update current power state instand of set boot state.
this change can fix the oops when kfd driver was enabled on Kv.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 8ef23364b654d44244400d79988e677e504b21ba ]
This is required by gfx hw and can fix the rlc hang when
do s3 stree test on Cz/St.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Hang Zhou <hang.zhou@amd.com>
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 2f40c6eac74a2a60921cdec9e9a8a57e88e31434 ]
SWDEV-146499: hang during multi vulkan process testing
cause:
the second frame's PREAMBLE_IB have clear-state
and LOAD actions, those actions ruin the pipeline
that is still doing process in the previous frame's
work-load IB.
fix:
need insert pipeline sync if have context switch for
SRIOV (because only SRIOV will report PREEMPTION flag
to UMD)
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d98ff24e8e9be3329eea7c84d5e244d0c1cd0ab3 ]
At this point the command submission can still be interrupted.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 8604ffcbf04f8f4f3f55a9e46e5ff948b2ed4290 ]
We need to figure out the address after validating the BO, not before.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 3257ec797d3a8c5232389eb1952d4451e80f3931 ]
The vc4 HVS uses an internal RGB888 representation of the frames, and will
by default expand formats using a lower depth using zeros.
This causes an issue when we try to use other compositing software such as
pixman that fill the missing bits by repeating the higher significant bits.
As such, we can't check the display output in a reliable way by doing a
software composition and an hardware one and compare both.
To prevent this, force the same behaviour so that we can do such things.
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20180517133759.25626-1-maxime.ripard@bootlin.com
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 1e871d65e375280757833d9fce91dda71980bdf5 ]
Daniel's format_mod_supported() patch predated Dave's for NV21/61, and
I didn't catch that when rebasing. This is a problem since the
formats are now getting validated before being passed to the driver's
atomic hooks.
Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Daniel Stone <daniels@collabora.com>
Cc: Dave Stevenson <dave.stevenson@raspberrypi.org>
Fixes: 423ad7b3cbd1 ("drm/vc4: Advertise supported modifiers for planes")
Link: https://patchwork.freedesktop.org/patch/msgid/20180316220435.31416-2-eric@anholt.net
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 7122b68b8a9692dcc3acf89595f04c492872115f ]
Between creation and queueing of a job, you need to prevent any other
job from being created and queued. Otherwise the scheduler's fences
may be signaled out of seqno order.
v2: move mutex unlock to the error label.
Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: 57692c94dcbe ("drm/v3d: Introduce a new DRM driver for Broadcom V3D V3.x+")
Link: https://patchwork.freedesktop.org/patch/msgid/20180606174851.12433-1-eric@anholt.net
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 5117bd898e8c0a31e8ab3a9b8523aecf0706e997 ]
- None of the list walkings where protected.
- Switch to a mutex since the list walking at device resume time can
sleep when pinning buffers through the tiler.
Only thing we need to be careful with here is that while we walk the
list we can't unreference any gem objects, since the final unref would
result in a recursive deadlock. But the only functions that walk the
list is the device resume and debugfs dumping, so all safe.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 367c359aa8637b15ee8df6335c5a29b7623966ec ]
sun4i_drv_add_endpoints() has a memory leak since it uses of_node_put()
when remote is equal to NULL and does nothing when remote has a valid
pointer.
Invert the logic to fix memory leak.
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180625120304.7543-7-jernej.skrabec@siol.net
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 09773c532d30187f86a142901c27c93e629ce6ba ]
Current DW HDMI PHY code never prepares and enables PHY clock after it is
created. It's just used as it is. This may work in some cases, but it's
clearly wrong. Fix it by adding proper calls to enable/disable PHY
clock.
Fixes: 4f86e81748fe ("drm/sun4i: Add support for H3 HDMI PHY variant")
Signed-off-by: Jernej Skrabec <jernej.skrabec@siol.net>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180625120304.7543-17-jernej.skrabec@siol.net
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit f3e077d95ca0a016fdf3d6b1e97a9910dfdaff17 ]
DML does not calculate chroma values for RQ when surface is not YUV, but DC
will unconditionally use the uninitialized values for HW programming.
This does not cause visual corruption since HW will ignore garbage chroma
values when surface is not YUV, but causes presubmission tests to fail
golden value comparison.
Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Signed-off-by: Eryk Brol <eryk.brol@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 6f3472a993e7cb63cde5d818dcabc8e42fc03744 ]
Add suffix ULL to constant 5 and cast variables target_pix_clk_khz and
feedback_divider to uint64_t in order to avoid multiple potential integer
overflows and give the compiler complete information about the proper
arithmetic to use.
Notice that such constant and variables are used in contexts that
expect expressions of type uint64_t (64 bits, unsigned). The current
casts to uint64_t effectively apply to each expression as a whole,
but they do not prevent them from being evaluated using 32-bit
arithmetic instead of 64-bit arithmetic.
Also, once the expressions are properly evaluated using 64-bit
arithmentic, there is no need for the parentheses that enclose
them.
Addresses-Coverity-ID: 1460245 ("Unintentional integer overflow")
Addresses-Coverity-ID: 1460286 ("Unintentional integer overflow")
Addresses-Coverity-ID: 1460401 ("Unintentional integer overflow")
Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 30f3984ede683b98a4e8096e200df78bf0609b4f upstream.
Add new pci id.
Reviewed-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit fcb74da1eb8edd3a4ef9b9724f88ed709d684227 upstream.
This fixes a NULL pointer dereference that can happen if the UDL
driver is unloaded before the framebuffer is initialized. This can
happen e.g. if the USB device is unplugged right after it was plugged
in.
As explained by Stéphane Marchesin:
It happens when fbdev is disabled (which is the case for Chrome OS).
Even though intialization of the fbdev part is optional (it's done in
udlfb_create which is the callback for fb_probe()), the teardown isn't
optional (udl_driver_unload -> udl_fbdev_cleanup ->
udl_fbdev_destroy).
Note that udl_fbdev_cleanup *tries* to be conditional (you can see it
does if (!udl->fbdev)) but that doesn't work, because udl->fbdev is
always set during udl_fbdev_init.
Cc: stable@vger.kernel.org
Suggested-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Emil Lundmark <lndmrk@chromium.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20180528142711.142466-1-lndmrk@chromium.org
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 658d8cbd07dae22ccecf49399e18c609c4e85c53 upstream.
When there's no scaling requested ->is_unity should be true no matter
the format.
Also, when no scaling is requested and we have a multi-planar YUV
format, we should leave ->y_scaling[0] to VC4_SCALING_NONE and only
set ->x_scaling[0] to VC4_SCALING_PPF.
Doing this fixes an hardly visible artifact (seen when using modetest
and a rather big overlay plane in YUV420).
Fixes: fc04023fafec ("drm/vc4: Add support for YUV planes.")
Cc: <stable@vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20180725122907.13702-1-boris.brezillon@bootlin.com
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 79e765ad665da4b8aa7e9c878bd2fef837f6fea5 upstream.
On most systems with ACPI hotplugging support, it seems that we always
receive a hotplug event once we re-enable EC interrupts even if the GPU
hasn't even been resumed yet.
This can cause problems since even though we schedule hpd_work to handle
connector reprobing for us, hpd_work synchronizes on
pm_runtime_get_sync() to wait until the device is ready to perform
reprobing. Since runtime suspend/resume callbacks are disabled before
the PM core calls ->suspend(), any calls to pm_runtime_get_sync() during
this period will grab a runtime PM ref and return immediately with
-EACCES. Because we schedule hpd_work from our ACPI HPD handler, and
hpd_work synchronizes on pm_runtime_get_sync(), this causes us to launch
a connector reprobe immediately even if the GPU isn't actually resumed
just yet. This causes various warnings in dmesg and occasionally, also
prevents some displays connected to the dedicated GPU from coming back
up after suspend. Example:
usb 1-4: USB disconnect, device number 14
usb 1-4.1: USB disconnect, device number 15
WARNING: CPU: 0 PID: 838 at drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h:170 nouveau_dp_detect+0x17e/0x370 [nouveau]
CPU: 0 PID: 838 Comm: kworker/0:6 Not tainted 4.17.14-201.Lyude.bz1477182.V3.fc28.x86_64 #1
Hardware name: LENOVO 20EQS64N00/20EQS64N00, BIOS N1EET77W (1.50 ) 03/28/2018
Workqueue: events nouveau_display_hpd_work [nouveau]
RIP: 0010:nouveau_dp_detect+0x17e/0x370 [nouveau]
RSP: 0018:ffffa15143933cf0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff8cb4f656c400 RCX: 0000000000000000
RDX: ffffa1514500e4e4 RSI: ffffa1514500e4e4 RDI: 0000000001009002
RBP: ffff8cb4f4a8a800 R08: ffffa15143933cfd R09: ffffa15143933cfc
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8cb4fb57a000
R13: ffff8cb4fb57a000 R14: ffff8cb4f4a8f800 R15: ffff8cb4f656c418
FS: 0000000000000000(0000) GS:ffff8cb51f400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f78ec938000 CR3: 000000073720a003 CR4: 00000000003606f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
? _cond_resched+0x15/0x30
nouveau_connector_detect+0x2ce/0x520 [nouveau]
? _cond_resched+0x15/0x30
? ww_mutex_lock+0x12/0x40
drm_helper_probe_detect_ctx+0x8b/0xe0 [drm_kms_helper]
drm_helper_hpd_irq_event+0xa8/0x120 [drm_kms_helper]
nouveau_display_hpd_work+0x2a/0x60 [nouveau]
process_one_work+0x187/0x340
worker_thread+0x2e/0x380
? pwq_unbound_release_workfn+0xd0/0xd0
kthread+0x112/0x130
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x35/0x40
Code: 4c 8d 44 24 0d b9 00 05 00 00 48 89 ef ba 09 00 00 00 be 01 00 00 00 e8 e1 09 f8 ff 85 c0 0f 85 b2 01 00 00 80 7c 24 0c 03 74 02 <0f> 0b 48 89 ef e8 b8 07 f8 ff f6 05 51 1b c8 ff 02 0f 84 72 ff
---[ end trace 55d811b38fc8e71a ]---
So, to fix this we attempt to grab a runtime PM reference in the ACPI
handler itself asynchronously. If the GPU is already awake (it will have
normal hotplugging at this point) or runtime PM callbacks are currently
disabled on the device, we drop our reference without updating the
autosuspend delay. We only schedule connector reprobes when we
successfully managed to queue up a resume request with our asynchronous
PM ref.
This also has the added benefit of preventing redundant connector
reprobes from ACPI while the GPU is runtime resumed!
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Cc: Karol Herbst <kherbst@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1477182#c41
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6833fb1ec120bf078e1a527c573a09d4de286224 upstream.
It's true we can't resume the device from poll workers in
nouveau_connector_detect(). We can however, prevent the autosuspend
timer from elapsing immediately if it hasn't already without risking any
sort of deadlock with the runtime suspend/resume operations. So do that
instead of entirely avoiding grabbing a power reference.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 7fec8f5379fb6eddabc0aaef6d2304c366808f97 upstream.
Currently, nouveau uses the generic drm_fb_helper_output_poll_changed()
function provided by DRM as it's output_poll_changed callback.
Unfortunately however, this function doesn't grab runtime PM references
early enough and even if it did-we can't block waiting for the device to
resume in output_poll_changed() since it's very likely that we'll need
to grab the fb_helper lock at some point during the runtime resume
process. This currently results in deadlocking like so:
[ 246.669625] INFO: task kworker/4:0:37 blocked for more than 120 seconds.
[ 246.673398] Not tainted 4.18.0-rc5Lyude-Test+ #2
[ 246.675271] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.676527] kworker/4:0 D 0 37 2 0x80000000
[ 246.677580] Workqueue: events output_poll_execute [drm_kms_helper]
[ 246.678704] Call Trace:
[ 246.679753] __schedule+0x322/0xaf0
[ 246.680916] schedule+0x33/0x90
[ 246.681924] schedule_preempt_disabled+0x15/0x20
[ 246.683023] __mutex_lock+0x569/0x9a0
[ 246.684035] ? kobject_uevent_env+0x117/0x7b0
[ 246.685132] ? drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[ 246.686179] mutex_lock_nested+0x1b/0x20
[ 246.687278] ? mutex_lock_nested+0x1b/0x20
[ 246.688307] drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[ 246.689420] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
[ 246.690462] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
[ 246.691570] output_poll_execute+0x198/0x1c0 [drm_kms_helper]
[ 246.692611] process_one_work+0x231/0x620
[ 246.693725] worker_thread+0x214/0x3a0
[ 246.694756] kthread+0x12b/0x150
[ 246.695856] ? wq_pool_ids_show+0x140/0x140
[ 246.696888] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.697998] ret_from_fork+0x3a/0x50
[ 246.699034] INFO: task kworker/0:1:60 blocked for more than 120 seconds.
[ 246.700153] Not tainted 4.18.0-rc5Lyude-Test+ #2
[ 246.701182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.702278] kworker/0:1 D 0 60 2 0x80000000
[ 246.703293] Workqueue: pm pm_runtime_work
[ 246.704393] Call Trace:
[ 246.705403] __schedule+0x322/0xaf0
[ 246.706439] ? wait_for_completion+0x104/0x190
[ 246.707393] schedule+0x33/0x90
[ 246.708375] schedule_timeout+0x3a5/0x590
[ 246.709289] ? mark_held_locks+0x58/0x80
[ 246.710208] ? _raw_spin_unlock_irq+0x2c/0x40
[ 246.711222] ? wait_for_completion+0x104/0x190
[ 246.712134] ? trace_hardirqs_on_caller+0xf4/0x190
[ 246.713094] ? wait_for_completion+0x104/0x190
[ 246.713964] wait_for_completion+0x12c/0x190
[ 246.714895] ? wake_up_q+0x80/0x80
[ 246.715727] ? get_work_pool+0x90/0x90
[ 246.716649] flush_work+0x1c9/0x280
[ 246.717483] ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
[ 246.718442] __cancel_work_timer+0x146/0x1d0
[ 246.719247] cancel_delayed_work_sync+0x13/0x20
[ 246.720043] drm_kms_helper_poll_disable+0x1f/0x30 [drm_kms_helper]
[ 246.721123] nouveau_pmops_runtime_suspend+0x3d/0xb0 [nouveau]
[ 246.721897] pci_pm_runtime_suspend+0x6b/0x190
[ 246.722825] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.723737] __rpm_callback+0x7a/0x1d0
[ 246.724721] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.725607] rpm_callback+0x24/0x80
[ 246.726553] ? pci_has_legacy_pm_support+0x70/0x70
[ 246.727376] rpm_suspend+0x142/0x6b0
[ 246.728185] pm_runtime_work+0x97/0xc0
[ 246.728938] process_one_work+0x231/0x620
[ 246.729796] worker_thread+0x44/0x3a0
[ 246.730614] kthread+0x12b/0x150
[ 246.731395] ? wq_pool_ids_show+0x140/0x140
[ 246.732202] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.732878] ret_from_fork+0x3a/0x50
[ 246.733768] INFO: task kworker/4:2:422 blocked for more than 120 seconds.
[ 246.734587] Not tainted 4.18.0-rc5Lyude-Test+ #2
[ 246.735393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 246.736113] kworker/4:2 D 0 422 2 0x80000080
[ 246.736789] Workqueue: events_long drm_dp_mst_link_probe_work [drm_kms_helper]
[ 246.737665] Call Trace:
[ 246.738490] __schedule+0x322/0xaf0
[ 246.739250] schedule+0x33/0x90
[ 246.739908] rpm_resume+0x19c/0x850
[ 246.740750] ? finish_wait+0x90/0x90
[ 246.741541] __pm_runtime_resume+0x4e/0x90
[ 246.742370] nv50_disp_atomic_commit+0x31/0x210 [nouveau]
[ 246.743124] drm_atomic_commit+0x4a/0x50 [drm]
[ 246.743775] restore_fbdev_mode_atomic+0x1c8/0x240 [drm_kms_helper]
[ 246.744603] restore_fbdev_mode+0x31/0x140 [drm_kms_helper]
[ 246.745373] drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xb0 [drm_kms_helper]
[ 246.746220] drm_fb_helper_set_par+0x2d/0x50 [drm_kms_helper]
[ 246.746884] drm_fb_helper_hotplug_event.part.28+0x96/0xb0 [drm_kms_helper]
[ 246.747675] drm_fb_helper_output_poll_changed+0x23/0x30 [drm_kms_helper]
[ 246.748544] drm_kms_helper_hotplug_event+0x2a/0x30 [drm_kms_helper]
[ 246.749439] nv50_mstm_hotplug+0x15/0x20 [nouveau]
[ 246.750111] drm_dp_send_link_address+0x177/0x1c0 [drm_kms_helper]
[ 246.750764] drm_dp_check_and_send_link_address+0xa8/0xd0 [drm_kms_helper]
[ 246.751602] drm_dp_mst_link_probe_work+0x51/0x90 [drm_kms_helper]
[ 246.752314] process_one_work+0x231/0x620
[ 246.752979] worker_thread+0x44/0x3a0
[ 246.753838] kthread+0x12b/0x150
[ 246.754619] ? wq_pool_ids_show+0x140/0x140
[ 246.755386] ? kthread_create_worker_on_cpu+0x70/0x70
[ 246.756162] ret_from_fork+0x3a/0x50
[ 246.756847]
Showing all locks held in the system:
[ 246.758261] 3 locks held by kworker/4:0/37:
[ 246.759016] #0: 00000000f8df4d2d ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.759856] #1: 00000000e6065461 ((work_completion)(&(&dev->mode_config.output_poll_work)->work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.760670] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_hotplug_event.part.28+0x20/0xb0 [drm_kms_helper]
[ 246.761516] 2 locks held by kworker/0:1/60:
[ 246.762274] #0: 00000000fff6be0f ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.762982] #1: 000000005ab44fb4 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.763890] 1 lock held by khungtaskd/64:
[ 246.764664] #0: 000000008cb8b5c3 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
[ 246.765588] 5 locks held by kworker/4:2/422:
[ 246.766440] #0: 00000000232f0959 ((wq_completion)"events_long"){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.767390] #1: 00000000bb59b134 ((work_completion)(&mgr->work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 246.768154] #2: 00000000cb66735f (&helper->lock){+.+.}, at: drm_fb_helper_restore_fbdev_mode_unlocked+0x4c/0xb0 [drm_kms_helper]
[ 246.768966] #3: 000000004c8f0b6b (crtc_ww_class_acquire){+.+.}, at: restore_fbdev_mode_atomic+0x4b/0x240 [drm_kms_helper]
[ 246.769921] #4: 000000004c34a296 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_backoff+0x8a/0x1b0 [drm]
[ 246.770839] 1 lock held by dmesg/1038:
[ 246.771739] 2 locks held by zsh/1172:
[ 246.772650] #0: 00000000836d0438 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
[ 246.773680] #1: 000000001f4f4d48 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870
[ 246.775522] =============================================
After trying dozens of different solutions, I found one very simple one
that should also have the benefit of preventing us from having to fight
locking for the rest of our lives. So, we work around these deadlocks by
deferring all fbcon hotplug events that happen after the runtime suspend
process starts until after the device is resumed again.
Changes since v7:
- Fixup commit message - Daniel Vetter
Changes since v6:
- Remove unused nouveau_fbcon_hotplugged_in_suspend() - Ilia
Changes since v5:
- Come up with the (hopefully final) solution for solving this dumb
problem, one that is a lot less likely to cause issues with locking in
the future. This should work around all deadlock conditions with fbcon
brought up thus far.
Changes since v4:
- Add nouveau_fbcon_hotplugged_in_suspend() to workaround deadlock
condition that Lukas described
- Just move all of this out of drm_fb_helper. It seems that other DRM
drivers have already figured out other workarounds for this. If other
drivers do end up needing this in the future, we can just move this
back into drm_fb_helper again.
Changes since v3:
- Actually check if fb_helper is NULL in both new helpers
- Actually check drm_fbdev_emulation in both new helpers
- Don't fire off a fb_helper hotplug unconditionally; only do it if
the following conditions are true (as otherwise, calling this in the
wrong spot will cause Bad Things to happen):
- fb_helper hotplug handling was actually inhibited previously
- fb_helper actually has a delayed hotplug pending
- fb_helper is actually bound
- fb_helper is actually initialized
- Add __must_check to drm_fb_helper_suspend_hotplug(). There's no
situation where a driver would actually want to use this without
checking the return value, so enforce that
- Rewrite and clarify the documentation for both helpers.
- Make sure to return true in the drm_fb_helper_suspend_hotplug() stub
that's provided in drm_fb_helper.h when CONFIG_DRM_FBDEV_EMULATION
isn't enabled
- Actually grab the toplevel fb_helper lock in
drm_fb_helper_resume_hotplug(), since it's possible other activity
(such as a hotplug) could be going on at the same time the driver
calls drm_fb_helper_resume_hotplug(). We need this to check whether or
not drm_fb_helper_hotplug_event() needs to be called anyway
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit d77ef138ff572409ab93d492e5e6c826ee6fb21d upstream.
Turns out this part is my fault for not noticing when reviewing
9a2eba337cace ("drm/nouveau: Fix drm poll_helper handling"). Currently
we call drm_kms_helper_poll_enable() from nouveau_display_hpd_work().
This makes basically no sense however, because that means we're calling
drm_kms_helper_poll_enable() every time we schedule the hotplug
detection work. This is also against the advice mentioned in
drm_kms_helper_poll_enable()'s documentation:
Note that calls to enable and disable polling must be strictly ordered,
which is automatically the case when they're only call from
suspend/resume callbacks.
Of course, hotplugs can't really be ordered. They could even happen
immediately after we called drm_kms_helper_poll_disable() in
nouveau_display_fini(), which can lead to all sorts of issues.
Additionally; enabling polling /after/ we call
drm_helper_hpd_irq_event() could also mean that we'd miss a hotplug
event anyway, since drm_helper_hpd_irq_event() wouldn't bother trying to
probe connectors so long as polling is disabled.
So; simply move this back into nouveau_display_init() again. The race
condition that both of these patches attempted to work around has
already been fixed properly in
d61a5c106351 ("drm/nouveau: Fix deadlock on runtime suspend")
Fixes: 9a2eba337cace ("drm/nouveau: Fix drm poll_helper handling")
Signed-off-by: Lyude Paul <lyude@redhat.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2f7ca781fd382cf8dde73ed36dfdd93fd05b3332 upstream.
Currently, there's nothing in nouveau that actually cancels this work
struct. So, cancel it on suspend/unload. Otherwise, if we're unlucky
enough hpd_work might try to keep running up until the system is
suspended.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 3e1a12754d4df5804bfca5dedf09d2ba291bdc2a upstream.
When we disable hotplugging on the GPU, we need to be able to
synchronize with each connector's hotplug interrupt handler before the
interrupt is finally disabled. This can be a problem however, since
nouveau_connector_detect() currently grabs a runtime power reference
when handling connector probing. This will deadlock the runtime suspend
handler like so:
[ 861.480896] INFO: task kworker/0:2:61 blocked for more than 120 seconds.
[ 861.483290] Tainted: G O 4.18.0-rc6Lyude-Test+ #1
[ 861.485158] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 861.486332] kworker/0:2 D 0 61 2 0x80000000
[ 861.487044] Workqueue: events nouveau_display_hpd_work [nouveau]
[ 861.487737] Call Trace:
[ 861.488394] __schedule+0x322/0xaf0
[ 861.489070] schedule+0x33/0x90
[ 861.489744] rpm_resume+0x19c/0x850
[ 861.490392] ? finish_wait+0x90/0x90
[ 861.491068] __pm_runtime_resume+0x4e/0x90
[ 861.491753] nouveau_display_hpd_work+0x22/0x60 [nouveau]
[ 861.492416] process_one_work+0x231/0x620
[ 861.493068] worker_thread+0x44/0x3a0
[ 861.493722] kthread+0x12b/0x150
[ 861.494342] ? wq_pool_ids_show+0x140/0x140
[ 861.494991] ? kthread_create_worker_on_cpu+0x70/0x70
[ 861.495648] ret_from_fork+0x3a/0x50
[ 861.496304] INFO: task kworker/6:2:320 blocked for more than 120 seconds.
[ 861.496968] Tainted: G O 4.18.0-rc6Lyude-Test+ #1
[ 861.497654] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 861.498341] kworker/6:2 D 0 320 2 0x80000080
[ 861.499045] Workqueue: pm pm_runtime_work
[ 861.499739] Call Trace:
[ 861.500428] __schedule+0x322/0xaf0
[ 861.501134] ? wait_for_completion+0x104/0x190
[ 861.501851] schedule+0x33/0x90
[ 861.502564] schedule_timeout+0x3a5/0x590
[ 861.503284] ? mark_held_locks+0x58/0x80
[ 861.503988] ? _raw_spin_unlock_irq+0x2c/0x40
[ 861.504710] ? wait_for_completion+0x104/0x190
[ 861.505417] ? trace_hardirqs_on_caller+0xf4/0x190
[ 861.506136] ? wait_for_completion+0x104/0x190
[ 861.506845] wait_for_completion+0x12c/0x190
[ 861.507555] ? wake_up_q+0x80/0x80
[ 861.508268] flush_work+0x1c9/0x280
[ 861.508990] ? flush_workqueue_prep_pwqs+0x1b0/0x1b0
[ 861.509735] nvif_notify_put+0xb1/0xc0 [nouveau]
[ 861.510482] nouveau_display_fini+0xbd/0x170 [nouveau]
[ 861.511241] nouveau_display_suspend+0x67/0x120 [nouveau]
[ 861.511969] nouveau_do_suspend+0x5e/0x2d0 [nouveau]
[ 861.512715] nouveau_pmops_runtime_suspend+0x47/0xb0 [nouveau]
[ 861.513435] pci_pm_runtime_suspend+0x6b/0x180
[ 861.514165] ? pci_has_legacy_pm_support+0x70/0x70
[ 861.514897] __rpm_callback+0x7a/0x1d0
[ 861.515618] ? pci_has_legacy_pm_support+0x70/0x70
[ 861.516313] rpm_callback+0x24/0x80
[ 861.517027] ? pci_has_legacy_pm_support+0x70/0x70
[ 861.517741] rpm_suspend+0x142/0x6b0
[ 861.518449] pm_runtime_work+0x97/0xc0
[ 861.519144] process_one_work+0x231/0x620
[ 861.519831] worker_thread+0x44/0x3a0
[ 861.520522] kthread+0x12b/0x150
[ 861.521220] ? wq_pool_ids_show+0x140/0x140
[ 861.521925] ? kthread_create_worker_on_cpu+0x70/0x70
[ 861.522622] ret_from_fork+0x3a/0x50
[ 861.523299] INFO: task kworker/6:0:1329 blocked for more than 120 seconds.
[ 861.523977] Tainted: G O 4.18.0-rc6Lyude-Test+ #1
[ 861.524644] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 861.525349] kworker/6:0 D 0 1329 2 0x80000000
[ 861.526073] Workqueue: events nvif_notify_work [nouveau]
[ 861.526751] Call Trace:
[ 861.527411] __schedule+0x322/0xaf0
[ 861.528089] schedule+0x33/0x90
[ 861.528758] rpm_resume+0x19c/0x850
[ 861.529399] ? finish_wait+0x90/0x90
[ 861.530073] __pm_runtime_resume+0x4e/0x90
[ 861.530798] nouveau_connector_detect+0x7e/0x510 [nouveau]
[ 861.531459] ? ww_mutex_lock+0x47/0x80
[ 861.532097] ? ww_mutex_lock+0x47/0x80
[ 861.532819] ? drm_modeset_lock+0x88/0x130 [drm]
[ 861.533481] drm_helper_probe_detect_ctx+0xa0/0x100 [drm_kms_helper]
[ 861.534127] drm_helper_hpd_irq_event+0xa4/0x120 [drm_kms_helper]
[ 861.534940] nouveau_connector_hotplug+0x98/0x120 [nouveau]
[ 861.535556] nvif_notify_work+0x2d/0xb0 [nouveau]
[ 861.536221] process_one_work+0x231/0x620
[ 861.536994] worker_thread+0x44/0x3a0
[ 861.537757] kthread+0x12b/0x150
[ 861.538463] ? wq_pool_ids_show+0x140/0x140
[ 861.539102] ? kthread_create_worker_on_cpu+0x70/0x70
[ 861.539815] ret_from_fork+0x3a/0x50
[ 861.540521]
Showing all locks held in the system:
[ 861.541696] 2 locks held by kworker/0:2/61:
[ 861.542406] #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[ 861.543071] #1: 0000000076868126 ((work_completion)(&drm->hpd_work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 861.543814] 1 lock held by khungtaskd/64:
[ 861.544535] #0: 0000000059db4b53 (rcu_read_lock){....}, at: debug_show_all_locks+0x23/0x185
[ 861.545160] 3 locks held by kworker/6:2/320:
[ 861.545896] #0: 00000000d9e1bc59 ((wq_completion)"pm"){+.+.}, at: process_one_work+0x1b3/0x620
[ 861.546702] #1: 00000000c9f92d84 ((work_completion)(&dev->power.work)){+.+.}, at: process_one_work+0x1b3/0x620
[ 861.547443] #2: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: nouveau_display_fini+0x96/0x170 [nouveau]
[ 861.548146] 1 lock held by dmesg/983:
[ 861.548889] 2 locks held by zsh/1250:
[ 861.549605] #0: 00000000348e3cf6 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
[ 861.550393] #1: 000000007009a7a8 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xc1/0x870
[ 861.551122] 6 locks held by kworker/6:0/1329:
[ 861.551957] #0: 000000002dbf8af5 ((wq_completion)"events"){+.+.}, at: process_one_work+0x1b3/0x620
[ 861.552765] #1: 00000000ddb499ad ((work_completion)(¬ify->work)#2){+.+.}, at: process_one_work+0x1b3/0x620
[ 861.553582] #2: 000000006e013cbe (&dev->mode_config.mutex){+.+.}, at: drm_helper_hpd_irq_event+0x6c/0x120 [drm_kms_helper]
[ 861.554357] #3: 000000004afc5de1 (drm_connector_list_iter){.+.+}, at: drm_helper_hpd_irq_event+0x78/0x120 [drm_kms_helper]
[ 861.555227] #4: 0000000044f294d9 (crtc_ww_class_acquire){+.+.}, at: drm_helper_probe_detect_ctx+0x3d/0x100 [drm_kms_helper]
[ 861.556133] #5: 00000000db193642 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_lock+0x4b/0x130 [drm]
[ 861.557864] =============================================
[ 861.559507] NMI backtrace for cpu 2
[ 861.560363] CPU: 2 PID: 64 Comm: khungtaskd Tainted: G O 4.18.0-rc6Lyude-Test+ #1
[ 861.561197] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET78W (1.51 ) 05/18/2018
[ 861.561948] Call Trace:
[ 861.562757] dump_stack+0x8e/0xd3
[ 861.563516] nmi_cpu_backtrace.cold.3+0x14/0x5a
[ 861.564269] ? lapic_can_unplug_cpu.cold.27+0x42/0x42
[ 861.565029] nmi_trigger_cpumask_backtrace+0xa1/0xae
[ 861.565789] arch_trigger_cpumask_backtrace+0x19/0x20
[ 861.566558] watchdog+0x316/0x580
[ 861.567355] kthread+0x12b/0x150
[ 861.568114] ? reset_hung_task_detector+0x20/0x20
[ 861.568863] ? kthread_create_worker_on_cpu+0x70/0x70
[ 861.569598] ret_from_fork+0x3a/0x50
[ 861.570370] Sending NMI from CPU 2 to CPUs 0-1,3-7:
[ 861.571426] NMI backtrace for cpu 6 skipped: idling at intel_idle+0x7f/0x120
[ 861.571429] NMI backtrace for cpu 7 skipped: idling at intel_idle+0x7f/0x120
[ 861.571432] NMI backtrace for cpu 3 skipped: idling at intel_idle+0x7f/0x120
[ 861.571464] NMI backtrace for cpu 5 skipped: idling at intel_idle+0x7f/0x120
[ 861.571467] NMI backtrace for cpu 0 skipped: idling at intel_idle+0x7f/0x120
[ 861.571469] NMI backtrace for cpu 4 skipped: idling at intel_idle+0x7f/0x120
[ 861.571472] NMI backtrace for cpu 1 skipped: idling at intel_idle+0x7f/0x120
[ 861.572428] Kernel panic - not syncing: hung_task: blocked tasks
So: fix this by making it so that normal hotplug handling /only/ happens
so long as the GPU is currently awake without any pending runtime PM
requests. In the event that a hotplug occurs while the device is
suspending or resuming, we can simply defer our response until the GPU
is fully runtime resumed again.
Changes since v4:
- Use a new trick I came up with using pm_runtime_get() instead of the
hackish junk we had before
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 611ce855420a6e8b9ff47af5f47431d52c7709f8 upstream.
Since actual hotplug notifications don't get disabled until
nouveau_display_fini() is called, all this will do is cause any hotplugs
that happen between this drm_kms_helper_poll_disable() call and the
actual hotplug disablement to potentially be dropped if ACPI isn't
around to help us.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Cc: stable@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b26b4590dd53e012526342e749c423e6c0e73437 upstream.
Currently, nouveau will re-write the DP_MSTM_CTRL register for an MST
hub every time it receives a long HPD pulse on DP. This isn't actually
necessary and additionally, has some unintended side effects.
With the P50 I've got here, rewriting DP_MSTM_CTRL constantly seems to
make it rather likely (1 out of 5 times usually) that bringing up MST
with it's ThinkPad dock will fail and result in sideband messages timing
out in the middle. Afterwards, successive probes don't manage to get the
dock to communicate properly over MST sideband properly.
Many times sideband message timeouts from MST hubs are indicative of
either the source or the sink dropping an ESI event, which can cause
DRM's perspective of the topology's current state to go out of sync with
reality. While it's tough to really know for sure what's happening to
the dock, using userspace tools to write to DP_MSTM_CTRL in the middle
of the MST link probing process does appear to make things flaky. It's
possible that when we write to DP_MSTM_CTRL, the function that gets
triggered to respond in the dock's firmware temporarily puts it in a
state where it might end up not reporting an ESI to the source, or ends
up dropping a sideband message we sent it.
So, to fix this we make it so that when probing an MST topology, we
respect it's current state. If the dock's already enabled, we simply
read DP_MSTM_CTRL and disable the topology if it's value is not what we
expected. Otherwise, we perform the normal MST probing dance. We avoid
taking any action except if the state of the MST topology actually
changes.
This fixes MST sideband message timeouts and detection failures on my
P50 with its ThinkPad dock.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Cc: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit fa3cdf8d0b092c4561f9f017dfac409eb7644737 upstream.
When probing a new MST device, it's not safe to make any assumptions
about it's current state. While most well mannered MST hubs will just
disable the branching unit on hotplug disconnects, this isn't enough to
save us from various other scenarios that might have resulted in
something writing to the MST branching unit before we got control of it.
This could happen if a previous probe we tried failed, if we're booting
in kexec context and the hub is still in the state the last kernel put
it in, etc.
Luckily; there is no reason we can't just reset the branching unit
every time we enable a new topology. So, fix this by resetting it on
enabling new topologies to ensure that we always start off with a clean,
unmodified topology state on MST sinks.
This fixes occasional hard-lockups on my P50's laptop dock (e.g. AUX
times out all DPCD trasactions) observed after multiple docks, undocks,
and module reloads.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Cc: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|