summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/nouveau
AgeCommit message (Collapse)AuthorFilesLines
2021-07-11drm/nouveau: fix dma_address check for CPU/GPU syncChristian König1-2/+2
[ Upstream commit d330099115597bbc238d6758a4930e72b49ea9ba ] AGP for example doesn't have a dma_address array. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210614110517.1624-1-christian.koenig@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-30drm/nouveau/i2c/gm200: increase width of aux semaphore owner fieldsBen Skeggs1-4/+4
[ Upstream commit ba6e9ab0fcf3d76e3952deb12b5f993991621d9c ] Noticed while debugging GA102. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-30drm/nouveau/bios: fix issue shadowing expansion ROMsBen Skeggs1-1/+1
[ Upstream commit 402a89660e9dc880710b12773076a336c9dab3d7 ] This issue has generally been covered up by the presence of additional expansion ROMs after the ones we're interested in, with header fetches of subsequent images loading enough of the ROM to hide the issue. Noticed on GA102, which lacks a type 0x70 image compared to TU102,. [ 906.364197] nouveau 0000:09:00.0: bios: 00000000: type 00, 65024 bytes [ 906.381205] nouveau 0000:09:00.0: bios: 0000fe00: type 03, 91648 bytes [ 906.405213] nouveau 0000:09:00.0: bios: 00026400: type e0, 22016 bytes [ 906.410984] nouveau 0000:09:00.0: bios: 0002ba00: type e0, 366080 bytes vs [ 22.961901] nouveau 0000:09:00.0: bios: 00000000: type 00, 60416 bytes [ 22.984174] nouveau 0000:09:00.0: bios: 0000ec00: type 03, 71168 bytes [ 23.010446] nouveau 0000:09:00.0: bios: 00020200: type e0, 48128 bytes [ 23.028220] nouveau 0000:09:00.0: bios: 0002be00: type e0, 140800 bytes [ 23.080196] nouveau 0000:09:00.0: bios: 0004e400: type 70, 7168 bytes Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-03drm/nouveau: Fix reference count leak in nouveau_connector_detectAditya Pakki1-1/+3
[ Upstream commit 990a1162986e8eff7ca18cc5a0e03b4304392ae2 ] nouveau_connector_detect() calls pm_runtime_get_sync and in turn increments the reference count. In case of failure, decrement the ref count before returning the error. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-03drm/nouveau/drm/noveau: fix reference count leak in nouveau_fbcon_openAditya Pakki1-1/+3
[ Upstream commit bfad51c7633325b5d4b32444efe04329d53297b2 ] nouveau_fbcon_open() calls calls pm_runtime_get_sync() that increments the reference count. In case of failure, decrement the ref count before returning the error. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-21drm/nouveau: fix multiple instances of reference count leaksAditya Pakki2-3/+9
[ Upstream commit 659fb5f154c3434c90a34586f3b7aa1c39cf6062 ] On calling pm_runtime_get_sync() the reference count of the device is incremented. In case of failure, decrement the ref count before returning the error. Signed-off-by: Aditya Pakki <pakki001@umn.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-21drm/nouveau/fbcon: fix module unload when fbcon init has failed for some reasonBen Skeggs1-0/+1
[ Upstream commit 498595abf5bd51f0ae074cec565d888778ea558f ] Stale pointer was tripping up the unload path. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-07-31drm/nouveau/i2c/g94-: increase NV_PMGR_DP_AUXCTL_TRANSACTREQ timeoutBen Skeggs2-4/+4
[ Upstream commit 0156e76d388310a490aeb0f2fbb5b284ded3aecc ] Tegra TRM says worst-case reply time is 1216us, and this should fix some spurious timeouts that have been popping up. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28drm/nouveau/disp/nv50-: prevent oops when no channel method map providedBen Skeggs1-0/+2
[ Upstream commit 0e6176c6d286316e9431b4f695940cfac4ffe6c2 ] The implementations for most channel types contains a map of methods to priv registers in order to provide debugging info when a disp exception has been raised. This info is missing from the implementation of PIO channels as they're rather simplistic already, however, if an exception is raised by one of them, we'd end up triggering a NULL-pointer deref. Not ideal... Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=206299 Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-28drm/nouveau: Fix copy-paste error in nouveau_fence_wait_uevent_handlerYueHaibing1-1/+1
[ Upstream commit 1eb013473bff5f95b6fe1ca4dd7deda47257b9c2 ] Like other cases, it should use rcu protected 'chan' rather than 'fence->channel' in nouveau_fence_wait_uevent_handler. Fixes: 0ec5f02f0e2c ("drm/nouveau: prevent stale fence->channel pointers, and protect with rcu") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-29drm/nouveau/pmu: don't print reply values if exec is falseColin Ian King1-2/+2
[ Upstream commit b1d03fc36ec9834465a08c275c8d563e07f6f6bf ] Currently the uninitialized values in the array reply are printed out when exec is false and nvkm_pmu_send has not updated the array. Avoid confusion by only dumping out these values if they have been actually updated. Detected by CoverityScan, CID#1271291 ("Uninitialized scaler variable") Fixes: ebb58dc2ef8c ("drm/nouveau/pmu: rename from pwr (no binary change)") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-01-29drm/nouveau/bios/ramcfg: fix missing parentheses when calculating RONColin Ian King1-1/+1
[ Upstream commit 13649101a25c53c87f4ab98a076dfe61f3636ab1 ] Currently, the expression for calculating RON is always going to result in zero no matter the value of ram->mr[1] because the ! operator has higher precedence than the shift >> operator. I believe the missing parentheses around the expression before appying the ! operator will result in the desired result. [ Note, not tested ] Detected by CoveritScan, CID#1324005 ("Operands don't affect result") Fixes: c25bf7b6155c ("drm/nouveau/bios/ramcfg: Separate out RON pull value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-08-04drm/nouveau/i2c: Enable i2c pads & busses during preinitLyude Paul1-0/+20
commit 7cb95eeea6706c790571042a06782e378b2561ea upstream. It turns out that while disabling i2c bus access from software when the GPU is suspended was a step in the right direction with: commit 342406e4fbba ("drm/nouveau/i2c: Disable i2c bus access after ->fini()") We also ended up accidentally breaking the vbios init scripts on some older Tesla GPUs, as apparently said scripts can actually use the i2c bus. Since these scripts are executed before initializing any subdevices, we end up failing to acquire access to the i2c bus which has left a number of cards with their fan controllers uninitialized. Luckily this doesn't break hardware - it just means the fan gets stuck at 100%. This also means that we've always been using our i2c busses before initializing them during the init scripts for older GPUs, we just didn't notice it until we started preventing them from being used until init. It's pretty impressive this never caused us any issues before! So, fix this by initializing our i2c pad and busses during subdev pre-init. We skip initializing aux busses during pre-init, as those are guaranteed to only ever be used by nouveau for DP aux transactions. Signed-off-by: Lyude Paul <lyude@redhat.com> Tested-by: Marc Meledandri <m.meledandri@gmail.com> Fixes: 342406e4fbba ("drm/nouveau/i2c: Disable i2c bus access after ->fini()") Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-11drm/nouveau/i2c: Disable i2c bus access after ->fini()Lyude Paul6-2/+65
commit 342406e4fbba9a174125fbfe6aeac3d64ef90f76 upstream. For a while, we've had the problem of i2c bus access not grabbing a runtime PM ref when it's being used in userspace by i2c-dev, resulting in nouveau spamming the kernel log with errors if anything attempts to access the i2c bus while the GPU is in runtime suspend. An example: [ 130.078386] nouveau 0000:01:00.0: i2c: aux 000d: begin idle timeout ffffffff Since the GPU is in runtime suspend, the MMIO region that the i2c bus is on isn't accessible. On x86, the standard behavior for accessing an unavailable MMIO region is to just return ~0. Except, that turned out to be a lie. While computers with a clean concious will return ~0 in this scenario, some machines will actually completely hang a CPU on certian bad MMIO accesses. This was witnessed with someone's Lenovo ThinkPad P50, where sensors-detect attempting to access the i2c bus while the GPU was suspended would result in a CPU hang: CPU: 5 PID: 12438 Comm: sensors-detect Not tainted 5.0.0-0.rc4.git3.1.fc30.x86_64 #1 Hardware name: LENOVO 20EQS64N17/20EQS64N17, BIOS N1EET74W (1.47 ) 11/21/2017 RIP: 0010:ioread32+0x2b/0x30 Code: 81 ff ff ff 03 00 77 20 48 81 ff 00 00 01 00 76 05 0f b7 d7 ed c3 48 c7 c6 e1 0c 36 96 e8 2d ff ff ff b8 ff ff ff ff c3 8b 07 <c3> 0f 1f 40 00 49 89 f0 48 81 fe ff ff 03 00 76 04 40 88 3e c3 48 RSP: 0018:ffffaac3c5007b48 EFLAGS: 00000292 ORIG_RAX: ffffffffffffff13 RAX: 0000000001111000 RBX: 0000000001111000 RCX: 0000043017a97186 RDX: 0000000000000aaa RSI: 0000000000000005 RDI: ffffaac3c400e4e4 RBP: ffff9e6443902c00 R08: ffffaac3c400e4e4 R09: ffffaac3c5007be7 R10: 0000000000000004 R11: 0000000000000001 R12: ffff9e6445dd0000 R13: 000000000000e4e4 R14: 00000000000003c4 R15: 0000000000000000 FS: 00007f253155a740(0000) GS:ffff9e644f600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005630d1500358 CR3: 0000000417c44006 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: g94_i2c_aux_xfer+0x326/0x850 [nouveau] nvkm_i2c_aux_i2c_xfer+0x9e/0x140 [nouveau] __i2c_transfer+0x14b/0x620 i2c_smbus_xfer_emulated+0x159/0x680 ? _raw_spin_unlock_irqrestore+0x1/0x60 ? rt_mutex_slowlock.constprop.0+0x13d/0x1e0 ? __lock_is_held+0x59/0xa0 __i2c_smbus_xfer+0x138/0x5a0 i2c_smbus_xfer+0x4f/0x80 i2cdev_ioctl_smbus+0x162/0x2d0 [i2c_dev] i2cdev_ioctl+0x1db/0x2c0 [i2c_dev] do_vfs_ioctl+0x408/0x750 ksys_ioctl+0x5e/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x60/0x1e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f25317f546b Code: 0f 1e fa 48 8b 05 1d da 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed d9 0c 00 f7 d8 64 89 01 48 RSP: 002b:00007ffc88caab68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00005630d0fe7260 RCX: 00007f25317f546b RDX: 00005630d1598e80 RSI: 0000000000000720 RDI: 0000000000000003 RBP: 00005630d155b968 R08: 0000000000000001 R09: 00005630d15a1da0 R10: 0000000000000070 R11: 0000000000000246 R12: 00005630d1598e80 R13: 00005630d12f3d28 R14: 0000000000000720 R15: 00005630d12f3ce0 watchdog: BUG: soft lockup - CPU#5 stuck for 23s! [sensors-detect:12438] Yikes! While I wanted to try to make it so that accessing an i2c bus on nouveau would wake up the GPU as needed, airlied pointed out that pretty much any usecase for userspace accessing an i2c bus on a GPU (mainly for the DDC brightness control that some displays have) is going to only be useful while there's at least one display enabled on the GPU anyway, and the GPU never sleeps while there's displays running. Since teaching the i2c bus to wake up the GPU on userspace accesses is a good deal more difficult than it might seem, mostly due to the fact that we have to use the i2c bus during runtime resume of the GPU, we instead opt for the easiest solution: don't let userspace access i2c busses on the GPU at all while it's in runtime suspend. Changes since v1: * Also disable i2c busses that run over DP AUX Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-11-10drm/nouveau/fbcon: fix oops without fbdev emulationPavel Roskin1-3/+4
[ Upstream commit 4813766325374af6ed0b66879ba6a0bbb05c83b6 ] This is similar to an earlier commit 52dfcc5ccfbb ("drm/nouveau: fix for disabled fbdev emulation"), but protects all occurrences of helper.fbdev in the source. I see oops in nouveau_fbcon_accel_save_disable() called from nouveau_fbcon_set_suspend_work() on Linux 3.13 when CONFIG_DRM_FBDEV_EMULATION option is disabled. Signed-off-by: Pavel Roskin <plroskin@gmail.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-10-10drm/nouveau/TBDdevinit: don't fail when PMU/PRE_OS is missing from VBIOSBen Skeggs1-1/+2
[ Upstream commit 0a6986c6595e9afd20ff7280dab36431c1e467f8 ] This Falcon application doesn't appear to be present on some newer systems, so let's not fail init if we can't find it. TBD: is there a way to determine whether it *should* be there? Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-29drm/nouveau/drm/nouveau: Use pm_runtime_get_noresume() in connector_detect()Lyude Paul1-9/+11
commit 6833fb1ec120bf078e1a527c573a09d4de286224 upstream. It's true we can't resume the device from poll workers in nouveau_connector_detect(). We can however, prevent the autosuspend timer from elapsing immediately if it hasn't already without risking any sort of deadlock with the runtime suspend/resume operations. So do that instead of entirely avoiding grabbing a power reference. Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Acked-by: Daniel Vetter <daniel@ffwll.ch> Cc: stable@vger.kernel.org Cc: Lukas Wunner <lukas@wunner.de> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-26drm/nouveau: tegra: Detach from ARM DMA/IOMMU mappingThierry Reding1-0/+13
[ Upstream commit b59fb482b52269977ee5de205308e5b236a03917 ] Depending on the kernel configuration, early ARM architecture setup code may have attached the GPU to a DMA/IOMMU mapping that transparently uses the IOMMU to back the DMA API. Tegra requires special handling for IOMMU backed buffers (a special bit in the GPU's MMU page tables indicates the memory path to take: via the SMMU or directly to the memory controller). Transparently backing DMA memory with an IOMMU prevents Nouveau from properly handling such memory accesses and causes memory access faults. As a side-note: buffers other than those allocated in instance memory don't need to be physically contiguous from the GPU's perspective since the GPU can map them into contiguous buffers using its own MMU. Mapping these buffers through the IOMMU is unnecessary and will even lead to performance degradation because of the additional translation. One exception to this are compressible buffers which need large pages. In order to enable these large pages, multiple small pages will have to be combined into one large (I/O virtually contiguous) mapping via the IOMMU. However, that is a topic outside the scope of this fix and isn't currently supported. An implementation will want to explicitly create these large pages in the Nouveau driver, so detaching from a DMA/IOMMU mapping would still be required. Signed-off-by: Thierry Reding <treding@nvidia.com> Acked-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Tested-by: Nicolas Chauvet <kwizart@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-09drm/drivers: add support for using the arch wc mapping API.Dave Airlie1-0/+8
commit 7cf321d118a825c1541b43ca45294126fd474efa upstream. This fixes a regression in all these drivers since the cache mode tracking was fixed for mixed mappings. It uses the new arch API to add the VRAM range to the PAT mapping tracking tables. Fixes: 87744ab3832 (mm: fix cache mode tracking in vm_insert_mixed()) Reviewed-by: Christian König <christian.koenig@amd.com>. Signed-off-by: Dave Airlie <airlied@redhat.com> Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-08-24drm/nouveau/gem: off by one bugs in nouveau_gem_pushbuf_reloc_apply()Dan Carpenter1-2/+2
[ Upstream commit 7f073d011f93e92d4d225526b9ab6b8b0bbd6613 ] The bo array has req->nr_buffers elements so the > should be >= so we don't read beyond the end of the array. Fixes: a1606a9596e5 ("drm/nouveau: new gem pushbuf interface, bump to 0.0.16") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-24drm/nouveau/kms: Increase max retries in scanout position queries.Mario Kleiner1-1/+1
[ Upstream commit 60b95d709525e3ce1c51e1fc93175dcd1755d345 ] So far we only allowed for 1 retry and just failed the query - and thereby high precision vblank timestamping - if we did not get a reasonable result, as such a failure wasn't considered all too horrible. There are a few NVidia gpu models out there which may need a bit more than 1 retry to get a successful query result under some conditions. Since Linux 4.4 the update code for vblank counter and timestamp in drm_update_vblank_count() changed so that the implementation assumes that high precision vblank timestamping of a kms driver either consistently succeeds or consistently fails for a given video mode and encoder/connector combo. Iow. switching from success to fail or vice versa on a modeset or connector change is ok, but spurious temporary failure for a given setup can confuse the core code and potentially cause bad miscounting of vblanks and confusion or hangs in userspace clients which rely on vblank stuff, e.g., desktop compositors. Therefore change the max retry count to a larger number - more than any gpu so far is known to need to succeed, but still low enough so that these queries which do also happen in vblank interrupt are still fast enough to be not disastrously long if something would go badly wrong with them. As such sporadic retries only happen seldom even on affected gpu's, this could mean a vblank irq could take a few dozen microseconds longer every few hours of uptime -- better than a desktop compositor randomly hanging every couple of hours or days of uptime in a hard to reproduce manner. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-18drm/nouveau: Fix deadlock on runtime suspendLukas Wunner1-5/+13
commit d61a5c1063515e855bedb1b81e20e50b0ac3541e upstream. nouveau's ->runtime_suspend hook calls drm_kms_helper_poll_disable(), which waits for the output poll worker to finish if it's running. The output poll worker meanwhile calls pm_runtime_get_sync() in nouveau_connector_detect() which waits for the ongoing suspend to finish, causing a deadlock. Fix by not acquiring a runtime PM ref if nouveau_connector_detect() is called in the output poll worker's context. This is safe because the poll worker is only enabled while runtime active and we know that ->runtime_suspend waits for it to finish. Other contexts calling nouveau_connector_detect() do require a runtime PM ref, these comprise: status_store() drm sysfs interface ->fill_modes drm callback drm_fb_helper_probe_connector_modes() drm_mode_getconnector() nouveau_connector_hotplug() nouveau_display_hpd_work() nv17_tv_set_property() Stack trace for posterity: INFO: task kworker/0:1:58 blocked for more than 120 seconds. Workqueue: events output_poll_execute [drm_kms_helper] Call Trace: schedule+0x28/0x80 rpm_resume+0x107/0x6e0 __pm_runtime_resume+0x47/0x70 nouveau_connector_detect+0x7e/0x4a0 [nouveau] nouveau_connector_detect_lvds+0x132/0x180 [nouveau] drm_helper_probe_detect_ctx+0x85/0xd0 [drm_kms_helper] output_poll_execute+0x11e/0x1c0 [drm_kms_helper] process_one_work+0x184/0x380 worker_thread+0x2e/0x390 INFO: task kworker/0:2:252 blocked for more than 120 seconds. Workqueue: pm pm_runtime_work Call Trace: schedule+0x28/0x80 schedule_timeout+0x1e3/0x370 wait_for_completion+0x123/0x190 flush_work+0x142/0x1c0 nouveau_pmops_runtime_suspend+0x7e/0xd0 [nouveau] pci_pm_runtime_suspend+0x5c/0x180 vga_switcheroo_runtime_suspend+0x1e/0xa0 __rpm_callback+0xc1/0x200 rpm_callback+0x1f/0x70 rpm_suspend+0x13c/0x640 pm_runtime_work+0x6e/0x90 process_one_work+0x184/0x380 worker_thread+0x2e/0x390 Bugzilla: https://bugs.archlinux.org/task/53497 Bugzilla: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=870523 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70388#c33 Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") Cc: stable@vger.kernel.org # v3.12+: 27d4ee03078a: workqueue: Allow retrieval of current task's work struct Cc: stable@vger.kernel.org # v3.12+: 25c058ccaf2e: drm: Allow determining if current task is output poll worker Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Dave Airlie <airlied@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://patchwork.freedesktop.org/patch/msgid/b7d2cbb609a80f59ccabfdf479b9d5907c603ea1.1518338789.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-03-03drm/nouveau/pci: do a msi rearm on initKarol Herbst1-0/+7
[ Upstream commit a121027d2747168df0aac0c3da35509eea39f61c ] On my GP107 when I load nouveau after unloading it, for some reason the GPU stopped sending or the CPU stopped receiving interrupts if MSI was enabled. Doing a rearm once before getting any interrupts fixes this. Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-02-25drm/nouveau: hide gcc-4.9 -Wmaybe-uninitializedArnd Bergmann1-1/+1
commit b74c0a9969f25217a5e5bbcac56a11bee16718d3 upstream. gcc-4.9 notices that the validate_init() function returns unintialized data when called with a zero 'nr_buffers' argument, when called with the -Wmaybe-uninitialized flag: drivers/gpu/drm/nouveau/nouveau_gem.c: In function ‘validate_init.isra.6’: drivers/gpu/drm/nouveau/nouveau_gem.c:457:5: error: ‘ret’ may be used uninitialized in this function [-Werror=maybe-uninitialized] However, the only caller of this function always passes a nonzero argument, and gcc-6 is clever enough to take this into account and not warn about it any more. Adding an explicit initialization to -EINVAL here is correct even if the caller changed, and it avoids the warning on gcc-4.9 as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-By: Karol Herbst <karolherbst@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-27drm/nouveau/mmu: flush tlbs before deleting page tablesBen Skeggs1-0/+2
commit 77913bbcb43ac9a07a6fe849c2fd3bf85fc8bdd8 upstream. Even though we've zeroed the PDE, the GPU may have cached the PD, so we need to flush when deleting them. Noticed while working on replacement MMU code, but a backport might be a good idea, so let's fix it in the current code too. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-27drm/nouveau/bsp/g92: disable by defaultIlia Mirkin1-1/+1
commit 194d68dd051c2dd5ac2b522ae16100e774e8d869 upstream. G92's seem to require some additional bit of initialization before the BSP engine can work. It feels like clocks are not set up for the underlying VLD engine, which means that all commands submitted to the xtensa chip end up hanging. VP seems to work fine though. This still allows people to force-enable the bsp engine if they want to play around with it, but makes it harder for the card to hang by default. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-09-14drm/nouveau/pci/msi: disable MSI on big-endian platforms by defaultIlia Mirkin1-0/+4
commit bc60c90f472b6e762ea96ef384072145adc8d4af upstream. It appears that MSI does not work on either G5 PPC nor on a E5500-based platform, where other hardware is reported to work fine with MSI. Both tests were conducted with NV4x hardware, so perhaps other (or even this) hardware can be made to work. It's still possible to force-enable with config=NvMSI=1 on load. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-08-07drm/nouveau/bar/gf100: fix access to upper half of BAR2Ben Skeggs1-1/+1
commit 38bcb208f60924a031b9f809f7cd252ea4a94e5f upstream. Bit 30 being set causes the upper half of BAR2 to stay in physical mode, mapped over the end of VRAM, even when the rest of the BAR has been set to virtual mode. We inherited our initial value from RM, but I'm not aware of any reason we need to keep it that way. This fixes severe GPU hang/lockup issues revealed by Wayland on F26. Shout-out to NVIDIA for the quick response with the potential cause! Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17drm/nouveau: Don't enabling polling twice on runtime resumeLyude Paul2-2/+6
[ Upstream commit cae9ff036eea577856d5b12860b4c79c5e71db4a ] As it turns out, on cards that actually have CRTCs on them we're already calling drm_kms_helper_poll_enable(drm_dev) from nouveau_display_resume() before we call it in nouveau_pmops_runtime_resume(). This leads us to accidentally trying to enable polling twice, which results in a potential deadlock between the RPM locks and drm_dev->mode_config.mutex if we end up trying to enable polling the second time while output_poll_execute is running and holding the mode_config lock. As such, make sure we only enable polling in nouveau_pmops_runtime_resume() if we need to. This fixes hangs observed on the ThinkPad W541 Signed-off-by: Lyude <lyude@redhat.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Kilian Singer <kilian.singer@quantumtechnology.info> Cc: Lukas Wunner <lukas@wunner.de> Cc: David Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17drm/nouveau/fence/g84-: protect against concurrent access to semaphore buffersBen Skeggs2-0/+7
[ Upstream commit 96692b097ba76d0c637ae8af47b29c73da33c9d0 ] Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-17drm/nouveau: prevent userspace from deleting client objectBen Skeggs1-1/+2
[ Upstream commit c966b6279f610a24ac1d42dcbe30e10fa61220b2 ] Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-06-14drm/nouveau/tmr: fully separate alarm execution/pending listsBen Skeggs2-3/+5
commit b4e382ca7586a63b6c1e5221ce0863ff867c2df6 upstream. Reusing the list_head for both is a bad idea. Callback execution is done with the lock dropped so that alarms can be rescheduled from the callback, which means that with some unfortunate timing, lists can get corrupted. The execution list should not require its own locking, the single function that uses it can only be called from a single context. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/nouveau/tmr: handle races with hw when updating the next alarm timeBen Skeggs1-10/+16
commit 1b0f84380b10ee97f7d2dd191294de9017e94d1d upstream. If the time to the next alarm is short enough, we could race with HW and end up with an ~4 second delay until it triggers. Fix this by checking again after we update HW. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/nouveau/tmr: avoid processing completed alarms when adding a new oneBen Skeggs1-3/+13
commit 330bdf62fe6a6c5b99a647f7bf7157107c9348b3 upstream. The idea here was to avoid having to "manually" program the HW if there's a new earliest alarm. This was lazy and bad, as it leads to loads of fun races between inter-related callers (ie. therm). Turns out, it's not so difficult after all. Go figure ;) Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/nouveau/tmr: fix corruption of the pending list when rescheduling an alarmBen Skeggs1-7/+10
commit 9fc64667ee48c9a25e7dca1a6bcb6906fec5bcc5 upstream. At least therm/fantog "attempts" to work around this issue, which could lead to corruption of the pending alarm list. Fix it properly by not updating the timestamp without the lock held, or trying to add an already pending alarm to the pending alarm list.... Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/nouveau/tmr: ack interrupt before processing alarmsBen Skeggs1-1/+1
commit 3733bd8b407211739e72d051e5f30ad82a52c4bc upstream. Fixes a race where we can miss an alarm that triggers while we're already processing previous alarms. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-05-25drm/nouveau/therm: remove ineffective workarounds for alarm bugsBen Skeggs4-4/+4
commit e4311ee51d1e2676001b2d8fcefd92bdd79aad85 upstream. These were ineffective due to touching the list without the alarm lock, but should no longer be required. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-21drm/nouveau/mmu/nv4a: use nv04 mmu rather than the nv44 oneIlia Mirkin1-1/+1
commit f94773b9f5ecd1df7c88c2e921924dd41d2020cc upstream. The NV4A (aka NV44A) is an oddity in the family. It only comes in AGP and PCI varieties, rather than a core PCIE chip with a bridge for AGP/PCI as necessary. As a result, it appears that the MMU is also non-functional. For AGP cards, the vast majority of the NV4A lineup, this worked out since we force AGP cards to use the nv04 mmu. However for PCI variants, this did not work. Switching to the NV04 MMU makes it work like a charm. Thanks to mwk for the suggestion. This should be a no-op for NV4A AGP boards, as they were using it already. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70388 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-21drm/nouveau/mpeg: mthd returns true on success nowIlia Mirkin2-2/+2
commit 83bce9c2baa51e439480a713119a73d3c8b61083 upstream. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Fixes: 590801c1a3 ("drm/nouveau/mpeg: remove dependence on namedb/engctx lookup") Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-09drm/nouveau/nv1a,nv1f/disp: fix memory clock rate retrievalIlia Mirkin1-1/+2
commit 24bf7ae359b8cca165bb30742d2b1c03a1eb23af upstream. Based on the xf86-video-nv code, NFORCE (NV1A) and NFORCE2 (NV1F) have a different way of retrieving clocks. See the nv_hw.c:nForceUpdateArbitrationSettings function in the original code for how these clocks were accessed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54587 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-02-09drm/nouveau/disp/gt215: Fix HDA ELD handling (thus, HDMI audio) on gt215Alastair Bridgewater1-1/+1
commit d347583a39e2df609a9e40c835f72d3614665b53 upstream. Store the ELD correctly, not just enough copies of the first byte to pad out the given ELD size. Signed-off-by: Alastair Bridgewater <alastair.bridgewater@gmail.com> Fixes: 120b0c39c756 ("drm/nv50-/disp: audit and version SOR_HDA_ELD method") Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-09drm/nouveau/i2c/gk110b,gm10x: use the correct implementationBen Skeggs1-2/+2
commit 5b3800a6b763874e4a23702fb9628d3bd3315ce9 upstream. DPAUX registers moved on Kepler, these chipsets were still using the Fermi implementation for some reason. This fixes detection of hotplug/sink IRQs on DP connectors. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-09drm/nouveau/fifo/gf100-: protect channel preempt with subdev mutexBen Skeggs2-6/+11
commit b27add13f500469127afdf011dbcc9c649e16e54 upstream. This avoids an issue that occurs when we're attempting to preempt multiple channels simultaneously. HW seems to ignore preempt requests while it's still processing a previous one, which, well, makes sense. Fixes random "fifo: SCHED_ERROR 0d []" + GPCCS page faults during parallel piglit runs on (at least) GM107. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-09drm/nouveau/ltc: protect clearing of comptags with mutexBen Skeggs1-0/+2
commit f4e65efc88b64c1dbca275d42a188edccedb56c6 upstream. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-09drm/nouveau/bios: require checksum to match for fast acpi shadow methodBen Skeggs3-2/+7
commit 5dc7f4aa9d84ea94b54a9bfcef095f0289f1ebda upstream. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-09drm/nouveau/kms: lvds panel strap moved again on maxwellBen Skeggs1-0/+3
commit 768e847759d551c96e129e194588dbfb11a1d576 upstream. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-10-07drm/nouveau/fifo/nv04: avoid ramht race against cookie insertionIlia Mirkin1-0/+3
commit 666ca3d8f19082f40745d75f3cc7cc0200ee87e3 upstream. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-09-30nouveau: fix nv40_perfctr_next() cleanup regressionArnd Bergmann1-2/+4
commit 86d65b7e7a0c927d07d18605c276d0f142438ead upstream. gcc-6 warns about code in the nouveau driver that is obviously silly: drivers/gpu/drm/nouveau/nvkm/engine/pm/nv40.c: In function 'nv40_perfctr_next': drivers/gpu/drm/nouveau/nvkm/engine/pm/nv40.c:62:19: warning: self-comparison always evaluats to false [-Wtautological-compare] if (pm->sequence != pm->sequence) { The behavior was accidentally introduced in a patch described as "This is purely preparation for upcoming commits, there should be no code changes here.". As far as I can tell, that was true for the rest of that patch except for this one function, which has been changed to a NOP. This patch restores the original behavior. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: 8c1aeaa13954 ("drm/nouveau/pm: cosmetic changes") Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-20drm/nouveau/fbcon: fix font width not divisible by 8Mikulas Patocka3-4/+4
commit 28668f43b8e421634e1623f72a879812288dd06b upstream. The patch f045f459d925 ("drm/nouveau/fbcon: fix out-of-bounds memory accesses") tries to fix some out of memory accesses. Unfortunatelly, the patch breaks the display when using fonts with width that is not divisiable by 8. The monochrome bitmap for each character is stored in memory by lines from top to bottom. Each line is padded to a full byte. For example, for 22x11 font, each line is padded to 16 bits, so each character is consuming 44 bytes total, that is 11 32-bit words. The patch f045f459d925 changed the logic to "dsize = ALIGN(image->width * image->height, 32) >> 5", that is just 8 words - this is incorrect and it causes display corruption. This patch adds the necesary padding of lines to 8 bytes. This patch should be backported to stable kernels where f045f459d925 was backported. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Fixes: f045f459d925 ("drm/nouveau/fbcon: fix out-of-bounds memory accesses") Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-08-20drm/nouveau/gr/nv3x: fix instobj write offsets in gr setupIlia Mirkin2-4/+4
commit d0e62ef6ed257715a88d0e5d7cd850a1695429e2 upstream. This should fix some unaligned access warnings. This is also likely to fix non-descript issues on nv30/nv34 as a result of incorrect channel setup. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96836 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>