Age | Commit message (Collapse) | Author | Files | Lines |
|
For a while we've been having issues with seemingly random interrupts
coming from nvidia cards when resuming them. Originally the fix for this
was thought to be just re-arming the MSI interrupt registers right after
re-allocating our IRQs, however it seems a lot of what we do is both
wrong and not even nessecary.
This was made apparent by what appeared to be a regression in the
mainline kernel that started introducing suspend/resume issues for
nouveau:
a0c9259dc4e1 (irq/matrix: Spread interrupts on allocation)
After this commit was introduced, we started getting interrupts from the
GPU before we actually re-allocated our own IRQ (see references below)
and assigned the IRQ handler. Investigating this turned out that the
problem was not with the commit, but the fact that nouveau even
free/allocates it's irqs before and after suspend/resume.
For starters: drivers in the linux kernel haven't had to handle
freeing/re-allocating their IRQs during suspend/resume cycles for quite
a while now. Nouveau seems to be one of the few drivers left that still
does this, despite the fact there's no reason we actually need to since
disabling interrupts from the device side should be enough, as the
kernel is already smart enough to know to disable host-side interrupts
for us before going into suspend. Since we were tearing down our IRQs by
hand however, that means there was a short period during resume where
interrupts could be received before we re-allocated our IRQ which would
lead to us getting an unhandled IRQ. Since we never handle said IRQ and
re-arm the interrupt registers, this would cause us to miss all of the
interrupts from the GPU and cause our init process to start timing out
on anything requiring interrupts.
So, since this whole setup/teardown every suspend/resume cycle is
useless anyway, move irq setup/teardown into the pci subdev's ctor/dtor
functions instead so they're only called at driver load and driver
unload. This should fix most of the issues with pending interrupts on
resume, along with getting suspend/resume for nouveau to work again.
As well, this probably means we can also just remove the msi rearm call
inside nvkm_pci_init(). But since our main focus here is to fix
suspend/resume before 4.15, we'll save that for a later patch.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Galbraith <efault@gmx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
- Fixes addition of stolen memory base address to PTEs.
- Removes support for compression.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Tested-by: Pierre Moreau <pierre.morrow@free.fr>
|
|
Commit bbb163e18960 ("drm/nouveau/bar: implement bar1 teardown")
introduced add a teardown helper function for BAR1. During
initialisation of the Nouveau, initially all the teardown helpers are
called once, before calling their init counterparts. For gk20a, after
the BAR1 teardown function is called, the device is hanging during the
initialisation of the FB sub-device. At this point it is unclear why
this is happening and this is still under investigation. However, this
change is preventing Tegra124 devices from booting when Nouveau is
enabled. To allow Tegra124 to boot, remove the teardown helper for
gk20a.
This is based upon a previous patch by Guillaume Tucker but limits
the workaround to only gk20a GPUs.
Fixes: bbb163e18960 ("drm/nouveau/bar: implement bar1 teardown")
Reported-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
This is obviously wrong in the current code. Make sure to record the
correct size of the arguments and pass the actual arguments to the
nvif_object_map_handle() function.
Suggested-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Fixes broken dp on GF119:
Call Trace:
? nvkm_dp_train_drive+0x183/0x2c0 [nouveau]
nvkm_dp_acquire+0x4f3/0xcd0 [nouveau]
nv50_disp_super_2_2+0x5d/0x470 [nouveau]
? nvkm_devinit_pll_set+0xf/0x20 [nouveau]
gf119_disp_super+0x19c/0x2f0 [nouveau]
process_one_work+0x193/0x3c0
worker_thread+0x35/0x3b0
kthread+0x125/0x140
? process_one_work+0x3c0/0x3c0
? kthread_park+0x60/0x60
ret_from_fork+0x25/0x30
Code: Bad RIP value.
RIP: (null) RSP: ffffb1e243e4bc38
CR2: 0000000000000000
Fixes: af85389c614a drm/nouveau/disp: shuffle functions around
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103421
Signed-off-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
kernel.org bz#198221.
Reported-by: Petr Vandrovec <petr@vandrovec.name>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
fdo#104340.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Fixes bug on Tegra where we'd strip kind information from system memory
(ie. all) buffers, resulting in misrendering.
Behaviour on dGPU should be unchanged.
Reported-by: Thierry Reding <treding@nvidia.com>
Fixes: d7722134b8 ("drm/nouveau: switch over to new memory and vmm interfaces")
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Tested-by: Thierry Reding <treding@nvidia.com>
|
|
While the Tegra (GK20A, GM20B, GP10B) MMUs support large pages in host
memory, we're currently lacking IOMMU support for merging system pages
into large enough chunks to be mapped as such by the GPU.
The core VMM code actually supports automatically determining the best
page size to map with, which is intended for these situations, but for
various complicated reasons the DRM is currently forcing the page size
selection on a per-BO basis.
This should fix breakage reported on Tegra GPUs in the meantime, until
one or both of the above issues are resolved properly.
Reported-by: Mikko Perttunen <cyndis@kapsi.fi>
Fixes: 7dc6a446da7c ("drm/nouveau: improve selection of GPU page size")
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Tested-by: Thierry Reding <treding@nvidia.com>
|
|
Reported-by: Mikko Perttunen <cyndis@kapsi.fi>
Fixes: 6359c98224 ("drm/nouveau/mmu/gp10b: fork from gf100")
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Tested-by: Thierry Reding <treding@nvidia.com>
|
|
On my GP107 when I load nouveau after unloading it, for some reason the
GPU stopped sending or the CPU stopped receiving interrupts if MSI was
enabled.
Doing a rearm once before getting any interrupts fixes this.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Reported-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
When the fbcon object is initialized, but nouveau_fbcon_create is not
called, we run into a NULL pointer access within nouveau_fbcon_create when
unloading nouveau.
The call to drm_fb_helper_funcs.fb_probe is deferred until there is a
display for real since 4.14, that's why fbcon->helper.fb is still not set.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Addresses-Coverity-ID: 1260018
Addresses-Coverity-ID: 1260019
Addresses-Coverity-ID: 1260022
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Addresses-Coverity-ID: 143119
Addresses-Coverity-ID: 143120
Addresses-Coverity-ID: 143121
Addresses-Coverity-ID: 143122
Addresses-Coverity-ID: 143123
Addresses-Coverity-ID: 143124
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Don't populate arrays hwsq_signature and edid_sig on the stack but
instead make them static. Makes the object code smaller by over 190
bytes:
Before:
text data bss dec hex filename
35676 3312 64 39052 988c nouveau_bios.o
After:
text data bss dec hex filename
35319 3472 64 38855 97c7 nouveau_bios.o
(gcc version 7.2.0 x86_64)
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Rounding value is guaranteed to be power-of-two, so this is better
anyway.
Fixes build on 32-bit.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Enables the use of Pascal's 2MiB pages for larger buffers.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
VMAs are about to not take references on the VMM they belong to, which
means more care is required when handling delayed unmapping.
Queuing it on the client workqueue ensures all pending VMA unmaps will
have completed before the VMM is destroyed.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
This is already handled in the top-level gem_new() ioctl in another manner,
but this will be removed in a future commit.
Ideally we'd not need to check up-front at all, and let the VMM code handle
error checking, but there are paths in the current BO management code where
this isn't possible due to map() not always being called during BO creation,
and map() calls not being allowed to fail during buffer migration.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
If the VMA is being deleted, we don't need to explicity unmap first
anymore. The MMU code will automatically merge the operations into
a single page tree walk.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Could be useful for if/when a future GPU removes support for the GF100
PT layout.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|
|
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
|