summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/nouveau
AgeCommit message (Collapse)AuthorFilesLines
2020-05-22drm/nouveau/kms/gv100-: Add support for interlaced modesLyude Paul2-4/+6
We advertise being able to set interlaced modes, so let's actually make sure to do that. Otherwise, we'll end up hanging the display engine due to trying to set a mode with timings adjusted for interlacing without telling the hardware it's actually an interlaced mode. Signed-off-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/kms/nv50-: Probe SOR and PIOR caps for DP interlacing supportLyude Paul15-3/+118
Right now, we make the mistake of allowing interlacing on all connectors. Nvidia hardware does not always support interlacing with DP though, so we need to make sure that we don't allow interlaced modes to be set in such situations as otherwise we'll end up accidentally hanging the display HW. This fixes some hangs with Turing, which would be caused by attempting to set an interlaced mode on hardware that doesn't support it. This patch likely fixes other hardware hanging in the same way as well. Note that we say we probe PIOR caps, but they don't actually have any interlacing caps. So, the get_caps() function for PIORs just sets interlacing support to true. Changes since v1: * Actually probe caps correctly this time, both on EVO and NVDisplay. Changes since v2: * Fix probing for < GF119 * Use vfunc table, in prep for adding more caps in the future. Signed-off-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/kms/nv50-: Initialize core channel in nouveau_display_create()Lyude Paul1-1/+4
We'll need the core channel initialized and ready by the time that we start creating modesetting objects, so that we can call the NV507D_GET_CAPABILITIES method to make the hardware expose it's modesetting capabilities for later probing. So, when loading the driver prepare the core channel from within nouveau_display_create(). Everywhere else, we initialize the core channel during resume. Signed-off-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/disp/hda/gv100-: NV_PDISP_SF_AUDIO_CNTRL0 register movedBen Skeggs5-2/+35
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/disp/hda/gf119-: select HDA device entry based on bound headBen Skeggs1-3/+4
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/disp/hda/gf119-: add HAL for programming device entry in SFBen Skeggs8-2/+17
Register has moved on GV100. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/disp/hda/gt215-: pass head to nvkm_ior.hda.eld()Ben Skeggs4-6/+6
We're going to use the bound head to select HDA device entry. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/disp/nv50-: increase timeout on pio channel free() pollingBen Skeggs1-1/+1
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/kms: Fix regression by audio component transitionTakashi Iwai1-6/+10
Since the commit 742db30c4ee6 ("drm/nouveau: Add HD-audio component notifier support"), the nouveau driver notifies and pokes the HD-audio HPD and ELD via audio component, but this seems broken. The culprit is the naive assumption that crtc->index corresponds to the HDA pin. Actually this rather corresponds to the MST dev_id (alias "pipe" in the audio component framework) while the actual port number is given from the output ior id number. This patch corrects the assignment of port and dev_id arguments in the audio component ops to recover from the HDMI/DP audio regression. Fixes: 742db30c4ee6 ("drm/nouveau: Add HD-audio component notifier support") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207223 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/device: use regular PRI accessors in chipset detectionBen Skeggs1-18/+13
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/device: detect vGPUsKarol Herbst1-3/+12
Using ENODEV as this prevents probe failed errors in dmesg. v2: move check further down Signed-off-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/device: detect if changing endianness failedKarol Herbst1-5/+21
v2: relax the checks a little Signed-off-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/device: rework mmio mapping code to get rid of second mapKarol Herbst1-12/+15
Fixes warnings on GPUs with smaller a smaller mmio region like vGPUs. Signed-off-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/mmu: Remove unneeded semicolonZheng Bin2-2/+2
Fixes coccicheck warning: drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h:307:2-3: Unneeded semicolon drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.c:583:2-3: Unneeded semicolon Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau: Use generic helper to check _PR3 presenceKai-Heng Feng1-34/+10
Replace nouveau_pr3_present() in favor of a more generic one, pci_pr3_present(). Also the presence of upstream bridge _PR3 doesn't need to go hand in hand with device's _DSM, so check _PR3 before _DSM. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/acr: Use kmemdup instead of kmalloc and memcpyZou Wei1-8/+4
Fixes coccicheck warning: drivers/gpu/drm/nouveau/nvkm/subdev/acr/hsfw.c:103:23-30: WARNING opportunity for kmemdup drivers/gpu/drm/nouveau/nvkm/subdev/acr/hsfw.c:113:22-29: WARNING opportunity for kmemdup Fixes: 22dcda45a3d1 ("drm/nouveau/acr: implement new subdev to replace "secure boot"") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zou Wei <zou_wei@huawei.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/core/memory: remove redundant assignments to variable retColin Ian King1-1/+1
The variable ret is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/svm: map pages after migrationRalph Campbell4-17/+95
When memory is migrated to the GPU, it is likely to be accessed by GPU code soon afterwards. Instead of waiting for a GPU fault, map the migrated memory into the GPU page tables with the same access permissions as the source CPU page table entries. This preserves copy on write semantics. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: "Jérôme Glisse" <jglisse@redhat.com> Cc: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/disp/gv100-: expose capabilities classBen Skeggs6-0/+69
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/bios: move ACPI _ROM handlingBen Skeggs3-76/+47
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau: remove open-coded version of remove_conflicting_pci_framebuffers()Ben Skeggs1-28/+3
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/gr/gk20a: move MODULE_FIRMWARE firmware definitionsBen Skeggs2-11/+11
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/ibus: use nvkm_subdev_new_()Ben Skeggs6-30/+6
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/core: add nvkm_subdev_new_() for bare subdevsBen Skeggs2-0/+13
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/kms: Support NVIDIA format modifiersJames Jones3-9/+104
Allow setting the block layout of a nouveau FB object using DRM format modifiers. When specified, the format modifier block layout and kind overrides the GEM buffer's implicit layout and kind. The specified format modifier is validated against the list of modifiers supported by the target display hardware. v2: Used Tesla family instead of NV50 chipset compare v4: Do not cache kind, tile_mode in nouveau_framebuffer v5: Resolved against nouveau_framebuffer cleanup Signed-off-by: James Jones <jajones@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/kms: Check framebuffer size against boJames Jones1-0/+98
Make sure framebuffer dimensions and tiling parameters will not result in accesses beyond the end of the GEM buffer they are bound to. v3: Return EINVAL when creating FB against BO with unsupported tiling v5: Resolved against nouveau_framebuffer cleanup Signed-off-by: James Jones <jajones@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/kms: Add format mod prop to base/ovly/nvdispJames Jones6-4/+112
Advertise support for the full list of format modifiers supported by each class of NVIDIA desktop GPU display hardware. Stash the array of modifiers in the nouveau_display struct for use when validating userspace framebuffer creation requests, which will be supportd in a subsequent change. Signed-off-by: James Jones <jajones@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/acr: ensure falcon providing acr functions is bootstrapped firstBen Skeggs1-0/+5
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/kms: Remove struct nouveau_framebufferThomas Zimmermann4-38/+28
After its cleanup, struct nouveau_framebuffer is only a wrapper around struct drm_framebuffer. Use the latter directly. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/kms: Remove field nvbo from struct nouveau_framebufferThomas Zimmermann7-84/+75
The buffer object stored in nvbo is also available GEM object in obj[0] of struct drm_framebuffer. Therefore remove nvbo in favor obj[0] and replace all references accordingly. This may require an additional cast. With this change we can already replace nouveau_user_framebuffer_destroy() and nouveau_user_framebuffer_create_handle() with generic GEM helpers. Calls to nouveau_framebuffer_new() receive a GEM object. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/kms: Move struct nouveau_framebuffer.vma to struct nouveau_fbdevThomas Zimmermann5-14/+14
The vma field of struct nouveau_framebuffer is a special field for the the accelerated fbdev console. Hence there's at most one single instance for the active console. Moving it into struct nouveau_fbdev makes struct nouveau_framebuffer slightly smaller and brings it closer to struct drm_framebuffer. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau/kms: Remove unused fields from struct nouveau_framebufferThomas Zimmermann1-5/+0
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-22drm/nouveau: fix out-of-tree module buildBen Skeggs1-4/+6
The $(srctree) addition a while back busted building the out-of-tree version of the module, and I've been hacking it up ever since. This allows us to work around the issue. Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-05-20drm/nouveau: remove _unlocked suffix in drm_gem_object_put_unlockedEmil Velikov4-13/+13
Spelling out _unlocked for each and every driver is a annoying. Especially if we consider how many drivers, do not know (or need to) about the horror stories involving struct_mutex. Just drop the suffix. It makes the API cleaner. Done via the following script: __from=drm_gem_object_put_unlocked __to=drm_gem_object_put for __file in $(git grep --name-only $__from); do sed -i "s/$__from/$__to/g" $__file; done Cc: Ben Skeggs <bskeggs@redhat.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20200515095118.2743122-26-emil.l.velikov@gmail.com
2020-05-11mm/hmm: remove the customizable pfn format from hmm_range_faultJason Gunthorpe3-56/+61
Presumably the intent here was that hmm_range_fault() could put the data into some HW specific format and thus avoid some work. However, nothing actually does that, and it isn't clear how anything actually could do that as hmm_range_fault() provides CPU addresses which must be DMA mapped. Perhaps there is some special HW that does not need DMA mapping, but we don't have any examples of this, and the theoretical performance win of avoiding an extra scan over the pfns array doesn't seem worth the complexity. Plus pfns needs to be scanned anyhow to sort out any DEVICE_PRIVATE pages. This version replaces the uint64_t with an usigned long containing a pfn and fixed flags. On input flags is filled with the HMM_PFN_REQ_* values, on successful output it is filled with HMM_PFN_* values, describing the state of the pages. amdgpu is simple to convert, it doesn't use snapshot and doesn't use per-page flags. nouveau uses only 16 hmm_pte entries at most (ie fits in a few cache lines), and it sweeps over its pfns array a couple of times anyhow. It also has a nasty call chain before it reaches the dma map and hardware suggesting performance isn't important: nouveau_svm_fault(): args.i.m.method = NVIF_VMM_V0_PFNMAP nouveau_range_fault() nvif_object_ioctl() client->driver->ioctl() struct nvif_driver nvif_driver_nvkm: .ioctl = nvkm_client_ioctl nvkm_ioctl() nvkm_ioctl_path() nvkm_ioctl_v0[type].func(..) nvkm_ioctl_mthd() nvkm_object_mthd() struct nvkm_object_func nvkm_uvmm: .mthd = nvkm_uvmm_mthd nvkm_uvmm_mthd() nvkm_uvmm_mthd_pfnmap() nvkm_vmm_pfn_map() nvkm_vmm_ptes_get_map() func == gp100_vmm_pgt_pfn struct nvkm_vmm_desc_func gp100_vmm_desc_spt: .pfn = gp100_vmm_pgt_pfn nvkm_vmm_iter() REF_PTES == func == gp100_vmm_pgt_pfn() dma_map_page() Link: https://lore.kernel.org/r/5-v2-b4e84f444c7d+24f57-hmm_no_flags_jgg@mellanox.com Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-11mm/hmm: remove HMM_PFN_SPECIALJason Gunthorpe1-1/+0
This is just an alias for HMM_PFN_ERROR, nothing cares that the error was because of a special page vs any other error case. Link: https://lore.kernel.org/r/4-v2-b4e84f444c7d+24f57-hmm_no_flags_jgg@mellanox.com Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-11mm/hmm: make hmm_range_fault return 0 or -1Jason Gunthorpe1-3/+3
hmm_vma_walk->last is supposed to be updated after every write to the pfns, so that it can be returned by hmm_range_fault(). However, this is not done consistently. Fortunately nothing checks the return code of hmm_range_fault() for anything other than error. More importantly last must be set before returning -EBUSY as it is used to prevent reading an output pfn as an input flags when the loop restarts. For clarity and simplicity make hmm_range_fault() return 0 or -ERRNO. Only set last when returning -EBUSY. Link: https://lore.kernel.org/r/2-v2-b4e84f444c7d+24f57-hmm_no_flags_jgg@mellanox.com Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-22Merge tag 'drm-misc-next-2020-04-14' of ↵Dave Airlie3-11/+7
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.8: UAPI Changes: - drm: error out with EBUSY when device has existing master - drm: rework SET_MASTER and DROP_MASTER perm handling Cross-subsystem Changes: - mm: export two symbols from slub/slob - fbdev: savage: fix -Wextra build warning - video: omap2: Use scnprintf() for avoiding potential buffer overflow Core Changes: - Remove drm_pci.h - drm_pci_{alloc/free)() are now legacy - Introduce managed DRM resourcesA - Allow drivers to subclass struct drm_framebuffer - Introduce struct drm_afbc_framebuffer and helpers - fbdev: remove return value from generic fbdev setup - Introduce simple-encoder helper - vram-helpers: set fence on plane - dp_mst: ACT timeout improvements - dp_mst: Remove drm_dp_mst_has_audio() - TTM: ttm_trace_dma_{map/unmap}() cleanups - dma-buf: add flag for PCIP2P support - EDID: Various improvements - Encoder: cleanup semantics of possible_clones and possible_crtcs - VBLANK documentation updates - Writeback documentation updates Driver Changes: - Convert several drivers to i2c_new_client_device() - Drop explicit drm_mode_config_cleanup() calls from drivers - Auto-release device structures with drmm_add_final_kfree() - Init bfdev console after registering DRM device - Make various .debugfs functions return 0 unconditionally; ignore errors - video: Use scnprintf() to avoid buffer overflows - Convert drivers to simple encoders - drm/amdgpu: note that we can handle peer2peer DMA-buf - drm/amdgpu: add support for exporting VRAM using DMA-buf v3 - drm/kirin: Revert change to register connectors - drm/lima: Add optional devfreq and cooling device support - drm/lima: Various improvements wrt. task handling - drm/panel: nt39016: Support multiple modes and 50Hz - drm/panel: Support Leadtek LTK050H3146W - drm/rockchip: Add support for afbc - drm/virtio: Various cleanups - drm/hisilicon/hibmc: Enforce 128-byte stride alignment - drm/qxl: Fix notify port address of cursor ring buffer - drm/sun4i: Improvements to format handling - drm/bridge: dw-hdmi: Various improvements Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20200414090738.GA16827@linux-uq9g
2020-04-17Merge drm/drm-next into drm-misc-nextThomas Zimmermann24-56/+276
Backmerging required to pull topic/phy-compliance. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2020-04-16Merge branch 'linux-5.7' of git://github.com/skeggsb/linux into drm-fixesDave Airlie2-0/+19
Add missing module firmware for turings. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Ben Skeggs <skeggsb@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/ <CACAvsv4njTRpiNqOC54iRjpd=nu3pBG8i_fp8o_dp7AZE6hFWA@mail.gmail.com
2020-04-16drm/nouveau/sec2/gv100-: add missing MODULE_FIRMWARE()Ben Skeggs2-0/+19
ASB was failing to load on Turing GPUs when firmware is being loaded from initramfs, leaving the GPU in an odd state and causing suspend/ resume to fail. Add missing MODULE_FIRMWARE() lines for initramfs generators. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Cc: <stable@vger.kernel.org> # 5.6
2020-04-08Merge tag 'drm-next-2020-04-08' of git://anongit.freedesktop.org/drm/drmLinus Torvalds23-50/+263
Pull drm fixes from Dave Airlie: "This is a set of fixes that have queued up, I think I might have another pull with some more before rc1 but I'd like to dequeue what I have now just in case Easter is more eggciting that expected. The main thing in here is a fix for a longstanding nouveau power management issues on certain laptops, it should help runtime suspend/resume for a lot of people. There is also a reverted patch for some drm_mm behaviour in atomic contexts. Summary: core: - revert drm_mm atomic patch - dt binding fixes fbcon: - null ptr error fix i915: - GVT fixes nouveau: - runpm fix - svm fixes amdgpu: - HDCP fixes - gfx10 fix - Misc display fixes - BACO fixes amdkfd: - Fix memory leak vboxvideo: - remove conflicting fbs vc4: - mode validation fix xen: - fix PTR_ERR usage" * tag 'drm-next-2020-04-08' of git://anongit.freedesktop.org/drm/drm: (41 commits) drm/nouveau/kms/nv50-: wait for FIFO space on PIO channels drm/nouveau/nvif: protect waits against GPU falling off the bus drm/nouveau/nvif: access PTIMER through usermode class, if available drm/nouveau/gr/gp107,gp108: implement workaround for HW hanging during init drm/nouveau: workaround runpm fail by disabling PCI power management on certain intel bridges drm/nouveau/svm: remove useless SVM range check drm/nouveau/svm: check for SVM initialized before migrating drm/nouveau/svm: fix vma range check for migration drm/nouveau: remove checks for return value of debugfs functions drm/nouveau/ttm: evict other IO mappings when running out of BAR1 space drm/amdkfd: kfree the wrong pointer drm/amd/display: increase HDCP authentication delay drm/amd/display: Correctly cancel future watchdog and callback events drm/amd/display: Don't try hdcp1.4 when content_type is set to type1 drm/amd/powerplay: move the ASIC specific nbio operation out of smu_v11_0.c drm/amd/powerplay: drop redundant BIF doorbell interrupt operations drm/amd/display: Fix dcn21 num_states drm/amd/display: Enable BT2020 in COLOR_ENCODING property drm/amd/display: LFC not working on 2.0x range monitors (v2) drm/amd/display: Support plane level CTM ...
2020-04-07drm/nouveau/kms/nv50-: wait for FIFO space on PIO channelsBen Skeggs3-6/+25
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-04-07drm/nouveau/nvif: protect waits against GPU falling off the busBen Skeggs11-22/+102
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-04-07drm/nouveau/nvif: access PTIMER through usermode class, if availableBen Skeggs3-5/+24
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-04-07drm/nouveau/gr/gp107,gp108: implement workaround for HW hanging during initBen Skeggs1-0/+26
Certain boards with GP107/GP108 chipsets hang (often, but randomly) for unknown reasons during GR initialisation. The first tell-tale symptom of this issue is: nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 409800 [ TIMEOUT ] appearing in dmesg, likely followed by many other failures being logged. Karol found this WAR for the issue a while back, but efforts to isolate the root cause and proper fix have not yielded success so far. I've modified the original patch to include a few more details, limit it to GP107/GP108 by default, and added a config option to override this choice. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com>
2020-04-07drm/nouveau: workaround runpm fail by disabling PCI power management on ↵Karol Herbst2-0/+65
certain intel bridges Fixes the infamous 'runtime PM' bug many users are facing on Laptops with Nvidia Pascal GPUs by skipping said PCI power state changes on the GPU. Depending on the used kernel there might be messages like those in demsg: "nouveau 0000:01:00.0: Refused to change power state, currently in D3" "nouveau 0000:01:00.0: can't change power state from D3cold to D0 (config space inaccessible)" followed by backtraces of kernel crashes or timeouts within nouveau. It's still unkown why this issue exists, but this is a reliable workaround and solves a very annoying issue for user having to choose between a crashing kernel or higher power consumption of their Laptops. Signed-off-by: Karol Herbst <kherbst@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Rafael J. Wysocki <rjw@rjwysocki.net> Cc: Mika Westerberg <mika.westerberg@intel.com> Cc: linux-pci@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205623 Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-04-07drm/nouveau/svm: remove useless SVM range checkRalph Campbell1-3/+0
When nouveau processes GPU faults, it checks to see if the fault address falls within the "unmanaged" range which is reserved for fixed allocations instead of addresses chosen by the core mm code. If start is greater than or equal to svmm->unmanaged.limit, then limit will also be greater than svmm->unmanaged.limit which is greater than svmm->unmanaged.start and the start = max_t(u64, start, svmm->unmanaged.limit) will change nothing. Just remove the useless lines of code. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-04-07drm/nouveau/svm: check for SVM initialized before migratingRalph Campbell1-0/+5
When migrating system memory to GPU memory, check that SVM has been enabled. Even though most errors can be ignored since migration is a performance optimization, return an error because this is a violation of the API. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2020-04-07drm/nouveau/svm: fix vma range check for migrationRalph Campbell1-0/+1
find_vma_intersection(mm, start, end) only guarantees that end is greater than or equal to vma->vm_start but doesn't guarantee that start is greater than or equal to vma->vm_start. The calculation for the intersecting range in nouveau_svmm_bind() isn't accounting for this and can call migrate_vma_setup() with a starting address less than vma->vm_start. This results in migrate_vma_setup() returning -EINVAL for the range instead of nouveau skipping that part of the range and migrating the rest. Signed-off-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>