summaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorFilesLines
2019-08-20drm/i915: Fix DP-MST crtc_maskVille Syrjälä1-1/+1
Each fake MST encoder is tied to a specific pipe. Fix the encoder's crtc_mask to reflect that fact. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190817093902.2171-16-lucas.demarchi@intel.com
2019-08-20drm/i915/tgl: update DMC firmware to 2.04Lucas De Marchi1-2/+2
2 important fixes: - vblank counter is now working - PSR1 is working Cc: Jose Souza <jose.souza@intel.com> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190817093902.2171-5-lucas.demarchi@intel.com
2019-08-20drm/i915/tgl: Move transcoders to pipes' powerwellsJosé Roberto de Souza1-2/+2
When trying to read registers from transcoder C and D while PG3 is ON it causes unclaimed access warnings. Adding the powerwells for the pipes fixes the issue, but doesn't match the spec. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190817093902.2171-4-lucas.demarchi@intel.com
2019-08-20drm/i915/tgl: add support for reading the timestamp frequencyMichel Thierry1-1/+1
There are no changes with respect to GEN11, which Paulo wrote. This gets rid of the "Missing switch case in read_timestamp_frequency" message at boot for Tiger Lake. [ Lucas: BSpec: 10742 and 9024, but there's a mismatch on the values. Let's say a glitch in the spec. Tested locally and it works. ] Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190817093902.2171-3-lucas.demarchi@intel.com
2019-08-20drm/i915/tgl: disable DDICLucas De Marchi1-2/+1
The current SKUs added for Tiger Lake don't have DDIC hooked up, even though it is supported by the SoC. The current state for these SKUs is problematic since while enabling the combo phy, PORT_COMP_DW* return 0xFFFFFFFF, which is invalid per register definition. During initialization we check what phys are not yet enabled by reading PHY_MISC_C and try to enable it by toggling the "DE to IO Comp Pwr Down" bit. But after that any read to the PORT_COMP_DW* returns invalid results. This removes the following warning [56997.634353] Missing case (val == 4294967295) [56997.639241] WARNING: CPU: 5 PID: 768 at drivers/gpu/drm/i915/display/intel_combo_phy.c:54 cnl_get_procmon_ref_values+0xc9/0xf0 [i915] [56997.639808] Modules linked in: i915(+) prime_numbers x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e [last unloaded: prime_numbers] [56997.639808] CPU: 5 PID: 768 Comm: insmod Tainted: G U W 5.2.0-demarchi+ #65 [56997.639808] Hardware name: Intel Corporation Tiger Lake Client Platform/TigerLake U DDR4 SODIMM RVP, BIOS TGLSFWI1.R00.2252.A03.1906270154 06/27/2019 [56997.639808] RIP: 0010:cnl_get_procmon_ref_values+0xc9/0xf0 [i915] [56997.639808] Code: 2c a0 85 c9 74 e0 81 f9 00 00 00 01 75 09 48 c7 c0 0c a4 2c a0 eb cf 48 c7 c6 3c 3a 31 a0 48 c7 c7 40 3a 31 a0 e8 6b 4d ea e0 <0f> 0b 48 c7 c0 00 a4 2c a0 eb b1 48 c7 c0 24 a4 2 c a0 eb a8 e8 be [56997.639808] RSP: 0018:ffffc9000068f8a8 EFLAGS: 00010286 [56997.639808] RAX: 0000000000000000 RBX: ffff88848fa90000 RCX: 0000000000000000 [56997.639808] RDX: ffff8884a08b5ef8 RSI: ffff8884a08a6658 RDI: 00000000ffffffff [56997.639808] RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000000 [56997.639808] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88848fa90000 [56997.639808] R13: 0000000000000000 R14: 0000000000000002 R15: 0006c00000162000 [56997.639808] FS: 00007f61ca3d12c0(0000) GS:ffff8884a0880000(0000) knlGS:0000000000000000 [56997.639808] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [56997.639808] CR2: 00007f71be6a92c0 CR3: 0000000494750006 CR4: 0000000000760ee0 [56997.639808] PKRU: 55555554 [56997.639808] Call Trace: [56997.639808] cnl_verify_procmon_ref_values+0x36/0xf0 [i915] [56997.639808] ? rcu_read_lock_sched_held+0x6f/0x80 [56997.639808] ? gen11_fwtable_read32+0x257/0x290 [i915] [56997.639808] icl_combo_phy_verify_state.part.0+0x22/0xa0 [i915] [56997.639808] intel_combo_phy_init+0x17e/0x3e0 [i915] [56997.639808] ? icl_display_core_init+0x2c/0x1a0 [i915] [56997.639808] ? _raw_spin_unlock_irqrestore+0x4c/0x60 [56997.639808] icl_display_core_init+0x34/0x1a0 [i915] [56997.639808] intel_power_domains_init_hw+0x200/0x570 [i915] [56997.639808] i915_driver_probe+0x103b/0x17e0 [i915] [56997.639808] ? printk+0x53/0x6a [56997.639808] i915_pci_probe+0x3b/0x190 [i915] We may or may not need to change the implementation to account for DDIC being available on other SKUs. For now I think the best thing to do is to just disable the port. Cc: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190814235517.10032-1-lucas.demarchi@intel.com
2019-08-20drm/i915: Update DRIVER_DATE to 20190820Rodrigo Vivi1-2/+2
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2019-08-20drm/i915/gtt: Relax pd_used assertionChris Wilson1-1/+2
The current assertion tries to make sure that we do not over count the number of used PDE inside a page directory -- that is with an array of 512 pde, we do not expect more than 512 elements used! However, our assertion has to take into account that as we pin an element into the page directory, the caller first pins the page directory so the usage count is one higher. However, this should be one extra pin per thread, and the upper bound is that we may have one thread for each entry. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190820141218.14714-1-chris@chris-wilson.co.uk
2019-08-20drm/i915: Dynamically allocate s0ix struct for VLVDaniele Ceraolo Spurio2-70/+106
This is only required for a single platform so no need to reserve the memory on all of them. This removes the last direct dependency of i915_drv.h on i915_reg.h (apart from the i915_reg_t definition). v2: drop unneeded diff, keep the vlv prefix, call functions unconditionally (Jani), fwd declaration of the struct (Chris) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Imre Deak <imre.deak@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190820020147.5667-1-daniele.ceraolospurio@intel.com
2019-08-20drm/i915/tgl: Gen12 render context sizeDaniele Ceraolo Spurio1-0/+1
Re-use Gen11 context size for now. [ Lucas: this is a temporary enabling patch that needs to be confirmed: we need to check BSpec 46255 and recompute ] Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190817093902.2171-27-lucas.demarchi@intel.com
2019-08-20drm/i915/tgl: Updated Private PAT programmingMichel Thierry2-1/+17
Gen12 removes the target-cache and age fields from the private PAT because MOCS now have the capability to set these itself. Only memory-type field should be programmed in the ppat, the reminded bits are reserved. Since now there are only 4 possible combinations, we could set only 4 PPAT and leave the reminded 4 as UC, but I left them as WB as we used to have before. Also these registers have been relocated to the 0x4800-0x481c range. HSDES: 1406402661 BSpec: 31654 Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190817093902.2171-33-lucas.demarchi@intel.com
2019-08-20drm/i915/tgl: Introduce initial Tiger Lake workaroundsLucas De Marchi3-4/+27
Add empty workaround hooks for Tiger Lake. The workarounds will be added on separate patches. We were already applying WaRsForcewakeAddDelayForAck, which is indeed still valid, so also update the comment. Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190817093902.2171-21-lucas.demarchi@intel.com
2019-08-20drm/i915/tgl: Gen12 csb supportDaniele Ceraolo Spurio1-2/+79
The CSB format has been reworked for Gen12 to include information on both the context we're switching away from and the context we're switching to. After the change, some of the events don't have their own bit anymore and need to be inferred from other values in the csb. One of the context IDs (0x7FF) has also been reserved to indicate the invalid ctx, i.e. engine idle. Note that the full context ID includes the SW counter as well, but since we currently only care if the context is valid or not we can ignore that part. v2: fix mask size, fix and expand comments (Tvrtko), use if-ladder (Chris) Bspec: 45555, 46144 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190820102201.29849-1-chris@chris-wilson.co.uk
2019-08-20drm/i915/tgl: add GEN12_MAX_CONTEXT_HW_IDDaniele Ceraolo Spurio2-1/+5
Like Gen11, Gen12 has 11 available bits for the ctx id field. However, the last value (0x7FF) is reserved to indicate engine idle, so we need to reduce the maximum number of contexts by 1 compared to Gen11. Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190817093902.2171-29-lucas.demarchi@intel.com
2019-08-20drm/i915/tgl: add Gen12 default indirect ctx offsetDaniele Ceraolo Spurio2-0/+5
Gen12 uses a new indirect ctx offset. Bspec: 11740 Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190817093902.2171-28-lucas.demarchi@intel.com
2019-08-20drm/i915/tgl: Report valid VDBoxes with SFC capabilityMichel Thierry1-1/+2
In Gen11, only even numbered "logical" VDBoxes are hooked up to a SFC (Scaler & Format Converter) unit. This is not the case in Tigerlake, where each VDBox can access a SFC. We will use this information to decide when the SFC units need to be reset and also pass it to the GuC. Bspec: 48077 Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190731004902.34672-5-daniele.ceraolospurio@intel.com
2019-08-20drm/i915: Be defensive when starting vma activityChris Wilson2-2/+9
Before we acquire the vma for GPU activity, ensure that the underlying object is not already in the process of being freed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190820100531.8430-1-chris@chris-wilson.co.uk
2019-08-20drm/i915: Serialize insertion into the file->mm.request_listChris Wilson2-5/+10
Currently, we remove the from per-file request list for throttling and retirement under a dedicated spinlock, but insertion is governed by struct_mutex. This needs to be the same lock so that the retirement/insertion of neighbouring requests (at the tail) doesn't break the list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190820080907.4665-1-chris@chris-wilson.co.uk
2019-08-20drm/i915: Sanitize PHY state during display core uninitImre Deak1-6/+11
To work around a DMC/Punit issue on ICL where the driver's ICL_PORT_COMP_DW8/IREFGEN PHY setting is lost when entering/exiting DC6 state, make sure to reinit the PHY whenever disabling DC states. Similarly the driver's PHY/DBUF/CDCLK settings should have been preserved across DC5/6 transitions, so check this on all platforms. This gets rid of the following WARN during suspend: Combo PHY A HW state changed unexpectedly Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190816095523.15800-1-imre.deak@intel.com
2019-08-20drm/i915: Fix HW readout for crtc_clock in HDMI modeImre Deak2-3/+3
The conversion during HDMI HW readout from port_clock to crtc_clock was missed when HDMI 10bpc support was added, so fix that. v2: - Unscrew the non-HDMI case. Fixes: cd9e11a8bf25 ("drm/i915/icl: Add 10-bit support for hdmi") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109593 Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190808162547.7009-1-imre.deak@intel.com
2019-08-20mm: remove CONFIG_MIGRATE_VMA_HELPERChristoph Hellwig1-1/+0
CONFIG_MIGRATE_VMA_HELPER guards helpers that are required for proper devic private memory support. Remove the option and just check for CONFIG_DEVICE_PRIVATE instead. Link: https://lore.kernel.org/r/20190814075928.23766-11-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20mm: remove the unused MIGRATE_PFN_DEVICE flagChristoph Hellwig1-2/+1
No one ever checks this flag, and we could easily get that information from the page if needed. Link: https://lore.kernel.org/r/20190814075928.23766-10-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20nouveau: simplify nouveau_dmem_migrate_vmaChristoph Hellwig1-129/+55
Factor the main copy page to vram routine out into a helper that acts on a single page and which doesn't require the nouveau_dmem_migrate structure for argument passing. As an added benefit the new version only allocates the dma address array once and reuses it for each subsequent chunk of work. Link: https://lore.kernel.org/r/20190814075928.23766-8-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20nouveau: simplify nouveau_dmem_migrate_to_ramChristoph Hellwig1-121/+40
Factor the main copy page to ram routine out into a helper that acts on a single page and which doesn't require the nouveau_dmem_fault structure for argument passing. Also remove the loop over multiple pages as we only handle one at the moment, although the structure of the main worker function makes it relatively easy to add multi page support back if needed in the future. But at least for now this avoid the needed to dynamically allocate memory for the dma addresses in what is essentially the page fault path. Link: https://lore.kernel.org/r/20190814075928.23766-7-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20nouveau: remove a few function stubsChristoph Hellwig1-11/+0
nouveau_dmem_migrate_vma and nouveau_dmem_convert_pfn are only called when CONFIG_DRM_NOUVEAU_SVM is enabled, so there is no need to provide !CONFIG_DRM_NOUVEAU_SVM stubs for them. Link: https://lore.kernel.org/r/20190814075928.23766-6-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20nouveau: factor out dmem fence completionChristoph Hellwig1-18/+15
Factor out the end of fencing logic from the two migration routines. Link: https://lore.kernel.org/r/20190814075928.23766-5-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20nouveau: factor out device memory address calculationChristoph Hellwig1-25/+17
Factor out the repeated device memory address calculation into a helper. Link: https://lore.kernel.org/r/20190814075928.23766-4-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20nouveau: reset dma_nr in nouveau_dmem_migrate_alloc_and_copyChristoph Hellwig1-0/+1
When we start a new batch of dma_map operations we need to reset dma_nr, as we start filling a newly allocated array. Link: https://lore.kernel.org/r/20190814075928.23766-3-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20mm: turn migrate_vma upside downChristoph Hellwig1-59/+63
There isn't any good reason to pass callbacks to migrate_vma. Instead we can just export the three steps done by this function to drivers and let them sequence the operation without callbacks. This removes a lot of boilerplate code as-is, and will allow the drivers to drastically improve code flow and error handling further on. Link: https://lore.kernel.org/r/20190814075928.23766-2-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20drm/amdkfd: use mmu_notifier_putJason Gunthorpe2-9/+4
The sequence of mmu_notifier_unregister_no_release(), mmu_notifier_call_srcu() is identical to mmu_notifier_put() with the free_notifier callback. As this is the last user of those APIs, converting it means we can drop them. Link: https://lore.kernel.org/r/20190806231548.25242-11-jgg@ziepe.ca Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20drm/amdkfd: fix a use after free race with mmu_notifer unregisterJason Gunthorpe1-41/+37
When using mmu_notifer_unregister_no_release() the caller must ensure there is a SRCU synchronize before the mn memory is freed, otherwise use after free races are possible, for instance: CPU0 CPU1 invalidate_range_start hlist_for_each_entry_rcu(..) mmu_notifier_unregister_no_release(&p->mn) kfree(mn) if (mn->ops->invalidate_range_end) The error unwind in amdkfd misses the SRCU synchronization. amdkfd keeps the kfd_process around until the mm is released, so split the flow to fully initialize the kfd_process and register it for find_process, and with the notifier. Past this point the kfd_process does not need to be cleaned up as it is fully ready. The final failable step does a vm_mmap() and does not seem to impact the kfd_process global state. Since it also cannot be undone (and already has problems with undo if it internally fails), it has to be last. This way we don't have to try to unwind the mmu_notifier_register() and avoid the problem with the SRCU. Along the way this also fixes various other error unwind bugs in the flow. Fixes: 45102048f77e ("amdkfd: Add process queue manager module") Link: https://lore.kernel.org/r/20190806231548.25242-10-jgg@ziepe.ca Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20drm/radeon: use mmu_notifier_get/put for struct radeon_mnJason Gunthorpe4-126/+38
radeon is using a device global hash table to track what mmu_notifiers have been registered on struct mm. This is better served with the new get/put scheme instead. radeon has a bug where it was not blocking notifier release() until all the BO's had been invalidated. This could result in a use after free of pages the BOs. This is tied into a second bug where radeon left the notifiers running endlessly even once the interval tree became empty. This could result in a use after free with module unload. Both are fixed by changing the lifetime model, the BOs exist in the interval tree with their natural lifetimes independent of the mm_struct lifetime using the get/put scheme. The release runs synchronously and just does invalidate_start across the entire interval tree to create the required DMA fence. Additions to the interval tree after release are already impossible as only current->mm is used during the add. Link: https://lore.kernel.org/r/20190806231548.25242-9-jgg@ziepe.ca Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20hmm: use mmu_notifier_get/put for 'struct hmm'Jason Gunthorpe2-0/+5
This is a significant simplification, it eliminates all the remaining 'hmm' stuff in mm_struct, eliminates krefing along the critical notifier paths, and takes away all the ugly locking and abuse of page_table_lock. mmu_notifier_get() provides the single struct hmm per struct mm which eliminates mm->hmm. It also directly guarantees that no mmu_notifier op callback is callable while concurrent free is possible, this eliminates all the krefs inside the mmu_notifier callbacks. The remaining krefs in the range code were overly cautious, drivers are already not permitted to free the mirror while a range exists. Link: https://lore.kernel.org/r/20190806231548.25242-6-jgg@ziepe.ca Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Tested-by: Ralph Campbell <rcampbell@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20drm/komeda: Add support for 'memory-region' DT node propertyMihail Atanassov1-0/+9
The 'memory-region' property of the komeda display driver DT binding allows the use of a 'reserved-memory' node for buffer allocations. Add the requisite of_reserved_mem_device_{init,release} calls to actually make use of the memory if present. Changes since v1: - Move handling inside komeda_parse_dt Signed-off-by: Mihail Atanassov <mihail.atanassov@arm.com> Signed-off-by: Ayan Kumar Halder <ayan.halder@arm.com> Reviewed-by: James Qian Wang (Arm Technology China) <james.qian.wang@arm.com> Link:- https://patchwork.kernel.org/patch/11076413/
2019-08-20Merge branch 'for-joerg/batched-unmap' of ↵Joerg Roedel1-8/+16
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into core
2019-08-20dw-hdmi-cec: use cec_notifier_cec_adap_(un)registerDariusz Marcinkiewicz1-7/+6
Use the new cec_notifier_cec_adap_(un)register() functions to (un)register the notifier for the CEC adapter. Also adds CEC_CAP_CONNECTOR_INFO capability to the adapter. Changes since v3: - add CEC_CAP_CONNECTOR_INFO to cec_allocate_adapter, - replace CEC_CAP_LOG_ADDRS | CEC_CAP_TRANSMIT | CEC_CAP_RC | CEC_CAP_PASSTHROUGH with CEC_CAP_DEFAULTS. Signed-off-by: Dariusz Marcinkiewicz <darekm@google.com> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Tested-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190814104520.6001-4-darekm@google.com
2019-08-20drm: dw-hdmi: use cec_notifier_conn_(un)registerDariusz Marcinkiewicz1-15/+30
Use the new cec_notifier_conn_(un)register() functions to (un)register the notifier for the HDMI connector, and fill in the cec_connector_info. Changes since v6: - move cec_notifier_conn_unregister to a bridge detach function, - add a mutex protecting a CEC notifier. Changes since v4: - typo fix Changes since v2: - removed unnecessary NULL check before a call to cec_notifier_conn_unregister, - use cec_notifier_phys_addr_invalidate to invalidate physical address. Changes since v1: Add memory barrier to make sure that the notifier becomes visible to the irq thread once it is fully constructed. Signed-off-by: Dariusz Marcinkiewicz <darekm@google.com> Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Tested-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190814104520.6001-9-darekm@google.com
2019-08-20drm/i915: Assume exclusive access to objects inside resumeChris Wilson1-4/+7
Inside gtt_restore_mappings() we currently take the obj->resv->lock, but in the future we need to avoid taking this fs-reclaim tainted lock as we need to extend the coverage of the vm->mutex. Take advantage of the single-threaded nature of the early resume phase, and do a single wbinvd() to flush all the GTT objects en masse. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190819200705.3631-1-chris@chris-wilson.co.uk
2019-08-19drm/i915: Use 0 for the unordered contextChris Wilson6-16/+5
Since commit 078dec3326e2 ("dma-buf: add dma_fence_get_stub") the 0 fence context became an impossible match as it is used for an always signaled fence. We can simplify our timeline tracking by knowing that 0 always means no match. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190819184404.24200-1-chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20190819175109.5241-1-chris@chris-wilson.co.uk
2019-08-19drm/i915: Select DMABUF_SELFTESTS for the default i915.ko debug buildChris Wilson1-0/+1
Include the DMABUF_SELFTESTS as part of the standard build for IGT, so that they can be run by igt/dmabuf Testcase: igt/dmabuf Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190819171900.4501-1-chris@chris-wilson.co.uk
2019-08-19drm/drv: Use // for comments in example codeJonathan Neuschäfer1-8/+6
This improves Sphinx output in two ways: - It avoids an unmatched single-quote ('), about which Sphinx complained: Documentation/gpu/drm-internals.rst:298: WARNING: Could not lex literal_block as "c". Highlighting skipped. An alternative approach would be to replace "can't" with a word that doesn't have a single-quote. - It lets Sphinx format the comments in italics and grey, making the code slightly easier to read. Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Acked-by: Daniel Vetter <daniel@ffwll.ch> [via irc] Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190808163629.14280-1-j.neuschaefer@gmx.net
2019-08-19drm/panfrost: Remove opp table when unloadingSteven Price3-1/+11
The devfreq opp table needs to be removed when unloading the driver to free the memory associated with it. Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190816093107.30518-3-steven.price@arm.com
2019-08-19drm/panfrost: Enable devfreq to work without regulatorSteven Price1-5/+2
If there is no regulator defined for the GPU then still control the frequency using the supplied clock. Some boards have clock control but no (direct) control of the regulator. For example the HiKey960 uses a mailbox protocol to a MCU to control frequencies and doesn't directly control the voltage. This patch allows frequency control of the GPU on this system. Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190816093107.30518-1-steven.price@arm.com
2019-08-19drm/panfrost: Implement per FD address spacesRob Herring9-87/+236
Up until now, a single shared GPU address space was used. This is not ideal as there's no protection between processes and doesn't work for supporting the same GPU/CPU VA feature. Most importantly, this will hopefully mitigate Alyssa's fear of WebGL, whatever that is. Most of the changes here are moving struct drm_mm and struct panfrost_mmu objects from the per device struct to the per FD struct. The critical function is panfrost_mmu_as_get() which handles allocating and switching the h/w address spaces. There's 3 states an AS can be in: free, allocated, and in use. When a job runs, it requests an address space and then marks it not in use when job is complete(but stays assigned). The first time thru, we find a free AS in the alloc_mask and assign the AS to the FD. Then the next time thru, we most likely already have our AS and we just mark it in use with a ref count. We need a ref count because we have multiple job slots. If the job/FD doesn't have an AS assigned and there are no free ones, then we pick an allocated one not in use from our LRU list and switch the AS from the old FD to the new one. Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190813150115.30338-1-robh@kernel.org
2019-08-19drm/panfrost: Fix missing unlock on error in panfrost_mmu_map_fault_addr()Wei Yongjun1-1/+4
Add the missing unlock before return from function panfrost_mmu_map_fault_addr() in the error handling case. Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190814044814.102294-1-weiyongjun1@huawei.com
2019-08-19drm/i915: i915_active.retire() is optionalChris Wilson1-2/+4
Check that i915_active.retire() exists before calling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190819075835.20065-6-chris@chris-wilson.co.uk
2019-08-19drm/i915/gen11: Allow usage of all GPIO pinsMatt Roper3-50/+3
Our pin mapping tables for ICP and MCC currently only list the standard GPIO pins used for various output ports. Even through ICP's standard pin usage only utilizes pins 1, 2, and 9-12, and MCC's standard pin usage only uses pins 1, 2, and 9, these platforms do still have GPIO registers to address pins in the range 1-3 and 9-14. OEM's may remap GPIO usage in non-standard ways (and provide the actual mapping via VBT settings), so we shouldn't exclude pins on these platforms just because they aren't part of the standard mappings. TGP's standard pin tables contains all the possible pins, so let's rename them to "icp" and use them for all PCH >= PCH_ICP. This will prevent intel_gmbus_is_valid_pin from rejecting non-standard pin usage that an OEM specifies via the VBT. Note that this will cause pin 9 to be labeled as "tc1" instead of "dpc" in debug messages on platforms with the MCC PCH, but that may actually help avoid confusion since the text strings will now be the same on all gen11+ platforms instead of being different on just EHL. v2: Drop now-unused MCC_DDC_BUS_DDI_* names. v3: We want to compare against INTEL_PCH_TYPE, not INTEL_PCH_ID. Bspec: 8417 Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Vivek Kasireddy <vivek.kasireddy@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190817005041.20651-1-matthew.d.roper@intel.com
2019-08-19drm/i915: Serialize against vma movesChris Wilson11-20/+56
Make sure that when submitting requests, we always serialize against potential vma moves and clflushes. Time for a i915_request_await_vma() interface! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190819112033.30638-1-chris@chris-wilson.co.uk
2019-08-19gpu: ipu-v3: image-convert: only sample into the next tile if necessaryPhilipp Zabel1-2/+2
The first pixel of the next tile is only sampled by the hardware if the fractional input position corresponding to the last written output pixel is not an integer position. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2019-08-19gpu: ipu-v3: image-convert: move tile burst alignment out of loopPhilipp Zabel1-39/+45
Burst aligned input and output width can be calculated once per column, instead of repeatedly for each tile in the column. The same goes for input and output height per row. Also don't round up the same values repeatedly. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
2019-08-19gpu: ipu-v3: image-convert: bail on invalid tile sizesPhilipp Zabel1-3/+24
If we managed to create tiles sized 0x0 because of a bug in the seam calculation, return with an error message instead of letting the driver run into a division by zero later. Also check for tile sizes that are larger than supported by the hardware. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>