summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/i915/gvt
AgeCommit message (Collapse)AuthorFilesLines
2017-08-02drm/i915/gvt: change resetting to resetting_engChuanxiao Dong4-10/+13
Use resetting_eng to identify which engine is resetting so the rest ones' workload won't be impacted v2: - use ENGINE_MASK(ring_id) instead of (1 << ring_id). (Zhenyu) Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-27drm/i915/hsw+: Add has_fuses power well attributeImre Deak1-3/+3
The pattern of a power well backing a set of fuses whose initialization we need to wait for during power well enabling is common to all GEN9+ platforms. Adding support for this to the HSW power well enable helper allows us to use the HSW/BDW power well code for GEN9+ as well in a follow-up patch. v2: - Use an enum for power gates instead of raw numbers. (Ville) Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20170711204236.5618-6-imre.deak@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27drm/i915/hsw+: Unify the hsw/bdw and gen9+ power well req/state macrosImre Deak1-3/+5
Although on HSW/BDW there is only a single display global power well, it's programmed the same way as other GEN9+ power wells. This also means we can get at the HSW/BDW request and status flags the same way it's done on GEN9+ by assigning the corresponding HSW/BDW power well ID. This ID was assigned in a recent patch, so we can now switch to using the same macros everywhere on HSW+. Updating the HSW power well control register with RMW is not strictly necessary, but this will allow us to use the same code for GEN9+. Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1499352040-8819-13-git-send-email-imre.deak@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-20Merge tag 'drm-intel-next-2017-07-17' of ↵Dave Airlie1-1/+1
git://anongit.freedesktop.org/git/drm-intel into drm-next 2nd round of 4.14 features: - prep for deferred fbdev setup - refactor fixed 16.16 computations and skl+ wm code (Mahesh Kumar) - more cnl paches (Rodrigo, Imre et al) - tighten context cleanup and handling (Chris Wilson) - fix interlaced handling on skl+ (Mahesh Kumar) - small bits as usual * tag 'drm-intel-next-2017-07-17' of git://anongit.freedesktop.org/git/drm-intel: (84 commits) drm/i915: Update DRIVER_DATE to 20170717 drm/i915: Protect against deferred fbdev setup drm/i915/fbdev: Always forward hotplug events drm/i915/skl+: unify cpp value in WM calculation drm/i915/skl+: WM calculation don't require height drm/i915: Addition wrapper for fixed16.16 operation drm/i915: cleanup fixed-point wrappers naming drm/i915: Always perform internal fixed16 division in 64 bits drm/i915: take-out common clamping code of fixed16 wrappers drm/i915/cnl: Add missing type case. drm/i915/cnl: Add max allowed Cannonlake DC. drm/i915: Make DP-MST connector info work drm/i915/cnl: Get DDI clock based on PLLs. drm/i915/cnl: Inherit RPS stuff from previous platforms. drm/i915/cnl: Gen10 render context size. drm/i915/cnl: Don't trust VBT's alternate pin for port D for now. drm/i915: Fix the kernel panic when using aliasing ppgtt drm/i915/cnl: Cannonlake color init. drm/i915/cnl: Add force wake for gen10+. x86/gpu: CNL uses the same GMS values as SKL ...
2017-07-17drm/i915/gvt: Fix the vblank timer close issue after shutdown VMs in reversefred gao1-11/+11
Once the Windows guest is shutdown, the display pipe will be disabled and intel_gvt_check_vblank_emulation will be called to check if the vblank timer is turned off. Given the scenario of creating VM1 ,VM2, destoying VM2 in current code, VM1 has pipe enabled and continues to check VM2, the flag have_enabled_pipe is always false since all the VM2 pipes are disabled, so the vblank timer will be canceled and TDR happens in Windows VM1 guest due to the vsync timeout. In this patch the vblank timer will be never canceled once one pipe is enabled. v2: - remove have_enabled_pipe flag and check pipe enabled directly. (Zhenyu) Cc: Wang Hongbo <hongbo.wang@intel.com> Signed-off-by: fred gao <fred.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-11drm/i915/gvt: Use fence error from GVT request for workload statusChuanxiao Dong1-9/+12
The req->fence.error will be set if this request caused GPU hang so we can use this value to workload->status to indicate whether this GVT request caused any problem. If it caused GPU hang, we shouldn't trigger any context switch back to the guest. v2: - only take -EIO from fence->error. (Zhenyu) Fixes: 8f1117abb408 (drm/i915/gvt: handle workload lifecycle properly) Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-11drm/i915/gvt: remove scheduler_mutex in per-engine workload_threadWeinan Li1-7/+0
For the vGPU workloads, now GVT-g use per vGPU scheduler, the per-ring work_thread only pick workload belongs to the current vGPU. And with time slice based scheduler, it waits all the engines become idle before do vGPU switch. So we can run free dispatch in per-ring work_thread, different ring running in different 'vGPU' won't happen. For the workloads between vGPU and Host, this scheduler_mutex can't block host to dispatch workload into other ring engines. Here remove this mutex since it impacts the performance when applications use more than 1 ring engines in 1 vgpu. ring0 running in vGPU1, ring1 running in Host. Will happen. ring0 running in vGPU1, ring1 running in vGPU2. Won't happen. Signed-off-by: Weinan Li <weinan.z.li@intel.com> Signed-off-by: Ping Gao <ping.a.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-11drm/i915/gvt: Revert "drm/i915/gvt: Fix possible recursive locking issue"Chuanxiao Dong2-48/+10
This reverts commit 62d02fd1f807bf5a259a242c483c9fb98a242630. The rwsem recursive trace should not be fixed from kvmgt side by using a workqueue and it is an issue should be fixed in VFIO. So this one should be reverted. Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-11drm/i915/gvt: Audit the command buffer addressPing Gao1-0/+10
The command buffer address in context like ring buffer base address and wa_ctx address need to be audit to make sure they are in the valid GGTT range. Signed-off-by: Ping Gao <ping.a.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-11drm/i915/gvt: Fix a memory leak in intel_gvt_init_gtt()Zhou, Wenjia1-0/+2
It will causes memory leak, if the function setup_spt_oos() fail, in the function intel_gvt_init_gtt(), which allocated by get_zeroed_page() and mapped by dma_map_page(). Unmap and free the page, after STP oos initialize fail, it will fix this issue. Signed-off-by: Zhou, Wenjia <zhiyuan_zhu@htc.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-07-10Merge tag 'drm-for-v4.13' into drm-intel-next-queuedDaniel Vetter2-19/+41
Resync with the main drm-next pull request for 4.13. What we really need is to fully resync with pending drm-misc, but that's not yet possible due to the still ongoing merge window. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2017-06-30Merge tag 'gvt-fixes-2017-06-29' of https://github.com/01org/gvt-linux into ↵Jani Nikula4-32/+99
drm-intel-next-fixes gvt-fixes-2017-06-29 - two race fixes for VFIO locks from Chuanxiao - virtual display fix for BDW from Xiong Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170629065424.kxopjbvntuakbyz2@zhen-hp.sh.intel.com
2017-06-29drm/i915/gvt: Make function dpy_reg_mmio_readx safeChangbin Du1-16/+19
The dpy_reg_mmio_read_x functions directly copy 4 bytes data to the target address with considering the length. If may cause the target memory corrupted if the requested length less than 4 bytes. Fix it for safety even we already have some checking to avoid this happen. And for convince, the 3 functions are merged. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-27drm/i915/gvt: Don't read ADPA_CRT_HOTPLUG_MONITOR from hostXiong Zhang2-1/+5
When host connects a crt screen, linux guest will detect two screens: crt and dp. This is wrong as linux guest has only one dp. In order to avoid guest get host crt screen, we should set ADPA_CRT_HOTPLUG_MONITOR to none. But MMIO_RO(PCH_ADPA) prevent from that. So MMIO_DH should be used instead of MMIO_RO. v2: Clear its staus to none at initialize, so guest don't get host crt.(Zhangyu) v3: SKL doesn't have this register, limit it to pre_skl.(xiong) Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-27drm/i915/gvt: Set initial PORT_CLK_SEL vreg for BDWXiong Zhang1-0/+18
On BDW, when host physical screen and guest virtual screen aren't on the same DDI port, guest i915 driver prints the following error and stop running. [ 6.775873] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068 [ 6.775928] IP: intel_ddi_clock_get+0x81/0x430 [i915] [ 6.776206] Call Trace: [ 6.776233] ? vgpu_read32+0x4f/0x100 [i915] [ 6.776264] intel_ddi_get_config+0x11c/0x230 [i915] [ 6.776298] intel_modeset_setup_hw_state+0x313/0xd40 [i915] [ 6.776334] intel_modeset_init+0xe49/0x18d0 [i915] [ 6.776368] ? vgpu_write32+0x53/0x100 [i915] [ 6.776731] ? intel_i2c_reset+0x42/0x50 [i915] [ 6.777085] ? intel_setup_gmbus+0x32a/0x350 [i915] [ 6.777427] i915_driver_load+0xabc/0x14d0 [i915] [ 6.777768] i915_pci_probe+0x4f/0x70 [i915] The null pointer is guest intel_crtc_state->shared_dpll which is setted in haswell_get_ddi_pll(). When guest and host screen are on different DDI port, host driver won't set PORT_CLK_SET(guest_port), so haswell_get_ddi_pll() will return null and don't set pipe_config->shared_dpll, once the following program refernce this structure, it will print the above error. This patch set the initial val of guest PORT_CLK_SEL(guest_port) to LCPLL_810. And guest i915 driver will reset this value according to guest screen mode. Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-26drm/i915/gvt: Fix inconsistent locks holding sequenceChuanxiao Dong1-5/+9
There are two kinds of locking sequence. One is in the thread which is started by vfio ioctl to do the iommu unmapping. The locking sequence is: down_read(&group_lock) ----> mutex_lock(&cached_lock) The other is in the vfio release thread which will unpin all the cached pages. The lock sequence is: mutex_lock(&cached_lock) ---> down_read(&group_lock) And, the cache_lock is used to protect the rb tree of the cache node and doing vfio unpin doesn't require this lock. Move the vfio unpin out of the cache_lock protected region. v2: - use for style instead of do{}while(1). (Zhenyu) Fixes: f30437c5e7bf ("drm/i915/gvt: add KVMGT support") Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-26drm/i915/gvt: Fix possible recursive locking issueChuanxiao Dong2-10/+48
vfio_unpin_pages will hold a read semaphore however it is already hold in the same thread by vfio ioctl. It will cause below warning: [ 5102.127454] ============================================ [ 5102.133379] WARNING: possible recursive locking detected [ 5102.139304] 4.12.0-rc4+ #3 Not tainted [ 5102.143483] -------------------------------------------- [ 5102.149407] qemu-system-x86/1620 is trying to acquire lock: [ 5102.155624] (&container->group_lock){++++++}, at: [<ffffffff817768c6>] vfio_unpin_pages+0x96/0xf0 [ 5102.165626] but task is already holding lock: [ 5102.172134] (&container->group_lock){++++++}, at: [<ffffffff8177728f>] vfio_fops_unl_ioctl+0x5f/0x280 [ 5102.182522] other info that might help us debug this: [ 5102.189806] Possible unsafe locking scenario: [ 5102.196411] CPU0 [ 5102.199136] ---- [ 5102.201861] lock(&container->group_lock); [ 5102.206527] lock(&container->group_lock); [ 5102.211191] *** DEADLOCK *** [ 5102.217796] May be due to missing lock nesting notation [ 5102.225370] 3 locks held by qemu-system-x86/1620: [ 5102.230618] #0: (&container->group_lock){++++++}, at: [<ffffffff8177728f>] vfio_fops_unl_ioctl+0x5f/0x280 [ 5102.241482] #1: (&(&iommu->notifier)->rwsem){++++..}, at: [<ffffffff810de775>] __blocking_notifier_call_chain+0x35/0x70 [ 5102.253713] #2: (&vgpu->vdev.cache_lock){+.+...}, at: [<ffffffff8157b007>] intel_vgpu_iommu_notifier+0x77/0x120 [ 5102.265163] stack backtrace: [ 5102.270022] CPU: 5 PID: 1620 Comm: qemu-system-x86 Not tainted 4.12.0-rc4+ #3 [ 5102.277991] Hardware name: Intel Corporation S1200RP/S1200RP, BIOS S1200RP.86B.03.01.APER.061220151418 06/12/2015 [ 5102.289445] Call Trace: [ 5102.292175] dump_stack+0x85/0xc7 [ 5102.295871] validate_chain.isra.21+0x9da/0xaf0 [ 5102.300925] __lock_acquire+0x405/0x820 [ 5102.305202] lock_acquire+0xc7/0x220 [ 5102.309191] ? vfio_unpin_pages+0x96/0xf0 [ 5102.313666] down_read+0x2b/0x50 [ 5102.317259] ? vfio_unpin_pages+0x96/0xf0 [ 5102.321732] vfio_unpin_pages+0x96/0xf0 [ 5102.326024] intel_vgpu_iommu_notifier+0xe5/0x120 [ 5102.331283] notifier_call_chain+0x4a/0x70 [ 5102.335851] __blocking_notifier_call_chain+0x4d/0x70 [ 5102.341490] blocking_notifier_call_chain+0x16/0x20 [ 5102.346935] vfio_iommu_type1_ioctl+0x87b/0x920 [ 5102.351994] vfio_fops_unl_ioctl+0x81/0x280 [ 5102.356660] ? __fget+0xf0/0x210 [ 5102.360261] do_vfs_ioctl+0x93/0x6a0 [ 5102.364247] ? __fget+0x111/0x210 [ 5102.367942] SyS_ioctl+0x41/0x70 [ 5102.371542] entry_SYSCALL_64_fastpath+0x1f/0xbe put the vfio_unpin_pages in a workqueue can fix this. v2: - use for style instead of do{}while(1). (Zhenyu) v3: - rename gvt_cache_mark to gvt_cache_mark_remove. (Zhenyu) Fixes: 659643f7d814 ("drm/i915/gvt/kvmgt: add vfio/mdev support to KVMGT") Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-21Merge tag 'drm-intel-next-2017-06-19' of ↵Dave Airlie19-440/+604
git://anongit.freedesktop.org/git/drm-intel into drm-next Final pile of features for 4.13 New uabi: - batch bo in first slot, for faster execbuf assembly in userspace (Chris Wilson) - (sub)slice getparam, needed for mesa perf support (Robert Bragg) First pile of patches for cnl/cfl support, maintained by Rodrigo but with lots of contributions from others. Still incomplete since public review still ongoing. Features/refactoring: - Make execbuf faster (Chris Wilson), a pile of series to make execbuf buffer handling have fewer passes, use less list walking, postpone more work to async workers and shuffle buffers less, all to make the common case much faster (in some cases at least). - cold boot support for glk dsi (Madhav Chauhan) - Clean up pipe A quirk and related old platform hacks (Ville) - perf sampling support for kbl/glk (Lionel) - perf cleanups (Robert Bragg) - wire atomic state to backlight code, to avoid pipe lookup hacks (Maarten) - reduce request waiting latency/overhead to remove the spinning and associated cpu cycle wasting (Chris) - fix 90/270 rotation wm computation (Ville) - new ddb allocation algo for skl (Kumar Mahesh) - fix regression due to system suspend optimiazatino (Imre) - the usual pile of small cleanups and refactors all over GVT updates contained in this tag: - optimization for per-VM mmio save/restore (Changbin) - optimization for mmio hash table (Changbin) - scheduler optimization with event (Ping) - vGPU reset refinement (Fred) - other misc refactor and cleanups, etc. * tag 'drm-intel-next-2017-06-19' of git://anongit.freedesktop.org/git/drm-intel: (170 commits) drm/i915: Update DRIVER_DATE to 20170619 drm/i915/cfl: Introduce Coffee Lake workarounds. drm/i915: Store 9 bits of PCI Device ID for platforms with a LP PCH drm/i915: Stash a pointer to the obj's resv in the vma drm/i915: Async GPU relocation processing drm/i915: Allow execbuffer to use the first object as the batch drm/i915: Wait upon userptr get-user-pages within execbuffer drm/i915: First try the previous execbuffer location drm/i915: Store a persistent reference for an object in the execbuffer cache drm/i915: Eliminate lots of iterations over the execobjects array drm/i915: Disable EXEC_OBJECT_ASYNC when doing relocations drm/i915: Pass vma to relocate entry drm/i915: Store a direct lookup from object handle to vma drm/i915: Fix retrieval of hangcheck stats drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty drm/i915: Mark CPU cache as dirty on every transition for CPU writes drm/i915: Make i915_vma_destroy() static drm/i915: Actually attach the tv_format property to the SDVO connector Revert "drm/i915/skl: New ddb allocation algorithm" drm/i915/glk: Add cold boot sequence for GLK DSI ...
2017-06-20drm/i915: Allow contexts to be unreferenced locklesslyChris Wilson1-1/+1
If we move the actual cleanup of the context to a worker, we can allow the final free to be called from any context and avoid undue latency in the caller. v2: Negotiate handling the delayed contexts free by flushing the workqueue before calling i915_gem_context_fini() and performing the final free of the kernel context directly v3: Flush deferred frees before new context allocations Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170620110547.15947-2-chris@chris-wilson.co.uk
2017-06-16Merge tag 'gvt-next-2017-06-08' of https://github.com/01org/gvt-linux into ↵Jani Nikula19-440/+604
drm-intel-next-queued gvt-next-2017-06-08 First gvt-next pull for 4.13: - optimization for per-VM mmio save/restore (Changbin) - optimization for mmio hash table (Changbin) - scheduler optimization with event (Ping) - vGPU reset refinement (Fred) - other misc refactor and cleanups, etc. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170608093547.bjgs436e3iokrzdm@zhen-hp.sh.intel.com
2017-06-16BackMerge tag 'v4.12-rc5' into drm-nextDave Airlie2-19/+41
Linux 4.12-rc5 for nouveau fixes
2017-06-08drm/i915/gvt: Refine virtual reset functionfred gao3-11/+28
during the emulation of virtual reset: 1. only reset the engine related mmio ending with MMIO offset Master_IRQ, not include display stuff. 2. fences are not required to set default value as well to prevent screen flicking. this will fix the issue of Guest screen hang while running Force tdr in Linux guest. v2: - only reset the engine related mmio. (Zhenyu & Zhiyuan) v3: - IMR/Ring mode registers are not save/restored. (Changbin) v4: - redefine the MMIO reset offset for easy understanding. (Zhenyu) - pvinfo can be reset. (Zhenyu) v5: - add more comments for mmio reset. (Zhenyu) Cc: Changbin Du <changbin.du@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Lv zhiyuan <zhiyuan.lv@intel.com> Cc: Zhang Yulei <yulei.zhang@intel.com> Signed-off-by: fred gao <fred.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: Fix GDRST vreg state after resetfred gao1-0/+3
Emulating the GDRST read behavior correctly to ack the guest reset request. v2: - split the original patch into two: GDRST read handler and virtual gpu reset. (Zhenyu) v3: - emulate the GDRST read right after write. (Zhenyu) Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhang Yulei <yulei.zhang@intel.com> Signed-off-by: fred gao <fred.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: Tuning the size of MMIO hash lookup table to 2048Changbin Du1-1/+1
On Skylake platform, The traced virtual mmio registers are up to 2039. So tuning the hash table size to improve lookup performance. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: Add helper for tuning MMIO hash tableChangbin Du2-0/+5
We count all the tracked virtual MMIO registers, which can help us to tune the MMIO hash table. v2: Move num_tracked_mmio into gvt structure. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: Make the MMIO attribute wrappers be inlineChangbin Du3-87/+79
Function calls are expensive. I have see obvious overhead call to these wrappers in perf data, especially from the cmd parser side. So make these simple wrappers be inline to kill them all. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: Make mmio_attribute as type u8 to save 1.5MB memoryChangbin Du2-3/+4
Type u8 is big enough to contain all MMIO attribute flags. As the total MMIO size is 2MB so we saved 1.5MB memory. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: Cleanup struct intel_gvt_mmio_infoChangbin Du3-16/+3
The size, length, addr_mask fields actually are not necessary. Every tracked mmio has DWORD size, and addr_mask is a legacy field. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: Optimize MMIO register handling for some large MMIO blocksChangbin Du3-114/+147
Some of traced MMIO registers are a large continuous section. These stuffed the MMIO lookup hash table and so waste lots of memory and get much lower lookup performance. Here we picked out these sections by special handling. These sections include: o Display pipe registers, total 768. o The PVINFO page, total 1024. o MCHBAR_MIRROR, total 65536. o CSR_MMIO, total 3072. So we removed 70,400 items from the hash table, and speed up guest boot time by ~500ms. v2: o add a local function find_mmio_block(). o fix comments. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: add gtt_invalidate API to flush the GTT TLBChuanxiao Dong1-6/+9
add gtt_invalidate API to handle the GTT TLB flush instead of hiding in write_pte64 function. This can avoid overkill when using write_pte64 Suggested-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: Add runtime_pm get/put to proctect MMIO accessingChuanxiao Dong2-0/+22
In some cases, GVT-g is accessing MMIO without holding runtime_pm and this patch can add the inline API for doing the runtime_pm get/put to make sure when accessing HW MMIO the i915 HW is really powered on. Suggested-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: remove redundant -WallNick Desaulniers1-1/+1
This flag is already set in the top level Makefile of the kernel. Also, by having set CONFIG_DRM_I915_GVT, thereby appending -Wall to ccflags, you undo all the -Wno-* cflags previously set in the Make variable KBUILD_CFLAGS. For example: cc foo.c -Wall -Wno-format -Wall resets -Wformat. Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: Legacy HSW related MMIO handler clean upfred gao2-25/+15
remove all the legacy pre-BDW mmio handlers and the corresponding usage/definition since pre-BDW platforms are not supported in GVT environment. v2: - clean up all the left dirty code before BDW, e.g all D_HSW usage and itself, D_IVB, D_PRE_BDW. (Zhenyu) v3: - change is based on gvt-staging. (Zhenyu) Signed-off-by: fred gao <fred.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: Trigger scheduling after context completePing Gao1-0/+4
The time based scheduler poll context busy status at every micro-second during vGPU switch, it will make GPU idle for a while when the context is very small and completed before the next micro-second arrival. Trigger scheduling immediately after context complete will eliminate GPU idle and improve performance. Create two vGPU with same type, run Heaven simultaneously: Before this patch: +---------+----------+----------+ | | vGPU1 | vGPU2 | +---------+----------+----------+ | Heaven | 357 | 354 | +-------------------------------+ After this patch: +---------+----------+----------+ | | vGPU1 | vGPU2 | +---------+----------+----------+ | Heaven | 397 | 398 | +-------------------------------+ v2: Let need_reschedule protect by gvt-lock. Signed-off-by: Ping Gao <ping.a.gao@intel.com> Signed-off-by: Weinan Li <weinan.z.li@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: Support event based schedulingPing Gao3-6/+18
This patch decouple the time slice calculation and scheduler, let other event be able to trigger scheduling without impact the calculation for QoS. v2: add only one new enum definition. v3: fix typo. Signed-off-by: Ping Gao <ping.a.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: Delete gvt_dbg_cmd() in cmd_parser_exec()Xiong Zhang1-6/+0
Since cmd message have been recorded in trace, gvt_dbg_cmd isn't necessary. This will reduce much of dmesg as gvt_dbg_cmd is repeated on each workload. Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: Change flood gvt dmesg into traceXiong Zhang5-18/+119
Currently gvt dmesg is so heavy at drm.debug=0x2 that guest and host almost couldn't run on xengt. This patch transfer these repeated messages into trace, so dmesg is light at drm.debug=0x2, and user could get the target message through trace event and trace filter. Suggested-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: clean up the unused last_ctx_submit_time of struct intel_vgpuChangbin Du2-2/+0
Clean up it as it is not used now. Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: add RING_INSTDONE and SC_INSTDONE mmio handler in GVT-gWeinan Li2-7/+15
kernel hangcheck needs to check RING_INSTDONE and SC_INSTDONE registers' state to know if hardware is still running. In GVT-g environment, we need to emulate these registers changing for all the guests although they are not render owner. Here we return the physical state for all the guests, then if INSTDONE is changing guest can know hardware is still running although its workload is pending. Read INSTDONE isn't one correct way to know if guest trigger gfx reset, especially with Linux guest, it will read ACTH first, then check INSTDONE and SUBSLICE registers to check if hardware is still running, at last trigger gfx reset when it finds all the registers is frozen. In Windows guest, read INSTDONE usually happens when OS detect TDR. With the difference between Windows and Linux guest, "disable_warn_untrack" may let debug log run into wrong state(Linux guest trigger hangcheck with no ACTHD changed, then check INSTDONE), but actually there is no TDR happened. The new policy is always WARN with untrack MMIO r/w. Bad effect is many noisy untrack mmio warning logs exist when real TDR happen. Even so you can control the log output or not by setting the debug mask bit. v2: remove log in instdone_mmio_read Suggested-by: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Weinan Li <weinan.z.li@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: implement per-vm mmio switching optimizationChangbin Du6-12/+80
Commit ab9da627906a ("drm/i915: make context status notifier head be per engine") gives us a chance to inspect every single request. Then we can eliminate unnecessary mmio switching for same vGPU. We only need mmio switching for different VMs (including host). This patch introduced a new general API intel_gvt_switch_mmio() to replace the old intel_gvt_load/restore_render_mmio(). This function can be further optimized for vGPU to vGPU switching. To support individual ring switch, we track the owner who occupy each ring. When another VM or host request a ring we do the mmio context switching. Otherwise no need to switch the ring. This optimization is very useful if only one guest has plenty of workloads and the host is mostly idle. The best case is no mmio switching will happen. v2: o fix missing ring switch issue. (chuanxiao) o support individual ring switch. Signed-off-by: Changbin Du <changbin.du@intel.com> Reviewed-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: refactor function intel_vgpu_submit_execlistChangbin Du1-33/+23
The function intel_vgpu_submit_execlist could be more simpler. It actually does: 1) validate the submission. The first context must be valid, and all two must be privilege_access. 2) submit valid contexts. The first one need emulate schedule_in. We do not need a bitmap, valid desc copy valid_desc. Local variable emulate_schedule_in also can be optimized out. v2: dump desc content in err msg (Zhi Wang) Signed-off-by: Changbin Du <changbin.du@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-06-08drm/i915/gvt: rewrite the trace gvt:gvt_command using trace style approachChangbin Du2-96/+32
The gvt:gvt_command trace involve unnecessary overhead even this trace is not enabled. We need improve it. The kernel trace infrastructure provide a full api to define a trace event. We should leverage them if possible. And one important thing is that a trace point should store raw data but not format string. This patch include two part work: 1) Refactor the gvt_command trace definition, including: o only store raw trace data. o use __dynamic_array() to declare a variable size buffer. o use __print_array() to format raw cmd data. o rename vm_id as vgpu_id. 2) Improve the trace invoking, including: o remove the cycles calculation for handler. We can get this data by any perf tool. o do not make a backup for raw cmd data which just doesn't make sense. With this patch, this trace has no overhead if it is not enabled. And we are trace style now. The final output example: gvt workload 0-211 [000] ...1 120.555964: gvt_command: vgpu1 ring 0: buf_type 0, ip_gma e161e880, raw cmd {0x4000000} gvt workload 0-211 [000] ...1 120.556014: gvt_command: vgpu1 ring 0: buf_type 0, ip_gma e161e884, raw cmd {0x7a000004,0x1004000,0xe1511018,0x0,0x7d,0x0} gvt workload 0-211 [000] ...1 120.556062: gvt_command: vgpu1 ring 0: buf_type 0, ip_gma e161e89c, raw cmd {0x7a000004,0x140000,0x0,0x0,0x0,0x0} gvt workload 0-211 [000] ...1 120.556110: gvt_command: vgpu1 ring 0: buf_type 0, ip_gma e161e8b4, raw cmd {0x10400002,0xe1511018,0x0,0x7d} Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-05-30Backmerge tag 'v4.12-rc3' into drm-nextDave Airlie3-3/+10
Linux 4.12-rc3 Daniel has requested this for some drm-intel-next work.
2017-05-24drm/i915/gvt: clean up unsubmited workloads before destroying kmem cacheChangbin Du1-10/+20
This is to fix a memory leak issue caused by unfreed gvtg workload objects. Walk through the workload list and free all of the remained workloads before destroying kmem cache. [179.885211] INFO: Object 0xffff9cef10003b80 @offset=7040 [179.885657] kmem_cache_destroy gvt-g_vgpu_workload: Slab cache still has objects [179.886146] CPU: 2 PID: 2318 Comm: win_lucas Tainted: G    B   W       4.11.0+ #1 [179.887223] Call Trace: [179.887394] dump_stack+0x63/0x90 [179.887617] kmem_cache_destroy+0x1cf/0x1e0 [179.887960] intel_vgpu_clean_execlist+0x15/0x20 [i915] [179.888365] intel_gvt_destroy_vgpu+0x4c/0xd0 [i915] [179.888688] intel_vgpu_remove+0x2a/0x30 [kvmgt] [179.888988] mdev_device_remove_ops+0x23/0x50 [mdev] [179.889309] mdev_device_remove+0xe4/0x190 [mdev] [179.889615] remove_store+0x7d/0xb0 [mdev] [179.889885] dev_attr_store+0x18/0x30 [179.890129] sysfs_kf_write+0x37/0x40 [179.890371] kernfs_fop_write+0x107/0x180 [179.890632] __vfs_write+0x37/0x160 [179.890865] ? kmem_cache_alloc+0xd7/0x1b0 [179.891116] ? apparmor_file_permission+0x1a/0x20 [179.891372] ? security_file_permission+0x3b/0xc0 [179.891628] vfs_write+0xb8/0x1b0 [179.891812] SyS_write+0x55/0xc0 [179.891992] entry_SYSCALL_64_fastpath+0x1e/0xad Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-05-23drm/i915/gvt: Disable compression workaround for Gen9Chuanxiao Dong1-9/+21
With enabling this workaround, can observe GPU hang issue on Gen9. As currently host side doesn't have this workaround, disable it from GVT side. v2: - Fix indent error.(Zhenyu) Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-05-15Merge tag 'gvt-fixes-2017-05-11' of https://github.com/01org/gvt-linux into ↵Jani Nikula3-3/+10
drm-intel-fixes gvt-fixes-2017-05-11 - vGPU scheduler performance regression fix (Ping) - bypass in-context mmio restore (Chuanxiao) - one typo fix (Colin) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170511054736.swpcmnzdoqi75cnl@zhen-hp.sh.intel.com
2017-05-10drm/i915/gvt: avoid unnecessary vgpu switchPing Gao1-2/+6
It's no need to switch vgpu if next vgpu is the same with current vgpu, otherwise it will make performance drop in some case. v2: correct the comments. Signed-off-by: Ping Gao <ping.a.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-05-08drm/i915/gvt: not to restore in-context mmioChuanxiao Dong1-0/+3
Needn't to restore the in-context MMIO when SCHEDULE_OUT. Sometimes with restoring the in-context MMIO, some GPU hang can be observed. So remove the in-context MMIO restore Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-05-05drm/i915/gvt: fix typo: "supporte" -> "support"Colin Ian King1-1/+1
trivial fix to typo in WARN_ONCE message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2017-05-04drm/i915: Use engine->context_pin() to report the intel_ringChris Wilson1-2/+4
Since unifying ringbuffer/execlist submission to use engine->pin_context, we ensure that the intel_ring is available before we start constructing the request. We can therefore move the assignment of the request->ring to the central i915_gem_request_alloc() and not require it in every engine->request_alloc() callback. Another small step towards simplification (of the core, but at a cost of handling error pointers in less important callers of engine->pin_context). v2: Rearrange a few branches to reduce impact of PTR_ERR() on gcc's code generation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170504093308.4137-1-chris@chris-wilson.co.uk