summaryrefslogtreecommitdiff
path: root/include/linux/kvm_host.h
AgeCommit message (Collapse)AuthorFilesLines
2012-10-30KVM: do not treat noslot pfn as a error pfnXiao Guangrong1-8/+20
This patch filters noslot pfn out from error pfns based on Marcelo comment: noslot pfn is not a error pfn After this patch, - is_noslot_pfn indicates that the gfn is not in slot - is_error_pfn indicates that the gfn is in slot but the error is occurred when translate the gfn to pfn - is_error_noslot_pfn indicates that the pfn either it is error pfns or it is noslot pfn And is_invalid_pfn can be removed, it makes the code more clean Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-30kvm: Directly account vtime to system on guest switchFrederic Weisbecker1-2/+10
Switching to or from guest context is done on ioctl context. So by the time we call kvm_guest_enter() or kvm_guest_exit() we know we are not running the idle task. As a result, we can directly account the cputime using vtime_account_system(). There are two good reasons to do this: * We avoid some useless checks on guest switch. It optimizes a bit this fast path. * In the case of CONFIG_IRQ_TIME_ACCOUNTING, calling vtime_account() checks for irq time to account. This is pointless since we know we are not in an irq on guest switch. This is wasting cpu cycles for no good reason. vtime_account_system() OTOH is a no-op in this config option. * We can remove the irq disable/enable around kvm guest switch in s390. A further optimization may consist in introducing a vtime_account_guest() that directly calls account_guest_time(). Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Joerg Roedel <joerg.roedel@amd.com> Cc: Alexander Graf <agraf@suse.de> Cc: Xiantao Zhang <xiantao.zhang@intel.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
2012-10-23KVM: Take kvm instead of vcpu to mmu_notifier_retryChristoffer Dall1-3/+3
The mmu_notifier_retry is not specific to any vcpu (and never will be) so only take struct kvm as a parameter. The motivation is the ARM mmu code that needs to call this from somewhere where we long let go of the vcpu pointer. Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-10-11Merge branch 'for-upstream' of http://github.com/agraf/linux-2.6 into queueMarcelo Tosatti1-0/+1
* 'for-upstream' of http://github.com/agraf/linux-2.6: (56 commits) arch/powerpc/kvm/e500_tlb.c: fix error return code KVM: PPC: Book3S HV: Provide a way for userspace to get/set per-vCPU areas KVM: PPC: Book3S: Get/set guest FP regs using the GET/SET_ONE_REG interface KVM: PPC: Book3S: Get/set guest SPRs using the GET/SET_ONE_REG interface KVM: PPC: set IN_GUEST_MODE before checking requests KVM: PPC: e500: MMU API: fix leak of shared_tlb_pages KVM: PPC: e500: fix allocation size error on g2h_tlb1_map KVM: PPC: Book3S HV: Fix calculation of guest phys address for MMIO emulation KVM: PPC: Book3S HV: Remove bogus update of physical thread IDs KVM: PPC: Book3S HV: Fix updates of vcpu->cpu KVM: Move some PPC ioctl definitions to the correct place KVM: PPC: Book3S HV: Handle memory slot deletion and modification correctly KVM: PPC: Move kvm->arch.slot_phys into memslot.arch KVM: PPC: Book3S HV: Take the SRCU read lock before looking up memslots KVM: PPC: bookehv: Allow duplicate calls of DO_KVM macro KVM: PPC: BookE: Support FPU on non-hv systems KVM: PPC: 440: Implement mfdcrx KVM: PPC: 440: Implement mtdcrx Document IACx/DACx registers access using ONE_REG API KVM: PPC: E500: Remove E500_TLB_DIRTY flag ...
2012-10-08KVM: x86: Convert kvm_arch_vcpu_reset into private kvm_vcpu_resetJan Kiszka1-1/+0
There are no external callers of this function as there is no concept of resetting a vcpu from generic code. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-10-06KVM: PPC: booke: Add watchdog emulationBharat Bhushan1-0/+1
This patch adds the watchdog emulation in KVM. The watchdog emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl. The kernel timer are used for watchdog emulation and emulates h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how it is configured. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> [bharat.bhushan@freescale.com: reworked patch] Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> [agraf: adjust to new request framework] Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-04Merge tag 'kvm-3.7-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-29/+116
Pull KVM updates from Avi Kivity: "Highlights of the changes for this release include support for vfio level triggered interrupts, improved big real mode support on older Intels, a streamlines guest page table walker, guest APIC speedups, PIO optimizations, better overcommit handling, and read-only memory." * tag 'kvm-3.7-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (138 commits) KVM: s390: Fix vcpu_load handling in interrupt code KVM: x86: Fix guest debug across vcpu INIT reset KVM: Add resampling irqfds for level triggered interrupts KVM: optimize apic interrupt delivery KVM: MMU: Eliminate pointless temporary 'ac' KVM: MMU: Avoid access/dirty update loop if all is well KVM: MMU: Eliminate eperm temporary KVM: MMU: Optimize is_last_gpte() KVM: MMU: Simplify walk_addr_generic() loop KVM: MMU: Optimize pte permission checks KVM: MMU: Update accessed and dirty bits after guest pagetable walk KVM: MMU: Move gpte_access() out of paging_tmpl.h KVM: MMU: Optimize gpte_access() slightly KVM: MMU: Push clean gpte write protection out of gpte_access() KVM: clarify kvmclock documentation KVM: make processes waiting on vcpu mutex killable KVM: SVM: Make use of asm.h KVM: VMX: Make use of asm.h KVM: VMX: Make lto-friendly KVM: x86: lapic: Clean up find_highest_vector() and count_vectors() ... Conflicts: arch/s390/include/asm/processor.h arch/x86/kvm/i8259.c
2012-09-25cputime: Use a proper subsystem naming for vtime related APIsFrederic Weisbecker1-2/+2
Use a naming based on vtime as a prefix for virtual based cputime accounting APIs: - account_system_vtime() -> vtime_account() - account_switch_vtime() -> vtime_task_switch() It makes it easier to allow for further declension such as vtime_account_system(), vtime_account_idle(), ... if we want to find out the context we account to from generic code. This also make it better to know on which subsystem these APIs refer to. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>
2012-09-23KVM: Add resampling irqfds for level triggered interruptsAlex Williamson1-1/+4
To emulate level triggered interrupts, add a resample option to KVM_IRQFD. When specified, a new resamplefd is provided that notifies the user when the irqchip has been resampled by the VM. This may, for instance, indicate an EOI. Also in this mode, posting of an interrupt through an irqfd only asserts the interrupt. On resampling, the interrupt is automatically de-asserted prior to user notification. This enables level triggered interrupts to be posted and re-enabled from vfio with no userspace intervention. All resampling irqfds can make use of a single irq source ID, so we reserve a new one for this interface. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-17KVM: make processes waiting on vcpu mutex killableMichael S. Tsirkin1-1/+1
vcpu mutex can be held for unlimited time so taking it with mutex_lock on an ioctl is wrong: one process could be passed a vcpu fd and call this ioctl on the vcpu used by another process, it will then be unkillable until the owner exits. Call mutex_lock_killable instead and return status. Note: mutex_lock_interruptible would be even nicer, but I am not sure all users are prepared to handle EINTR from these ioctls. They might misinterpret it as an error. Cleanup paths expect a vcpu that can't be used by any userspace so this will always succeed - catch bugs by calling BUG_ON. Catch callers that don't check return state by adding __must_check. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-09-06KVM: split kvm_arch_flush_shadowMarcelo Tosatti1-1/+5
Introducing kvm_arch_flush_shadow_memslot, to invalidate the translations of a single memory slot. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-27KVM: PPC: book3s: fix build error caused by gfn_to_hva_memslot()Gavin Shan1-0/+6
The build error was caused by that builtin functions are calling the functions implemented in modules. This error was introduced by commit 4d8b81abc4 ("KVM: introduce readonly memslot"). The patch fixes the build error by moving function __gfn_to_hva_memslot() from kvm_main.c to kvm_host.h and making that "inline" so that the builtin function (kvmppc_h_enter) can use that. Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-08-22KVM: introduce readonly memslotXiao Guangrong1-6/+1
In current code, if we map a readonly memory space from host to guest and the page is not currently mapped in the host, we will get a fault pfn and async is not allowed, then the vm will crash We introduce readonly memory region to map ROM/ROMD to the guest, read access is happy for readonly memslot, write access on readonly memslot will cause KVM_EXIT_MMIO exit Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-22KVM: introduce KVM_HVA_ERR_RO_BADXiao Guangrong1-2/+3
In the later patch, it indicates failure when we try to get a writable hva from the readonly memslot Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-22KVM: introduce KVM_HVA_ERR_BADXiao Guangrong1-1/+7
Then, remove bad_hva and inline kvm_is_error_hva Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-22KVM: introduce KVM_PFN_ERR_RO_FAULTXiao Guangrong1-0/+1
In the later patch, it indicates failure when we try to get a writable pfn from the readonly memslot Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-22KVM: introduce gfn_to_pfn_memslot_atomicXiao Guangrong1-1/+2
It can instead of hva_to_pfn_atomic Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-22KVM: hide KVM_MEMSLOT_INVALID from userspaceXiao Guangrong1-0/+7
Quote Avi's comment: | KVM_MEMSLOT_INVALID is actually an internal symbol, not used by | userspace. Please move it to kvm_host.h. Also, we divide the memlsot->flags into two parts, the lower 16 bits are visible for userspace, the higher 16 bits are internally used in kvm Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-06KVM: let the error pfn not depend on error codeXiao Guangrong1-9/+15
Currently, we use the error code as error pfn to indicat the error condition, it is not straightforward and it will not work on PAE 32-bit cpu with huge memory, since the valid physical address can be at most 52 bits For the normal pfn, the highest 12 bits should be zero, so we can mask these bits to indicate the error. Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-06KVM: do not release the error pageXiao Guangrong1-1/+1
After commit a2766325cf9f9, the error page is replaced by the error code, it need not be released anymore [ The patch has been compiling tested for powerpc ] Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-06KVM: introduce KVM_ERR_PTR_BAD_PAGEXiao Guangrong1-2/+7
It is used to eliminate the overload of function call and cleanup the code Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-06KVM: remove the unused declareXiao Guangrong1-2/+0
Remove it since it is not used anymore Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-06KVM: inline is_*_pfn functionsXiao Guangrong1-3/+16
These functions are exported and can not inline, move them to kvm_host.h to eliminate the overload of function call Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-06KVM: introduce KVM_PFN_ERR_BADXiao Guangrong1-0/+1
Then, remove get_bad_pfn Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-06KVM: introduce KVM_PFN_ERR_HWPOISONXiao Guangrong1-1/+1
Then, get_hwpoison_pfn and is_hwpoison_pfn can be removed Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-06KVM: introduce KVM_PFN_ERR_FAULTXiao Guangrong1-1/+2
After that, the exported and un-inline function, get_fault_pfn, can be removed Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-08-06KVM: Push rmap into kvm_arch_memory_slotTakuya Yoshikawa1-1/+0
Two reasons: - x86 can integrate rmap and rmap_pde and remove heuristics in __gfn_to_rmap(). - Some architectures do not need rmap. Since rmap is one of the most memory consuming stuff in KVM, ppc'd better restrict the allocation to Book3S HV. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-07-26KVM: Move KVM_IRQ_LINE to arch-generic codeChristoffer Dall1-0/+1
Handle KVM_IRQ_LINE and KVM_IRQ_LINE_STATUS in the generic kvm_vm_ioctl() function and call into kvm_vm_ioctl_irq_line(). This is even more relevant when KVM/ARM also uses this ioctl. Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-07-26KVM: remove dummy pagesXiao Guangrong1-1/+2
Currently, kvm allocates some pages and use them as error indicators, it wastes memory and is not good for scalability Base on Avi's suggestion, we use the error codes instead of these pages to indicate the error conditions Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-07-26Merge branch 'queue' into nextAvi Kivity1-12/+50
Merge patches queued during the run-up to the merge window. * queue: (25 commits) KVM: Choose better candidate for directed yield KVM: Note down when cpu relax intercepted or pause loop exited KVM: Add config to support ple or cpu relax optimzation KVM: switch to symbolic name for irq_states size KVM: x86: Fix typos in pmu.c KVM: x86: Fix typos in lapic.c KVM: x86: Fix typos in cpuid.c KVM: x86: Fix typos in emulate.c KVM: x86: Fix typos in x86.c KVM: SVM: Fix typos KVM: VMX: Fix typos KVM: remove the unused parameter of gfn_to_pfn_memslot KVM: remove is_error_hpa KVM: make bad_pfn static to kvm_main.c KVM: using get_fault_pfn to get the fault pfn KVM: MMU: track the refcount when unmap the page KVM: x86: remove unnecessary mark_page_dirty KVM: MMU: Avoid handling same rmap_pde in kvm_handle_hva_range() KVM: MMU: Push trace_kvm_age_page() into kvm_age_rmapp() KVM: MMU: Add memslot parameter to hva handlers ... Signed-off-by: Avi Kivity <avi@redhat.com>
2012-07-24Merge tag 'kvm-3.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-9/+18
Pull KVM updates from Avi Kivity: "Highlights include - full big real mode emulation on pre-Westmere Intel hosts (can be disabled with emulate_invalid_guest_state=0) - relatively small ppc and s390 updates - PCID/INVPCID support in guests - EOI avoidance; 3.6 guests should perform better on 3.6 hosts on interrupt intensive workloads) - Lockless write faults during live migration - EPT accessed/dirty bits support for new Intel processors" Fix up conflicts in: - Documentation/virtual/kvm/api.txt: Stupid subchapter numbering, added next to each other. - arch/powerpc/kvm/booke_interrupts.S: PPC asm changes clashing with the KVM fixes - arch/s390/include/asm/sigp.h, arch/s390/kvm/sigp.c: Duplicated commits through the kvm tree and the s390 tree, with subsequent edits in the KVM tree. * tag 'kvm-3.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (93 commits) KVM: fix race with level interrupts x86, hyper: fix build with !CONFIG_KVM_GUEST Revert "apic: fix kvm build on UP without IOAPIC" KVM guest: switch to apic_set_eoi_write, apic_write apic: add apic_set_eoi_write for PV use KVM: VMX: Implement PCID/INVPCID for guests with EPT KVM: Add x86_hyper_kvm to complete detect_hypervisor_platform check KVM: PPC: Critical interrupt emulation support KVM: PPC: e500mc: Fix tlbilx emulation for 64-bit guests KVM: PPC64: booke: Set interrupt computation mode for 64-bit host KVM: PPC: bookehv: Add ESR flag to Data Storage Interrupt KVM: PPC: bookehv64: Add support for std/ld emulation. booke: Added crit/mc exception handler for e500v2 booke/bookehv: Add host crit-watchdog exception support KVM: MMU: document mmu-lock and fast page fault KVM: MMU: fix kvm_mmu_pagetable_walk tracepoint KVM: MMU: trace fast page fault KVM: MMU: fast path of handling guest page fault KVM: MMU: introduce SPTE_MMU_WRITEABLE bit KVM: MMU: fold tlb flush judgement into mmu_spte_update ...
2012-07-23KVM: Choose better candidate for directed yieldRaghavendra K T1-0/+5
Currently, on a large vcpu guests, there is a high probability of yielding to the same vcpu who had recently done a pause-loop exit or cpu relax intercepted. Such a yield can lead to the vcpu spinning again and hence degrade the performance. The patchset keeps track of the pause loop exit/cpu relax interception and gives chance to a vcpu which: (a) Has not done pause loop exit or cpu relax intercepted at all (probably he is preempted lock-holder) (b) Was skipped in last iteration because it did pause loop exit or cpu relax intercepted, and probably has become eligible now (next eligible lock holder) Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> # on s390x Signed-off-by: Avi Kivity <avi@redhat.com>
2012-07-23KVM: Note down when cpu relax intercepted or pause loop exitedRaghavendra K T1-0/+34
Noting pause loop exited vcpu or cpu relax intercepted helps in filtering right candidate to yield. Wrong selection of vcpu; i.e., a vcpu that just did a pl-exit or cpu relax intercepted may contribute to performance degradation. Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: Rik van Riel <riel@redhat.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> # on s390x Signed-off-by: Avi Kivity <avi@redhat.com>
2012-07-20KVM: remove the unused parameter of gfn_to_pfn_memslotXiao Guangrong1-3/+2
The parameter, 'kvm', is not used in gfn_to_pfn_memslot, we can happily remove it Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-07-20KVM: remove is_error_hpaXiao Guangrong1-4/+0
Remove them since they are not used anymore Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-07-20KVM: make bad_pfn static to kvm_main.cXiao Guangrong1-1/+0
bad_pfn is not used out of kvm_main.c, so mark it static, also move it near hwpoison_pfn and fault_pfn Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-07-20KVM: using get_fault_pfn to get the fault pfnXiao Guangrong1-4/+1
Using get_fault_pfn to cleanup the code Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-07-18KVM: Introduce hva_to_gfn_memslot() for kvm_handle_hva()Takuya Yoshikawa1-0/+8
This restricts hva handling in mmu code and makes it easier to extend kvm_handle_hva() so that it can treat a range of addresses later in this patch series. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Cc: Alexander Graf <agraf@suse.de> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-07-03KVM: Guard mmu_notifier specific code with CONFIG_MMU_NOTIFIERMarc Zyngier1-2/+2
In order to avoid compilation failure when KVM is not compiled in, guard the mmu_notifier specific sections with both CONFIG_MMU_NOTIFIER and KVM_ARCH_WANT_MMU_NOTIFIER, like it is being done in the rest of the KVM code. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-07-03KVM: Pass kvm_irqfd to functionsAlex Williamson1-2/+2
Prune this down to just the struct kvm_irqfd so we can avoid changing function definition for every flag or field we use. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-06-18KVM: use KVM_CAP_IRQ_ROUTING to protect the routing related codeMarc Zyngier1-1/+1
The KVM code sometimes uses CONFIG_HAVE_KVM_IRQCHIP to protect code that is related to IRQ routing, which not all in-kernel irqchips may support. Use KVM_CAP_IRQ_ROUTING instead. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-06KVM: Cleanup the kvm_print functions and introduce pr_XX wrappersChristoffer Dall1-6/+12
Introduces a couple of print functions, which are essentially wrappers around standard printk functions, with a KVM: prefix. Functions introduced or modified are: - kvm_err(fmt, ...) - kvm_info(fmt, ...) - kvm_debug(fmt, ...) - kvm_pr_unimpl(fmt, ...) - pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...) Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-05KVM: Avoid wasting pages for small lpage_info arraysTakuya Yoshikawa1-0/+3
lpage_info is created for each large level even when the memory slot is not for RAM. This means that when we add one slot for a PCI device, we end up allocating at least KVM_NR_PAGE_SIZES - 1 pages by vmalloc(). To make things worse, there is an increasing number of devices which would result in more pages being wasted this way. This patch mitigates this problem by using kvm_kvzalloc(). Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-05-16KVM: MMU: Don't use RCU for lockless shadow walkingAvi Kivity1-1/+2
Using RCU for lockless shadow walking can increase the amount of memory in use by the system, since RCU grace periods are unpredictable. We also have an unconditional write to a shared variable (reader_counter), which isn't good for scaling. Replace that with a scheme similar to x86's get_user_pages_fast(): disable interrupts during lockless shadow walk to force the freer (kvm_mmu_commit_zap_page()) to wait for the TLB flush IPI to find the processor with interrupts enabled. We also add a new vcpu->mode, READING_SHADOW_PAGE_TABLES, to prevent kvm_flush_remote_tlbs() from avoiding the IPI. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-05-01KVM: s390: Implement the directed yield (diag 9c) hypervisor call for KVMKonstantin Weitz1-0/+1
This patch implements the directed yield hypercall found on other System z hypervisors. It delegates execution time to the virtual cpu specified in the instruction's parameter. Useful to avoid long spinlock waits in the guest. Christian Borntraeger: moved common code in virt/kvm/ Signed-off-by: Konstantin Weitz <WEITZKON@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-04-24KVM: Introduce direct MSI message injection for in-kernel irqchipsJan Kiszka1-0/+2
Currently, MSI messages can only be injected to in-kernel irqchips by defining a corresponding IRQ route for each message. This is not only unhandy if the MSI messages are generated "on the fly" by user space, IRQ routes are a limited resource that user space has to manage carefully. By providing a direct injection path, we can both avoid using up limited resources and simplify the necessary steps for user land. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-04-20KVM: Fix page-crossing MMIOAvi Kivity1-4/+27
MMIO that are split across a page boundary are currently broken - the code does not expect to be aborted by the exit to userspace for the first MMIO fragment. This patch fixes the problem by generalizing the current code for handling 16-byte MMIOs to handle a number of "fragments", and changes the MMIO code to create those fragments. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-04-20Merge branch 'linus' into queueMarcelo Tosatti1-0/+6
Merge reason: development work has dependency on kvm patches merged upstream. Conflicts: Documentation/feature-removal-schedule.txt Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-04-12KVM: unmap pages from the iommu when slots are removedAlex Williamson1-0/+6
We've been adding new mappings, but not destroying old mappings. This can lead to a page leak as pages are pinned using get_user_pages, but only unpinned with put_page if they still exist in the memslots list on vm shutdown. A memslot that is destroyed while an iommu domain is enabled for the guest will therefore result in an elevated page reference count that is never cleared. Additionally, without this fix, the iommu is only programmed with the first translation for a gpa. This can result in peer-to-peer errors if a mapping is destroyed and replaced by a new mapping at the same gpa as the iommu will still be pointing to the original, pinned memory address. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2012-04-08KVM: Remove unused dirty_bitmap_head and nr_dirty_pagesTakuya Yoshikawa1-2/+0
Now that we do neither double buffering nor heuristic selection of the write protection method these are not needed anymore. Note: some drivers have their own implementation of set_bit_le() and making it generic needs a bit of work; so we use test_and_set_bit_le() and will later replace it with generic set_bit_le(). Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>