summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2010-03-01KVM: ia64: remove redundant kvm_get_exit_data() NULL testsRoel Kluin1-15/+13
kvm_get_exit_data() cannot return a NULL pointer. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: SVM: Lazy fpu with nptAvi Kivity1-8/+0
Now that we can allow the guest to play with cr0 when the fpu is loaded, we can enable lazy fpu when npt is in use. Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: SVM: Selective cr0 interceptAvi Kivity1-6/+26
If two conditions apply: - no bits outside TS and EM differ between the host and guest cr0 - the fpu is active then we can activate the selective cr0 write intercept and drop the unconditional cr0 read and write intercept, and allow the guest to run with the host fpu state. This reduces cr0 exits due to guest fpu management while the guest fpu is loaded. Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: SVM: Restore unconditional cr0 intercept under nptAvi Kivity1-22/+7
Currently we don't intercept cr0 at all when npt is enabled. This improves performance but requires us to activate the fpu at all times. Remove this behaviour in preparation for adding selective cr0 intercepts. Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: SVM: Initialize fpu_active in init_vmcb()Avi Kivity1-1/+2
init_vmcb() sets up the intercepts as if the fpu is active, so initialize it there. This avoids an INIT from setting up intercepts inconsistent with fpu_active. Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: SVM: Fix SVM_CR0_SELECTIVE_MASKAvi Kivity1-1/+1
Instead of selecting TS and MP as the comments say, the macro included TS and PE. Luckily the macro is unused now, but fix in order to save a few hours of debugging from anyone who attempts to use it. Acked-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: Set cr0.et when the guest writes cr0Avi Kivity1-0/+2
Follow the hardware. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: VMX: Give the guest ownership of cr0.ts when the fpu is activeAvi Kivity1-2/+9
If the guest fpu is loaded, there is nothing interesing about cr0.ts; let the guest play with it as it will. This makes context switches between fpu intensive guest processes faster, as we won't trap the clts and cr0 write instructions. [marcelo: fix cr0 read shadow update on fpu deactivation; kills F8 install] Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: Lazify fpu activation and deactivationAvi Kivity5-31/+38
Defer fpu deactivation as much as possible - if the guest fpu is loaded, keep it loaded until the next heavyweight exit (where we are forced to unload it). This reduces unnecessary exits. We also defer fpu activation on clts; while clts signals the intent to use the fpu, we can't be sure the guest will actually use it. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: VMX: Allow the guest to own some cr0 bitsAvi Kivity4-0/+18
We will use this later to give the guest ownership of cr0.ts. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: Replace read accesses of vcpu->arch.cr0 by an accessorAvi Kivity7-27/+38
Since we'd like to allow the guest to own a few bits of cr0 at times, we need to know when we access those bits. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: VMX: trace clts and lmsw instructions as cr accessesAvi Kivity1-1/+4
clts writes cr0.ts; lmsw writes cr0[0:15] - record that in ftrace. Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Make large pages workAlexander Graf2-5/+8
An SLB entry contains two pieces of information related to size: 1) PTE size 2) SLB size The L bit defines the PTE be "large" (usually means 16MB), SLB_VSID_B_1T defines that the SLB should span 1 GB instead of the default 256MB. Apparently I messed things up and just put those two in one box, shaked it heavily and came up with the current code which handles large pages incorrectly, because it also treats large page SLB entries as "1TB" segment entries. This patch splits those two features apart, making Linux guests boot even when they have > 256MB. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Pass through program interruptsAlexander Graf1-0/+1
When we get a program interrupt in guest kernel mode, we try to emulate the instruction. If that doesn't fail, we report to the user and try again - at the exact same instruction pointer. So if the guest kernel really does trigger an invalid instruction, we loop forever. So let's better go and forward program exceptions to the guest when we don't know the instruction we're supposed to emulate. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Pass program interrupt flags to the guestAlexander Graf1-2/+5
When we need to reinject a program interrupt into the guest, we also need to reinject the corresponding flags into the guest. Signed-off-by: Alexander Graf <agraf@suse.de> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Fix HID5 setting codeAlexander Graf1-1/+2
The code to unset HID5.dcbz32 is broken. This patch makes it do the right rotate magic. Signed-off-by: Alexander Graf <agraf@suse.de> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Emulate trap SRR1 flags properlyAlexander Graf6-5/+14
Book3S needs some flags in SRR1 to get to know details about an interrupt. One such example is the trap instruction. It tells the guest kernel that a program interrupt is due to a trap using a bit in SRR1. This patch implements above behavior, making WARN_ON behave like WARN_ON. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Call SLB patching code in interrupt safe mannerAlexander Graf9-21/+34
Currently we're racy when doing the transition from IR=1 to IR=0, from the module memory entry code to the real mode SLB switching code. To work around that I took a look at the RTAS entry code which is faced with a similar problem and did the same thing: A small helper in linear mapped memory that does mtmsr with IR=0 and then RFIs info the actual handler. Thanks to that trick we can safely take page faults in the entry code and only need to be really wary of what to do as of the SLB switching part. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Get rid of unnecessary RFIAlexander Graf1-11/+11
Using an RFI in IR=1 is dangerous. We need to set two SRRs and then do an RFI without getting interrupted at all, because every interrupt could potentially overwrite the SRR values. Fortunately, we don't need to RFI in at least this particular case of the code, so we can just replace it with an mtmsr and b. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Implement 'skip instruction' modeAlexander Graf4-6/+59
To fetch the last instruction we were interrupted on, we enable DR in early exit code, where we are still in a very transitional phase between guest and host state. Most of the time this seemed to work, but another CPU can easily flush our TLB and HTAB which makes us go in the Linux page fault handler which totally breaks because we still use the guest's SLB entries. To work around that, let's introduce a second KVM guest mode that defines that whenever we get a trap, we don't call the Linux handler or go into the KVM exit code, but just jump over the faulting instruction. That way a potentially bad lwz doesn't trigger any faults and we can later on interpret the invalid instruction we fetched as "fetch didn't work". Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Use PACA backed shadow vcpuAlexander Graf10-237/+249
We're being horribly racy right now. All the entry and exit code hijacks random fields from the PACA that could easily be used by different code in case we get interrupted, for example by a #MC or even page fault. After discussing this with Ben, we figured it's best to reserve some more space in the PACA and just shove off some vcpu state to there. That way we can drastically improve the readability of the code, make it less racy and less complex. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Add helpers for CR, XERAlexander Graf4-10/+52
We now have helpers for the GPRs, so let's also add some for CR and XER. Having them in the PACA simplifies code a lot, as we don't need to care about where to store CC or not to overflow any integers. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: PPC: Use accessor functions for GPR accessAlexander Graf11-225/+274
All code in PPC KVM currently accesses gprs in the vcpu struct directly. While there's nothing wrong with that wrt the current way gprs are stored and loaded, it doesn't suffice for the PACA acceleration that will follow in this patchset. So let's just create little wrapper inline functions that we call whenever a GPR needs to be read from or written to. The compiled code shouldn't really change at all for now. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: Fix the explanation of write_emulatedTakuya Yoshikawa1-1/+1
The explanation of write_emulated is confused with that of read_emulated. This patch fix it. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: VMX: Enable EPT 1GB page supportSheng Yang3-4/+16
Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: x86: Rename gb_page_enable() to get_lpage_level() in kvm_x86_opsSheng Yang4-8/+10
Then the callback can provide the maximum supported large page level, which is more flexible. Also move the gb page support into x86_64 specific. Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: x86: Moving PT_*_LEVEL to mmu.hSheng Yang2-4/+4
We can use them in x86.c and vmx.c now... Signed-off-by: Sheng Yang <sheng@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: PPC: Enable lightweight exits againAlexander Graf2-53/+57
The PowerPC C ABI defines that registers r14-r31 need to be preserved across function calls. Since our exit handler is written in C, we can make use of that and don't need to reload r14-r31 on every entry/exit cycle. This technique is also used in the BookE code and is called "lightweight exits" there. To follow the tradition, it's called the same in Book3S. So far this optimization was disabled though, as the code didn't do what it was expected to do, but failed to work. This patch fixes and enables lightweight exits again. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: PPC: Fix typo in rebolting codeAlexander Graf1-1/+1
When we're loading bolted entries into the SLB again, we're checking if an entry is in use and only slbmte it when it is. Unfortunately, the check always goes to the skip label of the first entry, resulting in an endless loop when it actually gets triggered. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: avoid taking ioapic mutex for non-ioapic EOIsAvi Kivity2-0/+20
When the guest acknowledges an interrupt, it sends an EOI message to the local apic, which broadcasts it to the ioapic. To handle the EOI, we need to take the ioapic mutex. On large guests, this causes a lot of contention on this mutex. Since large guests usually don't route interrupts via the ioapic (they use msi instead), this is completely unnecessary. Avoid taking the mutex by introducing a handled_vectors bitmap. Before taking the mutex, check if the ioapic was actually responsible for the acked vector. If not, we can return early. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: Fill out ftrace exit reason stringsAvi Kivity1-19/+39
Some exit reasons missed their strings; fill out the table. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: Bump maximum vcpu count to 64Avi Kivity1-1/+1
With slots_lock converted to rcu, the entire kvm hotpath on modern processors (with npt or ept) now scales beautifully. Increase the maximum vcpu count to 64 to reflect this. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: convert slots_lock to a mutexMarcelo Tosatti11-39/+39
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: switch vcpu context to use SRCUMarcelo Tosatti6-38/+45
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: convert io_bus to SRCUMarcelo Tosatti9-75/+101
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: x86: switch kvm_set_memory_alias to SRCU updateMarcelo Tosatti4-11/+63
Using a similar two-step procedure as for memslots. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: use SRCU for dirty logMarcelo Tosatti1-8/+41
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: introduce kvm->srcu and convert kvm_set_memory_region to SRCU updateMarcelo Tosatti8-64/+136
Use two steps for memslot deletion: mark the slot invalid (which stops instantiation of new shadow pages for that slot, but allows destruction), then instantiate the new empty slot. Also simplifies kvm_handle_hva locking. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: use gfn_to_pfn_memslot in kvm_iommu_map_pagesMarcelo Tosatti3-10/+8
So its possible to iommu map a memslot before making it visible to kvm. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: introduce gfn_to_pfn_memslotMarcelo Tosatti2-8/+27
Which takes a memslot pointer instead of using kvm->memslots. To be used by SRCU convertion later. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: split kvm_arch_set_memory_region into prepare and commitMarcelo Tosatti6-47/+82
Required for SRCU convertion later. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: modify alias layout in x86s struct kvm_archMarcelo Tosatti2-7/+22
Have a pointer to an allocated region inside x86's kvm_arch. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: modify memslots layout in struct kvmMarcelo Tosatti8-37/+60
Have a pointer to an allocated region inside struct kvm. [alex: fix ppc book 3s] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: trivial document fixesWu Fengguang1-6/+6
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2010-03-01KVM: powerpc: Change maintainerAlexander Graf1-1/+1
Progress on KVM for Embedded PowerPC has stalled, but for Book3S there's quite a lot of work to do and going on. So in agreement with Hollis and Avi, we should switch maintainers for PowerPC. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Hollis Blanchard <hollis@penguinppc.org> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: powerpc: Remove AGGRESSIVE_DECAlexander Graf1-15/+1
Because we now emulate the DEC interrupt according to real life behavior, there's no need to keep the AGGRESSIVE_DEC hack around. Let's just remove it. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Acked-by: Hollis Blanchard <hollis@penguinppc.org> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: powerpc: Improve DEC handlingAlexander Graf4-1/+24
We treated the DEC interrupt like an edge based one. This is not true for Book3s. The DEC keeps firing until mtdec is issued again and thus clears the interrupt line. So let's implement this logic in KVM too. This patch moves the line clearing from the firing of the interrupt to the mtdec emulation. This makes PPC64 guests work without AGGRESSIVE_DEC defined. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Acked-by: Hollis Blanchard <hollis@penguinppc.org> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: powerpc: Move vector to irqprio resolving to separate functionAlexander Graf1-3/+10
We're using a switch table to find the irqprio that belongs to a specific interrupt vector. This table is part of the interrupt inject logic. Since we'll add a new function to stop interrupts, let's move this table out of the injection logic into a separate function. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Acked-by: Hollis Blanchard <hollis@penguinppc.org> Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: Simplify coalesced mmio initializationAvi Kivity3-8/+34
- add destructor function - move related allocation into constructor - add stubs for !CONFIG_KVM_MMIO Signed-off-by: Avi Kivity <avi@redhat.com>
2010-03-01KVM: Add KVM_MMIO kconfig itemAvi Kivity4-0/+6
s390 doesn't have mmio, this will simplify ifdefing it out. Signed-off-by: Avi Kivity <avi@redhat.com>