summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2014-12-10x86, kvm: Clear paravirt_enabled on KVM guests for espfix32's benefitAndy Lutomirski2-2/+8
paravirt_enabled has the following effects: - Disables the F00F bug workaround warning. There is no F00F bug workaround any more because Linux's standard IDT handling already works around the F00F bug, but the warning still exists. This is only cosmetic, and, in any event, there is no such thing as KVM on a CPU with the F00F bug. - Disables 32-bit APM BIOS detection. On a KVM paravirt system, there should be no APM BIOS anyway. - Disables tboot. I think that the tboot code should check the CPUID hypervisor bit directly if it matters. - paravirt_enabled disables espfix32. espfix32 should *not* be disabled under KVM paravirt. The last point is the purpose of this patch. It fixes a leak of the high 16 bits of the kernel stack address on 32-bit KVM paravirt guests. Fixes CVE-2014-8134. Cc: stable@vger.kernel.org Suggested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05KVM: cpuid: recompute CPUID 0xD.0:EBX,ECXRadim Krčmář1-0/+2
We reused host EBX and ECX, but KVM might not support all features; emulated XSAVE size should be smaller. EBX depends on unknown XCR0, so we default to ECX. SDM CPUID (EAX = 0DH, ECX = 0): EBX Bits 31-00: Maximum size (bytes, from the beginning of the XSAVE/XRSTOR save area) required by enabled features in XCR0. May be different than ECX if some features at the end of the XSAVE save area are not enabled. ECX Bit 31-00: Maximum size (bytes, from the beginning of the XSAVE/XRSTOR save area) of the XSAVE/XRSTOR save area required by all supported features in the processor, i.e all the valid bit fields in XCR0. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Tested-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05kvm: vmx: add nested virtualization support for xsavesWanpeng Li1-1/+22
Add nested virtualization support for xsaves. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05kvm: vmx: add MSR logic for XSAVESWanpeng Li2-1/+29
Add logic to get/set the XSS model-specific register. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05kvm: x86: handle XSAVES vmcs and vmexitWanpeng Li3-1/+28
Initialize the XSS exit bitmap. It is zero so there should be no XSAVES or XRSTORS exits. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05KVM: cpuid: mask more bits in leaf 0xd and subleavesPaolo Bonzini1-2/+8
- EAX=0Dh, ECX=1: output registers EBX/ECX/EDX are reserved. - EAX=0Dh, ECX>1: output register ECX bit 0 is clear for all the CPUID leaves we support, because variable "supported" comes from XCR0 and not XSS. Bits above 0 are reserved, so ECX is overall zero. Output register EDX is reserved. Source: Intel Architecture Instruction Set Extensions Programming Reference, ref. number 319433-022 Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Tested-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05KVM: cpuid: set CPUID(EAX=0xd,ECX=1).EBX correctlyPaolo Bonzini1-6/+16
This is the size of the XSAVES area. This starts providing guest support for XSAVES (with no support yet for supervisor states, i.e. XSS == 0 always in guests for now). Wanpeng Li suggested testing XSAVEC as well as XSAVES, since in practice no real processor exists that only has one of them, and there is no other way for userspace programs to compute the area of the XSAVEC save area. CPUID(EAX=0xd,ECX=1).EBX provides an upper bound. Suggested-by: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Tested-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05kvm: x86: Add kvm_x86_ops hook that enables XSAVES for guestWanpeng Li5-1/+17
Expose the XSAVES feature to the guest if the kvm_x86_ops say it is available. Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05KVM: x86: use F() macro throughout cpuid.cPaolo Bonzini1-7/+7
For code that deals with cpuid, this makes things a bit more readable. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05KVM: x86: support XSAVES usage in the hostPaolo Bonzini1-7/+83
Userspace is expecting non-compacted format for KVM_GET_XSAVE, but struct xsave_struct might be using the compacted format. Convert in order to preserve userspace ABI. Likewise, userspace is passing non-compacted format for KVM_SET_XSAVE but the kernel will pass it to XRSTORS, and we need to convert back. Fixes: f31a9f7c71691569359fa7fb8b0acaa44bce0324 Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: stable@vger.kernel.org Cc: H. Peter Anvin <hpa@linux.intel.com> Tested-by: Nadav Amit <namit@cs.technion.ac.il> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05x86: export get_xsave_addrPaolo Bonzini1-0/+1
get_xsave_addr is the API to access XSAVE states, and KVM would like to use it. Export it. Cc: stable@vger.kernel.org Cc: x86@kernel.org Cc: H. Peter Anvin <hpa@linux.intel.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-05Merge tag 'kvm-s390-next-20141204' of ↵Paolo Bonzini1-15/+18
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixups for kvm/next (3.19) Here we have two fixups of the latest interrupt rework and one architectural fixup.
2014-12-04KVM: s390: clean up return code handling in irq delivery codeJens Freimann1-13/+13
Instead of returning a possibly random or'ed together value, let's always return -EFAULT if rc is set. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-12-04KVM: s390: use atomic bitops to access pending_irqs bitmapJens Freimann1-2/+2
Currently we use a mixture of atomic/non-atomic bitops and the local_int spin lock to protect the pending_irqs bitmap and interrupt payload data. We need to use atomic bitops for the pending_irqs bitmap everywhere and in addition acquire the local_int lock where interrupt data needs to be protected. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-12-04KVM: s390: some ext irqs have to clear the ext cpu addrDavid Hildenbrand1-0/+3
The cpu address of a source cpu (responsible for an external irq) is only to be stored if bit 6 of the ext irq code is set. If bit 6 is not set, it is to be zeroed out. The special external irq code used for virtio and pfault uses the cpu addr as a parameter field. As bit 6 is set, this implementation is correct. Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-12-04KVM: x86: allow 256 logical x2APICs againRadim Krčmář2-7/+6
While fixing an x2apic bug, 17d68b7 KVM: x86: fix guest-initiated crash with x2apic (CVE-2013-6376) we've made only one cluster available. This means that the amount of logically addressible x2APICs was reduced to 16 and VCPUs kept overwriting themselves in that region, so even the first cluster wasn't set up correctly. This patch extends x2APIC support back to the logical_map's limit, and keeps the CVE fixed as messages for non-present APICs are dropped. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04KVM: x86: check bounds of APIC mapsRadim Krčmář1-4/+5
They can't be violated now, but play it safe for the future. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04KVM: x86: fix APIC physical destination wrappingRadim Krčmář1-1/+4
x2apic allows destinations > 0xff and we don't want them delivered to lower APICs. They are correctly handled by doing nothing. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04KVM: x86: deliver phys lowest-prioRadim Krčmář1-2/+0
Physical mode can't address more than one APIC, but lowest-prio is allowed, so we just reuse our paths. SDM 10.6.2.1 Physical Destination: Also, for any non-broadcast IPI or I/O subsystem initiated interrupt with lowest priority delivery mode, software must ensure that APICs defined in the interrupt address are present and enabled to receive interrupts. We could warn on top of that. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04KVM: x86: don't retry hopeless APIC deliveryRadim Krčmář1-2/+2
False from kvm_irq_delivery_to_apic_fast() means that we don't handle it in the fast path, but we still return false in cases that were perfectly handled, fix that. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04KVM: x86: use MSR_ICR instead of a numberRadim Krčmář1-2/+2
0x830 MSR is 0x300 xAPIC MMIO, which is MSR_ICR. Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04KVM: x86: Fix reserved x2apic registersNadav Amit1-0/+9
x2APIC has no registers for DFR and ICR2 (see Intel SDM 10.12.1.2 "x2APIC Register Address Space"). KVM needs to cause #GP on such accesses. Fix it (DFR and ICR2 on read, ICR2 on write, DFR already handled on writes). Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-04KVM: x86: Generate #UD when memory operand is requiredNadav Amit1-4/+34
Certain x86 instructions that use modrm operands only allow memory operand (i.e., mod012), and cause a #UD exception otherwise. KVM ignores this fact. Currently, the instructions that are such and are emulated by KVM are MOVBE, MOVNTPS, MOVNTPD and MOVNTI. MOVBE is the most blunt example, since it may be emulated by the host regardless of MMIO. The fix introduces a new group for handling such instructions, marking mod3 as illegal instruction. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-12-03Merge tag 'kvm-s390-next-20141128' of ↵Paolo Bonzini9-399/+872
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Several fixes,cleanups and reworks Here is a bunch of fixes that deal mostly with architectural compliance: - interrupt priorities - interrupt handling - intruction exit handling We also provide a helper function for getting the guest visible storage key.
2014-11-28KVM: s390: allow injecting all kinds of machine checksJens Freimann1-3/+11
Allow to specify CR14, logout area, external damage code and failed storage address. Since more then one machine check can be indicated to the guest at a time we need to combine all indication bits with already pending requests. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28KVM: s390: handle pending local interrupts via bitmapJens Freimann6-282/+380
This patch adapts handling of local interrupts to be more compliant with the z/Architecture Principles of Operation and introduces a data structure which allows more efficient handling of interrupts. * get rid of li->active flag, use bitmap instead * Keep interrupts in a bitmap instead of a list * Deliver interrupts in the order of their priority as defined in the PoP * Use a second bitmap for sigp emergency requests, as a CPU can have one request pending from every other CPU in the system. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28KVM: s390: add bitmap for handling cpu-local interruptsJens Freimann1-0/+86
Adds a bitmap to the vcpu structure which is used to keep track of local pending interrupts. Also add enum with all interrupt types sorted in order of priority (highest to lowest) Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28KVM: s390: refactor interrupt delivery codeJens Freimann1-177/+282
Move delivery code for cpu-local interrupt from the huge do_deliver_interrupt() to smaller functions which handle one type of interrupt. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28KVM: s390: add defines for virtio and pfault interrupt codeJens Freimann1-2/+4
Get rid of open coded value for virtio and pfault completion interrupts. Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28KVM: s390: external param not valid for cpu timer and ckcDavid Hildenbrand1-4/+2
The 32bit external interrupt parameter is only valid for timing-alert and service-signal interrupts. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28KVM: s390: refactor interrupt injection codeJens Freimann1-54/+167
In preparation for the rework of the local interrupt injection code, factor out injection routines from kvm_s390_inject_vcpu(). Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28KVM: S390: Create helper function get_guest_storage_keyJason J. Herne2-0/+40
Define get_guest_storage_key which can be used to get the value of a guest storage key. This compliments the functionality provided by the helper function set_guest_storage_key. Both functions are needed for live migration of s390 guests that use storage keys. Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28KVM: s390: trigger the right CPU exit for floating interruptsChristian Borntraeger1-1/+11
When injecting a floating interrupt and no CPU is idle we kick one CPU to do an external exit. In case of I/O we should trigger an I/O exit instead. This does not matter for Linux guests as external and I/O interrupts are enabled/disabled at the same time, but play safe anyway. The same holds true for machine checks. Since there is no special exit, just reuse the generic stop exit. The injection code inside the VCPU loop will recheck anyway and rearm the proper exits (e.g. control registers) if necessary. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
2014-11-28KVM: s390: Fix rewinding of the PSW pointing to an EXECUTE instructionThomas Huth4-13/+23
A couple of our interception handlers rewind the PSW to the beginning of the instruction to run the intercepted instruction again during the next SIE entry. This normally works fine, but there is also the possibility that the instruction did not get run directly but via an EXECUTE instruction. In this case, the PSW does not point to the instruction that caused the interception, but to the EXECUTE instruction! So we've got to rewind the PSW to the beginning of the EXECUTE instruction instead. This is now accomplished with a new helper function kvm_s390_rewind_psw(). Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-28KVM: s390: Small fixes for the PFMF handlerThomas Huth1-4/+7
This patch includes two small fixes for the PFMF handler: First, the start address for PFMF has to be masked according to the current addressing mode, which is now done with kvm_s390_logical_to_effective(). Second, the protection exceptions have a lower priority than the specification exceptions, so the check for low-address protection has to be moved after the last spot where we inject a specification exception. Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2014-11-24kvm: x86: avoid warning about potential shift wrapping bugPaolo Bonzini3-3/+3
cs.base is declared as a __u64 variable and vector is a u32 so this causes a static checker warning. The user indeed can set "sipi_vector" to any u32 value in kvm_vcpu_ioctl_x86_set_vcpu_events(), but the value should really have 8-bit precision only. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-24KVM: x86: move device assignment out of kvm_host.hPaolo Bonzini5-33/+64
Create a new header, and hide the device assignment functions there. Move struct kvm_assigned_dev_kernel to assigned-dev.c by modifying arch/x86/kvm/iommu.c to take a PCI device struct. Based on a patch by Radim Krcmar <rkrcmark@redhat.com>. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-23kvm: x86: mask out XSAVESPaolo Bonzini1-1/+10
This feature is not supported inside KVM guests yet, because we do not emulate MSR_IA32_XSS. Mask it out. Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-23kvm: x86: move assigned-dev.c and iommu.c to arch/x86/Radim Krčmář5-2/+1409
Now that ia64 is gone, we can hide deprecated device assignment in x86. Notable changes: - kvm_vm_ioctl_assigned_device() was moved to x86/kvm_arch_vm_ioctl() The easy parts were removed from generic kvm code, remaining - kvm_iommu_(un)map_pages() would require new code to be moved - struct kvm_assigned_dev_kernel depends on struct kvm_irq_ack_notifier Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-21kvm: remove CONFIG_X86 #ifdefs from files formerly shared with ia64Radim Krcmar2-24/+2
Signed-off-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-21kvm: x86: move ioapic.c and irq_comm.c back to arch/x86/Paolo Bonzini6-3/+1150
ia64 does not need them anymore. Ack notifiers become x86-specific too. Suggested-by: Gleb Natapov <gleb@kernel.org> Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-20KVM: ia64: removePaolo Bonzini27-13235/+0
KVM for ia64 has been marked as broken not just once, but twice even, and the last patch from the maintainer is now roughly 5 years old. Time for it to rest in peace. Acked-by: Gleb Natapov <gleb@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: x86: Remove FIXMEs in emulate.cNicholas Krause1-4/+0
Remove FIXME comments about needing fault addresses to be returned. These are propaagated from walk_addr_generic to gva_to_gpa and from there to ops->read_std and ops->write_std. Signed-off-by: Nicholas Krause <xerofoify@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: emulator: remove duplicated limit checkPaolo Bonzini1-9/+4
The check on the higher limit of the segment, and the check on the maximum accessible size, is the same for both expand-up and expand-down segments. Only the computation of "lim" varies. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: emulator: remove code duplication in register_address{,_increment}Paolo Bonzini1-12/+11
register_address has been a duplicate of address_mask ever since the ancestor of __linearize was born in 90de84f50b42 (KVM: x86 emulator: preserve an operand's segment identity, 2010-11-17). However, we can put it to a better use by including the call to reg_read in register_address. Similarly, the call to reg_rmw can be moved to register_address_increment. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: x86: Move __linearize masking of la into switchNadav Amit1-2/+1
In __linearize there is check of the condition whether to check if masking of the linear address is needed. It occurs immediately after switch that evaluates the same condition. Merge them. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: x86: Non-canonical access using SS should cause #SSNadav Amit1-1/+1
When SS is used using a non-canonical address, an #SS exception is generated on real hardware. KVM emulator causes a #GP instead. Fix it to behave as real x86 CPU. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: x86: Perform limit checks when assigning EIPNadav Amit1-41/+54
If branch (e.g., jmp, ret) causes limit violations, since the target IP > limit, the #GP exception occurs before the branch. In other words, the RIP pushed on the stack should be that of the branch and not that of the target. To do so, we can call __linearize, with new EIP, which also saves us the code which performs the canonical address checks. On the case of assigning an EIP >= 2^32 (when switching cs.l), we also safe, as __linearize will check the new EIP does not exceed the limit and would trigger #GP(0) otherwise. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: x86: Emulator performs privilege checks on __linearizeNadav Amit1-15/+0
When segment is accessed, real hardware does not perform any privilege level checks. In contrast, KVM emulator does. This causes some discrepencies from real hardware. For instance, reading from readable code segment may fail due to incorrect segment checks. In addition, it introduces unnecassary overhead. To reference Intel SDM 5.5 ("Privilege Levels"): "Privilege levels are checked when the segment selector of a segment descriptor is loaded into a segment register." The SDM never mentions privilege level checks during memory access, except for loading far pointers in section 5.10 ("Pointer Validation"). Those are actually segment selector loads and are emulated in the similarily (i.e., regardless to __linearize checks). This behavior was also checked using sysexit. A data-segment whose DPL=0 was loaded, and after sysexit (CPL=3) it is still accessible. Therefore, all the privilege level checks in __linearize are removed. Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-11-19KVM: x86: Stack size is overridden by __linearizeNadav Amit1-4/+5
When performing segmented-read/write in the emulator for stack operations, it ignores the stack size, and uses the ad_bytes as indication for the pointer size. As a result, a wrong address may be accessed. To fix this behavior, we can remove the masking of address in __linearize and perform it beforehand. It is already done for the operands (so currently it is inefficiently done twice). It is missing in two cases: 1. When using rip_relative 2. On fetch_bit_operand that changes the address. This patch masks the address on these two occassions, and removes the masking from __linearize. Note that it does not mask EIP during fetch. In protected/legacy mode code fetch when RIP >= 2^32 should result in #GP and not wrap-around. Since we make limit checks within __linearize, this is the expected behavior. Partial revert of commit 518547b32ab4 (KVM: x86: Emulator does not calculate address correctly, 2014-09-30). Signed-off-by: Nadav Amit <namit@cs.technion.ac.il> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>