summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2008-04-28return pfn from direct_access, for XIPJared Hulbert1-2/+3
Alter the block device ->direct_access() API to work with the new get_xip_mem() API (that requires both kaddr and pfn are returned). Some architectures will not do the right thing in their virt_to_page() for use by XIP (to translate from the kernel virtual address returned by direct_access(), to a user mappable pfn in XIP's page fault handler. However, we can't switch it to just return the pfn and not the kaddr, because we have no good way to get a kva from a pfn, and XIP requires the kva for its read(2) and write(2) handlers. So we have to return both. Signed-off-by: Jared Hulbert <jaredeh@gmail.com> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Carsten Otte <cotte@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux-mm@kvack.org Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28pageflags: use proper page flag functions in XenChristoph Lameter1-2/+2
Xen uses bitops to manipulate page flags. Make it use proper page flag functions. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28pageflags: get rid of FLAGS_RESERVEDChristoph Lameter1-2/+14
NR_PAGEFLAGS specifies the number of page flags we are using. From that we can calculate the number of bits leftover that can be used for zone, node (and maybe the sections id). There is no need anymore for FLAGS_RESERVED if we use NR_PAGEFLAGS. Use the new methods to make NR_PAGEFLAGS available via the preprocessor. NR_PAGEFLAGS is used to calculate field boundaries in the page flags fields. These field widths have to be available to the preprocessor. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: David Miller <davem@davemloft.net> Cc: Andy Whitcroft <apw@shadowen.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28pageflags: standardize comment inclusion in asm-offsets.h and fix MIPSChristoph Lameter1-202/+202
Add the ability to pass comments into asm-offsets.h by generating asm output like -># comment line Mips needs this feature to preserve the comments that are in asm-mips/asm-offsets.h right now. Then remove the special handling for mips from Kbuild and convert mips to use the new string to include the comments. Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28vmallocinfo: add caller informationChristoph Lameter1-6/+9
Add caller information so that /proc/vmallocinfo shows where the allocation request for a slice of vmalloc memory originated. Results in output like this: 0xffffc20000000000-0xffffc20000801000 8392704 alloc_large_system_hash+0x127/0x246 pages=2048 vmalloc vpages 0xffffc20000801000-0xffffc20000806000 20480 alloc_large_system_hash+0x127/0x246 pages=4 vmalloc 0xffffc20000806000-0xffffc20000c07000 4198400 alloc_large_system_hash+0x127/0x246 pages=1024 vmalloc vpages 0xffffc20000c07000-0xffffc20000c0a000 12288 alloc_large_system_hash+0x127/0x246 pages=2 vmalloc 0xffffc20000c0a000-0xffffc20000c0c000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c0c000-0xffffc20000c0f000 12288 acpi_os_map_memory+0x13/0x1c phys=cff64000 ioremap 0xffffc20000c10000-0xffffc20000c15000 20480 acpi_os_map_memory+0x13/0x1c phys=cff65000 ioremap 0xffffc20000c16000-0xffffc20000c18000 8192 acpi_os_map_memory+0x13/0x1c phys=cff69000 ioremap 0xffffc20000c18000-0xffffc20000c1a000 8192 acpi_os_map_memory+0x13/0x1c phys=fed1f000 ioremap 0xffffc20000c1a000-0xffffc20000c1c000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c1c000-0xffffc20000c1e000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c1e000-0xffffc20000c20000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c20000-0xffffc20000c22000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c22000-0xffffc20000c24000 8192 acpi_os_map_memory+0x13/0x1c phys=cff68000 ioremap 0xffffc20000c24000-0xffffc20000c26000 8192 acpi_os_map_memory+0x13/0x1c phys=e0081000 ioremap 0xffffc20000c26000-0xffffc20000c28000 8192 acpi_os_map_memory+0x13/0x1c phys=e0080000 ioremap 0xffffc20000c28000-0xffffc20000c2d000 20480 alloc_large_system_hash+0x127/0x246 pages=4 vmalloc 0xffffc20000c2d000-0xffffc20000c31000 16384 tcp_init+0xd5/0x31c pages=3 vmalloc 0xffffc20000c31000-0xffffc20000c34000 12288 alloc_large_system_hash+0x127/0x246 pages=2 vmalloc 0xffffc20000c34000-0xffffc20000c36000 8192 init_vdso_vars+0xde/0x1f1 0xffffc20000c36000-0xffffc20000c38000 8192 pci_iomap+0x8a/0xb4 phys=d8e00000 ioremap 0xffffc20000c38000-0xffffc20000c3a000 8192 usb_hcd_pci_probe+0x139/0x295 [usbcore] phys=d8e00000 ioremap 0xffffc20000c3a000-0xffffc20000c3e000 16384 sys_swapon+0x509/0xa15 pages=3 vmalloc 0xffffc20000c40000-0xffffc20000c61000 135168 e1000_probe+0x1c4/0xa32 phys=d8a20000 ioremap 0xffffc20000c61000-0xffffc20000c6a000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20000c6a000-0xffffc20000c73000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20000c73000-0xffffc20000c7c000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20000c7c000-0xffffc20000c7f000 12288 e1000e_setup_tx_resources+0x29/0xbe pages=2 vmalloc 0xffffc20000c80000-0xffffc20001481000 8392704 pci_mmcfg_arch_init+0x90/0x118 phys=e0000000 ioremap 0xffffc20001481000-0xffffc20001682000 2101248 alloc_large_system_hash+0x127/0x246 pages=512 vmalloc 0xffffc20001682000-0xffffc20001e83000 8392704 alloc_large_system_hash+0x127/0x246 pages=2048 vmalloc vpages 0xffffc20001e83000-0xffffc20002204000 3674112 alloc_large_system_hash+0x127/0x246 pages=896 vmalloc vpages 0xffffc20002204000-0xffffc2000220d000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc2000220d000-0xffffc20002216000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20002216000-0xffffc2000221f000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc2000221f000-0xffffc20002228000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20002228000-0xffffc20002231000 36864 _xfs_buf_map_pages+0x8e/0xc0 vmap 0xffffc20002231000-0xffffc20002234000 12288 e1000e_setup_rx_resources+0x35/0x122 pages=2 vmalloc 0xffffc20002240000-0xffffc20002261000 135168 e1000_probe+0x1c4/0xa32 phys=d8a60000 ioremap 0xffffc20002261000-0xffffc2000270c000 4894720 sys_swapon+0x509/0xa15 pages=1194 vmalloc vpages 0xffffffffa0000000-0xffffffffa0022000 139264 module_alloc+0x4f/0x55 pages=33 vmalloc 0xffffffffa0022000-0xffffffffa0029000 28672 module_alloc+0x4f/0x55 pages=6 vmalloc 0xffffffffa002b000-0xffffffffa0034000 36864 module_alloc+0x4f/0x55 pages=8 vmalloc 0xffffffffa0034000-0xffffffffa003d000 36864 module_alloc+0x4f/0x55 pages=8 vmalloc 0xffffffffa003d000-0xffffffffa0049000 49152 module_alloc+0x4f/0x55 pages=11 vmalloc 0xffffffffa0049000-0xffffffffa0050000 28672 module_alloc+0x4f/0x55 pages=6 vmalloc [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28mm: move cache_line_size() to <linux/cache.h>Pekka Enberg1-0/+3
Not all architectures define cache_line_size() so as suggested by Andrew move the private implementations in mm/slab.c and mm/slob.c to <linux/cache.h>. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Reviewed-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28mm: have zonelist contains structs with both a zone pointer and zone_idxMel Gorman1-1/+1
Filtering zonelists requires very frequent use of zone_idx(). This is costly as it involves a lookup of another structure and a substraction operation. As the zone_idx is often required, it should be quickly accessible. The node idx could also be stored here if it was found that accessing zone->node is significant which may be the case on workloads where nodemasks are heavily used. This patch introduces a struct zoneref to store a zone pointer and a zone index. The zonelist then consists of an array of these struct zonerefs which are looked up as necessary. Helpers are given for accessing the zone index as well as the node index. [kamezawa.hiroyu@jp.fujitsu.com: Suggested struct zoneref instead of embedding information in pointers] [hugh@veritas.com: mm-have-zonelist: fix memcg ooms] [hugh@veritas.com: just return do_try_to_free_pages] [hugh@veritas.com: do_try_to_free_pages gfp_mask redundant] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Christoph Lameter <clameter@sgi.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <clameter@sgi.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28mm: use two zonelist that are filtered by GFP maskMel Gorman1-4/+7
Currently a node has two sets of zonelists, one for each zone type in the system and a second set for GFP_THISNODE allocations. Based on the zones allowed by a gfp mask, one of these zonelists is selected. All of these zonelists consume memory and occupy cache lines. This patch replaces the multiple zonelists per-node with two zonelists. The first contains all populated zones in the system, ordered by distance, for fallback allocations when the target/preferred node has no free pages. The second contains all populated zones in the node suitable for GFP_THISNODE allocations. An iterator macro is introduced called for_each_zone_zonelist() that interates through each zone allowed by the GFP flags in the selected zonelist. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Christoph Lameter <clameter@sgi.com> Cc: Hugh Dickins <hugh@veritas.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28hotplug-memory: make online_page() commonJeremy Fitzhardinge5-69/+3
All architectures use an effectively identical definition of online_page(), so just make it common code. x86-64, ia64, powerpc and sh are actually identical; x86-32 is slightly different. x86-32's differences arise because it puts its hotplug pages in the highmem zone. We can handle this in the generic code by inspecting the page to see if its in highmem, and update the totalhigh_pages count appropriately. This leaves init_32.c:free_new_highpage with a single caller, so I folded it into add_one_highpage_init. I also removed an incorrect comment referring to the NUMA case; any NUMA details have already been dealt with by the time online_page() is called. [akpm@linux-foundation.org: fix indenting] Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Dave Hansen <dave@linux.vnet.ibm.com> Reviewed-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com> Tested-by: KAMEZAWA Hiroyuki <kamez.hiroyu@jp.fujitsu.com> Cc: Yasunori Goto <y-goto@jp.fujitsu.com> Cc: Christoph Lameter <clameter@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-28x86: PAT fixIngo Molnar1-4/+1
Adrian Bunk noticed the following Coverity report: > Commit e7f260a276f2c9184fe753732d834b1f6fbe9f17 > (x86: PAT use reserve free memtype in mmap of /dev/mem) > added the following gem to arch/x86/mm/pat.c: > > <-- snip --> > > ... > int phys_mem_access_prot_allowed(struct file *file, unsigned long pfn, > unsigned long size, pgprot_t *vma_prot) > { > u64 offset = ((u64) pfn) << PAGE_SHIFT; > unsigned long flags = _PAGE_CACHE_UC_MINUS; > unsigned long ret_flags; > ... > ... (nothing that touches ret_flags) > ... > if (flags != _PAGE_CACHE_UC_MINUS) { > retval = reserve_memtype(offset, offset + size, flags, NULL); > } else { > retval = reserve_memtype(offset, offset + size, -1, &ret_flags); > } > > if (retval < 0) > return 0; > > flags = ret_flags; > > if (pfn <= max_pfn_mapped && > ioremap_change_attr((unsigned long)__va(offset), size, flags) < 0) { > free_memtype(offset, offset + size); > printk(KERN_INFO > "%s:%d /dev/mem ioremap_change_attr failed %s for %Lx-%Lx\n", > current->comm, current->pid, > cattr_name(flags), > offset, offset + size); > return 0; > } > > *vma_prot = __pgprot((pgprot_val(*vma_prot) & ~_PAGE_CACHE_MASK) | > flags); > return 1; > } > > <-- snip --> > > If (flags != _PAGE_CACHE_UC_MINUS) we pass garbage from the stack to > ioremap_change_attr() and/or __pgprot(). > > Spotted by the Coverity checker. the fix simplifies the code as we get rid of the 'ret_flags' complication. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-27x86 PAT: tone down debugging messages some moreLinus Torvalds1-2/+2
Ingo already fixed one of these at my request (in "x86 PAT: tone down debugging messages", commit 1ebcc654f010d4a63f3ebf8ddd2cab5a709b1824), but there was another one he missed. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-27Merge branch 'kvm-updates-2.6.26' of ↵Linus Torvalds77-555/+20457
git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (147 commits) KVM: kill file->f_count abuse in kvm KVM: MMU: kvm_pv_mmu_op should not take mmap_sem KVM: SVM: remove selective CR0 comment KVM: SVM: remove now obsolete FIXME comment KVM: SVM: disable CR8 intercept when tpr is not masking interrupts KVM: SVM: sync V_TPR with LAPIC.TPR if CR8 write intercept is disabled KVM: export kvm_lapic_set_tpr() to modules KVM: SVM: sync TPR value to V_TPR field in the VMCB KVM: ppc: PowerPC 440 KVM implementation KVM: Add MAINTAINERS entry for PowerPC KVM KVM: ppc: Add DCR access information to struct kvm_run ppc: Export tlb_44x_hwater for KVM KVM: Rename debugfs_dir to kvm_debugfs_dir KVM: x86 emulator: fix lea to really get the effective address KVM: x86 emulator: fix smsw and lmsw with a memory operand KVM: x86 emulator: initialize src.val and dst.val for register operands KVM: SVM: force a new asid when initializing the vmcb KVM: fix kvm_vcpu_kick vs __vcpu_run race KVM: add ioctls to save/store mpstate KVM: Rename VCPU_MP_STATE_* to KVM_MP_STATE_* ...
2008-04-27KVM: MMU: kvm_pv_mmu_op should not take mmap_semMarcelo Tosatti1-3/+0
kvm_pv_mmu_op should not take mmap_sem. All gfn_to_page() callers down in the MMU processing will take it if necessary, so as it is it can deadlock. Apparently a leftover from the days before slots_lock. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: remove selective CR0 commentJoerg Roedel1-11/+0
There is not selective cr0 intercept bug. The code in the comment sets the CR0.PG bit. But KVM sets the CR4.PG bit for SVM always to implement the paged real mode. So the 'mov %eax,%cr0' instruction does not change the CR0.PG bit. Selective CR0 intercepts only occur when a bit is actually changed. So its the right behavior that there is no intercept on this instruction. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: remove now obsolete FIXME commentJoerg Roedel1-7/+0
With the usage of the V_TPR field this comment is now obsolete. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: disable CR8 intercept when tpr is not masking interruptsJoerg Roedel1-4/+27
This patch disables the intercept of CR8 writes if the TPR is not masking interrupts. This reduces the total number CR8 intercepts to below 1 percent of what we have without this patch using Windows 64 bit guests. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: sync V_TPR with LAPIC.TPR if CR8 write intercept is disabledJoerg Roedel1-0/+12
If the CR8 write intercept is disabled the V_TPR field of the VMCB needs to be synced with the TPR field in the local apic. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: export kvm_lapic_set_tpr() to modulesJoerg Roedel1-0/+1
This patch exports the kvm_lapic_set_tpr() function from the lapic code to modules. It is required in the kvm-amd module to optimize CR8 intercepts. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: sync TPR value to V_TPR field in the VMCBJoerg Roedel1-2/+16
This patch adds syncing of the lapic.tpr field to the V_TPR field of the VMCB. With this change we can safely remove the CR8 read intercept. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ppc: PowerPC 440 KVM implementationHollis Blanchard13-0/+2735
This functionality is definitely experimental, but is capable of running unmodified PowerPC 440 Linux kernels as guests on a PowerPC 440 host. (Only tested with 440EP "Bamboo" guests so far, but with appropriate userspace support other SoC/board combinations should work.) See Documentation/powerpc/kvm_440.txt for technical details. [stephen: build fix] Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: x86 emulator: fix lea to really get the effective addressAvi Kivity1-1/+1
We never hit this, since there is currently no reason to emulate lea. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: x86 emulator: fix smsw and lmsw with a memory operandAvi Kivity1-12/+17
lmsw and smsw were implemented only with a register operand. Extend them to support a memory operand as well. Fixes Windows running some display compatibility test on AMD hosts. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: x86 emulator: initialize src.val and dst.val for register operandsAvi Kivity1-0/+2
This lets us treat the case where mod == 3 in the same manner as other cases. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: force a new asid when initializing the vmcbAvi Kivity1-1/+1
Shutdown interception clears the vmcb, leaving the asid at zero (which is illegal. so force a new asid on vmcb initialization. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: fix kvm_vcpu_kick vs __vcpu_run raceMarcelo Tosatti1-2/+14
There is a window open between testing of pending IRQ's and assignment of guest_mode in __vcpu_run. Injection of IRQ's can race with __vcpu_run as follows: CPU0 CPU1 kvm_x86_ops->run() vcpu->guest_mode = 0 SET_IRQ_LINE ioctl .. kvm_x86_ops->inject_pending_irq kvm_cpu_has_interrupt() apic_test_and_set_irr() kvm_vcpu_kick if (vcpu->guest_mode) send_ipi() vcpu->guest_mode = 1 So move guest_mode=1 assignment before ->inject_pending_irq, and make sure that it won't reorder after it. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: add ioctls to save/store mpstateMarcelo Tosatti3-0/+43
So userspace can save/restore the mpstate during migration. [avi: export the #define constants describing the value] [christian: add s390 stubs] [avi: ditto for ia64] Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Carsten Otte <cotte@de.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Rename VCPU_MP_STATE_* to KVM_MP_STATE_*Avi Kivity4-31/+31
We wish to export it to userspace, so move it into the kvm namespace. Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: hlt emulation should take in-kernel APIC/PIT timers into accountMarcelo Tosatti6-0/+48
Timers that fire between guest hlt and vcpu_block's add_wait_queue() are ignored, possibly resulting in hangs. Also make sure that atomic_inc and waitqueue_active tests happen in the specified order, otherwise the following race is open: CPU0 CPU1 if (waitqueue_active(wq)) add_wait_queue() if (!atomic_read(pit_timer->pending)) schedule() atomic_inc(pit_timer->pending) Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: do not intercept task switch with NPTJoerg Roedel1-0/+1
When KVM uses NPT there is no reason to intercept task switches. This patch removes the intercept for it in that case. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Add kvm trace userspace interfaceFeng(Eric) Liu2-0/+14
This interface allows user a space application to read the trace of kvm related events through relayfs. Signed-off-by: Feng (Eric) Liu <eric.e.liu@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: Stub out kvmtraceAvi Kivity1-0/+3
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: s390: Stub out kvmtraceAvi Kivity1-0/+3
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: Add trace markersFeng (Eric) Liu2-1/+60
Trace markers allow userspace to trace execution of a virtual machine in order to monitor its performance. Signed-off-by: Feng (Eric) Liu <eric.e.liu@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: add intercept for machine check exceptionJoerg Roedel1-1/+16
To properly forward a MCE occured while the guest is running to the host, we have to intercept this exception and call the host handler by hand. This is implemented by this patch. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: align shadow CR4.MCE with hostJoerg Roedel1-0/+3
This patch aligns the host version of the CR4.MCE bit with the CR4 active in the guest. This is necessary to get MCE exceptions when the guest is running. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: SVM: indent svm_set_cr4 with tabs instead of spacesJoerg Roedel1-4/+4
The svm_set_cr4 function is indented with spaces. This patch replaces them with tabs. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: MMU: Don't assume struct page for x86Anthony Liguori2-59/+56
This patch introduces a gfn_to_pfn() function and corresponding functions like kvm_release_pfn_dirty(). Using these new functions, we can modify the x86 MMU to no longer assume that it can always get a struct page for any given gfn. We don't want to eliminate gfn_to_page() entirely because a number of places assume they can do gfn_to_page() and then kmap() the results. When we support IO memory, gfn_to_page() will fail for IO pages although gfn_to_pfn() will succeed. This does not implement support for avoiding reference counting for reserved RAM or for IO memory. However, it should make those things pretty straight forward. Since we're only introducing new common symbols, I don't think it will break the non-x86 architectures but I haven't tested those. I've tested Intel, AMD, NPT, and hugetlbfs with Windows and Linux guests. [avi: fix overflow when shifting left pfns by adding casts] Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: Enable kvm build for ia64Xiantao Zhang4-0/+111
Update the related Makefile and KConfig for kvm build Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: Add kvm sal/pal virtulization supportXiantao Zhang1-0/+500
Some sal/pal calls would be traped to kvm for virtulization from guest firmware. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: Add guest interruption injection supportXiantao Zhang1-0/+970
process.c mainly handle interruption injection, and some faults handling. Signed-off-by: Anthony Xu <anthony.xu@intel.com> Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: Generate offset values for assembly code useXiantao Zhang1-0/+251
asm-offsets.c will generate offset values used for assembly code for some fileds of special structures. Signed-off-by: Anthony Xu <anthony.xu@intel.com> Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: Add optimization for some virtulization faultsXiantao Zhang1-0/+918
optvfault.S Add optimization for some performance-critical virtualization faults. Signed-off-by: Anthony Xu <anthony.xu@intel.com> Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: Add processor virtulization supportXiantao Zhang1-0/+2163
vcpu.c provides processor virtualization logic for kvm. Signed-off-by: Anthony Xu <anthony.xu@intel.com> Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: Add trampoline for guest/host mode switchXiantao Zhang1-0/+1038
trampoline code targets for guest/host world switch. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: Add mmio decoder for kvm/ia64Xiantao Zhang1-0/+341
mmio.c includes mmio decoder, and related mmio logics. Signed-off-by: Anthony Xu <Anthony.xu@intel.com> Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: Add interruption vector table for vmmXiantao Zhang1-0/+1424
vmm_ivt.S includes an ivt for vmm use. Signed-off-by: Anthony Xu <anthony.xu@intel.com> Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: Add TLB virtulization supportXiantao Zhang1-0/+636
vtlb.c includes tlb/VHPT virtulization. Signed-off-by: Anthony Xu <anthony.xu@intel.com> Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: VMM module interfacesXiantao Zhang1-0/+66
vmm.c adds the interfaces with kvm/module, and initialize global data area. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: Add header files for kvm/ia64Xiantao Zhang5-0/+1421
kvm_minstate.h : Marcos about Min save routines. lapic.h: apic structure definition. vcpu.h : routions related to vcpu virtualization. vti.h : Some macros or routines for VT support on Itanium. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-04-27KVM: ia64: Add kvm arch-specific core code for kvm/ia64Xiantao Zhang1-0/+1789
kvm_ia64.c is created to handle kvm ia64-specific core logic. Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com> Signed-off-by: Avi Kivity <avi@qumranet.com>