kernel/linux.git - Linux kernel stable tree (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2011-03-17	Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6	Linus Torvalds	19	-178/+248
	* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: [S390] kexec: Disable ftrace during kexec [S390] support XZ compressed kernel [S390] css_bus_type: make it static [S390] css_driver: remove duplicate members [S390] css: remove subchannel private [S390] css: move chsc_private to drv_data [S390] css: move io_private to drv_data [S390] cio: move cdev pointer to io_subchannel_private [S390] cio: move options to io_sch_private [S390] cio: move asms to generic header [S390] cio: move orb definitions to separate header [S390] Write protect module text and RO data [S390] dasd: get rid of compile warning [S390] remove superfluous check from do_IRQ [S390] remove redundant stack check option
2011-03-17	Merge branch 'sh-latest' of ↵	Linus Torvalds	25	-552/+1250
	git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'sh-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (34 commits) sh: Convert to generic show_interrupts. sh: Wire up new fhandle and clock_adjtime syscalls. sh: modify platform_device for sh_eth driver sh: add GETHER's platform_device in board-sh7757lcr sh: update sh7757lcr_defconfig sh: add platform_device of tmio_mmc and sh_mmcif to sh7757lcr sh: dmaengine support for SH7757 sh: add mmc clock in clock-sh7757 sh: add spi_board_info in sh7757lcr sh: add platform_device for SPI sh: add USB_ARCH_HAS_EHCI and OHCI for SH7757 sh: Rename cpuidle states to fit general conventions serial: sh-sci: fix deadlock when resuming from S3 sleep sh: Enable CONFIG_GCOV_PROFILE_ALL for sh sh: Fix up async PCIe probing on SMP. serial: sh-sci: Kill off the special earlyprintk device. serial: sh-sci: Use dev_name() for region reservations. serial: sh-sci: Fix up earlyprintk port mapping. serial: sh-sci: Limit early console to one device. serial: sh-sci: Fix up break timer scheduling race. ...
2011-03-17	Merge git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6	Linus Torvalds	17	-1054/+787
	* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-2.6: fbdev: sh_mobile_lcdc: Add YUV framebuffer support viafb: split pll configs up viafb: remove duplicated clock storage viafb: always return the best possible clock viafb: remove duplicated clock information fbdev: sh_mobile_lcdcfb: add backlight support viafb: factor lcd scaling parameters out viafb: strip some structures viafb: remove unused data_mode and device_type viafb: kill lcd_panel_id video via: make local variables static video via: fix iomem access video/via: drop deprecated (and unused) i2c_adapter.id
2011-03-17	RPC: killing RPC tasks races fixed	Stanislav Kinsbursky	1	-1/+3
	RPC task RPC_TASK_QUEUED bit is set must be checked before trying to wake up task rpc_killall_tasks() because task->tk_waitqueue can not be set (equal to NULL). Also, as Trond Myklebust mentioned, such approach (instead of checking tk_waitqueue to NULL) allows us to "optimise away the call to rpc_wake_up_queued_task() altogether for those tasks that aren't queued". Here is an example of dereferencing of tk_waitqueue equal to NULL: CPU 0 CPU 1 CPU 2 -------------------- --------------------- -------------------------- nfs4_run_open_task rpc_run_task rpc_execute rpc_set_active rpc_make_runnable (waiting) rpc_async_schedule nfs4_open_prepare nfs_wait_on_sequence nfs_umount_begin rpc_killall_tasks rpc_wake_up_task rpc_wake_up_queued_task spin_lock(tk_waitqueue == NULL) BUG() rpc_sleep_on spin_lock(&q->lock) __rpc_sleep_on task->tk_waitqueue = q Signed-off-by: Stanislav Kinsbursky <skinsbursky@openvz.org> Cc: stable@kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-17	xprt: remove redundant check	j223yang@asset.uwaterloo.ca	1	-1/+1
	remove redundant check. Signed-off-by: Jinqiu Yang <crindy646@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-17	SUNRPC: Convert struct rpc_xprt to use atomic_t counters	Trond Myklebust	2	-10/+9
	Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-03-17	SUNRPC: Ensure we always run the tk_callback before tk_action	Trond Myklebust	1	-8/+6
	This fixes a race in which the task->tk_callback() puts the rpc_task to sleep, setting a new callback. Under certain circumstances, the current code may end up executing the task->tk_action before it gets round to the callback. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
2011-03-17	Merge branch 'drm-core-next' of ↵	Linus Torvalds	181	-6429/+9262
	git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (177 commits) drm/radeon: fixup refcounts in radeon dumb create ioctl. drm: radeon: *_cs_packet_parse_vline() cleanup radeon: merge list_del()/list_add_tail() to list_move_tail() drm: Retry i2c transfer of EDID block after failure drm/radeon/kms: fix typo in atom overscan setup drm: Hold the mode mutex whilst probing for sysfs status drm/nouveau: fix __nouveau_fence_wait performance drm/nv40: attempt to reserve just enough vram for all 32 channels drm/nv50: check for vm traps on every gr irq drm/nv50: decode vm faults some more drm/nouveau: add nouveau_enum_find() util function drm/nouveau: properly handle pushbuffer check failures drm/nvc0: remove vm hack forcing large/small pages to not share a PDE drm/i915: disable opregion lid detection for now. drm/i915: Only wait on a pending flip if we intend to write to the buffer drm/i915/dp: Sanity check eDP existence drm: add cap bit to denote if dumb ioctl is available or not. drm/core: add ioctl to query device/driver capabilities drm/radeon/kms: allow max clock of 340 Mhz on hdmi 1.3+ drm/radeon/kms: add cayman pci ids ...
2011-03-17	KVM: unbreak userspace that does not sets tss address	Gleb Natapov	1	-0/+13
	Commit 6440e5967bc broke old userspaces that do not set tss address before entering vcpu. Unbreak it by setting tss address to a safe value on the first vcpu entry. New userspaces should set tss address, so print warning in case it doesn't. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-17	KVM: MMU: cleanup pte write path	Xiao Guangrong	3	-56/+32
	This patch does: - call vcpu->arch.mmu.update_pte directly - use gfn_to_pfn_atomic in update_pte path The suggestion is from Avi. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: MMU: introduce a common function to get no-dirty-logged slot	Xiao Guangrong	1	-20/+17
	Cleanup the code of pte_prefetch_gfn_to_memslot and mapping_level_dirty_bitmap Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: fix rcu usage in init_rmode_* functions	Xiao Guangrong	1	-4/+8
	fix: [ 3494.671786] stack backtrace: [ 3494.671789] Pid: 10527, comm: qemu-system-x86 Not tainted 2.6.38-rc6+ #23 [ 3494.671790] Call Trace: [ 3494.671796] [] ? lockdep_rcu_dereference+0x9d/0xa5 [ 3494.671826] [] ? kvm_memslots+0x6b/0x73 [kvm] [ 3494.671834] [] ? gfn_to_memslot+0x16/0x4f [kvm] [ 3494.671843] [] ? gfn_to_hva+0x16/0x27 [kvm] [ 3494.671851] [] ? kvm_write_guest_page+0x31/0x83 [kvm] [ 3494.671861] [] ? kvm_clear_guest_page+0x1a/0x1c [kvm] [ 3494.671867] [] ? vmx_set_tss_addr+0x83/0x122 [kvm_intel] and: [ 8328.789599] stack backtrace: [ 8328.789601] Pid: 18736, comm: qemu-system-x86 Not tainted 2.6.38-rc6+ #23 [ 8328.789603] Call Trace: [ 8328.789609] [] ? lockdep_rcu_dereference+0x9d/0xa5 [ 8328.789621] [] ? kvm_memslots+0x6b/0x73 [kvm] [ 8328.789628] [] ? gfn_to_memslot+0x16/0x4f [kvm] [ 8328.789635] [] ? gfn_to_hva+0x16/0x27 [kvm] [ 8328.789643] [] ? kvm_write_guest_page+0x31/0x83 [kvm] [ 8328.789699] [] ? kvm_clear_guest_page+0x1a/0x1c [kvm] [ 8328.789713] [] ? vmx_create_vcpu+0x316/0x3c8 [kvm_intel] Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: fix kvmclock regression due to missing clock update	Nikola Ciprich	1	-1/+1
	commit 387b9f97750444728962b236987fbe8ee8cc4f8c moved kvm_request_guest_time_update(vcpu), breaking 32bit SMP guests using kvm-clock. Fix this by moving (new) clock update function to proper place. Signed-off-by: Nikola Ciprich <nikola.ciprich@linuxbox.cz> Acked-by: Zachary Amsden <zamsden@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: emulator: Fix permission checking in io permission bitmap	Gleb Natapov	1	-3/+2
	Currently if io port + len crosses 8bit boundary in io permission bitmap the check may allow IO that otherwise should not be allowed. The patch fixes that. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: emulator: Fix io permission checking for 64bit guest	Gleb Natapov	3	-21/+35
	Current implementation truncates upper 32bit of TR base address during IO permission bitmap check. The patch fixes this. Reported-and-tested-by: Francis Moreau <francis.moro@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: SVM: Load %gs earlier if CONFIG_X86_32_LAZY_GS=n	Avi Kivity	1	-0/+5
	With CONFIG_CC_STACKPROTECTOR, we need a valid %gs at all times, so disable lazy reload and do an eager reload immediately after the vmexit. Reported-by: IVAN ANGELOV <ivangotoy@gmail.com> Acked-By: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: x86: Remove useless regs_page pointer from kvm_lapic	Takuya Yoshikawa	3	-7/+5
	Access to this page is mostly done through the regs member which holds the address to this page. The exceptions are in vmx_vcpu_reset() and kvm_free_lapic() and these both can easily be converted to using regs. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: improve comment on rcu use in irqfd_deassign	Michael S. Tsirkin	1	-2/+3
	The RCU use in kvm_irqfd_deassign is tricky: we have rcu_assign_pointer but no synchronize_rcu: synchronize_rcu is done by kvm_irq_routing_update which we share a spinlock with. Fix up a comment in an attempt to make this clearer. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: MMU: remove unused macros	Xiao Guangrong	2	-8/+0
	These macros are not used, so removed Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: MMU: cleanup page alloc and free	Xiao Guangrong	1	-5/+5
	Using __get_free_page instead of alloc_page and page_address, using free_page instead of __free_page and virt_to_page Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: MMU: do not record gfn in kvm_mmu_pte_write	Xiao Guangrong	3	-7/+2
	No need to record the gfn to verifier the pte has the same mode as current vcpu, it's because we only speculatively update the pte only if the pte and vcpu have the same mode Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: MMU: move mmu pages calculated out of mmu lock	Xiao Guangrong	1	-5/+5
	kvm_mmu_calculate_mmu_pages need to walk all memslots and it's protected by kvm->slots_lock, so move it out of mmu spinlock Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: MMU: set spte accessed bit properly	Xiao Guangrong	1	-1/+1
	Set spte accessed bit only if guest_initiated == 1 that means the really accessed Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: MMU: fix kvm_mmu_slot_remove_write_access dropping intermediate W bits	Xiao Guangrong	1	-2/+7
	Only remove write access in the last sptes. Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: Start lock documentation	Jan Kiszka	1	-0/+25
	The goal of this document shall be - overview of all locks used in KVM core - provide details on the scope of each lock - explain the lock type, specifically of a raw spin locks - provide a lock ordering guide Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: better readability of efer_reserved_bits	Lai Jiangshan	1	-2/+3
	use EFER_SCE, EFER_LME and EFER_LMA instead of magic numbers. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: Clear async page fault hash after switching to real mode	Lai Jiangshan	1	-1/+3
	The hash array of async gfns may still contain some left gfns after kvm_clear_async_pf_completion_queue() called, need to clear them. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: VMX: Initialize vm86 TSS only once.	Gleb Natapov	1	-22/+6
	Currently vm86 task is initialized on each real mode entry and vcpu reset. Initialization is done by zeroing TSS and updating relevant fields. But since all vcpus are using the same TSS there is a race where one vcpu may use TSS while other vcpu is initializing it, so the vcpu that uses TSS will see wrong TSS content and will behave incorrectly. Fix that by initializing TSS only once. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: VMX: update live TR selector if it changes in real mode	Gleb Natapov	1	-0/+1
	When rmode.vm86 is active TR descriptor is updated with vm86 task values, but selector is left intact. vmx_set_segment() makes sure that if TR register is written into while vm86 is active the new values are saved for use after vm86 is deactivated, but since selector is not updated on vm86 activation/deactivation new value is lost. Fix this by writing new selector into vmcs immediately. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: VMX: add the __noclone attribute to vmx_vcpu_run	Lai Jiangshan	1	-1/+1
	The changelog of 104f226 said "adds the __noclone attribute", but it was missing in its patch. I think it is still needed. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-17	KVM: x86: Convert tsc_write_lock to raw_spinlock	Jan Kiszka	2	-4/+4
	Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: remove isr_ack logic from PIC	Gleb Natapov	2	-27/+2
	isr_ack logic was added by e48258009d to avoid unnecessary IPIs. Back then it made sense, but now the code checks that vcpu is ready to accept interrupt before sending IPI, so this logic is no longer needed. The patch removes it. Fixes a regression with Debian/Hurd. Signed-off-by: Gleb Natapov <gleb@redhat.com> Reported-and-tested-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: VMX: fix detection of BIOS disabling VMX	Joseph Cihula	1	-2/+8
	This patch fixes the logic used to detect whether BIOS has disabled VMX, for the case where VMX is enabled only under SMX, but tboot is not active. Signed-off-by: Joseph Cihula <joseph.cihula@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: Convert kvm_lock to raw_spinlock	Jan Kiszka	4	-23/+23
	Code under this lock requires non-preemptibility. Ensure this also over -rt by converting it to raw spinlock. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: SVM: check for progress after IRET interception	Avi Kivity	1	-1/+9
	When we enable an NMI window, we ask for an IRET intercept, since the IRET re-enables NMIs. However, the IRET intercept happens before the instruction executes, while the NMI window architecturally opens afterwards. To compensate for this mismatch, we only open the NMI window in the following exit, assuming that the IRET has by then executed; however, this assumption is not always correct; we may exit due to a host interrupt or page fault, without having executed the instruction. Fix by checking for forward progress by recording and comparing the IRET's rip. This is somewhat of a hack, since an unchaging rip does not mean that no forward progress has been made, but is the simplest fix for now. Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: Fix race between nmi injection and enabling nmi window	Avi Kivity	2	-1/+4
	The interrupt injection logic looks something like if an nmi is pending, and nmi injection allowed inject nmi if an nmi is pending request exit on nmi window the problem is that "nmi is pending" can be set asynchronously by the PIT; if it happens to fire between the two if statements, we will request an nmi window even though nmi injection is allowed. On SVM, this has disasterous results, since it causes eflags.TF to be set in random guest code. The fix is simple; make nmi_pending synchronous using the standard vcpu->requests mechanism; this ensures the code above is completely synchronous wrt nmi_pending. Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: use yield_to instead of sleep in kvm_vcpu_on_spin	Rik van Riel	2	-10/+48
	Instead of sleeping in kvm_vcpu_on_spin, which can cause gigantic slowdowns of certain workloads, we instead use yield_to to get another VCPU in the same KVM guest to run sooner. This seems to give a 10-15% speedup in certain workloads. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: keep track of which task is running a KVM vcpu	Rik van Riel	2	-0/+11
	Keep track of which task is running a KVM vcpu. This helps us figure out later what task to wake up if we want to boost a vcpu that got preempted. Unfortunately there are no guarantees that the same task always keeps the same vcpu, so we can only track the task across a single "run" of the vcpu. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	export pid symbols needed for kvm_vcpu_on_spin	Rik van Riel	2	-0/+3
	Export the symbols required for a race-free kvm_vcpu_on_spin. Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	KVM: Drop ad-hoc vendor specific instruction restriction	Avi Kivity	1	-28/+5
	Use the new support in the emulator, and drop the ad-hoc code in x86.c. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-17	KVM: x86 emulator: vendor specific instructions	Avi Kivity	2	-3/+10
	Mark some instructions as vendor specific, and allow the caller to request emulation only of vendor specific instructions. This is useful in some circumstances (responding to a #UD fault). Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-17	KVM: Drop bogus x86_decode_insn() error check	Avi Kivity	1	-3/+0
	x86_decode_insn() doesn't return X86EMUL_* values, so the check for X86EMUL_PROPOGATE_FAULT will always fail. There is a proper check later on, so there is no need for a replacement for this code. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-17	KVM: x86: Drop obsolete warning about INIT on runnable VCPU	Jan Kiszka	1	-4/+0
	This warning was once used for debugging QEMU user space. Though uncommon, it is actually possible to send an INIT request to a running VCPU. So better drop this warning before someone misuses it to flood kernel logs this way. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-17	KVM: x86: release kvmclock page on reset	Glauber Costa	1	-8/+12
	When a vcpu is reset, kvmclock page keeps being written to this days. This is wrong and inconsistent: a cpu reset should take it to its initial state. Signed-off-by: Glauber Costa <glommer@redhat.com> CC: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-17	mm: remove is_hwpoison_address	Huang Ying	2	-40/+0
	Unused. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-17	KVM: Replace is_hwpoison_address with __get_user_pages	Huang Ying	1	-1/+10
	is_hwpoison_address only checks whether the page table entry is hwpoisoned, regardless the memory page mapped. While __get_user_pages will check both. QEMU will clear the poisoned page table entry (via unmap/map) to make it possible to allocate a new memory page for the virtual address across guest rebooting. But it is also possible that the underlying memory page is kept poisoned even after the corresponding page table entry is cleared, that is, a new memory page can not be allocated. __get_user_pages can catch these situations. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-17	mm: make __get_user_pages return -EHWPOISON for HWPOISON page optionally	Huang Ying	7	-3/+21
	Make __get_user_pages return -EHWPOISON for HWPOISON page only if FOLL_HWPOISON is specified. With this patch, the interested callers can distinguish HWPOISON pages from general FAULT pages, while other callers will still get -EFAULT for all these pages, so the user space interface need not to be changed. This feature is needed by KVM, where UCR MCE should be relayed to guest for HWPOISON page, while instruction emulation and MMIO will be tried for general FAULT page. The idea comes from Andrew Morton. Signed-off-by: Huang Ying <ying.huang@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2011-03-17	mm: export __get_user_pages	Huang Ying	5	-11/+60
	In most cases, get_user_pages and get_user_pages_fast should be used to pin user pages in memory. But sometimes, some special flags except FOLL_GET, FOLL_WRITE and FOLL_FORCE are needed, for example in following patch, KVM needs FOLL_HWPOISON. To support these users, __get_user_pages is exported directly. There are some symbol name conflicts in infiniband driver, fixed them too. Signed-off-by: Huang Ying <ying.huang@intel.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Michel Lespinasse <walken@google.com> CC: Roland Dreier <roland@kernel.org> CC: Ralph Campbell <infinipath@qlogic.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-17	KVM: x86: handle guest access to BBL_CR_CTL3 MSR	john cooper	2	-0/+20
	A correction to Intel cpu model CPUID data (patch queued) caused winxp to BSOD when booted with a Penryn model. This was traced to the CPUID "model" field correction from 6 -> 23 (as is proper for a Penryn class of cpu). Only in this case does the problem surface. The cause for this failure is winxp accessing the BBL_CR_CTL3 MSR which is unsupported by current kvm, appears to be a legacy MSR not fully characterized yet existing in current silicon, and is apparently carried forward in MSR space to accommodate vintage code as here. It is not yet conclusive whether this MSR implements any of its legacy functionality or is just an ornamental dud for compatibility. While I found no silicon version specific documentation link to this MSR, a general description exists in Intel's developer's reference which agrees with the functional behavior of other bootloader/kernel code I've examined accessing BBL_CR_CTL3. Regrettably winxp appears to be setting bit #19 called out as "reserved" in the above document. So to minimally accommodate this MSR, kvm msr get will provide the equivalent mock data and kvm msr write will simply toss the guest passed data without interpretation. While this treatment of BBL_CR_CTL3 addresses the immediate problem, the approach may be modified pending clarification from Intel. Signed-off-by: john cooper <john.cooper@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-03-17	KVM: make make_all_cpus_request() lockless	Xiao Guangrong	2	-12/+3
	Now, we have 'vcpu->mode' to judge whether need to send ipi to other cpus, this way is very exact, so checking request bit is needless, then we can drop the spinlock let it's collateral Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Signed-off-by: Avi Kivity <avi@redhat.com>