summaryrefslogtreecommitdiff
path: root/include/asm-x86
AgeCommit message (Collapse)AuthorFilesLines
2008-08-14Merge branch 'x86/fpu' into x86/urgentIngo Molnar1-2/+0
2008-08-14Merge commit 'v2.6.27-rc3' into x86/xsaveIngo Molnar5-4/+28
Conflicts: arch/x86/kernel/genapic_64.c include/asm-x86/kvm_host.h Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-13crypto: padlock - fix VIA PadLock instruction usage with irq_ts_save/restore()Suresh Siddha1-0/+32
Wolfgang Walter reported this oops on his via C3 using padlock for AES-encryption: ################################################################## BUG: unable to handle kernel NULL pointer dereference at 000001f0 IP: [<c01028c5>] __switch_to+0x30/0x117 *pde = 00000000 Oops: 0002 [#1] PREEMPT Modules linked in: Pid: 2071, comm: sleep Not tainted (2.6.26 #11) EIP: 0060:[<c01028c5>] EFLAGS: 00010002 CPU: 0 EIP is at __switch_to+0x30/0x117 EAX: 00000000 EBX: c0493300 ECX: dc48dd00 EDX: c0493300 ESI: dc48dd00 EDI: c0493530 EBP: c04cff8c ESP: c04cff7c DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 Process sleep (pid: 2071, ti=c04ce000 task=dc48dd00 task.ti=d2fe6000) Stack: dc48df30 c0493300 00000000 00000000 d2fe7f44 c03b5b43 c04cffc8 00000046 c0131856 0000005a dc472d3c c0493300 c0493470 d983ae00 00002696 00000000 c0239f54 00000000 c04c4000 c04cffd8 c01025fe c04f3740 00049800 c04cffe0 Call Trace: [<c03b5b43>] ? schedule+0x285/0x2ff [<c0131856>] ? pm_qos_requirement+0x3c/0x53 [<c0239f54>] ? acpi_processor_idle+0x0/0x434 [<c01025fe>] ? cpu_idle+0x73/0x7f [<c03a4dcd>] ? rest_init+0x61/0x63 ======================= Wolfgang also found out that adding kernel_fpu_begin() and kernel_fpu_end() around the padlock instructions fix the oops. Suresh wrote: These padlock instructions though don't use/touch SSE registers, but it behaves similar to other SSE instructions. For example, it might cause DNA faults when cr0.ts is set. While this is a spurious DNA trap, it might cause oops with the recent fpu code changes. This is the code sequence that is probably causing this problem: a) new app is getting exec'd and it is somewhere in between start_thread() and flush_old_exec() in the load_xyz_binary() b) At pont "a", task's fpu state (like TS_USEDFPU, used_math() etc) is cleared. c) Now we get an interrupt/softirq which starts using these encrypt/decrypt routines in the network stack. This generates a math fault (as cr0.ts is '1') which sets TS_USEDFPU and restores the math that is in the task's xstate. d) Return to exec code path, which does start_thread() which does free_thread_xstate() and sets xstate pointer to NULL while the TS_USEDFPU is still set. e) At the next context switch from the new exec'd task to another task, we have a scenarios where TS_USEDFPU is set but xstate pointer is null. This can cause an oops during unlazy_fpu() in __switch_to() Now: 1) This should happen with or with out pre-emption. Viro also encountered similar problem with out CONFIG_PREEMPT. 2) kernel_fpu_begin() and kernel_fpu_end() will fix this problem, because kernel_fpu_begin() will manually do a clts() and won't run in to the situation of setting TS_USEDFPU in step "c" above. 3) This was working before the fpu changes, because its a spurious math fault which doesn't corrupt any fpu/sse registers and the task's math state was always in an allocated state. With out the recent lazy fpu allocation changes, while we don't see oops, there is a possible race still present in older kernels(for example, while kernel is using kernel_fpu_begin() in some optimized clear/copy page and an interrupt/softirq happens which uses these padlock instructions generating DNA fault). This is the failing scenario that existed even before the lazy fpu allocation changes: 0. CPU's TS flag is set 1. kernel using FPU in some optimized copy routine and while doing kernel_fpu_begin() takes an interrupt just before doing clts() 2. Takes an interrupt and ipsec uses padlock instruction. And we take a DNA fault as TS flag is still set. 3. We handle the DNA fault and set TS_USEDFPU and clear cr0.ts 4. We complete the padlock routine 5. Go back to step-1, which resumes clts() in kernel_fpu_begin(), finishes the optimized copy routine and does kernel_fpu_end(). At this point, we have cr0.ts again set to '1' but the task's TS_USEFPU is stilll set and not cleared. 6. Now kernel resumes its user operation. And at the next context switch, kernel sees it has do a FP save as TS_USEDFPU is still set and then will do a unlazy_fpu() in __switch_to(). unlazy_fpu() will take a DNA fault, as cr0.ts is '1' and now, because we are in __switch_to(), math_state_restore() will get confused and will restore the next task's FP state and will save it in prev tasks's FP state. Remember, in __switch_to() we are already on the stack of the next task but take a DNA fault for the prev task. This causes the fpu leakage. Fix the padlock instruction usage by calling them inside the context of new routines irq_ts_save/restore(), which clear/restore cr0.ts manually in the interrupt context. This will not generate spurious DNA in the context of the interrupt which will fix the oops encountered and the possible FPU leakage issue. Reported-and-bisected-by: Wolfgang Walter <wolfgang.walter@stwm.de> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2008-08-13Merge commit 'v2.6.27-rc3' into x86/urgentIngo Molnar1-0/+6
2008-08-13x86: fix xsave build errorIngo Molnar1-3/+3
fix this build failure with certain glibc versions: In file included from /usr/include/bits/sigcontext.h:28, from /usr/include/signal.h:333, from Documentation/accounting/getdelays.c:24: /home/mingo/tip/usr/include/asm/sigcontext.h:191: error: expected specifier-qualifier-list before ‘u64’ Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-13x86: propagate new nonpanic bootmem macros to CONFIG_HAVE_ARCH_BOOTMEM_NODEJohannes Weiner1-0/+6
Commit 74768ed833344b "page allocator: use no-panic variant of alloc_bootmem() in alloc_large_system_hash()" introduced two new _nopanic macros which are undefined for CONFIG_HAVE_ARCH_BOOTMEM_NODE. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Acked-by: "Jan Beulich" <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-12Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds3-4/+20
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix 2.6.27rc1 cannot boot more than 8CPUs x86: make "apic" an early_param() on 32-bit, NULL check EFI, x86: fix function prototype x86, pci-calgary: fix function declaration x86: work around gcc 3.4.x bug x86: make "apic" an early_param() on 32-bit x86, debug: tone down arch/x86/kernel/mpparse.c debugging printk x86_64: restore the proper NR_IRQS define so larger systems work. x86: Restore proper vector locking during cpu hotplug x86: Fix broken VMI in 2.6.27-rc.. x86: fdiv bug detection fix
2008-08-11EFI, x86: fix function prototypeRandy Dunlap1-1/+1
Fix function prototype in header file to match source code: linux-next-20080807/arch/x86/kernel/efi_64.c:100:14: error: symbol 'efi_ioremap' redeclared with different type (originally declared at include2/asm/efi.h:89) - different address spaces Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11x86: apic interrupts - move assignments to irqinit_32.c, v2Cyrill Gorcunov1-2/+0
64bit mode APIC interrupt handlers are set within irqinit_64.c. Lets do tha same for 32bit mode which would help in furter code merging. Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11Merge branch 'linus' into x86/cleanupsIngo Molnar11-82/+101
2008-08-11Merge branch 'linus' into x86/x2apicIngo Molnar13-79/+154
Conflicts: arch/x86/kernel/genapic_64.c Manual merge: arch/x86/kernel/genx2apic_uv_x.c Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11x86_64: restore the proper NR_IRQS define so larger systems work.Eric W. Biederman1-1/+9
As pointed out and tracked by Yinghai Lu <yhlu.kernel@gmail.com>: Dhaval Giani got: kernel BUG at arch/x86/kernel/io_apic_64.c:357! invalid opcode: 0000 [1] SMP CPU 24 ... his system (x3950) has 8 ioapic, irq > 256 This was caused by: commit 9b7dc567d03d74a1fbae84e88949b6a60d922d82 Author: Thomas Gleixner <tglx@linutronix.de> Date: Fri May 2 20:10:09 2008 +0200 x86: unify interrupt vector defines The interrupt vector defines are copied 4 times around with minimal differences. Move them all into asm-x86/irq_vectors.h It appears that Thomas did not notice that x86_64 does something completely different when he merge irq_vectors.h We can solve this for 2.6.27 by simply reintroducing the old heuristic for setting NR_IRQS on x86_64 to a usable value, which trivially removes the regression. Long term it would be nice to harmonize the handling of ioapic interrupts of x86_32 and x86_64 so we don't have this kind of confusion. Dhaval Giani <dhaval@linux.vnet.ibm.com> tested an earlier version of this patch by YH which confirms simply increasing NR_IRQS fixes the problem. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Mike Travis <travis@sgi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-08-11x86: Restore proper vector locking during cpu hotplugEric W. Biederman1-2/+10
Having cpu_online_map change during assign_irq_vector can result in some really nasty and weird things happening. The one that bit me last time was accessing non existent per cpu memory for non existent cpus. This locking was removed in a sloppy x86_64 and x86_32 merge patch. Guys can we please try and avoid subtly breaking x86 when we are merging files together? Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-11x86: mmconf: fix section mismatch warningMarcin Slusarz1-1/+1
WARNING: arch/x86/kernel/built-in.o(.cpuinit.text+0x1591): Section mismatch in reference from the function init_amd() to the function .init.text:check_enable_amd_mmconf_dmi() The function __cpuinit init_amd() references a function __init check_enable_amd_mmconf_dmi(). If check_enable_amd_mmconf_dmi is only used by init_amd then annotate check_enable_amd_mmconf_dmi with a matching annotation. check_enable_amd_mmconf_dmi is only called from init_amd which is __cpuinit Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-08-01Merge branch 'kvm-updates-2.6.27' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: KVM: s390: Fix kvm on IBM System z10 KVM: Advertise synchronized mmu support to userspace KVM: Synchronize guest physical memory map to host virtual memory map KVM: Allow browsing memslots with mmu_lock KVM: Allow reading aliases with mmu_lock
2008-07-31Merge branch 'x86/urgent' into x86/xenIngo Molnar1-0/+2
2008-07-31Merge branch 'linus' into x86/xenIngo Molnar1-21/+1
2008-07-30x86, xsave: keep the XSAVE feature mask as an u64H. Peter Anvin1-7/+11
The XSAVE feature mask is a 64-bit number; keep it that way, in order to avoid the mistake done with rdmsr/wrmsr. Use the xsetbv() function provided in the previous patch. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: add <asm/xcr.h> header file for XCR registersH. Peter Anvin1-0/+49
Add <asm-x86/xcr.h> header file for the XCR registers and their access functions. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: save/restore the extended state context in sigframeSuresh Siddha3-2/+13
On cpu's supporting xsave/xrstor, fpstate pointer in the sigcontext, will include the extended state information along with fpstate information. Presence of extended state information is indicated by the presence of FP_XSTATE_MAGIC1 at fpstate.sw_reserved.magic1 and FP_XSTATE_MAGIC2 at fpstate + (fpstate.sw_reserved.extended_size - FP_XSTATE_MAGIC2_SIZE). Extended feature bit mask that is saved in the memory layout is represented by the fpstate.sw_reserved.xstate_bv For RT signal frames, UC_FP_XSTATE in the uc_flags also indicate the presence of extended state information in the sigcontext's fpstate pointer. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: struct _fpstate extensions to include extended state informationSuresh Siddha3-6/+94
Bytes 464..511 in the current 512byte layout of fxsave/fxrstor frame, are reserved for SW usage. On cpu's supporting xsave/xrstor, these bytes are used to extended the fpstate pointer in the sigcontext, which now includes the extended state information along with fpstate information. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: xsave/xrstor specific routinesSuresh Siddha1-0/+52
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: reorganization of signal save/restore fpstate code layoutSuresh Siddha1-6/+7
move 64bit routines that saves/restores fpstate in/from user stack from signal_64.c to xsave.c restore_i387_xstate() now handles the condition when user passes NULL fpstate. Other misc changes for prepartion of xsave/xrstor sigcontext support. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: dynamically allocate sigframes fpstate instead of static allocationSuresh Siddha1-0/+2
dynamically allocate fpstate on the stack, instead of static allocation in the current sigframe layout on the user stack. This will allow the fpstate structure to grow in the future, which includes extended state information supporting xsave/xrstor. signal handlers will be able to access the fpstate pointer from the sigcontext structure asusual, with no change. For the non RT sigframe's (which are supported only for 32bit apps), current static fpstate layout in the sigframe will be unused(so that we don't change the extramask[] offset in the sigframe and thus prevent breaking app's which modify extramask[]). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: context switch support using xsave/xrstorSuresh Siddha4-6/+95
Uses xsave/xrstor (instead of traditional fxsave/fxrstor) in context switch when available. Introduces TS_XSAVE flag, which determine the need to use xsave/xrstor instructions during context switch instead of the legacy fxsave/fxrstor instructions. Thread-synchronous status word is already in L1 cache during this code patch and thus minimizes the performance penality compared to (cpu_has_xsave) checks. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: enable xsave/xrstor on cpus with xsave supportSuresh Siddha4-0/+40
Enables xsave/xrstor by turning on cr4.osxsave on cpu's which have the xsave support. For now, features that OS supports/enabled are FP and SSE. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30x86, xsave: xsave cpuid feature bitsSuresh Siddha1-0/+2
Add xsave CPU feature bits. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30Merge commit 'v2.6.27-rc1' into x86/coreIngo Molnar9-75/+92
Conflicts: include/asm-x86/dma-mapping.h include/asm-x86/namei.h include/asm-x86/uaccess.h Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-30Merge branch 'x86/fpu' into x86/coreIngo Molnar1-2/+0
2008-07-29generic, x86: fix add iommu_num_pages helper functionFUJITA Tomonori1-0/+2
This IOMMU helper function doesn't work for some architectures: http://marc.info/?l=linux-kernel&m=121699304403202&w=2 It also breaks POWER and SPARC builds: http://marc.info/?l=linux-kernel&m=121730388001890&w=2 Currently, only x86 IOMMUs use this so let's move it to x86 for now. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-29Merge commit 'v2.6.27-rc1' into x86/microcodeIngo Molnar1-21/+1
Conflicts: arch/x86/kernel/microcode.c Manual resolutions: arch/x86/kernel/microcode_amd.c arch/x86/kernel/microcode_intel.c Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-29KVM: Synchronize guest physical memory map to host virtual memory mapAndrea Arcangeli1-0/+6
Synchronize changes to host virtual addresses which are part of a KVM memory slot to the KVM shadow mmu. This allows pte operations like swapping, page migration, and madvise() to transparently work with KVM. Signed-off-by: Andrea Arcangeli <andrea@qumranet.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-29Merge branch 'linus' into core/generic-dma-coherentIngo Molnar68-525/+1327
Conflicts: arch/x86/Kconfig Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28x86: major refactoringPeter Oruba1-0/+3
Refactored code by introducing a two-module solution. There is one general module in which vendor specific modules can hook into. However, that is exclusive, there is only one vendor specific module allowed at a time. A CPU vendor check makes sure only the correct module for the underlying system gets called. Functinally in terms of patch loading itself there are no changes. This refactoring provides a basis for future implementations of other vendors' patch loaders. Signed-off-by: Peter Oruba <peter.oruba@amd.com> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28x86: first step of refactoring, introducing microcode_opsPeter Oruba1-0/+13
Refactoring with the goal of having one general module and separate vendor specific modules that hook into the general one. Microcode_ops is a function pointer structure in which vendor specific modules will enter all functions that differ between vendors and that need to be accessed from the general module. Signed-off-by: Peter Oruba <peter.oruba@amd.com> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28x86: add AMD specific declarationsPeter Oruba1-0/+30
Added AMD specific declarations to header file. Signed-off-by: Peter Oruba <peter.oruba@amd.com> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28x86: structure declaration renamingPeter Oruba1-4/+6
Renamed common structures to vendor specific naming scheme so other vendors will be able to use the same naming convention. Signed-off-by: Peter Oruba <peter.oruba@amd.com> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28x86: move per CPU microcode structure declaration to header filePeter Oruba1-0/+8
This structure will be later used by other modules as well and needs therfore to be moved out to a header file. Signed-off-by: Peter Oruba <peter.oruba@amd.com> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28x86: typedef removalPeter Oruba1-3/+0
Removed typedefs. No functional changes to the code. Signed-off-by: Peter Oruba <peter.oruba@amd.com> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28x86: moved Intel microcode patch loader declarations to seperate header filePeter Oruba2-35/+34
Intel specific microcode declarations have been moved to a seperate header file. There are no code changes to the code itself and no side effects to other parts. Signed-off-by: Peter Oruba <peter.oruba@amd.com> Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28Merge branch 'x86-tracehook' of ↵Ingo Molnar3-1/+218
git://git.kernel.org/pub/scm/linux/kernel/git/frob/linux-2.6-utrace into x86/tracehook
2008-07-28Merge branch 'linus' into x86/cpuIngo Molnar30-206/+209
Conflicts: arch/x86/kernel/cpu/intel_cacheinfo.c Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-07-28Merge core/lib: pick up memparse() change.Ingo Molnar11-61/+92
Merge branch 'core/lib' into x86/xen
2008-07-28x86: fix initialization of 'l' bit in ldt descriptorsJeremy Fitzhardinge1-0/+5
Make sure that fill_ldt() initializes the 'l' bit in the descriptor. It always sets it to 0, ignoring 'lm' in user_desc, preserving original x86_64 behaviour. Previously it was leaving 'l' uninitialized. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Glauber de Oliveira Costa <gcosta@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Glauber de Oliveira Costa <gcosta@redhat.com>
2008-07-27KVM: SVM: allow enabling/disabling NPT by reloading only the architecture moduleJoerg Roedel1-0/+1
If NPT is enabled after loading both KVM modules on AMD and it should be disabled, both KVM modules must be reloaded. If only the architecture module is reloaded the behavior is undefined. With this patch it is possible to disable NPT only by reloading the kvm_amd module. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-07-27[PATCH] kill altrootAl Viro1-11/+0
long overdue... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-07-27x86: tracehook: TIF_NOTIFY_RESUMERoland McGrath1-1/+3
This adds TIF_NOTIFY_RESUME support for x86, both 64-bit and 32-bit. When set, we call tracehook_notify_resume() on the way to user mode. Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-27x86: tracehook: asm/syscall.hRoland McGrath2-0/+215
Add asm/syscall.h for x86 with all the required entry points. This will allow arch-independent tracing code for system calls. Signed-off-by: Roland McGrath <roland@redhat.com>
2008-07-27Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2-7/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, AMD IOMMU: include amd_iommu_last_bdf in device initialization x86: fix IBM Summit based systems' phys_cpu_present_map on 32-bit kernels x86, RDC321x: remove gpio.h complications x86, RDC321x: add to mach-default crashdump: fix undefined reference to `elfcorehdr_addr' flag parameters: fix compile error of sys_epoll_create1
2008-07-26x86: lockless get_user_pages_fast()Nick Piggin1-0/+1
Implement get_user_pages_fast without locking in the fastpath on x86. Do an optimistic lockless pagetable walk, without taking mmap_sem or any page table locks or even mmap_sem. Page table existence is guaranteed by turning interrupts off (combined with the fact that we're always looking up the current mm, means we can do the lockless page table walk within the constraints of the TLB shootdown design). Basically we can do this lockless pagetable walk in a similar manner to the way the CPU's pagetable walker does not have to take any locks to find present ptes. This patch (combined with the subsequent ones to convert direct IO to use it) was found to give about 10% performance improvement on a 2 socket 8 core Intel Xeon system running an OLTP workload on DB2 v9.5 "To test the effects of the patch, an OLTP workload was run on an IBM x3850 M2 server with 2 processors (quad-core Intel Xeon processors at 2.93 GHz) using IBM DB2 v9.5 running Linux 2.6.24rc7 kernel. Comparing runs with and without the patch resulted in an overall performance benefit of ~9.8%. Correspondingly, oprofiles showed that samples from __up_read and __down_read routines that is seen during thread contention for system resources was reduced from 2.8% down to .05%. Monitoring the /proc/vmstat output from the patched run showed that the counter for fast_gup contained a very high number while the fast_gup_slow value was zero." (fast_gup is the old name for get_user_pages_fast, fast_gup_slow is a counter we had for the number of times the slowpath was invoked). The main reason for the improvement is that DB2 has multiple threads each issuing direct-IO. Direct-IO uses get_user_pages, and thus the threads contend the mmap_sem cacheline, and can also contend on page table locks. I would anticipate larger performance gains on larger systems, however I think DB2 uses an adaptive mix of threads and processes, so it could be that thread contention remains pretty constant as machine size increases. In which case, we stuck with "only" a 10% gain. The downside of using get_user_pages_fast is that if there is not a pte with the correct permissions for the access, we end up falling back to get_user_pages and so the get_user_pages_fast is a bit of extra work. However this should not be the common case in most performance critical code. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: Kconfig fix] [akpm@linux-foundation.org: Makefile fix/cleanup] [akpm@linux-foundation.org: warning fix] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Jens Axboe <jens.axboe@oracle.com> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>