summaryrefslogtreecommitdiff
path: root/Documentation
AgeCommit message (Collapse)AuthorFilesLines
2009-06-12lguest: try to batch interrupts on network receiveRusty Russell1-1/+16
Rather than triggering an interrupt every time, we only trigger an interrupt when there are no more incoming packets (or the recv queue is full). However, the overhead of doing the select to figure this out is measurable: 1M pings goes from 98 to 104 seconds, and 1G Guest->Host TCP goes from 3.69 to 3.94 seconds. It's close to the noise though. I tested various timeouts, including reducing it as the number of pending packets increased, timing a 1 gigabyte TCP send from Guest -> Host and Host -> Guest (GSO disabled, to increase packet rate). // time tcpblast -o -s 65536 -c 16k 192.168.2.1:9999 > /dev/null Timeout Guest->Host Pkts/irq Host->Guest Pkts/irq Before 11.3s 1.0 6.3s 1.0 0 11.7s 1.0 6.6s 23.5 1 17.1s 8.8 8.6s 26.0 1/pending 13.4s 1.9 6.6s 23.8 2/pending 13.6s 2.8 6.6s 24.1 5/pending 14.1s 5.0 6.6s 24.4 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12lguest: avoid sending interrupts to Guest when no activity occurs.Rusty Russell1-0/+9
If we track how many buffers we've used, we can tell whether we really need to interrupt the Guest. This happens as a side effect of spurious notifications. Spurious notifications happen because it can take a while before the Host thread wakes up and sets the VRING_USED_F_NO_NOTIFY flag, and meanwhile the Guest can more notifications. A real fix would be to use wake counts, rather than a suppression flag, but the practical difference is generally in the noise: the interrupt is usually coalesced into a pending one anyway so we just save a system call which isn't clearly measurable. Secs Spurious IRQS 1G TCP Guest->Host: 3.93 58 1M normal pings: 100 72 1M 1k pings (-l 120): 57 492904 Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12lguest: implement deferred interrupts in example LauncherRusty Russell1-19/+22
Rather than sending an interrupt on every buffer, we only send an interrupt when we're about to wait for the Guest to send us a new one. The console input and network input still send interrupts manually, but the block device, network and console output queues can simply rely on this logic to send interrupts to the Guest at the right time. The patch is cluttered by moving trigger_irq() higher in the code. In practice, two factors make this optimization less interesting: (1) we often only get one input at a time, even for networking, (2) triggering an interrupt rapidly tends to get coalesced anyway. Before: Secs RxIRQS TxIRQs 1G TCP Guest->Host: 3.72 32784 32771 1M normal pings: 99 1000004 995541 100,000 1k pings (-l 120): 5 49510 49058 After: 1G TCP Guest->Host: 3.69 32809 32769 1M normal pings: 99 1000004 996196 100,000 1k pings (-l 120): 5 52435 52361 (Note the interrupt count on 100k pings goes *up*: see next patch). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12lguest: have example Launcher service all devices in separate threadsRusty Russell1-574/+259
Currently lguest has three threads: the main Launcher thread, a Waker thread, and a thread for the block device (because synchronous block was simply too painful to bear). The Waker selects() on all the input file descriptors (eg. stdin, net devices, pipe to the block thread) and when one becomes readable it calls into the kernel to kick the Launcher thread out into userspace, which repeats the poll, services the device(s), and then tells the kernel to release the Waker before re-entering the kernel to run the Guest. Also, to make a slightly-decent network transmit routine, the Launcher would suppress further network interrupts while it set a timer: that signal handler would write to a pipe, which would rouse the Waker which would prod the Launcher out of the kernel to check the network device again. Now we can convert all our virtqueues to separate threads: each one has a separate eventfd for when the Guest pokes the device, and can trigger interrupts in the Guest directly. The linecount shows how much this simplifies, but to really bring it home, here's an strace analysis of single Guest->Host ping before: * Guest sends packet, notifies xmit vq, return control to Launcher * Launcher clears notification flag on xmit ring * Launcher writes packet to TUN device writev(4, [{"\0\0\0\0\0\0\0\0\0\0", 10}, {"\366\r\224`\2058\272m\224vf\274\10\0E\0\0T\0\0@\0@\1\265"..., 98}], 2) = 108 * Launcher sets up interrupt for Guest (xmit ring is empty) write(10, "\2\0\0\0\3\0\0\0", 8) = 0 * Launcher sets up timer for interrupt mitigation setitimer(ITIMER_REAL, {it_interval={0, 0}, it_value={0, 505}}, NULL) = 0 * Launcher re-runs guest pread64(10, 0xbfa5f4d4, 4, 0) ... * Waker notices reply packet in tun device (it was in select) select(12, [0 3 4 6 11], NULL, NULL, NULL) = 1 (in [4]) * Waker kicks Launcher out of guest: pwrite64(10, "\3\0\0\0\1\0\0\0", 8, 0) = 0 * Launcher returns from running guest: ... = -1 EAGAIN (Resource temporarily unavailable) * Launcher looks at input fds: select(7, [0 3 4 6], NULL, NULL, {0, 0}) = 1 (in [4], left {0, 0}) * Launcher reads pong from tun device: readv(4, [{"\0\0\0\0\0\0\0\0\0\0", 10}, {"\272m\224vf\274\366\r\224`\2058\10\0E\0\0T\364\26\0\0@"..., 1518}], 2) = 108 * Launcher injects guest notification: write(10, "\2\0\0\0\2\0\0\0", 8) = 0 * Launcher rechecks fds: select(7, [0 3 4 6], NULL, NULL, {0, 0}) = 0 (Timeout) * Launcher clears Waker: pwrite64(10, "\3\0\0\0\0\0\0\0", 8, 0) = 0 * Launcher reruns Guest: pread64(10, 0xbfa5f4d4, 4, 0) = ? ERESTARTSYS (To be restarted) * Signal comes in, uses pipe to wake up Launcher: --- SIGALRM (Alarm clock) @ 0 (0) --- write(8, "\0", 1) = 1 sigreturn() = ? (mask now []) * Waker sees write on pipe: select(12, [0 3 4 6 11], NULL, NULL, NULL) = 1 (in [6]) * Waker kicks Launcher out of Guest: pwrite64(10, "\3\0\0\0\1\0\0\0", 8, 0) = 0 * Launcher exits from kernel: pread64(10, 0xbfa5f4d4, 4, 0) = -1 EAGAIN (Resource temporarily unavailable) * Launcher looks to see what fd woke it: select(7, [0 3 4 6], NULL, NULL, {0, 0}) = 1 (in [6], left {0, 0}) * Launcher reads timeout fd, sets notification flag on xmit ring read(6, "\0", 32) = 1 * Launcher rechecks fds: select(7, [0 3 4 6], NULL, NULL, {0, 0}) = 0 (Timeout) * Launcher clears Waker: pwrite64(10, "\3\0\0\0\0\0\0\0", 8, 0) = 0 * Launcher resumes Guest: pread64(10, "\0p\0\4", 4, 0) .... strace analysis of single Guest->Host ping after: * Guest sends packet, notifies xmit vq, creates event on eventfd. * Network xmit thread wakes from read on eventfd: read(7, "\1\0\0\0\0\0\0\0", 8) = 8 * Network xmit thread writes packet to TUN device writev(4, [{"\0\0\0\0\0\0\0\0\0\0", 10}, {"J\217\232FI\37j\27\375\276\0\304\10\0E\0\0T\0\0@\0@\1\265"..., 98}], 2) = 108 * Network recv thread wakes up from read on tunfd: readv(4, [{"\0\0\0\0\0\0\0\0\0\0", 10}, {"j\27\375\276\0\304J\217\232FI\37\10\0E\0\0TiO\0\0@\1\214"..., 1518}], 2) = 108 * Network recv thread sets up interrupt for the Guest write(6, "\2\0\0\0\2\0\0\0", 8) = 0 * Network recv thread goes back to reading tunfd 13:39:42.460285 readv(4, <unfinished ...> * Network xmit thread sets up interrupt for Guest (xmit ring is empty) write(6, "\2\0\0\0\3\0\0\0", 8) = 0 * Network xmit thread goes back to reading from eventfd read(7, <unfinished ...> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12lguest: PAE supportMatias Zabaljauregui1-1/+0
This version requires that host and guest have the same PAE status. NX cap is not offered to the guest, yet. Signed-off-by: Matias Zabaljauregui <zabaljauregui@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12lguest: fix writev returning short on console outputRusty Russell1-1/+6
I've never seen it here, but I can't find anywhere that says writev will write everything. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12lguest: clean up length-used value in example launcherRusty Russell1-7/+4
The "len" field in the used ring for virtio indicates the number of bytes *written* to the buffer. This means the guest doesn't have to zero the buffers in advance as it always knows the used length. Erroneously, the console and network example code puts the length *read* into that field. The guest ignores it, but it's wrong. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12lguest: clean up example launcher compile flags.Rusty Russell1-2/+1
18 months ago 5bbf89fc260830f3f58b331d946a16b39ad1ca2d changed to loading bzImages directly, and no longer manually ungzipping them, so we no longer need libz. Also, -m32 is useful for those on 64-bit platforms (and harmless on 32-bit). Reported-by: Ron Minnich <rminnich@gmail.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12lguest: remove invalid interrupt forcing logic.Rusty Russell1-7/+1
20887611523e749d99cc7d64ff6c97d27529fbae (lguest: notify on empty) introduced lguest support for the VIRTIO_F_NOTIFY_ON_EMPTY flag, but in fact it turned on interrupts all the time. Because we always process one buffer at a time, the inflight count is always 0 when call trigger_irq and so we always ignore VRING_AVAIL_F_NO_INTERRUPT from the Guest. It should be looking to see if there are more buffers in the Guest's queue: if it's empty, then we force an interrupt. This makes little difference, since we usually have an empty queue; but that's the subject of another patch. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12lguest: get more serious about wmb() in example Launcher codeRusty Russell1-3/+4
Since the Launcher process runs the Guest, it doesn't have to be very serious about its barriers: the Guest isn't running while we are (Guest is UP). Before we change to use threads to service devices, we need to fix this. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12lguest: cleanup passing of /dev/lguest fd around example launcher.Rusty Russell1-55/+47
We hand the /dev/lguest fd everywhere; it's far neater to just make it a global (it already is, in fact, hidden in the waker_fds struct). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12lguest: be paranoid about guest playing with device descriptors.Rusty Russell1-10/+17
We can't trust the values in the device descriptor table once the guest has booted, so keep local copies. They could set them to strange values then cause us to segv (they're 8 bit values, so they can't make our pointers go too wild). This becomes more important with the following patches which read them. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2009-06-12Merge branch 'for-linus' of git://linux-arm.org/linux-2.6Linus Torvalds2-0/+146
* 'for-linus' of git://linux-arm.org/linux-2.6: kmemleak: Add the corresponding MAINTAINERS entry kmemleak: Simple testing module for kmemleak kmemleak: Enable the building of the memory leak detector kmemleak: Remove some of the kmemleak false positives kmemleak: Add modules support kmemleak: Add kmemleak_alloc callback from alloc_large_system_hash kmemleak: Add the vmalloc memory allocation/freeing hooks kmemleak: Add the slub memory allocation/freeing hooks kmemleak: Add the slob memory allocation/freeing hooks kmemleak: Add the slab memory allocation/freeing hooks kmemleak: Add documentation on the memory leak detector kmemleak: Add the base support Manual conflict resolution (with the slab/earlyboot changes) in: drivers/char/vt.c init/main.c mm/slab.c
2009-06-11Merge branch 'for-2.6.31' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds3-1/+93
* 'for-2.6.31' of git://git.kernel.dk/linux-2.6-block: (153 commits) block: add request clone interface (v2) floppy: fix hibernation ramdisk: remove long-deprecated "ramdisk=" boot-time parameter fs/bio.c: add missing __user annotation block: prevent possible io_context->refcount overflow Add serial number support for virtio_blk, V4a block: Add missing bounce_pfn stacking and fix comments Revert "block: Fix bounce limit setting in DM" cciss: decode unit attention in SCSI error handling code cciss: Remove no longer needed sendcmd reject processing code cciss: change SCSI error handling routines to work with interrupts enabled. cciss: separate error processing and command retrying code in sendcmd_withirq_core() cciss: factor out fix target status processing code from sendcmd functions cciss: simplify interface of sendcmd() and sendcmd_withirq() cciss: factor out core of sendcmd_withirq() for use by SCSI error handling code cciss: Use schedule_timeout_uninterruptible in SCSI error handling code block: needs to set the residual length of a bidi request Revert "block: implement blkdev_readpages" block: Fix bounce limit setting in DM Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txt ... Manually fix conflicts with tracing updates in: block/blk-sysfs.c drivers/ide/ide-atapi.c drivers/ide/ide-cd.c drivers/ide/ide-floppy.c drivers/ide/ide-tape.c include/trace/events/block.h kernel/trace/blktrace.c
2009-06-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmwLinus Torvalds2-9/+12
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-nmw: (25 commits) GFS2: Merge gfs2_get_sb into gfs2_get_sb_meta GFS2: Fix cache coherency between truncate and O_DIRECT read GFS2: Fix locking issue mounting gfs2meta fs GFS2: Remove unused variable GFS2: smbd proccess hangs with flock() call. GFS2: Remove args subdir from gfs2 sysfs files GFS2: Remove lockstruct subdir from gfs2 sysfs files GFS2: Move gfs2_unlink_ok into ops_inode.c GFS2: Move gfs2_readlinki into ops_inode.c GFS2: Move gfs2_rmdiri into ops_inode.c GFS2: Merge mount.c and ops_super.c into super.c GFS2: Clean up some file names GFS2: Be more aggressive in reclaiming unlinked inodes GFS2: Add a rgrp bitmap full flag GFS2: Improve resource group error handling GFS2: Don't warn when delete inode fails on ro filesystem GFS2: Update docs GFS2: Umount recovery race fix GFS2: Remove a couple of unused sysfs entries GFS2: Add commit= mount option ...
2009-06-11Merge branch 'for-linus' of ↵Linus Torvalds3-2/+35
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: (44 commits) nommu: Provide mmap_min_addr definition. TOMOYO: Add description of lists and structures. TOMOYO: Remove unused field. integrity: ima audit dentry_open failure TOMOYO: Remove unused parameter. security: use mmap_min_addr indepedently of security models TOMOYO: Simplify policy reader. TOMOYO: Remove redundant markers. SELinux: define audit permissions for audit tree netlink messages TOMOYO: Remove unused mutex. tomoyo: avoid get+put of task_struct smack: Remove redundant initialization. integrity: nfsd imbalance bug fix rootplug: Remove redundant initialization. smack: do not beyond ARRAY_SIZE of data integrity: move ima_counts_get integrity: path_check update IMA: Add __init notation to ima functions IMA: Minimal IMA policy and boot param for TCB IMA policy selinux: remove obsolete read buffer limit from sel_read_bool ...
2009-06-11kmemleak: Add documentation on the memory leak detectorCatalin Marinas2-0/+146
This patch adds the Documentation/kmemleak.txt file with some information about how kmemleak works. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2009-06-11Merge branch 'tracing-for-linus' of ↵Linus Torvalds6-14/+214
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (244 commits) Revert "x86, bts: reenable ptrace branch trace support" tracing: do not translate event helper macros in print format ftrace/documentation: fix typo in function grapher name tracing/events: convert block trace points to TRACE_EVENT(), fix !CONFIG_BLOCK tracing: add protection around module events unload tracing: add trace_seq_vprint interface tracing: fix the block trace points print size tracing/events: convert block trace points to TRACE_EVENT() ring-buffer: fix ret in rb_add_time_stamp ring-buffer: pass in lockdep class key for reader_lock tracing: add annotation to what type of stack trace is recorded tracing: fix multiple use of __print_flags and __print_symbolic tracing/events: fix output format of user stack tracing/events: fix output format of kernel stack tracing/trace_stack: fix the number of entries in the header ring-buffer: discard timestamps that are at the start of the buffer ring-buffer: try to discard unneeded timestamps ring-buffer: fix bug in ring_buffer_discard_commit ftrace: do not profile functions when disabled tracing: make trace pipe recognize latency format flag ...
2009-06-11Merge branch 'oprofile-for-linus' of ↵Linus Torvalds1-0/+8
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: oprofile: introduce module_param oprofile.cpu_type oprofile: add support for Core i7 and Atom oprofile: remove undocumented oprofile.p4force option oprofile: re-add force_arch_perfmon option
2009-06-11Merge branch 'rcu-for-linus' of ↵Linus Torvalds1-22/+80
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: rcu_sched_grace_period(): kill the bogus flush_signals() rculist: use list_entry_rcu in places where it's appropriate rculist.h: introduce list_entry_rcu() and list_first_entry_rcu() rcu: Update RCU tracing documentation for __rcu_pending rcu: Add __rcu_pending tracing to hierarchical RCU RCU: make treercu be default
2009-06-11Merge branch 'next' into for-linusJames Morris3-2/+35
2009-06-11Merge branch 'iommu-for-linus' of ↵Linus Torvalds2-5/+19
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (61 commits) amd-iommu: remove unnecessary "AMD IOMMU: " prefix amd-iommu: detach device explicitly before attaching it to a new domain amd-iommu: remove BUS_NOTIFY_BOUND_DRIVER handling dma-debug: simplify logic in driver_filter() dma-debug: disable/enable irqs only once in device_dma_allocations dma-debug: use pr_* instead of printk(KERN_* ...) dma-debug: code style fixes dma-debug: comment style fixes dma-debug: change hash_bucket_find from first-fit to best-fit x86: enable GART-IOMMU only after setting up protection methods amd_iommu: fix lock imbalance dma-debug: add documentation for the driver filter dma-debug: add dma_debug_driver kernel command line dma-debug: add debugfs file for driver filter dma-debug: add variables and checks for driver filter dma-debug: fix debug_dma_sync_sg_for_cpu and debug_dma_sync_sg_for_device dma-debug: use sg_dma_len accessor dma-debug: use sg_dma_address accessor instead of using dma_address directly amd-iommu: don't free dma adresses below 512MB with CONFIG_IOMMU_STRESS amd-iommu: don't preallocate page tables with CONFIG_IOMMU_STRESS ...
2009-06-11Merge branch 'futexes-for-linus' of ↵Linus Torvalds1-0/+131
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'futexes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: futex: fix restart in wait_requeue_pi futex: fix restart for early wakeup in futex_wait_requeue_pi() futex: cleanup error exit futex: remove the wait queue futex: add requeue-pi documentation futex: remove FUTEX_REQUEUE_PI (non CMP) futex: fix futex_wait_setup key handling sparc64: extend TI_RESTART_BLOCK space by 8 bytes futex: fixup unlocked requeue pi case futex: add requeue_pi functionality futex: split out futex value validation code futex: distangle futex_requeue() futex: add FUTEX_HAS_TIMEOUT flag to restart.futex.flags rt_mutex: add proxy lock routines futex: split out fixup owner logic from futex_lock_pi() futex: split out atomic logic from futex_lock_pi() futex: add helper to find the top prio waiter of a futex futex: separate futex_wait_queue_me() logic from futex_wait()
2009-06-11Merge branch 'x86-mm-for-linus' of ↵Linus Torvalds2-9/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits) x86: fix system without memory on node0 x86, mm: Fix node_possible_map logic mm, x86: remove MEMORY_HOTPLUG_RESERVE related code x86: make sparse mem work in non-NUMA mode x86: process.c, remove useless headers x86: merge process.c a bit x86: use sparse_memory_present_with_active_regions() on UMA x86: unify 64-bit UMA and NUMA paging_init() x86: Allow 1MB of slack between the e820 map and SRAT, not 4GB x86: Sanity check the e820 against the SRAT table using e820 map only x86: clean up and and print out initial max_pfn_mapped x86/pci: remove rounding quirk from e820_setup_gap() x86, e820, pci: reserve extra free space near end of RAM x86: fix typo in address space documentation x86: 46 bit physical address support on 64 bits x86, mm: fault.c, use printk_once() in is_errata93() x86: move per-cpu mmu_gathers to mm/init.c x86: move max_pfn_mapped and max_low_pfn_mapped to setup.c x86: unify noexec handling x86: remove (null) in /sys kernel_page_tables ...
2009-06-11Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds1-0/+18
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: cpu_debug: Remove model information to reduce encoding-decoding x86: fixup numa_node information for AMD CPU northbridge functions x86: k8 convert node_to_k8_nb_misc() from a macro to an inline function x86: cacheinfo: complete L2/L3 Cache and TLB associativity field definitions x86/docs: add description for cache_disable sysfs interface x86: cacheinfo: disable L3 ECC scrubbing when L3 cache index is disabled x86: cacheinfo: replace sysfs interface for cache_disable feature x86: cacheinfo: use cached K8 NB_MISC devices instead of scanning for it x86: cacheinfo: correct return value when cache_disable feature is not active x86: cacheinfo: use L3 cache index disable feature only for CPUs that support it
2009-06-11Merge branch 'sched-docs-for-linus' of ↵Linus Torvalds1-1/+128
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-docs-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Document memory barriers implied by sleep/wake-up primitives
2009-06-11Merge branch 'sched-core-for-linus' of ↵Linus Torvalds2-4/+31
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: fix typo in sched-rt-group.txt file ftrace: fix typo about map of kernel priority in ftrace.txt file. sched: properly define the sched_group::cpumask and sched_domain::span fields sched, timers: cleanup avenrun users sched, timers: move calc_load() to scheduler sched: Don't export sched_mc_power_savings on multi-socket single core system sched: emit thread info flags with stack trace sched: rt: document the risk of small values in the bandwidth settings sched: Replace first_cpu() with cpumask_first() in ILB nomination code sched: remove extra call overhead for schedule() sched: use group_first_cpu() instead of cpumask_first(sched_group_cpus()) wait: don't use __wake_up_common() sched: Nominate a power-efficient ilb in select_nohz_balancer() sched: Nominate idle load balancer from a semi-idle package. sched: remove redundant hierarchy walk in check_preempt_wakeup
2009-06-11Merge branch 'x86-kbuild-for-linus' of ↵Linus Torvalds1-8/+114
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-kbuild-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (46 commits) x86, boot: add new generated files to the appropriate .gitignore files x86, boot: correct the calculation of ZO_INIT_SIZE x86-64: align __PHYSICAL_START, remove __KERNEL_ALIGN x86, boot: correct sanity checks in boot/compressed/misc.c x86: add extension fields for bootloader type and version x86, defconfig: update kernel position parameters x86, defconfig: update to current, no material changes x86: make CONFIG_RELOCATABLE the default x86: default CONFIG_PHYSICAL_START and CONFIG_PHYSICAL_ALIGN to 16 MB x86: document new bzImage fields x86, boot: make kernel_alignment adjustable; new bzImage fields x86, boot: remove dead code from boot/compressed/head_*.S x86, boot: use LOAD_PHYSICAL_ADDR on 64 bits x86, boot: make symbols from the main vmlinux available x86, boot: determine compressed code offset at compile time x86, boot: use appropriate rep string for move and clear x86, boot: zero EFLAGS on 32 bits x86, boot: set up the decompression stack as early as possible x86, boot: straighten out ranges to copy/zero in compressed/head*.S x86, boot: stylistic cleanups for boot/compressed/head_64.S ... Fixed trivial conflict in arch/x86/configs/x86_64_defconfig manually
2009-06-10ftrace/documentation: fix typo in function grapher nameMike Frysinger1-1/+1
The function graph tracer is called just "function_graph" (no trailing "_tracer" needed). Signed-off-by: Mike Frysinger <vapier@gentoo.org> LKML-Reference: <1244623722-6325-1-git-send-email-vapier@gentoo.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-09Merge branch 'master' into nextJames Morris6-25/+96
2009-06-07Merge branch 'linus' into x86/cpuIngo Molnar56-3633/+2393
2009-06-07Merge branch 'dma-debug/2.6.31' of ↵Ingo Molnar2-0/+19
git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into core/iommu
2009-06-07Merge branch 'linus' into core/iommuIngo Molnar11-58/+589
Merge reason: This branch was on an -rc5 base so pull almost-2.6.30 to resync with the latest upstream fixes and make sure the combination works fine. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-02dma-debug: add documentation for the driver filterJoerg Roedel1-0/+12
This patch adds the driver filter feature to the dma-debug documentation. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-06-02dma-debug: add dma_debug_driver kernel command lineJoerg Roedel1-0/+7
This patch add the dma_debug_driver= boot parameter to enable the driver filter for early boot. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-06-02Removed reference to non-existing file Documentation/PCI/PCI-DMA-mapping.txtvibi sreenivasan1-1/+1
File Documentation/PCI/PCI-DMA-mapping.txt does not exist. Documentation/DMA-mapping.txt contains DMA Mapping details Signed-off-by: vibi sreenivasan <vibi_sreenivasan@cms.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-02cciss: add cciss driver sysfs entriesAndrew Patterson1-0/+33
Add sysfs entries to the cciss driver needed for the dm/multipath tools. A file for vendor, model, rev, and unique_id is added for each logical drive under directory /sys/bus/pci/devices/<dev>/ccissX/cXdY. Where X = the controller (or host) number and Y is the logical drive number. A link from /sys/bus/pci/devices/<dev>/ccissX/cXdY/block:cciss!cXdY to /sys/block/cciss!cXdY/device is also created. A bus is created in /sys/bus/cciss. A link is created from the pci ccissX entry to /sys/bus/cciss/devices/ccissX. Please consider this for inclusion. Signed-off-by: Mike Miller <mike.miller@hp.com> Cc: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-06-02ftrace: add kernel command line function filteringSteven Rostedt1-2/+15
When using ftrace=function on the command line to trace functions on boot up, one can not filter out functions that are commonly called. This patch adds two new ftrace command line commands. ftrace_notrace=function-list ftrace_filter=function-list Where function-list is a comma separated list of functions to filter. The ftrace_notrace will make the functions listed not be included in the function tracing, and ftrace_filter will only trace the functions listed. These two act the same as the debugfs/tracing/set_ftrace_notrace and debugfs/tracing/set_ftrace_filter respectively. The simple glob expressions that are allowed by the filter files can also be used by the command line interface. ftrace_notrace=rcu*,*lock,*spin* Will not trace any function that starts with rcu, ends with lock, or has the word spin in it. Note, if the self tests are enabled, they may interfere with the filtering set by the command lines. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-06-01Merge branch 'linus' into irq/numaIngo Molnar11-58/+589
Conflicts: arch/mips/sibyte/bcm1480/irq.c arch/mips/sibyte/sb1250/irq.c Merge reason: we gathered a few conflicts plus update to latest upstream fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-06-01hwmon: Update documentation on fan_maxChristian Engelmayer1-0/+6
Add fan_max description. Add fan limit alarm 'max_alarm' to the alarm section. Signed-off-by: Christian Engelmayer <christian.engelmayer@frequentis.com> Acked-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jean Delvare <khali@linux-fr.org>
2009-05-29Merge branch 'for-linus' of ↵Linus Torvalds1-24/+79
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: libps2 - better handle bad scheduler decisions Input: usb1400_ts - fix access to "device data" in resume function Input: multitouch - augment event semantics documentation Input: multitouch - add tracking ID to the protocol
2009-05-29sched: fix typo in sched-rt-group.txt fileGeunSik Lim1-1/+1
Fix typo about static priority's range. Kernel Space User Space =============================================================== 0(high) to 98(low) user RT priority 99(high) to 1(low) with SCHED_RR or SCHED_FIFO --------------------------------------------------------------- 99 sched_priority is not used in scheduling decisions(it must be specified as 0) --------------------------------------------------------------- 100(high) to 139(low) user nice -20(high) to 19(low) --------------------------------------------------------------- 140 idle task priority --------------------------------------------------------------- * ref) http://www.kernel.org/doc/man-pages/online/pages/man2/sched_setscheduler.2.html Signed-off-by: GeunSik Lim <geunsik.lim@samsung.com> CC: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-29ftrace: fix typo about map of kernel priority in ftrace.txt file.GeunSik Lim1-3/+12
Fix typo about chart to map the kernel priority to user land priorities. * About sched_setscheduler(2) Processes scheduled under SCHED_FIFO or SCHED_RR can have a (user-space) static priority in the range 1 to 99. (reference: http://www.kernel.org/doc/man-pages/online/pages/ man2/sched_setscheduler.2.html) * From: Steven Rostedt 0 to 98 - maps to RT tasks 99 to 1 (SCHED_RR or SCHED_FIFO) 99 - maps to internal kernel threads that want to be lower than RT tasks but higher than SCHED_OTHER tasks. Although I'm not sure if any kernel thread actually uses this. I'm not even sure how this can be set, because the internal sched_setscheduler function does not allow for it. 100 to 139 - maps nice levels -20 to 19. These are not set via sched_setscheduler, but are set via the nice system call. 140 - reserved for idle tasks. Signed-off-by: GeunSik Lim <geunsik.lim@samsung.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-28amd-iommu: remove amd_iommu_size kernel parameterJoerg Roedel1-5/+0
This parameter is not longer necessary when aperture increases dynamically. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2009-05-27Merge branch 'fix/pcm-jiffies-check' into for-linusTakashi Iwai1-0/+5
* fix/pcm-jiffies-check: ALSA: Enable PCM hw_ptr_jiffies check only in xrun_debug mode ALSA: Fix invalid jiffies check after pause
2009-05-27ALSA: Enable PCM hw_ptr_jiffies check only in xrun_debug modeTakashi Iwai1-0/+5
The PCM hw_ptr jiffies check results sometimes in problems when a hardware doesn't give smooth hw_ptr updates. So far, au88x0 and some other drivers appear not working due to this strict check. However, this check is a nice debug tool, and the capability should be still kept. Hence, we disable this check now as default unless the user enables it by setting the xrun_debug mode to the specific stream via a proc file. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-05-26Merge branch 'for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ALSA: hda - Add missing check of pin vref 50 and others in Realtek codecs ALSA: hda - Add 5stack-no-fp model for STAC927x ALSA: hda - Add forced codec-slots for ASUS W5Fm
2009-05-26kmemtrace: fix kernel parameter documentationPekka Enberg1-10/+0
The kmemtrace.enable kernel parameter no longer works. To enable kmemtrace at boot-time, you must pass "ftrace=kmemtrace" instead. [ Impact: remove obsolete kernel parameter documentation ] Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> LKML-Reference: <alpine.DEB.2.00.0905241112190.10296@rocky> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2009-05-26Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Remove remap percpu allocator for the time being x86: cpa_flush_array wbinvd should be done on all CPUs x86: bugfix wbinvd() model check instead of family check x86: introduce noxsave boot parameter x86, setup: revert ACPI 3 E820 extended attributes support x86: DMI match for the Sony VGN-Z540N as it needs BIOS reboot
2009-05-24ALSA: hda - Add 5stack-no-fp model for STAC927xTakashi Iwai1-0/+1
The recent fix for the headphone volume control on IDT/STAC codecs resulted in the removal of invalid "Side" volume eventually. But, if the front panel doesn't exist, this setup could be regarded as a sort of regression, as reported in kernel bug #13250. Now as a workaround, a new model 5stack-no-fp is added so that the user without the front panel can choose this one explicitly. Reference: bko#13250 http://bugzilla.kernel.org/show_bug.cgi?id=13250 Signed-off-by: Takashi Iwai <tiwai@suse.de>