summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2011-02-25genirq: Prepare the handling of shared oneshot interruptsThomas Gleixner2-19/+111
For level type interrupts we need to track how many threads are on flight to avoid useless interrupt storms when not all thread handlers have finished yet. Keep track of the woken threads and only unmask when there are no more threads in flight. Yes, I'm lazy and using a bitfield. But not only because I'm lazy, the main reason is that it's way simpler than using a refcount. A refcount based solution would need to keep track of various things like crashing the irq thread, spurious interrupts coming in, disables/enables, free_irq() and some more. The bitfield keeps the tracking simple and makes things just work. It's also nicely confined to the thread code pathes and does not require additional checks all over the place. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20110223234956.388095876@linutronix.de>
2011-02-25genirq: Make warning in handle_percpu_event usefulThomas Gleixner1-1/+2
The WARN_ON_ONCE in handle_percpu_event() which emits a warning when an action handler returns with interrupts enabled is not really useful. It does not reveal the interrupt number and handler function which caused it. Make it WARN_ONCE() and add the information. Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-25sched: Add #ifdef around irq time accounting functionsHeiko Carstens1-0/+2
Get rid of this: kernel/sched.c:3731:13: warning: 'irqtime_account_idle_ticks' defined but not used kernel/sched.c:3732:13: warning: 'irqtime_account_process_tick' defined but not used Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20110225133228.GD7469@osiris.boeblingen.de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23perf: Simplify task_clock_event_read()Peter Zijlstra1-10/+3
There is no point in us having different code paths for nmi and !nmi here, so remove the !nmi one. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23perf_events: Fix rcu and locking issues with cgroup supportStephane Eranian1-11/+29
This patches ensures that we do not end up calling perf_cgroup_from_task() when there is no cgroup event. This avoids potential RCU and locking issues. The change in perf_cgroup_set_timestamp() ensures we check against ctx->nr_cgroups. It also avoids calling perf_clock() tiwce in a row. It also ensures we do need to grab ctx->lock before calling the function. We drop update_cgrp_time() from task_clock_event_read() because it is not needed. This also avoids having to deal with perf_cgroup_from_task(). Thanks to Peter Zijlstra for his help on this. Signed-off-by: Stephane Eranian <eranian@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4d5e76b8.815bdf0a.7ac3.774f@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23sched, autogroup: Stop claiming ownership of the root task groupMike Galbraith1-5/+6
Disown it, and only display autogroup association if one exists. Signed-off-by: Mike Galbraith <efault@gmx.de> Reviewed-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1298383320.8036.5.camel@marge.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23sched, autogroup: Stop going ahead if autogroup is disabledYong Zhang2-0/+9
when autogroup is disable from the beginning, sched_autogroup_create_attach() autogroup_move_group() <== 1 sched_move_task() <== 2 task_move_group_fair() set_task_rq() task_group() autogroup_task_group() We go the whole path without doing anything useful. Then stop going further if autogroup is disabled. But there will be a race window between 1 and 2, in which sysctl_sched_autogroup_enabled is enabled. This issue will be toke by following patch. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <1298185696-4403-4-git-send-email-yong.zhang0@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23sched, autogroup, sysctl: Use proc_dointvec_minmax() insteadYong Zhang1-1/+1
sched_autogroup_enabled has min/max value, proc_dointvec_minmax() is be used for this case. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1298185696-4403-2-git-send-email-yong.zhang0@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23sched: Fix the group_imb logicPeter Zijlstra1-2/+10
On a 2*6*2 machine something like: taskset -c 3-11 bash -c 'for ((i=0;i<9;i++)) do while :; do :; done & done' _should_ result in 9 busy CPUs, each running 1 task. However it didn't quite work reliably, most of the time one cpu of the second socket (6-11) would be idle and one cpu of the first socket (0-5) would have two tasks on it. The group_imb logic is supposed to deal with this and detect when a particular group is imbalanced (like in our case, 0-2 are idle but 3-5 will have 4 tasks on it). The detection phase needed a bit of a tweak as it was too weak and required more than 2 avg weight tasks difference between idle and busy cpus in the group which won't trigger for our test-case. So cure that to be one or more avg task weight difference between cpus. Once the detection phase worked, it was then defeated by the f_b_g() tests trying to avoid ping-pongs. In particular, this_load >= max_load triggered because the pulling cpu (the (first) idle cpu in on the second socket, say 6) would find this_load to be 5 and max_load to be 4 (there'd be 5 tasks running on our socket and only 4 on the other socket). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Nikhil Rao <ncrao@google.com> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23sched: Clean up some f_b_g() commentsPeter Zijlstra1-15/+13
The existing comment tends to grow state (as it already has), split it up and place it near the actual tests. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Nikhil Rao <ncrao@google.com> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23sched: Clean up remnants of sd_idlePeter Zijlstra1-13/+10
With the wholesale removal of the sd_idle SMT logic we can clean up some more. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Nikhil Rao <ncrao@google.com> Cc: Venkatesh Pallipadi <venki@google.com> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23Merge commit 'v2.6.38-rc6' into sched/coreIngo Molnar4-22/+30
Merge reason: Pick up the latest fixes before queueing up new changes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-23genirq: Streamline kernel/irq/KconfigJan Beulich1-10/+10
"def_bool n" without prompt is pointless, these should be just "bool". [ tglx: Adapted to latest changes ] Signed-off-by: Jan Beulich <jbeulich@novell.com> LKML-Reference: <4D5D3309020000780003264A@vpn.id2.novell.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-23rtmutex: tester: Remove the remaining BKL leftoversThomas Gleixner1-3/+2
We just leave the numbers assinged as commemoration and in case that someone was crazy enough to reimplement the test stuff out of tree. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-22Merge branch 'irq-fixes-for-linus' of ↵Linus Torvalds4-3/+18
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: Disable the SHIRQ_DEBUG call in request_threaded_irq for now genirq: Prevent access beyond allocated_irqs bitmap
2011-02-22Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds1-4/+15
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Fix throttle logic perf, x86: P4 PMU: Fix spurious NMI messages
2011-02-22genirq: Use the correct variable for note_interruptThomas Gleixner1-7/+9
note_interrupt wants to be called with the combined result of all handlers called, not with the last one. If it's a shared interrupt then the last handler might return IRQ_NONE often enough to trigger the spurious dectector which turns off a perfectly fine working interrupt line. Bug was introduced in commit 1277a532(genirq: Simplify handle_irq_event()). Yes, I really messed up there. First the variable ret should not have been named differently to avoid similarity with retval. Second it should have been declared in the do {} loop. Rename it to res and move it into the do {} loop and vanish under a huge brown paperbag. Reported-bisected-tested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-21timers: Export CLOCK_BOOTTIME via the posix timers interfaceJohn Stultz1-1/+20
This patch exports CLOCK_BOOTTIME through the posix timers interface CC: Jamie Lokier <jamie@shareable.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Alexander Shishkin <virtuoso@slind.org> CC: Arve Hjønnevåg <arve@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-02-21timers: Add CLOCK_BOOTTIME hrtimer baseJohn Stultz1-4/+12
CLOCK_MONOTONIC stops while the system is in suspend. This is because to applications system suspend is invisible. However, there is a growing set of applications that are wanting to be suspend-aware, but do not want to deal with the complications of CLOCK_REALTIME (which might jump around if settimeofday is called). For these applications, I propose a new clockid: CLOCK_BOOTTIME. CLOCK_BOOTTIME is idential to CLOCK_MONOTONIC, except it also includes any time spent in suspend. This patch add hrtimer base for CLOCK_BOOTTIME, using get_monotonic_boottime/ktime_get_boottime, to allow in kernel users to set timers against. CC: Jamie Lokier <jamie@shareable.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Alexander Shishkin <virtuoso@slind.org> CC: Arve Hjønnevåg <arve@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-02-21time: Extend get_xtime_and_monotonic_offset() to also return sleepJohn Stultz2-6/+11
Extend get_xtime_and_monotonic_offset to get_xtime_and_monotonic_and_sleep_offset(). CC: Jamie Lokier <jamie@shareable.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Alexander Shishkin <virtuoso@slind.org> CC: Arve Hjønnevåg <arve@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-02-21time: Introduce get_monotonic_boottime and ktime_get_boottimeJohn Stultz1-1/+50
This adds new functions that return the monotonic time since boot (in other words, CLOCK_MONOTONIC + suspend time). CC: Jamie Lokier <jamie@shareable.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Alexander Shishkin <virtuoso@slind.org> CC: Arve Hjønnevåg <arve@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-02-21hrtimers: extend hrtimer base code to handle more then 2 clockidsJohn Stultz1-13/+27
The hrtimer code is written mainly with CLOCK_REALTIME and CLOCK_MONOTONIC in mind. These are clockids 0 and 1 resepctively. However, if we are to introduce any new hrtimer bases, using new clockids, we have to skip the cputimers (clockids 2,3) as well as other clockids that may not impelement timers. This patch adds a little bit of indirection between the clockid and the base, so that we can extend the base by one when we add a new clockid at number 7 or so. CC: Jamie Lokier <jamie@shareable.org> CC: Thomas Gleixner <tglx@linutronix.de> CC: Alexander Shishkin <virtuoso@slind.org> CC: Arve Hjønnevåg <arve@android.com> Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-02-21genirq: Add missing break in __irq_set_trigger()Thomas Gleixner1-0/+1
The switch case in __irq_set_trigger() lacks a break, which emits a pr_err unconditionally on success. Reported-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-21genirq: Use IRQ_BITMAP_BITS as search size in irq_alloc_descs()Yinghai Lu1-7/+8
The runtime expansion of nr_irqs does not take into account that bitmap_find_next_zero_area() returns "start" + size in case the search for an matching zero area fails. That results in a start value which can be completely off and is not covered by the following expand_nr_irqs() and possibly outside of the absolute limit. But we use it without further checking. Use IRQ_BITMAP_BITS as the limit for the bitmap search and expand nr_irqs when the start bit is beyond nr_irqs. So start is always pointing to the correct area in the bitmap. nr_irqs is just the limit for irq enumerations, not the real limit for the irq space. [ tglx: Let irq_expand_nr_irqs() take the new upper end so we do not expand nr_irqs more than necessary. Made changelog readable ] Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4D6014F9.8040605@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-21genirq: Fix misplaced status update in irq_disable()Thomas Gleixner1-1/+1
We lazy disable interrupt lines, so only mark the line masked, when the chip provides an irq_disable callback. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-21workqueue: fix build failure introduced by s/freezeable/freezable/Tejun Heo1-5/+5
wq:fixes-2.6.38 does s/WQ_FREEZEABLE/WQ_FREEZABLE and wq:for-2.6.39 adds new usage of the flag. The combination of the two creates a build failure after merge. Fix it by renaming all freezeables to freezables. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
2011-02-21Merge branch 'master' into for-2.6.39Tejun Heo23-186/+241
2011-02-20Merge branch 'master' of ↵David S. Miller23-186/+241
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/net/e1000e/netdev.c net/xfrm/xfrm_policy.c
2011-02-19genirq: Implement irq_data based move_*_irq() versionsThomas Gleixner1-9/+19
No need to lookup the irq descriptor when calling from a chip callback function which has irq_data already handy. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq; Add fasteoi irq_chip quirkThomas Gleixner1-1/+8
Some chips want irq_eoi() only called when an interrupt is actually handled. So they have checks for INPROGRESS and DISABLED in their irq_eoi callbacks. Add a chip flag, which allows to handle that in the generic code. No impact on the fastpath. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Add preflow handler supportThomas Gleixner2-0/+14
sparc64 needs to call a preflow handler on certain interrupts befor calling the action chain. Integrate it into handle_fasteoi_irq. Must be enabled via CONFIG_IRQ_FASTEOI_PREFLOW. No impact when disabled. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: David S. Miller <davem@davemloft.net>
2011-02-19genirq: Consolidate set_chip_handler functionsThomas Gleixner1-12/+4
No need to have separate functions if we have one plus inline wrappers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Use irq_get/put functionsThomas Gleixner2-117/+57
Convert the management functions to use the common irq_get/put function. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Implement irq_get/put_desc_[bus]locked/unlock()Thomas Gleixner2-0/+48
Most of the managing functions get the irq descriptor and lock it - either with or without buslock. Instead of open coding this over and over provide a common function to do that. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Remove real old transition functionsThomas Gleixner4-34/+5
These transition helpers are stale for years now. Remove them. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Remove desc->status when GENERIC_HARDIRQS_NO_COMPAT=yThomas Gleixner2-0/+5
If everything uses the right accessors, then enabling GENERIC_HARDIRQS_NO_COMPAT should just work. If not it will tell you. Don't be lazy and use the trick which I use in the core code! git grep status_use_accessors will unearth it in a split second. Offenders are tracked down and not slapped with stinking trouts. This time we use frozen shark for a better educational value. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Reflect IRQ_MOVE_PCNTXT in irq_data stateThomas Gleixner1-1/+3
Required by x86. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Move wakeup state to irq_dataThomas Gleixner3-5/+3
Some irq_chips need to know the state of wakeup mode for setting the trigger type etc. Reflect it in irq_data state. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Add IRQCHIP_SET_TYPE_MASKED flagThomas Gleixner3-5/+17
irq_chips, which require to mask the chip before changing the trigger type should set this flag. So the core takes care of it and the requirement for looking into desc->status in the chip goes away. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Walleij <linus.walleij@stericsson.com> Cc: Lars-Peter Clausen <lars@metafoo.de>
2011-02-19genirq: Cleanup irq.hThomas Gleixner1-16/+0
Put the constants into an enum and document them. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Force wrapped access to desc->status in core codeThomas Gleixner3-5/+8
Force the usage of wrappers by another nasty CPP substitution. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Wrap the remaning IRQ_* flagsThomas Gleixner5-12/+70
Use wrappers to keep them away from the core code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Mirror irq trigger type bits in irq_data.stateThomas Gleixner4-19/+62
That's the data structure chip functions get provided. Also allow them to signal the core code that they updated the flags in irq_data.state by returning IRQ_SET_MASK_OK_NOCOPY. The default is unchanged. The type bits should be accessed via: val = irqd_get_trigger_type(irqdata); and irqd_set_trigger_type(irqdata, val); Coders who access them directly will be tracked down and slapped with stinking trouts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Move IRQ_AFFINITY_SET to coreThomas Gleixner4-4/+24
Keep status in sync until last abuser is gone. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Reuse existing can set affinty checkThomas Gleixner2-4/+3
Add a !desc check while at it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Mirror IRQ_PER_CPU and IRQ_NO_BALANCING in irq_data.stateThomas Gleixner6-14/+69
That's the right data structure to look at for arch code. Accessor functions are provided. irqd_is_per_cpu(irqdata); irqd_can_balance(irqdata); Coders who access them directly will be tracked down and slapped with stinking trouts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Move debug code to separate headerThomas Gleixner2-44/+44
It'll break when I'm going to undefine the constants. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Remove CHECK_IRQ_PER_CPU from core codeThomas Gleixner2-3/+3
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Remove CONFIG_IRQ_PER_CPUThomas Gleixner3-11/+3
The saving of this switch is minimal versus the ifdef mess it creates. Simple enable PER_CPU unconditionally and remove the config switch. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-02-19genirq: Add IRQ_MOVE_PENDING to irq_data.stateThomas Gleixner6-6/+34
chip implementations need to know about it. Keep status in sync until all users are fixed. Accessor function: irqd_is_setaffinity_pending(irqdata) Coders who access them directly will be tracked down and slapped with stinking trouts. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>