summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2011-12-13UAPI: Guard linux/pmu.hDavid Howells1-0/+4
Place reinclusion guards on linux/pmu.h otherwise the UAPI splitter won't insert the #include to include the UAPI header from the kernel header. Signed-off-by: David Howells <dhowells@redhat.com>
2011-12-13UAPI: Guard linux/isdn_divertif.hDavid Howells1-0/+4
Place reinclusion guards on linux/isdn_divertif.h otherwise the UAPI splitter script won't insert the #include to include the UAPI header from the kernel header. Signed-off-by: David Howells <dhowells@redhat.com>
2011-12-13UAPI: Guard linux/sound.hDavid Howells1-0/+4
Place reinclusion guards on linux/sound.h otherwise the UAPI splitter script won't insert a #include to make the kernel header include the UAPI header. Signed-off-by: David Howells <dhowells@redhat.com>
2011-12-13linux/log2.h: Fix rounddown_pow_of_two(1)Linus Torvalds1-1/+0
Exactly like roundup_pow_of_two(1), the rounddown version was buggy for the case of a compile-time constant '1' argument. Probably because it originated from the same code, sharing history with the roundup version from before the bugfix (for that one, see commit 1a06a52ee1b0: "Fix roundup_pow_of_two(1)"). However, unlike the roundup version, the fix for rounddown is to just remove the broken special case entirely. It's simply not needed - the generic code 1UL << ilog2(n) does the right thing for the constant '1' argment too. The only reason roundup needed that special case was because rounding up does so by subtracting one from the argument (and then adding one to the result) causing the obvious problems with "ilog2(0)". But rounddown doesn't do any of that, since ilog2() naturally truncates (ie "rounds down") to the right rounded down value. And without the ilog2(0) case, there's no reason for the special case that had the wrong value. tl;dr: rounddown_pow_of_two(1) should be 1, not 0. Acked-by: Dmitry Torokhov <dtor@vmware.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-13Merge branch 'for-linus' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: mmc: core: Fix deadlock when the CONFIG_MMC_UNSAFE_RESUME is not defined mmc: sdhci-s3c: Remove old and misprototyped suspend operations mmc: tmio: fix clock gating on platforms with a .set_pwr() method mmc: sh_mmcif: fix clock gating on platforms with a .down_pwr() method mmc: core: Fix typo at mmc_card_sleep mmc: core: Fix power_off_notify during suspend mmc: core: Fix setting power notify state variable for non-eMMC mmc: core: Add quirk for long data read time mmc: Add module.h include to sdhci-cns3xxx.c mmc: mxcmmc: fix falling back to PIO mmc: omap_hsmmc: DMA unmap only once in case of MMC error
2011-12-13cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task()Tejun Heo1-3/+0
These three methods are no longer used. Kill them. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Paul Menage <paul@paulmenage.org> Cc: Li Zefan <lizf@cn.fujitsu.com>
2011-12-13cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), ↵Tejun Heo1-3/+25
cancel_attach() and attach() Currently, there's no way to pass multiple tasks to cgroup_subsys methods necessitating the need for separate per-process and per-task methods. This patch introduces cgroup_taskset which can be used to pass multiple tasks and their associated cgroups to cgroup_subsys methods. Three methods - can_attach(), cancel_attach() and attach() - are converted to use cgroup_taskset. This unifies passed parameters so that all methods have access to all information. Conversions in this patchset are identical and don't introduce any behavior change. -v2: documentation updated as per Paul Menage's suggestion. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Paul Menage <paul@paulmenage.org> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: James Morris <jmorris@namei.org>
2011-12-13threadgroup: extend threadgroup_lock() to cover exit and execTejun Heo1-6/+41
threadgroup_lock() protected only protected against new addition to the threadgroup, which was inherently somewhat incomplete and problematic for its only user cgroup. On-going migration could race against exec and exit leading to interesting problems - the symmetry between various attach methods, task exiting during method execution, ->exit() racing against attach methods, migrating task switching basic properties during exec and so on. This patch extends threadgroup_lock() such that it protects against all three threadgroup altering operations - fork, exit and exec. For exit, threadgroup_change_begin/end() calls are added to exit_signals around assertion of PF_EXITING. For exec, threadgroup_[un]lock() are updated to also grab and release cred_guard_mutex. With this change, threadgroup_lock() guarantees that the target threadgroup will remain stable - no new task will be added, no new PF_EXITING will be set and exec won't happen. The next patch will update cgroup so that it can take full advantage of this change. -v2: beefed up comment as suggested by Frederic. -v3: narrowed scope of protection in exit path as suggested by Frederic. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2011-12-13threadgroup: rename signal->threadgroup_fork_lock to ->group_rwsemTejun Heo2-20/+19
Make the following renames to prepare for extension of threadgroup locking. * s/signal->threadgroup_fork_lock/signal->group_rwsem/ * s/threadgroup_fork_read_lock()/threadgroup_change_begin()/ * s/threadgroup_fork_read_unlock()/threadgroup_change_end()/ * s/threadgroup_fork_write_lock()/threadgroup_lock()/ * s/threadgroup_fork_write_unlock()/threadgroup_unlock()/ This patch doesn't cause any behavior change. -v2: Rename threadgroup_change_done() to threadgroup_change_end() per KAMEZAWA's suggestion. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Menage <paul@paulmenage.org>
2011-12-13netem: add cell concept to simulate special MAC behaviorHagen Paul Pfeifer1-0/+3
This extension can be used to simulate special link layer characteristics. Simulate because packet data is not modified, only the calculation base is changed to delay a packet based on the original packet size and artificial cell information. packet_overhead can be used to simulate a link layer header compression scheme (e.g. set packet_overhead to -20) or with a positive packet_overhead value an additional MAC header can be simulated. It is also possible to "replace" the 14 byte Ethernet header with something else. cell_size and cell_overhead can be used to simulate link layer schemes, based on cells, like some TDMA schemes. Another application area are MAC schemes using a link layer fragmentation with a (small) header each. Cell size is the maximum amount of data bytes within one cell. Cell overhead is an additional variable to change the per-cell-overhead (e.g. 5 byte header per fragment). Example (5 kbit/s, 20 byte per packet overhead, cell-size 100 byte, per cell overhead 5 byte): tc qdisc add dev eth0 root netem rate 5kbit 20 100 5 Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13tcp memory pressure controlsGlauber Costa1-0/+1
This patch introduces memory pressure controls for the tcp protocol. It uses the generic socket memory pressure code introduced in earlier patches, and fills in the necessary data in cg_proto struct. Signed-off-by: Glauber Costa <glommer@parallels.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com> CC: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13socket: initial cgroup code.Glauber Costa1-0/+22
The goal of this work is to move the memory pressure tcp controls to a cgroup, instead of just relying on global conditions. To avoid excessive overhead in the network fast paths, the code that accounts allocated memory to a cgroup is hidden inside a static_branch(). This branch is patched out until the first non-root cgroup is created. So when nobody is using cgroups, even if it is mounted, no significant performance penalty should be seen. This patch handles the generic part of the code, and has nothing tcp-specific. Signed-off-by: Glauber Costa <glommer@parallels.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtsu.com> CC: Kirill A. Shutemov <kirill@shutemov.name> CC: David S. Miller <davem@davemloft.net> CC: Eric W. Biederman <ebiederm@xmission.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-13Merge branch 'for-next/dwc3' of ↵Greg Kroah-Hartman12-16/+34
git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next * 'for-next/dwc3' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb: (392 commits) usb: dwc3: ep0: fix for possible early delayed_status usb: dwc3: gadget: fix stream enable bit usb: dwc3: ep0: fix GetStatus handling (again) usb: dwc3: ep0: use dwc3_request for ep0 requsts instead of usb_request usb: dwc3: use correct hwparam register for power mgm check usb: dwc3: omap: move to module_platform_driver usb: dwc3: workaround: missing disconnect event usb: dwc3: workaround: missing USB3 Reset event usb: dwc3: workaround: U1/U2 -> U0 transiton usb: dwc3: gadget: return early in dwc3_cleanup_done_reqs() usb: dwc3: ep0: handle delayed_status again usb: dwc3: ep0: push ep0state into xfernotready processing usb: dwc3: fix sparse errors usb: dwc3: fix few coding style problems usb: dwc3: move generic dwc3 code from gadget into core usb: dwc3: use a helper function for operation mode setting usb: dwc3: ep0: don't use ep0in for transfers usb: dwc3: ep0: use proper endianess in SetFeature for wIndex usb: dwc3: core: drop DWC3_EVENT_BUFFERS_MAX usb: dwc3: omap: add multiple instances support to OMAP ...
2011-12-13USB: Remove the duplicate definition of HUB_SET_DEPTHQinglin Ye1-1/+0
The macro HUB_SET_DEPTH is defined twice in ch11.h (introduced by commit 0eadcc0 "usb: USB3.0 ch11 definitions" and dbe79bb "USB 3.0 Hub Changes"), so remove the duplicate one in the USB 2.0 part. Signed-off-by: Qinglin Ye <yestyle@gmail.com> Cc: John Youn <John.Youn@synopsys.com> Cc: Sergei Shtylyov <sshtylyov@mvista.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-13of: create of_phandle_args to simplify return of phandle parsing dataGrant Likely2-6/+15
of_parse_phandle_with_args() needs to return quite a bit of data. Rather than making each datum a separate **out_ argument, this patch creates struct of_phandle_args to contain all the returned data and reworks the user of the function. This patch also enables of_parse_phandle_with_args() to return the device node pointer for the phandle node. This patch also ends up being fairly major surgery to of_parse_handle_with_args(). The existing structure didn't work well when extending to use of_phandle_args, and I discovered bugs during testing. I also took the opportunity to rename the function to be like the existing of_parse_phandle(). v2: - moved declaration of of_phandle_args to fix compile on non-DT builds - fixed incorrect index in example usage - fixed incorrect return code handling for empty entries Reviewed-by: Shawn Guo <shawn.guo@freescale.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-12-12Merge branch 'master' of ↵John W. Linville3-1/+23
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
2011-12-12[media] DVB: Query DVB frontend delivery capabilitiesManu Abraham2-2/+4
Currently, for any multi-standard frontend it is assumed that it just has a single standard capability. This is fine in some cases, but makes things hard when there are incompatible standards in conjuction. Eg: DVB-S can be seen as a subset of DVB-S2, but the same doesn't hold the same for DSS. This is not specific to any driver as it is, but a generic issue. This was handled correctly in the multiproto tree, while such functionality is missing from the v5 API update. http://www.linuxtv.org/pipermail/vdr/2008-November/018417.html Later on a FE_CAN_2G_MODULATION was added as a hack to workaround this issue in the v5 API, but that hack is incapable of addressing the issue, as it can be used to simply distinguish between DVB-S and DVB-S2 alone, or another X vs X2 modulation. If there are more systems, then you have a potential issue. An application needs to query the device capabilities before requesting any operation from the device. Signed-off-by: Manu Abraham <abraham.manu@gmail.com> Acked-by: Andreas Oberritter <obi@linuxtv.org> Acked-by: Oliver Endriss <o.endriss@gmx.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
2011-12-12Merge branch 'mfd/wm8994' of ↵Mark Brown33-50/+287
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc into for-3.3
2011-12-12mfd: Convert wm8994 to use generic regmap irq_chipMark Brown1-2/+1
Factor out the irq_chip implementation, substantially reducing the code size for the driver. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-12-12Merge branch 'topic/irq' of ↵Mark Brown1-0/+48
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap into wm8994-mfd
2011-12-12mfd: Mark WM1811 GPIO6 register volatile for later revisionsMark Brown1-0/+1
For later chip revisions the WM1811 GPIO6 register is always volatile so store the device revision when initialising the driver and then check at runtime if we're running on a newer device. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-12-12mfd: Add missing mutex.h inclusion to WM8994 core.hMark Brown1-0/+1
struct wm8994 includes a mutex so we need to include mutex.h before we declare it. All current users rely on this being done implicitly. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-12-12mfd: Constify WM8994 regulator_init_dataMark Brown1-1/+1
The driver has no need to modify the regulator_init_data so declare it const to allow machine code to do so. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-12-12mfd: Enable register cache for wm8994 devicesMark Brown1-2/+0
As part of this we provide information about the registers that exist in the device to the regmap core, drop the small amount of cache that the core had been using and let regmap do the sync. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-12-12Merge branch 'topic/cache' of ↵Mark Brown1-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap into HEAD
2011-12-12mfd: Define some additional wm8994 registersMark Brown1-0/+96
Add a bunch of definitions for wm8994 registers that are not currently used by software. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-12-12mfd: Disable more pulls on WM8994Mark Brown1-0/+6
Disable more pulls by default on WM8994 for a small current saving. Since some designs do leave SPKMODE floating provide platform data to allow that to be left enabled. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-12-12iommu/amd: Add routines to bind/unbind a pasidJoerg Roedel1-0/+26
This patch adds routines to bind a specific process address-space to a given PASID. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12iommu/amd: Implement device aquisition code for IOMMUv2Joerg Roedel1-1/+22
This patch adds the amd_iommu_init_device() and amd_iommu_free_device() functions which make a device and the IOMMU ready for IOMMUv2 usage. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12iommu/amd: Add device errata handlingJoerg Roedel1-0/+18
Add infrastructure for errata-handling and handle two known erratas in the IOMMUv2 code. Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-12-12UAPI: elf_read_implies_exec() is a kernel-only feature - so hide from userspaceDavid Howells1-9/+9
elf_read_implies_exec() is a kernel-only feature as the second parameter is a constant that isn't exported to userspace. Not only that, but the arch-specific overrides are not exported either. So hide the macro from userspace. Similarly, struct file should not be predeclared in userspace. Signed-off-by: David Howells <dhowells@redhat.com>
2011-12-12usb: gadget: rename usb_gadget_driver::speed to max_speedMichal Nazarewicz1-2/+2
This commit renames the “speed” field of the usb_gadget_driver structure to “max_speed”. This is so that to make it more apparent that the field represents the maximum speed gadget driver can support. This also make the field look more like fields with the same name in usb_gadget and usb_composite_driver structures. All of those represent the *maximal* speed given entity supports. After this commit, there are the following fields in various structures: * usb_gadget::speed - the current connection speed, * usb_gadget::max_speed - maximal speed UDC supports, * usb_gadget_driver::max_speed - maximal speed gadget driver supports, and * usb_composite_driver::max_speed - maximal speed composite gadget supports. Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-12-12usb: gadget: replace usb_gadget::is_dualspeed with max_speedMichal Nazarewicz1-5/+5
This commit replaces usb_gadget's is_dualspeed field with a max_speed field. [ balbi@ti.com : Fixed DWC3 driver ] Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-12-12usb: gadget: renesas_usbhs: add platform power control functionKuninori Morimoto1-0/+8
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-12-12usb: gadget: renesas_usbhs: tidyup the unit of detection_delayKuninori Morimoto1-1/+1
detection_delay was assumed as msec Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-12-12Input: add driver for Sharp gp2ap002a00f proximity sensorCourtney Cavin1-0/+22
This driver adds support for Sharp's GP2AP002A00F proximity sensor. The proximity is measured as a binary switch, i.e. an object is either detected or not detected. Hence, this driver is implemented as a switch that reports SW_FRONT_PROXIMITY. Reviewed-by: Datta Shubhrajyoti <shubhrajyoti@ti.com> Signed-off-by: Courtney Cavin <courtney.cavin@sonyericsson.com> Signed-off-by: Oskar Andero <oskar.andero@sonyericsson.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2011-12-12Merge branch 'topic/irq' of ↵Mark Brown1-0/+48
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap into mfd/da9052
2011-12-12types.h: fix comment spelling for 'architectures'Mark Einon1-1/+1
Spelling change, architetures -> architectures Signed-off-by: Mark Einon <mark.einon@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-12-12net: use IS_ENABLED(CONFIG_IPV6)Eric Dumazet4-10/+10
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-11driver-core/cpu: Expose hotpluggability to the rest of the kernelJosh Triplett1-0/+1
When architectures register CPUs, they indicate whether the CPU allows hotplugging; notably, x86 and ARM don't allow hotplugging CPU 0. Userspace can easily query the hotpluggability of a CPU via sysfs; however, the kernel has no convenient way of accessing that property in an architecture-independent way. While the kernel can simply try it and see, some code needs to distinguish between "hotplug failed" and "hotplug has no hope of working on this CPU"; for example, rcutorture's CPU hotplug tests want to avoid drowning out real hotplug failures with expected failures. Expose this property via a new cpu_is_hotpluggable function, so that the rest of the kernel can access it in an architecture-independent way. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11rcu: Document same-context read-side constraintsPaul E. McKenney2-0/+20
The intent is that a given RCU read-side critical section be confined to a single context. For example, it is illegal to invoke rcu_read_lock() in an exception handler and then invoke rcu_read_unlock() from the context of the task that received the exception. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11nohz: Remove tick_nohz_idle_enter_norcu() / tick_nohz_idle_exit_norcu()Frederic Weisbecker1-46/+1
Those two APIs were provided to optimize the calls of tick_nohz_idle_enter() and rcu_idle_enter() into a single irq disabled section. This way no interrupt happening in-between would needlessly process any RCU job. Now we are talking about an optimization for which benefits have yet to be measured. Let's start simple and completely decouple idle rcu and dyntick idle logics to simplify. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11sched: Add is_idle_task() to handle invalidated uses of idle_cpu()Paul E. McKenney1-0/+8
Commit 908a3283 (Fix idle_cpu()) invalidated some uses of idle_cpu(), which used to say whether or not the CPU was running the idle task, but now instead says whether or not the CPU is running the idle task in the absence of pending wakeups. Although this new implementation gives a better answer to the question "is this CPU idle?", it also invalidates other uses that were made of idle_cpu(). This commit therefore introduces a new is_idle_task() API member that determines whether or not the specified task is one of the idle tasks, allowing open-coded "->pid == 0" sequences to be replaced by something more meaningful. Suggested-by: Josh Triplett <josh@joshtriplett.org> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11rcu: Introduce raw SRCU read-side primitivesPaul E. McKenney1-0/+43
The RCU implementations, including SRCU, are designed to be used in a lock-like fashion, so that the read-side lock and unlock primitives must execute in the same context for any given read-side critical section. This constraint is enforced by lockdep-RCU. However, there is a need to enter an SRCU read-side critical section within the context of an exception and then exit in the context of the task that encountered the exception. The cost of this capability is that the read-side operations incur the overhead of disabling interrupts. Note that although the current implementation allows a given read-side critical section to be entered by one task and then exited by another, all known possible implementations that allow this have scalability problems. Therefore, a given read-side critical section must be exited by the same task that entered it, though perhaps from an interrupt or exception handler running within that task's context. But if you are thinking in terms of interrupt handlers, make sure that you have considered the possibility of threaded interrupt handlers. Credit goes to Peter Zijlstra for suggesting use of the existing _raw suffix to indicate disabling lockdep over the earlier "bulkref" names. Requested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
2011-12-11nohz: Allow rcu extended quiescent state handling seperately from tick stopFrederic Weisbecker1-3/+43
It is assumed that rcu won't be used once we switch to tickless mode and until we restart the tick. However this is not always true, as in x86-64 where we dereference the idle notifiers after the tick is stopped. To prepare for fixing this, add two new APIs: tick_nohz_idle_enter_norcu() and tick_nohz_idle_exit_norcu(). If no use of RCU is made in the idle loop between tick_nohz_enter_idle() and tick_nohz_exit_idle() calls, the arch must instead call the new *_norcu() version such that the arch doesn't need to call rcu_idle_enter() and rcu_idle_exit(). Otherwise the arch must call tick_nohz_enter_idle() and tick_nohz_exit_idle() and also call explicitly: - rcu_idle_enter() after its last use of RCU before the CPU is put to sleep. - rcu_idle_exit() before the first use of RCU after the CPU is woken up. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: David Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-12-11nohz: Separate out irq exit and idle loop dyntick logicFrederic Weisbecker1-6/+7
The tick_nohz_stop_sched_tick() function, which tries to delay the next timer tick as long as possible, can be called from two places: - From the idle loop to start the dytick idle mode - From interrupt exit if we have interrupted the dyntick idle mode, so that we reprogram the next tick event in case the irq changed some internal state that requires this action. There are only few minor differences between both that are handled by that function, driven by the ts->inidle cpu variable and the inidle parameter. The whole guarantees that we only update the dyntick mode on irq exit if we actually interrupted the dyntick idle mode, and that we enter in RCU extended quiescent state from idle loop entry only. Split this function into: - tick_nohz_idle_enter(), which sets ts->inidle to 1, enters dynticks idle mode unconditionally if it can, and enters into RCU extended quiescent state. - tick_nohz_irq_exit() which only updates the dynticks idle mode when ts->inidle is set (ie: if tick_nohz_idle_enter() has been called). To maintain symmetry, tick_nohz_restart_sched_tick() has been renamed into tick_nohz_idle_exit(). This simplifies the code and micro-optimize the irq exit path (no need for local_irq_save there). This also prepares for the split between dynticks and rcu extended quiescent state logics. We'll need this split to further fix illegal uses of RCU in extended quiescent states in the idle loop. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: David Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-12-11rcu: Make srcu_read_lock_held() call common lockdep-enabled functionPaul E. McKenney1-1/+4
A common debug_lockdep_rcu_enabled() function is used to check whether RCU lockdep splats should be reported, but srcu_read_lock() does not use it. This commit therefore brings srcu_read_lock_held() up to date. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-12-11rcu: Warn when srcu_read_lock() is used in an extended quiescent statePaul E. McKenney1-13/+23
Catch SRCU up to the other variants of RCU by making PROVE_RCU complain if either srcu_read_lock() or srcu_read_lock_held() are used from within RCU-idle mode. Frederic reworked this to allow for the new versions of his patches that check for extended quiescent states. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-12-11rcu: Remove one layer of abstraction from PROVE_RCU checkingPaul E. McKenney1-45/+8
Simplify things a bit by substituting the definitions of the single-line rcu_read_acquire(), rcu_read_release(), rcu_read_acquire_bh(), rcu_read_release_bh(), rcu_read_acquire_sched(), and rcu_read_release_sched() functions at their call points. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-12-11rcu: Warn when rcu_read_lock() is used in extended quiescent stateFrederic Weisbecker1-10/+42
We are currently able to detect uses of rcu_dereference_check() inside extended quiescent states (such as the RCU-free window in idle). But rcu_read_lock() and friends can be used without rcu_dereference(), so that the earlier commit checking for use of rcu_dereference() and friends while in RCU idle mode miss some error conditions. This commit therefore adds extended quiescent state checking to rcu_read_lock() and friends. Uses of RCU from within RCU-idle mode are totally ignored by RCU, hence the importance of these checks. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>